@brucethemoose

brucethemoose@lemmy.world · 4 days ago

/c/Rimworld is leaking.

brucethemoose@lemmy.world · edit-2 5 days ago

Anandtech had a great saying:

There are no bad products, just bad prices.

Performance wise, Intel CPUs were just fine at the right price, no matter what manufacturing drama is going on. Don’t get me wrong, all my recent CPU purchases have been AMD, but not because of brand loyalty or anything; it’s because they were on sale and great for the price.

brucethemoose@lemmy.world · edit-2 5 days ago

Yep!

Also, I’m going to plug the AI Horde, which is basically the Fediverse for AI self hosting: https://aihorde.net/

It’s awesome! Though a bit sparsely populated, like Lemmy, heh.

Ping me, and I can host a medium-sized model to try for a few hours (via those linked web UIs), if you want. The options are limitless, from something STEM-focused like Nemotron 49B, to a long context model like Bytedance’s new 36B, to, dungeonmaster finetunes, to horny as heck roleplaying models, lol. But they should be significantly better than whatever 8B ollama downloads by default.

brucethemoose@lemmy.world · edit-2 5 days ago

I am on mobile and can be more detailed later, if you want but the jist is to sign up (with a payment method) to some API service. There are many. Some neat ones include:

Openrouter (a gateway to many, many models from many providers, I’d recommend this first)
Cerebras API (which is faster than anything and has a generous free tier)
Google Gemini, which is free to just try this out on with no credit card.

Some great models to look out for, that you may not know of:

GLM 4.5 (my all-around favorite)
Deepseek (and its uncensored finetunes)
Kimi
Jamba Large
Minimax
InternLM for image input
Qwen Coder for coding
Hermes 405B (which is particularly ‘uncensored’)
Gemini Pro/Flash, which is less private but free to try.

Most (in exchanges for charging pennies for each request) do not log your prompts. If you are really, really concerned, you can even rent your own GPU instance on demand.

Anyway, they will give you a key, which is basically a password.

Paste that key into the LLM frontend of your choice, like Open Web UI, LM Studio, or even web apps like:

Or even the Openrouter web interface.

brucethemoose@lemmy.world · edit-2 5 days ago

The only way to use chatgpt if you must is

-To use it over API.

Or, preferably, literally any other LLM API that isn’t such a censored privacy nightmare.

The problem here is that most percieve ChatGPT as the only chatbot in existance… It’s not, not even close.

Also, ChatGPT is really bad at spitting out its own policy, FYI. expect it to randomly hallucinate stuff.

brucethemoose@lemmy.world · 6 days ago

It’s just soldered LPDDR5X. Framework could’ve fixed it to a motherboard just like the desktop.

I think the problem is cooling and power. The laptop’s internal PSU and heatsink would have to be overhauled for Strix Halo, which would break backwards compatibility if it was even possible to cram in. Same with bigger AMD GPUs and such; people seem to underestimate the engineering and budget constraints they’re operating under.

That being said, way more laptop makers and motherboard makers could have picked up Strix Halo. I’d kill for a desktop motherboard with a PCIe x8 GPU slot.

brucethemoose@lemmy.world · 6 days ago

if ChatGPT sucks

Most people don’t know anything beyond ChatGPT and Copilot.

If we are talking programmers, maybe include claude, gemini, deepseek and perplexity search, though this is not always true.

…Point being, OpenAI does have a short term ‘default’ and known brand advantage, unfortunately.

That being said, there’s absolutely manipulation of LLMs, though not what OP is thinking persay. I see more of:

Benchmaxxing with a huge sycophancy bias (which works particularly well in LM Arena).
Benchmaxxing with massive thinking blocks, which is what OP is getting at. I’ve found Qwen is particularly prone to this, and it does drive up costs.
Token laziness from some of OpenAI’s older models, as if they were trained to give short responses to save GPU time.
“Deep Frying” models for narrow tasks (coding, GPQA style trivia, math, things like that) but making them worse outside of that, especially at long context.
…Straight up cheating by training on benchmark test sets.
Safety training to a ridiculous extent with stuff like Microsoft Phi, OpenAI, Claude, and such, for political reasons and to avoid bad PR.

In addition, ‘free’ chat UIs are geared for gathering data they can use to train on.

You’re right that there isn’t much like ad injection or deliberate token padding yet, but still.

brucethemoose@lemmy.world · edit-2 6 days ago

On the training side, it’s mostly:

Paying devs to prepare the training runs with data, software architecture, frameworks, smaller scale experiments, things like that.
Paying other devs to get the training to scale across 800+ nodes.
Building the data centers, where the construction and GPU hardware costs kind of dwarf power usage in the short term.

On the inference side:

Sometimes optimized deployment frameworks like Deepseek uses, though many seem to use something off the shelf like sglang
Renting or deploying GPU servers individually. They don’t need to be networked at scale like for training, with the highest end I’ve heard (Deepseek’s optimized framework) being like 18 servers or so. And again, the sticker price of the GPUs is the big cost here.
Developing tool use frameworks.

On both sides, the biggest players burn billions on Tech Bro “superstar” developers that, frankly, seem to Tweet more than developing interesting things.

Microsoft talks up nuclear power and such just because they want to cut out the middleman from the grid, reduce power costs, reduce the risk of power outages and such, not because there’s physically not enough power from the grid. It’s just corporate cheapness, not an existential need.

brucethemoose@lemmy.world · edit-2 6 days ago

Nods vigorously.

The future of LLMs basically unprofitable for the actual AI companies. We are in a hell of a bubble, which I can’t wait to pop so I can pick up a liquidation GPU (or at least rent one for cheap).

That doesn’t mean power usage is an existential issue. In fact, it seems like the sheer inefficiency of OpenAI/Grok and such are nails in their coffins.

brucethemoose@lemmy.world · edit-2 6 days ago

Shrug. The DoD is notorious for trying to keep competition between its suppliers alive. But I don’t know enough about the airplane business to say they’re in a death spiral or not.

The fab business is a bit unique because of the sheer scaling of planning and capital involved.

I dunno why you brought up China/foreign interests though. Intel’s military fab designs would likely never get sold overseas, and neither would the military arm of Boeing. I wouldn’t really care about that either way…

This is just about keeping one of three leading edge processor fabs on the planet alive, and of course the gov is a bit worried about the other two in Taiwan and South Korea.

brucethemoose@lemmy.world · edit-2 6 days ago

The power usage is massively overstated, and a meme perpetuated by Altman so he’ll get more more money for ‘scaling’. And he’s lying through his teeth: there literally isn’t enough silicon capacity in the world for that stupid idea.

GPT-5 is already proof scaling with no innovation doesn’t work. So are open source models trained/running on peanuts nipping at its heels.

And tech in the pipe like bitnet is coming to disrupt that even more; the future is small, specialized, augmented models, mostly running locally on your phone/PC because it’s so cheap and low power.

There’s tons of stuff to worry about over LLMs and other generative ML, but future power usage isn’t one.

brucethemoose@lemmy.world · edit-2 7 days ago

Ars is making a mountain out of a molehill.

James McRitchie

Kristin Hull

These are literal activists investors known for taking such stances. It would be weird if they didn’t.

a company that’s not in crisis

Intel is literally circling the drain. It doesn’t look like it on paper, but the fab/chip design business is so long term that if they don’t get on track, they’re basically toast. And they’re also important to the military.

Intel stock is up, short term and YTD. CNBC was ooing and aahing over it today. Intel is not facing major investor backlash.

Of course there are blatant issues, like:

However, the US can vote “as it wishes,” Intel reported, and experts suggested to Reuters that regulations may be needed to “limit government opportunities for abuses such as insider trading.”

And we all know they’re going to insider trade the heck out of it, openly, and no one is going to stop them. Not to speak of the awful precedent this sets.

But the sentiment (not the way the admin went about it) is not a bad idea. Government ties/history mixed with private enterprise are why TSMC and Samsung Foundry are where they are today, and their bowed-out competitors are not.

brucethemoose@lemmy.world · edit-2 7 days ago

The 7900 specifically.

They have to stay within the TDP. Their only option is something newer and ~100W (like the 5070).

And I’m pretty sure the 7000 series is going out of production anyway…

Also (while no 395 is disappointing), it is a totally different socket/platform, and the 395 has a much, much higher effective TDP, so it may not even work in the Framework 16 as its currently engineered. For instance, the cooling or PSU just may not be able to physically handle it. Or perhaps there’s no space on the PCB.

brucethemoose@lemmy.world · edit-2 7 days ago

Maybe they will be the ones to break the curse then and I can have a laptop that can actually treat like a desktop.

Nah, unfortunately they are just as beholden to the GPU makers as any of us. More than larger laptop OEMs for sure.

A future Intel Arc module may be the only hope, but that’s quite a hope.

I just got a 10L SFF desktop I can put in a suitcase, heh…

brucethemoose@lemmy.world · edit-2 7 days ago

Problem is almost no laptop has Strix Halo. Not even the Frameworks.

And rumors are its successor APU may be much better, so the investment could be, err, questionable.

brucethemoose@lemmy.world · edit-2 7 days ago

Yeah, I mean, this can be done with text too.

People should mostly be using a RAG retrieval system, not pure LLM slop like this, for reference. It just hasn’t really been made at scale because Google Search functioned as that well enough, and AI Bros seem to think everything should be done within LLM weights instead of proper databases.

I mean… WTF. What if human minds were not allowed to use references?

WolframAlpha was kinda trying to build this, but stalled.

brucethemoose@lemmy.world · edit-2 7 days ago

Apple is a bit more receptive to bad PR, but Google has a history of kinda ignoring developer feedback, like with the JPEG XL thing as a narrow example.

This is an especially technical matter to; it’s no threat to them.

brucethemoose@lemmy.world · edit-2 7 days ago

I dunno, I’m sure there’s a part of them that doesn’t want to scare off all the free labor they get from the community developers.

Google’s thinking has gone short term “next quarter must go up.” They would absolutely trash their Android dev community for a quick buck, 100%.

brucethemoose@lemmy.world · edit-2 7 days ago

They will for the Chinese market, whatever that’s worth.

brucethemoose@lemmy.world · edit-2 8 days ago

I think the metaphor is finetuning a LLM for ‘safety’ is like trying to engineer the blades to be “finger safe”, when the better approach would be to guard against fingers getting inside an active blender.

Finetuning LLMs to be safe is just not going to work, but building stricter usage structures around them will. Like tools.

This kinda goes against Altman’s assertion that they’re magic crystal balls (in progress), which would pop his bubble he’s holding up. But in the weeds of LLM land, you see a lot more people calling for less censoring, and more sensible and narrow usage.