• 0 Posts
  • 53 Comments
Joined 1 year ago
cake
Cake day: March 22nd, 2024

help-circle



  • I am on mobile and can be more detailed later, if you want but the jist is to sign up (with a payment method) to some API service. There are many. Some neat ones include:

    • Openrouter (a gateway to many, many models from many providers, I’d recommend this first)
    • Cerebras API (which is faster than anything and has a generous free tier)
    • Google Gemini, which is free to just try this out on with no credit card.

    Some great models to look out for, that you may not know of:

    • GLM 4.5 (my all-around favorite)

    • Deepseek (and its uncensored finetunes)

    • Kimi

    • Jamba Large

    • Minimax

    • InternLM for image input

    • Qwen Coder for coding

    • Hermes 405B (which is particularly ‘uncensored’)

    • Gemini Pro/Flash, which is less private but free to try.

    Most (in exchanges for charging pennies for each request) do not log your prompts. If you are really, really concerned, you can even rent your own GPU instance on demand.

    Anyway, they will give you a key, which is basically a password.

    Paste that key into the LLM frontend of your choice, like Open Web UI, LM Studio, or even web apps like:

    Or even the Openrouter web interface.




  • if ChatGPT sucks

    Most people don’t know anything beyond ChatGPT and Copilot.

    If we are talking programmers, maybe include claude, gemini, deepseek and perplexity search, though this is not always true.

    …Point being, OpenAI does have a short term ‘default’ and known brand advantage, unfortunately.


    That being said, there’s absolutely manipulation of LLMs, though not what OP is thinking persay. I see more of:

    • Benchmaxxing with a huge sycophancy bias (which works particularly well in LM Arena).

    • Benchmaxxing with massive thinking blocks, which is what OP is getting at. I’ve found Qwen is particularly prone to this, and it does drive up costs.

    • Token laziness from some of OpenAI’s older models, as if they were trained to give short responses to save GPU time.

    • “Deep Frying” models for narrow tasks (coding, GPQA style trivia, math, things like that) but making them worse outside of that, especially at long context.

    • …Straight up cheating by training on benchmark test sets.

    • Safety training to a ridiculous extent with stuff like Microsoft Phi, OpenAI, Claude, and such, for political reasons and to avoid bad PR.

    In addition, ‘free’ chat UIs are geared for gathering data they can use to train on.

    You’re right that there isn’t much like ad injection or deliberate token padding yet, but still.


  • On the training side, it’s mostly:

    • Paying devs to prepare the training runs with data, software architecture, frameworks, smaller scale experiments, things like that.

    • Paying other devs to get the training to scale across 800+ nodes.

    • Building the data centers, where the construction and GPU hardware costs kind of dwarf power usage in the short term.

    On the inference side:

    • Sometimes optimized deployment frameworks like Deepseek uses, though many seem to use something off the shelf like sglang

    • Renting or deploying GPU servers individually. They don’t need to be networked at scale like for training, with the highest end I’ve heard (Deepseek’s optimized framework) being like 18 servers or so. And again, the sticker price of the GPUs is the big cost here.

    • Developing tool use frameworks.

    On both sides, the biggest players burn billions on Tech Bro “superstar” developers that, frankly, seem to Tweet more than developing interesting things.

    Microsoft talks up nuclear power and such just because they want to cut out the middleman from the grid, reduce power costs, reduce the risk of power outages and such, not because there’s physically not enough power from the grid. It’s just corporate cheapness, not an existential need.



  • Shrug. The DoD is notorious for trying to keep competition between its suppliers alive. But I don’t know enough about the airplane business to say they’re in a death spiral or not.

    The fab business is a bit unique because of the sheer scaling of planning and capital involved.

    I dunno why you brought up China/foreign interests though. Intel’s military fab designs would likely never get sold overseas, and neither would the military arm of Boeing. I wouldn’t really care about that either way…

    This is just about keeping one of three leading edge processor fabs on the planet alive, and of course the gov is a bit worried about the other two in Taiwan and South Korea.


  • The power usage is massively overstated, and a meme perpetuated by Altman so he’ll get more more money for ‘scaling’. And he’s lying through his teeth: there literally isn’t enough silicon capacity in the world for that stupid idea.

    GPT-5 is already proof scaling with no innovation doesn’t work. So are open source models trained/running on peanuts nipping at its heels.

    And tech in the pipe like bitnet is coming to disrupt that even more; the future is small, specialized, augmented models, mostly running locally on your phone/PC because it’s so cheap and low power.

    There’s tons of stuff to worry about over LLMs and other generative ML, but future power usage isn’t one.


  • Ars is making a mountain out of a molehill.

    James McRitchie

    Kristin Hull

    These are literal activists investors known for taking such stances. It would be weird if they didn’t.

    a company that’s not in crisis

    Intel is literally circling the drain. It doesn’t look like it on paper, but the fab/chip design business is so long term that if they don’t get on track, they’re basically toast. And they’re also important to the military.

    Intel stock is up, short term and YTD. CNBC was ooing and aahing over it today. Intel is not facing major investor backlash.


    Of course there are blatant issues, like:

    However, the US can vote “as it wishes,” Intel reported, and experts suggested to Reuters that regulations may be needed to “limit government opportunities for abuses such as insider trading.”

    And we all know they’re going to insider trade the heck out of it, openly, and no one is going to stop them. Not to speak of the awful precedent this sets.

    But the sentiment (not the way the admin went about it) is not a bad idea. Government ties/history mixed with private enterprise are why TSMC and Samsung Foundry are where they are today, and their bowed-out competitors are not.





  • brucethemoose@lemmy.worldtoLemmy Shitpost@lemmy.worldUhm
    link
    fedilink
    arrow-up
    9
    ·
    edit-2
    7 days ago

    Yeah, I mean, this can be done with text too.

    People should mostly be using a RAG retrieval system, not pure LLM slop like this, for reference. It just hasn’t really been made at scale because Google Search functioned as that well enough, and AI Bros seem to think everything should be done within LLM weights instead of proper databases.

    I mean… WTF. What if human minds were not allowed to use references?

    WolframAlpha was kinda trying to build this, but stalled.





  • I think the metaphor is finetuning a LLM for ‘safety’ is like trying to engineer the blades to be “finger safe”, when the better approach would be to guard against fingers getting inside an active blender.

    Finetuning LLMs to be safe is just not going to work, but building stricter usage structures around them will. Like tools.

    This kinda goes against Altman’s assertion that they’re magic crystal balls (in progress), which would pop his bubble he’s holding up. But in the weeds of LLM land, you see a lot more people calling for less censoring, and more sensible and narrow usage.