A software developer and Linux nerd, living in Germany. I’m usually a chill dude but my online persona doesn’t always reflect my true personality. Take what I say with a grain of salt, I usually try to be nice and give good advice, though.

I’m into Free Software, selfhosting, microcontrollers and electronics, freedom, privacy and the usual stuff. And a few select other random things, too.

  • 6 Posts
  • 1.71K Comments
Joined 7 months ago
cake
Cake day: June 25th, 2024

help-circle
  • I know. This isn’t the first article about it. IMO this could have been done deliberately. They just slapped on something with a minimal amount of effort to pass Chinese regulation and that’s it. But all of this happens in a context, doesn’t it? Did the scientists even try? What’s the target use-case and the implications on usage? And why is the baseline something that doesn’t really compare, plus the only category missing, where they did some censorship? I’m just saying, with that much information missing, it’s a bold claim to come up with numbers like 100% and saying it’s alarming.

    (And personally, I’d say these numbers show how these additional safeguards work. You can see how LLMs with nothing in front of them (like Llama405 or Deepseek) fail, and the ones with additional safeguards do way better.)



  • Uh I forgot Llama3 has 8B parameters. What about something like L3-8B-Lunaris? Though, that’s not the latest and greatest anymore and it’s tuned for roleplay. Maybe it’s worth a try, but there are probably better ones out there. I use Mistral-Nemo-Instruct-2407 for pretty much everything. I think it’s a great allrounder and can do anything from answering questions about facts to dialogue to storywriting, and it’s not censored at all. But it has 14B parameters unless I’m mistaken… Does your worldbuilding have to be fast? Because if you’re fine with it being very slow, you can just run it on the CPU, without any graphics card. I usually do that. It’ll take a few minutes to ingest the prompt and come up with an output. But I don’t really care for use cases like storywriting or creative worldbuilding. (Software would be something like llama.cpp, ollama, LocalAI, koboldcpp, …)

    Otherwise I think you’d need to find a fine-tune of a <=8B parameter model that fits. There are enough of them out there. But I found writing prose, or story arcs is a bit more challenging than other tasks, and I believe worldbuilding might be, too. So I guess it’s not as easy as finding a random roleplay or chatbot model.






  • I think it’s not really rooted in facts. AI is an unsubstantiated hype and the stock market is a bubble. People seemed to have been under the impression, that OpenAI was going to invest several trillions(!) of dollars into Nvidia chips. To me, that always seemed a bit unrealistic. But that’s what inflated the Nvidia stock. And now it turns out, to everyones’ surprise, that OpenAI isn’t the only company who can do AI. And that AI is making advancements and is getting better and more efficient all the time… So that trillion dollar bubble collapses.

    To me, that’s just silly. AI making progress was the very reason for those people to invest in it. Plus it’s not like there is another company manufacturing the chips… Deepseek used Nvidia chips. So IMO they proved they’re even better than people previously thought and there is room for improvement… But seems to me the stock market is set on doing it one specific and ineffective way, so it theoretically would need more hardware to do AI.

    I think it’ll turn out the opposite. The better AI gets, the more it’ll get adopted. And that’ll lead to more sales, not less. And if Nvidia hardware turned out to be better than we thought, it just proves they’re ahead of their competition. So even more reason to invest in them. But the stock market sometimes just does silly things and isn’t focused on long term goals.


  • Can’t you feed that back into the same model? I believe most agentic pipelines just use a regular LLM to assess and review the answers from the previous step. At least that’s what I’ve seen in these CoT examples. I believe training a model on rationality tests would be quite hard, as this requires understanding the reasoning, context, having the domain specific knowledge available… Wouldn’t that require a very smart LLM? Or just the original one (R1) since that was trained on… well… reasoning? I’d just run the same R1 as “distillation” and tell it to come up with critique and give a final rating of the previous idea in machine redable format (JSON). After that you can feed it back again and have the LLM decide on two promising ideas to keep and follow. That’d implement the tree search. Though I’d argue this isn’t Monte Carlo.









  • Let’s see how the election turns out. That’ll verify the level of fucked-ness. Whatever it is. I certainly hope this is going to make some sane conservatives re-think if it’s a good choice to vote for Friedrich Merz. (Who is an asshole anyways, in my opinion. Every time I saw his face on television during the last 10 years or so, he said something bad about migrants, women, economics, energy transition… I don’t think he’s a 100% fascist though. Just an asshole who is fine siding with fascists, as long as it suits him, personally.)