(New) papers by Meta: Large Concept Models and BLT

hendrik@palaver.p3x.de · edit-2 1 hour ago

I know. This isn’t the first article about it. IMO this could have been done deliberately. They just slapped on something with a minimal amount of effort to pass Chinese regulation and that’s it. But all of this happens in a context, doesn’t it? Did the scientists even try? What’s the target use-case and the implications on usage? And why is the baseline something that doesn’t really compare, plus the only category missing, where they did some censorship? I’m just saying, with that much information missing, it’s a bold claim to come up with numbers like 100% and saying it’s alarming.

(And personally, I’d say these numbers show how these additional safeguards work. You can see how LLMs with nothing in front of them (like Llama405 or Deepseek) fail, and the ones with additional safeguards do way better.)

hendrik@palaver.p3x.de · edit-2 3 hours ago

Nice study. But I think they’ve should have mentioned some more context. Yesterday people were complaining the models won’t talk about the CCP, or Winnie the Pooh. And today the lack of censorship is alarming… Yeah, so much about that… And by the way, censorship isn’t just a thing in the bare models. Meta OpenAI etc all use frameworks and extra software around the models themselves to check input and output. So it isn’t really fair to compare a pipeline with AI safety factored in, to a bare LLM.

hendrik@palaver.p3x.de · edit-2 14 hours ago

Uh I forgot Llama3 has 8B parameters. What about something like L3-8B-Lunaris? Though, that’s not the latest and greatest anymore and it’s tuned for roleplay. Maybe it’s worth a try, but there are probably better ones out there. I use Mistral-Nemo-Instruct-2407 for pretty much everything. I think it’s a great allrounder and can do anything from answering questions about facts to dialogue to storywriting, and it’s not censored at all. But it has 14B parameters unless I’m mistaken… Does your worldbuilding have to be fast? Because if you’re fine with it being very slow, you can just run it on the CPU, without any graphics card. I usually do that. It’ll take a few minutes to ingest the prompt and come up with an output. But I don’t really care for use cases like storywriting or creative worldbuilding. (Software would be something like llama.cpp, ollama, LocalAI, koboldcpp, …)

Otherwise I think you’d need to find a fine-tune of a <=8B parameter model that fits. There are enough of them out there. But I found writing prose, or story arcs is a bit more challenging than other tasks, and I believe worldbuilding might be, too. So I guess it’s not as easy as finding a random roleplay or chatbot model.

hendrik@palaver.p3x.de · edit-2 16 hours ago

Uh, that’s not much VRAM. What kind of model sizes fit into a GPU like that? Does a 7B parameter model fit, quantized to 4bit? With whatever context length you need?

hendrik@palaver.p3x.de · 24 hours ago

Super Idee, direkt Nägel mit Köpfen machen…

hendrik@palaver.p3x.de · edit-2 1 day ago

Nebius sounds nice. I’m currently with runpod.io (there’s also vast.ai). If anyone has more European providers, I’d be interested, too. Especially pre-paid or some good control over spending, since I’m just tinkering for fun and not doing it as a job. And I don’t want to risk getting a huge bill at the end of the month.

hendrik@palaver.p3x.de · edit-2 1 day ago

Sehr gut. Ich hab mir grad Bitwarden installiert und meine ersten beiden Passkeys erstellt.

hendrik@palaver.p3x.de · edit-2 1 day ago

I think it’s not really rooted in facts. AI is an unsubstantiated hype and the stock market is a bubble. People seemed to have been under the impression, that OpenAI was going to invest several trillions(!) of dollars into Nvidia chips. To me, that always seemed a bit unrealistic. But that’s what inflated the Nvidia stock. And now it turns out, to everyones’ surprise, that OpenAI isn’t the only company who can do AI. And that AI is making advancements and is getting better and more efficient all the time… So that trillion dollar bubble collapses.

To me, that’s just silly. AI making progress was the very reason for those people to invest in it. Plus it’s not like there is another company manufacturing the chips… Deepseek used Nvidia chips. So IMO they proved they’re even better than people previously thought and there is room for improvement… But seems to me the stock market is set on doing it one specific and ineffective way, so it theoretically would need more hardware to do AI.

I think it’ll turn out the opposite. The better AI gets, the more it’ll get adopted. And that’ll lead to more sales, not less. And if Nvidia hardware turned out to be better than we thought, it just proves they’re ahead of their competition. So even more reason to invest in them. But the stock market sometimes just does silly things and isn’t focused on long term goals.

hendrik@palaver.p3x.de · edit-2 1 day ago

Can’t you feed that back into the same model? I believe most agentic pipelines just use a regular LLM to assess and review the answers from the previous step. At least that’s what I’ve seen in these CoT examples. I believe training a model on rationality tests would be quite hard, as this requires understanding the reasoning, context, having the domain specific knowledge available… Wouldn’t that require a very smart LLM? Or just the original one (R1) since that was trained on… well… reasoning? I’d just run the same R1 as “distillation” and tell it to come up with critique and give a final rating of the previous idea in machine redable format (JSON). After that you can feed it back again and have the LLM decide on two promising ideas to keep and follow. That’d implement the tree search. Though I’d argue this isn’t Monte Carlo.

hendrik@palaver.p3x.de · 1 day ago

Is this a new thing? I thought that’s how companies like Google and Meta operate for the last 10 years or so, minus a few poor people in underprivileged countries who have to sort the really bad stuff…

hendrik@palaver.p3x.de · 2 days ago

If you don’t get any good answers in this general community, try !Pixelfed@lemmy.world

hendrik@palaver.p3x.de · 2 days ago

Does’t seem too hard to me. I personally didn’t. And it’s kind of hard to track what happeded, with all the articles on DeepSeek.

I’d just take some prompt/agent framework like Langchain. That has Chain of Thought prompting built in for quite some time already. And then connect it to R1. That shoud do it. Maybe the thinking blocks need to be handled differently, idk.

hendrik@palaver.p3x.de · edit-2 2 days ago

Kdeconnect. Alternatively NextCloud or sending an email to myself.

hendrik@palaver.p3x.de · 2 days ago

Seems they’re introducing lots of errors lately… First Pixelfed, now entire Linux…

hendrik@palaver.p3x.de · 2 days ago

Last time I checked, yunohost.org worked well. It either has the services you’re interested in, or it doesn’t… But it’s really easy to get it running. Docker containers also usually work well. Though, you need some amount of technical knowledge. And I’d recommend to use something like docker-compose and not do everything manually… If you’re a beginner, maybe just try YunoHost.

hendrik@palaver.p3x.de · 2 days ago

Sorry, I didn’t read this. But ultimately they’re just doing you a favor by removing you from that community, I don’t think it has a nice atmosphere in the first place. Just take this as an invitation to focus on something else. Practically anything is better than commenting on the news on lemmy.world.

hendrik@palaver.p3x.de · edit-2 2 days ago

Let’s see how the election turns out. That’ll verify the level of fucked-ness. Whatever it is. I certainly hope this is going to make some sane conservatives re-think if it’s a good choice to vote for Friedrich Merz. (Who is an asshole anyways, in my opinion. Every time I saw his face on television during the last 10 years or so, he said something bad about migrants, women, economics, energy transition… I don’t think he’s a 100% fascist though. Just an asshole who is fine siding with fascists, as long as it suits him, personally.)

hendrik@palaver.p3x.de · edit-2 3 days ago

Deepseek is a Chinese company. And as such they abide by Chinese law. And that includes lots of political censorship. And it’s not like the tech-bros of their main opponent do it very differently. Those are currently very busy licking boots of their new government. Whatever that means for their users and society… I think all of that sucks.