This is a perfect demonstration of how LLMs work and why they do not think.
The base question here, that the model is most strongly statistically geared towards, is “How many Rs are in strawberry”. You can see how the response in the screenshot works as the template for the correct answer to this question.
All it did was get the most likely response for the strawberry question (which is the closest, most confident match in structure to the blueberry question) , and then substitute specific tokens. This is essentially what it does with every response for any question. It uses the closest match from the data it is trained on, then substitutes individual terms, so it looks appropriate to the question.
Ultimately every answer will only ever be an approximation, but there will never be any certainty to its correctness.
tbh that kinda sounds like it’s “thinking” though, just that it’s not very good at it at all
∞ 🏳️⚧️Edie [it/its, she/her, fae/faer, love/loves, ze/hir, des/pair, none/use name, undecided]@hexbear.netEnglish40·6 days agoThat’s the easiest way to describe it to people, but it isn’t. It’s just math doing this.
The undefeated argument for explaining it to laypeople is to show just how “linear” the process for an LLM is compared to human thought. When you prompt the LLM, all it ever does is it takes your input, turns it into a sequence of mathematical objects, then it puts them through a really long chain of matrix multiplications that lands on an output that gets converted back into language. At no point does it have branches where it takes some time to introspect, consider, recall, or reflect on anything the way a human does when we receive a question. It’s not thinking.
I’ve taken to calling them “synths” because what is it doing that’s fundamentally different from a 1980’s CASIO? A simple input is returning a complex output? waow
Honestly I think if the term “cybernetics” had won over “artificial intelligence” there’d be less of this obfuscation. But “AI” is more marketable, and of course that’s all that matters.
Gippity, the technical term is gippity.
i don’t want to argue w/ people all day but it was a joke
Ultimately every answer will only ever be an approximation, but there will never be any certainty to its correctness.
sounds like pretty much any and all thinking to me, people don’t “know” things, they think they know things. usually they’re right, but memory is weird shit and doesn’t always work properly and there are ten billion and one factors that can influence a person’s recollection of some bit of information. i was like “woah the magic conch is just like me fr fr”
p.s. I do wanna argue though that while i don’t think chatgpt thinks, I do think that consciousness is an emergent property and with enough things like chatgpt all jumbled together you might see something resembling consciousness or thought, at least in a way that if you really interrogate it closely enough you might not be able to meaningfully differentiate it from biological consciousness or thought (which if you really wanna argue could also be reduced to “it’s just math” as well, just math that is way beyond the ability of people to determine. I mean if you had magical deterministic information of the position and interaction of every neuron and neurochemical and every related cellular process etc and could map out and understand it you could look at it and shrug and go “it’s just math” too, j/s doggggggggg)
this is where I’d press a disable inbox reply button IF I HAD IT
you really interrogate it closely enough you might not be able to meaningfully differentiate it from biological consciousness or thought (which if you really wanna argue could also be reduced to “it’s just math” as well, just math that is way beyond the ability of people to determine
Here’s one easy way to differentiate it: my brain is wet and runs on electrochemical processes powered by food. Is that a “significant” difference? That depends on what you think is worth tracking! Defining what counts as “functionally identical” requires you decide which features of a system are “functional” and which are “mere” cosmetic differences. That differentiation isn’t given to us by nature, though, and already reflects a hefty series of evaluative judgements. By carefully defining our functions, we can call any two things “functionally identical.” There’s no right answer, which is both a strength and a limitation of this kind of functionalist framework. Both the AI boosters and the AI “impossibilists” miss this point: functional identity is perspectival, and encodes a bunch of evaluative assumptions about which differences do and don’t matter. That’s ok–all model building does that–but it’s important not to confuse the map and the territory, or think we’re identifying some kind of value-independent feature of the world when we attribute functional identity.
you don’t think in language and words tho
you don’t think in language and words tho
am i missing this being sarcastic in a way because people do think in language and words
you have running inner monologue, sure, but when you solve something like how many b’s in blackberry, do you honestly say you thinking in words about a problem?
you have concepts/ideas/pictures/words/signs/symbols wheezing by, that are not embodied in words until desired to. And until you engage in rechecking/reflecting, i don’t think it’s very likely this thinking is in language, more like you can interpret flashes of thoughts into words if you decide to dwell on them, but are not necessitated to do so, and i don’t think ordinary engagement with imagination requires language. (could have swore i linked some article related to math/language/fmri, that shown ideas (math in that case) thinking is not exactly located in language areas of brain)
look i’m not a linguist so i’m not going to make the proper argument here but the defining features of our type of human are the specific adaptations for language, how people behave is culturally defined and culture is understood and communicated through language.
frankly likening the experience of sensations to knowledge of them without language sounds very silly to me.
neither does the computer!!!
I think chatgpt is basically like a computer equivalent of figuring out language processing to an alright degree which is like p. cool and I guess enough to trick people into thinking the mechanical turk has an agenda but yeah still not thinking
i guess my issue is that neural networks as they exist now can’t emerge property, they are fitting to data to predict next word in the best way possible, or most probable in unknown sentence. It’s not how anybody learns, not mice, not humans.
Something akin to experiments with free floating robot arms with bolted on computer vision seem like much more viable approach, but there the problem is they don’t have right architecture to feed it into, at least i don’t think they do, and even then it will probably will stall out for a time at animal level.
my problem is at some point they’re gonna smoosh chatgpt and that sort of stuff and other shit together and it might be approximating consciousness but nerds will be like "it’s just math!
" and it’ll make commander Data sad
n’ they won’t even care
A mechanical turk is a fake AI with a human behind it
The entire economy is getting refocused onto building a robot that lies to you.
Damn. Not even cable news anchors are safe from automation.
So Chat GPT will be running for office soon?
For reference, the reason why this happens is because LLMs aren’t “next word predictors”, but rather “next token predictors”. Each word is broken into tokens, probably ‘blue’ and ‘berry’ for this case. The LLM doesn’t have any access to information below the token level, which means that it can’t count letters directly, but it has to rely on the “proximity” of the tokens in it’s training data. Because there’s a lot on the Internet about letters and strawberries, it counts the r instead of the b in ‘berry’. Chain of Thought (CoT) models like Deepseek-reasoner or ChatGPT-o3 feed their output back into themselves and are more likely to output the text ‘b l u e b e r r y’ which is the trick to doing this. The lack of sub-token information isn’t a critical flaw and doesn’t come up often in real world usecases, so there isn’t much energy dedicated to fixing it.
OpenAI got that sweet DoD gig and now they’re just slapping a UI wrapper on GPT 3.5 and calling it GPT 5.
china’s going to have an actual AI running in some nuclear fusion powered bunker solving climate change and destroying america while america burns up its rivers to power 27000 data centers, 40% of which are dedicated to grok’s boobs
Well yes it’s terrible and hallucinates, it’s a real piece of shit actually, but you see of course this is precisely why we need to commit all of humanity’s resources. To improve it! To allow it to spell a word!
I wonder if I would convince dumb rich investors to buy bags of my poop as the next big innovation
THIS IS INVESTMENT ADVICE: go all in on owl pellet futures
Considering the expected cultural impact of the harry potter show for the general idea of owls as pets, again
Can your poop replace my workers?
AI has its utilities, but capitalists searching for new frontiers and trying to find a genie that can solve climate change, poverty, wealth inequality, and really all of humanity’s problems - directly and indirectly caused by capitalism - is not going to happen.
Its not AI. Ai is a marketing term to get investors to throw money at them. Theres nothing intelligent happening here.
I suppose using it as a shorthand is misleading since it lends credibility to its misnomer; do we just stick to calling it LLMs then?
I just call them what they do. Text generator. Image denoiser. Having used every pre-LLM version of accelerated statistical analysis out there (anything meant to find patterns in data), it’s always been machine learning outputs. AI was only ever a term I heard in video gaming, which still seems more appropriate.
capitalists are not trying to solve any of those problems, they’re just looking for a magic machine that can replace workers
tbh I don’t think capitalists give a shit about those problems
every time i see the failures of the fancy predictive text machine i find myself asking “what exactly was wrong with expert systems?”
like, they actually work for what people need them for?When is the screenshot from and which model?
GPT-5, which just released yesterday and is “clearly generally intelligent” according to Altman.
Just for more context, Altman was posting images of the Deathstar and acting like he is the AI Oppenheimer right before release.
I wanna see the results if you ask ChatGPT the same question a million times. What percentage of responses would actually get the correct number?
It depends on the temperature. There’s a variable you can play with that adds since randomness to the responses (LLMs are fully deterministic when temperature is 0). Sometimes the F1 or F2 score is used to determine correctness of many questions, but I don’t have a great understanding of how that metric works and what ChatGPTs is.
I think that heavily depends on whether it gets the initial answer right, since it will use that as context
When you’re calling it through an API then you can simply choose not to pass it any context
Fair enough
AI general intelligence acheived, I’ve probably answered this same question with that answer at some point in my life. And I have some level of intelligence.
3 Rs in strawberry
Strawberry and blueberry are both in “berry” category and are more closely associated with each other than any other fruit
B is to Blueberry the way R is to Strawberry
Therefore, blueberry has 3 Bs
deleted by creator