Even if LLM’s can’t be said to have ‘true understanding’ (however you’re choosing to define it), there is very little to suggest they should be able to understand predict the correct response to a particular context, abstract meaning, and intent with what primitive tools they were built with.
If there’s some as-yet uncrossed threshold to a bare-minimum ‘understanding’, it’s because we simply don’t have the language to describe what that threshold is or know when it has been crossed. If the assumption is that ‘understanding’ cannot be a quality granted to a transformer-based model -or even a quality granted to computers generally- then we need some other word to describe what LLM’s are doing, because ‘predicting the next-best word’ is an insufficient description for what would otherwise be a slight-of-hand trick.
There’s no doubt that there’s a lot of exaggerated hype around these models and LLM companies, but some of these advancements published in 2022 surprised a lot of people in the field, and their significance shouldn’t be slept on.
Certainly don’t trust the billion-dollar companies hawking their wares, but don’t ignore the technology they’re building, either.
You are best off thinking of LLMs as highly advanced auto correct. They don’t know what words mean. When they output a response to your question the only process that occurred was “which words are most likely to come next”.
That’s only true on a very basic level, I understand that Turings maths is complex and unintuitive even more so than calculus but it’s a very established fact that relatively simple mathematical operations can have emergent properties when they interact to have far more complexity than initially expected.
The same way the giraffe gets its spots the same way all the hardware of our brain is built, a strand of code is converted into physical structures that interact and result in more complex behaviours - the actual reality is just math, and that math is almost entirely just probability when you get down to it. We’re all just next word guessing machines.
We don’t guess words like a Markov chain instead use a rather complex token system in our brain which then gets converted to words, LLMs do this too - that’s how they can learn about a subject in one language then explain it in another.
Calling an LLM predictive text is a fundamental misunderstanding of reality, it’s somewhat true on a technical level but only when you understand that predicting the next word can be a hugely complex operation which is the fundamental math behind all human thought also.
Plus they’re not really just predicting one word ahead anymore, they do structured generation much like how image generators do - first they get the higher level principles to a valid state then propagate down into structure and form before making word and grammar choices. You can manually change values in the different layers and see the output change, exploring the latent space like this makes it clear that it’s not simply guessing the next word but guessing the next word which will best fit into a required structure to express a desired point - I don’t know how other people are coming up with sentences but that feels a lot like what I do
LLMs don’t “learn” they literally don’t have the capacity to “learn”. We train them on an insane amount of text and then the LLMs job is to produce output that looks like that text. That’s why when you attempt to correct it nothing happens. It can’t learn, it doesn’t have the capacity to.
Humans aren’t “word guessing machines”. Humans produce language with intent and meaning. This is why you and I can communicate. We use language to represent things. When I say “Tree” you know what that is because it’s the word we use to describe an object we all know about. LLMs don’t know what a tree is. They can use “tree” in a sentence correctly but they don’t know what it means. They can even translate it to another language but they still don’t know what “tree” means. What they know is generating text that looks like what they were trained on.
Even if LLM’s can’t be said to have ‘true understanding’ (however you’re choosing to define it), there is very little to suggest they should be able to understand predict the correct response to a particular context, abstract meaning, and intent with what primitive tools they were built with.
Did you mean “shouldn’t”? Otherwise I’m very confused by your response
There’s no reason to expect a program that calculates the probability of the next most likely word in a sentence should be able to do anything more than string together an incoherent sentence, let alone correctly answer even an arbitrary question
It’s like using a description for how covalent bonds are formed as an explanation for how it is you know when you need to take a shit.
I find this line of thinking tedious.
Even if LLM’s can’t be said to have ‘true understanding’ (however you’re choosing to define it), there is very little to suggest they should be able to
understandpredict the correct response to a particular context, abstract meaning, and intent with what primitive tools they were built with.If there’s some as-yet uncrossed threshold to a bare-minimum ‘understanding’, it’s because we simply don’t have the language to describe what that threshold is or know when it has been crossed. If the assumption is that ‘understanding’ cannot be a quality granted to a transformer-based model -or even a quality granted to computers generally- then we need some other word to describe what LLM’s are doing, because ‘predicting the next-best word’ is an insufficient description for what would otherwise be a slight-of-hand trick.
There’s no doubt that there’s a lot of exaggerated hype around these models and LLM companies, but some of these advancements published in 2022 surprised a lot of people in the field, and their significance shouldn’t be slept on.
Certainly don’t trust the billion-dollar companies hawking their wares, but don’t ignore the technology they’re building, either.
You are best off thinking of LLMs as highly advanced auto correct. They don’t know what words mean. When they output a response to your question the only process that occurred was “which words are most likely to come next”.
And we all know how often auto correct is wrong
Yep. Been having trouble with mine recently, it’s managed to learn my typos and it’s getting quite frustrating
deleted by creator
deleted by creator
That’s only true on a very basic level, I understand that Turings maths is complex and unintuitive even more so than calculus but it’s a very established fact that relatively simple mathematical operations can have emergent properties when they interact to have far more complexity than initially expected.
The same way the giraffe gets its spots the same way all the hardware of our brain is built, a strand of code is converted into physical structures that interact and result in more complex behaviours - the actual reality is just math, and that math is almost entirely just probability when you get down to it. We’re all just next word guessing machines.
We don’t guess words like a Markov chain instead use a rather complex token system in our brain which then gets converted to words, LLMs do this too - that’s how they can learn about a subject in one language then explain it in another.
Calling an LLM predictive text is a fundamental misunderstanding of reality, it’s somewhat true on a technical level but only when you understand that predicting the next word can be a hugely complex operation which is the fundamental math behind all human thought also.
Plus they’re not really just predicting one word ahead anymore, they do structured generation much like how image generators do - first they get the higher level principles to a valid state then propagate down into structure and form before making word and grammar choices. You can manually change values in the different layers and see the output change, exploring the latent space like this makes it clear that it’s not simply guessing the next word but guessing the next word which will best fit into a required structure to express a desired point - I don’t know how other people are coming up with sentences but that feels a lot like what I do
LLMs don’t “learn” they literally don’t have the capacity to “learn”. We train them on an insane amount of text and then the LLMs job is to produce output that looks like that text. That’s why when you attempt to correct it nothing happens. It can’t learn, it doesn’t have the capacity to.
Humans aren’t “word guessing machines”. Humans produce language with intent and meaning. This is why you and I can communicate. We use language to represent things. When I say “Tree” you know what that is because it’s the word we use to describe an object we all know about. LLMs don’t know what a tree is. They can use “tree” in a sentence correctly but they don’t know what it means. They can even translate it to another language but they still don’t know what “tree” means. What they know is generating text that looks like what they were trained on.
Here’s a well made video by Kyle Hill that will teach you lot better than I could
Here is an alternative Piped link(s):
Here’s a well made video by Kyle Hill
Piped is a privacy-respecting open-source alternative frontend to YouTube.
I’m open-source; check me out at GitHub.
Did you mean “shouldn’t”? Otherwise I’m very confused by your response
No, i mean ‘should’, as in:
It’s like using a description for how covalent bonds are formed as an explanation for how it is you know when you need to take a shit.
Fair enough, that just seemed to be the opposite point that the rest of your post was making so seemed like a typo.
I don’t think so…