Instead of diving into the math, I explain *why* they're built as "predict the next word" engines, and present a theory for why they make conceptual errors.
Another example of "painting itself in a corner" is that if you ask it to give an answer and then a reason for it, it sometimes gets the answer wrong and then will try to justify it, somehow. That's how it tries to escape! But if you ask it to write the justification first, then the answer, then it can often avoid the trap.
(Another way to avoid the trap might be to admit a mistake, but apparently LLM's don't get much training in doing that.)
A consequence is that whenever you ask an LLM to justify its answer, there's no reason to believe that the justification is the real reason the neural network chose its answer. Plausible justifications get invented like they would for someone else's answer. If you want the justification to possibly be real (or at least used as input), you need to ask for it before asking for the answer.
I wonder whether including some examples of admitting a mistake in the training data – or even slipping them into the prompt (few-shot learning) – might improve performance for logical reasoning.
Do you have an example of asking it to write the justification first?
Writing the justification first is sometimes called "chain of thought" reasoning and often ChatGPT will do it on its own. I would guess that OpenAI trained it to do this because it's well-known that it improves results. There's a paper about it: https://arxiv.org/abs/2201.11903
That was very helpful for my understanding, especially the part about how it can plan a little-- based on one “pass” through the network-- and also that it is just spectacular at making shit up to finish a sentence.
I have to think future versions will both have a memory and will be able to plan better-- after all, many essays do lay a seed early on and come back to that seed later, so wouldn’t a sufficiently large model be able to copy that structure as well? Seems like another layer of sophistication that a deeper model could emulate without having to be at all original.
And on the originality topic, I have now had ChatGPT-4 write about 30 Shakespearean sonnets on a very broad range of (often silly or highly idiosyncratic) topics. You are right that, although the sonnets are often superficially quite clever, and often use words quite skillfully, it never produces something that truly surprises or feels new. I would be very interested to see what “raw” GPT-4 can do, and if it is any better on that score. I suspect you would say that while it might give a broader variety of answers (not having the layer of RLHF), that its answers would still just be a clever remix of the internet and all the novels and essays ever written.
The Microsoft “Sparks of AGI” paper was pushing hard that it really did sometimes produce original ideas and lines of reasoning, but I wonder if what they were really looking at with their supposedly most extreme examples of originality (like the real analysis proof they show in the paper) was just GPT-4 drawing close analogies from its training data. In other words, maybe the Microsoft authors of the paper didn’t really know where OpenAI’s training data came from, and mistook its ability to analogize with training data as evidence that it could come up with a real analysis proof on its own.
That being said, I’d be interested in your thoughts on what really constitutes “originality”. To me, the term seems suspiciously like “creative” or “adaptable”, in that they are often given as de facto things humans can do that machines and almost all animals cannot. In your second article, you give the example of your coming up with a solution of parsing text into 500 word chunks, and I agree that is a good example of what humans call creative and original. Where I’m not convinced is that a future version of ChatGPT won’t be able to recombine ideas in such a way to come up with equally original solutions to problems; after all, most people agree that Midjourney is “creative” and “adaptable”, so why wouldn’t a future version of ChatGPT also start being original? After all, I feel that a lot of human originality is (ironically) derivative of what we already know, but remixed, as it were. But maybe reasoning through technical problems really does require something ungraspable merely by looking at the patterns of even a million math books.
> ...many essays do lay a seed early on and come back to that seed later, so wouldn’t a sufficiently large model be able to copy that structure as well?
Certainly the training process ought to spur a model towards incorporating this technique. My question is whether the model would be capable of carrying it out; I suspect that it's too hard for anything resembling the current generation of LLMs. For great writing, you can't just decide "hey, I'll plant a seed now"; you need to work out how all of the components of your story or essay are going to fit together, which means spending a lot of time outlining and plotting, with multiple revisions until you get all the pieces to fit together nicely. (Or so I presume; I don't aspire to this level of writing myself. I do something similar when tackling a difficult programming challenge, however.)
As for "originality": I suspect that, like "intelligence", this is one of those concepts where we're going to have to keep moving the goalposts as we repeatedly observe computers achieving the letter but not the spirit of the idea. Or to put it another way, I don't know that we have a very good idea of what "originality" means, and our rough notion was sufficient when only people had originality and we intuitively understood how it manifested in people, but that rough notion is not nearly adequate to characterize it in a computer. (All of which has also applied to "intelligence".)
Another point. I think we *think* originality is basically about novelty, but that it's really about a combination of novelty and quality. A future version of ChatGPT may well be able to come up with original solutions to problems, but that won't be helpful unless they are *good* solutions, or at least valid solutions. I'm sure AIs will get there, but I think it will require more than just the ability to combine or generate ideas in novel ways, it will also require this ability to review and optimize and explore branching paths that I keep harping on, especially in my "can an AI be a software engineer" post (https://amistrongeryet.substack.com/p/can-ai-do-my-job).
There have been some amazing advances with this approach to AI but it absolutely cannot reason, not in any human sense of the word. I asked Chat GPT4 for the most efficient way to tile a specific rectangular space with rectangular tiles. Dozens of attempts to get it to come up with a solution failed, despite adding lots of description, asking if the solution it provided was correct (it did at least always say that its solution was wrong), and giving it hints. Sentence completion is not reasoning, even if trained on lots of sentences that embed reasoned thinking.
For now I think any tasks that are more about reasoning than language will have to be sent out to plugins with domain specific knowledge, or actual feedback loops that can sanity check output. Though I should note that Wolfram Alpha didn't even attempt to answer the tiling question, and just gave me a definition of the first keyword in the question, like "floor" or "space".
Another example of "painting itself in a corner" is that if you ask it to give an answer and then a reason for it, it sometimes gets the answer wrong and then will try to justify it, somehow. That's how it tries to escape! But if you ask it to write the justification first, then the answer, then it can often avoid the trap.
(Another way to avoid the trap might be to admit a mistake, but apparently LLM's don't get much training in doing that.)
A consequence is that whenever you ask an LLM to justify its answer, there's no reason to believe that the justification is the real reason the neural network chose its answer. Plausible justifications get invented like they would for someone else's answer. If you want the justification to possibly be real (or at least used as input), you need to ask for it before asking for the answer.
Great points.
I wonder whether including some examples of admitting a mistake in the training data – or even slipping them into the prompt (few-shot learning) – might improve performance for logical reasoning.
Do you have an example of asking it to write the justification first?
Writing the justification first is sometimes called "chain of thought" reasoning and often ChatGPT will do it on its own. I would guess that OpenAI trained it to do this because it's well-known that it improves results. There's a paper about it: https://arxiv.org/abs/2201.11903
If it doesn't do it on its own, you can ask. Here's an example where I asked ChatGPT to explain its answer and got different results (towards the end, in the update): https://tildes.net/~tech/1567/megathread_7_for_news_updates_discussion_of_ai_chatbots_and_image_generators#comment-7u7t
That was very helpful for my understanding, especially the part about how it can plan a little-- based on one “pass” through the network-- and also that it is just spectacular at making shit up to finish a sentence.
I have to think future versions will both have a memory and will be able to plan better-- after all, many essays do lay a seed early on and come back to that seed later, so wouldn’t a sufficiently large model be able to copy that structure as well? Seems like another layer of sophistication that a deeper model could emulate without having to be at all original.
And on the originality topic, I have now had ChatGPT-4 write about 30 Shakespearean sonnets on a very broad range of (often silly or highly idiosyncratic) topics. You are right that, although the sonnets are often superficially quite clever, and often use words quite skillfully, it never produces something that truly surprises or feels new. I would be very interested to see what “raw” GPT-4 can do, and if it is any better on that score. I suspect you would say that while it might give a broader variety of answers (not having the layer of RLHF), that its answers would still just be a clever remix of the internet and all the novels and essays ever written.
The Microsoft “Sparks of AGI” paper was pushing hard that it really did sometimes produce original ideas and lines of reasoning, but I wonder if what they were really looking at with their supposedly most extreme examples of originality (like the real analysis proof they show in the paper) was just GPT-4 drawing close analogies from its training data. In other words, maybe the Microsoft authors of the paper didn’t really know where OpenAI’s training data came from, and mistook its ability to analogize with training data as evidence that it could come up with a real analysis proof on its own.
That being said, I’d be interested in your thoughts on what really constitutes “originality”. To me, the term seems suspiciously like “creative” or “adaptable”, in that they are often given as de facto things humans can do that machines and almost all animals cannot. In your second article, you give the example of your coming up with a solution of parsing text into 500 word chunks, and I agree that is a good example of what humans call creative and original. Where I’m not convinced is that a future version of ChatGPT won’t be able to recombine ideas in such a way to come up with equally original solutions to problems; after all, most people agree that Midjourney is “creative” and “adaptable”, so why wouldn’t a future version of ChatGPT also start being original? After all, I feel that a lot of human originality is (ironically) derivative of what we already know, but remixed, as it were. But maybe reasoning through technical problems really does require something ungraspable merely by looking at the patterns of even a million math books.
Really enjoying your Substack!
> ...many essays do lay a seed early on and come back to that seed later, so wouldn’t a sufficiently large model be able to copy that structure as well?
Certainly the training process ought to spur a model towards incorporating this technique. My question is whether the model would be capable of carrying it out; I suspect that it's too hard for anything resembling the current generation of LLMs. For great writing, you can't just decide "hey, I'll plant a seed now"; you need to work out how all of the components of your story or essay are going to fit together, which means spending a lot of time outlining and plotting, with multiple revisions until you get all the pieces to fit together nicely. (Or so I presume; I don't aspire to this level of writing myself. I do something similar when tackling a difficult programming challenge, however.)
As for "originality": I suspect that, like "intelligence", this is one of those concepts where we're going to have to keep moving the goalposts as we repeatedly observe computers achieving the letter but not the spirit of the idea. Or to put it another way, I don't know that we have a very good idea of what "originality" means, and our rough notion was sufficient when only people had originality and we intuitively understood how it manifested in people, but that rough notion is not nearly adequate to characterize it in a computer. (All of which has also applied to "intelligence".)
Another point. I think we *think* originality is basically about novelty, but that it's really about a combination of novelty and quality. A future version of ChatGPT may well be able to come up with original solutions to problems, but that won't be helpful unless they are *good* solutions, or at least valid solutions. I'm sure AIs will get there, but I think it will require more than just the ability to combine or generate ideas in novel ways, it will also require this ability to review and optimize and explore branching paths that I keep harping on, especially in my "can an AI be a software engineer" post (https://amistrongeryet.substack.com/p/can-ai-do-my-job).
There have been some amazing advances with this approach to AI but it absolutely cannot reason, not in any human sense of the word. I asked Chat GPT4 for the most efficient way to tile a specific rectangular space with rectangular tiles. Dozens of attempts to get it to come up with a solution failed, despite adding lots of description, asking if the solution it provided was correct (it did at least always say that its solution was wrong), and giving it hints. Sentence completion is not reasoning, even if trained on lots of sentences that embed reasoned thinking.
For now I think any tasks that are more about reasoning than language will have to be sent out to plugins with domain specific knowledge, or actual feedback loops that can sanity check output. Though I should note that Wolfram Alpha didn't even attempt to answer the tiling question, and just gave me a definition of the first keyword in the question, like "floor" or "space".