3 Comments

Hmmm so I've been mulling this too: LLMs for example do predictive modeling based on what is most likely to come next averaged across all the different training examples. They sometimes make stuff up and sometimes get things wrong. It's possible therefore to suggest that they have accurately modelled language but not modelled meaning. Because they don't have meaning modelled they are not "grounded" in either definitions or experience whereas humans are.

BUT... a counter argument is that the prompt is the grounding (which is why prompt engineering "works") and RHLF is the "experience". You can maybe take both of these together and have the situation that real soon now (less than 5 years) we have a sufficiently well trained LLM via RHLF that it responds just like a human would. Maybe... ha.

Expand full comment
author

Yeah, I'm sure we'll see further progress from LLMs trained on even more / better data + more RHLF (and other forms of further training) + better prompt engineering.

At the same time, there are clearly limits to this approach. For instance, no amount of training and RHLF will add long-term memory to an LLM, because there is physically no mechanism by which the LLM can store information beyond the life of its token window. (Yes, people try to work around this using RAG, but as I explained in my recent post on memory, I don't think that will get us to human-equivalent ability to work on long-term projects.) That's the point I'm trying to make – it can be simultaneously true that (a) a given approach, in this case LLMs, will keep making progress and (b) there's no reason to believe that progress will encompass all of human thought; some aspects of thought fit into the word-prediction paradigm, and some do not, and the fact that we keep getting surprised by how many things do fit into the paradigm does not mean that everything will.

Expand full comment
Sep 30, 2023Liked by Steve Newman

Yeah agreed. I'm basically responding in a stream of consciousness type way.

My thought (and you've kind of said the same thing in a more expanded different way with more examples) is that there are gradually increasing layers to AGI. Right now we're going for task automation. We might find that we get something that seems to be human within the domain of certain tasks. In the RAG area I'm thinking that the real challenge isn't RAG per se... it's open-ended RAG. Could be any question vs could be any knowledge. That's a HARD problem IMO.

On the other hand there's a much simpler version: "hey can I have a decaf americano" (at the drive thru) followed by "sir, this is a bank".

My crappy joke is basically saying that many e.g. corporate use cases for RAG are severely constrained. And if we look at what deepmind is discovering; it's the severely constrained domains that are easier to "conquer" via AI.

But yeah, I'm wandering off your point; you're talking about AGI and I'm talking about narrow AI that *looks like* AGI within the general domain.

My hope is we're actually somewhere in the middle; it would be good IMO to have very capable narrow AGI in these areas to give us a chance to catch up...

Love your posts by the way.

Expand full comment