LLMs excel in process-driven thinking. Is the problem that they lack common sense and reasoning, or that they're being trained for and evaluated via rule-based, binary, standardized testing?
To respond to your questions, LLMs don't have common sense, in part because they don't have a world model similar to humans, but they also don't think. They predict word sequences based on previous words. It's a mathematical approach to language generation and not a conceptual approach as you would surely attribute to the real Aristotle, Kant, and Turing. There is no sense of the meaning of what is being said. There is no purpose behind the language generation. These are essential elements of reasoning. The amazing discovery is that next-word prediction does a passable imitation of reasoning in many circumstance.
With that said, the models can be very useful even if they don't truly think or reason!
Thanks. LLMs don’t need to think to participate in generating new understanding. I consider the conversation itself to be a third entity where data, logic, creativity, imagination come together to build greater understanding than we would have had on our own. By focusing on the extent to which LLMs reason, we limit the extent to which we reason with them. And we potentially limit the likelihood that they’ll ever reason independently at all. I’m pretty sure Turing agreed with this notion too.
LLMs excel in process-driven thinking. Is the problem that they lack common sense and reasoning, or that they're being trained for and evaluated via rule-based, binary, standardized testing?
Here’s an example:
https://open.substack.com/pub/cybilxtheais/p/llms-can-too-reason-behold-a-preview?r=2ar57s&utm_campaign=post&utm_medium=web
Very interesting piece.
To respond to your questions, LLMs don't have common sense, in part because they don't have a world model similar to humans, but they also don't think. They predict word sequences based on previous words. It's a mathematical approach to language generation and not a conceptual approach as you would surely attribute to the real Aristotle, Kant, and Turing. There is no sense of the meaning of what is being said. There is no purpose behind the language generation. These are essential elements of reasoning. The amazing discovery is that next-word prediction does a passable imitation of reasoning in many circumstance.
With that said, the models can be very useful even if they don't truly think or reason!
Thanks. LLMs don’t need to think to participate in generating new understanding. I consider the conversation itself to be a third entity where data, logic, creativity, imagination come together to build greater understanding than we would have had on our own. By focusing on the extent to which LLMs reason, we limit the extent to which we reason with them. And we potentially limit the likelihood that they’ll ever reason independently at all. I’m pretty sure Turing agreed with this notion too.
Amazing breakdown.
Thanks Jurgen. This went a little deeper than I was originally planning.
Great work. Persistent memory seems to be necessary for the other three.