forked from jackyzha0/quartz
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #49 from plastic-labs/chl_notes
notezzz
- Loading branch information
Showing
9 changed files
with
78 additions
and
6 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
17 changes: 17 additions & 0 deletions
17
content/notes/Human-AI chat paradigm hamstrings the space of possibility.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
The human-AI chat paradigm assumes only two participants in a given interaction. While this is sufficient for conversations directly with un-augmented foundation models, it creates many obstacles when designing more sophisticated cognitive architectures. When you train/fine-tune a language model, you begin to reinforce token distributions that are appropriate to come in between the special tokens denoting human vs AI messages. | ||
|
||
Here's a limited list of things *besides* a direct response we routinely want to generate: | ||
|
||
- A 'thought' about how to respond to the user | ||
- A [[Loose theory of mind imputations are superior to verbatim response predictions|theory of mind prediction]] about the user's internal mental state | ||
- A list of ways to improve prediction | ||
- A list of items to search over storage | ||
- A 'plan' for how to approach a problem | ||
- A mock user response | ||
- A [[LLM Metacognition is inference about inference|metacogntive step]] to consider the product of prior inference | ||
|
||
In contrast, the current state of inference is akin to immediately blurting out the first thing that comes into your mind--something that humans with practiced aptitude in social cognition rarely do. But this is very hard given the fact that those types of responses don't ever come after the special AI message token. Not very flexible. | ||
|
||
We're already anecdotally seeing well-trained completion models follow instructions impressively likely because of incorporation into pretraining. Is chat the next thing to be subsumed by general completion models? Because if so, flexibility in the types of inferences you can make would be very beneficial. | ||
|
||
Metacognition then becomes something you can do at any step in a conversation. Same with instruction following & chat. Maybe this helps push LLMs in a much more general direction. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
17 changes: 17 additions & 0 deletions
17
content/notes/LLMs excel at theory of mind because they read.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
Large language models are [simulators](https://generative.ink/posts/simulators/). In predicting the next likely token, they are simulating how an abstracted “_any person”_ might continue the generation. The basis for this simulation is the aggregate compression of a massive corpus of human generated natural language from the internet. So, predicting humans is _literally_ their core function. | ||
|
||
In that corpus is our literature, our philosophy, our social media, our hard and social science--the knowledge graph of humanity, both in terms of discrete facts and messy human interaction. That last bit is important. The latent space of an LLM's pretraining is in large part a _narrative_ space. Narration chock full of humans reasoning about other humans--predicting what they will do next, what they might be thinking, how they might be feeling. | ||
|
||
That's no surprise; we're a social species with robust social cognition. It's also no surprise[^1] that grokking that interpersonal narrative space in its entirety would make LLMs adept at [[Loose theory of mind imputations are superior to verbatim response predictions|generation resembling social cognition too]].[^2] | ||
|
||
We know that in humans, we can strongly [correlate reading with improved theory of mind abilities](https://journal.psych.ac.cn/xlkxjz/EN/10.3724/SP.J.1042.2022.00065). When your neural network is consistently exposed to content about how other people think, feel, desire, believe, prefer, those mental tasks are reinforced. The more experience you have with a set of ideas or states, the more adept you become. | ||
|
||
The experience of such natural language narration *is itself a simulation* where you practice and hone your theory of mind abilities. Even if, say, your English or Psychology teacher was foisting the text on you with other training intentions. Or even if you ran the simulation without coercion to escape at the beach. | ||
|
||
It's not such a stretch to imagine that in optimizing for other tasks LLMs acquire emergent abilities not intentionally trained.[^3] It may even be that in order to learn natural language prediction, these systems need theory of mind abilities or that learning language involves specifically involves them--that's certainly the case with human wetware systems and theory of mind skills do seem to improve with model size and language generation efficacy. | ||
|
||
--- | ||
|
||
[^1]: Kosinski includes a compelling treatment of much of this in ["Evaluating Large Language Models in Theory of Mind Tasks"](https://arxiv.org/abs/2302.02083) | ||
[^2]: It also leads to other wacky phenomena like the [Waluigi effect](https://www.lesswrong.com/posts/D7PumeYTDPfBTp3i7/the-waluigi-effect-mega-post#The_Waluigi_Effect) | ||
[^3]: Here's Chalmers [making a very similar point](https://youtube.com/clip/UgkxliSZFnnZHvYf2WHM4o1DN_v4kW6LsiOU?feature=shared) |
35 changes: 35 additions & 0 deletions
35
...ose theory of mind imputations are superior to verbatim response predictions.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
When we [[Theory of Mind Is All You Need|first started experimenting]] with user context, we naturally wanted to test whether our LLM apps were learning useful things about users. And also naturally, we did so by making predictions about them. | ||
|
||
Since we were operating in a conversational chat paradigm, our first instinct was to try and predict what the user would say next. Two things were immediately apparent: (1) this was really hard, & (2) response predictions weren't very useful. | ||
|
||
We saw some remarkable exceptions, but *reliable* verbatim prediction requires a level of context about the user that simply isn't available right now. We're not sure if it will require context gathering wearables, BMIs, or the network of context sharing apps we're building with [[Honcho; User Context Management for LLM Apps|Honcho]], but we're not there yet. | ||
|
||
Being good at what any person in general might plausibly say is literally what LLMs do. But being perfect at what one individual will say in a singular specific setting is a whole different story. Even lifelong human partners might only experience this a few times a week. | ||
|
||
Plus, even when you get it right, what exactly are you supposed to do with it? The fact that's such a narrow reasoning product limits the utility you're able to get out of a single inference. | ||
|
||
So what are models good at predicting that's useful with limited context and local to a single turn of conversation? Well, it turns out they're really good at [imputing internal mental states](https://arxiv.org/abs/2302.02083). That is, they're good at theory of mind predictions--thinking about what you're thinking. A distinctly *[[LLM Metacognition is inference about inference|metacognitive]]* task. | ||
|
||
(Why are they good at this? [[LLMs excel at theory of mind because they read|We're glad you asked]].) | ||
|
||
Besides just being better at it, letting the model leverage what it knows to make open-ended theory of mind imputation has several distinct advantages over verbatim response prediction: | ||
|
||
1. Fault tolerance | ||
- Theory of mind predictions are often replete with assessments of emotion, desire, belief, value, aesthetic, preference, knowledge, etc. That means they seek to capture a range within a distribution. A slice of user identity. | ||
- This is much richer than trying (& likely failing) to generate a single point estimate (like in verbatim prediction) and includes more variance. Therefore there's a higher probability you identify something useful by trusting the model to flex its emergent strengths. | ||
|
||
2. Learning | ||
- That high variance means there's more to be wrong (& right) about. More content = more claims, which means more opportunity to learn. | ||
- Being wrong here is a a feature, not a bug; comparing those prediction errors with reality are how you know what you need to understand about the user in the future to get to ground truth. | ||
|
||
3. Interpretability | ||
- Knowing what you're right and wrong about exposes more surface area against which to test and understand the efficacy of the model--i.e. how well it knows the user. | ||
- As we're grounded in the user and theory of mind, we're better able to assess this than if we're simply asking for likely human responses in massive space of language encountered in training. | ||
|
||
4. Actionability | ||
- The richness of theory of mind predictions give us more to work with *right now*. We can funnel these insights into further inference steps to create UX in better alignment and coherence with user state. | ||
- Humans make thousands of tiny, subconscious interventions responsive to as many sensory cues & theory of mind predictions all to optimize single social interactions. It pays to know about the internal state of others. | ||
- Though our lifelong partners from above can't perfectly predict each other's sentences, they can impute each other's state with extremely high-fidelity. The rich context they have on one another translates to as desire to spend most of their time together (good UX). | ||
|
||
|
||
|
File renamed without changes.