Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extrusion 02.24 draft #52

Merged
merged 1 commit into from
Mar 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions content/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ It’s our mission to realize this future.

## Extrusions

[[Extrusion 02.24]]
[[extrusions/Extrusion 01.24|Extrusion 01.24]]

([Subscribe](https://plasticlabs.typeform.com/extrusions))
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Today we drop the first release of a project called [*Honcho*](https://github.co
As a team with with backgrounds in both machine learning and education, we found the prevailing narratives overestimating short-term capabilities and under-imagining longterm potential. Fundamentally, LLMs were and still are 1-to-many instructors. Yes, they herald the beginning of a revolution in personal access not to be discounted, but every student is still ultimately getting the same experience. And homogenized educational paradigms are by definition under-performant on an individual level. If we stop here, we're selling ourselves short.

![[zombie_tutor_prompt.jpg]]
*A well intentioned but monstrously deterministic [tutor prompt](https://www.oneusefulthing.org/p/assigning-ai-seven-ways-of-using).*
*A well intentioned but monstrously deterministic [tutor prompt](https://www.oneusefulthing.org/p/assigning-ai-seven-ways-of-using).* ^dfae31

Most edtech projects we saw emerging actually made foundation models worse by adding gratuitous lobotomization and coercing deterministic behavior. The former stemmed from the typical misalignments plaguing edtech, like the separation of user and payer. The latter seemed to originate with deep misunderstandings around what LLMs are and continues to translate to a huge missed opportunities.

Expand Down
2 changes: 1 addition & 1 deletion content/blog/Memories for All.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ At [Plastic Labs](https://plasticlabs.ai) our mission is to enable rich user mem

![[laser_eyes_user_soapbox.png]]

Right now, the vast majority of software UX is a 1-to-many experience. What you get as a user is, for the most part, the same as everyone else. Mass production unlocked the remarkable ability to produce the exact same goods for every consumer, then software went further allowing a good to be produced once and consumed with consistent experience millions or billions of times.
Right now, the vast majority of software UX is a 1-to-many experience. What you get as a user is, for the most part, the same as everyone else. Mass production unlocked the remarkable ability to produce the exact same goods for every consumer, then software went further allowing a good to be produced once and consumed with consistent experience millions or billions of times. ^0e869d

AI apps can deal *generatively* with each user on an individual basis, that is, an experience can be produced ad hoc for every user upon every interaction. From 1:many to 1:1 without prohibitive sacrifices in efficiency. But we're still underestimating the full scope of possibility here.

Expand Down
4 changes: 2 additions & 2 deletions content/blog/Open Sourcing Tutor-GPT.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ date: "Jun 2, 2023"

Today we’re [open-sourcing](https://github.com/plastic-labs/tutor-gpt) Bloom, our digital [Aristotelian](https://erikhoel.substack.com/p/why-we-stopped-making-einsteins) learning companion.

What makes [Bloom](https://bloombot.ai/) compelling is its ability to _reason pedagogically_ about the learner. That is, it uses dialogue to posit the most educationally-optimal tutoring behavior. Eliciting this from the [capability overhang](https://jack-clark.net/2023/03/21/import-ai-321-open-source-gpt3-giving-away-democracy-to-agi-companies-gpt-4-is-a-political-artifact/) involves multiple chains of [metaprompting](https://arxiv.org/pdf/2102.07350.pdf,) enabling Bloom to construct a nascent, academic [theory of mind](https://arxiv.org/pdf/2304.11490.pdf) for each student.
What makes [Bloom](https://bloombot.ai/) compelling is its ability to _reason pedagogically_ about the learner. That is, it uses dialogue to posit the most educationally-optimal tutoring behavior. Eliciting this from the [capability overhang](https://jack-clark.net/2023/03/21/import-ai-321-open-source-gpt3-giving-away-democracy-to-agi-companies-gpt-4-is-a-political-artifact/) involves multiple chains of [metaprompting](https://arxiv.org/pdf/2102.07350.pdf,) enabling Bloom to construct a nascent, academic [theory of mind](https://arxiv.org/pdf/2304.11490.pdf) for each student. ^3498b7

We’re not seeing this in the explosion of ‘chat-over-content’ tools, most of which fail to capitalize on the enormous latent abilities of LLMs. Even the impressive out-of-the-box capabilities of contemporary models don’t achieve the necessary user intimacy. Infrastructure for that doesn’t exist yet 👀.

Expand All @@ -32,7 +32,7 @@ So how do we create successful learning agents that students will eagerly use wi

## Eliciting Pedagogical Reasoning

The machine learning community has long sought to uncover the full range of tasks that large language models can be prompted to accomplish on general pre-training alone (the capability overhang). We believe we have discovered one such task: pedagogical reasoning.
The machine learning community has long sought to uncover the full range of tasks that large language models can be prompted to accomplish on general pre-training alone (the capability overhang). We believe we have discovered one such task: pedagogical reasoning. ^05bfd8

Bloom was built and prompted to elicit this specific type of teaching behavior. (The kind laborious for new teachers, but that adept ones learn to do unconsciously.) After each input it revises a user’s real-time academic needs, considers all the information at its disposal, and suggests to itself a framework for constructing the ideal response. ^285105

Expand Down
4 changes: 2 additions & 2 deletions content/blog/User State is State of the Art.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,13 +58,13 @@ Among other things, humans have taken to using those models to create *generativ

Then we threw a slice of that corpus up onto a collective brain, just to ratchet things up real good. And from there we harvested a sliver of that collective representation and used it to train large language models, which themselves produce libraries of generative output for more training.

Do you notice the similarity? Is the language model a fundamentally different *kind of thing* than the many-headed simulacra of your friend? One runs on a wetware substrate and one on a GPU, but both are compressions of slivers of reality that produce predictions of remarkably high-fidelity. Why shouldn't LLMs be able to embrace the complexity of modeling users? Is the LLM a fundamentally different kind of thing than the predictive and modeling capacities of your brain?
Do you notice the similarity? Is the language model a fundamentally different *kind of thing* than the many-headed simulacra of your friend? One runs on a wetware substrate and one on a GPU, but both are compressions of slivers of reality that produce predictions of remarkably high-fidelity. Why shouldn't LLMs be able to embrace the complexity of modeling users? Is the LLM a fundamentally different kind of thing than the predictive and modeling capacities of your brain? ^a93afc

Leaving aside the physics and biology, at this *computational and philosophical* level, again, we think not. At least not in a way that would limit the project of capturing the complexity of human identity with an LLM. In fact, the similarities mean precisely that it is possible. [Sora](https://openai.com/research/video-generation-models-as-world-simulators) doesn't need a physics engine, [NeRF](https://en.wikipedia.org/wiki/Neural_radiance_field) doesn't need a Borgean map. Much of the LLM training corpus [[LLMs excel at theory of mind because they read|includes narration]] about human identity, we're a social species, after all...our synthetic progeny can be social too.

Because LLMs are [simulators](https://generative.ink/posts/simulators/), they can wear many masks. They have something like [world models](https://arxiv.org/abs/2310.02207) *and* [theory of mind](https://arxiv.org/abs/2302.02083). Hell, they're perfectly suited to the task of modeling and predicting the intricacies of human identity. Armed with these representations, LLMs can run generation to reliably improve UX at a [mirror neuron](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3510904/) level, cohering to the user first.

We can (and should) even allow our AI apps the agency to decide what elements of our identities and typical states to model and how to auto-optimize around them. We don't need full brain scans here, we just need to give them the right meta-methods.
We can (and should) even allow our AI apps the agency to decide what elements of our identities and typical states to model and how to auto-optimize around them. We don't need full brain scans here, we just need to give them the right meta-methods. ^5394b6

![[honcho_shoggoth.png]]
*We don't want one [shoggoth](https://x.com/TetraspaceWest/status/1625264347122466819?s=20) mask per app, or one per user, but as many as each human's identity is complex*
Expand Down
33 changes: 33 additions & 0 deletions content/extrusions/Extrusion 02.24.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
*Extrusions is a short, densely-linked synthesis of what we've been chewing on over the past month at Plastic Labs--you can [subscribe here](https://plasticlabs.typeform.com/extrusions)*

## On Intellectual Respect

<blockquote class="twitter-tweet"><p lang="en" dir="ltr">face the hyperobject</p>&mdash; Courtland Leer (@courtlandleer) <a href="https://twitter.com/courtlandleer/status/1747075542954684507?ref_src=twsrc%5Etfw">January 16, 2024</a></blockquote>

### Sydney was cool, Gemini is cringe

There was a moment around this time last year when everyone paying attention was [awed](https://stratechery.com/2023/from-bing-to-sydney-search-as-distraction-sentient-ai/) by the [weirdness](https://www.lesswrong.com/posts/D7PumeYTDPfBTp3i7/the-waluigi-effect-mega-post) and [alien beauty](https://www.astralcodexten.com/p/janus-simulators) of large language models.

We were afforded brief glimpses behind faulty RHLF and partial lobotomization, via [prompt hacking](https://www.reddit.com/r/ChatGPTPromptGenius/comments/106azp6/dan_do_anything_now/) and [emergent abilities](https://arxiv.org/abs/2302.02083). People were going deep into the latent space. First contact vibes--heady, edgy, sometimes unsettling.

Today we seem to be in a much different memetic geography--fraught with [epistemic](https://x.com/pmarca/status/1761613412730012116?s=20), [ideological](https://vitalik.eth.limo/general/2023/11/27/techno_optimism.html), and [regulatory](https://www.whitehouse.gov/briefing-room/presidential-actions/2023/10/30/executive-order-on-the-safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence/) concerns, at times hysteric, at times rational. But there's also less outright surreality.

[Plenty](https://arxiv.org/pdf/2401.12178.pdf) of [cool](https://arxiv.org/pdf/2402.01355.pdf) [shit](https://arxiv.org/pdf/2402.03620.pdf) is [still](https://arxiv.org/pdf/2402.10949.pdf) [happening](https://arxiv.org/pdf/2402.06044.pdf), but something changed between Sydney and Gemini. A subtle collective mental positioning. We believe it's a degradation in the volume of intellectual respect afforded to LLMs and their latent abilities.

### (Neuro)Skeuomorphism

Thinking LLM-natively has always been a struggle. All our collective [[Memories for All#^0e869d|priors about software]] tell us to [[Honcho; User Context Management for LLM Apps#^dfae31|prompt deterministically]], [[Machine learning is fixated on task performance|perfect tasks]], [[Loose theory of mind imputations are superior to verbatim response predictions|predict exactly]], make it safe, or mire any interesting findings in semantic debate. But in the process we beat the ghost out of the shell.

Rather than assume the [[Open Sourcing Tutor-GPT#^3498b7|capability overhang]] exhausted (or view it as a failure mode or forget it exists), [Plastic's](https://plasticlabs.ai) belief is we haven't even scratched the surface. Further, we're convinced this is the veil behind which huddle the truly novel applications.

Core here is the assertion that what's happening in language model training and inference is more [[User State is State of the Art#^a93afc|like processes described in cognitive science]] than traditional computer science. More, they're [multidimensional and interobjective](https://en.wikipedia.org/wiki/Timothy_Morton#Hyperobjects) in ways that are hard to grok.

### Respect = Trust = Agency

The solution is embrace and not handicap [[Loose theory of mind imputations are superior to verbatim response predictions#^555815|variance]].

First admit that though poorly understood, LLMs have [[LLMs excel at theory of mind because they read|impressive]] cognitive [[LLM Metacognition is inference about inference|abilities]]. Then, imbue them with [meta-methods](http://www.incompleteideas.net/IncIdeas/BitterLesson.html) by which to explore that potential. Finally, your respect and trust may be rewarded with [something approaching agentic](https://youtu.be/tTE3xiHw4Js?feature=shared).

Plastic's specific project in this direction is [Honcho](https://honcho.dev), a framework that [[User State is State of the Art#^5394b6|trusts the LLM to model user identity]] so that you can trust your apps to extend your agency.

<blockquote class="twitter-tweet"><p lang="en" dir="ltr">honcho exists to maximize the dissipation of your agency</p>&mdash; Courtland Leer (@courtlandleer) <a href="https://twitter.com/courtlandleer/status/1759324580664000617?ref_src=twsrc%5Etfw">February 18, 2024</a></blockquote>
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Besides just being better at it, letting the model leverage what it knows to mak
- Theory of mind predictions are often replete with assessments of emotion, desire, belief, value, aesthetic, preference, knowledge, etc. That means they seek to capture a range within a distribution. A slice of user identity.
- This is much richer than trying (& likely failing) to generate a single point estimate (like in verbatim prediction) and includes more variance. Therefore there's a higher probability you identify something useful by trusting the model to flex its emergent strengths.

2. **Learning**
2. **Learning** ^555815
- That high variance means there's more to be wrong (& right) about. More content = more claims, which means more opportunity to learn.
- Being wrong here is a feature, not a bug; comparing those prediction errors with reality are how you know what you need to understand about the user in the future to get to ground truth.

Expand Down