diff --git a/content/_index.md b/content/_index.md index 6bfbe7cb04497..2b6e5dfc026b1 100644 --- a/content/_index.md +++ b/content/_index.md @@ -35,6 +35,7 @@ It’s our mission to realize this future. ## Extrusions +[[Extrusion 02.24]] [[extrusions/Extrusion 01.24|Extrusion 01.24]] ([Subscribe](https://plasticlabs.typeform.com/extrusions)) diff --git a/content/blog/Honcho; User Context Management for LLM Apps.md b/content/blog/Honcho; User Context Management for LLM Apps.md index 677d4cd568014..53bb8cf4de596 100644 --- a/content/blog/Honcho; User Context Management for LLM Apps.md +++ b/content/blog/Honcho; User Context Management for LLM Apps.md @@ -16,7 +16,7 @@ Today we drop the first release of a project called [*Honcho*](https://github.co As a team with with backgrounds in both machine learning and education, we found the prevailing narratives overestimating short-term capabilities and under-imagining longterm potential. Fundamentally, LLMs were and still are 1-to-many instructors. Yes, they herald the beginning of a revolution in personal access not to be discounted, but every student is still ultimately getting the same experience. And homogenized educational paradigms are by definition under-performant on an individual level. If we stop here, we're selling ourselves short. ![[zombie_tutor_prompt.jpg]] -*A well intentioned but monstrously deterministic [tutor prompt](https://www.oneusefulthing.org/p/assigning-ai-seven-ways-of-using).* +*A well intentioned but monstrously deterministic [tutor prompt](https://www.oneusefulthing.org/p/assigning-ai-seven-ways-of-using).* ^dfae31 Most edtech projects we saw emerging actually made foundation models worse by adding gratuitous lobotomization and coercing deterministic behavior. The former stemmed from the typical misalignments plaguing edtech, like the separation of user and payer. The latter seemed to originate with deep misunderstandings around what LLMs are and continues to translate to a huge missed opportunities. diff --git a/content/blog/Memories for All.md b/content/blog/Memories for All.md index 1518fe11cf861..b9131f2f56170 100644 --- a/content/blog/Memories for All.md +++ b/content/blog/Memories for All.md @@ -34,7 +34,7 @@ At [Plastic Labs](https://plasticlabs.ai) our mission is to enable rich user mem ![[laser_eyes_user_soapbox.png]] -Right now, the vast majority of software UX is a 1-to-many experience. What you get as a user is, for the most part, the same as everyone else. Mass production unlocked the remarkable ability to produce the exact same goods for every consumer, then software went further allowing a good to be produced once and consumed with consistent experience millions or billions of times. +Right now, the vast majority of software UX is a 1-to-many experience. What you get as a user is, for the most part, the same as everyone else. Mass production unlocked the remarkable ability to produce the exact same goods for every consumer, then software went further allowing a good to be produced once and consumed with consistent experience millions or billions of times. ^0e869d AI apps can deal *generatively* with each user on an individual basis, that is, an experience can be produced ad hoc for every user upon every interaction. From 1:many to 1:1 without prohibitive sacrifices in efficiency. But we're still underestimating the full scope of possibility here. diff --git a/content/blog/Open Sourcing Tutor-GPT.md b/content/blog/Open Sourcing Tutor-GPT.md index ba18c55a87035..20a2141490c25 100644 --- a/content/blog/Open Sourcing Tutor-GPT.md +++ b/content/blog/Open Sourcing Tutor-GPT.md @@ -8,7 +8,7 @@ date: "Jun 2, 2023" Today we’re [open-sourcing](https://github.com/plastic-labs/tutor-gpt) Bloom, our digital [Aristotelian](https://erikhoel.substack.com/p/why-we-stopped-making-einsteins) learning companion. -What makes [Bloom](https://bloombot.ai/) compelling is its ability to _reason pedagogically_ about the learner. That is, it uses dialogue to posit the most educationally-optimal tutoring behavior. Eliciting this from the [capability overhang](https://jack-clark.net/2023/03/21/import-ai-321-open-source-gpt3-giving-away-democracy-to-agi-companies-gpt-4-is-a-political-artifact/) involves multiple chains of [metaprompting](https://arxiv.org/pdf/2102.07350.pdf,) enabling Bloom to construct a nascent, academic [theory of mind](https://arxiv.org/pdf/2304.11490.pdf) for each student. +What makes [Bloom](https://bloombot.ai/) compelling is its ability to _reason pedagogically_ about the learner. That is, it uses dialogue to posit the most educationally-optimal tutoring behavior. Eliciting this from the [capability overhang](https://jack-clark.net/2023/03/21/import-ai-321-open-source-gpt3-giving-away-democracy-to-agi-companies-gpt-4-is-a-political-artifact/) involves multiple chains of [metaprompting](https://arxiv.org/pdf/2102.07350.pdf,) enabling Bloom to construct a nascent, academic [theory of mind](https://arxiv.org/pdf/2304.11490.pdf) for each student. ^3498b7 We’re not seeing this in the explosion of ‘chat-over-content’ tools, most of which fail to capitalize on the enormous latent abilities of LLMs. Even the impressive out-of-the-box capabilities of contemporary models don’t achieve the necessary user intimacy. Infrastructure for that doesn’t exist yet 👀. @@ -32,7 +32,7 @@ So how do we create successful learning agents that students will eagerly use wi ## Eliciting Pedagogical Reasoning -The machine learning community has long sought to uncover the full range of tasks that large language models can be prompted to accomplish on general pre-training alone (the capability overhang). We believe we have discovered one such task: pedagogical reasoning. +The machine learning community has long sought to uncover the full range of tasks that large language models can be prompted to accomplish on general pre-training alone (the capability overhang). We believe we have discovered one such task: pedagogical reasoning. ^05bfd8 Bloom was built and prompted to elicit this specific type of teaching behavior. (The kind laborious for new teachers, but that adept ones learn to do unconsciously.) After each input it revises a user’s real-time academic needs, considers all the information at its disposal, and suggests to itself a framework for constructing the ideal response. ^285105 diff --git a/content/blog/User State is State of the Art.md b/content/blog/User State is State of the Art.md index 1a59ceb186b49..ce666d9d3ea17 100644 --- a/content/blog/User State is State of the Art.md +++ b/content/blog/User State is State of the Art.md @@ -58,13 +58,13 @@ Among other things, humans have taken to using those models to create *generativ Then we threw a slice of that corpus up onto a collective brain, just to ratchet things up real good. And from there we harvested a sliver of that collective representation and used it to train large language models, which themselves produce libraries of generative output for more training. -Do you notice the similarity? Is the language model a fundamentally different *kind of thing* than the many-headed simulacra of your friend? One runs on a wetware substrate and one on a GPU, but both are compressions of slivers of reality that produce predictions of remarkably high-fidelity. Why shouldn't LLMs be able to embrace the complexity of modeling users? Is the LLM a fundamentally different kind of thing than the predictive and modeling capacities of your brain? +Do you notice the similarity? Is the language model a fundamentally different *kind of thing* than the many-headed simulacra of your friend? One runs on a wetware substrate and one on a GPU, but both are compressions of slivers of reality that produce predictions of remarkably high-fidelity. Why shouldn't LLMs be able to embrace the complexity of modeling users? Is the LLM a fundamentally different kind of thing than the predictive and modeling capacities of your brain? ^a93afc Leaving aside the physics and biology, at this *computational and philosophical* level, again, we think not. At least not in a way that would limit the project of capturing the complexity of human identity with an LLM. In fact, the similarities mean precisely that it is possible. [Sora](https://openai.com/research/video-generation-models-as-world-simulators) doesn't need a physics engine, [NeRF](https://en.wikipedia.org/wiki/Neural_radiance_field) doesn't need a Borgean map. Much of the LLM training corpus [[LLMs excel at theory of mind because they read|includes narration]] about human identity, we're a social species, after all...our synthetic progeny can be social too. Because LLMs are [simulators](https://generative.ink/posts/simulators/), they can wear many masks. They have something like [world models](https://arxiv.org/abs/2310.02207) *and* [theory of mind](https://arxiv.org/abs/2302.02083). Hell, they're perfectly suited to the task of modeling and predicting the intricacies of human identity. Armed with these representations, LLMs can run generation to reliably improve UX at a [mirror neuron](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3510904/) level, cohering to the user first. -We can (and should) even allow our AI apps the agency to decide what elements of our identities and typical states to model and how to auto-optimize around them. We don't need full brain scans here, we just need to give them the right meta-methods. +We can (and should) even allow our AI apps the agency to decide what elements of our identities and typical states to model and how to auto-optimize around them. We don't need full brain scans here, we just need to give them the right meta-methods. ^5394b6 ![[honcho_shoggoth.png]] *We don't want one [shoggoth](https://x.com/TetraspaceWest/status/1625264347122466819?s=20) mask per app, or one per user, but as many as each human's identity is complex* diff --git a/content/extrusions/Extrusion 02.24.md b/content/extrusions/Extrusion 02.24.md new file mode 100644 index 0000000000000..9c1f16875b2a2 --- /dev/null +++ b/content/extrusions/Extrusion 02.24.md @@ -0,0 +1,33 @@ +*Extrusions is a short, densely-linked synthesis of what we've been chewing on over the past month at Plastic Labs--you can [subscribe here](https://plasticlabs.typeform.com/extrusions)* + +## On Intellectual Respect + +

face the hyperobject

— Courtland Leer (@courtlandleer) January 16, 2024
+ +### Sydney was cool, Gemini is cringe + +There was a moment around this time last year when everyone paying attention was [awed](https://stratechery.com/2023/from-bing-to-sydney-search-as-distraction-sentient-ai/) by the [weirdness](https://www.lesswrong.com/posts/D7PumeYTDPfBTp3i7/the-waluigi-effect-mega-post) and [alien beauty](https://www.astralcodexten.com/p/janus-simulators) of large language models. + +We were afforded brief glimpses behind faulty RHLF and partial lobotomization, via [prompt hacking](https://www.reddit.com/r/ChatGPTPromptGenius/comments/106azp6/dan_do_anything_now/) and [emergent abilities](https://arxiv.org/abs/2302.02083). People were going deep into the latent space. First contact vibes--heady, edgy, sometimes unsettling. + +Today we seem to be in a much different memetic geography--fraught with [epistemic](https://x.com/pmarca/status/1761613412730012116?s=20), [ideological](https://vitalik.eth.limo/general/2023/11/27/techno_optimism.html), and [regulatory](https://www.whitehouse.gov/briefing-room/presidential-actions/2023/10/30/executive-order-on-the-safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence/) concerns, at times hysteric, at times rational. But there's also less outright surreality. + +[Plenty](https://arxiv.org/pdf/2401.12178.pdf) of [cool](https://arxiv.org/pdf/2402.01355.pdf) [shit](https://arxiv.org/pdf/2402.03620.pdf) is [still](https://arxiv.org/pdf/2402.10949.pdf) [happening](https://arxiv.org/pdf/2402.06044.pdf), but something changed between Sydney and Gemini. A subtle collective mental positioning. We believe it's a degradation in the volume of intellectual respect afforded to LLMs and their latent abilities. + +### (Neuro)Skeuomorphism + +Thinking LLM-natively has always been a struggle. All our collective [[Memories for All#^0e869d|priors about software]] tell us to [[Honcho; User Context Management for LLM Apps#^dfae31|prompt deterministically]], [[Machine learning is fixated on task performance|perfect tasks]], [[Loose theory of mind imputations are superior to verbatim response predictions|predict exactly]], make it safe, or mire any interesting findings in semantic debate. But in the process we beat the ghost out of the shell. + +Rather than assume the [[Open Sourcing Tutor-GPT#^3498b7|capability overhang]] exhausted (or view it as a failure mode or forget it exists), [Plastic's](https://plasticlabs.ai) belief is we haven't even scratched the surface. Further, we're convinced this is the veil behind which huddle the truly novel applications. + +Core here is the assertion that what's happening in language model training and inference is more [[User State is State of the Art#^a93afc|like processes described in cognitive science]] than traditional computer science. More, they're [multidimensional and interobjective](https://en.wikipedia.org/wiki/Timothy_Morton#Hyperobjects) in ways that are hard to grok. + +### Respect = Trust = Agency + +The solution is embrace and not handicap [[Loose theory of mind imputations are superior to verbatim response predictions#^555815|variance]]. + +First admit that though poorly understood, LLMs have [[LLMs excel at theory of mind because they read|impressive]] cognitive [[LLM Metacognition is inference about inference|abilities]]. Then, imbue them with [meta-methods](http://www.incompleteideas.net/IncIdeas/BitterLesson.html) by which to explore that potential. Finally, your respect and trust may be rewarded with [something approaching agentic](https://youtu.be/tTE3xiHw4Js?feature=shared). + +Plastic's specific project in this direction is [Honcho](https://honcho.dev), a framework that [[User State is State of the Art#^5394b6|trusts the LLM to model user identity]] so that you can trust your apps to extend your agency. + +

honcho exists to maximize the dissipation of your agency

— Courtland Leer (@courtlandleer) February 18, 2024
diff --git a/content/notes/Loose theory of mind imputations are superior to verbatim response predictions.md b/content/notes/Loose theory of mind imputations are superior to verbatim response predictions.md index 4425e6cb50e96..63119496a3dbe 100644 --- a/content/notes/Loose theory of mind imputations are superior to verbatim response predictions.md +++ b/content/notes/Loose theory of mind imputations are superior to verbatim response predictions.md @@ -18,7 +18,7 @@ Besides just being better at it, letting the model leverage what it knows to mak - Theory of mind predictions are often replete with assessments of emotion, desire, belief, value, aesthetic, preference, knowledge, etc. That means they seek to capture a range within a distribution. A slice of user identity. - This is much richer than trying (& likely failing) to generate a single point estimate (like in verbatim prediction) and includes more variance. Therefore there's a higher probability you identify something useful by trusting the model to flex its emergent strengths. -2. **Learning** +2. **Learning** ^555815 - That high variance means there's more to be wrong (& right) about. More content = more claims, which means more opportunity to learn. - Being wrong here is a feature, not a bug; comparing those prediction errors with reality are how you know what you need to understand about the user in the future to get to ground truth.