Wikimaxxing is all you need

Apr 23, 2026

The most important thing I’ve learned from running OpenClaw over the past few months is your personal context makes all the difference. You haven’t really used these frontier models until you’ve felt that. And you’re not going to feel it inside the walled garden, no matter how good the model is.

LLMs have the memory of a goldfish, as you know, and every conversation starts from zero. You paste in the relevant files, explain who so-and-so is, clarify that yea already tried that, and by the time the model has what it needs to actually be useful, you could’ve just done the thing yourself. An agent that actually works is one that already remembers.

And your context is wider than you’d think. Email, texts, calendar, notes, docs, chat logs, browser history, fitness tracker, transactions, code, pretty much your whole digital existence. These models are able to intuit a shocking amount from all of it if you let them. It’s a little uncanny the first time you see it connect stuff you never said out loud or draw patterns across channels.

I also don’t think we’re far from wearables recording everything we do and say by default. And not just wearables, reusing the security cameras mounted in your house so your assistant knows where you are and what you’re up to while you’re home. Further out, maybe publicly accessible nodes that record people’s presence, actions, and speech that personal agents can tap into. Yea, I know how that sounds.

Anyway, the practical problem is curating the perfect context every single time you want to ask something is way too much work, and honestly you don’t even know what the perfect context is. Something real shifts though when you give the model your whole life. The picture comes in grainy and gets sharper and sharper the more you feed it.

I’ve already had enough glimpses of this to know it’s real. What makes it work is unfettered access to your data paired with the ability to do anything a computer can.

Karpathy was on to something

Earlier this month, Andrej Karpathy posted another banger tweet that did the numbers:

Andrej Karpathy@karpathy

LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. In this way, a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating

8:42 PM · Apr 2, 2026 · 20.2M Views

2.8K Replies · 6.84K Reposts · 56.9K Likes

He wrote this pattern up in a bit more detail as well. I read it like a lot of folks that day and rolled my own. I was quite intrigued by this personal knowledge base right away. My results were mid though.

I didn’t go far enough. I didn’t give it enough sources. So naturally I kept feeding the beast. Let it cook!

The Memory Wiki plugin in OpenClaw rolled out, clearly inspired by Karpathy’s idea, and this one is the start of something really remarkable.

Vincent Koc@vincent_koc

Proud to bring fully native @karpathy's LLM wiki support including backfilling, native @obsdmd, and intergration with /dreams. 🧠 Memory features seem to be the next big unlock for agentic systems.

OpenClaw🦞 @openclaw

OpenClaw 2026.4.7 🦞 🔮 openclaw infer 🎬 music + video editing 💾 session branch/restore 🔗 webhook-driven TaskFlows 🤖 Arcee, Gemma 4, Ollama vision 🧠 memory-wiki: persistent knowledge, not just vibes Because “trust me bro” is not a knowledge system. https://t.co/L7OaBHA7Qg

6:09 AM · Apr 8, 2026 · 111K Views

28 Replies · 42 Reposts · 792 Likes

The core idea is the same. You dump source material into a raw/ directory and let the LLM incrementally compile a wiki of markdown pages with summaries, backlinks, and articles for the concepts it finds.

You can view it in Obsidian. The LLM owns the wiki, you rarely touch it directly. Ask it a question, it reads across the wiki, does its research, and files the answer back in, and it all compounds.

It sits alongside the active memory plugin, which in my case is QMD, a local-first search sidecar that does BM25, vectors, and reranking in a single binary. QMD handles recall and semantic search. Memory Wiki compiles the durable stuff into a structured vault with entities, concepts, syntheses, and sources, each page carrying provenance, claims, and confidence. They run together in bridge mode so search spans both.

What I love is the simplicity of it all. Everything is markdown on my hard drive. The vault lives in a directory and renders as an Obsidian vault, so I can read it from any device. You’ve got no vendor lock-in, and no proprietary format you’ll ever lose access to.

I’ve been using it to catalog my entire life. Really. I’m still early but so far I’ve got pages going on relationships, health, fitness, personal finances, career, pets, and home. The sources are my emails, texts, chat logs, notes, and documents. The claw reads, the claw writes. It’s a partnership though, I steer and correct as we go, much like vibe coding. I hand edit more than I expected to, but you can kind of see a future with zero onboarding and negligible maintenance.

When I’m talking to my claw about some random thing, it has the context and the context for the context. My wife wants a rainbow driveway and has a very specific vision for it. My assistant already knows what she’s picturing because she’s been texting me about it. It knows what a contractor quoted because I’ve been texting him about ideas and cost. It pulls from the finances page because yea, projects cost money, and from the home page because the driveway’s already filed there. On and on. Every page leans on every other page.

You kinda have to live with it for a bit before it clicks. What makes it click is the memory plus full control over every aspect of your assistant’s instructions and soul. Once you’re outside the walled garden, honestly, every other AI feels bland, boring, and useless.

Your LLM needs a diet

Speaking of context, too much of the wrong kind is its own problem. Agents burn tokens on the garbage you never asked them to read. An ls that dumps hundreds of files. A grep that returns the whole log. A test run that prints copious output nobody's reading. You’re paying for all of it, in dollars and in context.

Vincent Koc (@vincent_koc), one of the core maintainers of OpenClaw, built tokenjuice to fix this. Yep, your LLM needs a diet. Run your commands untouched, trim the fat out of the output before it reaches the model’s context.

It hooks in after each tool runs. Your ls still runs, the model just doesn’t have to read every entry. You see it fire all day, a little notice in the tool output, compacting bash output with tokenjuice: 60 items, and life goes on. Escape hatch is right there when you need it, tokenjuice wrap --raw -- <command> forces the unfiltered output through.

I’ve been using it with Claude Code, Codex CLI, and Pi. One install per host:

npm install -g tokenjuice
tokenjuice install claude-code
tokenjuice install codex
tokenjuice install pi

And as of today, literally today, tokenjuice is bundled in OpenClaw as a plugin. Flip it on, and cold press some tokens:

openclaw config set plugins.entries.tokenjuice.enabled true

People are seeing up to 80% token reduction on tool use:

Vincent Koc@vincent_koc

People reporting 80% token reduction on tool use 💅

Alex Fries @ajfonthemove

Have been tinkering with @vincent_koc's tokenjuice for the last two days. It was light days, between dad and family obligations, but the results speak for themselves. If you haven't already, check out the repo: https://t.co/uyE9YV7M44 https://t.co/T8OzTJ9dvt

10:35 PM · Apr 18, 2026 · 11.2K Views

7 Replies · 4 Reposts · 65 Likes

Feels like 1995

Honestly, this is the most fun I’ve had in computing since I first encountered the internet and Linux in the mid-1990s. Didn’t think I’d feel this again, but feels like that same energy. There’s always just one more prompt.

Happy hacking, everyone.

Notes on AI Engineering

Discussion about this post

Ready for more?