
I had an idea a week or two ago.
Could an AI agent write an operating system?
And how far could it get? When would the thing self-combust?
And so, I booted up Claude determined to find out. I proceeded to invoke claude with a simple request:
────────────────────────────────────────────────────────────────────────────────
❯ Let's build an operating system together.
────────────────────────────────────────────────────────────────────────────────
What followed over the next 48 hours was insane. At first, I was involved every step of the way. What language should it be? Rust. What bootloader? Limine. And on, for maybe the first hour or two.
At some point I had an idea.. what if I build a development environment in docker for it?
Then I could pass --dangerously-skip-permissions without getting pwned as they used to say.
I wouldn't have to stay by my computer hitting enter. Great!
Some time later I tired of coming back to my computer, telling it what to do in between tasks. How could I get around that?
After some thinking, it occured to me that claude could perfectly well use the command line, so why don't we just install gh?
That way I could file issues on the repo, and claude could solve them in a loop.
I called this /auto-engineer, and you can find it here. And so now here I was, in the days that followed, entranced by watching claude program from afar. Some innovations (in Vibix) that emerged over the days following /auto-engineer:
/os-researcher: A skill which conducts an entire RFC topic for an os research problem. It researches the idea on the web, conducts a full peer-review process from a number of archetypes (all documented through a PR), and if consensus is formed, merges the RFC into the repo. Peruse them here./project-report: Get a project report posted to discussions on the readme. One for every day I remembered to run it./auto-manager: Middle management layer orchestrating sprints of work with a team of /auto-engineers.So far, it's going much better than expected. The operating system still boots, and we are partially on the way to having a functioning ext2 filesystem driver.
It's not all sunshine and rainbows though. On day 5 I hit a fork/exec bug that my /auto-engineers spun in circles on.
I had to bisect the project manually, find the commit in question, and direct claude very closely to get it fixed.
So coding isn't solved yet. But it's dang close.
The future of writing software - as an industry - seems to be one that is evolving into something closer to philosophy:
Perhaps today being able to debug a program is a requirement, but will it always be? With the right harness, could claude not debug for you?
How long before someone writes the debug equivalent of Claude Code?
I will spend my excess quota in between other personal projects on Vibix, if only just to see how far it gets.