> You end up wasting tokens on implementation, debugging, execution, and parsing when you could just use the tool (tool description gets used instead).
The token are not wasted, because I rewind to before it started building the tool. That it can build and manipulate its own tools to me is the benefit, not the downside. The internal work to manipulate the tools does not waste any context because it's a side adventure that does not affect my context.
Maybe I'm not understanding the scenario well. I'm imagining an autonomous agent as a sort of baseline. Are you saying the agent says "I need to write a tool", it takes a snapshot, and once it's done, it rewinds to the snapshot but this time, it has the tool it desired? That's actually a really cool idea to do autonomously!
If you mean manually, that's still interesting, but that kind of feels like the same thing to me. The idea is - don't let the agent burn context writing tools, it should just use them. Isn't that exactly what yours is doing? Instead of rewinding to a snapshot, I have a separate code base for it. As tools get more complex, it seems nice to have them well-tested with standardized input and output. Generating tools on the fly, rewinding, and using tools is just the same thing. You even would need to provide some context that says what the tool is and how to use it, which is basically what the mcp server is doing.
> Are you saying the agent says "I need to write a tool", it takes a snapshot, and once it's done, it rewinds to the snapshot but this time, it has the tool it desired? That's actually a really cool idea to do autonomously!
I'm basically saying this except I currently don't give the agent a tool yet to do it automatically because it's not really RL'ed to that extend. So I use the branching and compaction functionality of my harness manually when it should do that.
> If you mean manually, that's still interesting, but that kind of feels like the same thing to me.
It's similar, but it retains the context and feels very naturally. There are many ways to skin the cat :)
The token are not wasted, because I rewind to before it started building the tool. That it can build and manipulate its own tools to me is the benefit, not the downside. The internal work to manipulate the tools does not waste any context because it's a side adventure that does not affect my context.