What is the major and minor semver meaning for these models? Is each minor release a new fine-tuning with a new subset of example data while the major releases are made from scratch? Or do they even mean anything at this point?
Nothing. The next major increment is going to happen when marketing department is confident they can sell it as a major improvement without everyone laughing at them. Which at this point seems like never.
I think Anthropic fearmongering and "leaks" of Mythos was them testing the ground for 5.x, which seems to have backfired.
Shameless self promo but, I've been working on Optio specifically for coding, it works by taking any harness you want and tasking it to open Github/lab PRs based on notion/jira/linear tickets, see: https://news.ycombinator.com/item?id=47520220
It works on top of k8s, so you can deploy and run in your own compute cluster. Right now it's focused only on coding tasks but I'm currently working on abstractions so you can similarly orchestrate large runs of any agentic workflow.
@jawiggins saw your repo it looks like openAI symphony but better as it works across multiple agents and issue trackers and the feedback loop is great . One feature request though - can you add plan mode ? Your issues are so detailed it becomes plan to implement (but I guess your plan mode is currently happening outside of GitHub issues ) but let’s say issue is “implement support for plan mode” there should be back and forth with agent with issue tags pointing to opus max and/or plan mode - so we can correct agents plan back and forth and once tag is removed it can start implementing or something similar ?
Thanks for the feedback. Earlier I expected I'd need to do more back and forth with the agents before accepting their work but in general I've found it isn't needed.
I do have some features coming up that will improve the ability to converse with the agent as it's running. I'll make a note to add in a plan setting so you can have that run and converse before it gets going.
Do you just add the Issue Title like this "feat: CLI improvements — status dashboard, workflow commands, shell completions" and it generated the plan in issue body and started working on it OR is the plan generated by another ai agent and copied to issue body for pickup by optio ?
Really interesting to see Google's approach to this.
Recently I shared my approach, Optio, which is also an Agent Orchestration platform: https://news.ycombinator.com/item?id=47520220
I was much more focused on integrating with ticketing systems (Notion, Github Issues, Jira, Linear), and then having coding agents specifically work towards merging a PR.
Scion's support for long running agents and inter-container communication looks really interesting though. I think I'll have to go plan some features around that. Some of their concepts, make less sense to me, I chose to build on top of k8s whereas they seem to be trying to make something that recreates the control plane. Somewhat skeptical that the recreation and grove/hub are needed, but maybe they'll make more sense once I see them in action the first time.
One pod is an instance of a repo, you can set the number of instances of each agent/task that can be running on a pod at a time. For >1, each agent should be using it's own worktree.
Maybe - I do think as the model get better they'll be able to handle more and more difficult tasks. And yet, even if they can only solve the simplest issues now, why not let them so you can focus on the more important things?
Yup. MCP can be configured on a repo level. At task execution time, enabled MCP servers are written as a .mcp.json file into the agent's worktree. Enabled skills are written as .claude/commands/{name}.md files in the worktree, making them available as slash commands to the agent
Generally I've found agents are capable of self correcting as long as they can bash up against a guardrail and see the errors. So in optio the agent is resumed and told to fix any CI failures or fix review feedback.
reply