One things for sure I won't be buying any SaaS, streaming, or ordering from Amazon if I have no future prospects for work. I already stopped most of my subscriptions because of a layoff unrelated to AI.
We buy food and go for walks as entertainment. It's been refreshing but also obviously scary.
Didn’t get the “scary” part. I also keep my entertainment to the minimum dependencies possible. I try to rely on stuff I own: music cds, iso videogames + emulators, physical books or ebooks (thanks Anna), exercise outdoors… ditching streaming like netflix/youtube, buying crap on amazon, uber, etc
It’s the combination of AI changing the workplace, the large techs shedding double digit headcount, recruiting / hiring departments being so broken by the AI arms race hitting job applications, and the macro business environment generally being on the downward slope at the moment.
This feels like the same mechanism for climate change. The actors dont care since they're not completely responsible for that outcome and benefit from ignoring it
Cutting welfare spending will get us no where. The majority components of the federal budget are Defense, SS/Medicare/Medicaid, and debt payments. Not the forestry service or what we commonly know as welfare. At this point, even cutting everything else to zero still lands us in deficit. (Unless taxes are raised.)
To be serious, we need to talk about what cuts are to be made to SS/Medicare/Medicaid and the military. But no one wants to have that discussion. So we throw out meaningless issues like welfare and the forestry service. We quibble around at the extreme edges, never addressing the central problems. That's the essence of the politics being discussed. Those politics make the issue impossible to fix.
I honestly don't know why it's so hard? I'd be totally willing to countenance the necessary cuts to the sacred cow programs at this point. Why is everyone so opposed to it?
Yes "If" spending is cut or "If" taxes are not cut then you might have a balance.
But just implement balanced budget goals. Accept at most a deficit of 1% in the budget or whatever. Allow for a deviation from this to do QE but require a more qualified majority and limit to 1 year only.
Want to cut taxes? Fine - but don't do it with deficit spending. Want to increase welfare spending? Fine - but remember to then cut somewhere else OR increase taxes.
The fact that one side can implement large tax cuts funded by borrowing over and over (and still be elected again) is absolutely _crazy_ on a scale that is perhaps only rivaled by the healthcare system.
ICE is now larger and more expensive than the entire United States Marine Corps.
Let that sink in.
Not only that, we also seem to start a new war every 6 months. Demanding money for each one of them. SS/Pensions/Medicare seem to trend nowhere but up. And like Santa Claus the party in power keeps handing out tax cuts.
We have to make a change guys. The old ways aren't working. We can't be distracting from the central problems by yelling "welfare!". That doesn't work anymore.
And if I didn't quit my job, I'd still be able to pay rent. Oh no, whatever is there to do in this situation?
You can play the twisty game but the fact is simple - if he didn't have the political capital to cut spending, then cutting taxes is irresponsible governance.
Is it really, at the end of the day, a server that serves your data over JSON (aka a social media post) and offers feeds (ala an RSS model) of the changes of that data and provides services for discovering the schema of that JSON (aka its lexicon)?
I'd say "a verifier" here is a loose term. A great testsuite is a verifier. I've done reverse-engineering projects that involved generating trace logs from the object under test, having a reimplementation emit the same logs, and running strict comparisons.
OP's post is basically pointing out what certainly many others have independently discovered: Your agent-based dev operation is as good as the test rituals and guard rails you give the agents.
Can you explain your question a little more? The recursive agents will find the minimum to satisfy the deterministic termination condition including cheating. In other words, it will be literally correct yet wrong. I would go so far to say malicious compliance.
I have recursive agent that finds trading strategies after recreating academic research and probing the model using its training on everything. It works really well but I have to force it to write out every line and write a proof that data in the future from the time of the wall clock didn't enter the system. Even then some stupid thing like not converting the timezone with daylight savings will allow it to peek into the future 1 hour. These types of bugs are almost impossible to find. Now there needs to be another agent whose only purpose to write out every line explaining that the timezone for that line of code was correct.
I used it (well, a skill based on the same idea) to optimise a prompt that does data extraction from UGC.
However there isn't really a "correct" answer that's easy to define in code (I could manually label a training set, but wanted to avoid that) so I had the LLM just analyse the results itself and decide if they are better or not. It wrote deterministic rules for a few things, but overall it just reviewed the results of each round and decided if the are better or not.
Reviewing the before and after results, I would say yes, it's a big improvement in quality. It also optimised the prompt size to reduce input tokens by 25% and switched to a smaller/cheaper model.
Its tangential, but: I’m currently doing a rewrite of the backend of a project, and the verifier is basically the instruction of “maintain v1 functionality if observed from the api side externally”. This allows making a lot of tests based on existing data in the system and how the frontend expects data.
The thing they are really wildly behind on is a business model. They are losing wild amounts of money per customer and it is hard to see how the competitive situation is going to allow them to fix that.
Thing is, if you're using Codex, you're supporting Sam Altman and the idea of Sam Altmans, in the same way that if you use X or buy a Tesla, you're supporting Elon Musk and the idea of Elon Musks. That's a pretty big tax to factor into the usage of such products. If you even got 5% better coding results, would that make up for the future they're trying to build?
Dario wants to replace you with AI as well. Don't be fooled into thinking he's your friend because he said no to Trump that one time. I'll remind you that Musk used to be the left's hero not too long ago.
I'm in the "AI could be good for humanity" camp, and in this camp, we believe that Dario/Anthropic is a radically better choice going forward than the alternatives at this moment. In this camp we are not 'fooled into thinking he's our friend because he said no to Trump that one time', we are evaluating the entire set of available information and figuring that Anthropic's the best bet.
As for Musk ever being "the left"'s "hero" -- that's amazing, that's what Pauli would call 'not even wrong'.
Funny of you to bring up "humanity" while singing praise of the guy who's on record A-OK with aiding mass surveillance of anywhere not the U.S., and who's happy to help kill people but just wouldn't do it fully automatically, merely because he doesn't believe the tech is there yet. All while collaborating with Palantir.
If by "the best bet" you mean slightly less shitty bet then maybe.
I'm getting pretty close too, but I wouldn't switch to Codex I'd switch to one of the open agents that can use any backing LLM. My reasoning is that if I'm willing to pay the cost of the small changes in usage, I might as well switch to an open source agent that I can add my own convenience features to, like remote sessions and phone-based operation.
Why Codex when you can use something that hasn't been touched by Sam Altman? Surely, your drive to get the very best model isn't stronger than your sense of ethics?
For the topic of remote control, Happy seems to be working pretty well for Claude Code but is also supposed to support Codex. It's a bit rough around the edges, but nice that it is open source: https://github.com/slopus/happy
Given the topic of this article, we've been using Claude Code pointed at Bedrock and have never had any scaling issues. Obviously it is more expensive paying by the token than a monthly plan, but I sometimes have 2 or 3 instances chugging along with Opus 4.7 or Sonnet 4.6 and have never had it be down or error out due to usage limits.
What would be subscription customers, no? Rather than Bedrock or per-api customers? Many of the companies running on Bedrock or by-use have per day limits above the max monthly subscription costs.
They are going to lose, the question is how much. I don't see Russians losing the territory they occupy right now, unless some black swan event happens. It remains to be seen how much more territory Russia is able to conquer until they lose their appetite.
This is a common story, that Ukraine has lost control over a vital part of it's country, and that is true.
But so has Russia, they also have lost control over their stuff, stuff that Ukraine are constantly blowing up.
Ukraine now have a capable long range offensive capability which allows them to strike deep into Russia.
Russias vital strategic assets are slowly depleting due to Ukrainian strikes, while Ukraine has better support from the rest of the world and are now in fact cooperating with some powerful nations across the world.
Russia are managing to advance a few square kilometer of territory per month? while losing expensive, hard to replace assets continuously.
Have you seen how the Russians have treated conquered territories with torture chambers and having to submit to the dictatorship for life? And life is often only a few months as they round the males up and force them into the Russian army to get killed.
And to replenish their losses Ukrainian regime snatches men on the street. There are thousands of videos made by bystanders.[0] Come and see.
P.S. I love the ad I see on that site: "Over 10 million Ukrainians suffer from anxiety due to the war. Free exercises with scientifically proven effectiveness." The most important exercise now is running - it saves your life.
The Ukrainians seem to be dealing with it by switching to bots, drones and the like as much as possible and moving human resources back.
I'm not sure about the Russians. That would kind of make sense for them too but things seem a bit gummed up by bureaucracy - I just read a thead about them having to use firecrackers in dones due to such restrictions https://x.com/ChrisO_wiki/status/2049026651544023271
I hope Russia manages to get some more sensible leadership or policies. The Ilya Remeslo guy being let out and able to criticize Putin seems somewhat promising.
> I just read a thead about them having to use firecrackers in dones due to such restrictions
Haven't read the thread but the post omits crucial detail. This is about using interceptor drones inside Russia, not on the frontline. Apparently, the thinking is that failed interceptor drones present hazard of their own, but it might be outdated now.
>I hope Russia manages to get some more sensible leadership or policies.
Like what?
More and more people in Russia are unhappy with Putin dragging his feet with so called special military operation. They think that it's long overdue to turn to total war and forget about minimizing civilian losses in the Ukraine.
Well an obvious solution would be to back to Russia and do something else. You don't have to invade other countries and have an empire. I'm a Brit and we gave up on that about a century ago and it hasn't been so bad. The whole thing seems anachronistic, I think based on Putin reading too many history books and avoiding modern info on the internet.
>I'm a Brit and we gave up on that about a century ago and it hasn't been so bad.
Wait until Scotland or Northern Ireland gets independent and then China or some other powerful country "midwifes" an anti-British coup there and then we'll talk.
reply