More

BoumTAC · 2026-03-17T17:46:44 1773769604

To me, mini releases matter much more and better reflect the real progress than SOTA models.

The frontier models have become so good that it's getting almost impossible to notice meaningful differences between them.

Meanwhile, when a smaller / less powerful model releases a new version, the jump in quality is often massive, to the point where we can now use them 100% of the time in many cases.

And since they're also getting dramatically cheaper, it's becoming increasingly compelling to actually run these models in real-life applications.

brikym · 2026-03-17T18:07:53 1773770873

If you're doing something common then maybe there are no differences with SOTA. But I've noticed a few. GPT 5.4 isn't as good at UI work in svelte. Gemini tends to go off and implement stuff even if I prompt it to discuss but it's pretty good at UI code. Claude tends to find out less about my code base than GPT and it abuses the any type in typescript.

patates · 2026-03-17T19:02:57 1773774177

Big part of these differences may be the system prompts and/or the harness.

pzo · 2026-03-17T17:50:28 1773769828

they do are cheaper than SOTA but not getting dramatically cheaper but actually the opposite - GPT 5.4 mini is around ~3x more expensive than GPT 5.0 mini.

Similarly gemini 3.1 flash lite got more expensive than gemini 2.5 flash lite.

BoumTAC · 2026-03-17T18:01:05 1773770465

But they are getting dramatically better.

What's the point of a crazy cheap model if it's shit ?

I code most of the time with haiku 4.5 because it's so good. It's cheaper for me than buying a 23€ subscription from Anthropic.

philipkglass · 2026-03-17T18:18:39 1773771519

The crazy cheap models may be adequate for a task, and low cost matters with volume. I need to label millions of images to determine if they're sexually suggestive (this includes but is not limited to nudity). The Gemini 2.0 Flash Lite model is inexpensive and performs well. Gemini 2.5 Flash Lite is also good, but not noticeably better, and it costs more. When 2.0 gets retired this June my costs are going up.

dev_hugepages · 2026-03-18T07:07:05 1773817625

Time to gather a dataset and train your own model!

ainch · 2026-03-17T23:42:28 1773790948

I use Gemini via its web app, which aggressively autoswitches to the Flash over Pro, but I usually notice quickly because the answers are weird or the logic doesn't quite follow. I feel like, at least for 'daily driver' usage, small models are still a little disappointing. That said, they're getting very good for more automation-y tasks with simple, well-constrained tasks.

mountainriver · 2026-03-18T02:08:13 1773799693

Most annoying part of their web app and a really terrible idea.

I often just think Gemini is terrible but then it turns out they silently changed the model on me

ainch · 2026-03-18T15:07:08 1773846428

I got Codex to whip me up a Chrome extension that autoswaps back to Pro whenever I reload the page. It's made Gemini significantly less irritating to use.

zozbot234 · 2026-03-17T20:29:13 1773779353

> And since they're also getting dramatically cheaper, it's becoming increasingly compelling to actually run these models in real-life applications.

They're not really cheaper than the SOTA open models on third-party inference platforms, and they're generally dumber. I suppose they're still worth it if you must minimize latency for any given level of smarts, but not really otherwise.

XCSme · 2026-03-17T22:32:06 1773786726

Well, in that case, the difference is quite minimal between 5 mini and 5.4 mini

5.4 mini seems to be a lot more wild/unstable, but with this instability it gets the right answer more often.

https://aibenchy.com/compare/openai-gpt-5-4-mini-medium/open...

sebastiennight · 2026-03-17T19:51:32 1773777092

> 100% of the time in many cases

So, every single time, the new model works most of the time?

kennywinker · 2026-03-17T23:00:23 1773788423

You’ve parsed the sentence wrong.

Read it as: “You can use them full time in many cases”

BoumTAC · 2026-03-12T14:53:38 1773327218

It's because they are getting so good it's impossible to recognize them.

Haiku 4.5 is already so good it's ok for 80% (95%?) of dev tasks.

FuckButtons · 2026-03-12T22:59:07 1773356347

I must be writing very different software than you, I keep opus on a tight leash and it still comes to the strangest conclusions.

lukan · 2026-03-13T00:52:32 1773363152

Very possible. Some things work like a charm on first try for me, others you can spell it out again and again. And then yet again. Something to do with training data, obviously.

Bolwin · 2026-03-12T22:59:51 1773356391

I've found Haiku to be truly mediocre for working with. If you want a cheap models, the open source ones are much better

BoumTAC · 2025-07-22T02:32:40 1753151560

I'm all about trashing Europe when it's needed but I think this post is an hidden PR post.

It seems so fake to me and so far from the experience I have here in France.

BoumTAC · 2025-07-21T15:41:42 1753112502

you should not ask indie hackers for advice and you should not hang out with them.

If you build a product for marketers, you should hang out with them and ask them for advices, not indie hackers who know nothing about marketing.

If you build a product for bakers, you should hang out with them to understand what they need, not with indie hackers who have never baked anything in their lives.

That sounds logical, but for certain types of products, it is not.

There is no point in talking with indie hackers. It's only useful if you need knowledge about coding skills, which is rarely the case (especially now with AI).

BoumTAC · 2025-03-28T10:49:50 1743158990

How does it work ? How does it find the environment ?

Let say I have a project in `/home/boumtac/dev/myproject` with the venv inside.

If I run `uv python find --script /home/boumtac/dev/myproject/my_task.py`, will it find the venv ?

JimDabell · 2025-03-28T11:05:50 1743159950

The philosophy of uv is that the venv is ephemeral; creating a new venv should be fast enough that you can do it on demand.

Do you have a standalone script or do you have a project? --script is for standalone scripts. You don’t use it with projects.

If you tell it to run a standalone script, it will construct the venv itself on the fly in $XDG_CACHE_HOME.

If you have a project, then it will look in the .venv/ subdirectory by default and you can change this with the $UV_PROJECT_ENVIRONMENT environment variable. If it doesn’t find an environment where it is expecting to, it will construct one.

BoumTAC · 2025-03-28T09:15:10 1743153310

I'm not a native English speaker, but don't you use the ";" in English ?

To me, it feels like it is the same purpose as the EM dashes.

And I discovered the EM with ChatGPT, I've never seen it before.

layer8 · 2025-03-28T14:10:06 1743171006

A semicolon connects, whereas an em-dash creates more of a pause and therefore separates. In addition, em-dashes can be used in pairs to create a parenthesis, which semicolons can’t. I think with time you will appreciate the difference.

https://thenarrativearc.org/blog/2020/2/4/epic-grammar-battl...

OJFord · 2025-03-28T09:30:45 1743154245

Dashes surround a sub-clause - something like this - which is like a parenthetical addition to a sentence that could stand alone without it; semi-colons (';') connect a further sentence or part of one where perhaps a full-stop and additional word could have been. They also sometimes separate list items following a colon, especially if the things listed are longer sentences perhaps themselves containing commas that'd otherwise be ambiguous.

grey413 · 2025-03-28T09:57:22 1743155842

Em dashes are very similar to semicolons. You use em dashes if your related sentence is in the middle of another sentence, and semicolons if it's at the end.

They're frequently used in skilled and professional grade writing.

mmooss · 2025-03-28T18:51:31 1743187891

So as not to mislead anyone, the parent is mostly incorrect:

Here's an example sentence: Semicolons must have independent clauses—phrases that could form a full sentence on their own—on both sides of them; they are essentially alternatives for periods. Em dashes don't require independent clauses on either side.

In the italicized sentence,

* phrases that could form a full sentence on their own is not an independent clause but is valid between em dashes. on both sides of them, after the em dashes, is also not an independent clause. (The em dashes function like commas or parentheses here.)

* The parts before and after the semicolon are independent clauses. You could replace the semicolon with a period and you'd have perfectly valid grammar. I just chose to connect the two sentences a bit more.

I don't know if you can use em dashes as the parent comment describes, connecting three independent clauses:

* My favorite fruit is peaches—they are very sweet—I eat them all summer.

I think the above is wrong; it should be one of the following:

* My favorite fruit is peaches—they are very sweet—and I eat them all summer.: The last section is a dependent clause made by "and", not an independent clause.

* My favorite fruit is peaches—they are very sweet; I eat them all summer.: One both sides of the semicolon are independent clauses; I could replace the semicolon with a period.

Maybe there are examples I'm not thinking of? I infer that the rule might be that the punctution following the em-dashed clauses should be the punctuation that would have been used without the em-dashed clause, but that's based on very limited evidence.

mmooss · 2025-03-28T18:58:50 1743188330

Many people don't use semicolons (;) in English but many do, and they are certainly part of correct grammar.

Semicolons are generally alternatives to periods, when you want more connection between the two sentences. Like periods, semicolons must have two full sentences—that is, what could be full sentences—on either side of them; the potential 'full sentences' are properly called independent clauses. (A dependent clause needs the rest of the sentence to form valid grammar; it can't function on its own. For example, in this paragraph's first sentence, when you want more connection between the two sentences is a dependent clause. Often they follow commas.)

Another use of semicolons is for lists in a paragraph where one of the list items has a comma in it (similar to the parsing problem for CSVs where some records contain commas): I only like wine; beer, but only ales; and orange juice.

BoumTAC · on Nov 2, 2024

Shopify is built using Ruby on Rails, they successfully handle enormous traffic spikes during Black Friday sales without issues.

So I think we're good with performance.

rockyj · on Nov 2, 2024

Everything can scale if you throw enough servers at it. Of-course Shopify scales, they even spent time and money to build a JIT on top of Ruby. As a smaller company, does everyone have the time and money to spend on servers or optimising the language to this extent?

bnferguson · on Nov 2, 2024

That's the nice thing! You don't need to optimise the language and build a JIT as a smaller company, Shopify already did that for you. Just like Google did for Javascript, which lead to Javascript having any performance at all (which lead to node being a thing).

Also remember that Shopify didn't start out making billions. They started as a small side project on a far, far slower version of Ruby and Rails.

Same with GitHub, same with many others that are either still on Rails or started there.

You can optimise things later once you actually have customers, know the shape of your problem and where the actual pain points are/what needs to be scaled.

To me, I care a ton about performance (it's an area I work in), but there's not a lot of sense in sacrificing development agility for request speed on things that may not matter or be things people will pay for. Especially when you're small.

chucke · on Nov 2, 2024

No, they only have time for features and productivity, which is, as you pointed out earlier, what rails is good at.

lentil · on Nov 2, 2024

Smaller companies have less traffic, need less expensive servers, and have no need to spend money optimising the language. They can focus on that when they make billions of dollars, like Shopify does.

axelthegerman · on Nov 2, 2024

And in the meantime just passively benefit from the OSS improvements along the way

ksec · on Nov 2, 2024

>So I think we're good with performance.

>On Rails, the most heavy page has a P95 duration of 338 ms. There is of course room for improvement but it's plenty snappy.

I guess everyone will have different opinion on P95 at 338ms. The great thing is that we are getting cheaper CPU Core price and YJIT. As long as this trend continues, the definition of Fast Enough will cover more grounds for more people.

dajonker · on Nov 2, 2024

There's lots of tricks you can do, such as preloading pages when the users hovers over the link. This makes even a "slow" page load of 400ms feel pretty much instant to a human.

BoumTAC · on June 28, 2024

Something I don't understand, Boeing 737 have been flying for years without no issue.

Why since a few month are they a lot of problem with them ? Why did these problems not appear before ?

BoumTAC · on May 13, 2024

Did they provide the limit rate for free user ?

Because I have the plus membership which is expensive (25$/month).

But if the limit is high enough (or my usage low enough), there is no point for paying that much money for me.

BoumTAC · on Feb 4, 2024

I think it's because of two things.

- The first one Chrome OS.

- There's a noticeable trend where the general population is increasingly favoring smartphones over traditional computers for their digital needs. This shift has predominantly affected Windows users, as they represent a significant portion of the casual computing market. As a result, the proportion of dedicated Linux users has become more pronounced

jszymborski · on Feb 4, 2024

Chrome OS and Android are their own category in this list.