Hacker Newsnew | past | comments | ask | show | jobs | submit | BoumTAC's commentslogin

To me, mini releases matter much more and better reflect the real progress than SOTA models.

The frontier models have become so good that it's getting almost impossible to notice meaningful differences between them.

Meanwhile, when a smaller / less powerful model releases a new version, the jump in quality is often massive, to the point where we can now use them 100% of the time in many cases.

And since they're also getting dramatically cheaper, it's becoming increasingly compelling to actually run these models in real-life applications.


If you're doing something common then maybe there are no differences with SOTA. But I've noticed a few. GPT 5.4 isn't as good at UI work in svelte. Gemini tends to go off and implement stuff even if I prompt it to discuss but it's pretty good at UI code. Claude tends to find out less about my code base than GPT and it abuses the any type in typescript.

Big part of these differences may be the system prompts and/or the harness.

they do are cheaper than SOTA but not getting dramatically cheaper but actually the opposite - GPT 5.4 mini is around ~3x more expensive than GPT 5.0 mini.

Similarly gemini 3.1 flash lite got more expensive than gemini 2.5 flash lite.


But they are getting dramatically better.

What's the point of a crazy cheap model if it's shit ?

I code most of the time with haiku 4.5 because it's so good. It's cheaper for me than buying a 23€ subscription from Anthropic.


The crazy cheap models may be adequate for a task, and low cost matters with volume. I need to label millions of images to determine if they're sexually suggestive (this includes but is not limited to nudity). The Gemini 2.0 Flash Lite model is inexpensive and performs well. Gemini 2.5 Flash Lite is also good, but not noticeably better, and it costs more. When 2.0 gets retired this June my costs are going up.

Time to gather a dataset and train your own model!

I use Gemini via its web app, which aggressively autoswitches to the Flash over Pro, but I usually notice quickly because the answers are weird or the logic doesn't quite follow. I feel like, at least for 'daily driver' usage, small models are still a little disappointing. That said, they're getting very good for more automation-y tasks with simple, well-constrained tasks.

Most annoying part of their web app and a really terrible idea.

I often just think Gemini is terrible but then it turns out they silently changed the model on me


I got Codex to whip me up a Chrome extension that autoswaps back to Pro whenever I reload the page. It's made Gemini significantly less irritating to use.

> And since they're also getting dramatically cheaper, it's becoming increasingly compelling to actually run these models in real-life applications.

They're not really cheaper than the SOTA open models on third-party inference platforms, and they're generally dumber. I suppose they're still worth it if you must minimize latency for any given level of smarts, but not really otherwise.


Well, in that case, the difference is quite minimal between 5 mini and 5.4 mini

5.4 mini seems to be a lot more wild/unstable, but with this instability it gets the right answer more often.

https://aibenchy.com/compare/openai-gpt-5-4-mini-medium/open...


> 100% of the time in many cases

So, every single time, the new model works most of the time?


You’ve parsed the sentence wrong.

Read it as: “You can use them full time in many cases”


It's because they are getting so good it's impossible to recognize them.

Haiku 4.5 is already so good it's ok for 80% (95%?) of dev tasks.


I must be writing very different software than you, I keep opus on a tight leash and it still comes to the strangest conclusions.

Very possible. Some things work like a charm on first try for me, others you can spell it out again and again. And then yet again. Something to do with training data, obviously.

I've found Haiku to be truly mediocre for working with. If you want a cheap models, the open source ones are much better

I'm all about trashing Europe when it's needed but I think this post is an hidden PR post.

It seems so fake to me and so far from the experience I have here in France.


you should not ask indie hackers for advice and you should not hang out with them.

If you build a product for marketers, you should hang out with them and ask them for advices, not indie hackers who know nothing about marketing.

If you build a product for bakers, you should hang out with them to understand what they need, not with indie hackers who have never baked anything in their lives.

That sounds logical, but for certain types of products, it is not.

There is no point in talking with indie hackers. It's only useful if you need knowledge about coding skills, which is rarely the case (especially now with AI).


How does it work ? How does it find the environment ?

Let say I have a project in `/home/boumtac/dev/myproject` with the venv inside.

If I run `uv python find --script /home/boumtac/dev/myproject/my_task.py`, will it find the venv ?


The philosophy of uv is that the venv is ephemeral; creating a new venv should be fast enough that you can do it on demand.

Do you have a standalone script or do you have a project? --script is for standalone scripts. You don’t use it with projects.

If you tell it to run a standalone script, it will construct the venv itself on the fly in $XDG_CACHE_HOME.

If you have a project, then it will look in the .venv/ subdirectory by default and you can change this with the $UV_PROJECT_ENVIRONMENT environment variable. If it doesn’t find an environment where it is expecting to, it will construct one.


I'm not a native English speaker, but don't you use the ";" in English ?

To me, it feels like it is the same purpose as the EM dashes.

And I discovered the EM with ChatGPT, I've never seen it before.


A semicolon connects, whereas an em-dash creates more of a pause and therefore separates. In addition, em-dashes can be used in pairs to create a parenthesis, which semicolons can’t. I think with time you will appreciate the difference.

https://thenarrativearc.org/blog/2020/2/4/epic-grammar-battl...


Dashes surround a sub-clause - something like this - which is like a parenthetical addition to a sentence that could stand alone without it; semi-colons (';') connect a further sentence or part of one where perhaps a full-stop and additional word could have been. They also sometimes separate list items following a colon, especially if the things listed are longer sentences perhaps themselves containing commas that'd otherwise be ambiguous.


Em dashes are very similar to semicolons. You use em dashes if your related sentence is in the middle of another sentence, and semicolons if it's at the end.

They're frequently used in skilled and professional grade writing.


So as not to mislead anyone, the parent is mostly incorrect:

Here's an example sentence: Semicolons must have independent clauses—phrases that could form a full sentence on their own—on both sides of them; they are essentially alternatives for periods. Em dashes don't require independent clauses on either side.

In the italicized sentence,

* phrases that could form a full sentence on their own is not an independent clause but is valid between em dashes. on both sides of them, after the em dashes, is also not an independent clause. (The em dashes function like commas or parentheses here.)

* The parts before and after the semicolon are independent clauses. You could replace the semicolon with a period and you'd have perfectly valid grammar. I just chose to connect the two sentences a bit more.

I don't know if you can use em dashes as the parent comment describes, connecting three independent clauses:

* My favorite fruit is peaches—they are very sweet—I eat them all summer.

I think the above is wrong; it should be one of the following:

* My favorite fruit is peaches—they are very sweet—and I eat them all summer.: The last section is a dependent clause made by "and", not an independent clause.

* My favorite fruit is peaches—they are very sweet; I eat them all summer.: One both sides of the semicolon are independent clauses; I could replace the semicolon with a period.

Maybe there are examples I'm not thinking of? I infer that the rule might be that the punctution following the em-dashed clauses should be the punctuation that would have been used without the em-dashed clause, but that's based on very limited evidence.


Many people don't use semicolons (;) in English but many do, and they are certainly part of correct grammar.

Semicolons are generally alternatives to periods, when you want more connection between the two sentences. Like periods, semicolons must have two full sentences—that is, what could be full sentences—on either side of them; the potential 'full sentences' are properly called independent clauses. (A dependent clause needs the rest of the sentence to form valid grammar; it can't function on its own. For example, in this paragraph's first sentence, when you want more connection between the two sentences is a dependent clause. Often they follow commas.)

Another use of semicolons is for lists in a paragraph where one of the list items has a comma in it (similar to the parsing problem for CSVs where some records contain commas): I only like wine; beer, but only ales; and orange juice.


Shopify is built using Ruby on Rails, they successfully handle enormous traffic spikes during Black Friday sales without issues.

So I think we're good with performance.


Everything can scale if you throw enough servers at it. Of-course Shopify scales, they even spent time and money to build a JIT on top of Ruby. As a smaller company, does everyone have the time and money to spend on servers or optimising the language to this extent?


That's the nice thing! You don't need to optimise the language and build a JIT as a smaller company, Shopify already did that for you. Just like Google did for Javascript, which lead to Javascript having any performance at all (which lead to node being a thing).

Also remember that Shopify didn't start out making billions. They started as a small side project on a far, far slower version of Ruby and Rails.

Same with GitHub, same with many others that are either still on Rails or started there.

You can optimise things later once you actually have customers, know the shape of your problem and where the actual pain points are/what needs to be scaled.

To me, I care a ton about performance (it's an area I work in), but there's not a lot of sense in sacrificing development agility for request speed on things that may not matter or be things people will pay for. Especially when you're small.


No, they only have time for features and productivity, which is, as you pointed out earlier, what rails is good at.


Smaller companies have less traffic, need less expensive servers, and have no need to spend money optimising the language. They can focus on that when they make billions of dollars, like Shopify does.


And in the meantime just passively benefit from the OSS improvements along the way


>So I think we're good with performance.

>On Rails, the most heavy page has a P95 duration of 338 ms. There is of course room for improvement but it's plenty snappy.

I guess everyone will have different opinion on P95 at 338ms. The great thing is that we are getting cheaper CPU Core price and YJIT. As long as this trend continues, the definition of Fast Enough will cover more grounds for more people.


There's lots of tricks you can do, such as preloading pages when the users hovers over the link. This makes even a "slow" page load of 400ms feel pretty much instant to a human.


Something I don't understand, Boeing 737 have been flying for years without no issue.

Why since a few month are they a lot of problem with them ? Why did these problems not appear before ?


Did they provide the limit rate for free user ?

Because I have the plus membership which is expensive (25$/month).

But if the limit is high enough (or my usage low enough), there is no point for paying that much money for me.


I think it's because of two things.

- The first one Chrome OS.

- There's a noticeable trend where the general population is increasingly favoring smartphones over traditional computers for their digital needs. This shift has predominantly affected Windows users, as they represent a significant portion of the casual computing market. As a result, the proportion of dedicated Linux users has become more pronounced


Chrome OS and Android are their own category in this list.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: