More

jorvi · 2026-04-20T22:24:23 1776723863

Ah yes, the company that still can't their gesture and backswipe UX functioning properly 7 years after its introduction, and with Apple giving them 2 years to study it beforehand.

A decade to produce a non-functioning gesture bar / system. Such a titan among titans.

jorvi · 2026-04-16T19:26:27 1776367587

Even Gemini with no memory does hilarious things. Like, if you ask it how heavy the average man is, you usually get the right answer but occasionally you get a table that says:

- 20-29: 190 pounds

- 30-39: 375 pounds

- 40-49: 750 pounds

- 50-59: 4900 pounds

Yet somehow people believe LLMs are on the cusp of replacing mathematicians, traders, lawyers and what not. At least for code you can write tests, but even then, how are you gonna trust something that can casually make such obvious mistakes?

drnick1 · 2026-04-17T01:33:57 1776389637

> how are you gonna trust something that can casually make such obvious mistakes?

In many cases, a human can review the content generated, and still save a huge amount of time. LLMs are incredibly good at generating contracts, random business emails, and doing pointless homework for students.

gf000 · 2026-04-17T05:49:31 1776404971

And humans are incredibly bad at "skimming through this long text to check for errors", so this is not a happy pairing.

As for the homework, there is obviously a huge category that is pointless. But it should not be that way, and the fundamental idea behind homework is sound and the only way something can be properly learnt is by doing exercises and thinking through it yourself.

nickjj · 2026-04-16T20:56:15 1776372975

Yeah, ChatGPT's paid version is wildly inaccurate on very important and very basic things. I never got onboard with AI to begin with but nowadays I don't even load it unless I'm really stuck on something programming related.

dyauspitr · 2026-04-16T19:29:05 1776367745

So what? That might happen one out of 100 times. Even if it’s 1 in 10 who cares? Math is verifiable. You’ve just saved yourself weeks or months of work.

icedchai · 2026-04-16T20:30:16 1776371416

You don't think these errors compound? Generated code has 100's of little decisions. Yes, it "usually" works.

russfink · 2026-04-16T23:28:54 1776382134

LLM’s: sometimes wrong but never in doubt.

dyauspitr · 2026-04-16T20:37:08 1776371828

Not in my experience. With a proper TDD framework it does better than most programmers at a company who anecdotally have a bug every 2-3 tasks.

tranceylc · 2026-04-17T00:10:39 1776384639

The kind of mistakes it makes are usually strange and inhuman though. Like getting hard parts correct while also getting something fundamental about the same problem wrong. And not in the “easy to miss or type wrong” way.

I wish I had an example for you saved, but happens to me pretty frequently. Not only that but it also usually does testing incorrectly at a fundamental level, or builds tests around incorrect assumptions.

icedchai · 2026-04-17T14:03:33 1776434613

I've seen LLMs implement "creative" workarounds. Example: Sonnet 4.5 couldn't figure out how to authenticate a web socket request using whatever framework I was experimenting with, so it decided to just not bother. Instead, it passed the username as part of the web socket request and blindly trusted that user was actually authenticated.

The application looked like it worked. Tests did pass. But if you did a cursory examination of the code, it was all smoke and mirrors.

svachalek · 2026-04-17T18:18:44 1776449924

Yeah recently it had an issue getting OIDC working and decided to implement its own, throwing in a few thousand extra lines. I'm sure there were no security holes created in there at all. /s

icedchai · 2026-04-17T20:10:26 1776456626

Well, the tests passed, right?

bratwurst3000 · 2026-04-17T14:56:58 1776437818

yes i wished i had safes some of my best examples too. One i had was super weird in chatgpt pro. It told me that after 30 years my interest would become negative and i would start loosing money. Didnt want to accept the error.

FeepingCreature · 2026-04-17T09:07:34 1776416854

Errors compounding is a meme. In iterated as well as verifiable domains, errors dilute instead of compounding because the llm has repeated chances to notice its failure.

coldtea · 2026-04-17T00:49:31 1776386971

Yes, just use random results. You’ve just saved yourself weeks or months of work of gathering actual results.

jorvi · 2026-04-16T13:04:39 1776344679

> a Finnish band called Steve'n'Seagulls

I wonder if their name is a deep cut Nightwish reference, who are also Finnish: https://youtu.be/gg5_mlQOsUQ?t=17s

troupo · 2026-04-16T13:22:58 1776345778

Brilliant :) No I wonder this, too

jorvi · 2026-04-16T11:20:22 1776338422

Airline tickets used to be transferable. Hell, you used to be able to fly on someone else's name.

9/11 changed a lot. Just like how before World War I, passports and border controls weren't really a thing.

jorvi · 2026-04-16T10:15:45 1776334545

That seems very unlikely.

Chinese AI vendors specifically pointed out that even a few gens ago there was maybe 5-15% more capability to squeeze out via training, but that the cost for this is extremely prohibitive and only US vendors have the capex to have enough compute for both inference and that level of training.

I'd take their word over someone that has a vested interested in pushing Anthropic's latest and greatest.

The real improvements are going to be in tooling and harnessing.

anitil · 2026-04-16T23:11:22 1776381082

> The real improvements are going to be in tooling and harnessing

I don't have any special knowledge here, but the guy in the podcast (who worked/works with one of the big AI firms) is the one who made the claim. In the future when (if?) the speed of development slows I agree it would no longer be true

jorvi · 2026-04-16T08:19:08 1776327548

Your hardware will age slower if you have consistent load.

Thermal stress from bursty workloads is much more of a wearing problem than electromigration. If you can consistently keep the SoC at a specific temperature, it'll last much longer.

This is also why it was very ironic that crypto miner GPUs would get sold at massive discounts. Everyone assumed that they had been ran ragged, but a proper miner would have undervolted the card and ran it at consistent utilization, meaning the card would be in better condition than a secondhand gamer GPU that would have constantly been shifting between 1% to 80% utilization, or rather, 30°C to 75°C

jorvi · 2026-04-14T20:24:29 1776198269

The worst is that if you're someone who enjoys multiple niche things and who also interacts with the algorithm (like - dislike - not interested) your account gets marked as content discovery vanguard and it will endlessly feed you videos with <1000 views just so they can get more feelers on if the content is actually good.

Even if you consistently "not interested", the algorithm never ever figures out the overlapping theme is that you (generally) don't like low view count low subscriber count content.

jorvi · 2026-04-13T13:12:39 1776085959

AFAIK a lot of the bigger sites / services already hide or outright strip EXIF.

Its better to do it from the source, obviously.

jorvi · 2026-04-12T15:08:15 1776006495

But value is meaningless. The ultimate purpose for a producer in a market is to provide infinite value for no cost.

Of course that is a pipe dream, so it should provide the highest value for the lowest cost.

For those who want to argue it should be a balance: consider the opposite position. A producer should provide no value at infinite cost. In this case, everything withers. If party A and party B need each other's products to survive, they can do that when the value is infinite and the cost is zero, but not when the value is zero and the cost is infinite.

The last few decades have shown that giving the finger to the customer and going all in on shareholdermaxxing has nothing but terrible effects and is like sticking a spanner into the wheel of capitalism.

jorvi · 2026-04-12T00:15:10 1775952910

My proof-in-pudding test is still the fact that we haven't seen gigantic mass firings at tech companies, nor a massive acceleration on quality or breadth (not quantity!) of development.

Microsoft has been going heavy on AI for 1y+ now. But then they replace their cruddy native Windows Copilot application with an Electron one. If tests and dev only has marginal cost now, why aren't they going all in on writing extremely performant, almost completely bug-free native applications everywhere?

And this repeats itself across all big tech or AI hype companies. They all have these supposed earth-shattering gains in productivity but then.. there hasn't been anything to show for that in years? Despite that whole subsect of tech plus big tech dropping trillions of dollars on it?

And then there is also the really uncomfortable question for all tech CEOs and managers: LLMs are better at 'fuzzy' things like writing specs or documentation than they are at writing code. And LLMs are supposedly godlike. Leadership is a fuzzy thing. At some point the chickens will come to roost and tech companies with LLM CEOs / managers and human developers or even completely LLM'd will outperform human-led / managed companies. The capital class will jeer about that for a while, but the cost for tokens will continue to drop to near zero. At that point, they're out of leverage too.

johnfn · 2026-04-12T01:48:15 1775958495

Your proof-in-pudding test seems to assume that AI is binary -- either it accelerates everyone's development 100x ("let's rewrite every app into bug-free native applications") or nothing ("there hasn't been anything to show for that in years"). I posit reality is somewhere in between the two.

coldtea · 2026-04-13T16:57:21 1776099441

Considering that "AI will replace nearly all devs" and "AI will give 100x boost" and such we were promised, it makes sense to question this.

After almost all hyped technology is also "somewere between the two" extremes of not doing what it promises at all and doing it. The question is which edge it's closer to.

eiens · 2026-04-12T02:44:05 1775961845

LLM’s are capable of searching information spaces and generating some outputs that one can use to do their job.

But it’s not taking anyone’s job, ever. People are not bots, a lot of the work they do is tacit and goes well beyond the capabilities and abilities of llm’s.

Many tech firms are essentially mature and are currently using too much labour. This will lead to a natural cycle of lay offs if they cannot figure out projects to allocate the surplus labour. This is normal and healthy - only a deluded economist believes in ‘perfect’ stuff.

ipaddr · 2026-04-12T05:42:42 1775972562

"it’s not taking anyone’s job, ever"

It has already and that doesn't mean new jobs haven't been created or that those new jobs went to those who lost their jobs.

johnfn · 2026-04-12T04:03:42 1775966622

In this entire thread of conversation, I never said that LLMs would take people's jobs, and that is not something I believe.

MidnightRider39 · 2026-04-12T00:45:57 1775954757

Leadership is also a very human thing. I think most people would balk at the idea of being led by an LLM.

One of the main functions of leaders (should be) is to assume responsibility for decisions and outcomes. A computer cant do that.

And finally why should someone in power choose to replace themselves?

coldtea · 2026-04-13T16:59:32 1776099572

>One of the main functions of leaders (should be) is to assume responsibility for decisions and outcomes. A computer cant do that.

Sure it can. "Assuming responsibility" just means people/the law lets you to.

It can be totally empty too, like CEOs or politicians "assuming responsibility" for some outcome but nevertheless suffering zero conseuences.

eiens · 2026-04-12T02:55:52 1775962552

Someone in power doesn’t get to choose - the board of directors do. Who’s job is to act in the best interest of shareholders.

Firms tend to follow peers in an industry - once one blinks the rest follow.

eru · 2026-04-12T05:30:48 1775971848

> Someone in power doesn’t get to choose - the board of directors do. Who’s job is to act in the best interest of shareholders.

Alas, shareholder value is a great ideal, but it tends to be honoured in practice rather less strictly.

As you can also see when sudden competition leads to rounds of efficiency improvements, cost cutting and product enhancements: even without competition, a penny saved is a penny earned for shareholders. But only when fierce competition threatens to put managers' jobs at risk, do they really kick into overdrive.

coldtea · 2026-04-13T17:00:20 1776099620

>shareholder value is a great ideal

It's one of the most horrible ideas ever, responsible for anything from market abuse and enshittification to rent seeking and patent trolling.

MidnightRider39 · 2026-04-12T03:19:41 1775963981

The board of directors are also people in power - why not replace them with an LLM as well if it works so well for CEOs?

dbdr · 2026-04-12T06:35:40 1775975740

> Someone in power doesn’t get to choose - the board of directors do

Since the board of directors can decide to replace the CEO, it's not the CEO who holds the (ultimate) power, it's the board of directors.

jsjohnst · 2026-04-12T14:53:16 1776005596

Since the majority shareholder(s) can decide to replace the board of directors, it’s not the board of directors who holds the (ultimate) power, it’s the majority shareholder(s).

dbdr · 2026-04-13T13:48:50 1776088130

Indeed, and there we reached the end of the chain.

nopinsight · 2026-04-12T07:34:19 1775979259

> LLMs are better at 'fuzzy' things like writing specs or documentation than they are at writing code.

At least for writing specs, this is clearly not true. I am a startup founder/engineer who has written a lot of code, but I've written less and less code over the last couple of years and very little now. Even much of the code review can be delegated to frontier models now (if you know which ones to use for which purpose).

I still need to guide the models to write and revise specs a great deal. Current frontier LLMs are great at verifiable things (quite obvious to those who know how they're trained), including finding most bugs. They are still much less competent than expert humans at understanding many 'softer' aspects of business and user requirements.

locknitpicker · 2026-04-12T07:07:00 1775977620

> Microsoft has been going heavy on AI for 1y+ now. But then they replace their cruddy native Windows Copilot application with an Electron one.

This.

Also, Microsoft is going heavy on AI but it's primarily chatbot gimmicks they call copilot agents, and they need to deeply integrate it with all their business products and have customers grant access to all their communications and business data to give something for the chatbot to work with. They go on and on in their AI your with their example on how a company can work on agents alone, and they tell everyone their job is obsoleted by agents, but they don't seem to dogfood any of their products.

mlmonkey · 2026-04-12T17:33:13 1776015193

> My proof-in-pudding test is still the fact that we haven't seen gigantic mass firings at tech companies

This assumes that companies will announce such mass firings (yeah, I'm aware of WARN Act); when in reality they will steadily let go of people for various reasons (including "performance").

From my (tech heavy) social circle, I have noticed an uptick in the number of people suddenly becoming unemployed.

naasking · 2026-04-12T13:27:16 1776000436

> My proof-in-pudding test is still the fact that we haven't seen gigantic mass firings at tech companies

Jevon's paradox.

gspetr · 2026-04-13T01:36:10 1776044170

For Jevons paradox to be a win-win, you need these 3 statements to be true:

1)Workers get more productive thanks to AI.

2)Higher worker productivity translates into lower prices.

3)Most importantly, consumer demand needs to explode in reaction to lower prices. And we're finding out in real-time that the demand is inelastic.

Around 1900, 40% of American workers worked in agriculture. Today, it's < 2%.

Which is similar to what we see with coding: The increase in demand has not exploded enough to offset the job-killing of each farmer being able to produce more food.