No amount of valuation can fix global supply issues for GPUs for inference unfor...

natpalmer1776 · 2026-04-15T15:17:35 1776266255

Remember when OpenAI wasn’t allowing new subscriptions to their ChatGPT pro plans because they were oversubscribed? Pepperidge Farms remembers.

andai · 2026-04-15T15:25:45 1776266745

Wouldn't that be good? I remember back in the day you could only get Gmail thru an invite, it was an awesome strategy. "Currently closed for applications" creates FOMO. They'd just need to actually get the GPUs in relatively short supply. They could do it in bursts though, right? "Now accepting applications for a short time."

I'm not an internet marketer but that sounds like a win win to me. People feel special, they get extra hype, and the service isn't broken.

hirako2000 · 2026-04-15T15:41:14 1776267674

In the case of Gmail that was fake scarcity.

In the case of Anthropic is fake availability.

Sam Altman explained the idea is to scale the thing up, and see what happens.

He hadn't claimed to offer a solution to the supply problem that would unfold.

bruckie · 2026-04-15T18:08:16 1776276496

Are you sure it was fake scarcity for Gmail? IIRC they did it because they were worried about systems falling over if it grew too fast, and discovered the marketing benefits as a side effect.

iainmerrick · 2026-04-15T17:48:05 1776275285

Are you mixing up Anthropic and OpenAI here?

hirako2000 · 2026-04-16T16:15:26 1776356126

I didn't. Anthropic and others followed the concept of scaling up models and worry about efficiency and availability later. Sam likely didn't invent the idea but he talked about it.

the_gipsy · 2026-04-15T15:32:48 1776267168

Yes, "Pepperidge farm remembers" is usually about how something used to be good.

CoastalCoder · 2026-04-15T17:33:16 1776274396

Yeah, but there was a spoof on that (in Family Guy?). It was a tie in to the movie "I Know what you Did last Summer", IIRC.

joquarky · 2026-04-15T17:23:06 1776273786

Google Wave demonstrated that this doesn't always work.

scratchyone · 2026-04-15T15:33:07 1776267187

maybe, but the response to GPU shortages being increased error rates is the concern imo. they could implement queuing or delayed response times. it's been long enough that they've had plenty of time to implement things like this, at least on their web-ui where they have full control. instead it still just errors with no further information.

skeledrew · 2026-04-15T15:49:24 1776268164

I've been experiencing a good amount of delays (says it's taking extra time to really think, etc), and I'm using during off-peak time.

scratchyone · 2026-04-15T15:51:34 1776268294

i notice that as well. most of the time when i see those it has a retry counter also and i can see it trying and failing multiple requests haha. almost never succeeds in producing a response when i see those though, eventually just errors out completely.

hirako2000 · 2026-04-15T15:38:54 1776267534

Coding is a problem solved. Claud writes the code. I edit it. I code around it.

Engineer roles dead in 6 months.

post-it · 2026-04-15T16:25:00 1776270300

> I edit it. I code around it.

You're never gonna guess what software engineers do.

bulbar · 2026-04-15T18:09:03 1776276543

Because of the context I would think this is sarcasm, but I am not sure.

hirako2000 · 2026-04-16T16:17:56 1776356276

It is.

zachncst · 2026-04-15T15:57:48 1776268668

Sure but we don't need GPUs to log in.

sobellian · 2026-04-15T15:30:46 1776267046

Their issues seem to extend well beyond inference into services like auth.

ryandrake · 2026-04-15T15:45:13 1776267913

Yes. Whenever these outages happen, it always seems that it's their login system that is broken.

bostik · 2026-04-15T17:37:51 1776274671

That implies that either the auth is too heavy (possible, ish) or their systems don't degrade gracefully enough and many different types of failures propagate up and out all the way to their outermost layer, ie. auth (more plausible).

Disclosure: I have scars from a distributed system where errors propagated outwards and took down auth...

AlecSchueler · 2026-04-15T18:13:04 1776276784

> thus the reason why we're seeing them do other things to cut down on inference cost (ie changing their default thinking length).

The dynamic thinking and response length is funny enough the best upgrade I've experienced with the service for more than a year. I really appreciate that when I say or ask something simple the answer now just comes back as a single sentence without having to manually toggle "concise" mode on and off again.

paulddraper · 2026-04-15T17:54:33 1776275673

A. These aren’t rate limit errors from the API.

B. Everything is down, even auth.