More

nikitaga · 2026-03-29T23:53:02 1774828382

Scraping static content from a website at near-zero marginal cost to its server, vs scraping an expensive LLM service provided for free, are different things.

The former relies on fairly controversial ideas about copyright and fair use to qualify as abuse, whereas the latter is direct financial damage – by your own direct competitors no less.

It's fun to poke at a seeming hypocrisy of the big bad, but the similarity in this case is quite superficial.

PunchyHamster · 2026-03-30T01:29:30 1774834170

> Scraping static content from a website at near-zero marginal cost to its server, vs scraping an expensive LLM service provided for free, are different things.

I bet people being fucking DDOSed by AI bots disagree

Also the fucking ignorance assuming it's "static content" and not something needing code running

remus · 2026-03-30T07:42:37 1774856557

I think the parent is just pointing out that these things lie on a spectrum. I have a website that consists largely of static content and the (significant) scraping which occurs doesn't impact the site for general users so I don't mind (and means I get good, up to date answers from LLMs on the niche topic my site covers). If it did have an impact on real users, or cost me significant money, I would feel pretty differently.

0xEF · 2026-03-30T08:35:35 1774859735

Putting everything on a spectrum is what got us into this mess of zero regulation and moving goal posts. It's slippery slope thinking no matter which way we cut it, because every time someone calls for a stop sign to be put up after giving an inch, the very people who would have to stop will argue tirelessly for the extra mile.

Aerroon · 2026-03-30T12:23:10 1774873390

What mess are you talking about? The existence of LLMs? I think it's pretty neat that I can now get answers to questions I have.

This is something I couldn't have done before, because people very often don't have the patience to answer questions. Even Google ended up in loops of "just use Google" or "closed. This is a duplicate of X, but X doesn't actually answer the question" or references to dead links.

Are there downsides to this? Sure, but imo AI is useful.

butlike · 2026-03-30T16:39:31 1774888771

It's just repackaged Google results masquerading as an 'answer.' PageRank pulled results and displayed the first 10 relevant links and the LLM pulls tokens and displays the first relevant tokens to the query.

Just prompt it.

daveidol · 2026-03-30T10:50:46 1774867846

I’d argue putting everything in terms of black and white is the bigger issue than understanding nuance

instig007 · 2026-03-30T11:07:10 1774868830

Generalizing with "everything", "all", etc exclusive markers is exactly the kind of black/white divide you're arguing against. What happened to your nuanced reality within a single sentence? Not everything is black and white, but some situations are.

fc417fc802 · 2026-03-30T12:24:17 1774873457

The person he's replying to argued against putting things on a spectrum. Does that not imply painting everything in black and white? Thus his response seems perfectly sensible to me.

instig007 · 2026-03-30T18:57:20 1774897040

He argued against putting things in a spectrum in many instances where that would be wrong, including the case under the question. What's your argument against that idea? LLM'ed too much lately?

fc417fc802 · 2026-03-30T23:39:54 1774913994

He argued against and the response presented a counterargument. Both were based around social costs and used the same wording (ie "everything").

You made a specious dismissal. Now you're making personal attacks. Perhaps it's actually you who is having difficulty reasoning properly here?

Den_VR · 2026-03-30T02:47:02 1774838822

I miss the www where the .html was written in vim or notepad.

mghackerlady · 2026-03-30T13:20:24 1774876824

It still can be. Do it. Go make your website in M$ Frontpage, for all I care

butlike · 2026-03-30T16:42:05 1774888925

Shameless plug: My music homepage follows the HTML 2.0 spec and is written by hand

https://sampleoffline.com/

mghackerlady · 2026-03-30T17:05:56 1774890356

heck yeah B)

consp · 2026-03-30T07:19:13 1774855153

Just did that for a test frontend for a module I needed to build (not my primary job so don't know anything about UI but running in browsers was a requirement), so basic HTML with the bare minimum of JS and all DOM. Colleagues were very surprized. And yes, vim is still the goto editor and will be for a long time now all "IDE" are pushing "AI" slop everywhere.

holler · 2026-03-30T03:03:21 1774839801

ahh yes, fresh off reading "Html For Dummies" I made my first tripod.com site

sdsd · 2026-03-30T03:43:51 1774842231

For me it was making a petpage for my neopets using https://lissaexplains.com/

It's still up in all its glory.

DigiEggz · 2026-03-30T05:28:27 1774848507

This is great! The name reference also made me smile.

eloisius · 2026-03-30T06:39:22 1774852762

Also wild that from the tech bro perspective, the cost of journalism is just how much data transfer costs for the finished article. Authors spend their blood, sweat and tears writing and then OpenAI comes to Hoover it up without a care in the world about license, copyright or what constitutes fair use. But don’t you dare scrape their slop.

lelanthran · 2026-03-30T08:48:50 1774860530

> Also wild that from the tech bro perspective, the cost of journalism is just how much data transfer costs for the finished article.

Exactly. I think the unfairness can be mitigated if models trained on public information, or on data generated by a model trained on public information, or has any of those two in its ancestry, must be made public.

Then we don't have to hit (for example) Anthropic, we can download and use the models as we see fit without Anthropic whining that the users are using too much capacity.

mikkupikku · 2026-03-30T10:25:00 1774866300

[flagged]

jazzyjackson · 2026-03-30T13:36:52 1774877812

The library's archive is not a service provided by the newspaper

mikkupikku · 2026-03-30T16:15:19 1774887319

So? If the newspaper's website is willing to serve the documents, what's the problem?

The point is, if you're pleading with others to respect ""intellectual property"" then you're a worm serving corporate interests against your own.

jazzyjackson · 2026-03-30T21:29:21 1774906161

I may be a worm but at least I respect that others might have a different take on how best to make creative work an attainable way of life since before copyright law it was basically "have a wealthy patron who steered if not outright commissioned what you would produce"

eru · 2026-03-30T03:44:41 1774842281

> I bet people being fucking DDOSed by AI bots disagree

Are you sure it's a DDoS and not just a DoS?

MattJ100 · 2026-03-30T06:57:05 1774853825

Yes, it is. The worst offenders hammer us (and others) with thousands upon thousands of requests, and each request uses unique IP addresses making all per-IP limits useless.

We implemented an anti-bot challenge and it helped for a while. Then our server collapsed again recently. The perf command showed that the actual TLS handshakes inside nginx were using over 50% of our server's CPU, starving other stuff on the machine.

It's a DDoS.

troyvit · 2026-03-30T04:58:45 1774846725

You should see Cloudflare's control panel for AI bot blocking. There are dozens of different AI bots you can choose to block, and that doesn't even count the different ASNs they might use. So in this case I'd say that a DDoS is a decent description. It's not as bad as every home router on the eastern seaboard or something, but it's pretty bad.

Bilal_io · 2026-03-30T03:55:14 1774842914

Uncoordinated DDoS, when multiple search and AI companies are hammering your server.

catoc · 2026-03-30T06:50:18 1774853418

> Are you sure it's a DDoS and not just a DoS?

I think these days it’s ‘DAIS’, as in your site just DAIS - from Distributed/Damned AI Scraping

SolarNet · 2026-03-30T03:54:20 1774842860

When every AI company does it from multiple data centers... yes it's distributed.

1718627440 · 2026-03-30T11:08:33 1774868913

Off topic, but why is a DoS something considered to act on, often by just shutting down the service altogether? That results in the same DoS just by the operator than due to congestion. Actually it's worse, because now the requests will never actually be responded rather then after some delay. Why is the default not to just don't do anything?

pocksuppet · 2026-03-30T12:39:34 1774874374

It keeps the other projects hosted on the same server or network online. Blackhole routes are pushed upstream to the really big networks and they push them to their edge routers, so traffic to the affected IPs is dropped near the sender's ISP and doesn't cause network congestion.

DDoSers who really want to cause damage now target random IPs in the same network as their actual target. That way, it can't be blackholed without blackholing the entire hosting provider.

echoangle · 2026-03-30T11:25:29 1774869929

I think some people use hosting that is paid per request/load, so having crawlers make unwanted requests costs them money.

ImPostingOnHN · 2026-03-30T11:39:15 1774870755

*> Why is the default not to just don't do anything?

Because ingress and compute costs often increase with every request, to the point where AI bot requests rack up bills of hundreds or thousands of dollars more than the hobbyist operator was expecting to send.

lm411 · 2026-03-30T05:01:39 1774846899

> Also the fucking ignorance assuming it's "static content" and not something needing code running

Wild eh.

If it's not ai now, it's by default labelled "static content" and "near-zero marginal cost".

littlestymaar · 2026-03-30T05:32:30 1774848750

What's a database after all.

nikitaga · 2026-03-30T12:12:34 1774872754

All this reactionary outrage in the comments is funny. And lame.

Yes, for the vast majority of the internet, serving traffic is near zero marginal cost. Not for LLMs though – those requests are orders of magnitude more expensive.

This isn't controversial at all, it's a well understood fact, outside of this irrationally angry thread at least. I don't know, maybe you don't understand the economic term "marginal cost", thus not understanding the limited scope of my statement.

If such DDOSes as you mention were common, such a scraping strategy would not have worked for the scraper at all. But no, they're rare edge cases, from a combination of shoddy scrapers and shoddy website implementations, including the lack of even basic throttling for expensive-to-serve resources.

The vast majority of websites handle AI traffic fine though, either because they don't have expensive to serve resources, or because they properly protect such resources from abuse.

If you're an edge case who is harmed by overly aggressive scrapers, take countermeasures. Everyone with that problem should, that's neither new nor controversial.

ipaddr · 2026-03-30T12:59:33 1774875573

"such DDOSes as you mention were common, such a scraping strategy would not have worked for the scraper at all"

They are common. The strategy works for the llm but not for the website owner or users who can't use a site during this attack.

The majority of sites are not handling AI fine. Getting Ddosed only part of the time is not acceptable. Countermeasures like blocking huge ranges can help but also lock out legimate users.

nikitaga · 2026-03-30T20:15:22 1774901722

> They are common

Any actual evidence of the alleged scope of this problem, or just anecdotes from devs who are mad at AI, blown out of proportion?

ipaddr · 2026-03-30T21:15:24 1774905324

Love AI so can't be that. Not devs website owners. Yes ask AI for stats.

fireflash38 · 2026-03-30T12:52:00 1774875120

It's not a cost for me to scrape LLM.

It is a cost for me for LLM to scrape me.

Why should I care about costs that have when they don't care about the costs I have?

grayhatter · 2026-03-30T12:51:45 1774875105

The extent of the utilization is new.

The number of bots that try to hide who they are, and don't bother to even check robots.txt is new.

juliangmp · 2026-03-30T18:57:34 1774897054

"They are rare edge cases" are we on the same internet?

expedition32 · 2026-03-30T14:00:57 1774879257

One euro is marginal for me for someone else it is their daily meal.

not2b · 2026-03-30T01:10:35 1774833035

I understand why OpenAI is trying to reduce its costs, but it simply isn't true that AI crawlers aren't creating very significant load, especially those crawlers that ignore robots.txt and hide their identities. This is direct financial damage and it's particularly hard on nonprofit sites that have been around a long time.

zer00eyz · 2026-03-30T04:42:21 1774845741

> but it simply isn't true that AI crawlers aren't creating very significant load.

And how much of this is users who are tired of walled gardens and enshitfication. We murdered RSS, API's and the "open web" in the name of profit, and lock in.

There is a path where "AI" turns into an ouroboros, tech eating itself, before being scaled down to run on end user devices.

stingraycharles · 2026-03-30T02:41:36 1774838496

These are ChatGPT and Claude Desktop crawlers we’re talking about? Or what is it exactly? Are these really creating significant load while not honoring robots.txt?

Genuinely interested.

63stack · 2026-03-30T09:06:56 1774861616

Is this the first time you are reading HN? Every day there are posts from people describing how AI crawlers are hammering their sites, with no end. Filtering user agents doesn't work because they spoof it, filtering IPs doesn't work because they use residential IPs. Robots.txt is a summer child's dream.

miki123211 · 2026-03-30T08:48:40 1774860520

They seem to mostly be third-party upstarts with too much money to burn, willing to do what it takes to get data, probably in hopes of later selling it to big labs. Maaaybe Chinese AI labs too, I wouldn't put it past them.

OpenAI et al seem to mostly be well-behaved.

cruffle_duffle · 2026-03-30T02:54:55 1774839295

I bet dollars to doughnuts that 95% of the traffic is from Claude and ChatGPT desktop / mobile and not literal content scraping for training.

crote · 2026-03-30T04:10:46 1774843846

That wouldn't explain the 1000x increase in traffic for extremely obscure content, or seeing it download every single page on a classic web forum.

duttish · 2026-03-30T06:52:19 1774853539

And doing it over, and over, and over and over again. Because sure it didn't change in the last 8 years but maybe it's changed since yesterdays scrape?

lm411 · 2026-03-30T03:18:05 1774840685

That is ridiculous.

You imply that "an expensive llm service" is harmed by abuse, but, every other service is not? Because their websites are "static" and "near-zero marginal cost"?

You have no clue what you are talking about.

camillomiller · 2026-03-30T05:23:49 1774848229

Well he’s a simp

cicko · 2026-03-30T06:19:46 1774851586

Interesting how other people's cost is "near-zero marginal cost" while yours is "an expensive LLM service". Also, others' rights are "fairly controversial ideas about copyright and fair use" while yours is "direct financial damage". I like how you frame this.

sandeepkd · 2026-03-30T02:36:35 1774838195

Lets not try to qualify the wrongs by picking a metric and evaluating just one side of it. A static website owner could be running with a very small budget and the scraping from bots can bring down their business too. The chances of a static website owner burning through their own life savings are probably higher.

expedition32 · 2026-03-30T07:55:15 1774857315

Perhaps the long play is to destroy all small hobby websites until only a AI directed web is left.

miki123211 · 2026-03-30T08:51:37 1774860697

If you're truly running a static site, you can run it for free, no matter how much traffic you're getting.

Github pages is one way, but there are other platforms offering similar services. Static content just isn't that expensive to host.

THe troubles start when you're actually running something dynamic that pretends to be static, like Wordpress or Mediawiki. You can still reduce costs significantly with CDNs / caching, but many don't bother and then complain.

ezrast · 2026-03-30T15:36:07 1774884967

Setting aside the notion that a site presenting live-editability as its entire core premise is "pretending to be static", do the actual folks at Wikimedia, who have been running a top 10 website successfully for many years, and who have a caching system that worked well in the environment it was designed for, and who found that that system did not, in fact, trivialize the load of AI scraping, have any standing to complain? Or must they all just be bad at their jobs?

https://diff.wikimedia.org/2025/04/01/how-crawlers-impact-th...

jazzyjackson · 2026-03-30T13:40:40 1774878040

It's true it can be done but many business owners are not hip to cloudflare r2 buckets or github pages. Many are still paying for a whole dedicated server to run apache (and wordpress!) to serve static files. These sites will go down when hammered by unscrupulous bots.

alsetmusic · 2026-03-30T02:33:33 1774838013

Have you not seen the multiple posts that have reached the front page of HN with people taking self-hosted Git repos offline or having their personal blogs hammered to hell? Cause if you haven't, they definitely exist and get voted up by the community.

AmbroseBierce · 2026-03-30T04:49:13 1774846153

It's not like those models are expensive because the usefulness that they extracted from scraping others without permission right? You are not even scratching the surface of the hypocrisy

VadimPR · 2026-03-30T06:10:01 1774851001

Getting scraped by abusive bots who bring down the website because they overload the DB with unique queries is not marginal. I spent a good half of last year with extra layers of caching, CloudFlare, you name it because our little hobby website kept getting DDoS'd by the bots scraping the web for training data.

Never in 15 years if running the website did we have such issues, and you can be sure that cache layers were in place already for it to last this long.

wolvoleo · 2026-03-30T05:59:19 1774850359

It's more ironic because without all the scraping openai has done, there would have been no ChatGPT.

Also, it's not just the cost of the bandwidth and processing. Information has value too. Otherwise they wouldn't bother scraping it in the first place. They compete directly with the websites featuring their training data and thus they are taking away value from them just as the bots do from ChatGPT.

In fact the more I think of it, I think it's exactly the same thing.

expedition32 · 2026-03-30T07:59:01 1774857541

This leads me to thinking: I ask chatGPT a question and they get the answer from gamefaqs.

But what happens if gamefaqs disappears because of lack of traffic?

Can LLM actually create or only regurgitate content.

Aerroon · 2026-03-30T13:15:56 1774876556

>Can LLM actually create or only regurgitate content.

Contrary to what others say, LLMs can create content. If you have a private repo you can ask the LLM to look at it and answer questions based on that. You can also have it write extra code. Both of these are examples of something that did not exist before.

In terms of gamefaqs, I could theoretically see an LLM play a game and based on that write about the game. This is theoretical, because currently LLMs are nowhere near capable enough to play video games.

wolvoleo · 2026-03-30T08:26:43 1774859203

It will remain in their scraped data so they can keep including it in their later training datasets if they wish. However it won't be able to do live internet searches anymore. And it will not generate new content of course. Especially not based on games released after the site codes down so it doesn't know. Though it could of course correlate data from other sources that talk about the game in question.

stefanka · 2026-03-30T08:08:27 1774858107

They cannot create original content.

wolvoleo · 2026-03-30T08:27:59 1774859279

Well they can make some up, like hallucination. That's an additional problem: when the original site that provided the training data is gone: how can they use verify the AI output to make sure it's correct?

unsungNovelty · 2026-03-30T12:21:29 1774873289

"near-zero marginal costs". For whom exactly????

https://drewdevault.com/2025/03/17/2025-03-17-Stop-externali...

lelanthran · 2026-03-30T08:29:56 1774859396

I don't think a rule along the lines of "Doing $FOO to a corporate is forbidden, but doing $FOO to a charitable initiative is fine" is at all fair.

What "$FOO" actually is, is irrelevant. I'm curious how you would convince people that this sort of rule is fair.

The corp can always ban users who break ToS, after all. They don't need any help. The charitable initiative can't actually do that, can they?

ungreased0675 · 2026-03-30T11:39:03 1774870743

You’re describing the tragedy of the commons. No single raindrop thinks it’s responsible for the flood.

razingeden · 2026-03-30T01:25:17 1774833917

It is direct financial damage if my servers not on an unmetered connection — after years of bills coming in around $3/mo I got a surprise >$800 bill on a site nobody on earth appears to care about besides AI scrapers.

It hasn’t even been updated in years so hell if I know why it needs to be fetched constantly and aggressively, - but fuck every single one of these companies now whining about bots scraping and victimizing them, here’s my violin.

gzread · 2026-03-30T05:31:08 1774848668

If you can identify the scraper you should have a valid legal case to recover damages.

thisislife2 · 2026-03-30T14:37:34 1774881454

Only if they had a robots.txt for their site.

razingeden · 2026-03-30T14:58:21 1774882701

I hadn’t even considered that. Don’t know why that comment is greyed out or downvoted.

It’s a static site that hasn’t been updated since 2016—- so it’s .. since been moved to cloudflare r2 where it’s getting a $0.00 bill, and it now has a disallow / directive. I’m not sure if it’s being obeyed because the cf dash still says it’s getting 700-1300 hits a day even with all the anti bot, “cf managed robots” stuff for ai crawlers in there.

The content is so dry and irrelevant I just can’t even fathom 1/100th of that being legitimate human interest but I thought these things just vacuumed up and stole everyone’s content instead of nailing their pages constantly?

gzread · 2026-03-30T15:02:32 1774882952

No, it's still illegal to DDoS sites that don't have robots.txt.

thisislife2 · 2026-03-30T15:46:21 1774885581

You are right, I hadn't considered that aspect.

grishka · 2026-03-30T08:27:27 1774859247

> Scraping static content from a website at near-zero marginal cost to its server

It's not possible to know in advance what is static and what is not. I have some rather stubborn bots make several requests per second to my server, completely ignoring robots.txt and rel="nofollow", using residential IPs and browser user-agents. It's just a mild annoyance for me, although I did try to block them, but I can imagine it might be a real problem for some people.

I'm not against my website getting scraped, I believe being able to do that is an important part what the web is, but please have some decency.

the_sleaze_ · 2026-03-30T04:14:09 1774844049

60% of our traffic is bot, on average. Sometimes almost 100%.

not_your_vase · 2026-03-30T05:06:24 1774847184

  > net-zero marginal cost

Lol, you single-handedly created a market for Anubis, and in the past 3 years the cloudflare captchas have multiplied by at least 10-fold, now they are even on websites that were very vocal against it. Many websites are still drowning - gnu family regularly only accessible through wayback machine.

Spare me your tears.

xmcqdpt2 · 2026-03-30T11:29:16 1774870156

AI providers also claim to have small marginal costs. The costs of token is supposedly based on pricing in model training, so not that different from eg your server costs being low but the content production costs being high. And in many cases AI companies are direct competitors (artists, musicians etc.)

(TBH it's not clear to me that their marginal costs are low. They seem to pick based on narrative.)

ori_b · 2026-03-30T13:47:38 1774878458

My website serving git that only works from Plan 9 is serving about a terabyte of web traffic monthly. Each page load is about 10 to 30 kilobytes. Do you think there's enough organic, non-scraper interest in the site that scrapers are a near-zero part of the cost?

SkiFire13 · 2026-03-30T05:47:37 1774849657

> Scraping static content

How do you know the content is static?

foobiekr · 2026-03-30T14:33:49 1774881229

You are, of course, ignoring the production costs of the static content that OpenAi is stealing.

Stop justifying their anti-social behavior because it lines your pockets.

bakugo · 2026-03-29T23:55:08 1774828508

The cost is so marginal that many, many websites have been forced to add cloudflare captchas or PoW checks before letting anyone access them, because the server would slow to a crawl from 1000 scrapers hitting it at once otherwise.

mcfedr · 2026-03-30T16:24:38 1774887878

I'm sure the copyright holders would consider your use of their content as direct financial damage

heyethan · 2026-03-30T02:41:15 1774838475

I think this also explains why the checks are moving up the stack.

If the real cost is in actually running the app or the model, then just verifying a browser isn’t enough anymore. You need to verify that the expensive part actually happened.

Otherwise you’re basically protecting the cheapest layer while the expensive one is still exposed.

swagmoney1606 · 2026-03-30T02:09:25 1774836565

And yet I have to pay in my time and cash to handle the constant ddos'es from the constant LLM scraping

gmerc · 2026-03-30T07:04:19 1774854259

It’s not for techbros to decide at what threshold of theft it’s actually theft. “My GPU time is more valuable than your CPU time” isn’t a thing and Wikipedias latest numbers on scraping show that marginal costs at scale are a valid concern

nozzlegear · 2026-03-30T02:35:36 1774838136

Are they, actually?

make3 · 2026-03-30T05:10:53 1774847453

Absolutely not, the former relies on controversial ideas to qualify as legal.

Stealing the content from the whole planet & actively reducing the incentive to visit the sites without financial restitution is pretty bad.

AtlasBarfed · 2026-03-30T00:59:40 1774832380

Because you say it is?

I obviously disagree. I mean, on top of this we are talking about not-open OpenAI.

nickphx · 2026-03-30T10:55:41 1774868141

Speak for yourself.

karlshea · 2026-03-30T00:59:21 1774832361

I don’t know what world you live in but it’s not this one.

platybubsy · 2026-03-30T07:04:20 1774854260

Bait or genuine techbro? Hard to say

andrepd · 2026-03-30T14:19:54 1774880394

> Scraping static content from a website at near-zero marginal cost to its server

The gall. https://weirdgloop.org/blog/clankers

nslsm · 2026-03-29T23:56:59 1774828619

The issue is that there are so many awful webmasters that have websites that take hundreds of milliseconds to generate and are brought down by a couple requests a second.

bakugo · 2026-03-30T00:16:39 1774829799

OpenAI must be the most awful webmasters of all, then, to need such sophisticated protections.

nikitaga · 2026-01-30T00:27:56 1769732876

This is great news, nice win for Scala.

It's a great language, I've been working with it for 10 years now. Full stack Scala with Scala.js on the frontend is so very nice. My experience is mostly in fintech & healthcare startups where the language helped us get correctness, refactorability, clarity, and high velocity at the same time without blowing up the team size.

Initially I learned Scala on the job, but I've been writing open source Scala for years since then. It's a cool language to learn and explore ideas in, since it has lots of elegantly integrated features (especially FP + OOP).

Scala may not be the #1 most popular language, and that's fine. Popular stuff surely gets the benefits of funding and attention, and sometimes lacking such support is really annoying, like a few years ago when Scala 3 was first released, the IDEs took a looong time to catch up. But I still choose Scala despite those occasional annoyances, even though I also have years of experience in JS / TS and other languages. It's just a much better tool for my needs.

nikitaga · 2026-01-29T23:52:57 1769730777

^ This meme is from 10+ years ago when Scala was at the peak of its hype driven by the FP craze. Nobody seriously writes cryptic-symbolic-operator code like that nowadays. Scalaz, the FP library most notorious for cryptic operator/method names, hasn't been relevant for many years. Today everyone uses Cats, ZIO, or plain Tapir or Play, all of which are quite ergonomic.

nikitaga · 2026-01-19T18:05:00 1768845900

The main reason for populism is that the incumbent governments do a consistently poor job satisfying their constituents' preferences and interests, so people get desperate to find something / someone different that might work better. Always has been, always will be, social media or not.

Example: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4230288

We haven't invented a governance structure yet that would be immune to this, although some are better than others. I'm sure the current social media algorithms are harmful as well. You can ban viral algorithms, but the hostile actors whose literal job it is to drive polarization / populism will just find other strategies to effectively deliver their message.

"Education" is nice and all, but millions of people keep smoking despite the obvious harm and decades of education, not to mention the many limitations, taxes, and bans. I mention smoking as an obviously-bad-thing that everyone knows is bad. Education succeeded, and yet, here we are, still puffing poison. But you can also look already-polarized political topics. There's been no shortage of education on those topics either, but if that worked well enough, we wouldn't be decrying populism right now.

lovlar · 2026-01-19T18:54:14 1768848854

> "Education" is nice and all, but millions of people keep smoking despite the obvious harm and decades of education

I think there’s a missed opportunity for media to make it explicit that by giving their time and attention to these platforms, people are directly generating profit. Way too many assume their involvement has no real effect, but it does. I suspect people would be far less willing to log in if it were clear that each session generates, on average, X dollars in revenue. It’s a business model most people still haven’t fully digested.

whattheheckheck · 2026-01-19T21:50:52 1768859452

It needs to be looked at holistically.

Daniel Schmattenberger said one kpi for good society is the inverse of addiction behaviors.

Maybe a 5 why's (and beyond) on people's addictions can help get to the root cause.

It's usually structural and systemic and not a moral failing on the individual choice problem

nikitaga · 2026-01-19T19:32:52 1768851172

I know all about their business models, yet I couldn't care less how much money Facebook gets from ad clicks. Them making a profit is not directly harming me.

The things that are harming me are a lot more complicated than that, but people don't have the attention span to be educated about such complex issues. It's easier than ever to spread "education" now. The fact that it doesn't stick is not some grand conspiracy – most people simply don't care.

nikitaga · 2026-01-19T17:39:28 1768844368

How is MacOS as enshittified as Windows? It doesn't have ads, doesn't push AI on you, their online services are trivial to ignore once and never think about again, etc. I haven't tried Tahoe, and sure, its new glass UI is shit, but merely incompetent UI design is not "enshittification" and is not in any way equivalent to what Microsoft does in Windows.

madeofpalk · 2026-01-19T17:54:13 1768845253

macOS absolutely, definitely, 100% has ads.

Buy a new Apple Watch and notice that the settings app with have a [1] badge trying to upsell you to buy AppleCare+. They obscure dismissing these by clicking the "Add AppleCare Coverage" button and then having a button that says actually no.

pixelready · 2026-01-19T18:11:11 1768846271

The undissmissable badges in settings irk me to no end. Using language like “finish setting up” in iOS to describe me opting out of Apple Intelligence by choice as leaving MY device in some sort of “unfinished state” is user hostile too. With the amount of effort it takes me to push back constantly on these dark patterns, I know for a fact all my less tech savvy friends and family just aren’t bothering and that’s what they count on.

Not as egregious as what windows is doing with copilot everywhere or sneakily flipping user-toggled options during updates, but it’s all some degree of gross.

lynndotpy · 2026-01-19T18:15:47 1768846547

This, on top of the nonstop onslaught of advertisements for F1. It seemed like every one of Apple's services were pushing for that movie. They even put it into maps, wallets, into CarPlay (while people were driving!) It was surprisingly shameless.

It's certainly not as bad _right now_ as what you'll see on Windows 11, but this is something that will almost certainly only get worse over time.

iwontberude · 2026-01-19T18:16:33 1768846593

Windows has third party ads and it’s so trash

nikitaga · 2026-01-19T19:54:18 1768852458

My MacOS "100%" does not have any ads. But I don't use Apple watch or Apple online services, so that's the difference I guess.

You don't need to buy a Windows Watch to get ads on Windows though. They'll be right there anyway, and more of them.

gverrilla · 2026-01-20T08:07:14 1768896434

Didn't know that was possible (not a mac user).

rpdillon · 2026-01-19T18:26:33 1768847193

Agreed. And don't press the play/pause button on your Bluetooth headset, or Apple Music will fire up and ask you to agree to their terms.

Marsymars · 2026-01-19T18:46:48 1768848408

> their online services are trivial to ignore once and never think about again

The workarounds to get rid of the nag to log into your icloud account on macOS are far more difficult than the workarounds to avoid using an MS account in Windows.

latexr · 2026-01-20T11:05:26 1768907126

> The workarounds to get rid of the nag to log into your icloud account on macOS

Do you have an example? I have to set up macOS on the regular, and after saying no to iCloud on the setup screen, it never bothers me again.

They are very aggressive with trying to get me to “update” to Tahoe, though.

Marsymars · 2026-01-20T16:33:19 1768926799

It's this thing I'm talking about, where I've got a permanent notification bubble on the System Preferences app on the Mac where I'm not logged into an Apple ID: https://apple.stackexchange.com/questions/344278/how-can-i-d...

runjake · 2026-01-19T17:44:26 1768844666

I've been getting intrusive first-party ads in Apple's OSes for at least the past 3 major OS releases. News+, Fitness+, Music, Apple TV+, etc etc.

nikitaga · 2026-01-19T19:40:18 1768851618

Surely we can distinguish MacOS – the operating system – from the online services provided by Apple that happen to have a native app?

If you are choosing to use Apple online services, sure, you'll get upsells I guess, as with any other online service. I don't use any of Apple's online services, and never see those ads.

al_borland · 2026-01-19T18:18:06 1768846686

News+ also has a ton of articles behind paywalls, even if paying for the premium version. It’s an ugly experience, probably the worst one.

bigyabai · 2026-01-19T17:59:47 1768845587

macOS does have ads, their online services are worse than Windows, and installing basic software like Homebrew and Git is like having teeth pulled.

Windows is absolutely miserable, but with WSL installed it's far and away the better dev environment. I say that as someone who dailies Linux and hates all three OSes.

coldtea · 2026-01-19T19:22:15 1768850535

While macOS has gown down over time, installing Homebrew and Git on macOS is trivial, a 30 second affair.

nikitaga · 2025-10-09T05:17:01 1759987021

The authenticity of old fashioned forums is often outweighed by their poor UX and in general terrible ergonomics. It's no wonder that so few people want to use them anymore. Reddit's "nested, collapsible comments sorted by upvotes" format is simply superior.

20 years after Reddit started, the best that the forums can offer is perhaps discourse.org, which is barely any better than traditional forums – sleeker UI for sure, but it's still fundamentally the same unworkable linear format. It's like sticking to magnetic tapes in the age of SSDs.

Even Facebook, one of the dumbest discussion platforms, has nested comments. Terribly implemented of course, but how does the platform designed for the lowest-common-denominator kind of user have more advanced discussion features than forums made for discussion connoisseurs? It is utterly baffling.

rhines · 2025-10-09T15:51:34 1760025094

I strongly disagree. But maybe because of a difference of perspective. If you're imagining a Reddit-scale forum, with millions of people with no sense of community and no knowledge of the content they're consuming, then yeah a traditional forum format is awful.

Forums shine as spaces for focused communities, where people have reputations and care about the subject matter. Time-sorted discussions are great because that's what's happening - a discussion in the community. You don't want to read someone's quip first, you want to get the whole context. You don't want there to be upvotes that people try to earn - there's already your reputation in the community. If someone's a troll or gives bad advice or is wrong, they'll get called out, or banned, or simply ignored as everyone knows they aren't respected.

Forums just aren't meant for generic content and it's not because of the UI, it's because the entire concept is not compatible with masses of semi-anonymous users with no commonalities.

A_D_E_P_T · 2025-10-09T13:30:11 1760016611

"Nested comments sorted by upvotes" is, for free and frank discussion, inherently far worse than non-nested in-line comments. With the latter there's no hive-mind effect, no consensus-seeking, no dopamine/approval-chasing. Also, traditional forums tended to encourage longer-form posts (which you can still see in places like Spacebattles), which naturally contained quite a lot of technical detail and pictures, whereas Reddit (and HN) are optimized for very short comments. In Reddit's case, smarmy one-liners, usually.

But the main problem, to repeat for emphasis, is that the upvote/downvote system (even if it's fair and used virtuously, and it usually isn't,) stifles disagreement and debate.

nikitaga · 2025-10-10T01:13:20 1760058800

> stifles disagreement and debate.

When I append "reddit" to my google search query, I'm not looking for "disagreement and debate". I'm looking for specific information on non-political topics, such as repairing my car, finding a good product in the sea of garbage, or learning new techniques. Such topics are typically discussed cooperatively rather than adversarially. For this stuff, consensus-seeking is a feature not a bug, and where the consensus appears inadequate, I'm well capable of looking past the top post. Reddit's format is not perfect, but it's better than having to read through a 30-page thread in which most messages are irrelevant to most other messages. Such threads are linear only artificially through a UI that hides the structure of the underlying conversations.

If you don't like the upvotes aspect of reddit, we could settle on the same nested format but without sorting by upvotes. But with forums, we don't even have that.

Reddit's comments aren't one-liners because Reddit's format encourages that, it's because it's the most popular site where everyone goes. If forums were as widely popular, they would see the same people making the same comments there too.

nikitaga · 2025-09-16T09:59:57 1758016797

> If you don't want that, you gotta bring a wrapper or another reactivity library/framework.

Being able to use a different library with a component, instead of the component being tied to React, is the whole point.

React isn't 100x more popular because its reactivity system or any other feature is 100x better. Half the reason it's popular is network effects – too many frontend components / libraries are made React-only even though they don't need to be React-specific.

Those network effects are the trap, not the reactivity system that's as good as any other for the purpose of writing a Web Component. If you don't want to use simple and small tools like Lit.js, that's fine, but that's your choice, not a limitation of Web Components.

The point of Web Components is not to provide a blessed state management or virtual DOM implementation that will have to stay in JS stdlib forever, it's to make the components you author compatible with most / all UI libraries. For that goal, I don't know of a better solution.

WA · 2025-09-16T10:09:37 1758017377

I get your point. I'm fully with you that it makes no sense to use React and write React apps if you can achieve the same without React. I hate the fact that many great frontend components only work with React, especially considering that React didn't properly support Web components for ages, whereas almost every other framework had no problems with them.

However, out of the box, Web components don't come with almost anything. Comparing React to Web components is comparing apples to oranges.

Lit is great, but Lit is a framework. Now you're comparing React with Lit. Different story than React vs. vanilla Web components.

spankalee · 2025-09-16T13:37:52 1758029872

Lit is not a framework. Lit only helps you make standard web components that you can use anywhere *because they are web components*.

You could take a Lit-based web components a rip Lit out and you would still have the same component that you can still use anywhere. Lit is just an implementation detail.

MrJohz · 2025-09-16T14:00:17 1758031217

Lit is a framework, that's the whole point of it. Lit is a framework that happens to generate web components, but the goal of Lit is to provide the rendering and state management necessary to actually write those components. That's the framework bit.

If you take a Lit-based web component and rip Lit out, you have dead code that won't work because it's dependent on a framework that you have removed.

You could take a Lit-based web component and replace it with a non-Lit-based web component and that would be fine, because Lit uses web components as its core interface, but Lit itself is still a framework.

nikitaga · 2025-09-17T02:25:42 1758075942

> Comparing React to Web components is comparing apples to oranges.

I mean, yes, but you're the one making this comparison, saying that WCs lack reactivity etc.

Web Components are an extension of the DOM – a low level browser API. They are similarly low level. That's expected. I don't need or expect them to be something more.

I am happy that I can use any reactivity system I want to implement a Web Component. That's a feature, not a bug. Having implemented a reactivity system myself, I know that there isn't a perfect one, the design is full of tradeoffs, and I'd rather not have a blessed implementation in the browser, because it will inevitably turn out to be flawed, yet we won't be able to retire it because "we can't break the web". A blessed implementation like that would benefit from network effects just like React does, and would have all the same problems as React, plus the inability to rapidly innovate due to the browser's unique backwards compatibility concerns. I'd rather ship an extra 3KB and avoid all those problems.

WA · 2025-09-17T09:40:34 1758102034

Fair enough. I agree with you. It occured to me that the comment I originally replied to that claimed "Web components are the way out of this trap" doesn't mean that you can't add helpers for reactivity.

MrJohz · 2025-09-16T13:57:34 1758031054

Except the problem with compatibility is almost always the reactivity element, right? Getting, say, Vue's reactivity system to compose properly with Svelte's, or React's with Angular's. And that's not going to work well when Vue is using signals to decide when to rerender a component, React is using its props and state, and Svelte isn't even rerendering components in the first place.

This is especially difficult when you start running into complicated issues like render props in JSX-based frameworks, context for passing state deeply into a component, or slots/children that mean reactivity needs to be threaded through different frameworks.

nikitaga · 2025-08-26T22:39:09 1756247949

Preferred syntax is whatever looks nicer to you. It's not really two different syntaxes, just one more flexible syntax where if you choose to go full braceless, it ends up looking like python. I personally like the new braceless python-like syntax.

Scala has two main camps, one is purist FP (cats / zio / etc.), another is plain Scala, banking on ergonomic OOP+FP fusion. Neither of those is the default. FP advocates are more vocal online but that's because they need a bunch of libraries (thus more OSS work) to make that approach work, whereas the other camp just uses plain Scala and simpler libraries that aren't reinvented every 5 years, so their online presence is not as apparent.

nikitaga · 2025-08-26T22:31:21 1756247481

Scala is very much alive, it's just past the initial hype stage, well into the slope of enlightenment / plateau of productivity depending on which style of Scala one is into. It's now growing slower but based on more sustainable pragmatism instead of just hype.

nikitaga · on March 26, 2025

Yes, this guy:

https://www.youtube.com/@rockthejvm

https://rockthejvm.com/

Great videos.

btreecat · on March 26, 2025

I didn't find them helpful or interesting.

I guess I'm not recursively solving my problems in my day to day challenges