Hacker Newsnew | past | comments | ask | show | jobs | submit | jerf's commentslogin

At the risk of going against the gestalt, Facebook openly and publicly rejecting the ads is actually one of the better outcomes. They could have just put their thumbs on the scale, deprioritizing them, serving them to people they think are least likely to bite, etc. Lying about the number of times it was served because, after all, who can check? Many of us suspect the ad platforms already do this pretty routinely through one mechanism or another anyhow, after all.

It isn't reasonable to ask a platform to host content that is literally about suing them, not because of "freedom" concerns or whether or not Facebook is being hypocritical, but more because in the end there isn't a "fair" way for them to host that. The constraints people want to put on how Facebook would handle that ends up solving down to the null set by the time we account for them all. Open, public rejection is actually a fairly reasonable response and means the lawyers at least know what is up and can respond to a clear stimulus.


> It isn't reasonable to ask a platform to host content that is literally about suing them

Explicit rejection is better than opacity but better still is public accountability. Meta’s properties have a combined userbase that amounts to just over 1 in 4 people on earth; these platforms should have been regulated as utilities a long time ago. Suppose I wanted to run ad campaigns advocating for antitrust legislation targeting social media companies and ended up getting booted off of all of the major platforms; what feasible method is there for me to advance these ideas that could possibly compete with the platforms’ own abilities to influence public opinion?


You can really see this in the recent video generation where they try to incorporate text-to-speech into the video. All the tokens flying around, all the video data, all the context of all human knowledge ever put into bytes ingested into it, and the systems still completely routinely (from what I can tell) fails to put the speech in the right mouth even with explicit instruction and all the "common sense" making it obvious who is saying what.

There was some chatter yesterday on HN about the very strange capability frontier these models have and this is one of the biggest ones I can think of... a model that de novo, from scratch is generating megabyte upon megabyte of really quite good video information that at the same time is often unclear on the idea that a knock-knock joke does not start with the exact same person saying "Knock knock? Who's there?" in one utterance.


By the nature of the LLM architecture I think if you "colored" the input via tokens the model would about 85% "unlearn" the coloring anyhow. Which is to say, it's going to figure out that "test" in the two different colors is the same thing. It kind of has to, after all, you don't want to be talking about a "test" in your prompt and it be completely unable to connect that to the concept of "test" in its own replies. The coloring would end up as just another language in an already multi-language model. It might slightly help but I doubt it would be a solution to the problem. And possibly at an unacceptable loss of capability as it would burn some of its capacity on that "unlearning".

One of the reasons I'm comfortable using them as coding agents is that I can and do review every line of code they generate, and those lines of code form a gate. No LLM-bullshit can get through that gate, except in the form of lines of code, that I can examine, and even if I do let some bullshit through accidentally, the bullshit is stateless and can be extracted later if necessary just like any other line of code. Or, to put it another way, the context window doesn't come with the code, forming this huge blob of context to be carried along... the code is just the code.

That exposes me to when the models are objectively wrong and helps keep me grounded with their utility in spaces I can check them less well. One of the most important things you can put in your prompt is a request for sources, followed by you actually checking them out.

And one of the things the coding agents teach me is that you need to keep the AIs on a tight leash. What is their equivalent in other domains of them "fixing" the test to pass instead of fixing the code to pass the test? In the programming space I can run "git diff *_test.go" to ensure they didn't hack the tests when I didn't expect it. It keeps me wondering what the equivalent of that is in my non-programming questions. I have unit testing suites to verify my LLM output against. What's the equivalent in other domains? Probably some other isolated domains here and there do have some equivalents. But in general there isn't one. Things like "completely forged graphs" are completely expected but it's hard to catch this when you lack the tools or the understanding to chase down "where did this graph actually come from?".

The success with programming can't be translated naively into domains that lack the tooling programmers built up over the years, and based on how many times the AIs bang into the guardrails the tools provide I would definitely suggest large amounts of skepticism in those domains that lack those guardrails.


I know it's not what people want to hear but my response to a lot of the comments here is just a general, I agree, it's time to stop using Windows.

They won't let you secure your drive the way you want. They won't let you secure your network the way you want (per the top-level comment about Wireguard). In so doing they are demonstrating not just that they can stop you from running these particular programs but that they are very likely going to exert this control on the entire product category going forward, and I see little reason to believe they will stop there. These are not minor issues; these are fundamental to the safety, security, and functionality of your machine. This indicates that Microsoft will continue to compromise the safety, security, and functionality of your machine going forward to their benefit as they see fit. This is intolerable for many, many use cases.

I think it is becoming clear that Microsoft no longer considers Windows users to be their customers any more. Despite the fact that people do in fact pay for Windows, Microsoft has shifted from largely supporting their customers to out-and-out exploiting their customers. (Granted a certain amount of exploitation has been around for a long time, but things like the best backwards compatibility in the industry showed their support, as well.)

I suspect this is the result of a lot of internal changes (not one big one) but I also see no particular reason at the moment to expect this to change. To my eyes both the first and second derivative is heading in the direction of more exploitation. More treating users like a cattle field and less like customers. When new features or work is being proposed at Microsoft, it is clear that it is being analyzed entirely in terms of how it can benefit Microsoft and users are not at the table.

No amount of wishing this wasn't so is going to change anything. No amount of complaining about how hard it is to get off of Windows is going to change anything; indeed at this point you're just signalling to Microsoft that they are correct and they can treat you this way and there's nothing you will do about it for a long time.


Stop supporting Windows as well.

Open source developers are doing Microsoft a big favor when they support Windows and publish Windows builds and installers. It's a substantial effort, and apparently that effort isn't appreciated.

If all open source software dropped support for Windows, it wouldn't really affect the open source community that much. It would definitely cause headaches for Microsoft however.


It's not that easy.

I agree that supporting Windows helps its ecosystem.

But also open source software on Windows is an important gateway to the free world. When you are already used to Firefox, LibreOffice and VLC, you might as well switch to Linux painlessly, but if those didn't run on Windows, switching to Linux would require relearning everything.


Irrelevant. If it's time to stop using windows, all those windows users will have to relearn everything either way. Whether they do it in a windows environment or a linux one doesn't really change the equation.

A sudden lack of software on windows will increase user migration. If we all keep publishing for windows, users will just stay there because their needs are already met.


> If it's time to stop using windows, all those windows users will have to relearn everything either way.

No, that's the thing; they ideally would only need to replace the OS. Many long years ago, when I switched from Windows to Ubuntu (this was back when it was good), part of why it was so easy is because I mostly kept the same applications. If you use eg. Firefox, VLC, open/libreoffice, audacity, etc., then you can install a new OS, reinstall the same applications, and barely have to change anything. That's huge.


I agree to some extend but we (or at least I) publish open source software (amongst other reasons) because I like helping others and it so happens that most users that could benefit are still using Windows so it doesn't feel right to stop doing that as long as the effort is reasonable (which it is, unlike for macOS).

Nah, it's simpler. Microsoft just lost sense of UX and touch with the reality to their own internal management vibes.

Look at the Windows start menu. It used to be trivial to switch users. Two clicks, one to open the user list, another to switch - done. Now it's four: user panel, three-dots, switch user, pick user.

Look at the login sequence. They want their Windows Hello and they don't care if it works well or not - no way to get a pin or password prompt instantly, you gotta click three times (one to show a method picker, another to pick PIN entry, and lastly one to focus the goddamn field) despite no reasons to hide this UI.

It's not like they're trying to scam or sell user into something. It looks like some internal decision-makers that don't ever dogfood their decisions losing touch with the common sense.

Apple has that too, and this rot spreads elsewhere. But it's not intently malicious, a lot of things simply don't make sense - just total lack of self-reflection capabilities at the corporate level.


I think they've been heading that way for a while, and it's only getting clearer.

I've been thinking, and said before, 90s Microsoft was far from perfect, but they at least seemed to care a lot about the quality of Windows. 2020s Microsoft seems to see Windows users as a captive audience they can exploit for whatever the corporate executives fancy at the moment. It seems more like a gradual transition.

In any case, it seems to be getting more clear that Linux is destined to be the best OS for power-users.


> I think it is becoming clear that Microsoft no longer considers Windows users to be their customers any more.

Quite obviously. Look at the out of box new user experience on a Windows 11 Home installation. What you get when you open a new $600 laptop from Best Buy for the first time. The entire thing is designed to drive users towards perpetual monthly recurring subscription billing for various MS services for life (OneDrive, Office, Xbox Live, Xbox game store purchased games, etc). It's a platform which is built atop a rent seeking cloud services ideology that shows no sign of ever letting up.


I see two basic cases for the people who are claiming it is useless at this point.

One is that they tried AI-based coding a year or two ago, came to the IMHO completely correct at that time conclusion that it was nearly useless, and have not tried it since then to see that the situation has changed. To which the solution is, try it again. It changed a lot.

The other are those who have incorporated into their personal identity that they hate AI and will never use it. I have seen people do things like fire AI at a task they have good reasons to believe it will fail at, and when it does, project that out to all tasks without letting themselves consciously realize that picking a bad task on purpose skews the deck.

To those people my solution is to encourage them to hold on to their skepticism. I try to hold on to it as well despite the incredible cognitive temptation not to. It is very useful. But at the same time... yeah, there was a step change in the past year or so. It has gotten a lot more useful...

... but a lot of that utility is in ways that don't obviate skilled senior coding skills. It likes to write scripting code without strong types. Since the last time I wrote that, I have in fact used it in a situation where there were enough strong types that it spontaneously originated some, but it still tends to write scripting code out of that context no matter what language it is working in. It is good at very straight-line solutions to code but I rarely see it suggest using databases, or event sourcing, or a message bus, or any of a lot of other things... it has a lot of Not Invented Here syndrome where it instead bashes out some minimal solution that passes the unit tests with flying colors but can't be deployed at scale. No matter how much documentation a project has it often ends up duplicating code just because the context window is only so large and it doesn't necessarily know where the duplicated code might be. There's all sorts of ways it still needs help to produce good output.

I also wonder how many people are failing to prompt it enough. Some of my prompts are basically "take this and do that and write a function to log the error", but a lot of my prompts are a screen or two of relevant context of the project, what it is we are trying to do, why the obvious solution doesn't work, here's some other code to look at, here's the relevant bugs and some Wiki documentation on the planning of the project, we should use {event sourcing/immutable trees/stored procedures/whatever}, interact with me for questions before starting anything. This is not a complete explanation of what they are doing anymore, but there's still a lot of ways in which what an LLM can really do is style transfer... it is just taking "take this and do that and write a function to log the error" and style-transforming that into source code. If you want it to do something interesting it really helps to give it enough information in the first place for the "style transfer" to get a hold of and do something with. Don't feel silly "explaining it to a computer", you're giving the function enough data to operate on.


I can see huge utility with AI as a guide and helper.

But not being one leg in the code myself is not something I am comfortable with. It starts feeling like management and not development. I really feel the abdication very strongly and it makes me unable and unwilling to put a hard stamp on quality. I have seen too much hallucination or half missed requirements to put that much trust in AI.

It's the same with code reviews of hard tickets. You can scroll past and just approve, but do you really understand what your colleague has built? Are you really in the driver's seat? It feels to me like YOLOing with major consequences.

I dont but, at all that people doing 20x output have any idea what they are coding. They are just pressing the yolo button and no one, not the engineer, not the AI and not management is in the driver's seat. it is a very scary time.


"Also, it seems like all the Copilot 'connected experiences' are really just a chat window without any real integration with the applications they are embedded in."

I was triple-booked today. Two of the meetings in question should have had significant overlap between attendees. I figured, hey, there's this Copilot thing here, I'll ask it what the overlap is, that's the sort of thing an AI should be able to do. It comes back and reports that there is one person in both meetings, and that "one person" isn't even me. That doesn't seem right. One of the autocompleted suggestions for the next thing to ask is "show me the entire list of attendees" so I'm like, sure, do that.

It turns out that the API Copilot has access to can only access the first ten attendees of the meetings. Both meetings were much larger than that.

Insert rant here about hobbling 2026 servers with random "plucked out of my bum" limits on processing based on the capabilities of roughly 2000-era servers for the sheer silliness of a default 10-attendee limit being imposed on any API into Outlook.

But also in general what a complete waste of hooking up an amazingly sophisticated AI model to such an impoverished view of the world.


There are innumerable companies built around the Outlook calendar; you’d think Microsoft could get something right here with AI; but they seem unable.

"plucked out of my bum" sounds so much more sophisticated than “pulled out of my ass”

Plucked betwixt mine cheeks

I would personally pay money not to have this thing.

It's wonderful and I love that someone else loves it. The care put into it is fantastic. Vive la différence.

(https://en.wiktionary.org/wiki/vive_la_diff%C3%A9rence for those who may not recognize that phrase.)


How much money are you willing to pay? If sufficient I won't commission one and have it sent to your house. @AnthonyDavidAdams on venmo!

"say that AI developers should incorporate more real-world diversity into large language model (LLM) training sets,"

Are you kidding me?

How much more "real-world diversity" could they possibly incorporate into the models than the entire freaking Internet and also every scrap of text written on paper the AI companies could get a hold of?

How on Earth could someone think that AIs speak like this because their training set is full of LLM-speak? This is transparently obviously false.

This is the sort of massive, blinding error that calls everything else written in the article into question. Whatever their mental model of AI is it has no resemblance to reality.


The problem isn't the diversity in the training set - the problem is that the method by design picks the average.

LLM speak isn't even quite the average either. It's something more like the average, then pushed through more training to turn it into the agents we think of today (a fresh-off-the-training-set LLM really is in some sense that "fancy autocomplete" that people called it for a while), then trained by the AI companies to be generally inoffensive and do the other things they want them to do. All of the further actions push the agents away from the original LLM average. The similarity of the "LLM tone" across multiple models and multiple companies, and the fact I don't think this tone has been super directly trained for, strongly suggests that the process of converting the raw LLM into the desirable agents we all use is some sort of strong strange attractor for the LLMs that are pushed through that process.

Maybe they are training for that tone now, either deliberately or accidentally. But my belief that they weren't initially comes from the fact that it's a new tone that I doubt anyone designed with deliberation. It bears strong resemblance to "corporate bland", but it is also clearly distinct from it in that we could all tell those two apart very easily.


Like foxes coming up with floppy ears.

There is a study that shows that what the model is doing behind the scenes in those cases is a lot more than just outputting those tokens.

For an LLM, tokens are thought. They have no ability to think, by whatever definition of that word you like, without outputting something. The token only represents a tiny fraction of the internal state changes made when a token is output.

Clearly there is an optimal for each task (not necessarily a global one) and a concrete model for a given task can be arbitrarily far from it. But you'd need to test it out for each case, not just assume that "less tokens = more better". You can be forcing your model to be dumber without realizing it if you're not testing.


High dimensional vectors are thought (insofar as you can define what that even means). Tokens are one dimensional input that navigates the thought, and output that renders the thought. The "thinking" takes place in the high dimension space, not the one dimensional stream of tokens.

But isn't the one dimensional tokens a reflex of high dimensional space? What you see is "sure let's take a look at that" but behind the curtains it's actually an indication that it's searching a very specific latent space which might be radically different if those tokens didn't exist. Or not. In any case, you can't just make that claim and isolate those two processes. They might be totally unrelated but they also might be tightly interconnected.

I assume in practice, filler words do nothing of value. When words add or mean nothing (their weights are basically 0 in relation to the subject), I don't see why they'd affect what the model outputs (except cause more filler words)?

Politeness have impact (https://arxiv.org/abs/2402.14531) so I wouldn't be too fast to make any kind of claim with a technology we don't know exactly how it works.

> For an LLM, tokens are thought. They have no ability to think

This is so funny


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: