I think I 100% agree with you, and yet the other day I found myself telling someone "Did you know OpenClaw was written Codex and not Claude Code?", and I really think I meant it in the same sense I'd mean a programming language or framework, and I only noticed what I'd said a few minutes later.
Why is learning an appropriate metaphor for changing weights but not for context? There are certainly major differences in what they are good or bad at and especially how much data you can feed them this way effectively. They both have plenty of properties we wish the other had. But they are both ways to take an artifact that behaves as if it doesn't know something and produce an artifact that behaves as if it does.
I've learned how to solve a Rubik's cube before, and forgot almost immediately.
I'm not personally fond of metaphors to human intelligence now that we are getting a better understanding of the specific strengths and weaknesses these models have. But if we're gonna use metaphors I don't see how context isn't a type of learning.
I suppose ultimately, the external behaviour of the system is what matters. You can see the LLM as the system, on a low level, or even the entire organisation of e.g. OpenAI at a high level.
If it's the former: Yeah, I'd argue they don't "learn" much (!) past inference. I'd find it hard to argue context isn't learning at all. It's just pretty limited in how much can be learned post inference.
If you look at the entire organisation, there's clearly learning, even if relatively slow with humans in the loop. They test, they analyse usage data, and they retrain based on that. That's not a system that works without humans, but it's a system that I would argue genuinely learns. Can we build a version of that that "learns" faster and without any human input? Not sure, but doesn't seem entirely impossible.
Do either of these systems "learn like a human"? Dunno, probably not really. Artificial neural networks aren't all that much like our brains, they're just inspired by them. Does it really matter beyond philosophical discussions?
I don't find it too valuable to get obsessed with the terms. Borrowed terminology is always a bit off. Doesn't mean it's not meaningful in the right context.
It’s not very good in context, for one thing. Context isn’t that big, and RAG is clumsy. Working with an LLM agent is like working with someone who can’t form new long term memories. You have to get them up to speed from scratch every time. You can accelerate this by putting important stuff into the context, but that slows things down and can’t handle very much stuff.
The article does demonstrate how bad it is in context.
Context has a lot of big advantages over training though, too, it's not one-sided. Upfront cost and time are the big obvious ones, but context also works better than training on small amounts of data, and it's easier to delete or modify.
Like even for a big product like Claude Code from someone that controls the model, although I'm sure they do a lot of training to make the product better, they're not gonna just rely entirely on training and go with a nearly blank system prompt.
"I'm not fond of metaphors to human intelligence".
You're assuming that learning during inference is something specific to humans and that the suggestion is to add human elements into the model that are missing.
That isn't the case at all. The training process is already entirely human specific by way of training on human data. You're already special casing the model as hard as possible.
Human DNA doesn't contain all the information that fully describes the human brain, including the memories stored within it. Human DNA only contains the blue prints for a general purpose distributed element known as neurons and these building blocks are shared by basically any animal with a nervous system.
This means if you want to get away from humans you will have to build a model architecture that is more general and more capable of doing anything imaginable than the current model architectures.
Context is not suitable for learning because it wasn't built for that purpose. The entire point of transformers is that you specify a sequence and the model learns on the entire sequence. This means that any in-context learning you want to perform must be inside the training distribution, which is a different way of saying that it was just pretraining after all.
The fact the DNA doesn't store all connections in the brain doesn't mean that enormous parts of the brain, and by extension, behaviour aren't specified in the DNA. Tons of animals have innate knowledge encoded in their DNA, humans among them.
I don't think it's specific to humans at all, I just think the properties of learning are different in humans than they are in training an LLM, and injecting context is different still. I'd rather talk about the exact properties than bemoan that context isn't learning. We should just talk about the specific things we see as problems.
It might be wrong, but I have started flagging this shit daily. Garbage articles that waste my time as a person who comes on here to find good articles.
I understand that reading the title and probably skimming the article makes it a good jumping off point for a comment thread. I do like the HN comments but I don't want it to be just some forum of curious tech folks, I want it to be a place I find interesting content too.
I agree. It seems this is kind of a shelling point right now on HN and there isn't a clear guideline yet. I think your usage of flagging makes sense. Thanks
The problem with LLM-written is that I run into so many README.md's where it's clear the author barely read the thing they're expecting me to read and it's got errors that waste my time and energy.
I don't mind it if I have good reason to believe the author actually read the docs, but that's hard to know from someone I don't know on the internet. So I actually really appreciate if you are editing the docs to make them sound more human written.
I think the other aspect is that if the README feels autogenerated without proper review, then my assumption is that the code is autogenerated without proper review as well. And I think that's fine for some things, but if I'm looking at a repo and trying to figure out if it's likely to work, then a lack of proper review is a big signal that the tool is probably going to fall apart pretty quickly if I try and do something that the author didn't expect.
I use this stuff heavily and I have some libraries I use that are very effective for me that I have fully vibed into existence. But I would NOT subject someone else to them, I am confident they are full of holes once you use them any differently than I do.
Hmm. The only button on the screen is ([Apple Logo] Send me a download link). When you scroll it off screen it's replaced with ([Apple Logo] Try Kiki) and a collage of macOS screenshots.
They could certainly put it in the FAQ, which is below the ([Apple Logo] Get the App) button, I don't actually disagree with you, but it is somewhat of a funny complaint to me given the actual content of the page.
The Apple logo character isn't a real symbol, it's just a space from the Unicode private-use area (the 'anything goes' area that's not codified and is reserved for niche local uses) that Apple decided would render as the Apple logo in iOS and macOS, probably to allow them to draw their logo as text. It's not something that should be used in browsers or anything that can render outside of Apple's ecosystem. It's not a great sign that something this front-and-center, immediately apparent on any non-Apple devices, wasn't tested by them on any other platforms.
"Crisis" is a massively overblown word for this. And the "wordle community" is a drop in the bucket of regular players, and not remotely representative.
I did have a similar reaction personally to the "exciting news" framing but I'm not actually sure it's wrong. The original list of words was an excellent list, and it's been over 4 years.
Suboptimal - likely. There is some utility: a green letter is more useful than a yellow. Checking for a in two locations when a is a very commonly used letter is __useful__. Still likely much more useful to check for the presence of a fifth letter than a chance at knowing more precisely the location of an a.
There seems to be a progression of Wordle strategies.
Playing with a set start word (or words, e.g. "SIREN OCTAL DUMPY" or people who go the "AUDIO ADIEU" route).
(Many people also go down the rabbit hole of looking for "optimal" starting words or choices based on the original word lists.)
Then, once you've played that for a while, you find it's not that much of a challenge unless you end up in one of the forms of madness like _A_E_, and you'll switch to playing in "hard code" (e.g. correct/green must be played again in the same place in all subsequent tries, yellow letters must also be reused each time).
The hard mode starting with the same word gets a bit boring, so people move on to varying the start word each day, either pulling them from a list or just using the answer for the day before.
There's no "correct" approach obviously, people can play the game however they want and extract the fun/anger however they want.
Because a wordle-in-one is meaningless. It doesn't mean you're any good at Wordle, the way a hole-in-one suggests you're good at golf. It definitely doesn't mean that you're a "Genius" as the game puts it, because you were operating with zero information and didn't employ any skill or intuition. It just means you burned some luck points on something that doesn't matter.
I used to use “stare” or “stale” as the starting guess when I played Wordle, thinking you’d want to start off with the most common letters, like R-S-T-L-N-E from Wheel of Fortune.
I'll also recommend Being You by Seth Anil. It makes a lot of sense of consciousness to me. It certainly doesn't answer the question but it's not just throw your hands up and "we have no idea why qualia", and it's also not just "here's a list of neural correlates of consciousness and we won't even discuss qualia".
It goes through how sensations fit into this highly constrained, highly functional hallucination that models the outside world as a sort of bayesian prediction about the world as they relate to your concerns and capabilities as a human, and then it has a very interesting discussion about emotions as they relate to inner bodily sensations.
reply