> but it's increasingly looking like LeCunn is right.
This is an absolutely crazy statement vis-a-vis reality and the fact that it’s so upvoted is an indictment of the type of wishful thinking that has grown deep roots here.
If you are paying attention to actual research, guarded benchmarks, and understand how benchmarks are being gamed, I would say there is plenty of evidence we are approaching a clear plateau / the march-of-nines thesis of Karpathy is basically correct long-term. Short-term it remains to be seen how much more we can do with the current tech.
Your best bet would be to look deeply into performance on ARC-AGI fully-private test set performances (e.g. https://arcprize.org/blog/arc-prize-2025-results-analysis), and think carefully about the discrepancies here, or, just to broadly read any academic research on classic benchmarks and note the plateaus on classic datasets.
It is very clear when you look at academic papers actually targeting problems specific to reasoning / intelligence (e.g. rotation invariance in images, adversarial robustness) that all the big companies are doing is just fitting more data / spending more resources on human raters and other things to boost performance on (open) metrics, but that clear actual gains in genuine intelligence are being made only by milking what we know very well to be a limited approach. I.e. there are trivially-basic problems that cannot be solved by curve-fitting models, which makes it clear most current advances are indeed coming from curve(manifold) fitting. It just isn't clear how far we can exploit these current approaches and in what domains this kind of exploitation is more than good enough.
EDIT: Are people unaware Google Scholar is a thing? It is trivial to find modern AI papers that can be read without requiring access to a research institution. And e.g. HuggingFace collects trending papers (https://huggingface.co/papers/trending), and etc.
At present its only SWE's that are benefitting from a productivity stand point. I know a lot of people in finance (from accounting to portfolio management) and they scoff at the outputs of LLMs in their day to day jobs.
But the bizarre thing is, even though the productivity of SWE's is increasing I dont believe there will be much happening in regards to lay offs due to the fact that there isn't complete trust in LLMs; I dont see this changing either. In which case the LLM producers will need to figure out a way to increase the value of LLMs and get users to pay more.
Are SWE’s really experiencing a productivity uplift? When studies attempt to measure the productivity impact of AI in software the results I have seen are underwhelming compared to the frontier labs marketing.
And, again, this is ignoring all the technical debt of produced code that is poorly understood, weakly-reviewed, and of questionable quality overall.
I still think this all has serious potential for net benefit, and does now in certain cases. But we need to be clearer about spelling out where that is (webshit, boilerplate, language-to-language translation, etc) and where it maybe isn't (research code, legacy code, large codebases, niche/expert domains).
This Stanford study on developer productivity found 0 correlation between developers assessment of their own productivity and independent measures of their productivity. Any anecdotal evidence from developers on how AI has made them more or less productive is worthless.
Yup, most progress is also confined to SWE's doing webshit / writing boilerplate code too. Anything specialized, LLMs are rarely useful, and this is all ignoring the future technical debt of debugging LLM code.
I am hopeful about LLMs for SWE, but the progress is currently contextual.
Even if LLMs could write great code with no human oversight, the world would not change over night. Human creativity is necessary to figure out what stuff to produce that will yield incremental benefits to what already exists.
The humans who possess such capability stand to win long-term; said humans tend to be those from the humanities and liberal arts.
This is an absolutely crazy statement vis-a-vis reality and the fact that it’s so upvoted is an indictment of the type of wishful thinking that has grown deep roots here.