Hacker Newsnew | past | comments | ask | show | jobs | submit | more ComplexSystems's commentslogin

Are you kidding? This is an incredible result. Stuff like this is the most important stuff happening in AI right now. Automated theorem proving? It's not too far to say the entire singular point of the technology was to get us to this.


This is me being snarky and ignorant, but if it solved one problem and it is automated what’s stopping it from solving all the others? That’s what’s ultimately being sold by the tweet right.


>This is me being snarky and ignorant, but if it solved one problem and it is automated what’s stopping it from solving all the others?

Yeah that's the crux of the matter. How do AI did it? Using already existing math. If we need new math to prove Collatz, Goldbach or Riemman, LLMs are simply SOL. That's what's missing and hype boys always avoid to mention.


> How do AI did it? Using already existing math. If we need new math to prove Collatz, Goldbach or Riemman, LLMs are simply SOL.

An unproved theorem now proved is by definition new math. Will LLMs get you to Collatz, Goldbach, or Riemann? Unclear.

But it's not like there's some magical, entirely unrelated to existing math, "new math" that was required to solve all the big conjectures of the past. They proceeded, as always, by proving new theorems one by one.


Yes, "new math" is neither magical nor unrelated to existing math, but that doesn't mean any new theorem or proof is automatically "new math." I think the term is usually reserved for the definition of a new kind of mathematical object, about which you prove theorems relating it to existing math, which then allows you to construct qualitatively new proofs by transforming statements into the language of your new kind of object and back.

I think eventually LLMs will also be used as part of systems that come up with new, broadly useful definitions, but we're not there yet.


>An unproved theorem now proved is by definition new math.

No. By _new math_ I mean new mathematical constructs and theories like (to mention the "newest" ones) category theory, information theory, homotopy type theory, etc. Something like Cantor inventing set theory, or Shannon with information theory, or Euler with graph theory.

AFAIK no new field of mathematics as been created _by_ AI. Feel free to correct me.


Ultimately the main thing that will stop it from solving literally "all the others" are things like the impossibility of solving the halting problem, considerations like P ≠ NP, etc. But as we have just seen, despite these impossibility theorems, AI systems are still able to make substantive progress on solving important open real-world problems.


> But as we have just seen, despite these impossibility theorems, AI systems are still able to make substantive progress on solving important open real-world problems.

You seem to be smuggling in the assumption that this problem was "important".


All problems are important. You never know what might be useful.


I think it is way too far to say that!

We've had automated theorem proving since the 60s. What we need is automated theorem discovery. Erdős discovered these theorems even if he wasn't really able to prove them. Euler and Gauss discovered a ton of stuff they couldn't prove. It is weird that nobody considers this to be intelligence. Instead intelligence is a little game AI plays with Lean.

AI researchers keep trying to reduce intelligence into something tiny and approachable, like automated theorem proving. It's easy: you write the theorem you want proven and hope you get a proof. It works or it doesn't. Nice and benchmarkable.

Automated axiom creation seems a lot harder. How is an LLM supposed to know that "between any two points there is a line" formalizes an important property of physical space? Or how to suggest an alternative to Turing machines / lambda calculus that expresses the same underlying idea?


> What we need is automated theorem discovery.

I've been thinking mathematicians have fun doing math, making discoveries, crafting proofs.

Does Tour de France & Co. make no sense since small, lightweight and powerful e-bicycles appeared?

Using computer as a helper like bicycles is one thing, using LLMs seems more like e-bicycle and is something another.


> We've had automated theorem proving since the 60s.

By that logic, we've had LLMs since the 60s!

> What we need is automated theorem discovery.

I don't see any reason you couldn't train a model to do this. You'd have to focus it on generating follow-up questions to ask after reading a corpus of literature, playing around with some toy examples in Python and making a conjecture out of it. This seems much easier than training it to actually complete an entire proof.

> Erdős discovered these theorems even if he wasn't really able to prove them. Euler and Gauss discovered a ton of stuff they couldn't prove. It is weird that nobody considers this to be intelligence.

Who says they don't? I wouldn't be surprised if HarmonicMath, DeepMind, etc have also thought about this kind of thing.

> Automated axiom creation seems a lot harder. How is an LLM supposed to know that "between any two points there is a line" formalizes an important property of physical space?

That's a good question! It would be interesting to see if this is an emergent property of multimodal LLMs trained specifically on this kind of thing. You would need mathematical reasoning, visual information and language encoded into some shared embedding space where similar things are mapped right next to each other geometrically.


>> We've had automated theorem proving since the 60s.

> By that logic, we've had LLMs since the 60s!

From a bit earlier[0], actually:

  Progressing to the 1950s and 60s

  We saw the development of the first language models.
Were those "large"? I'm sure at the time they were thought to be so.

0 - https://ai-researchstudies.com/history-of-large-language-mod...


(We detached this subthread from https://news.ycombinator.com/item?id=46094763.)


What are some interesting things he has used the SVD to solve?


Model reduction (for control systems), numerical precision control for matrix operations, eigenvalue computation, etc. as far as I understand.


I think the article makes decent points but I don't agree with the general conclusion here, which is that all of this investment is wasted unless it "reaches AGI." Maybe it isn't necessary for every single dollar we spend on AI/LLM products and services to go exclusively toward the goal of "reaching AGI?" Perhaps it's alright if these dollars instead go to building out useful services and applications based on the LLM technologies we already have.

The author, for whatever reason, views it as a foregone conclusion that every dollar spent in this way is a waste of time and resources, but I wouldn't view any of that as wasted investment at all. It isn't any different from any other trend - by this logic, we may as well view the cloud/SaaS craze of the last decade as a waste of time. After all, the last decade was also fueled by lots of unprofitable companies, speculative investment and so on, and failed to reach any pie-in-the-sky Renaissance-level civilization-altering outcome. Was it all a waste of time?

It's ultimately just another thing industry is doing as demand keeps evolving. There is demand for building the current AI stack out, and demand for improving it. None of it seems wasted.


That’s not what he’s saying, the investors are the ones who have put trillions of dollars into this technology on the premise that it will achieve AGI. People like Sam Altman and Marc Andreesen have been going into podcasts saying AGI is imminent and they’re going to automate every job.

The author did not say every dollar was wasted, he said that LLMs will never meet the current investment returns.

It’s very frustrating to see comments like this attacking strawmans and setting up Motte and Bailey arguments every time there’s AI criticism. “Oh but LLMs are still useful” and “Even if LLMs can’t achieve AGI we’ll figure out something that will eventually.” Yes but that isn’t what Sam and Andreesen and all these VCs have been saying, and now the entire US economy is a big gamble on a technology that doesn’t deliver what they said it would and because the admin is so cozy with VCs we’re probably all going to suffer for the mistakes of a handful of investors who got blinded by dollar signs in their eyes.


The author quite literally says that the last few years were a "detour" that has wasted a trillion dollars. He explicitly lists building new LLMs, building larger LLMs and scaling LLMs as the problem and source of the waste. So I don't think I am strawmanning his position at all.

It is one thing to say that OpenAI has overpromised on revenues in the short term and another to say that the entire experiment was a waste of time because it hasn't led to AGI, which is quite literally the stance that Marcus has taken in this article.


> The author, for whatever reason, views it as a foregone conclusion that every dollar spent in this way is a waste of time and resources

This is a strawman, the author at no point says that “every dollar is a waste.”


He quite literally says that the dollars spent on scaling LLMs in the past few years are a waste.


Yes and you’re portraying it as “every dollar was wasted” and then arguing against that exaggerated claim which was never made.


Those billions of dollars spent on data centres could instead have hypothetically been spent on researchers with relatively modest computing resources. A few hundred billion dollars will buy you a lot of researchers.


It's the nature of bubbles to overpromise and underdeliver.

So - do Altman and Andreesen really believe that, or is it just a marketing and investment pitch?

A reminder that OpenAI has its own explicit definition of AGI: "Highly autonomous systems that outperform humans at most economically valuable work."

The MS/OAI agreement quantifies that as "profits of $100bn/yr."

This seems rather stupid to me. If you can get multiple AGIs all generating profits of $100bn a year - roughly half an Apple, or two thirds of a Meta - you no longer have anything recognisable as a pre-AI economy, because most of the white collar population is out of work and no longer earning anything. And most of the blue collar population that depends on white collar earnings is in the same boat.

So you have to ask "Profits from doing what, and selling to whom?"

The Altman pitch seems to be "Promise it first, collect investor cash, and worry about the rest later."


> do Altman and Andreesen really believe that, or is it just a marketing and investment pitch?

As for Andreessen, I don't think he even cares. As the author writes:

"for the venture capitalists that have driven so much of field, scaling, even if it fails, has been a great run: it’s been a way to take their 2% management fee investing someone else’s money on plausible-ish sounding bets that were truly massive, which makes them rich no matter how things turn out"

VCs win every time. Even if it's a bubble and it bursts, they still win. In fact, they are the only party that wins.

Heck, the bigger the bubble, the more money is poured into it, and the bigger the commissions. So VCs have an interest in pumping it up.


You are making the same strawman attack you are criticising.

The dollars invested are not justified considering TODAYs revenues.

Just like 2 years ago people said NVIDIA stock prices was not justified and a massive bubble considering the revenue from those days. But NVIDIA revenues 10xed, and now the stock price from 2 years ago looks seriously underpriced and a bargain.

You are assuming LLM revenues will remain flat or increase moderately and not explode.


You seem like someone who might be interested in my nuclear fusion startup. Right now all we have is a bucket of water but in five years that bucket is going to power the state of California.


It's not about "every dollar spent" being a waste of time, it's about acknowledging the reality of opportunity cost. Of course, no one in any movement is likely to listen to their detractors, but in this case the pioneers seem to agree.

https://www.youtube.com/watch?v=DtePicx_kFY https://www.bbc.com/news/articles/cy7e7mj0jmro


I think there is broad agreement that new models and architectures are needed, but I don't see it as a waste to also scale the stack that we currently have. That's what Silicon Valley has been doing for the past 50 years - scaling things out while inventing the next set of things - and I don't see this as any different. Maybe current architectures will go the way of the floppy disk, but it wasn't a waste to scale up production of floppy disk drives while they were relevant. And ChatGPT was still released only 3 years ago!


And notably, Marcus has been banging this drum for years. Even this article points back to articles he wrote years ago suggesting deep learning was hitting the wall... With GPT 3....

It's sour grapes because the methods he prefers have not gotten the same attention (hah...) or funding.

He's continuing to push the ludicrous Apple "reasoning paper" that he described as a "knockout blow for LLMs" even though it was nothing of the sort.

With each of his articles, I usually lose more respect for him.


And a drummer here in the States!


Matrices represent linear transformations. Linear transformations are very natural and "beautiful" things. They are also very clearly not commutative: f(g(x)) is not the same as g(f(x)). The matrix algebra perfectly represents all of this, and as a result, FGx is not the same as GFx. It's only not "beautiful" if you believe that matrix multiplication is a random operation that exists for no reason.


A few things:

First, modern LLMs can be thought, abstractly, as a kind of Markov model. We are taking the entire previous output as one state vector and from there we have a distribution to the next state vector, which is the updated output with another token added. The point is that there is some subtlety in what a "state" is. So that's one thing.

But the point of the usual Markov chain is that we need to figure out the next conditional probability based on the entire previous history. Making a lookup table based on an exponentially increasing history of possible combinations of tokens is impossible, so we make a lookup table on the last N tokens instead - this is an N-gram LLM or an N'th order Markov chain, where states are now individual tokens. It is much easier, but it doesn't give great results.

The main reason here is that sometimes, the last N words (or tokens, whatever) simply do not have sufficient info about what the next word should be. Often times some fragment of context way back at the beginning was much more relevant. You can increase N, but then sometimes there are a bunch of intervening grammatical filler words that are useless, and it also gets exponentially large. So the 5 most important words to look at, given the current word, could be 5 words scattered about the history, rather than the last 5. And this is always evolving and differs for each new word.

Attention solves this problem. Instead of always looking at the last 5 words, or last N words, we have a dynamically varying "score" for how relevant each of the previous words is given the current one we want to predict. This idea is closer to the way humans parse real language. A Markov model can be thought of as a very primitive version of this where we always just attend evenly to the last N tokens and ignore everything else. So you can think of attention as kind of like an infinite-order Markov chain, but with variable weights representing how important past tokens are, and which is always dynamically adjusting as the text stream goes on.

The other difference is that we no longer can have a simple lookup table like we do with n-gram Markov models. Instead, we need to somehow build some complex function that takes in the previous context and computes outputs the correct next-token distribution. We cannot just store the distribution of tokens given every possible combination of previous ones (and with variable weights on top of it!), as there are infinitely many. It's kind of like we need to "compress" the hypothetically exponentially large lookup table into some kind of simple expression that lets us compute what the lookup table would be without having to store every possible output at once.

Both of these things - computing attention scores, and figuring out some formula for the next-token distribution - are currently solved with deep networks just trying to learn from data and perform gradient descent until it magically starts giving good results. But if the network isn't powerful enough, it won't give good results - maybe comparable to a more primitive n-gram model. So that's why you see what you are seeing.


Don't new cars just directly record your location as you drive them?


Do you think that corporate erosion of (or outright hostility to) privacy is somehow a compelling reason to deny rights to those of us who make different choices in an attempt to protect them? Just because some people decided to buy a smartphone on wheels, do I have to suffer and have my freedom of movement narrowed and protection from arbitrary inspection by government agents denied?


Of course not - it makes no sense to gather that from my post. Were you trying to respond to someone else?


You can't beat an alreadist/incrementalist with reason. For them, if you ever start to walk towards a cliff, you must fall from it, because why? Because their half-dimensional logic.


They do, but it is relatively easy to nuke the onboard modem to permanently disconnect your car. Unfortunately, most people don't know or don't care that their cars are actively spying on them.


My guess would be that your car would develop some covert or overt fault if you did that. You might even lose warranty as the manufacturer could claim that the issue they fixed via a software update couldn't be installed on your car - or couldn't monitor some diagnostics which are a prerequisite for in-warranty repair.

Most telco execs would sell their own mothers before offering reasonable data plans - that your car comes with one for free should be very telling


> My guess would be that your car would develop some covert or overt fault if you did that.

I used a bypass harness to avoid losing a speaker and the car hasn't developed any fault. I wouldn't buy a car without knowing in advance that this mod is possible. There are plenty of tutorials on how to do this for popular makes/models. It may be as simple as removing a fuse, or you might need to disassemble part of the dash to physically unplug the modem.


I'm a but skeptical if this is going to be possible in the future - I'd say it's increasingly unlikely that you'll be able to meaningfully change anything in your car.

For example if you look at where Tesla and co. are going, they are looking to reduce the amount and tickness of wiring harnesses - and they did so by creating a proprietary fieldbus that communicates between proprietary controllers that handle multiple functions, so they can multiplex everything under the sun into a single bus.


Subaru ? Been thinking about doing the same. Was it easy to install ?


So does your phone. And the government just buys the data from data brokers.


One wonders if any given tesla is harvesting the plates the other cars it see in traffic as well.


This article was written weirdly:

> For decades, Google has developed privacy-enhancing technologies (PETs) to improve a wide range of AI-related use cases.

They introduce this random "PETs" acronym after a random string of three words and then never use it. In general this article makes some weird stylistic choices (WSCs).


Why? This seems like a reasonable task to benchmark on.


Because you hit guard rails.


Sure, reasonable to benchmark on if your goal is to find out which companies are the best at stealing the hard work of some honest, human souls.


correction: pacman is not a human and has no soul.


Why do you have to willfully misinterpret the person you're replying to? There's truth in their comment.


And only in the very particular scenario of a national-only cartel which has not successfully roped in other international pharma companies.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: