More

yellowflash · 2026-04-06T01:13:34 1775438014

Kernel is a bad analogy, if you understand how it behaves you can understand how its built. LLMs don't have that, their behaviour is not completely defined by how they are built.

Every abstraction is leaky, its not like I have 1 in every 100 tickets I work on needs understanding of the existence of filesystem buffers, it's in the back of my mind, it's always there. I didn't read linux kernel source, but I know it's existence. LLM output doesn't have that.

yellowflash · 2026-04-02T09:20:07 1775121607

Today I learned, Stockfish moved to neural network on 2023. I knew that it was just a minmax with alpha beta pruning and a really good eval function. Now its not.

vova_hn2 · 2026-04-02T10:28:05 1775125685

> I knew that it was just a minmax with alpha beta pruning and a really good eval function. Now its not.

It is still "just" a minimax with alpha beta pruning, except the eval function is now a neural network. NNUE, to be more specific.

I highly advise anyone who is curious about chess engines, but hasn't heard about NNUE to read about it. I find this technology absolutely fascinating.

The key idea is that a neural network is structured in a way that makes it very cheap to calculate scores for similar positions. This means that during a tree search, each time you advance or backtrack you can update the score efficiently instead of recalculating it from scratch.

Good starting points to read more:

- https://en.wikipedia.org/wiki/Efficiently_updatable_neural_n...

- https://www.chessprogramming.org/NNUE

sroelants · 2026-04-02T10:17:36 1775125056

I mean, it still is. Now it just has a really good neural net-based eval function. Don't be fooled: it's not that stockfish just has "a really good eval function", and that's the only thing that makes it as strong as it is. The actual tree search is _incredibly_ sophisticated, with boatloads of heuristics, optimizations, and pruning methods on top of alpha-beta.

yellowflash · 2026-03-06T15:36:49 1772811409

Very good question. This is totally dependent on the starting point of the search. The entire domain is not a contraction in such cases, there are sub sets of domains where it is a contraction and as whole it's not. Like multiple pieces where we can apply Banach theorem.

yellowflash · 2026-03-06T11:09:20 1772795360

Sorry, can you check now?

yellowflash · 2025-09-23T17:05:46 1758647146

I am trying to get a better understanding of what reasoning could possibly mean. So far, I am thinking that more we are able to compress knowledge more it's an indicator for reasoning. I would like to understand more about these, please tell me where my understanding is lacking or point me what I should learn more regarding this.

hollerith · 2025-09-23T17:07:11 1758647231

Ask an LLM or Google, "what is a reasoning model in the context of large language models?"

yellowflash · 2025-09-23T17:17:48 1758647868

I was wondering if I could get a different way of thinking about reasoning machines as such. Reasoning models are trying to just externalize the reasoning through chain of thought or fine-tuning on reasoning focused dataset.

They all seem very hacky and not really reasoning. I wanted to see if there are alternative fundamental ways to think about reasoning as end by itself.

yellowflash · on Jan 23, 2023

I think observable notebooks like Pluto.jl is something like his vision though not exactly. It's just more general and useful ?

yellowflash · on Aug 23, 2022

"Once, Zhuang Zhou dreamed he was a butterfly, a butterfly flitting and fluttering about, happy with himself and doing as he pleased. He didn't know that he was Zhuang Zhou. Suddenly he woke up and there he was, solid and unmistakable Zhuang Zhou. But he didn't know if he was Zhuang Zhou who had dreamt he was a butterfly, or a butterfly dreaming that he was Zhuang Zhou. Between Zhuang Zhou and the butterfly there must be some distinction! This is called the Transformation of Things"

yellowflash · on Aug 23, 2022

My understanding is rudimentary, but it doesn't alter the system. But it fixes what state the system should have to produce the given observation, retroactively. Which is counterintuitive in lot of ways. It can be thought as if the observation caused the determination of past state. Do I make sense or I am just talking absolute non-sense.

yellowflash · on Aug 1, 2022

One of them is per family and other per person (family of size 1). Is it even comparable? I am not saying it is lesser, it's pretty high, probably 200% not 560% if I am right.

simonh · on Aug 1, 2022

Family size of 1 is given as 108L each, but average usage per member of a family of 2.3 people is given as 89L each, so presumably an average total household usage in Flanders would be about 2.3 x 89 = 205L. I'm not sure where OP gets the 202L number from, it doesn't appear anywhere on the page. Possibly mis-reading a date as a consumption, hence I'm using capital L for Litres.

That's more than 5x less than the US. It's quite plausible the average family size and environmental conditions in Flanders differ quite a bit from that in the US, but it's still a huge disparity, and according to the second link a lot of that seems to be down to domestic irrigation. Tackling that wouldn't be very popular, of course.

markvdb · on Aug 1, 2022

The headline says 74000l/household/year. Divide by roughly 365 days a year = 202.7392l/day. I made a rounding error and didn't read further down. Still a very useful approximation I'd say.

yellowflash · on July 22, 2022

That's how mongodb geospatial indexes work IIRC

soren1 · on July 22, 2022

Was about to mention this. If I recall correctly, the 2d-sphere index rounds geospatial coordinates to 5 decimals. Very occasionally, I found it would distort polygon geometries just enough to cause them to become invalid (e.g. overlapping geometries), which causes the index build to fail.

In my recent experience working with collections containing million of documents, each containing a geoJSON-style polygon/multipolygon representing a property (i.e. a block of land), I found invalid geometries to occur for about 1 document in 1 million. For a while, I suspected the data-vendor was the cause, however it became more puzzling when other geospatial software confirmed they were valid. Eventually we traced the issue to the 2d-sphere index.

A very clever workaround was suggested by a colleague of mine, inspired by [1]. It preserved the original geometries. In each document, we added a new field containing the geometry's extent. A 2d-sphere index was then built on the extent field instead of the original geometry field. Invalid geometries were no longer an issue since we were dealing with much simpler geometries that were substantially larger than the max precision of the index.

When running geoIntersects queries on our collection of millions of documents, we did so in 2 steps (aggregation queries):

1. GeoIntersects on the extent field (uses the index).

2. On the result set from the last step, perform geoIntersects on the original geometry field (operates on a much smaller set of records compared to querying the collection directly)

[1] https://www.mongodb.com/docs/manual/tutorial/create-queries-...

skylabmelody · on July 22, 2022

Seems exactly like broad phase and narrow phase in games physics engine.

djmips · on July 22, 2022

The same things get invented over and over again and named different things depending on the field. Sometimes it's not immediately clear that they are the same things mathematically.

pacaro · on July 22, 2022

It's tried and tested for sure, I first encountered them in 94 but I assume they're much older.

A little sleuthing shows that (TIL) they are an application of Z-order curves, which date back to 1906