Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is an understudied area, for me. Yes we can come up with mathematically optimal ways of solving a problem, but it's much harder to come up with the mathematically optimal way of teaching a _human_ to solve that problem. Obviously there's a lot of work in 'explainability' in machine learning models but that doesn't really get there.

I often find myself going back to Eurisko as a fascinating example of the road not taken (or perhaps, the road taken to its logical end and abandoned) in machine learning. You had this system that instead of learning opaque weights in a network, learned heuristics that were fairly explainable. In some ways that's a much more useful companion than a model that's strictly superior in its decisions. I find this regularly in chess - you can prep with Stockfish and you know you're probably memorising the best line, and you can follow the sidelines to find very concretely _why_ that's the case, but it still feels like an inefficient way of learning. It always feels like a machine that yielded up fewer, more abstract principles would be helping me more. I recently thought about this for solving Rubik's cubes. I don't really want the fastest way to solve it, ideally I'd have a way that simultaneously optimises the number of heuristics (if you have such and such a colour here and here, do this, if this face is all white, do that etc) and the number of total actions. That seems like a tractable problem for a computer to brute force for you even without smarts.

I face this in my work constantly: you can come up with the most amazing model based on the most cutting edge architecture, and you can trust its predictions above any others. But then you've got to explain to a football coach why _they_ should trust it, and once that trust is built, distil it into few enough rules that they can train a team to reflect its expertise.

Sorry for the ramble!



(Author of the article here.) I agree that it's somewhat unsatisfying that the output of all this is an enormous table of numbers rather than some nice simple rules.

I think it would be interesting to try some of the methods for approximately solving MDPs on this game to see how complicated the machine learning model that approximates the value function (or state-action function) has to be to get reasonable accuracy. Maybe some simple rules would emerge...


What has consistently worked for me is to mostly use 3 directions, forcing all of the large tiles to the bottom row, and crowding the space above the largest number(s) so going in the fourth direction keeps those large numbers along the bottom.

All the way to 1024 is autopilot by just keeping small numbers out of the bottom row. Then, the thinking begins---plan each move to keep the 32s, 64s, 128s on the same edge of the board until you make an inevitable mistake or have bad luck working in the remaining 2x4 or 2x3 section. This last section is where your analysis algorithm would really add value. Plus, using just three directions until it's time to condense large blocks should reduce your computation a lot.

Do you think regular use of all four directions is necessary to get more consistently to higher values like 16k or 32k? (Actually, is 32k possible?) I feel like once you have most of the blocks on board to get to 8k except for perhaps one 256, you can't really have a random 2 or 4 in the corner or it's Game Over in 10 moves or fewer. Moving in the fourth direction makes the 'corner small number' a guarantee when the board is filled like this.

Playing all large numbers on the bottom also means you can calculate an optimal play (using the smaller numbers) in small (2x3, 2x4, 3x3) regions as long as the result comes out at the same grid location consistently---near the smallest large number.

Then, optimization of the large numbers becomes a separate problem. Do you put everything sequential in the bottom row?

4096-2048-512-128

Or do you put high values at the ends?

4096-128-512-2048

Or low values at the ends? (This is not the answer. Getting a medium value from one end to the other---consistently---without messing everything else up is not probable and probably not possible.)

Or should the high values make a wedge along two sides?

512

4096-2048-128

In summary, I don't think trying to solve the full state space makes sense. This game feels like two separate isolatable optimization problems---small and large numbers---plus a little bit of glue when the isolation breaks down: starting a game, fixing a stuck situation, or collapsing a series of large numbers.

Have you thought about using your worst cases to find rules to make Evil 2048 even more evil? Or see how an MDP works against Evil 2048 to get the best outcome in worst case scenarios?

Oh-- Great article and analysis, by the way.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: