If you're going to do that then you might as well just use UUID, since you effectively reintroduce the negative aspects of that (infinitesimally miniscule chance of collisions, computation involved in the calculation, etc.)
The difference is that you can still use sequential IDs internally, while exposing hashed IDs to the outside. This protects your database from collisions under all circumstances, while in the absolute worst case, a single user might experience bugs because two external IDs collide.
This is a weird proposal. If you're using non-hashed IDs internally and exposing hashed IDs externally, you are going to need to map those (securely hashed) ids back to internal ids when the client hands them to you.
I guess you could do this with complete table scans, hashing the ids and looking for matches, but that would be horribly inefficient. You could maintain your own internal reverse index of hash -> id but now I have to ask what's the point? You aren't saving any storage and you're adding a lot of complexity.
Seems like if you want random unguessable external ids, you're always better off just generating them and using them as primary keys.
Also, you aren't protecting your database "from collisions under all circumstances" - there's no guarantee your hash won't collide even if the input is small.
Yes, it is more reasonable to use encrypted IDs externally from structured/sequential IDs internally, not hashed IDs. Recovering the internal ID from the external ID is computationally trivial since it will fit in a single AES block and you don't have to worry about collisions.
Yes, I tend to like this philosophy in database design, of internal sequential ids which are used for joins between tables etc. and an exposed "external reference". But I typically would use a UUID for my external reference rather than a hash of the internal id.
Doesn't that just add a whole lot of unnecessary complexity? If elements have multiple IDs, one of which should not be leaked to the outside, that's just asking for trouble in my opinion.
Is generating UUIDv4 or UUIDv7 really too much effort? I'd assume that writing the row to the database takes longer than generating the UUID.
It also means once your hash function leaks for whatever reason or gets brute forced because of whatever weird weakness in your system, it's game over and everybody will forever be able to predict any future ids, guess neighboring ids, etc., unless you're willing to change the hash and invalidate all links to any content on your site.
If I'm in a scenario where I think I need consecutive ids internally and random ones externally, I'll just have two fields in my tables.
Store just the sequential id, compute the hash on the edge.
This keeps your database simple and performant, and pushes complexity and work to the backend servers. This can be nice because developers are typically more at home at that layer, and scaling the backend can be a lot easier than scaling your database. But it also comes with the downsides listed in this thread.
Good point. Back when we did that we just used a reversible hash function (some would call it encryption). There are some simple algorithms meant for encrypting single integers with a reasonable key.
I might be misremembering, but didn't YouTube do this in the early days? So yeah, that was what I was thinking of when replying, not a traditional hash function.