>Rails developers at the time who were largely encouraged to treat their database as a dumb data store by convention and to let the application do the work
I've worked in a big enterprise once (hated it for all sorts of reasons) where the business logic was entirely in the database, I get that that could work too. But in the sort of environments where I've worked most, there's a lot of knowledge about the application programming language, and less about the database. No dedicated DBAs. And we all know our SQL, we build basic NOT NULL and foreign key contraints, maybe a simple uniqueness constraint in SQL, but otherwise concentrate all the application logic in... you know, the application. For better or worse, I guess it's a coping mechanism for application developers. (And then it becomes a habit, maybe for some, like the article said, dogma.)
The logic of the app I currently work on lives 99% in the database... As a C++ plugin to the DB.
Not stored procedures, or triggers, or anything like that... Just a giant plugin loaded when the DB starts that provides business specific functions that can be called by the client instead of SQL queries (think "CreateAccount" or "GetFooDetails").
Oh, and probably 50% of the columns are giant JSON objects.
I was expecting OP to keep blaming database choice given that their opinion on decision making is critical. But they seem to give lots of praise to the databases full text search features. Appearantly he was judging Heroku's choice of database, which I guess would not be a random one. Can not grasp the incentives of why it is put like that in the article. Though dramatizing is a nice contour to have but gives away prejudice bias, as this comment does for me.
(not a rebuttal on your comment, just a continuation of the conversation)
I found it an interesting criticism, and from what we know about the project I don't think I agree with the author.
In the space of what database to use, there are really few differences between Postgres and Postgres' peers, that "using what's available" is a perfectly sound reasoning to use Postgresql over, say, MySQL. I think around that time MongoDB was a possible alternative as well, however it was experimental and often mocked. Postgres and Mysql are pretty much feature parity, even in 2011, except for very niche use cases, which this project did not sound like it would have.
Factoring in that Heroku's PG UI is great, and to use an alternative would mean to have to set things up yourself (in 2011 the Heroku marketplace was not yet as expansive iirc), or at least have another account at another SaaS company with dubious maturity, the decision to use PG seemed to have been given the correct amount of critical thinking ("will it be able to support all that we need? most likely yes").
The author may also not have been aware that for RoR applications the code is pretty DB-agnostic, an unawareness he hints at in other parts of the story as well. Meaning that in the span of the year that they're working on rebuilding the site in RoR, if they discover a use case that another DB would be extremely helpful with, they could relatively easily switch over to this other DB without having to rewrite all of the code. At most it'd take rewriting of some migrations for FK's and possibly some (hopefully few) custom SQL queries. Remember that they had a one-time-switch-over when the site was done being built, so during this year of development time, the database did not accrue live user data that would have made such a change more difficult.
You’re not wrong in your assessment but I would still expect a company architecting a total replacement for an existing system to know why they are choosing critical components. In this case it was an extensive lack of database consideration in many decisions.
Much of this and the supporting articles were focused on the security and fraud topics. Since those details were linked out from this article, most of what's left is around the technical bits that didn't get their own posts.
For an audience of programmers on HN, I'm not surprised that this topic got more attention than the fraud but it definitely wasn't intended to be the focus of the post.
Since fraud is a broad problem to tackle with and it requires individual awareness. It challenges densely education of the users rather than technical enhancement which ends up on OTP and 2FA. Honesty of the point of view of OP is the most attractive part of the article which leads one to complete reading. Then focus on the domains varies on readers interests to discuss.
A couple years ago, the company I was working at had a dedicated team just for JIRA (and other workflow mgmt systems).
They were looking for some people to fill out the team so I asked my SRE friends if they knew any good JIRA people.
I was surprised that the #1 answer was "Dedicated JIRA people? Why don't you just have someone do it from your team. JIRA isn't that hard."
This immediately made me think of the "Why do we even need DBAs? It's all ORM anyway." Which of course leads to "Why is the DB so slow?" etc etc
I thought that was a bad scenario till a friend was telling me at their current employer (which uses a lot of databases) they routinely ask "Is anyone here the DBA?" and the answer is "What's a DBA??"
I think like 95% of a DBA's value is a willingness to touch the db. It's not that hard to find slow running queries or missing indexes. Just that most devs don't know how and/or never check. I'm guilty of it too. Then some DBA comes along, takes five minutes to run a query, and sends the devs 10 easy performance fixes.
I think the OP is referring to the time before Rails natively supported foreign keys.
While I agree there's a place for business rules (the app), properly enforcing referential integrity and uniqueness can now easily be done in both - and I trust PostgreSQL more.
> In order to load just 100 paginated items on his account history page, over 50,000 queries were triggered using so much RAM that it crashed the server.
Issuing 50k queries to get 100 items is just wrong, no matter how you look at it. I can see the attractiveness of writing all your logic in one language; and in a way it's unfortunate there's not a simple way to write your filter in Ruby, pass it to the database, and have it executed there. But you have to face reality and deal with the fact that such complicated logic needs to be close to the data.
Edit: Initially misunderstood to mean 50k items, not 50k queries, which is even worse.
Since this was written Rails has become much better about this kind of thing.
I think Arel and some of the other work that really helped was happening more than 10 years ago now (how time flies!) but it was a non-trivial upgrade so a lot of production sites didn't switch over for a few more years.
It's pretty common for Rails developers to know very little about databases (sql specifically).
It's very frustrating because right out of the gate, Rails pushes thinking of the db as a graph, leading to highly inefficient queries
My comment was based on Java environments, and it's the same. It's unacceptable to build poorly-designed schemas in any environment. But good schema design is something else than what I thought the article might have been referring to, which is building the entire business logic in SQL, using triggers and what not. I've seen enterprises with a database team doing that, for application layer people to come in and built their application on top of this "fait accompli" database. And this means you need two highly-skilled teams: the database and the application team. (Aside from a third, the infrastructure team.)
It’s common for JR developers fresh out of college or code camps. It’s also good reason not to let the ratio of Junior to seniors get too far out of whack.
I’ve had several jobs now, where I came in to clean up after they hired a bunch of juniors with no experience.
Usually they were quite good at making things pretty.
But fundamentals like N+1, foreign key, insane / broken build times, desperation of concerns, etc abounded.
It’s sort of nice because you come in and be a hero for fixing everything up.
Can really sucks because there are often good reasons things got to be that bad.
My biggest issue on this particular project was an almost intentional neglect of database specifics, such as not enforcing uniqueness at the database level even for usernames. There were about 50 users that had 2 records which led to all sorts of crazy issues to track down.
Hours to days of data cleanup work that could have been bypassed by setting a simple unique index.
> I've worked in a big enterprise once (hated it for all sorts of reasons) where the business logic was entirely in the database, I get that that could work too.
This used to be quite common back in the day. In a sense the database was thought of as the application platform so the reasoning was to move business logic as close to the data as possible.
Performance was also a big driver since the DB was often the most high-powered server / cluster in the datacenter. An argument I heard was that by pushing logic into the DB via stored procedures and the like, you would get better performance that way vs. pulling a result set into application memory and then dealing with it there.
Note that I am not rendering a qualitative judgment on any of this, just sharing my experience. :)
I've worked in a big enterprise once (hated it for all sorts of reasons) where the business logic was entirely in the database, I get that that could work too. But in the sort of environments where I've worked most, there's a lot of knowledge about the application programming language, and less about the database. No dedicated DBAs. And we all know our SQL, we build basic NOT NULL and foreign key contraints, maybe a simple uniqueness constraint in SQL, but otherwise concentrate all the application logic in... you know, the application. For better or worse, I guess it's a coping mechanism for application developers. (And then it becomes a habit, maybe for some, like the article said, dogma.)