Hello, I noticed that diaspora mentions support for MariaDB in the wiki’s installation guides, e.g. Installation/Ubuntu/Noble, but that is is deprecated. Any particular reason for that? I’m in general an advocator for MariaDB, so I’m wondering what has led to this situation?
It’s correct that we strongly discourage new pods from being set up with anything but PostgreSQL. The in-project support for MySQL/MariaDB is technically not deprecated yet, and it continues to function just fine. So if having MariaDB support is the deciding factor for you between setting up a pod or not - you’re fine.
Some time in the future, we will completely drop anything but PostgreSQL. We have not decided when that happens - and it won’t be anytime soon. If we make that decision, there will be a proper announcement, with a slow and careful timeline, and we will offer support for existing pods to migrate their data. So this is nothing to worry about now.
The reasons for that are complex, and there are quite a few of them. The biggest reason with the most impact to users is performance. With our specific database scheme, on the same system specs, PostgreSQL performs much better - it reduces the stream load time by more than half, for example. We tried a lot to make this perform better, but the query used for generating post streams is surprisingly complex (involves quite a few joins over big tables), and PostgreSQL just handles that much better. That’s also not just our opinion, that has been a constant observation across the tech community.
We also identified a bunch of database-scheme and -querying changes we want to do at some point in the future, to increase performance while reducing database size on disk and in memory, which is relevant for smaller pods. Part of that actually involves de-normalizing some tables and heavily relying on PGs array types and their indexing support on those. One example: in a lab test with the Geraspora production database, I de-normalized the post tags. At the moment, there’s a tags
table with only an id and a name (and an used counter), a taggings
table that is essentially a many-to-many table that references tags and posts. We don’t need any metadata on tags, and putting tags into a string[]
directly in the posts
table drops two tables and fairly big caches from memory, and with an index on that array column, querying for tags is actually faster than our current “proper” implementation. MariaDB does not support array columns - and while workarounds exist (like using the json type and using it as a list), the query performance on multi-valued indexes in MariaDB is absolutely abysmal compared to PostgreSQL - to a point where our current m2m-table-based solution is faster. There are a lot more examples where we want to use Postgres-specific SQL, and maintaining two database layer implementations is simply not feasible for a project as small as diaspora*.
But again, what I’m writing now is a future thing. We have no specific plans, and if we ever make the switch to Postgres-only, we will offer support to migrate existing MariaDB/MySQL-based pods. So while I heavily encourage you to use PG - if you can’t, you’re still fine.
I started my pod on MariaDB a number of years ago (2010!), and query performance quickly degraded to terrible. I migrated to Postgres 9 (I used diaspora to create the postgres schema and then simply copied the data from the mysql tables to pg using a tool I found online). Somewhere around postgres 13 the query optimizer got smart enough to decrease the stream query by about an order of magnitude. Since then it’s only gotten better.