Postgres Take it All

Nov 5, 2025 · 5 min read

en software · kafka postgresql sqlite nosql

PostgreSQL is becoming a catch-all solution for simple scenario, and this is a trend, not an accident. A new article full of evidence enforced the ideas I wrote down in the 2019 about Kafka vs PostgreSQL: lets dig into it.

In the last 15 years in the IT are born very nice solution to address ‘facebook effect’ challenge. Some example to explain what I mean.

Back in 2009…

:right::image_thumbnail — Every project is an iterative journey from a wrong need to the right deployment

Friends

When you have a graph of friends of friends, a relational database can have an hard day extracting all the information you need. If your database has 1 billion people as records (like Facebook used to) your problem seems intractable. NoSQL database are born.

We are the dot in dot….bubble

You have a big shop like Ikea or Amazon: you sell heterogeneous stuff from plastic bags to reindeer meatballs, passing by desks and chairs. On these scenarios a big t_product table is challenging (Oracle per-table column limit is 1000). For sure every of these things have a price. You can meta-describe it but believe me, you are entering in the meta-dimension madness: for instance Reddit chosed to have just two tables to model everything, a very bold (and crazy) move but okey…we was in 2012…

During this phase Mongo DB emerged.

The castle of Logs around 2010

If you have a social application (Linkedin) you end up managing a lot of streamed data (from cellphones, website, advertisement systems, etc) you need to process in very different way. Also a big trading system has similar need. Sometime you want to re-play the data to reprocess them on a different angle. Kafka emerged around 2011.

In this time frame databases was just databases, and Oracle was the only one providing additional feature on top of them, with high performance (and high costs).

PostgreSQL was known to be painful slow but less bugged than 10 years before.

Fast forward on 2025

:left::image_thumbnail — PostgreSQL Logo generated with ChatGPT-5

Let me address the challenge from another point of view.

You can rent on AWS a 64 CPU+128GB RAM for 14304€ per year. This machine has the same CPU and double the memory of Sun E10000. Fineco used 3 of them in 2000 to manage its trading on line platform. Now it costs less then a 15 day-vacation on Europe (or Japan).

PostgreSQL nowadays offer:

Json management via native-type support. Schemaless is easy give it a try!
Full text search support (with stemming included)
Parallel select SQL

Cloud providers offer PostgreSQL installation with stronger performance enhancement too.

Why a dumb architecture is better than a bad architecture

Often I am called to set up new projects (which are 99% of time tech refresh of legacy one, for sure). These system interact with other big systems.

When I ask for a volume estimation, things get incredible complex:

Business stakeholders tend to be a bit optimistic on expected future volumes.
Data not always are congruent.

On the last system I am working on, about 40% of input transactions was not valid (i.e. hits the limit or are malformed) and so the declared volume was 40% greater then the outcome sent to other systems (!).

On these scenarios, finding the correct sizing of the system upfront is challenging and often impossible.

On some projects, make little sense to propose MongoDB or Kafka without strong evidence (in the form of volume, scope, provisioning costs tradeoff, etc).

Things change…

During the first period of solution rollout, you must have a very fast and simple way of inspecting system status and fixing bugs (mostly small edge case business ask you to fix very fast).

I want to stress you than after first set of deployments, and if your business starts to become successful, things will change very fast and you need also a way to scale in a reliable way.

As an example first installation of Facebook just required 1000 USD rented machine and used MySQL.

Last but not least it seems BigData are quite dead, for reason not linked to the IT but to other forces (like legal risks of keeping too much data laying around). These facts bring up another weird situation: a fat MySql/MariaDB installation can often be on par on a MongoDB cluster, simply because you have too little data to justify MongoDB super-powers.

The complexity of managing a fault-tolerant Kafka installation is often underestimated: a replicated passive-active database is far more simple to set-up.

Conclusion

It become crucial to analyze your use case. Modern database like PostgreSQL (or Oracle) are ideal for exploring different architectural setup: you can implement small queues, structure key/value data in json column, do full text searches functionalities, and so on.

I was amazed on how it is simple to update the search index of a single document in SQLite compared to the index update of an ElasticSearch cluster. It is a simple task, but on SQLite you need to know nothing: it is just an update!

And yes SQLite too has minimal text indexing capabilities too :)

After you embrace the “PostrgreSQL take it all” mantra, you can evolve your infrastructure adding components like RabbitMQ, Kafka, ElasticSearch, Vector Databases, etc in a second phase (that believe be, could be in a far future called “NEVER” :)