Kafka |

Postgres Take it All

Nov 5, 2025 · 5 min read

en software · kafka postgresql sqlite nosql

·

PostgreSQL is becoming a catch-all solution for simple scenario, and this is a trend, not an accident. A new article full of evidence enforced the ideas I wrote down in the 2019 about Kafka vs PostgreSQL: lets dig into it.

Avoid Kafka if unsure (think twice series)

Dec 2, 2019 · 2 min read

en featured knowledgebase sql · java nosql kafka

·

Share on:

Avoid Kafka if unsure (think twice series)

Some co-workers started using Apache Kafka con a bunch of our Customers.

Apache Kafka is a community distributed event streaming platform capable of handling trillions of events a day. Initially conceived as a messaging queue, Kafka is based on an abstraction of a distributed commit log[*].

To get this goal, Apache Kafka needs a complex servers setup, even more complex if you want the certification for the producing company (Confluent). Now, if you are planning to use Kafka like a simple JavaMessaeSystem (JMS) implementation, think twice before going on this route.

PostgreSQL 12 offers a fair (and open source) partition implementation, whereas if money are not a problem, Oracle 12c can happy scale on billions of record before running into troubles (and ExaData can scale even more).

PostgreSQL and Oracle offer optimizations for partitioned data, called “Partition Pruning” in PostreSQL teminology:

With partition pruning enabled, the planner will examine the definition of each partition and prove that the partition need not be scanned because it could not contain any rows meeting the query's WHERE clause. When the planner can prove this, it excludes (prunes) the partition from the query plan.

This feature is quite brand new (popped in PostreSQL 11) but it is essential to a successful partition strategy. Before these feature, partitioning was a black magic art. Now it is simpler to manage.
Read More

Finding Kafka’s throughput limit in Dropbox infrastructure | Dropbox Tech Blog

Feb 14, 2019 · 1 min read

en · kafka

·

Share on:

At Dropbox, Kafka clusters are managed by the Jetstream team, whose primary responsibility is to provide high quality Kafka services. Understanding Kafka’s throughput limit in Dropbox infrastructure is crucial in making proper provisioning decision for different use cases, and this has been an important goal for the team. Recently, we created an automated testing platform to achieve this objective. In this post, we would like to share our method and findings.

Read Finding Kafka’s throughput limit in Dropbox infrastructure | Dropbox Tech Blog

?FileSystem is faster than RAM [under your Operating System]

Jan 18, 2019 · 2 min read

en featured · development great-ideas java kafka

·

Share on:

I am studying Apache Kafka (a "distributed streaming platform") and I stumbled upon this conclusion: the "disk read fear" a lot of projects have in the past, is a fake.

A lot of distributed database started tutorial with the statement "disk read are slow, write are fast" which is true to some extent, and accounted for de-normalizing data and memory caches.