// Real-Time Streaming with Apache Kafka
Kafka’s throughput comes from partitioning: more partitions mean more parallelism, but also more connections and potential rebalancing. Choose partition keys that match your ordering and scaling needs—e.g. user_id for per-user order, or a random key for even spread.
Exactly-once semantics require idempotent producers, transactional commits, and a consumer that commits offsets only after processing. It’s doable but adds complexity; many systems are fine with at-least-once plus idempotent sinks.
Consumer lag is the canary in the coal mine. Monitor lag per partition and set alerts; sudden spikes often mean a slow consumer, a bad rebalance, or a downstream outage. Tune batch size and parallelism before adding more consumers.
→ Key takeaway: Partition key choice drives both ordering and scalability. Monitor lag and tune consumers before scaling out.