kafka 5 years feedback
Abstract
- one topic for all:
- 👎 one message slow down the other messages
- 👎 complex monitoring
- 👎 non specialized consumer
- functional topics
- 👍 similar messages in processing time
- 👍 simpler debugging and monitoring
- dead letter
- 👍 no message loss
- 👍 limited impact from one tenant to another
- 👎 order not kept
- transactional messages
- 👍 no message loss
- 👍 no distributed transaction
- 👎 queue based implementation in database
- 👎 small loss of real time
Anti-patterns
- batch producer
- let Kafka do the batch
- Kafka message size
- do not send binaries
- store the binaries in a file storage
- send the Kafka message containing the file URL
- message processing time
- have deterministic and constant processing time in the same topic
- set the
max.poll.interval.ms
andmax.poll.records
depending on the expected consumption- topic definition
- no one topic for all
- one topic per use case
- define naming convention
- e.g.
<business domain>.<data description>.<classification>
catalog.product.cmd
containing the messageUpdateProduct
,DeleteProduct
Number of partitions
- repartition ⇒ order loss
- it’s best to over-partition than the inverse
- choose a number of partition that is easy to divide (e.g. 12, 24 or 36)
Partition key
- partition by homogenous business value
Monitoring
- add a
correlation id
to the messages- monitor the message production throughput
- monitor the consumer group lags
Access control
- no manual action
- kafka prod == database prod
Checklist
- producers
acks
ma.in.flight.requests.per.connection
- partition keys
- consumers
max.poll.interval.ms
max.poll.records
- monitor consumer group lags