batch processing with DDD
Abstract
Merge domain concepts into one domain-model. Add adapters around. Have a hexagonal architecture so we can have a service that supports multiple entrypoints (REST controllers, bus listener, CLI, batch launcher, …).
Use cache only for the time of the batch processing:
- to speed up the processing,
- to enforce consistent values throughout the calculations.
To improve batch processing performance:
- batch on dataset,
- consider splitting batches → modular batches.
- Composable structures, e.g. Monoid, i.e. make results composable (as tuples).
- Monoids.
- map-reduce architecture.
Report process in domain language (e.g. Q3 results complete, except week 8, or France complete, except Paris).
A form a aggregate:
- unit of success/failure,
- unit of re-run.
Reviews
Why did I want to read/watch this? I’m currently on a project that would need batch processing, and I already worked in some sort of DDD project. This topic “comes at the right time”.
What did I get out of it? There are lots of concepts that can be applied to design a good batch processing service. However, it seems there are still lots of stuff to design and implement compared to some off-the-shelve solutions, like temporal.