Running scheduled jobs is not a trivial subject, there are a lot of things to consider, especially when something goes wrong.
Starting from a simple scheduling job:
Recoverability
Add @Transactional to ensure ACID if the server crashes:
Batch processing
If volume increases, our application might grow in memory usage. So we need to apply batch processing. However, with the previous fix, we have long-running transactions. To mitigate this, use programmatic transaction management:
High availability
Use database pessimistic locking to ensure no two nodes / threads are processing the same batch:
What if we want more throughput? The previous mitigation only ensures only one job is dealing with a specific batch.
To have more throughput, we can leverage database locking clause to prevent the operation from waiting for other transactions to commit.
The previous code will insert and update the database atomically. However, we can configure hibernate to perform in batch, which will perform one single write operations.
To configure it, we can configure our springJPA like this: