Moving IO to the edges of your app: Functional Core, Imperative Shell - Scott Wlaschin
Abstract
- Avoid non-pure code in core domain
- Hard to test and unpredicable.
- Instead, do (in order of preference)
- Push I/O to edges (“dependency rejection”)
- Or use dependency injection/parameterization
- Or use the interpreter pattern (only if you benefit from the complexity)
Avoid I/O:
- I/O is not part of the domain: makes business logic harder to comprehend
- I/O is non-deterministic: different results each time ⇒ hard to test
- I/O might fail: lots of code for handling exceptions
Favor domain-centric architecture instead:
--> I/O --> business logic / decisions --> I/O -->
Good vs bad designs
Good: has inputs and output. Bad: no input or output.
Design guidelines: write code that is:
- comprehensible
- everything it needs is explicitly provided as an input
- it has an output
- deterministic
- same inputs ⇒ same outputs
- has no side-effects
I/O does not meet those guidelines. We need I/O but we should keep it on the edges.
Traditional design vs imperative shell
A traditional design would abstract away the differences, i.e. create an interface for the I/O parts. E.g.
PersonRepository
interface with 2 behaviors:read
andsave
CsvPersonRepository
implementation that would read / write from a CSVDbPersonRepository
implementation that would read / write from a database
In “imperative shell” approach, just change the shell code. No need to create abstraction. Business logic is unchanged, no interfaces needed. And no need for a mock shell at all.
Example of migrating to functional core imperative shell
Scenario: update user profile
- retrieve the existing profile data from the database
- compare the new profile data to the existing customer
- if there are no changes: do nothing
- if either the name or email has changed: update the database
- if the email has changed: send a verification email to the new address
Impure implementation:
Pure implementation:
- pure code had explicit inputs and outputs
- pure code was easy to test
- pure code was not async
- pure code did not need to handle exceptions
Unit testing vs integration testing
If you do it this way, the boundaries are obvious:
--> I/O --> business logic / decisions --> I/O -->
<----- unit tests ------->
<----------- integration tests ---------->
- You may want to emulate (fake/mock/stub) an I/O component sometimes.
- You will never need mocks in pure code!
Important testing concepts:
- system under test should be about business value
- Test workflows, not classes!
- test at the boundaries of a system.
- No need to test internals.
- Tests should be done at the workflow level.
- A “unit” test means the test is isolated.
- A “unit” produces no side effects and can be run in isolation. No I/O.
- A “unit” is not a class!
Where does validation fit in?
- It does not belong in your business logic.
- You do it before you send to the business logic.
- Validation happens at the edges.
- Should be no need for null checking in the core domain.
- If validation fails, you can skip the entire business logic.
What about ORMs? Unit of Work?
Heavy ORMs don’t fit this approach.
These kinds of methods with hidden I/O are common in OO designs (especially with ORMs).
Unit of work doesn’t fit this approach:
Unit of Work is database centric and is hard to test.
⇒ with pure core & imperative shell, you don’t need a heavy ORM.
What if I really need I/O in the middle of the workflow?
--> I/O --> pure code --> I/O --> pure code --> I/O -->
Keep them seperate.
Code smells for your domain code
- Is it async? You’re doing I/O somewhere.
- Is it catching exception? You’re (probably) doing I/O somewhere.
- Is it throwing exception? Why not use a proper return value?
- If any of these true, time to refactor!
Keep your I/O at a distance.
Other ways of managing I/O dependencies
What is a dependency?
- something that introduces non-determinism
- e.g. I/O, random numbers, current time
- could also be something like strategy pattern
- anything else is not a “dependency”
- don’t mock other classes if they are pure!
5 strategies to manage dependencies:
1. Dependency rejection
Don’t have any! Keep away from pure domain code!
Pros:
- easy to comprehend, easy to test
Cons:
- some extra work to document “decisions”
2. Dependency retention
- Just don’t worry about them.
- Hard-code all the file and database access!
- Used by some simple frameworks (e.g. ruby on rails, Django).
- Appropriate for data-heavy activities:
- ETL and data transformation pipelines
- datascience
- exploratory coding
- throw-away scripts
3. Dependency injection
Pass dependencies into classes as a whole.
Beware of interface creep, where the interface has too many methods! In this case, prefer dependency parameterization.
Pros:
- Well understood.
- Constructor injection used by many frameworks.
Cons:
- Interface creep.
- Laziness ⇒ doing the wrong thing
- Interfaces often not fine grained.
- E.g.
Database
has ten of methods not needed. - Danger of unintentional access.
- E.g.
4. Dependency parameterization
Pass only the dependencies needed for a particular function/method.
Pros:
- Dependencies are explicit.
- Pass in focused functions, not whole interfaces.
- Prevents interface creep.
- Laziness ⇒ doing the right thing.
Cons:
- Hard to work with deeply nested code.
- Doesn’t help comprehension.
- Only use if moving I/O to edges is not applicable.
5. Dependency interpretation
Build a domain specific language and interpret it. Replace calls to dependencies with “instructions” that are interpreted later. A separate interpreter processes the “program”.
Pros:
- Pure domain-oriented API.
- No I/O at all.
- Completely decoupled from any API changes.
- Optimization possible (e.g. merging instructions).
- Multiple interpreters possible.
Cons:
- Complex
- Best with limited set of operations.
- E.g. Twitter Stitch library, Facebook’s Haxl