Moving IO to the edges of your app: Functional Core, Imperative Shell - Scott Wlaschin

Abstract

  • Avoid non-pure code in core domain
    • Hard to test and unpredicable.
  • Instead, do (in order of preference)
    • Push I/O to edges (“dependency rejection”)
    • Or use dependency injection/parameterization
    • Or use the interpreter pattern (only if you benefit from the complexity)

Avoid I/O:

  • I/O is not part of the domain: makes business logic harder to comprehend
  • I/O is non-deterministic: different results each time hard to test
  • I/O might fail: lots of code for handling exceptions

Favor domain-centric architecture instead:

--> I/O --> business logic / decisions --> I/O -->

Good vs bad designs

Good: has inputs and output. Bad: no input or output.

Design guidelines: write code that is:

  • comprehensible
    • everything it needs is explicitly provided as an input
    • it has an output
  • deterministic
    • same inputs same outputs
  • has no side-effects

I/O does not meet those guidelines. We need I/O but we should keep it on the edges.

Traditional design vs imperative shell

A traditional design would abstract away the differences, i.e. create an interface for the I/O parts. E.g.

  • PersonRepository interface with 2 behaviors: read and save
  • CsvPersonRepository implementation that would read / write from a CSV
  • DbPersonRepository implementation that would read / write from a database

In “imperative shell” approach, just change the shell code. No need to create abstraction. Business logic is unchanged, no interfaces needed. And no need for a mock shell at all.

Example of migrating to functional core imperative shell

Scenario: update user profile

  1. retrieve the existing profile data from the database
  2. compare the new profile data to the existing customer
  3. if there are no changes: do nothing
  4. if either the name or email has changed: update the database
  5. if the email has changed: send a verification email to the new address

Impure implementation:

class ImpureImpl
  void updateCustomer(Customer newCustomer) {
    var existingCustomer = customerRepository.findById(newCustomer.id());
 
    // check for changes
    if (
      !existingCustomer.name().equals(newCustomer.name())
      || !existingCustomer.email().equals( newCustomer.email()
    ) {
      // update customer
      customerRepository.update(newCustomer);
    }
 
    // send verification email if email changed
    if (!existingCustomer.email().equals(newCustomer.email()) {
      var msg = new EmailMessage(newCustomer.email(), "Please verify your email");
      emailService.send(msg);
    }
  }
}

Pure implementation:

enum UpdateCustomerDecision {
  DO_NOTHING
  UPDATE_CUSTOMER_ONLY,
  UPDATE_CUSTOMER_AND_SEND_EMAIL
}
 
record UpdateCustomerResult(
  UpdateCustomerDecision decision,
  Customer customer,
  EmailMessage message
) {}
 
class PureImpl
  UpdateCustomerResult updateCustomer(Customer newCustomer, Customer existingCustomer) {
    // initial decision is DO_NOTHING
    var result = new UpdateCustomerResult(DO_NOTHING, null, null);
 
    // check for changes
    if (
      !existingCustomer.name().equals(newCustomer.name())
      || !existingCustomer.email().equals( newCustomer.email()
    ) {
      // change the decision
      result = new UpdateCustomerResult(UPDATE_CUSTOMER_ONLY, newCustomer, null);
    }
    
    // send verification email if email changed
    if (!existingCustomer.email().equals(newCustomer.email()) {
      // change the decision
      result = new UpdateCustomerResult(
        UPDATE_CUSTOMER_AND_SEND_EMAIL,
        newCustomer,
        new EmailMessage(newCustomer.email(), "Please verify your email")
      );
    }
    
    return result;
  }
}
class Shell {
  void update(Customer newCustomer) {
    // I/O
    var existingCustomer = customerRepository.findById(newCustomer.id());
    
    // pure code
    var result = pure.updateCustomer(newCustomer, existingCustomer);
    
    // I/O
    switch(result.decision()) {
      case DO_NOTHING:
        break;
      case UPDATE_CUSTOMER_ONLY:
        customerRepository.update(result.customer());
        break;
      case UPDATE_CUSTOMER_AND_SEND_EMAIL:
        customerRepository.update(result.customer());
        emailService.send(result.message());
        break;
    }
  }
}
  • pure code had explicit inputs and outputs
  • pure code was easy to test
  • pure code was not async
  • pure code did not need to handle exceptions

Unit testing vs integration testing

If you do it this way, the boundaries are obvious:

--> I/O --> business logic / decisions --> I/O -->
            <----- unit tests ------->
    <----------- integration tests ----------> 
  • You may want to emulate (fake/mock/stub) an I/O component sometimes.
  • You will never need mocks in pure code!

Important testing concepts:

  • system under test should be about business value
    • Test workflows, not classes!
  • test at the boundaries of a system.
    • No need to test internals.
    • Tests should be done at the workflow level.
  • A “unit” test means the test is isolated.
    • A “unit” produces no side effects and can be run in isolation. No I/O.
    • A “unit” is not a class!

Where does validation fit in?

  • It does not belong in your business logic.
  • You do it before you send to the business logic.
  • Validation happens at the edges.
  • Should be no need for null checking in the core domain.
  • If validation fails, you can skip the entire business logic.

What about ORMs? Unit of Work?

Heavy ORMs don’t fit this approach.

voir registerUser(String name) {
  var account = new Account(name);
  account.save(); // I/O in the middle
  var ackEmail = new Email("Welcome");
  ackEmail.send(); // I/O in the middle
}

These kinds of methods with hidden I/O are common in OO designs (especially with ORMs).

Unit of work doesn’t fit this approach:

void registerUser(String name, DbContext dbContext) {
  var account = new Account(name);
  dbContext.addAccount(account);
  var ackEmail = new Email("Welcome");
  dbContext.addEmail(ackEmail);
  dbContext.saveChanges();
}

Unit of Work is database centric and is hard to test.

with pure core & imperative shell, you don’t need a heavy ORM.

What if I really need I/O in the middle of the workflow?

--> I/O --> pure code --> I/O --> pure code --> I/O -->

Keep them seperate.

Code smells for your domain code

  • Is it async? You’re doing I/O somewhere.
  • Is it catching exception? You’re (probably) doing I/O somewhere.
  • Is it throwing exception? Why not use a proper return value?
  • If any of these true, time to refactor!

Keep your I/O at a distance.

Other ways of managing I/O dependencies

What is a dependency?

  • something that introduces non-determinism
    • e.g. I/O, random numbers, current time
    • could also be something like strategy pattern
  • anything else is not a “dependency”
    • don’t mock other classes if they are pure!

5 strategies to manage dependencies:

1. Dependency rejection

Don’t have any! Keep away from pure domain code!

Pros:

  • easy to comprehend, easy to test

Cons:

  • some extra work to document “decisions”

2. Dependency retention

  • Just don’t worry about them.
  • Hard-code all the file and database access!
  • Used by some simple frameworks (e.g. ruby on rails, Django).
  • Appropriate for data-heavy activities:
    • ETL and data transformation pipelines
    • datascience
    • exploratory coding
    • throw-away scripts

3. Dependency injection

Pass dependencies into classes as a whole.

Beware of interface creep, where the interface has too many methods! In this case, prefer dependency parameterization.

Pros:

  • Well understood.
  • Constructor injection used by many frameworks.

Cons:

  • Interface creep.
    • Laziness doing the wrong thing
  • Interfaces often not fine grained.
    • E.g. Database has ten of methods not needed.
    • Danger of unintentional access.

4. Dependency parameterization

Pass only the dependencies needed for a particular function/method.

Pros:

  • Dependencies are explicit.
  • Pass in focused functions, not whole interfaces.
  • Prevents interface creep.
    • Laziness doing the right thing.

Cons:

  • Hard to work with deeply nested code.
  • Doesn’t help comprehension.
    • Only use if moving I/O to edges is not applicable.

5. Dependency interpretation

Build a domain specific language and interpret it. Replace calls to dependencies with “instructions” that are interpreted later. A separate interpreter processes the “program”.

Pros:

  • Pure domain-oriented API.
  • No I/O at all.
  • Completely decoupled from any API changes.
  • Optimization possible (e.g. merging instructions).
  • Multiple interpreters possible.

Cons: