From CRUD to CQRS with Dropwizard (Part 2)

Part 2: Asynchronous Command Handling

This is the second part of a multi-week series of posts describing various implementations of Command Query Responsibility Segregation (CQRS) and Event Sourcing using Dropwizard. Week 1 can be found here. As I mentioned last week, CQRS is a design pattern where commands modifying data are separated from queries requesting it and the data is denormalized into structures matching the Data Transfer Objects (DTO) that these queries are requesting. If you want to learn more about CQRS, read Martin Fowler’s blog post on the subject.

Picking up from last week, here’s the initial CQRS application we built:

This application was entirely synchronous. We didn’t send a response to the client until we had already denormalized and written the changes they requested to all data stores. In contrast, this week’s application looks like this:

CQRS Async Instantaneous.jpg

The major difference, as you can see, is that instead of directly processing the commands created from the request, we are writing these commands to a message bus (after validating the request) and then asynchronously reading the commands out of the message bus, handling them, and writing our data. Handling the command consists of retrieving the existing state of the entity, applying the command, and generating an event if there is a delta. This event should encapsulate the change in state of the entity and also include some mechanism for ensuring idempotency (we’ll get into this more in a future post). So for example if a command tells our service to create a product, the event generated upon successful execution of the command would be a ProductCreated event which would contain the data for that new product’s current state.

I’ve created another small Dropwizard application to demonstrate this pattern using Mongo and Kafka. You can find the code and instructions on how to run it in IntelliJ and send commands to it via Postman here.

The steps for a data update request (command) are:

  1. Http request received to change entity
  2. Request is validated
  3. Request is translated to command(s)
  4. Command(s) are written to message bus
  5. Response is sent to client
  6. Command(s) are pulled off of message bus
  7. Command(s) are handled
    1. Existing entity is retrieved and command(s) are validated
    2. Command(s) are applied to entity and delta is determined
    3. If there’s a delta:
      1. The entity is updated
      2. Event(s) are generated
  8. If the Command results in Event(s):
    1. The event(s) are denormalized to relevant data sources

The steps for a data retrieval request (query) are the same as last week:

  1. Http request is made to retrieve entity
  2. Entity is retrieved via key lookup

There are now certainly more moving pieces and we have changed the user experience. Instead of being able to respond to the client and tell them we’ve completed their request, we can only tell them that their request has been received and looks good to the best of our current knowledge.

Comparing this approach to last weeks, there are some additional drawbacks:

  • Client doesn’t know when command is handled
  • Additional technologies- message bus
  • Additional complexity- writing to and reading from the message bus

But there are also additional benefits:

  • Write requests are validated and response sent to the client more quickly
  • The handling of commands is decoupled from the handling of requests
    • If the format of the request changes tomorrow then the command handling service doesn’t need to changed
    • At some point in the future the client could directly write to the message bus or commands could be written from several sources and the command handler wouldn’t change
    • The location of either the request handler or the command handler could change at any time with no need for service discovery or reconfiguration
  • If our message bus can be partitioned (Kafka) then we can horizontally scale out our command handlers and ensure that all commands for a given partition key (ex SKU ID) are handled by the same instance of our command handler.
  • If our message bus persists messages (Kafka) then we can replay requests for debugging or disaster recovery by using another consumer or moving the offset back
    • Transient errors during command handling would automatically be retried with the correct Kafka settings in place.

Coming up in part 3: CQRS with eventually consistent data denormalization.


From CRUD to CQRS with Dropwizard (Part 1)

Part 1: Synchronicity Everywhere

This is the first part of a multi-week series of posts describing various implementations of Command Query Responsibility Segregation (CQRS) and Event Sourcing using Dropwizard. CQRS is a design pattern where commands modifying data are separated from queries requesting it and the data is denormalized into structures matching the Data Transfer Objects (DTO) that these queries are requesting. I’m not going to get deep into the details of CQRS here, if you want to learn more then I highly recommend Martin Fowler’s blog post on the subject. But here is a quick comparison between CRUD (Create, Read, Update, Delete) and CQRS. A typical CRUD application looks like this:

As you can see, there’s a User Interface which writes and request data from an API which in turn persists and retrieves it from a data store.

In contrast, here’s a basic CQRS application:

CQRS.jpg

The major difference, as you can see, is that we are separating the source of truth written to by the api from the projection which is read by the api. A denormalizer is used to keep the two in sync. In future weeks we’ll introduce asynchronicity, message buses like Kafka, and eventual consistency. But for our initial purposes we will assume that this denormalization will be done synchronously with the update of the source of truth and prior to the response being sent to the user interface.

I’ve created a small Dropwizard application to demonstrate this pattern using Mongo. You can find the code and instructions on how to run it in IntelliJ and send commands to it via Postman here.

The steps for a data update request (command) are:

  1. Http request is received by API to change entity
  2. Request is translated to command(s)
  3. Command(s) are handled
  4. Existing entity is retrieved and command(s) are validated
  5. Command(s) are applied to entity and delta is determined
  6. If there’s a delta:
    1. The entity is updated
    2. Event(s) are generated
  7. If the Command results in Event(s):
    1. The event(s) are denormalized to relevant data sources
  8. Response is sent to client

And the steps for a data retrieval request (query) are:

  1. Http request is made to retrieve entity
  2. Entity is retrieved via key lookup

As you can see, data changes require a few extra steps but data retrieval is extremely simple. This would still be the case regardless of how many sources of truth must be denormalized and combined to form the document we need for the query.

However there are still some drawbacks:

  • Duplicate data storage
  • Transactions across data stores need to be handled by application
  • Denormalization needs to be carefully managed to avoid inconsistent states
  • Writes are slower/more expensive since we are synchronously denormalizing
  • Reads can result in large payloads depending on domain design

However, in some cases these are outweighed by the benefits:

  • Reads are faster and can be optimized separately from writes
  • Since data is stored as key/value, lower level/cheaper data storage can be used
  • Fewer http calls on read since documents are integrated on writes
  • UX doesn’t need to change because consistency model is the same
  • We don’t need message buses (yet!)

Coming up in part 2: CQRS with asynchronous commands


Using Flow to Write More Confident React Apps

Writing your first React app is simple with tools such as Create React App, but as your application and your team grow, it becomes more difficult to write scalable, confident code efficiently. Think about the number of times you pass props around a React application and expect them to look a certain way or be of a certain type, or the times when you’ve passed the wrong type of argument into a function and rendered a page useless. Fortunately, there are many solutions for these preventable problems, and one of the tools we’ve found helpful here at Rent The Runway is Flow.

What Is Flow?

According to Flow’s official website, Flow is a static type checker for JavaScript. After you install Flow, start the Flow background process, and annotate your code, Flow checks for errors while you code and gently alerts you to them. There are even Flow packages available for many popular text editors, which can highlight potential problems and provide explanatory text right beside the code itself.

Many times, Flow can infer how you want your code to work, so you don’t always need to do extra work, but it helps to be certain by actually using the static type annotations. In short, Flow checks how data moves through your app, ensuring that you’re using the types of data that you expect and forcing you to think twice before you reassign a variable to a different type, or try to access a non-existent property of an object.

Basic Example

Here’s a basic example of how to add Flow type checking to a file.

This is a simple component to display and edit data for a customer’s account.

Telling Flow that it should check this file is as simple as including the `// @flow` flag at the top of the file. Almost immediately, Flow will begin checking the file, and you’ll notice red dots in the text editor (Sublime) warning about potential problems. When we hover over those lines, we see additional information about the error at the bottom of the text editor.

After tackling the highlighted lines one by one, we’re left with a nicely annotated file:

Here, we’re doing the bare minimum to silence the red dots, telling Flow that the `constructor` method accepts a `props` argument that should be an object. (Flow allows you to be more specific, which we’ll show in a bit, but this suffices for now). In addition, we’ve explicitly declared the properties and types of each property for the class, which is part of Flow’s Class type checking capabilities. Now, the shape of state and the types of its properties are guaranteed, and we only had to do a bit of annotation for Flow to get started.

To be more explicit, we can also declare the types of properties that an object when supplied as an argument to a function must have. Below, on line 15 of the CustomerAccountProfile component (right), we specify that the props must have a customer property.

In that component’s parent, CustomerTabAccount (left), on line 75, you can see that when we try to pass in ‘null’ as a prop, Flow gives us a warning that ‘This type is incompatible with (object type)’.

To go one step farther, you could even declare the types of all the properties within props for the component. Now, if you try to reference a property that isn’t declared with the props, Flow will let you know that the property is not found.

Although this is just a sample of the many type checks Flow is capable of, the Flow documentation goes pretty deep into all the possibilities.

Advantages

What’s great about Flow is that it can catch errors early and help development teams communicate better about how components work together within an application. And although React’s PropTypes serve a similar purpose and are arguably simpler to use, they only throw errors at runtime rather than while you’re coding. Plus, since they’re declared below the component, they seem more like an afterthought.

Beyond the basics, adding Flow to your codebase is fairly painless, since it’s opt-in. You tell Flow to analyze a file by including a `// @flow` flag at the top of a file. This will mark it as a Flow file, and the Flow background process will monitor all Flow files. Files without the flag will be ignored, so you get to control which files you check and you can integrate it into a new or existing codebase over time.

It also plays nicely with ES6 and Babel, which strips out all the annotations for production.

Disadvantages

On the other hand, that same opt-in policy leaves room for neglect. Less diligent programmers can opt to forgo Flow altogether. In addition, there is a learning curve for type annotations, especially for new developers who aren’t used to type checking their code. What I find to be a bigger challenge is not getting started, but using Flow well. Basic type checking is useful, but Flow offers the ability to write very use-case specific type checks such as “Maybe” types, which check optional values and “Class” types, which allow you to use a class as a type. It’s easy to begin using Flow, but difficult to master all of its capabilities.

We began adopting Flow slowly in one of our retail applications, and as we continue building new features, we continue to convert older components to use Flow. Overall, it has helped us identify bugs sooner rather than later and has enabled easier refactoring because we can be more confident about the arguments functions accept and how different function invocations change a component’s state.

For more information on Flow and getting started, consult the official Flow documentation.

Resources

Official Flow Documentation

React PropTypes Versus Flow


Efficient ViewController Transitions

Anyone familiar with RTR’s app knows that there are several different viewControllers that the user is able to navigate between; however, anyone not familiar with the code would most likely assume this navigation is done through simply pushing the relevant viewController via the navigation controller or by presenting the viewController modally (ex below).

Prior to joining the team, I would have made the same assumption. This assumption is wrong! You are probably now asking yourself “what other way is there to navigate between viewControllers?” Good question!

To handle efficient routing, we implemented a unique pattern to navigate between viewControllers. All of the data on the app’s viewControllers are populated by modelControllers. With that in mind, when a viewController is instantiated, the viewController’s initialization requires a modelController as a parameter. You don’t have to do it this way, but for reasons related to a whole different (future) blog post, we gather all the relevant information from the network at an earlier stage and persist this information throughout the app’s lifecycle.

After having created the viewController and established which modelController the viewController “owns,” it is time to create the route that will link to this viewController. RTR’s app has an entire class (AppRouter) dedicated to routing. One thing to note: we use JLRoutes, a cocoa pod that manages routes for URL schemes, to handle routing within the app. The main function we are concerned with inside AppRouter is adding a new route, which is a function provided by JLRoutes.

As you can see from the code above, addRoute requires a path as a parameter. We store all paths required for routing in a class, RTRURL. The enum Route (below) is the path to the specific viewController and the enum Param (also below) holds the paths to any parameters required to instantiate the viewController. In this example, case Experiment is the path to experimentsMC, which is the modelController that is required to initialize RoutingVC.

Great! We now have a way of routing the app to a given viewController based on the viewController’s path provided in the addRoute func. So…how do we present it?

We made a class, Presentation, that subclasses from RTRURL. This means that Presentation has access to all of the properties held in RTRURL. Yes, the code inside Presentation could be done inside RTRURL. However, we made this Presentation class as a way of abstracting implementation details to make the code easier to read. 👏

In order to “tie together” all of the components that contribute to navigating to a given viewController (aka creating a new route and providing the viewController’s route path), we decided to create a “route struct” for every viewController. Specifically, this struct encapsulates all of the information required to instantiate the viewController. It does this by adopting and conforming to the PresentationProtocol.

PresentationProtocol has three requirements: a path, required parameters and valid parameters. What does this mean, exactly? With this particular example, RoutingVC can’t be instantiated without an experimentsMC; therefore, this property must be provided in the RequiredParams. Similarly, a new route can’t be added without a path, thus, a path must be provided.

ValidParams takes in properties that can be displayed by the viewController but aren’t required in order to instantiate the viewController. ValidParams is encapsulated within the PresentationProtocol but can be an empty array. An example of ValidParams could be the way in which the viewController is presented. Is it pushed? Is it modal? You choose!

Screen Shot 2017-03-24 at 10.27.49 AM.png

FINALLY we are ready to show the viewController. The params property includes the required experimentsMC parameter as well as presentationStyle, which is a valid (but not required) parameter. These parameters are passed through to the Experiment “routes struct” held on the Presentation class.

After creating an instance of the “routes struct” (in this example, Experiment), we are now ready to actually present the viewController. openViewConrollerWithPresentation() takes in a conforming type of PresentationProtocol as a parameter. After establishing the path, parameters and URL, routeURL is called. Using JLRoutes, routeURL presents the viewController associated with that url.

This may seem like a complicated solution for a “simple” transition between viewControllers. So, why did we decide to do it this way? For two key reasons:

  1. By creating a route for each viewController, integrating deep linking with universal links is much simpler. The url path already exists!
  2. When testing new features on the app, it is much more efficient to link directly to the viewController that contains the feature in question rather than navigating throughout the app to finally get to the relevant viewController.

The next time you use RTR’s app, you now have a “behind the scenes” understanding of what’s happening when you navigate between viewControllers. Thoughts? Comments? Please don’t hesitate to reach out to us!


iOS Data Binding is Better with Signals

Here at Rent the Runway, we want to make sure that all of our users have a responsive and rewarding iOS mobile app experience. With our backend architected to use many highly performant microservices, there is a whole lot of data and user interactions to keep in sync. We have services for user memberships, promotions, products, search categories, user hearts, and so much more. Without a one-to-one mapping of API service calls to application views, it was critical that we come up with a robust and coordinated approach for data binding and content updates.

One of the main iOS communication patterns is delegation. Delegation creates a one-to-one messaging structure between two objects, where one object can act on behalf of another. Using this pattern, it is typical that a completed API call would inform its delegate object that new data exists and is ready for processing and/or display somewhere else in the app.

On this diagram we can see that we have several UIViewControllers (ProductCollectionsVC, ProductsGridVC, ProductVC) that hold a ModelController (HeartsMC), which is responsible to fetch all the hearts from a user. 
While this approach is useful in many situations, it makes it difficult for communicating data updates from one object to many objects at the same time as we can see on the diagram.
 
As an alternative to delegates, we can use ReactiveCocoa’s framework to use signals for communication. In short ReactiveCocoa is a functional reactive programming framework which represents dataflows by the notion of events. An event is just a concept that represents that something happened, which could represent a button being tapped or the status of a network call being made. ReactiveCocoa uses Signals and SignalProducers to send messages representing these events.  A Signal will only notify an observing object of events that have occurred after the observer has started observing, not any past actions. On the other hand, a SignalProducer allows you to start observing for an event but also have the visibility for past and future events.

ReactiveCocoa explains these concepts by comparing them to a person (the observer) watching television (the signal or signal producer). 

[A signal] is like a live TV feed — you can observe and react to the content, but you cannot have a side effect on the live feed or the TV station. [A signal producer] is like a on-demand streaming service — even though the episode is streamed like a live TV feed, you can choose what you watch, when to start watching and when to interrupt it.
— https://github.com/ReactiveCocoa/ReactiveSwift

Signals work very well for UI events like tapping a button, where an observer would only be interested in reacting to touch events after it has started observing. A Signal Producer would be more useful for a network request, where its on-demand nature would help provide more contextual information about a network request, such as allowing an observer to know if it is in progress, has it previously failed, should we cancel it, etc.

Objects throughout our app can then observe these event-driven signals and take any action they need to bind data to views, process data, or trigger any other actions. This allows multiple views and viewControllers in our app to be notified of data changes at the same time without the need for complex delegation chains. 

In the code snippet above, we are setting up signals for a modelController. When it receives a signal, it can call a method that is defined by our protocol. If we were to configure these methods and interactions with a delegate, other ViewControllers would not receive any message and would not know to update their data.

In the app, we have a TabBarController that allows users to quickly switch between viewControllers that provide different functionality. However, we do have some shared data sources between these controllers, so our signals allow us to update data on every accessible viewController so the user can interact with them immediately after switching tabs.

This approach has been very helpful for broadcasting changes of the datasource on a ModelController throughout the app. While implementing, we paid special attention to weak/strong references with our observers to prevent retain cycles, routinely testing that objects were being deallocated when needed. Additionally, deadlocks can occasionally occur if signals trigger other signals that are dependent on the completion of another signal. However, with well thought out protocols and implementation patterns, we have been able to provide a responsive and stable app experience for our users.