Dive into React codebase: Handling state changes

State. One of the most complicated concepts in React.js nomenclature. While some of us already got rid of it in our projects (Redux anyone?) by externalizing state, it is still widely used feature of React.js.

While convenient, it can cause some issues. Robert Pankowecki, one of Rails meets React.js authors had the problem with validations when he started his journey with React.

The story went like this: Validations seem to be quite easy to do, but there is a problem of vanilla form - first time user sees the input it should not get validated, even if it’s invalid to have this input empty. This is definitely a stateful behaviour - so Robert concluded it’ll be good to put his validation messages in state.

So basically he just went and implemented the logic like this:

// ...
changeTitle: function changeTitle (event) {
  this.setState({ title: event.target.value });
  this.validateTitle();
},
validateTitle: function validateTitle () {
  if(this.title.length === 0) {
    this.setState({ titleError: "Title can't be blank" });
  }
},
// ...

Bam. This code doesn’t work. Why is so? This is because setState works in an asynchronous way. That means after calling setState the this.state variable is not immediately changed. This is described in docs under “Notes” section:

setState() does not immediately mutate this.state but creates 
a pending state transition. 
Accessing this.state after calling this method can potentially 
return the existing value.

So it is rather a misunderstanding or lack of knowledge than something really weird. Robert could avoid his problem by reading the docs. But you can agree that it’s a rather easy mistake to make for a beginner!

This situation has triggered some interesting discussions internally in the team. What can you expect from setState? What guarantees you have? Why can you be sure that state will get updated correctly if you change the input and press “Submit” immediately? To understand what’s going on and to make an interesting journey into React internals I’ve decided to trace what happens under the hood if you call setState. But first, let’s solve the problem Robert had in an appropriate way.

Solving the validation problem

Let’s see what options you have in React 0.14.7 to solve the problem described above.

You can use built-in callback capability of setState. setState accepts two arguments - first is state change, and the second is callback to be called when state is updated:

  changeTitle: function changeTitle (event) {
    this.setState({ title: event.target.value }, function afterTitleChange () {
      this.validateTitle();
    });
  },
  // ...

Inside afterTitleChange the state is already updated so you can be certain that you’ll read fresh values in this.state.

You can concatenate both state changes. To do so you have to parametrize validateTitle. So you concatenate two state changes into one:

  changeTitle: function changeTitle (event) {
    let nextState = Object.assign({},
                                  this.state,
                                  { title: event.target.value });
    this.validateTitle(nextState);
    this.setState(nextState);
  },
  validateTitle: function validateTitle (state) {
    if(state.title.length === 0) {
      state.titleError = "Title can't be blank";
    }
  }

So the problem disappears since there are no two state changes anymore. This can get a little complicated when you state is bigger and nested (which is not a good practice anyway!). It is also a good practice to transform validateTitle into a pure function - so its return value relies only on arguments and it does not mutate anything:

  changeTitle: function changeTitle (event) {
    let nextState = Object.assign({},
                                  this.state,
                                  { title: event.target.value });
    this.setState(this.withTitleValidated(nextState));
  },
  withTitleValidated: function validateTitle (state) {
    let validation = {};
    if(state.title.length === 0) {
      validation.titleError = "Title can't be blank";
    }
    return Object.assign({}, state, validation);
  }

This can be a little more performant than previous solutions thanks to shouldComponentUpdate config. If it’s a custom made shouldComponentUpdate, some state updates can be a no-op and won’t trigger componentDidUpdate, thus validateTitle won’t get called.

You can externalize the state, making the validation on the other part of the app (Redux reducer, for example). This way you avoid having this problem on the React side by not using state entirely.

In the book Robert has gone with concatenating state solution (so the second approach). This adventure has convinced him to keep his state outside of React components to avoid thinking about it. But is it the valid case against state? Let’s see…

If you want to skip the React codebase step by step examination, you can just go to the recap section.

Inside the React - down the rabbit hole!

Let’s start with the definition of setState. It is defined in the ReactComponent prototype which is basically a React.Component class you extend in ES2015-class definition of React components. setState available in components created by React.createClass is the same code - the prototype of resulting component class returned by React.createClass is ReactClassComponent, which is basically:

var ReactClassComponent = function() {};
assign(
  ReactClassComponent.prototype,
  ReactComponent.prototype,
  ReactClassMixin
);

As you can see, ReactComponent prototype is here - that means setState from there is used (unless ReactClassMixin doesn’t do something weird - and it doesn’t). Let’s see the implementation:

ReactComponent.prototype.setState = function(partialState, callback) {
  invariant(
    typeof partialState === 'object' ||
    typeof partialState === 'function' ||
    partialState == null,
    'setState(...): takes an object of state variables to update or a ' +
    'function which returns an object of state variables.'
  );
  if (__DEV__) {
    warning(
      partialState != null,
      'setState(...): You passed an undefined or null state object; ' +
      'instead, use forceUpdate().'
    );
  }
  this.updater.enqueueSetState(this, partialState);
  if (callback) {
    this.updater.enqueueCallback(this, callback);
  }
};

Apart from checking invariants and issuing those great warnings, there are two things this method is doing:

setState is enqueued to the updater to do it’s job.
callback passed as the second argument of setState is enqueued too if exists. There’s a reason for that you’ll understand later.

But what is updater? The name implies that it’ll have something to do with updating components. Let’s see where it is defined for ReactClass and ReactComponent:

  // We initialize the default updater but the real one gets injected by the
  // renderer.
  this.updater = updater || ReactNoopUpdateQueue;

React.js codebase relies heavily on a dependency injection principle. This allows to substitute parts of React.js based on the environment (server-side vs. client-side, different platforms) in which you’re rendering. ReactComponent is a part of the isomorphic namespace - it will always exist, no matter it is React Native, ReactDOM on browser or server-side. Also it contains only pure JavaScript which should run on every device capable of understanding the ECMAScript 5 standard of JS.

So where the real updater gets injected? In ReactCompositeComponent part of the renderer (mountComponent method):

    // These should be set up in the constructor, but as a convenience for
    // simpler class abstractions, we set them up after the fact.
    inst.props = publicProps;
    inst.context = publicContext;
    inst.refs = emptyObject;
    inst.updater = ReactUpdateQueue;

This ReactCompositeComponent class is used in many types of React (react-dom, react-native, react-art) to build an environment-independent foundation that every React component has. It is used as a precondition of a Transaction, for example in ReactMount of the react-dom client - so the platform-dependent code goes there and is wrapped with a transaction which sets platform-independent internals correctly.

Since you know what is an updater, let’s see how enqueueSetState and enqueueCallback are implemented.

Enqueuing state changes and callbacks - ReactUpdateQueue

So inside setState there are two methods called: enqueueSetState and enqueueCallback. As you’ve seen, React uses ReactUpdateQueue instance to implement those methods. Let’s see how implementations look like.

enqueueSetState:

  enqueueSetState: function(publicInstance, partialState) {
    var internalInstance = getInternalInstanceReadyForUpdate(
      publicInstance,
      'setState'
    );

    if (!internalInstance) {
      return;
    }

    var queue =
      internalInstance._pendingStateQueue ||
      (internalInstance._pendingStateQueue = []);
    queue.push(partialState);

    enqueueUpdate(internalInstance);
  },

enqueueCallback:

  enqueueCallback: function(publicInstance, callback) {
    invariant(
      typeof callback === 'function',
      'enqueueCallback(...): You called `setProps`, `replaceProps`, ' +
      '`setState`, `replaceState`, or `forceUpdate` with a callback that ' +
      'isn\'t callable.'
    );
    var internalInstance = getInternalInstanceReadyForUpdate(publicInstance);

    // Previously we would throw an error if we didn't have an internal
    // instance. Since we want to make it a no-op instead, we mirror the same
    // behavior we have in other enqueue* methods.
    // We also need to ignore callbacks in componentWillMount. See
    // enqueueUpdates.
    if (!internalInstance) {
      return null;
    }

    if (internalInstance._pendingCallbacks) {
      internalInstance._pendingCallbacks.push(callback);
    } else {
      internalInstance._pendingCallbacks = [callback];
    }
    // TODO: The callback here is ignored when setState is called from
    // componentWillMount. Either fix it or disallow doing so completely in
    // favor of getInitialState. Alternatively, we can disallow
    // componentWillMount during server-side rendering.
    enqueueUpdate(internalInstance);
  },

As you can see, both methods reference the enqueueUpdate function which will get inspected soon. The pattern goes like this:

The internal instance is retrieved. What you see as a React Component in your code has a backing instance inside React which has fields which are a part of the private interface. Those internal instances are obtained by a piece of code called ReactInstanceMap to which getInternalInstanceReadyForUpdate is delegating this task.
A change to the internal instance is made. In case of enqueuing callback, callback is added to a pending callbacks queue. In case of enqueuing a state change, pending state change is added to a pending state queue.
enqueueUpdate is called to flush changes made by those methods. Let’s see how it is done.

enqueueUpdate:

function enqueueUpdate(internalInstance) {
  ReactUpdates.enqueueUpdate(internalInstance);
}

Oh, so yet another piece of this puzzle! It’s interesting why on this level ReactUpdates is referenced directly and not injected, though. It is because ReactUpdates is quite generic and it’s dependencies are injected instead. Let’s see how ReactUpdates works, then!

Performing updates - ReactUpdates

Let’s see how enqueueUpdate is implemented on the ReactUpdates side:

function enqueueUpdate(component) {
  ensureInjected();

  // Various parts of our code (such as ReactCompositeComponent's
  // _renderValidatedComponent) assume that calls to render aren't nested;
  // verify that that's the case. (This is called by each top-level update
  // function, like setProps, setState, forceUpdate, etc.; creation and
  // destruction of top-level components is guarded in ReactMount.)

  if (!batchingStrategy.isBatchingUpdates) {
    batchingStrategy.batchedUpdates(enqueueUpdate, component);
    return;
  }

  dirtyComponents.push(component);
}

There are two moving parts (injections) on the ReactUpdates level which are important to mention. ensureInjected sheds some light on them:

function ensureInjected() {
  invariant(
    ReactUpdates.ReactReconcileTransaction && batchingStrategy,
    'ReactUpdates: must inject a reconcile transaction class and batching ' +
    'strategy'
  );
}

batchingStrategy is a strategy of how React will batch your updates. For now there is only one, called ReactDefaultBatchingStrategy which is used in the codebase. ReactReconcileTransaction is environment-dependent piece of code which is responsible for “fixing” transient state after updates - for DOM it is fixing selected pieces of text which can be lost after update, suppressing events during reconciliation and queueing lifecycle methods. More about it here.

Code of enqueueUpdate is a little hard to read. On the first look it seems that there is nothing special happening here. batchingStrategy which is a Transaction has a field which tells you whether a transaction is in progress. If it’s not, enqueueUpdate stops and registers itself to be performed in transaction. Then, a component is added to the list of dirty components.

So, what now? Pending state and callbacks queues are updated and a component is pushed to the list of dirty components. But nothing so far is causing this state to actually be updated!

To understand what’s going on, you need to jump to the implementation of ReactDefaultBatchingStrategy. It has two wrappers:

FLUSH_BATCHED_UPDATES - which is calling flushBatchedUpdates from ReactUpdates after performing a function in a transaction. This is the heart of the state updating code. IMO it’s confusing that important piece of code is hidden within transaction’s implementation - I believe it is made for code reuse.
RESET_BATCHED_UPDATES - responsible for clearing the isBatchingUpdates flag after function is performed within transaction.

While RESET_BATCHED_UPDATES is more of a detail, flushBatchedUpdates is extremely important - it is where the logic of updating state really happens. Let’s see the implementation:

var flushBatchedUpdates = function() {
  // ReactUpdatesFlushTransaction's wrappers will clear the dirtyComponents
  // array and perform any updates enqueued by mount-ready handlers (i.e.,
  // componentDidUpdate) but we need to check here too in order to catch
  // updates enqueued by setState callbacks and asap calls.
  while (dirtyComponents.length || asapEnqueued) {
    if (dirtyComponents.length) {
      var transaction = ReactUpdatesFlushTransaction.getPooled();
      transaction.perform(runBatchedUpdates, null, transaction);
      ReactUpdatesFlushTransaction.release(transaction);
    }

    if (asapEnqueued) {
      asapEnqueued = false;
      var queue = asapCallbackQueue;
      asapCallbackQueue = CallbackQueue.getPooled();
      queue.notifyAll();
      CallbackQueue.release(queue);
    }
  }
};

There is yet another transaction (ReactUpdatesFlushTransaction) which is responsible for “catching” any pending updates that appeared after running flushBatchedUpdates. It is a complication because componentDidUpdate or callbacks to setState can enqueue next updates which needs to be processed. This transaction is additionaly pooled (there are instances prepared instead of creating them on the fly - React uses this trick to avoid unnecessary garbage collecting) which is a neat trick which came from video games development. There is also a concept of asap updates which will be described a little bit later.

As you can see, there is a method called runBatchedUpdates called. Whew! This is a lot of methods called from setState to an end. And it’s not over. Let’s see:

function runBatchedUpdates(transaction) {
  var len = transaction.dirtyComponentsLength;
  invariant(
    len === dirtyComponents.length,
    'Expected flush transaction\'s stored dirty-components length (%s) to ' +
    'match dirty-components array length (%s).',
    len,
    dirtyComponents.length
  );

  // Since reconciling a component higher in the owner hierarchy usually (not
  // always -- see shouldComponentUpdate()) will reconcile children, reconcile
  // them before their children by sorting the array.
  dirtyComponents.sort(mountOrderComparator);

  for (var i = 0; i < len; i++) {
    // If a component is unmounted before pending changes apply, it will still
    // be here, but we assume that it has cleared its _pendingCallbacks and
    // that performUpdateIfNecessary is a noop.
    var component = dirtyComponents[i];

    // If performUpdateIfNecessary happens to enqueue any new updates, we
    // shouldn't execute the callbacks until the next render happens, so
    // stash the callbacks first
    var callbacks = component._pendingCallbacks;
    component._pendingCallbacks = null;

    ReactReconciler.performUpdateIfNecessary(
      component,
      transaction.reconcileTransaction
    );

    if (callbacks) {
      for (var j = 0; j < callbacks.length; j++) {
        transaction.callbackQueue.enqueue(
          callbacks[j],
          component.getPublicInstance()
        );
      }
    }
  }
}

This method takes all dirty components, orders them by the mount order (since updates go from parent to children), enqueues callbacks to the transaction queue and runs performUpdateIfNecessary method from ReactReconciler. So runBatchedUpdates is all about making updates in order.

Let’s jump to the last part (whew again!) - ReactReconciler.

ReactReconciler & performUpdateIfNeeded - a final step

Let’s see the ReactReconciler performUpdateIfNecessary method implementation:

  performUpdateIfNecessary: function(
    internalInstance,
    transaction
  ) {
    internalInstance.performUpdateIfNecessary(transaction);
  },

Uh. Yet another call. Fortunately it is quickly found in ReactCompositeComponent:

  performUpdateIfNecessary: function(transaction) {
    if (this._pendingElement != null) {
      ReactReconciler.receiveComponent(
        this,
        this._pendingElement || this._currentElement,
        transaction,
        this._context
      );
    }

    if (this._pendingStateQueue !== null || this._pendingForceUpdate) {
      this.updateComponent(
        transaction,
        this._currentElement,
        this._currentElement,
        this._context,
        this._context
      );
    }
  },

This method is splitted into two parts:

ReactReconciler.receiveComponent - which is comparing your components on the elements level. So elements instance are compared and if they’re not the same or the context changed there is a receiveComponent called on the level of an internal instance. It won’t get covered here, but it usually just calls updateComponent which contains logic for checking if elements are the same.
this.updateComponent is called if there is any pending state.

You may think why these checks for pending state or force updates are needed. The state must be pending since you came from setState, right? No. updateComponent is recursive so you can have components which are updated, but pending state is empty. Also the conditional for _pendingElement is for the case where something changed in children.

Let’s see how updateComponent is implemented:

  updateComponent: function(
    transaction,
    prevParentElement,
    nextParentElement,
    prevUnmaskedContext,
    nextUnmaskedContext
  ) {
    var inst = this._instance;

    var nextContext = this._context === nextUnmaskedContext ?
      inst.context :
      this._processContext(nextUnmaskedContext);
    var nextProps;

    // Distinguish between a props update versus a simple state update
    if (prevParentElement === nextParentElement) {
      // Skip checking prop types again -- we don't read inst.props to avoid
      // warning for DOM component props in this upgrade
      nextProps = nextParentElement.props;
    } else {
      nextProps = this._processProps(nextParentElement.props);
      // An update here will schedule an update but immediately set
      // _pendingStateQueue which will ensure that any state updates gets
      // immediately reconciled instead of waiting for the next batch.

      if (inst.componentWillReceiveProps) {
        inst.componentWillReceiveProps(nextProps, nextContext);
      }
    }

    var nextState = this._processPendingState(nextProps, nextContext);

    var shouldUpdate =
      this._pendingForceUpdate ||
      !inst.shouldComponentUpdate ||
      inst.shouldComponentUpdate(nextProps, nextState, nextContext);

    if (__DEV__) {
      warning(
        typeof shouldUpdate !== 'undefined',
        '%s.shouldComponentUpdate(): Returned undefined instead of a ' +
        'boolean value. Make sure to return true or false.',
        this.getName() || 'ReactCompositeComponent'
      );
    }

    if (shouldUpdate) {
      this._pendingForceUpdate = false;
      // Will set `this.props`, `this.state` and `this.context`.
      this._performComponentUpdate(
        nextParentElement,
        nextProps,
        nextState,
        nextContext,
        transaction,
        nextUnmaskedContext
      );
    } else {
      // If it's determined that a component should not update, we still want
      // to set props and state but we shortcut the rest of the update.
      this._currentElement = nextParentElement;
      this._context = nextUnmaskedContext;
      inst.props = nextProps;
      inst.state = nextState;
      inst.context = nextContext;
    }
  },

This is a rather big method! Let’s go through it step-by-step:

First of all, there is a check if context changed. If that’s the case, context is processed is stored in the nextContext variable. _processContext is responsible for this, but it won’t be covered here.
Then updateComponent is checking whether props changed or just state got changed. If props changed, the componentWillReceiveProps lifecycle method gets called and next properties are prepared in a similar fashion to nextContext. Otherwise nextProps points just to properties from the passed nextParentElement (won’t change at all).
Then next state gets processed. This is finally a piece we’re interested in. The _processPendingState is responsible for this.
Then there is a check whether a component should update virtual DOM-wise. If that’s the case, all produced next values are passed to the responsible method - _performComponentUpdate. If that’s not the case, variables just get updated in place.

Let’s see how _processPendingState is implemented:

  _processPendingState: function(props, context) {
    var inst = this._instance;
    var queue = this._pendingStateQueue;
    var replace = this._pendingReplaceState;
    this._pendingReplaceState = false;
    this._pendingStateQueue = null;

    if (!queue) {
      return inst.state;
    }

    if (replace && queue.length === 1) {
      return queue[0];
    }

    var nextState = assign({}, replace ? queue[0] : inst.state);
    for (var i = replace ? 1 : 0; i < queue.length; i++) {
      var partial = queue[i];
      assign(
        nextState,
        typeof partial === 'function' ?
          partial.call(inst, nextState, props, context) :
          partial
      );
    }

    return nextState;
  },

Since setting and replacing state shares queues, there is a conditional logic which checks whether state is being replaced. If so, pending state is merged with the replaced state - otherwise it is merged with the current state. There is also a conditional for determining in which style state is being set - you can pass object or function as a setState first argument and it is checked during processing pending state. Also calling replaceState flushes all previous state, so replaced state is always first on the queue.

ReactUpdates.asap

There is one important feature in ReactUpdates which is not covered here. It is so-called asap method of ReactUpdates:

function asap(callback, context) {
  invariant(
    batchingStrategy.isBatchingUpdates,
    'ReactUpdates.asap: Can\'t enqueue an asap callback in a context where' +
    'updates are not being batched.'
  );
  asapCallbackQueue.enqueue(callback, context);
  asapEnqueued = true;
}

This is used in flushBatchedUpdates of ReactUpdates like this:

var flushBatchedUpdates = function() {
  // ReactUpdatesFlushTransaction's wrappers will clear the dirtyComponents
  // array and perform any updates enqueued by mount-ready handlers (i.e.,
  // componentDidUpdate) but we need to check here too in order to catch
  // updates enqueued by setState callbacks and asap calls.
  while (dirtyComponents.length || asapEnqueued) {
    if (dirtyComponents.length) {
      var transaction = ReactUpdatesFlushTransaction.getPooled();
      transaction.perform(runBatchedUpdates, null, transaction);
      ReactUpdatesFlushTransaction.release(transaction);
    }

    if (asapEnqueued) {
      asapEnqueued = false;
      var queue = asapCallbackQueue;
      asapCallbackQueue = CallbackQueue.getPooled();
      queue.notifyAll();
      CallbackQueue.release(queue);
    }
  }
};

This is used exclusively by input elements of React to avoid some subtle problems with updating such elements. Basically the strategy of calling callbacks works like this: Callbacks are called after all updates, even nested ones. Asap makes calling such callback right after end of the current update - so if updates happen to be nested, they need to wait after asap callbacks finishes its work.

Recap

Wow, that was quite a journey! Setting state in React is a quite long process which includes:

Calling setState which enqueues a pending state change to ReactUpdateQueue.
ReactUpdateQueue updates the internal instance of the component with an additional pending state entry and call ReactUpdates to do its work.
ReactUpdates uses batchingStrategy to be sure all state changes are perfomed in a transaction which ensures those changes will get flushed.
flushBatchedUpdates is responsible for performing updates in a sequence and in an atomic way.
ReactUpdatesFlushTransaction ensures nested updates are properly handled.
runBatchedUpdates is responsible for ordering updates in a parent-to-child fashion and calling ReactReconciler methods to update components.
performUpdateIfNecessary is responsible for checking whether it was a prop or state change and calling updateComponent which gathers all changes.
updateComponent has logic for distinguishing types of updates and checks if there are no logic in shouldComponentUpdate which can prevent virtual DOM updates. Also it orchestrates calling lifecycle methods within component (shouldComponentUpdate, componentWillReceiveProps, componentWillUpdate, componentDidUpdate).
_processPendingState is all about applying state changes to the component. It can distinguish between state set and state replace, as well as different styles of passing the first argument (object vs. function).
There are asap callbacks which are used for inputs to prevent subtle bugs while reconciling - they work by applying callback straight after the current batch of updates are being processed.

But practically, what you can and what you can’t expect from setState? What are the guarantees you have?

Guarantees

You have a guarantee that state changes will be processed in order. That means you can be sure calling it with arguments { a: 2 }, { a: 3 } will result in state value a set to 3.
You don’t have a guarantee that React will update the DOM twice after calling setState two times in the same context. In fact it is unlikely.
You don’t have a guarantee that after calling setState your this.state will be immediately set to proper values. You must use the setState callback (second argument) or do it in the lifecycle method (componentDidUpdate).
You have a guarantee that callbacks will be fired in order.
You don’t have a guarantee that setState callback will be called with partial state matching exactly your setState first argument. For example calling setState with {a: 2} and {a: 3} with callback applied to the first setState may run callback when component will have {a: 3} in its state.
You have a guarantee that your state changes will get processed before you handle next event. It is solved by the EVENT_SUPPRESSION wrapper inside ReactReconcileTransaction.

Summary

Tracing how setState works is a great showcase how transactions are used in the React codebase. It also gives insight how such asynchronous flows can be managed in a very synchronous manner. It will surely make you a little better React programmer - knowing inside-out how it works and knowing your guarantees can avoid you some hassle in future. Also, isn’t it cool and aren’t you curious?

Of course it’s quite advanced topic. If you’d like to learn basics, we have React.js by Example book (with repository and some videos!) that can introduce you gently by creating cool widgets. If you happen to be a Rails developer and you fancy CoffeeScript, there is also a book from which the original problem came which inspired me to dwell React internals - Rails meets React.js, which comes with a repo too.

If you have any questions about the React internals, you can ask me - I’ll try to answer them if I know the answer. Also you can reach me on Twitter and by an e-mail. I’m looking forward to hearing back from you!