Virtual DOM Explained


Introduction

The Virtual DOM was one of React’s main differentiators when it first appeared. It was a big advantage in comparison with previous frameworks, and newer libraries started to follow the same approach (e.g. Vue.js).

Even with all the attention that the concept received in the past few years, there are still several questions surrounding the topic. How does it work behind the scenes? Why is it considered faster than direct DOM manipulation? How does it relate to dirty model checking?


What Is It Trying to Solve?

When you're dealing with a client-side application, you quickly face one issue: DOM manipulation is expensive. If your application is big or very dynamic, the time spent manipulating DOM elements rapidly adds up and you hit a performance bottleneck.


The obvious answer to the problem is to avoid manipulating elements unless strictly necessary. The approach used by Angular, which is arguably the framework that popularized the concept of SPAs (Single Page Applications), is called Dirty Model Checking.

Example model:

 1  {
2  subject: 'World'
3}

Example template:

   1  <div>
2  <div id="header">
3    <h1>Hello, {{model.subject}}!</h1>
4    <p>How are you today?</p>
5  </div>
6</div>

With this approach, the framework keeps tabs on all models. If the model changes, it interpolates/executes the corresponding templates, manipulating the DOM directly. If the model doesn't change, it won't touch the DOM.


Now, this is a smart solution. There are still problems with it, though. One of the main issues becomes very obvious when changes to your model don't necessarily translate into a change in the template - or, even worse, when your model and template are super complex. In the example shown above, that p tag will never change. It will still be updated after every single time your model is considered dirty - there is nothing between your template and the actual DOM, so the whole thing is modified every time.


A simple solution to this problem is: Add a layer between your template and your DOM!


What's a Virtual DOM?

Basically, it's an in-memory representation of the actual elements that are being created for your page.

Let's go back to that previous HTML:

   1  <div>
2  <div id="header">
3    <h1>Hello, {{state.subject}}!</h1>
4    <p>How are you today?</p>
5  </div>
6</div>

After rendering, your virtual DOM could be represented as something like this:

   1  {
2  tag: 'div',
3  children: [
4    {
5      tag: 'div',
6      attributes: {
7        id: 'header'
8      },
9      children: [
10        {
11          tag: 'h1',
12          children: 'Hello, World!'
13        },
14        {
15          tag: 'p',
16          children: 'How are you today?'
17        }
18      ]
19    }
20  ]
21}

Now, let's say our state changed - state.subject is now Mom. The new representation will be:

   1  {
2  tag: 'div',
3  children: [
4    {
5      tag: 'div',
6      attributes: {
7        id: 'header'
8      },
9      children: [
10        {
11          tag: 'h1',
12          children: 'Hello, Mom!'
13        },
14        {
15          tag: 'p',
16          children: 'How are you today?'
17        }
18      ]
19    }
20  ]
21}

We can now diff the two trees and identify that only that h1 changed. We then surgically update that single element - no need to manipulate the whole thing.


Let's make things a bit more interesting - we'll write our own naive implementation of a Virtual DOM library!

Conclusion

The Virtual DOM is definitely going to be around for a while. It provides a really nice way of decoupling your application's logic from its DOM elements and, therefore, reduces the likelihood of creating unintentional bottlenecks when it comes to DOM manipulation. Other libraries are moving forward with the same approach, further solidifying the concept as one of the preferred strategies for web applications.


It's worth mentioning that dirty model checking and virtual DOM are not mutually exclusive. They both came as solutions for the same problem but tackling it in different ways. An MVC framework could very well implement both techniques. In React's case, it just didn't make much sense - React is mostly a View library after all.


So, in summary, the Virtual DOM implements:

  • A tree structure representing the DOM elements your application creates.
  • A diff algorithm designed to identify changes between DOM representations.
  • A way to replicate said changes in the actual DOM - but only if necessary.

We consider the virtual DOM one of the cornerstones of mastering React - it certainly allowed me to have more context on some of the choices that went into designing the framework, and even to improve my own components and optimization techniques.