Node.js event loop: concurrent processing in a single-threaded environment

When you start a Node.js application by running node index.js, you are in fact creating a Node.js process in which your code will run. Every process has a dedicated memory pool which is shared between its threads. Or in other words, you can create a variable in one thread and read it from another.

While multithreading is good for better throughput, it also comes with complexity. For example, the creation and release of the resources can result in significant overhead and affect performance (you will create and manage a thread pool to avoid this overhead, and that is added complexity). Also, the threads have to communicate with each other to be in sync and avoid Race condition.

Node.js avoids these complexities by being single-thread. That means it has only one thread, one call stack, and all the code you wrote runs in that single thread (even when it serves multiple requests simultaneously)

Being a single-threaded environment, Node.js should not run a time-consuming function in its main thread as it will block other functions which are ready to be run concurrently. This will affect the throughput.

For example, imagine that you created a microservice with an API that performs some network operations for 10 seconds. Now consider the scenario when users A and B invoke the API together and A's request started processing first. Since Node.js has only one main thread, B's request has to wait for A's request to be complete for the thread to become free. In effect, B has to wait for 20 seconds(10 sec waiting for the thread + 10-sec processing the request) for getting the result. This is called blocking behavior and is to be avoided.

Hence Node.js comes with C/C++ APIs that provide asynchronous input/output (I/O), and interaction with the operation system (OS), which allows code execution similar to multithreading without the same memory shortcomings. The Event Loop was implemented to assist with the interactions between these asynchronous components and the main application thread.

Callback queueing; an oversimplified view

Before getting deep into the event loop, let’s take a look at an overly simplified view of how Node.js queues the callbacks to achieve concurrent processing.

Think of Node.js as having only three components.

  1. A callback queue
  2. The main thread
  3. Asynchronous helper libraries for performing the blocking operations such as I/O, OS calls, etc.

Every request you make to run a callback function will put the function into the callback queue for the main thread to pick it up when it is free.

The main thread keeps polling the callback queue for functions to be executed. It will keep executing the functions available in the queue until no callbacks are left to process. The execution is synchronous. The callbacks are executed in the order they are put into the queue.

When the functions start blocking operations such as I/O, or interaction with OS, it is handed over to the asynchronous helper libraries in the Node.js and the currently executing function is marked as pending and removed from the callback queue. Thus, the other callback functions in the queue can come forward and be executed by the main thread; while the helper libraries continue their work asynchronously without blocking the main thread. After the processing is done, the helper libraries will notify to the Node.js and the pending function is taken back to the rear of the callback queue to continue its execution.

This is how the queueing of the functions helps to avoid blocking the main thread and makes concurrent processing possible.

As mentioned earlier, this is a ridiculously oversimplified version of what’s happening. In reality, event loops are used to achieve this kind of processing flow in Node.js.

Event loop

When Node.js starts, it initializes the event loop, and processes the provided input script which may make async API calls, schedule timers, or call process.nextTick(), then begins processing the event loop.

An event loop is, as the name suggests, a loop of phases that Node.js iterates over and over as long as it has got something to execute. The following diagram shows a simplified overview of the event loop’s order of operations.

Phases of an event loop (Image courtesy nodejs.org)

Each phase has a FIFO queue of callbacks to execute. When the event loop enters a given phase, it will execute the callbacks in the queue of the phase until there are none left in the queue or a maximum number of call back has been executed. Then the event loop moves ahead to the next phase and so on.

This continues until all the phases are empty or the process is killed manually.

Overview of the event loop phases

Let’s take a look at the responsibilities that the event loop performs in each phase.

timers

The callbacks that are scheduled with timer functions like setTimeout(), setInterval() , etc. are processed in the timers phase. The event loop compares the current time with the next execution time of the earliest item in the timer’s callbacks and executes them if they are ready. Note that, by this mechanism, the timed callbacks are not executed exactly after the specified time is elapsed. It is actually executed when the event loop reaches the timers phase the first time after the specified time is elapsed. In other words, the time we specify in the timer functions in the code is the minimum time to wait before invoking the callback.

pending callbacks

This is where we get notified of the results of the non-blocking I/O. The asynchronous I/O request is recorded into the queue and then the main call stack can continue working as expected. When the I/O operation is complete, or errors out, its callback will be placed in the pending queue and it will be processed during the pending callbacks phase of the Event Loop.

idle, prepare

This is a housekeeping phase. During this phase, the Event Loop performs internal operations of any callbacks. It is primarily used for gathering information, and planning of what needs to be executed during the next tick of the Event Loop.

poll

This is the phase where all the JavaScript code that we write is executed, starting at the beginning of the file given with the node command to execute. Depending on the code it may execute immediately, or it may add something to the queue to be executed during a future tick of the Event Loop.

During this phase, the Event Loop is managing the I/O workload, calls the functions in the queue until the queue is empty, and calculates how long it should wait until moving to the next phase. All callbacks in this phase are called synchronously in the order that they were added to the queue, from oldest to newest.

Once the poll queue is empty, the event loop will check for timers whose time thresholds have been reached. If one or more timers are ready, the event loop will wrap back to the timers phase to execute those timers’ callbacks.

When the event loop enters the poll phase and there are no timers scheduled, one of the following things will happen:

  1. If the poll queue is not empty, the event loop will iterate through its queue of callbacks executing them synchronously until either the queue has been exhausted, or the system-dependent hard limit is reached.
  2. If the poll queue is empty, one of two more things will happen:
    • If scripts have been scheduled by setImmediate(), the event loop will end the poll phase and continue to the check phase to execute those scheduled scripts.
    • If scripts have not been scheduled by setImmediate(), the event loop will wait for callbacks to be added to the queue, then execute them immediately.

check

This phase allows us to execute callbacks immediately(using setImmediate() function) after the poll phase has been completed. If the poll phase becomes idle and scripts have been queued with setImmediate(), the event loop may continue to the check phase rather than waiting.

close callbacks

This phase executes the callbacks of all close events. For example, a close event of web socket callback, or when process.exit() is called. This is when the Event Loop is wrapping up one cycle and is ready to move to the next one. It is primarily used to clean the state of the application.


To summarize, one tick of the operation cycle of a Node.js application starts with timers and continues as follows.

  1. Callbacks of timers for which the wait time is up are executed in order from smallest wait time to largest (timer phase).
  2. Then, I/O callbacks are executed (pending callbacks phase),
  3. Then, some internal processing (idle, prepare phase).
  4. Then, the main code to get into the picture and poll queue callbacks is executed (poll phase).
  5. Then, callbacks of setImmediate() and close event callbacks are called (check and close phases).

This cycle repeats as long as there is code that needs to be executed.


References

One thought on “Node.js event loop: concurrent processing in a single-threaded environment

Leave a comment