Project Reactor – introduction

Reactive programming allows building easy to scale, asynchronous, non-blocking, and event-driven application.
The Project Reactor is an implementation of the Reactive Programming paradigm, based on Reactive Streams Specification. Sounds complicated? Don’t worry, in this article I’will start from the basics.

Reactive Programming

Reactive Programming is nothing new, actually, the origins can probably be traced to the 1970s or even earlier. In the 21st century, we really like buzzwords so Reactive Programming have to be one of them of course.
Let’s take a look at the Reactive Manifesto.

Reactive Manifesto

The Reactive Manifesto consist of four parts:
1. Responsive
2. Resilient
3. Elastic
4. Message Driven

The first three is more about architecture than plain programming, so we will take a look at the fourth.

Message Driven

Reactive Systems rely on asynchronous message-passing to establish a boundary between components that ensure loose coupling, isolation and location transparency. This boundary also provides the means to delegate failures as messages. Employing explicit message-passing enables load management, elasticity, and flow control by shaping and monitoring the message queues in the system and applying backpressure when necessary. Location transparent messaging as a means of communication makes it possible for the management of failure to work with the same constructs and semantics across a cluster or within a single host. Non-blocking communication allows recipients to only consume resources while active, leading to less system overhead.

In this part, the most important for us are three bolded sentences:
1. Failures as messages
2. Backpressure
3. Non-blocking

Let’s start from the first one.

Failures as Messages

Everything in Reactive World is a message, even errors. The data is prepared by Producer and will be sent to every Subscriber as a message. What if an error occurs on the Producer side?
The Producer can catch the exception and finish streaming, but what about Subscriber which wasn’t aware of the exception?
This is the reason why we need to send errors to subscriber too. So even errors are passing as messages.

Backpressure

What if the Publisher is really fast and produce a lot of data, but the Subscriber need some time to process all of them? In the other words, the Subscriber is slower than the Publisher? In this case, we can use the backpressure mechanism. The backpressure is one of the most important mechanisms in the Reactive World. It’s quite simple to use and allows to limit incoming data. The Subscriber can send a message upstream to the Publisher and inform about max data amount.
Let’s look at the diagram below.

As you can see in the diagram, the Subscriber sends the request to the Publisher. When Subscriber needs less data than Publisher produces, then the data is getting from the buffer, otherwise, the Subscriber obtains the “fresh” data, just created by Publisher.
As you can imagine, the buffer has limited space, and cannot keep every data forever. We can set the buffer limit for the amount of data or time.
The default is a max Integer value.

Non-blocking

From most of the Java developers perspective it looks like:
– Who cares?
– Modern hardware is so powerful, I can use so many threads as I want
– Thread per request? Perfection!

Of course, I exaggerate… 🙂 but Java can use a lot of threads, so Java just does it.
Take a look at thread perspective in a “thread per request” approach.

In standard way servers like e.g. Tomcat create a new thread per request (the standard pool is 200). When the I/O operation occurs e.g. DB connection, then thread have to wait until the operation is done – the thread is blocked. Blocked thread even if do nothing, still, consume resources – as you can imagine this is highly inefficient.
The non-blocking approach looks like:

The non-blocking servers like e.g. Netty do not create request per thread, instead, the limited number of threads are created. The standard is a thread per Core, so if you have eight cores in your CPU, then eight threads will be created when the server starts.
The thread starts request processing and when encounters on I/O operation, instead of waiting until the result will be ready, thread finish his job right here. I mean, he is no longer tied with this concrete request so, the thread never will be blocked. When I/O operation is processing, the thread can process other threads. Once, the I/O operation is done, the thread will be informed about that. Sounds great, huh?
But how it will be informed? What is the white space between thread and DB driver on the diagram? The answer is Event Loop.

Event Loop

Event Loop is a well-known mechanism from the JavaScript world, where everything is done on a single thread. Node.js is working on the Google V8 engine which uses only one thread, but still, Node.js is super efficient.
In the Java non-blocking world, Event Loop works on own single thread. So when e.g. Netty runs 8 threads, one of them is dedicated for Event Loop.
So it looks like the following:

To be precise, Netty uses Event Loop Groups which aggregates several Event Loops, but the main concept looks as above. For more info, you should take a look at the server documentation.

Reactive Streams

Now we know main Reactive Programming concepts so it’s good time to go deeper in the Reactive world. Let’s take a look at the Reactive Streams.
Reactive Streams is just a standard for asynchronous stream processing with non-blocking backpressure. For programmers, it’s just API for Reactive Programming (much like JPA or JDBC). The Reactive Streams API is the product of a collaboration between engineers from Netflix, Pivotal, Red Hat, Twitter, and many others.
The Reactive Streams API is really simple.

As you can see there we have only four interfaces. 

Pure Java does not support Reactive Streams until version 9.
Reactive Streams API is available as a separate jar – org.reactivestreams – for Java 8. 
In the Java 9 RS is a part of the official API under the Flow class in the java.util.concurrent package. Java doesn’t offer to mutch implementations of these interfaces, the only publisher is SubbmissionPublisher and right here the external libraries come with help.
The most popular Reactive Streams API implementations are:
-Project Reactor
-Rx Java
-Akka Streams
-Ratpack

We will focus on the first one because Spring 5 use it as default.
Project Reactor and RxJava are really similar, so if you know one of them, switch for the second will be easy. 

Summary

This post showed you the basics of Reactive Programming and Reactive Streams, now you should be ready to explore Project Reactor. In the next post, we’ll create our first publishers, subscribers etc., I will show you how to manage threads and explain the difference between cold and hot streams.



Leave a Reply

jvmfy