Understanding message brokers. Learning the mechanics of messaging through ActiveMQ and Kafka. Chapter 1

Hello!

Began translating a small book:

" Understanding Message Brokers ",

author: Jakub Korab, publisher: O'Reilly Media, Inc., publication date: June 2017, ISBN: 9781492049296.

From the introduction to the book:

"... This book will teach you to talk about messaging systems on brokers by comparing and contrasting two popular broker technologies: Apache ActiveMQ and Apache Kafka. Here you will find examples of usage and development incentives that led their developers to use completely different approaches to the same area - messaging between systems with an intermediate broker.We will look at these technologies from scratch and highlight the influence of various design options along the way.You will get a deep understanding of both products, an understanding of how they should and should not be used, and an understanding of what to look for when considering other messaging technologies in the future ... "

Parts translated to date:

Chapter 1. Introduction

Chapter 3. Kafka

I will post the completed chapters as they are translated.

CHAPTER 1 Introduction

Intersystem messaging is one of the least understood areas of IT. As a developer or architect, you may be well acquainted with various frameworks and databases. However, it is likely that you only have a passing acquaintance with how broker-based messaging technologies work. If you feel that way, don’t worry, you are in good company.

People usually have very limited contact with the messaging infrastructure. Often they connect to a system created a long time ago, or download a distribution kit from the Internet, install it in PROM and start writing code for it. After launching the infrastructure in PROM, the results may be mixed: message loss during failures, sending does not work as you expected, or brokers “suspend” your producers or do not send messages to your consumers.

Sounds familiar?

A common scenario is when your messaging code works fine, for the time being. Until it stops working. This period lulls vigilance and gives a false sense of security, which leads to even more code based on false ideas about the fundamental behavior of the technology. When something starts to go wrong, you are faced with an uncomfortable truth: that you really did not understand the basic behavior of the product or the compromises chosen by the authors, such as performance versus reliability, or transactionality versus horizontal scalability.

Without a deep understanding of how brokers work, people make seemingly reasonable statements about their messaging systems, such as:

The system will never lose messages
Messages will be processed sequentially
Adding consumers to make the system faster
Messages will be delivered only once.

Unfortunately, some of these statements are based on assumptions that are applicable only in certain circumstances, while others are simply incorrect.

This book will teach you how to talk about broker-based messaging systems by comparing and contrasting two popular broker technologies: Apache ActiveMQ and Apache Kafka. Here we will describe examples of use and development incentives that led their developers to use completely different approaches to the same area - messaging between systems with an intermediate broker. We will look at these technologies from scratch and highlight the impact of various design options along the way. You will gain a deep understanding of both products, an understanding of how they should and should not be used, and an understanding of what to look for when considering other messaging technologies in the future.

Before we get started, let's go through the basics.

What is a messaging system and why is it needed

In order for two applications to communicate with each other, they must first define an interface. The definition of this interface includes the choice of transport or protocol, such as HTTP, MQTT or SMTP, and the negotiation of message formats that the systems will exchange. This can be a rigorous process, such as defining an XML schema with payload requirements for a message, or it can be much less formal, for example, an agreement between two developers that some part of the HTTP request will contain a client identifier .

As long as the message format and the order of sending between the systems are consistent, they will be able to interact with each other without worrying about the implementation of another system. The internals of these systems, such as a programming language or the used framework, may change over time. As long as the contract itself is supported, the interaction can continue unchanged from the other side. The two systems are effectively disconnected (separated) by this interface.

Messaging systems, as a rule, provide for the participation of an intermediary between two systems that interact to further disengage (separate) the sender from the recipient or recipients. At the same time, the messaging system allows the sender to send a message without knowing where the recipient is located, whether he is active or how many copies of them.

Let's look at a couple of analogies of the varieties of problems that a messaging system solves, and introduce some basic terms.

Point-to-point

Alexandra goes to the post office to send Adam a parcel. She walks to the window and hands the employee a package. The employee picks up the parcel and gives Alexandra a receipt. Adam does not need to be home at the time of sending the parcel. Alexandra is confident that the package will be delivered to Adam at some point in the future and can continue to do their own thing. Later, at some point, Adam receives the package.

This is an example of a point-to-point messaging model. The post office here acts as a parcel distribution mechanism, ensuring that each parcel is delivered once. The use of the post office separates the act of sending the parcel from the delivery of the parcel.

In classic messaging systems, the point-to-point model is implemented through queues . The queue acts as a FIFO buffer (first in, first out) that one or more consumers can subscribe to. Each message is delivered to only one of the subscribed consumers . Queues usually try to fairly distribute messages between consumers. Only one consumer will receive this message.

The term "durable" is applied to queues. Reliability is a feature of the service that ensures that the messaging system will save messages in the absence of active subscribers until the consumer subscribes to the queue for message delivery.

Reliability is often confused with persistence and, although the two terms are used interchangeably, they perform different functions. Persistence determines whether the message is exchanged by the messaging system in any kind of storage between receiving and sending it to the consumer. Messages sent to the queue may or may not be persistent.

Point-to-point messaging is used when a use case requires a one-time action with a message. An example is depositing funds into an account or completing a delivery order. We will discuss later why the messaging system itself is unable to provide one-time delivery and why queues can, at best, provide a guarantee of delivery at least once .

Publisher Subscriber

Gabriella dials the conference number. While she is connected to the conference, she hears everything the speaker says, along with the other participants in the call. When she turns off, she skips what is said. When reconnected, she continues to hear what they say.

This is an example of a publish-subscribe messaging model. Conferencing acts as a broadcast mechanism. The talking person does not care about how many people are currently joining the call - the system ensures that anyone connected at the moment hears what is being said.

In classic messaging systems, the publish-subscribe messaging model is implemented through topics . The topic provides the same broadcast method as the conferencing mechanism. When a message is sent to the topic, it is distributed to all subscribed users .

Topics are usually not reliable (nondurable) . Like a listener who does not hear what is said on the conference call, when the listener disconnects, the subscribers of the topic skip any messages that are sent when they are offline. For this reason, it can be said that topics provide a guarantee of delivery no more than once for each consumer.

Publish-subscribe messaging is usually used when messages are informational in nature and the loss of one message is not particularly significant. For example, a topic can transmit temperature readings from a group of sensors once per second. A system that is interested in the current temperature and which subscribes to the topic will not worry if it misses a message - another will arrive in the near future.

Hybrid models

The store’s website places order messages in the message queue. The main consumer of these messages is the executive system. In addition, the audit system must have copies of these order messages for subsequent tracking. Both systems cannot skip messages, even if the systems themselves are unavailable for some time. A website should not be aware of other systems.

Usage scenarios often require the combination of publish-subscribe and point-to-point messaging models, for example, when multiple systems require a copy of a message, and both reliability and persistence are required to prevent message loss.

In these cases, a destination is required (a general term for queues and topics), which distributes messages mainly as a topic, so that each message is sent to a separate system that is interested in these messages, but also in which each system can define several consumers who receive incoming messages, which is more like a queue. The type of reading in this case is once for each interested party . These hybrid recipients often require durability, so if the consumer disconnects, messages that are sent at this time are received after reconnecting the consumer.

Hybrid models are not new and can be used in most messaging systems, including both ActiveMQ (via virtual or composite destinations that combine topics and queues), and Kafka (implicitly, as a fundamental property of the design of its addressee).

Now that we have some basic terminology and an understanding of what a messaging system might be useful for, let's move on to the details.

Translation completed: tele.gg/middle_java

Next translated part: Chapter 3. Kafka

To be continued...

All Articles

Understanding message brokers. Learning the mechanics of messaging through ActiveMQ and Kafka. Chapter 1

CHAPTER 1

Introduction

What is a messaging system and why is it needed

Point-to-point

Publisher Subscriber

Hybrid models

More articles: