Welcome To Fusebes - Dev & Programming Blog

Simple Way to Monitor the Spring Boot Apps
18
Mar
2021

Simple Way to Monitor the Spring Boot Apps

Administration of spring boot applications using spring boot admin.

This includes health status, various metrics, log level management, JMX-Beans interaction, thread dumps and traces, and much more. Spring Boot Admin is a community project initiated and maintained by code-centric.

Spring boot admin will provide UI to monitor and do some administrative work for your spring boot applications.

This project has been started by codecentric and its open source. You can do your own customization if you want to.

Git Repo: Link

The above video will give you a better idea of what is this project, so we will directly start with an example.

Spring Boot provides actuator endpoints to monitor metrics of individual microservices. These endpoints are very helpful for getting information about applications like if they are up if their components like database etc are working well. But a major drawback or difficulty about using actuator endpoints is that we have to individually hit the endpoints for applications to know their status or health. Imagine microservices involving 150 applications, the admin will have to hit the actuator endpoints of all 150 applications. To help us to deal with this situation we are using Spring Boot Admin app.

Sample Code:

To implement this we will create two projects one is server and another is the client.

  1. Spring Boot Admin server.
  2. Spring Boot Admin client.

Spring Boot Admin Server:

The project structure should look like any spring boot application:

POM.xml

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<groupId>com.techwasti</groupId>
<artifactId>spring-boot-admin</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>

<name>spring-boot-admin</name>
<description>Demo project for Spring Boot</description>

<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>1.5.4.RELEASE</version>
<relativePath /> <!-- lookup parent from repository -->
</parent>

<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
<java.version>1.8</java.version>
</properties>

<dependencies>
<!-- admin dependency-->
<dependency>
<groupId>de.codecentric</groupId>
<artifactId>spring-boot-admin-server-ui-login</artifactId>
<version>1.5.1</version>
</dependency>
<dependency>
<groupId>de.codecentric</groupId>
<artifactId>spring-boot-admin-server</artifactId>
<version>1.5.1</version>
</dependency>
<dependency>
<groupId>de.codecentric</groupId>
<artifactId>spring-boot-admin-server-ui</artifactId>
<version>1.5.1</version>
</dependency>
<!-- end admin dependency-->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-security</artifactId>
</dependency>

</dependencies>

<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
</plugin>
</plugins>
</build>


</project>

We need to configure security as well since we are accessing sensitive information:

package com.techwasti;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.context.annotation.Configuration;
import org.springframework.security.config.annotation.web.builders.HttpSecurity;
import org.springframework.security.config.annotation.web.configuration.WebSecurityConfigurerAdapter;

import de.codecentric.boot.admin.config.EnableAdminServer;

@EnableAdminServer
@Configuration
@SpringBootApplication
public class SpringBootAdminApplication {

public static void main(String[] args) {
SpringApplication.run(SpringBootAdminApplication.class, args);
}

@Configuration
public static class SecurityConfig extends WebSecurityConfigurerAdapter {
@Override
protected void configure(HttpSecurity http) throws Exception {
http.formLogin().loginPage("/login.html").loginProcessingUrl("/login").permitAll();
http.logout().logoutUrl("/logout");
http.csrf().disable();

http.authorizeRequests().antMatchers("/login.html", "/**/*.css", "/img/**", "/third-party/**").permitAll();
http.authorizeRequests().antMatchers("/**").authenticated();

http.httpBasic();
}
}

}

application.propertie file content

spring.application.name=SpringBootAdminEx
server.port=8081
security.user.name=admin
security.user.password=admin

Run the app and localhost:8081

Enter username and password and click on login button

As this is a sample example so we hardcoded username and password but you can use spring security to integrate LDAP or any other security.

Spring Boot Admin can be configured to display only the information that we consider useful.

spring.boot.admin.routes.endpoints=env, metrics, trace, info, configprops

Notifications and Alerts:

We can notify and send alerts using any below channels.

  • Email
  • PagerDuty
  • OpsGenie
  • Hipchat
  • Slack
  • Let’s Chat

Spring Boot Admin Client:

Now we are ready with the admin server application let us create the client application. Create any HelloWorld spring boot application or if you have any existing spring boot app you can use the same as a client application.

Add below Maven dependency

<dependency>
<groupId>de.codecentric</groupId>
<artifactId>spring-boot-admin-starter-client</artifactId>
<version>1.5.1</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>

Next, update application.properties and add the following properties

spring.boot.admin.url=http://localhost:8081
spring.boot.admin.username=admin
spring.boot.admin.password=admin

These changes are fine in your client application now run the client application. Once the client application is up and running go and check your admin server application. It will show all your applications.

Beautiful Dashboards:

Spring boot admin is providing a wide range of features. As part of this article, we have just listed very few dig deeper and explore more.

Stop this Microservices Madness
26
Mar
2021

Stop this Microservices Madness

I’m a big fan of Microservices.
Does it sound weird, given the title? Well, it should not.

Every now and then, a new paradigm or technology arises. It brings hope and hype. This will save us all!
Yeah, it’s true. Maybe it is ground-breaking. Extremely beneficial. However, like everything in life, every solution has its scope.

Have you ever tried to apply Newtonian laws to subatomic particles? Or, have you ever tried to squeeze a television into a sock? Pretty much the same here.
Hype makes you forget this little clause at the end of the contract. And results are catastrophic.

Proper usage of socks and televisions. Photo by Mollie Sivaram on Unsplash

Splitting is the way

In a normal web application, you have a single process serving all requests. Well, you can replicate it for balancing reasons, but in any case there’s a single package/software you have to run. And it bundles all the logic together.

When you use Microservices, instead, you have several, distinct processes, each one serving just a subset of requests. Each process runs a distinct package, with just the code it needs.

Is it really new?

No, it isn’t.

The Microservice pattern is nothing new, actually, like almost every emergent design (besides it has been around for some years).
It is Object-Oriented Programming applied on a higher level: application architecture.

No coincidence that it often makes use of HTTP APIs, particularly RESTful, which are in turn OOP principles applied to API design.
P.S. I hate acronyms.

Let me explain.

When you create a program using Object-Oriented Programming, you split it into several pieces, Classes. Each Class represents a component of the solution and has a specific role. It also works just on its own subset of data. Does it sound familiar?

This is not the right place to discuss about Object-Oriented Programming in general, but I have to point out some aspects. If you are not familiar with it, you can reference online documentation and, soon, my other article about it!

The Two Principles — The Good Tailor

When you design Classes, you must follow two main principles: loose coupling and high cohesion.

To be honest, these should be the guiding stars of each software project, not just OOP ones. But that’s another story. Or is it?

High cohesionis the situation where each component is strongly consistent. All its data and methods refer to a very specific area, or meaning, or role.

Loose couplingis the situation where each component has the least possible amount of interactions with other components. It can work on its own, hiding low-level details, and exposes the essential interfaces.

For Microservices it is the same.
I can’t stress this enough. A good OOP design leads to an optimal Microservice design. There is no middle ground.

This should be almost enough to answer why and when to use Microservices. But since there is more, let’s break it down.

Be a Good Tailor. Loosely sew highly-entangled (coherent) fabrics. Photo by Jeff Wade on Unsplash

When to split?

This is exactly as it is for Object-Oriented Programming.

When you can respect high cohesion and loose coupling.

No differently.

You know how to split code into Classes in the right way. Why would you split your Microservices randomly?
If a bad Class diagram is painful to maintain and manage, because of the sparsity of code in different files, think of randomly split Microservices application, where you have different processes altogether.

But why on Earth would I split my application into sub-processes?

Imagine how much it’s easy to make two Objects interact. You just have to call an Object’s method and you are good to go. All within a single program.

Microservices, instead, live in different processes. Possibly, on different machines. You have to communicate over a network, using APIs.

It adds complexity!
You must have excellent reasons to do so.
Not for trend, not for fun. Because, I assure you, if you use Microservices randomly, their management is not fun at all.

Actually, you must have the same reasons you’d have when you decide to use APIs. It’s the same.

APIs let you hide all the implementation details behind them. That’s essential in some cases, and it offers incredible benefits.

1. Scalability for non-uniform traffic

Scenario: you have a Supermarket application.
You have a Stock Microservice, that just shows you the amounts of articles in stock, and a Vision Microservice, which uses GPUs to detect your articles given a picture of them.

If you receive thousands of calls for your Stock APIs and just a bunch of requests for your Vision APIs, you could just replicate your Stock Microservice.

No need to double your GPU resources!

Think of it. If you kept everything in a single machine or application, and you still wanted to scale to match all the incoming requests, you would have to replicate it entirely. A waste.

2. Error resilience

Suppose you have a Bank application. Your Transfer Microservice keeps crashing, maybe for a temporary error or, worse, for a newly introduced bug (damn untested releases…).

Why should the entire Bank application be down? Maybe some customers wouldn’t even notice, because they just want to see their movements or they want to use their cards.

3. Separate deployments

Similarly to the reason above, if you have a new feature to add or a bug to fix, you can just deploy the right Microservice. Your build and deployment times should be smaller.

Also, you could put your Microservices on different machines, in different locations, maybe.

4. Complete isolation

Maybe you need complete isolation between Microservices. In this way, you can use different DBs (SQL and noSQL) or different technologies altogether. This will give you complete freedom!
And it is of course strictly related to the other points as well, about error resilience or requirements.

5. Different requirements

Do you have a heavy computation to perform? Maybe a Machine Learning one, where you need GPUs, their libraries…
Maybe that part would be much better in Python, while the other one could benefit from Java.
Or maybe libraries for the same language would create some conflicts, but you desperately need two specific versions for two distinct tasks.
And finally, perhaps you want to use two different Cloud products to host your two distinct pieces.

There are plenty of reasons, some more common than others.

In those cases, split the application!
Probably it’s easier. But remember to respect the Two Principles! Be a Good Tailor, or you would have hard times sharing data between your Microservices…

Even the highest of Monoliths is made by blocks. It might be atoms, but it is. Photo by Zoltan Tasi on Unsplash

Is it not your case? Go for a Monolith.

If your system is tightly coupled, use a single application. A Monolith.

You’ll have cleaner dependency management, and you won’t need complex orchestrations or distributed systems to trace errors, share data, collect logs, sync calls, secure network interactions… and I can go on.

Is Object-Oriented Programming not suitable for your application architecture? Or, better, is RESTful design not suitable? Microservices won’t be either!

If your application serves a long sequence of API calls, maybe because an API requires the result of the previous ones, this is not ideal. It would introduce also network delays.
If your application is a single procedure that requires user interaction, well, the sentence is self-explaining. It is a Procedure!
You have to share a state between calls, each component is not independent and a single error in the sequence will block it entirely. There is hardly a benefit in splitting the application.

And remember, Monolith doesn’t mean Mess

Monolith.

Mess.

They are also spelled differently.

Someone sees Monolith as a bunch of unordered code.
They claim that Microservice pattern is the only way to have splits, order and all.

Well, I’ll reveal something.

It’s — not — true.

Your code design does not depend on process divisions. Object-Oriented Programming doesn’t either. If you split your code right, if you manage your dependencies in a clear hierarchy, as well as your files, you can achieve the same level of cohesion and decoupling.

And another secret: if you divide your APIs into separate packages/modules/controllers/blueprints/whatever in your code, then you can decide to serve them in a single process or split them in Microservices at build-time, or at run-time too, for free.

Wow.

Conclusions

I’m a big fan of Microservices, as I said.

If used in the right context, for the right reasons, they solve issues and improve performances or costs.

But please, stop making them the one and only solution for all problems.
Because they can also be the root of all evil.

Split a PDF file into many using Java Pdfbox Api.
05
Mar
2021

Split a PDF file into many using Java Pdfbox Api

Following is an example program to split a PDF in to many using Java.

mport org.apache.pdfbox.multipdf.Splitter; 
import org.apache.pdfbox.pdmodel.PDDocument;  

import java.io.File; 
import java.io.IOException; 

import java.util.List; 
import java.util.Iterator;  

public class SplittingPDF { 
   public static void main(String[] args) throws IOException { 
      
      //Loading an existing PDF document 
      File file = new File("C:/pdfBox/splitpdf_IP.pdf"); 
      PDDocument doc = PDDocument.load(file); 

      //Instantiating Splitter class 
      Splitter splitter = new Splitter(); 
      
      //splitting the pages of a PDF document 
      List<PDDocument> Pages = splitter.split(doc); 

      //Creating an iterator 
      Iterator<PDDocument> iterator = Pages.listIterator();         

      //Saving each page as an individual document 
      int i = 1; 
      
      while(iterator.hasNext()){ 
         PDDocument pd = iterator.next(); 
         pd.save("C:/pdfBox/splitOP"+ i++ +".pdf");             
      } 
      System.out.println("PDF splitted");     
   } 
}

Input

Split Input

Output

Split Output
Split Output
Two Number Sum Problem Solution
06
Feb
2021

Two Number Sum Problem Solution

Two Number Sum Problem Statement

Given an array of integers, return the indices of the two numbers whose sum is equal to a given target.

You may assume that each input would have exactly one solution, and you may not use the same element twice.

Example:

Given nums = [2, 7, 11, 15], target = 9.

The output should be [0, 1]. 
Because nums[0] + nums[1] = 2 + 7 = 9.

Two Number Sum Problem solution in Java

METHOD 1. Naive approach: Use two for loops

The naive approach is to just use two nested for loops and check if the sum of any two elements in the array is equal to the given target.

Time complexity: O(n^2)

import java.util.HashMap;
import java.util.Scanner;
import java.util.Map;

class TwoSum {

    // Time complexity: O(n^2)
    private static int[] findTwoSum_BruteForce(int[] nums, int target) {
        for (int i = 0; i < nums.length; i++) {
            for (int j = i + 1; j < nums.length; j++) {
                if (nums[i] + nums[j] == target) {
                    return new int[] { i, j };
                }
            }
        }
        return new int[] {};
    }


    public static void main(String[] args) {
        Scanner keyboard = new Scanner(System.in);

        int n = keyboard.nextInt();
        int[] nums = new int[n];

        for(int i = 0; i < n; i++) {
            nums[i] = keyboard.nextInt();
        }
        int target = keyboard.nextInt();

        keyboard.close();

        int[] indices = findTwoSum_BruteForce(nums, target);

        if (indices.length == 2) {
            System.out.println(indices[0] + " " + indices[1]);
        } else {
            System.out.println("No solution found!");
        }
    }
}
# Output
$ javac TwoSum.java
$ java TwoSum
4 2 7 11 15
9
0 1

METHOD 2. Use a HashMap (Most efficient)

You can use a HashMap to solve the problem in O(n) time complexity. Here are the steps:

  1. Initialize an empty HashMap.
  2. Iterate over the elements of the array.
  3. For every element in the array –
    • If the element exists in the Map, then check if it’s complement (target - element) also exists in the Map or not. If the complement exists then return the indices of the current element and the complement.
    • Otherwise, put the element in the Map, and move to the next iteration.

Time complexity: O(n)

import java.util.HashMap;
import java.util.Scanner;
import java.util.Map;

class TwoSum {
    
    // Time complexity: O(n)
    private static int[] findTwoSum(int[] nums, int target) {
        Map<Integer, Integer> numMap = new HashMap<>();
        for (int i = 0; i < nums.length; i++) {
            int complement = target - nums[i];
            if (numMap.containsKey(complement)) {
                return new int[] { numMap.get(complement), i };
            } else {
                numMap.put(nums[i], i);
            }
        }
        return new int[] {};
    }
}

METHOD 3. Use Sorting along with the two-pointer sliding window approach

There is another approach which works when you need to return the numbers instead of their indexes. Here is how it works:

  1. Sort the array.
  2. Initialize two variables, one pointing to the beginning of the array (left) and another pointing to the end of the array (right).
  3. Loop until left < right, and for each iteration
    • if arr[left] + arr[right] == target, then return the indices.
    • if arr[left] + arr[right] < target, increment the left index.
    • else, decrement the right index.

This approach is called the two-pointer sliding window approach. It is a very common pattern for solving array related problems.

Time complexity: O(n*log(n))

import java.util.Scanner;
import java.util.Arrays;

class TwoSum {

    // Time complexity: O(n*log(n))
    private static int[] findTwoSum_Sorting(int[] nums, int target) {
        Arrays.sort(nums);
        int left = 0;
        int right = nums.length - 1;
        while(left < right) {
            if(nums[left] + nums[right] == target) {
                return new int[] {nums[left], nums[right]};
            } else if (nums[left] + nums[right] < target) {
                left++;
            } else {
                right--;
            }
        }
        return new int[] {};
    }
}

Liked the Article? Share it on Social media!

Spring Boot Actuator metrics monitoring with Prometheus and Grafana
03
Mar
2021

Spring Boot Actuator metrics monitoring with Prometheus and Grafana

Welcome to the second part of the Spring Boot Actuator tutorial series. In the first part, you learned what spring-boot-actuator module does, how to configure it in a spring boot application, and how to interact with various actuator endpoints.

In this article, you’ll learn how to integrate spring boot actuator with a monitoring system called Prometheus and a graphing solution called Grafana.

At the end of this article, you’ll have a Prometheus as well as a Grafana dashboard setup in your local machine where you’ll be able to visualize and monitor all the metrics generated from the Spring Boot application.

Prometheus

Prometheus is an open-source monitoring system that was originally built by SoundCloud. It consists of the following core components –

  • A data scraper that pulls metrics data over HTTP periodically at a configured interval.
  • time-series database to store all the metrics data.
  • A simple user interface where you can visualize, query, and monitor all the metrics.

Grafana

Grafana allows you to bring data from various data sources like Elasticsearch, Prometheus, Graphite, InfluxDB etc, and visualize them with beautiful graphs.

It also lets you set alert rules based on your metrics data. When an alert changes state, it can notify you over email, slack, or various other channels.

Note that, Prometheus dashboard also has simple graphs. But Grafana’s graphs are way better. That’s why, in this post, we’ll integrate Grafana with Prometheus to import and visualize our metrics data.

Adding Micrometer Prometheus Registry to your Spring Boot application

Spring Boot uses Micrometer, an application metrics facade to integrate actuator metrics with external monitoring systems.

It supports several monitoring systems like Netflix Atlas, AWS Cloudwatch, Datadog, InfluxData, SignalFx, Graphite, Wavefront, Prometheus etc.

To integrate actuator with Prometheus, you need to add the micrometer-registry-prometheus dependency –

<!-- Micrometer Prometheus registry  -->
<dependency>
	<groupId>io.micrometer</groupId>
	<artifactId>micrometer-registry-prometheus</artifactId>
</dependency>

Once you add the above dependency, Spring Boot will automatically configure a PrometheusMeterRegistry and a CollectorRegistry to collect and export metrics data in a format that can be scraped by a Prometheus server.

All the application metrics data are made available at an actuator endpoint called /prometheus. The Prometheus server can scrape this endpoint to get metrics data periodically.

Exploring Spring Boot Actuator’s /prometheus Endpoint

Let’s explore the prometheus endpoint that is exposed by Spring Boot when micrometer-registry-prometheus dependency is available on the classpath.

First of all, you’ll start seeing the prometheus endpoint on the actuator endpoint-discovery page (http://localhost:8080/actuator) –

Spring Boot Actuator Prometheus Endpoint

The prometheus endpoint exposes metrics data in a format that can be scraped by a Prometheus server. You can see the exposed metrics data by navigating to the prometheus endpoint (http://localhost:8080/actuator/prometheus) –

# HELP jvm_buffer_memory_used_bytes An estimate of the memory that the Java virtual machine is using for this buffer pool
# TYPE jvm_buffer_memory_used_bytes gauge
jvm_buffer_memory_used_bytes{id="direct",} 81920.0
jvm_buffer_memory_used_bytes{id="mapped",} 0.0
# HELP jvm_threads_live The current number of live threads including both daemon and non-daemon threads
# TYPE jvm_threads_live gauge
jvm_threads_live 23.0
# HELP tomcat_global_received_bytes_total  
# TYPE tomcat_global_received_bytes_total counter
tomcat_global_received_bytes_total{name="http-nio-8080",} 0.0
# HELP jvm_gc_pause_seconds Time spent in GC pause
# TYPE jvm_gc_pause_seconds summary
jvm_gc_pause_seconds_count{action="end of minor GC",cause="Allocation Failure",} 7.0
jvm_gc_pause_seconds_sum{action="end of minor GC",cause="Allocation Failure",} 0.232
jvm_gc_pause_seconds_count{action="end of minor GC",cause="Metadata GC Threshold",} 1.0
jvm_gc_pause_seconds_sum{action="end of minor GC",cause="Metadata GC Threshold",} 0.01
jvm_gc_pause_seconds_count{action="end of major GC",cause="Metadata GC Threshold",} 1.0
jvm_gc_pause_seconds_sum{action="end of major GC",cause="Metadata GC Threshold",} 0.302
# HELP jvm_gc_pause_seconds_max Time spent in GC pause
# TYPE jvm_gc_pause_seconds_max gauge
jvm_gc_pause_seconds_max{action="end of minor GC",cause="Allocation Failure",} 0.0
jvm_gc_pause_seconds_max{action="end of minor GC",cause="Metadata GC Threshold",} 0.0
jvm_gc_pause_seconds_max{action="end of major GC",cause="Metadata GC Threshold",} 0.0
# HELP jvm_gc_live_data_size_bytes Size of old generation memory pool after a full GC
# TYPE jvm_gc_live_data_size_bytes gauge
jvm_gc_live_data_size_bytes 5.0657472E7

## More data ...... (Omitted for brevity)

Downloading and Running Prometheus using Docker

1. Downloading Prometheus

You can download the Prometheus docker image using docker pull command like so –

$ docker pull prom/prometheus

Once the image is downloaded, you can type docker image ls command to view the list of images present locally –

$ docker image ls
REPOSITORY                                   TAG                 IMAGE ID            CREATED             SIZE
prom/prometheus                              latest              b82ef1f3aa07        5 days ago          119MB

2. Prometheus Configuration (prometheus.yml)

Next, We need to configure Prometheus to scrape metrics data from Spring Boot Actuator’s /prometheus endpoint.

Create a new file called prometheus.yml with the following configurations –

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'
    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.
    static_configs:
    - targets: ['127.0.0.1:9090']

  - job_name: 'spring-actuator'
    metrics_path: '/actuator/prometheus'
    scrape_interval: 5s
    static_configs:
    - targets: ['HOST_IP:8080']

The above configuration file is an extension of the basic configuration file available in the Prometheus documentation.

The most important stuff to note in the above configuration file is the spring-actuator job inside scrape_configs section.

The metrics_path is the path of the Actuator’s prometheus endpoint. The targets section contains the HOST and PORT of your Spring Boot application.

Please make sure to replace the HOST_IP with the IP address of the machine where your Spring Boot application is running. Note that, localhost won’t work here because we’ll be connecting to the HOST machine from the docker container. You must specify the network IP address.

3. Running Prometheus using Docker

Finally, Let’s run Prometheus using Docker. Type the following command to start a Prometheus server in the background –

$ docker run -d --name=prometheus -p 9090:9090 -v <PATH_TO_prometheus.yml_FILE>:/etc/prometheus/prometheus.yml prom/prometheus --config.file=/etc/prometheus/prometheus.yml

Please make sure to replace the <PATH_TO_prometheus.yml_FILE> with the PATH where you have stored the Prometheus configuration file.

After running the above command, docker will start the Prometheus server inside a container. You can see the list of all the containers with the following command –

$ docker container ls
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                    NAMES
e036eb20b8ad        prom/prometheus     "/bin/prometheus --c…"   4 minutes ago       Up 4 minutes        0.0.0.0:9090->9090/tcp   prometheus

4. Visualizing Spring Boot Metrics from Prometheus dashboard

That’s it! You can now navigate to http://localhost:9090 to explore the Prometheus dashboard.

You can enter a Prometheus query expression inside the Expression text field and visualize all the metrics for that query.

Following are some Prometheus graphs for our Spring Boot application’s metrics –

  • System’s CPU usage –
Spring Boot Actuator Metrics Dashboard Prometheus
  • Response latency of a slow API –
Spring Boot Actuator Prometheus Dashboard Graph Example

You can check out the official Prometheus documentation to learn more about Prometheus Query Expressions.

Downloading and running Grafana using Docker

Type the following command to download and run Grafana using Docker –

$ docker run -d --name=grafana -p 3000:3000 grafana/grafana 

The above command will start Grafana inside a Docker container and make it available on port 3000 on the Host machine.

You can type docker container ls to see the list of Docker containers –

$ docker container ls
CONTAINER ID        IMAGE               COMMAND                  CREATED                  STATUS              PORTS                    NAMES
cf9196b30d0d        grafana/grafana     "/run.sh"                Less than a second ago   Up 5 seconds        0.0.0.0:3000->3000/tcp   grafana
e036eb20b8ad        prom/prometheus     "/bin/prometheus --c…"   16 minutes ago           Up 16 minutes       0.0.0.0:9090->9090/tcp   prometheus

That’s it! You can now navigate to http://localhost:3000 and log in to Grafana with the default username admin and password admin.

Configuring Grafana to import metrics data from Prometheus

Follow the steps below to import metrics from Prometheus and visualize them on Grafana:

1. Add the Prometheus data source in Grafana

Spring Boot Actuator Prometheus Grafana Dashboard

2. Create a new Dashboard with a Graph

Spring Boot Actuator Grafana Dashboard Create Graph

3. Add a Prometheus Query expression in Grafana’s query editor

Spring Boot Actuator Grafana Dashboard Prometheus Metrics Graph

4. Visualize metrics from Grafana’s dashboard

Spring Boot Actuator Grafana Dashboard Visualize Prometheus metrics graph

Read the First Part: Spring Boot Actuator: Health check, Auditing, Metrics gathering and Monitoring

More Learning Resources

Log Aggregation For Microservice Architecture
08
Mar
2021

Log Aggregation For Microservice Architecture

To solve such problems, a preferred approach is to take advantage of a centralized logging service that aggregate logs from each service instance.

Users can search through these logs from one centralized spot and configure alerts when certain messages appear.

Standard tools are available and widely used by various enterprises.

ELK Stack is the most frequently used solution, where logging daemon, Logstash, collects and aggregate logs which can be searched via a Kibana dashboard indexed by Elasticsearch.

Spring-boot-starter-parent Example
30
Mar
2021

Spring Boot Starter Parent Example

In this spring boot tutorial, we will learn about spring-boot-starter-parent dependency which is used internally by all spring boot dependencies. We will also learn what all configurations this dependency provides, and how to override them.

What is spring-boot-starter-parent dependency?

The spring-boot-starter-parent dependency is the parent POM providing dependency and plugin management for Spring Boot-based applications. It contains the default versions of Java to use, the default versions of dependencies that Spring Boot uses, and the default configuration of the Maven plugins.

Few important configurations provided by this file are as below. Please refer to this link to read the complete configuration.

<?xml version="1.0" encoding="UTF-8"?><project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd;<modelVersion>4.0.0</modelVersion><parent><groupId>org.springframework.boot</groupId><artifactId>spring-boot-dependencies</artifactId><version>${revision}</version><relativePath>../../spring-boot-dependencies</relativePath></parent><artifactId>spring-boot-starter-parent</artifactId><packaging>pom</packaging><name>Spring Boot Starter Parent</name><description>Parent pom providing dependency and plugin management for applicationsbuilt with Maven</description><properties><java.version>1.8</java.version><resource.delimiter>@</resource.delimiter> <!-- delimiter that doesn't clash with Spring ${} placeholders --><project.build.sourceEncoding>UTF-8</project.build.sourceEncoding><project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding><maven.compiler.source>${java.version}</maven.compiler.source><maven.compiler.target>${java.version}</maven.compiler.target></properties> ... <resource><directory>${basedir}/src/main/resources</directory><filtering>true</filtering><includes><include>**/application*.yml</include><include>**/application*.yaml</include><include>**/application*.properties</include></includes></resource> </project>

The spring-boot-starter-parent dependency further inherits from spring-boot-dependencies, which is defined at the top of above POM file at line number : 9.

This file is the actual file which contains the information of default version to use for all libraries. The following code shows the different versions of various dependencies that are configured in spring-boot-dependencies:

<properties><!-- Dependency versions --><activemq.version>5.15.3</activemq.version><antlr2.version>2.7.7</antlr2.version><appengine-sdk.version>1.9.63</appengine-sdk.version><artemis.version>2.4.0</artemis.version><aspectj.version>1.8.13</aspectj.version><assertj.version>3.9.1</assertj.version><atomikos.version>4.0.6</atomikos.version><bitronix.version>2.1.4</bitronix.version><byte-buddy.version>1.7.11</byte-buddy.version><caffeine.version>2.6.2</caffeine.version><cassandra-driver.version>3.4.0</cassandra-driver.version><classmate.version>1.3.4</classmate.version> ......</properties>

Above list is very long and you can read complete list in this link.

How to override default dependency version?

As you see, spring boot has default version to use for most of dependencies. You can override the version of your choice or project need, in properties tag in your project’s pom.xml file.

e.g. Spring boot used default version of google GSON library as 2.8.2.

<groovy.version>2.4.14</groovy.version><gson.version>2.8.2</gson.version><h2.version>1.4.197</h2.version>

I want to use 2.7 of gson dependency. So I will give this information in properties tag like this.

<properties><java.version>1.8</java.version><project.build.sourceEncoding>UTF-8</project.build.sourceEncoding><gson.version>2.7</gson.version></properties>

Now in your eclipse editor, you can see the message as : The managed version is 2.7 The artifact is managed in org.springframework.boot:spring-boot-dependencies:2.0.0.RELEASE.

GSON resolved dependency
GSON resolved dependency

Drop me your questions in comments section.

Happy Learning !!

Spring Boot File Upload / Download with JPA, Hibernate, and MySQL database
03
Mar
2021

Spring Boot File Upload / Download with JPA, Hibernate and MySQL database

In this article, you’ll learn how to upload and download files in a Restful spring boot web service. The files will be stored in MySQL database.

This article is a continuation of an earlier article where I’ve shown how to upload files and store them in the local filesystem.

We’ll reuse most of the code and concepts described in the last article. So I highly recommend you to go through that before reading this one.

Following is the directory structure of the complete application for your reference –

Spring Boot File Upload Download JPA, Hibernate, MySQL database example.

JPA and MySQL dependencies

Since we’ll be storing files in MySQL database, we’ll need JPA and MySQL dependencies along with Web dependency. So make sure that your pom file contains the following dependencies –

<dependencies>
	<dependency>
		<groupId>org.springframework.boot</groupId>
		<artifactId>spring-boot-starter-web</artifactId>
	</dependency>

	<dependency>
		<groupId>org.springframework.boot</groupId>
		<artifactId>spring-boot-starter-data-jpa</artifactId>
	</dependency>

	<dependency>
		<groupId>mysql</groupId>
		<artifactId>mysql-connector-java</artifactId>
		<scope>runtime</scope>
	</dependency>

	<dependency>
		<groupId>org.springframework.boot</groupId>
		<artifactId>spring-boot-starter-test</artifactId>
		<scope>test</scope>
	</dependency>
</dependencies>

If you’re building the project from scratch, you can generate the project skeleton from Spring Initialzr website

Configuring the Database and Multipart File properties

Next, we need to configure the MySQL database url, username, and password. You can configure that in the src/main/resources/application.properties file –

## Spring DATASOURCE (DataSourceAutoConfiguration & DataSourceProperties)
spring.datasource.url= jdbc:mysql://localhost:3306/file_demo?useSSL=false&serverTimezone=UTC&useLegacyDatetimeCode=false
spring.datasource.username= root
spring.datasource.password= pass

## Hibernate Properties
# The SQL dialect makes Hibernate generate better SQL for the chosen database
spring.jpa.properties.hibernate.dialect = org.hibernate.dialect.MySQL5InnoDBDialect
spring.jpa.hibernate.ddl-auto = update

## Hibernate Logging
logging.level.org.hibernate.SQL= DEBUG


## MULTIPART (MultipartProperties)
# Enable multipart uploads
spring.servlet.multipart.enabled=true
# Threshold after which files are written to disk.
spring.servlet.multipart.file-size-threshold=2KB
# Max file size.
spring.servlet.multipart.max-file-size=200MB
# Max Request Size
spring.servlet.multipart.max-request-size=215MB

The above properties file also has Multipart file properties. You can make changes to these properties as per your requirements.

DBFile model

Let’s create a DBFile entity to model the file attributes that will be stored in the database –

package com.example.filedemo.model;

import org.hibernate.annotations.GenericGenerator;

import javax.persistence.*;

@Entity
@Table(name = "files")
public class DBFile {
    @Id
    @GeneratedValue(generator = "uuid")
    @GenericGenerator(name = "uuid", strategy = "uuid2")
    private String id;

    private String fileName;

    private String fileType;

    @Lob
    private byte[] data;

    public DBFile() {

    }

    public DBFile(String fileName, String fileType, byte[] data) {
        this.fileName = fileName;
        this.fileType = fileType;
        this.data = data;
    }

    // Getters and Setters (Omitted for brevity)
}

Note that, the file’s contents will be stored as a byte array in the database.

DBFileRepository

Next, we need to create a repository to save files in the database and retrieve them back –

package com.example.filedemo.repository;

import com.example.filedemo.model.DBFile;
import org.springframework.data.jpa.repository.JpaRepository;
import org.springframework.stereotype.Repository;

@Repository
public interface DBFileRepository extends JpaRepository<DBFile, String> {

}

DBFileStorageService

The following DBFileStorageService contains methods to store and retrieve files to/from the database –

package com.example.filedemo.service;

import com.example.filedemo.exception.FileStorageException;
import com.example.filedemo.exception.MyFileNotFoundException;
import com.example.filedemo.model.DBFile;
import com.example.filedemo.repository.DBFileRepository;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;
import org.springframework.util.StringUtils;
import org.springframework.web.multipart.MultipartFile;
import java.io.IOException;

@Service
public class DBFileStorageService {

    @Autowired
    private DBFileRepository dbFileRepository;

    public DBFile storeFile(MultipartFile file) {
        // Normalize file name
        String fileName = StringUtils.cleanPath(file.getOriginalFilename());

        try {
            // Check if the file's name contains invalid characters
            if(fileName.contains("..")) {
                throw new FileStorageException("Sorry! Filename contains invalid path sequence " + fileName);
            }

            DBFile dbFile = new DBFile(fileName, file.getContentType(), file.getBytes());

            return dbFileRepository.save(dbFile);
        } catch (IOException ex) {
            throw new FileStorageException("Could not store file " + fileName + ". Please try again!", ex);
        }
    }

    public DBFile getFile(String fileId) {
        return dbFileRepository.findById(fileId)
                .orElseThrow(() -> new MyFileNotFoundException("File not found with id " + fileId));
    }
}

FileController (File upload/download REST APIs)

Finally, following are the Rest APIs to upload and download files –

package com.example.filedemo.controller;

import com.example.filedemo.model.DBFile;
import com.example.filedemo.payload.UploadFileResponse;
import com.example.filedemo.service.DBFileStorageService;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.core.io.ByteArrayResource;
import org.springframework.core.io.Resource;
import org.springframework.http.HttpHeaders;
import org.springframework.http.MediaType;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.*;
import org.springframework.web.multipart.MultipartFile;
import org.springframework.web.servlet.support.ServletUriComponentsBuilder;

import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;

@RestController
public class FileController {

    private static final Logger logger = LoggerFactory.getLogger(FileController.class);

    @Autowired
    private DBFileStorageService dbFileStorageService;

    @PostMapping("/uploadFile")
    public UploadFileResponse uploadFile(@RequestParam("file") MultipartFile file) {
        DBFile dbFile = dbFileStorageService.storeFile(file);

        String fileDownloadUri = ServletUriComponentsBuilder.fromCurrentContextPath()
                .path("/downloadFile/")
                .path(dbFile.getId())
                .toUriString();

        return new UploadFileResponse(dbFile.getFileName(), fileDownloadUri,
                file.getContentType(), file.getSize());
    }

    @PostMapping("/uploadMultipleFiles")
    public List<UploadFileResponse> uploadMultipleFiles(@RequestParam("files") MultipartFile[] files) {
        return Arrays.asList(files)
                .stream()
                .map(file -> uploadFile(file))
                .collect(Collectors.toList());
    }

    @GetMapping("/downloadFile/{fileId}")
    public ResponseEntity<Resource> downloadFile(@PathVariable String fileId) {
        // Load file from database
        DBFile dbFile = dbFileStorageService.getFile(fileId);

        return ResponseEntity.ok()
                .contentType(MediaType.parseMediaType(dbFile.getFileType()))
                .header(HttpHeaders.CONTENT_DISPOSITION, "attachment; filename=\"" + dbFile.getFileName() + "\"")
                .body(new ByteArrayResource(dbFile.getData()));
    }

}

Results

I’ll be reusing the complete front-end code that I demonstrated in the first part of this article. You should check out the first article to learn more about the front-end code.

Once you have the front-end code in place, type the following command to run the application –

mvn spring-boot:run

Here is a screenshot of the final app –

Spring Boot File Upload Download with JPA, Hibernate and MySQL database demo

That’s all for now. Thanks for reading!

Micro-frontends: The path to a scalable future — part 2
26
Mar
2021

Micro-frontends: The path to a scalable future — part 2

Introduction

So in Part 1 of this article, we’ve learned what micro-frontends are, what they are meant for, and what the main concepts associated with them are. In this part, we are going to dive deep into various micro-frontend architecture types. We will understand what characteristics they share and what they don’t. We’ll see how complex each architecture is to implement. Then through the documents-to-applications continuum concept, we will identify our application type and choose the proper micro-frontend architecture that meets our application’s needs at most.

Micro-frontend architecture characteristics

In order to define micro-frontend architecture types, let’s reiterate over two micro-frontend integration concepts that we learned in the first part: (Fig. 1 and 2).

  • routing/page transitions

For page transition, we’ve seen that there are server and client-side routings, which in another way are called hard and soft transitions. It’s “hard” because with server-side routing, each page transition leads to a full page reload, which means for a moment the user sees a blank page until the content starts to render and show up. And client-side routing is called soft because on page transition, only some part of the page content is being reloaded (usually header, footer, and some common content stays as it is) by JavaScript. So there is no blank page, and, also the new content is smaller so it arrives earlier than during the full page reload. Therefore, with soft transitions, we undoubtedly have a better user experience than with hard ones.

Figure 1. routing/page transition types

Currently, almost all modern web applications are Single Page Applications (SPAs). This is a type of application where all page transitions are soft and smooth. When speaking about micro-frontends, we can consider each micro-frontend as a separate SPA, so the page transitions within each of them can also be considered soft. From a micro-frontend architectural perspective, by saying page transitions, we meant the transitions from one micro frontend to another and not the local transitions within a micro frontend. Therefore this transition type is one of the key characteristics of a micro-frontend architecture type.

  • composition/rendering

For composition, we’ve spoken about the server and client-side renderings. In the 2000s, when JavaScript was very poor and limited, almost all websites were fully rendered on the server-side. With the evolution of web applications and JavaScript, client-side rendering became more and more popular. SPAs mentioned earlier also are applications with client-side rendering.

Figure 2. composition/rendering types

Currently, when saying server-side rendering or universal rendering, we don’t assume that the whole application is rendered on the server-side — it’s more like hybrid rendering where some content is rendered on the server-side and the other part on the client-side. In the first article about micro-frontends, we’ve spoken about the cases when universal rendering can be useful. So the rendering is the second characteristic of a micro-frontend architecture type.

Architecture types

Now that we have defined two characteristics of micro-frontend architecture, we can extract four types based on various combinations of those two.

Linked Applications

  • Hard transitions;
  • Client-side rendering;

This is the simplest case where the content rendering is being done on the client-side and where the transition between micro-frontends is hard. As an example, we can consider two SPAs that are hosted under the same domain (http://team.com). A simple Nginx proxy server serves the first micro-frontend on the path “/a” and the second micro-frontend on the path “/b.” Page transitions within each micro-frontend are smooth as we assumed they are SPAs.

The transition from one micro-frontend to another happens via links which in fact is a hard transition and leads to a full page reload (Fig. 3). This micro-frontend architecture has the simplest implementation complexity.

Figure 3. Linked application transition model

Linked Universal Applications

  • Hard transitions;
  • Universal rendering;

If we add universal rendering features to the previous case, we’ll get the Linked Universal Applications architecture type. This means that some content of either one or many micro frontends renders on the server-side. But still, the transition from one micro to another is done via links. This architecture type is more complex than the first one as we have to implement server-side rendering.

Unified Applications

  • Soft transitions;
  • Client-side rendering;

Here comes my favorite part. To have a soft transition between micro-frontends, there should be a shell-application that will act as an umbrella for all micro-frontend applications. It will only have a markup layout and, most importantly, a top-level client routing. So the transition from path “/a” to path “/b” will be handed over to the shell-application (Fig. 4). As it is a client-side routing and is the responsibility of JavaScript, it can be done softly, making a perfect user experience. Though this type of application is fully rendered on the client-side, there is no universal rendering of this architecture type.

Figure 4. Unified application with a shell application

Unified Universal Applications

  • Soft transitions;
  • Universal rendering;

And here we reach the most powerful and, at the same time, the most complex architecture type where we have both soft transitions between micro-frontends and universal rendering (Fig. 5).

Figure 5. Micro-frontend architecture types

A logical question can arise like, “oh, why don’t we pick the fourth and the best option?” Simply because these four architecture types have different levels of complexity, and the more complex our architecture is, the more we will pay for it for both implementation and maintenance (Fig. 6). So there is no absolute right or wrong architecture. We just have to find a way to understand how to choose the right architecture for our application, and we are going to do it in the next section.

Figure 6. Micro-frontend architecture types based on their implementation complexities.

Which architecture to choose?

There is an interesting concept called the documents-to-applications continuum (Fig. 7). It’s an axis with two ends where we have more content-centric websites, and on the other side, we have pure web applications that are behavior-centric.

  • Content-centric: the priority is given to the content. Think of some news or a blog website where users usually open an article. The content is what matters in such websites; the faster it loads, the more time the users save. There are not so many transitions in these websites, so they can easily be hard, and no one will care.
  • Behavior-centric: the priority is given to the user experience. These are pure applications like online code editors, and playgrounds, etc. In these applications, you are usually in a flow of multiple actions. Thus there are many route/state transitions and user interactions along with the flow. For such types of applications having soft transitions is key for good UX.
Figure 7. Correlation of the document-to-applications continuum and micro-frontend architecture types.

From Fig. 7, we can see that there is an overlap of content and behavior-centric triangles. This is the area where the progressive web applications lay. These are applications where both the content and the user experience are important. A good example is an e-commerce web application.

The faster the first page loads, the more pleasant it will be for the customers to stay on the website and shop.

The first example is that of a content-centric application, while the second one is behavior-centric.

Now let’s see where our architecture types lay in the document-to-applications continuum. Since we initially assumed that each micro-frontend is a SPA by itself, we can assume that all our architecture types lay in the behavior-centric spectrum. Note that this assumption of SPA’s is not a prerogative and is more of an opinionated approach. As both linked universal apps and universal unified apps have universal rendering features, they will lay in progressive web apps combined with both content and behavior-centric spectrums. Linked apps and unified apps are architecture types that are more suited to web applications where the best user experience has higher priority than the content load time.

First of all, we have to identify our application requirements and where it is positioned in the document-to-application continuum. Then it’s all about what complexity level of implementation we can afford.

Conclusion

This time, as we already learned the main concepts of micro-frontends, we dived deep into micro-frontend applications’ high-level architecture. We defined architecture characteristics and came up with four types. And at the end, we learned how and based on what to choose our architecture type. I’d like to mention that I’ve gained most of the knowledge on micro-frontends and especially the things I wrote about in this article from the book by Micheal Geers “ Micro frontends in Action,” which I recommend reading if you want to know much more about this topic.

What’s next?

For the next chapter, we will be going towards the third architecture — Unified apps, as this type of architecture, mostly fits the AI annotation platform. We will be learning how to build unified micro-frontends. So stay tuned for the last third part.

Kafka For Beginners
31
Mar
2021

Apache Kafka For Beginners

What is Kafka?

We use Apache Kafka when it comes to enabling communication between producers and consumers using message-based topics. Apache Kafka is a fast, scalable, fault-tolerant, publish-subscribe messaging system. Basically, it designs a platform for high-end new generation distributed applications. Also, it allows a large number of permanent or ad-hoc consumers. One of the best features of Kafka is, it is highly available and resilient to node failures and supports automatic recovery. This feature makes Apache Kafka ideal for communication and integration between components of large-scale data systems in real-world data systems.

Moreover, this technology replaces the conventional message brokers, with the ability to give higher throughput, reliability, and replication like JMS, AMQP and many more. In addition, core abstraction Kafka offers a Kafka broker, a Kafka Producer, and a Kafka Consumer. Kafka broker is a node on the Kafka cluster, its use is to persist and replicate the data. A Kafka Producer pushes the message into the message container called the Kafka Topic. Whereas a Kafka Consumer pulls the message from the Kafka Topic.

Before moving forward in Kafka Tutorial, let’s understand the actual meaning of term Messaging System in Kafka.

a. Messaging System in Kafka

When we transfer data from one application to another, we use the Messaging System. It results as, without worrying about how to share data, applications can focus on data only. On the concept of reliable message queuing, distributed messaging is based. Although, messages are asynchronously queued between client applications and messaging system. There are two types of messaging patterns available, i.e. point to point and publish-subscribe (pub-sub) messaging system. However, most of the messaging patterns follow pub-sub.

Apache Kafka — Kafka Messaging System
  • Point to Point Messaging System

Here, messages are persisted in a queue. Although, a particular message can be consumed by a maximum of one consumer only, even if one or more consumers can consume the messages in the queue. Also, it makes sure that as soon as a consumer reads a message in the queue, it disappears from that queue.

  • Publish-Subscribe Messaging System

Here, messages are persisted in a topic. In this system, Kafka Consumers can subscribe to one or more topic and consume all the messages in that topic. Moreover, message producers refer publishers and message consumers are subscribers here.

History of Apache Kafka

Previously, LinkedIn was facing the issue of low latency ingestion of huge amount of data from the website into a lambda architecture which could be able to process real-time events. As a solution, Apache Kafka was developed in the year 2010, since none of the solutions was available to deal with this drawback, before.

However, there were technologies available for batch processing, but the deployment details of those technologies were shared with the downstream users. Hence, while it comes to Real-time Processing, those technologies were not enough suitable. Then, in the year 2011 Kafka was made public.

Why Should we use Apache Kafka Cluster?

As we all know, there is an enormous volume of data in Big Data. And, when it comes to big data, there are two main challenges. One is to collect the large volume of data, while another one is to analyze the collected data. Hence, in order to overcome those challenges, we need a messaging system. Then Apache Kafka has proved its utility. There are numerous benefits of Apache Kafka such as:

  • Tracking web activities by storing/sending the events for real-time processes.
  • Alerting and reporting the operational metrics.
  • Transforming data into the standard format.
  • Continuous processing of streaming data to the topics.

Therefore, this technology is giving a tough competition to some of the most popular applications like ActiveMQ, RabbitMQ, AWS etc. because of its wide use.

Kafka Tutorial — Audience

Professionals who are aspiring to make a career in Big Data Analytics using Apache Kafka messaging system should refer this Kafka Tutorial article. It will give you complete understanding about Apache Kafka.

Kafka Tutorial — Prerequisites

You must have a good understanding ofJavaScala, Distributed messaging system, and Linux environment, before proceeding with this Apache Kafka Tutorial.

Kafka Architecture

Below we are discussing four core APIs in this Apache Kafka tutorial:

Apache Kafka — Kafka Architecture

a. Kafka Producer API

This Kafka Producer API permits an application to publish a stream of records to one or more Kafka topics.

b. Kafka Consumer API

To subscribe to one or more topics and process the stream of records produced to them in an application, we use this Kafka Consumer API.

c. Kafka Streams API

In order to act as a stream processor consuming an input stream from one or more topics and producing an output stream to one or more output topics and also effectively transforming the input streams to output streams, this Kafka Streams API gives permission to an application.

d. Kafka Connector API

This Kafka Connector API allows building and running reusable producers or consumers that connect Kafka topics to existing applications or data systems. For example, a connector to a relational database might capture every change to a table.

Kafka Components

Using the following components, Kafka achieves messaging:

a. Kafka Topic

Basically, how Kafka stores and organizes messages across its system and essentially a collection of messages are Topics. In addition, we can replicate and partition Topics. Here, replicate refers to copies and partition refers to the division. Also, visualize them as logs wherein, Kafka stores messages. However, this ability to replicate and partitioning topics is one of the factors that enable Kafka’s fault tolerance and scalability.

Apache Kafka — Kafka Topic

b. Kafka Producer

It publishes messages to a Kafka topic.

c. Kafka Consumer

This component subscribes to a topic(s), reads and processes messages from the topic(s).

d. Kafka Broker

Kafka Broker manages the storage of messages in the topic(s). If Kafka has more than one broker, that is what we call a Kafka cluster.

e. Kafka Zookeeper

To offer the brokers with metadata about the processes running in the system and to facilitate health checking and broker leadership election, Kafka uses Kafka zookeeper.

Kafka Tutorial — Log Anatomy

We view log as the partitions in this Kafka tutorial. Basically, a data source writes messages to the log. One of the advantages is, at any time one or more consumers read from the log they select. Here, the below diagram shows a log is being written by the data source and the log is being read by consumers at different offsets.

Apache Kafka Tutorial — Log Anatomy

Kafka Tutorial — Data Log

By Kafka, messages are retained for a considerable amount of time. Also, consumers can read as per their convenience. However, if Kafka is configured to keep messages for 24 hours and a consumer is down for time greater than 24 hours, the consumer will lose messages. And, messages can be read from last known offset, if the downtime on part of the consumer is just 60 minutes. Kafka doesn’t keep state on what consumers are reading from a topic.

Kafka Tutorial — Partition in Kafka

There are few partitions in every Kafka broker. Moreover, each partition can be either a leader or a replica of a topic. In addition, along with updating of replicas with new data, Leader is responsible for all writes and reads to a topic. The replica takes over as the new leader if somehow the leader fails.

Apache Kafka Tutorial — Partition In Kafka

Importance of Java in Apache Kafka

Apache Kafka is written in pure Java and also Kafka’s native API is java. However, many other languages like C++, Python, .Net, Go, etc. also support Kafka. Still, a platform where there is no need of using a third-party library is Java. Also, we can say, writing code in languages apart from Java will be a little overhead.

In addition, we can useJavalanguage if we need the high processing rates that come standard on Kafka. Also, Java provides a good community support for Kafka consumer clients. Hence, it is a right choice to implement Kafka in Java.

Kafka Use Cases

There are several use Cases of Kafka that show why we actually use Apache Kafka.

  • Messaging

For a more traditional message broker, Kafka works well as a replacement. We can say Kafka has better throughput, built-in partitioning, replication, and fault-tolerance which makes it a good solution for large-scale message processing applications.

  • Metrics

For operational monitoring data, Kafka finds the good application. It includes aggregating statistics from distributed applications to produce centralized feeds of operational data.

  • Event Sourcing

Since it supports very large stored log data, that means Kafka is an excellent backend for applications of event sourcing.

Kafka Tutorial — Comparisons in Kafka

Many applications offer the same functionality as Kafka like ActiveMQ, RabbitMQ, Apache Flume, Storm, and Spark. Then why should you go for Apache Kafka instead of others?

Let’s see the comparisons below:

a. Apache Kafka vs Apache Flume

Kafka Tutorial — Apache Kafka vs Flume

i. Types of tool

Apache Kafka– For multiple producers and consumers, it is a general-purpose tool.

Apache Flume– Whereas, it is a special-purpose tool for specific applications.

ii. Replication feature

Apache Kafka– Using ingest pipelines, it replicates the events.

Apache Flume- It does not replicate the events.

b. RabbitMQ vs Apache Kafka

One among the foremost Apache Kafka alternatives is RabbitMQ. So, let’s see how they differ from one another:

Kafka Tutorial — Kafka vs RabbitMQ

i. Features

Apache Kafka– Basically, Kafka is distributed. Also, with guaranteed durability and availability, the data is shared and replicated.

RabbitMQ– It offers relatively less support for these features.

ii. Performance rate

Apache Kafka — Its performance rate is high to the tune of 100,000 messages/second.

RabbitMQ — Whereas, the performance rate of RabbitMQ is around 20,000 messages/second.

iii. Processing

Apache Kafka — It allows reliable log distributed processing. Also, stream processing semantics built into the Kafka Streams.

RabbitMQ — Here, the consumer is just FIFO based, reading from the HEAD and processing 1 by 1.

c. Traditional queuing systems vs Apache Kafka

Kafka Tutorial — Traditional queuing systems vs Apache Kafka

i. Messages Retaining

Traditional queuing systems — Most queueing systems remove the messages after it has been processed typically from the end of the queue.

Apache Kafka — Here, messages persist even after being processed. They don’t get removed as consumers receive them.

ii. Logic-based processing

Traditional queuing systems — It does not allow to process logic based on similar messages or events.

Apache Kafka — It allows to process logic based on similar messages or events.

So, this was all about Apache Kafka Tutorials. Hope you like our explanation.

Conclusion: Kafka Tutorial

Hence, in this Kafka Tutorial, we have seen the whole concept of Apache Kafka and seen what is Kafka. Moreover, we discussed Kafka components, use cases, and Kafka architecture. At last, we discussed the comparison of Kafka vs other messaging tools. Furthermore, if you have any query regarding Kafka Tutorial, feel free to ask in the comment section. Also, keep visiting Data Flair for more knowledgeable articles on Apache Kafka.