AI, ML, Java | All things J2EE | Expert Java J2EE SOA tips

Sunday, June 22, 2025

Docker compose with Fluentd and Kafka on localhost

Directory Structure

$PWD/fluentd

--> logs

--> Dockerfile

--> fluent.conf

$PWD/docker-compose.yml

build, create, and start the services defined in a docker-compose.yaml file.
docker-compose up --build

stops and removes containers, networks, and volumes that were created by docker-compose up.
docker-compose down -v --remove-orphans

docker-compose.yml

version: '3.8'

services:
  fluentd:
    build:
      context: ./fluentd
    networks:
      - custom-net
    container_name: fluentd
    volumes:
      - ./fluentd/fluent.conf:/fluentd/etc/fluent.conf
      - ./fluentd/logs:/var/log/fluentd
    depends_on:
      - kafka

  kafka:
    image: confluentinc/cp-kafka:latest
    hostname: kafka
    container_name: kafka
    networks:
      - custom-net
    ports:
      - "9092:9092"
      - "9093:9093"
    environment:
      KAFKA_KRAFT_MODE: "true"  # This enables KRaft mode in Kafka.
      KAFKA_PROCESS_ROLES: controller,broker  # Kafka acts as both broker and controller.
      KAFKA_NODE_ID: 1  # A unique ID for this Kafka instance.
      KAFKA_CONTROLLER_QUORUM_VOTERS: "1@localhost:9093"  # Defines the controller voters.
      KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092,CONTROLLER://0.0.0.0:9093
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,CONTROLLER:PLAINTEXT
      KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
      KAFKA_CONTROLLER_LISTENER_NAMES: CONTROLLER
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
      #KAFKA_LOG_DIRS: /var/lib/kafka/data  # Where Kafka stores its logs.
      KAFKA_AUTO_CREATE_TOPICS_ENABLE: "true"  # Kafka will automatically create topics if needed.
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1  # Since we’re running one broker, one replica is enough.
      KAFKA_LOG_RETENTION_HOURS: 168  # Keep logs for 7 days.
      KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0  # No delay for consumer rebalancing.
      CLUSTER_ID: "Mk3OEYBSD34fcwNTJENDM2Qk" 


networks:
  custom-net:
    name: fluent-net 

Dockerfile

FROM fluent/fluentd:v1.16-1

USER root

# Install Kafka plugin
#RUN gem install fluent-plugin-kafka --no-document
RUN apk add --no-cache --update \
      build-base \
      ruby-dev \
      libffi-dev \
      && gem install fluent-plugin-kafka --no-document \
      && apk del build-base ruby-dev libffi-dev

USER fluent

Fluentd.conf

<source>
  @type tail
  path /var/log/fluentd/input.log
  pos_file /var/log/fluentd/input.pos
  tag app.log
  read_from_head true

  <parse>
     @type multiline
     format_firstline /\d{4}-\d{1,2}-\d{1,2}/
     format1 /^(?<time>\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2},\d{1,3}) (?<message>.*)/
  </parse>
</source>

<match app.log>
  @type kafka2
  brokers kafka:9092
  default_topic test-topic
  output_data_type json

  <format>
    @type json
  </format>

  <buffer topic>
    @type memory
    flush_interval 5s
  </buffer>
</match>

Login to kafka instance shell

docker exec -it kafka bash

Publish and consume messages

kafka-topics --bootstrap-server localhost:9092 --create --topic test-topic

kafka-console-producer --bootstrap-server localhost:9092 --topic test-topic

kafka-console-consumer -bootstrap-server localhost:9092 --topic test-topic --from-beginning

Login to fluentd instance shell

docker exec -it fluentd sh

Sunday, March 23, 2025

Virtual Machnes

A Virtual Machine (VM), on the other hand, is created to perform tasks that if otherwise performed directly on the host environment, may prove to be risky. VMs are isolated from the rest of the system; the software inside the virtual machine cannot tamper with the host computer. Therefore, implementing tasks such as accessing virus-infected data and testing of operating systems are done using virtual machines. We can define a virtual machine as a computer file or software usually termed as a guest, or an image that is created within a computing environment called the host.

A virtual machine is capable of performing tasks such as running applications and programs like a separate computer making them ideal for testing other operating systems like beta releases, creating operating system backups, and running software and applications. A host can have several virtual machines running at a specific time. Logfile, NVRAM setting file, virtual disk file, and configuration file are some of the key files that make up a virtual machine. Another sector where VMs are of great use is server virtualisation. In server virtualization, a physical server is divided into multiple isolated and unique servers, thereby allowing each server to run its operating system independently. Each virtual machine provides its virtual hardware, such as CPUs, memory, network interfaces, hard drives, and other devices.

VMs are broadly divided into two categories depending upon their use:

System Virtual Machines: A platform that allows multiple VMs, each running with its copy of the operating system to share the physical resources of the host system. Hypervisor, which is also a software layer, provides the virtualisation technique. The hypervisor executes at the top of the operating system or the hardware alone.
Process Virtual Machine: Provides a platform-independent programming environment. The process virtual machine is designed to hide the information of the underlying hardware and operating system and allows the program to execute in the same manner on every given platform.

Although several VMs running at a time may sound efficient, it leads to unstable performance. As the guest OS would have its kernel, set of libraries and dependencies, this would take up a large chunk of system resources.

Other drawbacks include inefficient hypervisor and long boot uptime. The concept of Containerization overcomes these flaws. Docker is one such containerization platform.

Docker and containerization

Docker is a software development tool and a virtualization technology that makes it easy to develop, deploy, and manage applications by using containers. A container refers to a lightweight, stand-alone, executable package of a piece of software that contains all the libraries, configuration files, dependencies, and other necessary parts to operate the application.

In other words, applications run the same irrespective of where they are and what machine they are running on because the container provides the environment throughout the software development life cycle of the application. Since containers are isolated, they provide security, thus allowing multiple containers to run simultaneously on the given host. Also, containers are lightweight because they do not require an extra load of a hypervisor. A hypervisor is a guest operating system like VMWare or VirtualBox, but instead, containers run directly within the host’s machine kernel.

Containers provide the following benefits:

Reduced IT management resources
Reduced size of snapshots
Quicker spinning up apps
Reduced and simplified security updates
Less code to transfer, migrate and upload workloads

Docker Versus Virtual Machines - Notes

Docker containers and Virtual Machines (VMs) are both virtualization technologies, but they differ in how they isolate software and resources.

VMs virtualize the entire hardware
Docker containers virtualize the operating system, sharing the host kernel for efficiency

Feature	Virtual Machines (VMs)	Docker Containers
Virtualization Level	Hardware	Operating System
Resource Consumption	High	Low
Isolation	Strong	Shared Kernel
Portability	Limited	High
Use Cases	Legacy applications, testing different OS	Microservices, CI/CD, portable applications

Saturday, March 22, 2025

SSL - File types (CSR, PEM, KEY, PFX)

csr - This is a Certificate Signing Request. Some applications can generate these for submission to certificate-authorities. The actual format is PKCS10 which is defined in RFC 2986. It includes some/all of the key details of the requested certificate such as subject, organization, state, whatnot, as well as the public key of the certificate to get signed. These get signed by the CA and a certificate is returned. The returned certificate is the public certificate (which includes the public key but not the private key), which itself can be in a couple of formats.
.pem - Defined in RFC 1422 (part of a series from 1421 through 1424) this is a container format that may include just the public certificate (such as with Apache installs, and CA certificate files /etc/ssl/certs), or may include an entire certificate chain including public key, private key, and root certificates. Confusingly, it may also encode a CSR (e.g. as used here) as the PKCS10 format can be translated into PEM. The name is from Privacy Enhanced Mail (PEM), a failed method for secure email but the container format it used lives on, and is a base64 translation of the x509 ASN.1 keys.
.key - This is a (usually) PEM formatted file containing just the private-key of a specific certificate and is merely a conventional name and not a standardized one. In Apache installs, this frequently resides in /etc/ssl/private. The rights on these files are very important, and some programs will refuse to load these certificates if they are set wrong.
.pkcs12 .pfx .p12 - Originally defined by RSA in the Public-Key Cryptography Standards (abbreviated PKCS), the "12" variant was originally enhanced by Microsoft, and later submitted as RFC 7292. This is a password-protected container format that contains both public and private certificate pairs. Unlike .pem files, this container is fully encrypted. Openssl can turn this into a .pem file with both public and private keys: openssl pkcs12 -in file-to-convert.p12 -out converted-file.pem -nodes