Skip to main content

Command Palette

Search for a command to run...

Learning Kafka - Kafka Fundamentals

Updated
4 min read
Y

Hello! I'm Yuvraj. I'm a Computer Science Student. I love to learn, create, and explore new things. I am currently doing a Bachelor of Computer Science from the University of Delhi.

Kafka Basic Concepts

Now that we have Kafka running, let's understand the key concepts that make Kafka work. I'll explain these in simple terms without jargon.

The Big Picture

Kafka is a system that lets different parts of your application talk to each other by passing messages. It's designed to handle huge amounts of data reliably.

Here's how it works at a high level:

  1. Producers send messages to Kafka

  2. Kafka stores these messages in Topics

  3. Consumers read messages from Topics

Let's dive into each concept:

Topics: The Message Categories

A Topic is like a category or channel for your messages. Think of it like:

  • A folder where related messages are stored

  • A TV channel that broadcasts specific content

  • A mailbox for a specific type of mail

For example, you might have topics like:

  • user-signups for new user registrations

  • order-placed for new orders

  • payment-processed for payment confirmations

Topics have these important characteristics:

  • They have a name (like "first-topic")

  • They can be split into multiple Partitions (more on this below)

  • They store messages in an ordered sequence

Partitions: Splitting Up Topics for Scale

A Partition is a way to divide a topic into multiple parts. This is important because:

  1. It allows Kafka to store more data than can fit on a single server

  2. It enables parallel processing of messages

Think of partitions like:

  • Multiple checkout lines at a grocery store

  • Multiple lanes on a highway

  • Multiple workers in an assembly line

Each partition:

  • Is an ordered sequence of messages

  • Is stored on a single server (called a broker)

  • Can be replicated to other servers for fault tolerance

  • Has messages identified by their position (called an offset)

Messages: The Data Being Sent

A Message is the basic unit of data in Kafka. It's what producers send and consumers read.

A message consists of:

  • A key (optional): Helps determine which partition the message goes to

  • A value: The actual data being sent (can be text, JSON, binary, etc.)

  • A timestamp: When the message was created

  • Headers (optional): Additional metadata

Messages are immutable - once they're written to Kafka, they don't change.

Producers: Sending Messages

A Producer is an application that sends messages to Kafka topics.

Producers:

  • Connect to Kafka brokers

  • Serialize messages (convert them to a format that can be transmitted)

  • Can choose which partition to send messages to (or let Kafka decide)

  • Can wait for acknowledgment that messages were received

Consumers: Reading Messages

A Consumer is an application that reads messages from Kafka topics.

Consumers:

  • Connect to Kafka brokers

  • Subscribe to one or more topics

  • Deserialize messages (convert them back from transmission format)

  • Keep track of which messages they've read using offsets

Consumer Groups: Scaling Consumption

A Consumer Group is a set of consumers that work together to process messages from topics.

Consumer groups allow you to:

  • Process messages in parallel (each consumer handles a subset of partitions)

  • Scale processing by adding more consumers

  • Provide fault tolerance (if one consumer fails, others take over)

The key rule: Each partition is consumed by only one consumer in a group

This means:

  • If you have more consumers than partitions, some consumers will be idle

  • If you have fewer consumers than partitions, some consumers will handle multiple partitions

Brokers: The Kafka Servers

A Broker is a Kafka server that:

  • Stores partitions

  • Handles producer and consumer requests

  • Manages replication of partitions

A Kafka cluster consists of multiple brokers working together.

ZooKeeper: The Coordinator

ZooKeeper is a service that helps coordinate the Kafka cluster:

  • Keeps track of which brokers are alive

  • Helps elect a controller (a broker that manages the cluster)

  • Stores configuration information

Putting It All Together

Here's how all these concepts work together:

  1. Producers send messages to topics

  2. Topics are divided into partitions for scalability

  3. Partitions are stored on brokers (Kafka servers)

  4. Consumers read messages from topics

  5. Consumer groups allow parallel processing

  6. ZooKeeper coordinates the whole system

Visual Representation

┌─────────────┐     ┌───────────────────────────────────┐     ┌─────────────┐
│             │     │             KAFKA                 │     │             │
│  Producers  │────▶│  ┌─────────┐  ┌─────────┐         │     │  Consumers  │
│             │     │  │ Topic A │  │ Topic B │         │     │             │
└─────────────┘     │  │ Part 1  │  │ Part 1  │         │     └─────────────┘
                    │  │ Part 2  │  │ Part 2  │         │           ▲
                    │  │ Part 3  │  │ Part 3  │         │           │
                    │  └─────────┘  └─────────┘         │           │
                    │                                   │           │
                    └───────────────────────────────────┘           │
                                    │                               │
                                    │                               │
                                    ▼                               │
                    ┌───────────────────────────────────┐           │
                    │           ZooKeeper               │           │
                    │  (Coordinates Kafka Brokers)      │           │
                    └───────────────────────────────────┘           │
                                                                    │
                    ┌───────────────────────────────────┐           │
                    │         Consumer Group            │───────────┘
                    │  (Distributes work among          │
                    │   multiple consumers)             │
                    └───────────────────────────────────┘

Next Steps

Now that you understand the basic concepts, let's create our first Kafka producer in the next section.

Next: Your First Producer

14 views