Apache Kafka is a open source framework written in Scala and Java which is used for distributed streaming platform.
Question: What are components of Kafka?
- Producer –Producers are responsible for sending the data to Kafka topic.
- Consumer –Consumers are subscribers to a topic and also reads and processes from the topic.
- Topic –It is name of Group where producer send the messages and consumer receive the messages.
- Brokers – We use broker to manage storage of messages in the topic .
- ZooKeeper - ZooKeeper is used to coordinate the brokers/cluster topology.
Question: Explain the role of the offset.
There is a sequential ID number given to the messages in the partitions that called offset.
It is used identify each message in the partition uniquely.
Question: What is a Consumer Group?
Kafka consumer group consists of one or more consumers that jointly consume a set of subscribed topics.
Question: What is the role of the ZooKeeper?
Apache Kafka is a distributed system is built to use Zookeeper.
Zookeeper’s main role here is to coordinate the brokers/cluster topology.
It also uses to recover from previously committed offset if any node fails because it works as periodically commit offset.
In every Kafka broker, there are few partitions available, and each partition in Kafka can be either a leader or a replica of a topic.
- High-throughput
- Low Latency
- Fault-Tolerant
- Durability
- Scalability
- Producer API
- Consumer API
- Streams API
- Connector API
Kafka Consumer subscribes to a topic, and also reads and processes messages from the topic.
There is one server which acts as the Leader, and Other servers plays the role as a Followers.
Main role of the Leader is to perform the task of all read and write requests for the partition, whereas Followers passively replicate the leader. At the time of Leader failing, one of the Followers takeover the role of the Leader.
Replications make sure that published messages are not lost and can be consumed in the event of any machine error, program error or frequent software upgrades.
Kafka Producer attempts to send messages at a pace that the Broker cannot handle at that time QueueFullException typically occurs.
Retention period retains all the published records within the Kafka cluster but It doesn’t check whether they have been consumed or not. We can also update the Retention period through configuration.
1000000 bytes
We can enable the Multi-tenancy is enabled, We can easily deploy Kafka as a multi-tenant solution. However, by configuring which topics can produce or consume data
Streams API permits an application to act as a stream processor, and also consuming an input stream and producing an output stream to output topics.
Connector API permits to run as well as build the reusable producers or consumers which connect Kafka topics to existing applications.
Netflix
Mozilla
Oracle
etc