Apache Kafka Interview Questions And Answers. Here Coding compiler sharing a list of 30 Kafka interview questions for experienced. Thes interview questions on Kafka were asked in various interviews conducted by top MNC companies and prepared by expert Kafka professionals. We are sure that this list of Apache Kafka questions will help you to crack your next Kafka job interview. All the best for your future and happy learning.
Apache Kafka Interview Questions
- What is Apache Kafka?
- Kafka has written in which languages?
- What exactly Kafka does?
- Kafka can be used for which kind of applications?
- What are the capabilities of Kafka?
- What are the core API’s of Kafka?
- What does Producer API in Kafka?
- What does Streams API in Kafka?
- What does Connector API in Kafka?
- What does Consumer API in Kafka?
- How Kafka communicate with clients and servers?
- What is a topic in Kafka?
- What is Geo-Replication in Kafka?
- What are Producers in Kafka?
- What are Consumers in Kafka?
Kafka Interview Questions And Answers
1) What is Apache Kafka?
A) Apache Kafka is an open-source stream-processing software platform developed by the Apache Software Foundation.
2) Kafka has written in which languages?
A) Kafka has written in Scala and Java.
3) What exactly Kafka does?
A) A streaming platform has three key capabilities:
- Publish and subscribe to streams of records, similar to a message queue or enterprise messaging system.
- Store streams of records in a fault-tolerant durable way.
- Process streams of records as they occur.
4) Kafka can be used for which kind of applications?
A) Kafka is generally used for two broad classes of applications:
- Building real-time streaming data pipelines that reliably get data between systems or applications
- Building real-time streaming applications that transform or react to the streams of data
5) What are the capabilities of Kafka?
A) Some Capabilities of Apache Kafka,
- Kafka is run as a cluster on one or more servers that can span multiple datacenters.
- The Kafka cluster stores streams of records in categories called topics.
- Each record consists of a key, a value, and a timestamp.
6) What are the core API’s of Kafka?
A) Kafka has four core APIs:
- Producer API
- Streams API
- Connector API
- Consumers API
7) What does Producer API in Kafka?
A) The Producer API allows an application to publish a stream of records to one or more Kafka topics.
8) What does Streams API in Kafka?
A) The Streams API allows an application to act as a stream processor, consuming an input stream from one or more topics and producing an output stream to one or more output topics, effectively transforming the input streams to output streams.
9) What does Connector API in Kafka?
A) The Connector API allows building and running reusable producers or consumers that connect Kafka topics to existing applications or data systems.
10) What does Consumer API in Kafka?
A) The Consumer API allows an application to subscribe to one or more topics and process the stream of records produced to them.
Interview Questions on Kafka
11) How Kafka communicate with clients and servers?
A) In Kafka the communication between the clients and the servers is done with a simple, high-performance, language agnostic TCP protocol. This protocol is versioned and maintains backwards compatibility with older version.
12) What is a topic in Kafka?
A) A topic is a category or feed name to which records are published. Topics in Kafka are always multi-subscriber; that is, a topic can have zero, one, or many consumers that subscribe to the data written to it. For each topic, the Kafka cluster maintains a partitioned log.
13) What is Geo-Replication in Kafka?
A) Kafka MirrorMaker provides geo-replication support for your clusters. With MirrorMaker, messages are replicated across multiple datacenters or cloud regions. You can use this in active/passive scenarios for backup and recovery; or in active/active scenarios to place data closer to your users, or support data locality requirements.
14) What are Producers in Kafka?
A) Producers publish data to the topics of their choice. The producer is responsible for choosing which record to assign to which partition within the topic. This can be done in a round-robin fashion simply to balance load or it can be done according to some semantic partition function.
15) What are Consumers in Kafka?
A) Consumers label themselves with a consumer group name, and each record published to a topic is delivered to one consumer instance within each subscribing consumer group. Consumer instances can be in separate processes or on separate machines.
Advanced Kafka Interview Questions
16) How can you start the Kafka server?
A) Kafka uses ZooKeeper so you need to first start a ZooKeeper server if you don’t already have one. You can use the convenience script packaged with kafka to get a quick-and-dirty single-node ZooKeeper instance.
> bin/zookeeper-server-start.sh config/zookeeper.properties
[2013-04-22 15:01:37,495] INFO Reading configuration from: config/zookeeper.properties (org.apache.zookeeper.server.quorum.QuorumPeerConfig)
…
Now start the Kafka server:
> bin/kafka-server-start.sh config/server.properties
[2013-04-22 15:01:47,028] INFO Verifying properties (kafka.utils.VerifiableProperties)
[2013-04-22 15:01:47,051] INFO Property socket.send.buffer.bytes is overridden to 1048576 (kafka.utils.VerifiableProperties)
…
17) How can you create Topic in Kafka?
A) Create a topic – Let’s create a topic named “test” with a single partition and only one replica:
> bin/kafka-topics.sh –create –zookeeper localhost:2181 –replication-factor 1 –partitions 1 –topic test
We can now see that topic if we run the list topic command:
> bin/kafka-topics.sh –list –zookeeper localhost:2181
test
18) How can you send some messages in Kafka?
A) Kafka comes with a command line client that will take input from a file or from standard input and send it out as messages to the Kafka cluster. By default, each line will be sent as a separate message.
Run the producer and then type a few messages into the console to send to the server.
> bin/kafka-console-producer.sh –broker-list localhost:9092 –topic test
This is a message
This is another message
19) How can you Start a consumer in Kafka?
A) Kafka also has a command line consumer that will dump out messages to standard output.
> bin/kafka-console-consumer.sh –bootstrap-server localhost:9092 –topic test –from-beginning
This is a message
This is another message
20) What does AdminClient API in Kafka?
A) The AdminClient API allows managing and inspecting topics, brokers, and other Kafka objects.
Kafka Interview Questions For Experienced
Kafka Interview Questions # 21) How can you use Producer API code?
A) The Producer API allows applications to send streams of data to topics in the Kafka cluster.
Examples showing how to use the producer are given in the javadocs.
To use the producer, you can use the following maven dependency:
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
<version>1.0.1</version>
</dependency>
Kafka Interview Questions # 22) How can you use Consumer API?
A) The Consumer API allows applications to read streams of data from topics in the Kafka cluster.
Examples showing how to use the consumer are given in the javadocs.
To use the consumer, you can use the following maven dependency:
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
<version>1.0.1</version>
</dependency>
Kafka Interview Questions # 23) How can you use Streams API?
A) The Streams API allows transforming streams of data from input topics to output topics.
Examples showing how to use this library are given in the javadocs
Additional documentation on using the Streams API is available here.
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-streams</artifactId>
<version>1.0.1</version>
</dependency>
Kafka Interview Questions # 24) How can you use AdminClient API?
A) The AdminClient API supports managing and inspecting topics, brokers, acls, and other Kafka objects.
To use the AdminClient API, add the following Maven dependency:
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
<version>1.0.1</version>
</dependency>
Apache Kafka Tutorial Interview Questions
25) What are Broker Configs?
A) The essential configurations are the following:
broker.id
log.dirs
zookeeper.connect
Kafka Interview Questions # 26) What is Replication in Kafka?
A) Kafka replicates the log for each topic’s partitions across a configurable number of servers (you can set this replication factor on a topic-by-topic basis). This allows automatic failover to these replicas when a server in the cluster fails so messages remain available in the presence of failures.
Kafka Interview Questions # 27) What is Log Compaction?
A) Log compaction is handled by the log cleaner, a pool of background threads that recopy log segment files, removing records whose key appears in the head of the log. Each compactor thread works as follows:
It chooses the log that has the highest ratio of log head to log tail
It creates a succinct summary of the last offset for each key in the head of the log
It recopies the log from beginning to end removing keys which have a later occurrence in the log. New, clean segments are swapped into the log immediately so the additional disk space required is just one additional log segment (not a fully copy of the log).
The summary of the log head is essentially just a space-compact hash table. It uses exactly 24 bytes per entry. As a result with 8GB of cleaner buffer one cleaner iteration can clean around 366GB of log head (assuming 1k messages).
Kafka Interview Questions # 28) How can you configur the Log Cleaner?
A) The log cleaner is enabled by default. This will start the pool of cleaner threads. To enable log cleaning on a particular topic you can add the log-specific property
log.cleanup.policy=compact
This can be done either at topic creation time or using the alter topic command.
Kafka Interview Questions # 29) What are Quotas?
A) Kafka cluster has the ability to enforce quotas on requests to control the broker resources used by clients. Two types of client quotas can be enforced by Kafka brokers for each group of clients sharing a quota:
Network bandwidth quotas define byte-rate thresholds (since 0.9)
Request rate quotas define CPU utilization thresholds as a percentage of network and I/O threads (since 0.11)
Kafka Interview Questions # 30) What are Client groups?
A) The identity of Kafka clients is the user principal which represents an authenticated user in a secure cluster. In a cluster that supports unauthenticated clients, user principal is a grouping of unauthenticated users chosen by the broker using a configurable PrincipalBuilder.
Client-id is a logical grouping of clients with a meaningful name chosen by the client application. The tuple (user, client-id) defines a secure logical group of clients that share both user principal and client-id.
RELATED INTERVIEW QUESTIONS AND ANSWERS
- Couchbase Interview Questions
- IBM Bluemix Interview Questions
- Cloud Foundry Interview Questions
- Maven Interview Questions
- VirtualBox Interview Questions
- Laravel Interview Questions
- Logstash Interview Questions
- Elasticsearch Interview Questions
- Kibana Interview Questions
- JBehave Interview Questions
- Openshift Interview Questions
- Kubernetes Interview Questions
- Nagios Interview Questions
- Jenkins Interview Questions
- Chef Interview Questions
- Puppet Interview Questions
- RPA Interview Questions And Answers
- Demandware Interview Questions
- Visual Studio Interview Questions
- Vagrant Interview Questions
- 60 Java Multiple Choice Questions
- 40 Core Java MCQ Questions
- Anaplan Interview Questions And Answers
- Tableau Multiple Choice Questions
- Python Coding Interview Questions
- CSS3 Interview Questions
- Linux Administrator Interview Questions
- SQL Interview Questions
- Hibernate Interview Questions
- Android Interview Questions