Working Kafka in Kubernetes with Strimzi

Faheem

Kubernetes shouldn’t be the primary platform that involves thoughts to run Apache Kafka clusters. Certainly, Kafka’s robust dependency on storage may be a ache level concerning Kubernetes’ method of doing issues in the case of persistent storage. Kafka brokers are distinctive and stateful, how can we implement this in Kubernetes?

Let’s undergo the fundamentals of Strimzi, a Kafka operator for Kubernetes curated by Red Hat and see what issues it solves.

A particular focus will likely be made on how you can plug extra Kafka instruments to a Strimzi set up.

We can even evaluate Strimzi with different Kafka operators by offering their execs and cons.

Strimzi




Strimzi logo

Strimzi is a Kubernetes Operator aiming at lowering the price of deploying Apache Kafka clusters on cloud native infrastructures.

As an operator, Strimzi extends the Kubernetes API by offering sources to natively handle Kafka sources, together with:

  • Kafka clusters
  • Kafka matters
  • Kafka customers
  • Kafka MirrorMaker2 situations
  • Kafka Join situations

The mission is at present on the “Sandbox” stage on the Cloud Native Computing Foundation.

Notice: The CNCF web site defines a “sandbox” mission as “Experimental tasks not but extensively examined in manufacturing on the bleeding fringe of expertise.”

With Strimzi, deploying a 3 dealer tls-encrypted cluster is so simple as making use of the next YAML file:

apiVersion: kafka.strimzi.io/v1beta2
sort: Kafka
metadata:
  title: my-cluster
spec:
  kafka:
    model: 3.2.3
    replicas: 3
    listeners:
      - title: plain
        port: 9092
        kind: inner
        tls: false
      - title: tls
        port: 9093
        kind: inner
        tls: true
    config:
      offsets.subject.replication.issue: 3
      transaction.state.log.replication.issue: 3
      transaction.state.log.min.isr: 2
      default.replication.issue: 3
      min.insync.replicas: 2
      inter.dealer.protocol.model: "3.2"
    storage:
      kind: jbod
      volumes:
        - id: 0
          kind: persistent-declare
          measurement: 100Gi
          deleteClaim: false
        - id: 1
          kind: persistent-declare
          measurement: 100Gi
          deleteClaim: false
  zookeeper:
    replicas: 3
    storage:
      kind: persistent-declare
      measurement: 100Gi
      deleteClaim: false
  entityOperator:
    topicOperator: {}
    userOperator: {}

A subject appears to be like like this:

apiVersion: kafka.strimzi.io/v1beta2
sort: KafkaTopic
metadata:
  title: my-subject
  labels:
    strimzi.io/cluster: my-cluster
spec:
  partitions: 1
  replicas: 1
  config:
    retention.ms: 7200000
    section.bytes: 1073741824

Each of those examples are from the examples listing of the Strimzi operator. This listing contains many extra examples overlaying all of Strimzi’s capabilities.

Safety

An fascinating characteristic of Strimzi is the out-of-the-box safety features. By default, intra-broker communication is encrypted with TLS whereas communication with ZooKeeper is each autenticated and encrypted with mTLS.

The Apache ZooKeeper clusters backing the Kafka situations usually are not uncovered outdoors of the Kubernetes cluster, offering additionnal safety.

These configurations are literally unimaginable to override, thought it’s doable to entry the ZooKeeper through the use of a tweak project by scholzj.

Strimzi PodSets

Kubernetes comes with its personal answer for managing distributed stateful purposes: StatefulSets.

The official documentation states:

(StatefulSets) manages the deployment and scaling of a set of Pods, and gives ensures concerning the ordering and uniqueness of those Pods.

Whereas StatfulSets benefit from being Kubernetes native sources, they’ve some limitations.

Listed here are just a few examples:

  • Scaling up and down is linear. When you’ve got a StatefulSet with 3 pods: pod-1, pod-2, pod-3, scaling up will create pod-4 and cutting down can solely delete pod-4. This may be a problem if you need to remove a selected pod of your deployment. Utilized to Kafka, you may be in a scenario the place a foul subject could make a dealer instable, with StatefulSets you can’t delete this explicit dealer and scale out a brand new contemporary dealer.
  • All of the pods share the identical specs (CPU, Mem, # of PVCs, and so on.)
  • Important node failure requires handbook intervention

These limitations had been addressed by the Strimzi workforce by developping their very own sources: the StrimziPodSets, a characteristic launched in Strimzi 0.29.0.

The advantages of utilizing StrimziPodSets embody:

  • Scaling up and down is extra versatile
  • Per dealer configuration
  • Opens the gate for dealer specialization as soon as ZooKeeper-less Kafka is GA (KIP-500, extra on this subject later within the article)

A disadvantage of utilizing StrimziPodSets is that the Strimzi Operator occasion turns into essential.

If you wish to hear extra concerning the Strimzi PodSets, be happy to observe the StrimziPodSets – What it is and why should you care? video by Jakub Scholz.

Deploying Strimzi

Strimzi’s Quickstart documentation is completely full and functionnal.

We’ll focus the remainder of the article on addressing helpful points that aren’t coated by Strimzi.

Kafka UI on prime of Strimzi

Strimzi brings plenty of consolation for customers in the case of managing Kafka sources in Kubernetes. We wished to carry one thing to the desk by exhibiting how you can deploy a Kafka UI on prime of a Strimzi cluster as a local Kubernetes ressource.

There are a number of open supply Kafka UI tasks on GitHub, to quote just a few:

Let’s go for Kafka UI which has the cleanest UI (IMO) among the many competitors.

The mission gives official Docker photos as we will see within the documentation. We’ll leverage this picture and deploy a Kafka UI occasion as a Kubernetes deployment.

The next YAML is an instance of a Kafka UI occasion configured over a SCRAM-SHA-512 authenticated Strimzi Kafka cluster. The UI is authenticated towards an OpenLDAP through ldaps.

apiVersion: apps/v1
sort: Deployment
metadata:
  title: cluster-kafka-ui
  namespace: kafka
spec:
  selector:
    matchLabels:
      app: cluster-kafka-ui
  template:
    metadata:
      labels:
        app: cluster-kafka-ui
    spec:
      containers:
        - picture: provectuslabs/kafka-ui:v0.4.0
          title: kafka-ui
          ports:
            - containerPort: 8080
          env:
            - title: KAFKA_CLUSTERS_0_NAME
              worth: "cluster"
            - title: KAFKA_CLUSTERS_0_BOOTSTRAPSERVERS
              worth: "cluster-kafka-bootstrap:9092"
            - title: KAFKA_CLUSTERS_0_PROPERTIES_SECURITY_PROTOCOL
              worth: SASL_PLAINTEXT
            - title: KAFKA_CLUSTERS_0_PROPERTIES_SASL_MECHANISM
              worth: SCRAM-SHA-512
            - title: KAFKA_CLUSTERS_0_PROPERTIES_SASL_JAAS_CONFIG
              worth: 'org.apache.kafka.widespread.safety.scram.ScramLoginModule required username="admin" password="XSnBiq6pkFNp";'
            
            - title: AUTH_TYPE
              worth: LDAP
            - title: SPRING_LDAP_URLS
              worth: ldaps://myldapinstance.firm:636
            - title: SPRING_LDAP_DN_PATTERN
              worth: uid={0},ou=Folks,dc=firm
            - title: SPRING_LDAP_ADMINUSER
              worth: uid=admin,ou=Apps,dc=firm
            - title: SPRING_LDAP_ADMINPASSWORD
              worth: Adm1nP@ssw0rd!
            
            - title: JAVA_OPTS
              worth: "-Djdk.tls.shopper.cipherSuites=TLS_RSA_WITH_AES_128_GCM_SHA256 -Djavax.web.ssl.trustStore=/and so on/kafka-ui/ssl/truststore.jks"
          volumeMounts:
            - title: truststore
              mountPath: /and so on/kafka-ui/ssl
              readOnly: true
      volumes:
        - title: truststore
          secret:
            secretName: myldap-truststore

Notice: By leveraging a PLAINTEXT inner listener on port 9092, we don’t want to supply a KAFKA_CLUSTERS_0_PROPERTIES_SSL_TRUSTSTORE_LOCATION configuration.

With this configuration, customers have to authenticate through LDAP to the Kafka UI. As soon as they’re logged in, the underlying person used for interactions with the Kafka cluster is the admin person outlined in KAFKA_CLUSTERS_0_PROPERTIES_SASL_JAAS_CONFIG. Position primarily based entry management was not too long ago launched with this issue.

Schema Registry with Strimzi

We had a functionnal have to deploy a Schema Registry occasion for our Kafka clusters operating in Kubernetes.

Whereas Strimzi goes the additional mile by managing extra instruments like Kafka Join or MirrorMaker situations, it’s not but able to deploying a Schema Registry.

To mitigate this subject, the Rubin Observatory Science Quality and Reliability Engineering team labored on the strimzi-registry-operator.

The configurations we used are the one showcased within the example section of the README.

The one issue we encountered was that the operator shouldn’t be but succesful to deploy a Schema Registry backed on a SCRAM-SHA-512 secured cluster.

What about ZooKeeper-less Kafka?

After a few years of labor on KIP-500, the Apache Kafka workforce lastly introduced that operating Kafka in KRaft mode (ZooKeeper much less) turned manufacturing prepared. The announcement was made as a part of the Kafka 3.3 release.

The Strimzi workforce started work on the KRaft mode in Strimzi 0.29.0. As said within the Strimzi documentation, the characteristic continues to be experimental, each on Kafka and Strimzi ranges.

Strimzi’s main contributor, Jakub Scholz, has commented the next on the matter:

I believe calling it manufacturing prepared for brand spanking new clusters is a bit unusual. It signifies that we would wish to keep up two parallel code paths with assured upgrades and so on. for presumably a very long time. So, TBH, I hoped we might have far more progress at this time limit and be extra ready for ZooKeeper removing. However as a my private opinion – I might be most likely very reluctant to name something at this stage manufacturing prepared anyway.

Following on these feedback, we will guess that ZooKeeper-less Kafka shouldn’t be going to be the default configuration in Strimzi within the subsequent launch (0.34.0 on the time of writing) however it’ll undoubtedly occur sooner or later.

What about storage?

Storage is commonly a ache level with naked metallic Kubernetes clusters and Kafka makes no exception.

The neighborhood consensus for provisioning storage on Kubernetes is through Ceph with Rook thought different options exists (Longhorn or OpenEBS on the Open Supply facet, Portworx or Linstor as proprietary options).

Evaluating storage engines for naked metallic Kubernetes clusters is just too massive a subject to be included on this article however be happy to take a look at our earlier article ”Ceph object storage within a Kubernetes cluster with Rook” for extra on Rook.

We did have the chance to match performances between a 3 brokers Kafka set up with Strimzi/Rook Ceph towards a 3 brokers Kafka cluster operating on the identical machine with direct disk entry.

Listed here are the specs and outcomes of the benchmark:

Specs

Kubernetes environement:

  • Kafka Model 3.2.0 on Kubernetes by way of Strimzi
  • 3 brokers (one pod per node)
  • 6 RBD gadgets per dealer (provisionned by the Rook Ceph Storage Class)
  • Xms java default (2g)
  • Xmx java default (29g)

Naked metallic environement:

  • Kafka Model 3.2.0 as JVM course of with the Apache launch
  • 3 brokers (one JVM per node)
  • 6 RBD gadgets per dealer (JBOD with ext4 formatting)
  • Xms java default (2g)
  • Xmx java default (29g)

Notes: The benchmarks had been run on the identical machines (HP Gen 7 with 192 Gb RAM and 6 x 2 TB disks) with RHEL 7.9. Kubernetes was not operating when Kafka as JVM course of was operating and vice versa.

kafka-producer-perf-test 
--topic my-topic-benchmark 
--record-size 1000 
--throughput -1 
--producer.config /mnt/kafka.properties 
--num-records 50000000

Notice: The subject my-topic-benchmark has 100 partitions and 1 reproduction.

Outcomes

We ran the earlier benchmark 10 occasions on every configuration and averaged the outcomes:

Metric JBOD naked metallic Ceph RBD Efficiency distinction
Data/sec 75223 65207 – 13.3 %
Avg latency 1.45 1.28 + 11.1 %

The outcomes are fascinating: whereas the write performances had been higher on JBOD, the latency was slower utilizing Ceph.

Strimzi alternate options

There are two foremost alternate options to Strimzi in the case of working Kafka on Kubernetes:

We didn’t check Koperator completely so it might be unfair to match it to Strimzi on this article.

As for the Confluent operator, it gives many options that we don’t have with Strimzi. Listed here are just a few that we deemed fascinating:

  • Schema Registry integration
  • ksqlDB integration
  • LDAP authentication help
  • Out-of-the-box UI (Confluent Management Middle) for each Admins and Developpers
  • Alerting
  • Tiered storage

All these include the fee (actually) of shopping for a business license from Confluent. Notice that the operator and Management Middle might be examined for a 30 days trial interval.

Leave a Comment