1. Introduction

A robust distributed streaming framework called Apache Kafka is frequently used to handle real-time data in a fault-tolerant way. The data that Kafka producers send typically is text-based, like JSON. To decrease network bandwidth usage and storage needs, data must be compressed. Working with Apache Kafka is made simpler by Spring Boot, which makes it simple to configure and incorporate a Kafka message compression method into your application.

Several compression choices, including gzip, snappy, lz4, and zstd compression, are available for the producer configuration when utilising Spring Boot and Apache Kafka client. By implementing these options, you can be certain that your application will compress messages to the Kafka cluster, which will optimise resource utilisation while maintaining high throughput and low latency. You may effectively manage massive volumes of data while maintaining the efficiency of your application with the correct options and Spring Boot’s simple setup procedure.

Table of Contents show

2. Understanding Kafka Message Compression Techniques

Kafka message compression is a powerful feature that allows you to reduce the size of messages, conserving valuable storage and network resources. By enabling compression on the producer side, the corresponding brokers and consumers won’t need any changes. You can choose between different compression algorithms depending on your specific requirements.

There are five options available for the compression type in Kafka:

none: No compression is applied to the messages.

gzip: Messages are compressed using the widely-used Gzip algorithm, which balances compression ratio and speed well.

snappy: This option uses the Snappy algorithm, known for its low latency and high-speed compression.

lz4: The LZ4 algorithm offers a faster compression speed than Gzip but might achieve slightly lower compression ratios.

zstd: Zstandard is a relatively new compression algorithm that offers excellent compression ratios and fast decompression speeds.

To apply your chosen compression technique, you should set the compression.type parameter in your Kafka producer configuration.

3. Prerequisites and Setup

You may skip this section if you do not follow this tutorial thoroughly and only want to look at code examples.

If you try to follow this tutorial using your IDE, I will assume you already have an Apache Kafka client inside the docker image. You may want to read how to run Kafka if you don’t.

3.1. Generate Project Template

Before constructing a Spring Boot Kafka project, we must first use Spring Initializr to establish a Spring Boot project template with all the necessary dependencies. Once it has been generated, we may import the template into our IDE and carry on with the development process.

4. Kafka Message Compression With Spring Boot

To begin, set up your Spring Boot application for Kafka integration. First, we need to add the Jackson databind dependency to your build.gradle file:

implementation 'com.fasterxml.jackson.core:jackson-databind'

In Spring Boot, you can modify the Kafka producer configuration bean in your application configuration to specify the compression. Defined KafkaTemaple will be to send messages onto Kafka broker.

Here’s an example using config with the snappy algorithm:

@Configuration
public class ProducerConfiguration {
    @Bean
    public ProducerFactory<String, Person> producerFactory() {
        Map<String, Object> config = new HashMap<>();
        config.put(BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
        config.put(KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
        config.put(VALUE_SERIALIZER_CLASS_CONFIG, JsonSerializer.class);
        config.put(COMPRESSION_TYPE_CONFIG, "snappy");
        return new DefaultKafkaProducerFactory<>(config);
    }

    @Bean
    public KafkaTemplate<String, Person> kafkaTemplate() {
        return new KafkaTemplate<>(producerFactory());
    }
}

Followed by a basic ConsumerConfiguration that can deserialize the Person object from JSON.

import static org.apache.kafka.clients.consumer.ConsumerConfig.*;

@Configuration
public class ConsumerConfiguration {

    @Bean
    public ConsumerFactory<String, Person> consumerFactory() {
        Map<String, Object> config = new HashMap<>();
        config.put(BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
        config.put(KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
        config.put(VALUE_DESERIALIZER_CLASS_CONFIG, JsonDeserializer.class);
        config.put(GROUP_ID_CONFIG, "test");
        config.put(JsonDeserializer.TRUSTED_PACKAGES, "*");
        return new DefaultKafkaConsumerFactory<>(config);
    }

    @Bean
    public ConcurrentKafkaListenerContainerFactory<String, Person>
    kafkaListenerContainerFactory() {
        ConcurrentKafkaListenerContainerFactory<String, Person> factory
                = new ConcurrentKafkaListenerContainerFactory<>();
        factory.setConsumerFactory(consumerFactory());
        return factory;
    }
}

Finally, the actual Person class.

@Data
@AllArgsConstructor
@NoArgsConstructor
public class Person {
    private String name;
    private int age;
}

5. Sending and Receiving Compressed Messages

First, we need to create a PersonProducer class with KafkaTemplate that will be used to send messages:

@Service
@AllArgsConstructor
public class PersonProducer {

    private KafkaTemplate<String, Person> kafkaTemplate;

    public void sendMessage(String topic, Person personMessage) {
        kafkaTemplate.send(topic, personMessage);
    }
}

Next is the simple Kafka consumer that will print received messages:

@Slf4j
@Service
public class PersonConsumer {

    @KafkaListener(topics = "compression_topic", groupId = "test")
    public void listen(Person personMessage) {
        log.info(String.format(
                "Received person: %s, age: %d",
                personMessage.getName(),
                personMessage.getAge())
        );
    }
}

Finally, we are going to use @PostConstruct to send messages once the application context is loaded:

@SpringBootApplication
public class KafkaMessageCompressionApplication {

    @Autowired
    private PersonProducer personProducer;

    public static void main(String[] args) {
        SpringApplication.run(
            KafkaMessageCompressionApplication.class, args
        );
    }

    @PostConstruct
    public void postConstruct() {
        personProducer.sendMessage(
                      "compression_topic", new Person("Jack", 20)
        );
        personProducer.sendMessage(
                      "compression_topic", new Person("Tom", 25)
        );
    }
}

Run the application. In the startup log messages, the producer config is displayed and the compression.type value should be set to snappy. Two messages will be inserted if you have a Kafka server running locally. The following command can be used to verify:

docker exec -it broker bash (winpty at the start if on windows)
kafka-console-consumer --bootstrap-server=localhost:9092 
--topic compression_topic --from-beginning

5.1. Handling Large Messages

You can modify the max.request.size option in your ProducerConfiguration if your application needs to accommodate huge messages sent to Kafka. This setting guarantees that your application can easily handle messages of a certain size:

config.put(ProducerConfig.MAX_REQUEST_SIZE_CONFIG, 20971520);

The aforementioned example limits request sizes to 20MB. Larger sizes may hurt the performance of your Kafka deployment, so be sure to set this value to your application’s needs.

6. Compression Type on Kafka Topic

You can also apply compression on a per-topic basis by using TopicConfig.COMPRESSION_TYPE_CONFIG parameter. This allows you to enable a specific compression algorithm for individual topics, catering to the varying needs of your Kafka consumers.

It’s essential to consider the characteristics of your messages when configuring their compression settings. For example, textual content might achieve better compression ratios with Gzip, while binary data could benefit more from LZ4 or Zstandard.

Additionally, it would be best to weigh the potential performance trade-offs when choosing between low-latency compression options like Snappy and more computationally expensive methods like Zstandard.

6.1. Creating Kafka Topic Programmatically

To manage topics in your application, you can use the TopicBuilder class to create or modify new topics. You can also customize topic configurations using via TopicConfig class. For example, to define replication factor or set the compression type on a topic, we can do the following:

@Bean
public NewTopic yourTopic() {
    return TopicBuilder.name("compression_topic_two")
        .partitions(3)
        .replicas(2)
        .config(TopicConfig.COMPRESSION_TYPE_CONFIG, "snappy")
        .build();
}

7. Conclusion

In conclusion, developers will greatly benefit from learning Kafka message compression techniques if they want to improve the performance of their messaging system. Developers can easily manage large messages thanks to the examples that simplify sending and receiving compressed messages. When Kafka topics are generated programmatically, and their compression types are set, there is even more control over the messaging system’s performance. Programmers can improve their messaging systems’ functioning and their apps’ overall efficacy using these techniques and technologies.

Daniel Barczak

Daniel Barczak is a software developer with a solid 9-year track record in the industry. Outside the office, Daniel is passionate about home automation. He dedicates his free time to tinkering with the latest smart home technologies and engaging in DIY projects that enhance and automate the functionality of living spaces, reflecting his enthusiasm and passion for smart home solutions.