Karafka is a Ruby and Rails multi-threaded efficient Kafka processing framework that:
- Supports parallel processing in multiple threads (also for a single topic partition work)
- Has ActiveJob backend support (including ordered jobs)
- Automatically integrates with Ruby on Rails
- Supports in-development code reloading
- Is powered by librdkafka (the Apache Kafka C/C++ client library)
- Has an out-of the box StatsD/DataDog monitoring with a dashboard template.
# Define what topics you want to consume with which consumers in karafka.rb
Karafka::App.routes.draw do
topic 'system_events' do
consumer EventsConsumer
end
end
# And create your consumers, within which your messages will be processed
class EventsConsumer < ApplicationConsumer
# Example that utilizes ActiveRecord#insert_all and Karafka batch processing
def consume
# Store all of the incoming Kafka events locally in an efficient way
Event.insert_all messages.payloads
end
end
Karafka uses threads to handle many messages simultaneously in the same process. It does not require Rails but will integrate tightly with any Ruby on Rails applications to make event processing dead simple.
If you're entirely new to the subject, you can start with our "Kafka on Rails" articles series, which will get you up and running with the terminology and basic ideas behind using Kafka:
- Kafka on Rails: Using Kafka with Ruby on Rails – Part 1 – Kafka basics and its advantages
- Kafka on Rails: Using Kafka with Ruby on Rails – Part 2 – Getting started with Rails and Kafka
If you want to get started with Kafka and Karafka as fast as possible, then the best idea is to visit our Getting started guides and the example apps repository.
We also maintain many integration specs illustrating various use-cases and features of the framework.
Prerequisites: Kafka running. You can start it by following instructions from here.
- Add and install Karafka:
bundle add karafka
bundle exec karafka install
- Dispatch a message to the example topic using the Rails or Ruby console:
Karafka.producer.produce_sync(topic: 'example', payload: { 'ping' => 'pong' }.to_json)
- Run Karafka server and see the consumption magic happen:
bundle exec karafka server
[7616dc24-505a-417f-b87b-6bf8fc2d98c5] Polled 1 message in 1000ms
[dcf3a8d8-0bd9-433a-8f63-b70a0cdb0732] Consume job for ExampleConsumer on example started
{"ping"=>"pong"}
[dcf3a8d8-0bd9-433a-8f63-b70a0cdb0732] Consume job for ExampleConsumer on example finished in 0ms
I also sell Karafka Pro subscriptions. It includes a commercial-friendly license, priority support, architecture consultations, and high throughput data processing-related features (virtual partitions, long-running jobs, and more).
20% of the income will be distributed back to other OSS projects that Karafka uses under the hood.
Help me provide high-quality open-source software. Please see the Karafka homepage for more details.
Karafka has Wiki pages for almost everything and a pretty decent FAQ. It covers the installation, setup, and deployment, along with other useful details on how to run Karafka.
If you have questions about using Karafka, feel free to join our Slack channel.
Karafka has priority support for technical and architectural questions that is part of the Karafka Pro subscription.