Skip Navigation
Show nav
Heroku Dev Center Dev Center
  • Get Started
  • Documentation
  • Changelog
  • Search
Heroku Dev Center Dev Center
  • Get Started
    • Node.js
    • Ruby on Rails
    • Ruby
    • Python
    • Java
    • PHP
    • Go
    • Scala
    • Clojure
    • .NET
  • Documentation
  • Changelog
  • More
    Additional Resources
    • Home
    • Elements
    • Products
    • Pricing
    • Careers
    • Help
    • Status
    • Events
    • Podcasts
    • Compliance Center
    Heroku Blog

    Heroku Blog

    Find out what's new with Heroku on our blog.

    Visit Blog
  • Log in or Sign up
View categories

Categories

  • Heroku Architecture
    • Compute (Dynos)
      • Dyno Management
      • Dyno Concepts
      • Dyno Behavior
      • Dyno Reference
      • Dyno Troubleshooting
    • Stacks (operating system images)
    • Networking & DNS
    • Platform Policies
    • Platform Principles
    • Buildpacks
  • Developer Tools
    • AI Tools
    • Command Line
    • Heroku VS Code Extension
  • Deployment
    • Deploying with Git
    • Deploying with Docker
    • Deployment Integrations
  • Continuous Delivery & Integration (Heroku Flow)
    • Continuous Integration
  • Language Support
    • Node.js
      • Troubleshooting Node.js Apps
      • Working with Node.js
      • Node.js Behavior in Heroku
    • Ruby
      • Rails Support
        • Working with Rails
      • Working with Bundler
      • Working with Ruby
      • Ruby Behavior in Heroku
      • Troubleshooting Ruby Apps
    • Python
      • Working with Python
      • Background Jobs in Python
      • Python Behavior in Heroku
      • Working with Django
    • Java
      • Java Behavior in Heroku
      • Working with Java
      • Working with Maven
      • Working with Spring Boot
      • Troubleshooting Java Apps
    • PHP
      • PHP Behavior in Heroku
      • Working with PHP
    • Go
      • Go Dependency Management
    • Scala
    • Clojure
    • .NET
      • Working with .NET
  • Databases & Data Management
    • Heroku Postgres
      • Postgres Basics
      • Postgres Getting Started
      • Postgres Performance
      • Postgres Data Transfer & Preservation
      • Postgres Availability
      • Postgres Special Topics
      • Heroku Postgres Advanced (Limited GA)
      • Migrating to Heroku Postgres
    • Heroku Key-Value Store
    • Apache Kafka on Heroku
    • Other Data Stores
  • AI
    • Inference Essentials
    • Inference API
    • Inference Quick Start Guides
    • AI Models
    • Tool Use
    • AI Integrations
    • Vector Database
  • Monitoring & Metrics
    • Logging
  • App Performance
  • Add-ons
    • All Add-ons
  • Collaboration
  • Security
    • App Security
    • Identities & Authentication
      • Single Sign-on (SSO)
    • Private Spaces
      • Infrastructure Networking
    • Compliance
  • Heroku Enterprise
    • Enterprise Accounts
    • Enterprise Teams
  • Patterns & Best Practices
  • Extending Heroku
    • Platform API
    • App Webhooks
    • Heroku Labs
    • Building Add-ons
      • Add-on Development Tasks
      • Add-on APIs
      • Add-on Guidelines & Requirements
    • Building CLI Plugins
    • Developing Buildpacks
    • Dev Center
  • Accounts & Billing
  • Troubleshooting & Support
  • Integrating with Salesforce
    • Heroku AppLink
      • Heroku AppLink Reference
      • Working with Heroku AppLink
      • Getting Started with Heroku AppLink
    • Heroku Connect (Salesforce sync)
      • Heroku Connect Administration
      • Heroku Connect Reference
      • Heroku Connect Troubleshooting
    • Other Salesforce Integrations
  • Databases & Data Management
  • Apache Kafka on Heroku
  • Kafka Streams on Heroku

Kafka Streams on Heroku

English — 日本語に切り替える

Table of Contents [expand]

  • Connecting Your App
  • Managing Internal Topics and Consumer Groups
  • Scaling Your App
  • Caveats

Last updated June 09, 2026

Kafka Streams is a Java client library for processing streaming data with Apache Kafka on Heroku. Use Kafka Streams to develop lightweight, scalable, and fault-tolerant stream processing apps.

Apps built with Kafka Streams produce and consume data from streams, which are unbounded, replayable, ordered, and fault-tolerant sequences of events. A stream is either represented as a Kafka topic (KStream) or materialized as compacted topics (KTable). By default, the library makes sure that your app handles stream events one at a time and handles late-arriving or out-of-order events.

Heroku deploys apps using Kafka Streams as normal Java services and communicates directly with Kafka clusters. Both dedicated and basic Apache Kafka on Heroku plans support Kafka Streams with some additional setup required for basic plans.

Connecting Your App

See Connecting to a Kafka Cluster for more information.

Managing Internal Topics and Consumer Groups

Kafka Streams uses internal topics for fault tolerance and repartitioning. Apps with Kafka Streams require these topics to work properly.

Creating Kafka Streams internal topics is unrelated to Kafka’s auto.create.topics.enable config. Kafka Streams communicates with clusters directly through an admin client for topic creation.

Dedicated Kafka Plans

Dedicated Kafka clusters don’t require additional configuration.

See dedicated plans and configurations for more details.

Basic Kafka Plans

Basic Kafka plans run on multi-tenant clusters. Kafka Access Control Lists (ACLs) isolate user data and access privileges. Also, basic plans namespace topic and consumer group names with an auto-generated prefix to prevent naming collisions.

Running Kafka Streams apps on basic plans requires two preliminary steps: setting up the application.id and pre-creating internal topics and consumer groups.

Setting Your application.id

Each Kafka Streams app has an important unique identifier called the application.id that identifies it and its associated topology. If you use an Apache Kafka on Heroku basic plan, make sure that each application.id begins with your cluster’s assigned prefix:

properties.put(StreamsConfig.APPLICATION_ID_CONFIG, String.format("%saggregator-app", HEROKU_KAFKA_PREFIX));

Pre-Creating Internal Topics and Consumer Groups

Because basic plans on Apache Kafka on Heroku use limited ACLs, Kafka Streams apps can’t interact with topics and consumer groups directly. This limitation is problematic because Kafka Streams use an internal admin client to create internal topics and consumer groups transparently at runtime. This scenario primarily affects processors in Kafka Streams.

Processors are classes that implement a process method. They receive input events from a stream, process those events, and optionally produce output events to downstream processors. Stateful processors are processors that use state produced by previous events when processing subsequent ones. Kafka Streams provides built-in functionality for storage of this state.

For each stateful processor in your app, create its two internal topics for changelog and repartition. Internal topics follow the naming convention <application.id>-<operatorName>-<suffix>:

$ heroku kafka:topics:create ${KAFKA_PREFIX}aggregator-app-<OPERATOR>-changelog --app example-app
$ heroku kafka:topics:create ${KAFKA_PREFIX}aggregator-app-<OPERATOR>-repartition --app example-app

Additionally, create a single consumer group for your app that matches the application.id:

$ heroku kafka:consumer-groups:create ${KAFKA_PREFIX}aggregator-app --app example-app

See basic plans and configurations for more details.

Scaling Your App

Parallelism Model

In Kafka Streams, stream tasks are a fundamental unit of parallelism. Stream tasks can interact with one or more partitions. Because of this configuration, partitions also represent the upper bound for parallelism of a Kafka Streams service where the number of stream tasks must be less than or equal to the number of partitions.

Each instance of a Kafka Streams app contains stream threads. These threads are responsible for running one or more stream tasks. Kafka Streams makes sure that it spreads input partitions evenly across stream tasks so that it can consume and process all events.

Kafka Streams uses an application.id to make sure that it spreads partitions transparently and evenly across stream tasks. The application.id identifies your Kafka Streams service and is unique per Kafka cluster.

Vertical Scaling

By default, Kafka Streams create one stream thread per app instance. Each stream thread runs one or more stream tasks. Scale an app instance by scaling its number of stream threads. To do so, modify the num.stream.threads config value in your app. The app transparently rebalances workload across threads within each app instance.

Horizontal Scaling

Kafka Streams rebalances workload and local state across instances as the number of app instances changes. This rebalancing works transparently by distributing workload and local state across instances with the same application.id. Scale Kafka Streams apps horizontally by scaling the number of dynos with the heroku ps:scale command.

The number of input partitions is effectively the upper bound for parallelism. Make sure that the number of stream tasks doesn’t exceed the number of input partitions. Otherwise, this over-provisioning results in idle app instances.

Caveats

RocksDB Persistence

Kafka Streams uses RocksDB as the default storage engine for persistent stores. Because dynos are backed by an ephemeral filesystem, it isn’t practical to rely on the underlying disk for durable storage. This configuration presents a challenge for using RocksDB with Kafka Streams on Heroku. However, using the default RocksDB configuration isn’t a requirement. Kafka Streams treats RocksDB as a write-through cache, where the source of truth is the underlying changelog internal topic. If there’s no underlying RocksDB, the app replays the state directly from the changelog topics on startup.

By default, replaying state directly from changelog topics incurs additional latency when rebalancing your app instances or when dynos restart. To minimize latency, configure Kafka Streams to fail over stream tasks to their associated standby tasks.

Standby tasks are replicas of stream tasks that maintain fully replicated copies of state. Dynos use standby tasks to resume work immediately instead of waiting for the state to rebuild from the changelog topics.

You can modify the num.standby.replicas config in your app to change the number of Standby Tasks.

Feedback

Log in to submit feedback.

Information & Support

  • Getting Started
  • Documentation
  • Changelog
  • Compliance Center
  • Training & Education
  • Blog
  • Support Channels
  • Status

Language Reference

  • Node.js
  • Ruby
  • Java
  • PHP
  • Python
  • Go
  • Scala
  • Clojure
  • .NET

Other Resources

  • Careers
  • Elements
  • Products
  • Pricing
  • RSS
    • Dev Center Articles
    • Dev Center Changelog
    • Heroku Blog
    • Heroku News Blog
    • Heroku Engineering Blog
  • Twitter
    • Dev Center Articles
    • Dev Center Changelog
    • Heroku
    • Heroku Status
  • Github
  • LinkedIn
  • © 2026 Salesforce, Inc. All rights reserved. Various trademarks held by their respective owners. Salesforce Tower, 415 Mission Street, 3rd Floor, San Francisco, CA 94105, United States
  • heroku.com
  • Legal
  • Terms of Service
  • Privacy Information
  • Responsible Disclosure
  • Trust
  • Contact
  • Cookie Preferences
  • Your Privacy Choices