Table of Contents [expand]
Our model cards contain documentation for each available AI model.
Available Models
The Heroku Managed Inference and Agent add-on is hosted in two regions: us
and eu
. However, the add-on can be provisioned and accessed from apps in any Heroku region.
Each region offers slightly different models.
Region: us
Model Documentation | Type | API Endpoint | Model Source | Description |
---|---|---|---|---|
Claude 4 Sonnet | text → text |
/v1/chat/completions | Anthropic | A state-of-the-art large language model (LLM) that supports chat and tool-calling. |
Claude 3.7-sonnet | text → text |
/v1/chat/completions | Anthropic | A state-of-the-art LLM that supports chat and tool-calling. |
Claude 3.5 Sonnet Latest | text → text |
/v1/chat/completions | Anthropic | A state-of-the-art LLM that supports chat and tool-calling. |
Claude 3.5 Haiku | text → text |
/v1/chat/completions | Anthropic | A faster, more affordable LLM that supports chat and tool-calling. |
Amazon Nova Lite | text → text |
/v1/chat/completions | Amazon | A fast and cost-effective LLM. |
Amazon Nova Pro | text → text |
/v1/chat/completions | Amazon | A high-performance LLM designed for complex tasks. |
OpenAI gpt-oss-120b | text → text |
/v1/chat/completions | OpenAI | An open-weight LLM that supports chat and tool-calling. |
Cohere Embed Multilingual | text → embedding |
/v1/embeddings | Cohere | A state-of-the-art embedding model that supports multiple languages. This model is helpful for developing RAG (Retrieval Augmented Generation) search. |
Stable Image Ultra | text → image |
/v1/images/generations | Stability AI | A state-of-the-art diffusion (image generation) model. |
Region: eu
Model Documentation | Type | API Endpoint | Model Source | Description |
---|---|---|---|---|
Claude 4 Sonnet | text → text |
/v1/chat/completions | Anthropic | A state-of-the-art LLM that supports chat and tool-calling. |
Claude 3.7 Sonnet | text → text |
/v1/chat/completions | Anthropic | A state-of-the-art LLM that supports chat and tool-calling. |
Claude 3 Haiku | text → text |
/v1/chat/completions | Anthropic | A faster, more affordable LLM that supports chat and tool-calling. |
Amazon Nova Lite | text → text |
/v1/chat/completions | Amazon | A fast and cost-effective LLM. |
Amazon Nova Pro | text → text |
/v1/chat/completions | Amazon | A high-performance LLM designed for complex tasks. |
Cohere Embed Multilingual | text → embedding |
/v1/embeddings | Cohere | A state-of-the-art embedding model that supports multiple languages. This model is helpful for developing RAG (Retrieval Augmented Generation) search. |