Amazon Bedrock pricing

Pricing overview

Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) through a single API, along with a broad set of capabilities you need to build generative AI applications with security, privacy, and responsible AI.

Amazon Bedrock offers flexible pricing options to support customers at every stage of their generative AI journey. Customers can choose from on-demand pricing for pay-as-you-go usage with no upfront commitments, or batch mode for cost-efficient processing of large volumes of input. For high-volume and predictable workloads, provisioned throughput provides dedicated model capacity with discounted pricing. These options help optimize cost while balancing speed, scale, and model access needs.

      

Service tiers and model import

Standard Tier

The Standard tier provides consistent performance at regular rates for everyday AI tasks. With the Standard tier you only pay for what you use, with no time-based term commitments. For text-generation models, you are charged for every input token processed and every output token generated. For embeddings models, you are charged for every input token processed. A token comprises a few characters and refers to the basic unit of text that a model learns to understand the user input and prompt. For image-generation models, you are charged for every image generated.

Learn more

Batch

With Batch mode, you can provide a set of prompts as a single input file and receive responses as a single output file, allowing you to get simultaneous large-scale predictions. The responses are processed and stored in your Amazon S3 bucket so you can access them at a later time. Amazon Bedrock offers select foundation models (FMs) from leading AI providers like Anthropic, Meta, Mistral AI, and Amazon for batch inference at a 50% lower price compared to on-demand inference pricing. Please refer to model list here.

Tools

Guardrails

Amazon Bedrock Guardrails is the only responsible AI capability from a major cloud provider that helps you build and customize safety, privacy, and truthfulness safeguards for your generative AI applications. It evaluates user inputs and model responses based on use case-specific policies, providing an additional layer of safeguards beyond what's natively available. Safeguards from Guardrails can be applied to models hosted on Amazon Bedrock or with any third-party models (such as OpenAI and Google Gemini) via the ApplyGuardrail API. You can also use Guardrails with an agent framework such as Strands Agents, including agents deployed using Amazon Bedrock AgentCore. Guardrails helps filter hallucinations and improve factual accuracy through contextual grounding checks against RAG content and Automated Reasoning checks to deliver provably truthful responses.

  • Amazon
    • Amazon Nova
    • Amazon Nova

      Pricing for Understanding Models

Tools and optimization pricing details

  • Guardrails
  • Amazon Bedrock Guardrails

    Guardrails filter*

    Price

    Content filters for classic tier (text content)


    €0.2587283 per 1,000 text units


    Denied topics for classic tier


    €0.2587283 per 1,000 text units


    Sensitive information filters


    €0.1724855 per 1,000 text units

    Sensitive information filters (regular expression) 

    Free

    Word filters

    Free

    Contextual grounding checks

    €0.1724855 per 1,000 text units

    On-Demand pricing

    * Each guardrail policy is optional and can be enabled based on your application requirements. Charges will be incurred based on the policy type used in the guardrail. For example, if a guardrail is configured with content filters and denied topics, charges will be incurred for these two policies, while there will be no charges associated with sensitive information filters.

    Note: A text unit can contain up to 1000 characters. If a text input is more than 1000 characters, it is processed as multiple text units, each containing 1000 characters or less. For example, if a text input contains 5600 characters, it will be charged for 6 text units.

    Contextual grounding check uses a reference source and a query to determine if the model response is grounded based on the source and relevant to the query. The total number of text units charged is calculated by combining all the characters in the source, query, and model response.