AWS Step Functions: Standard vs. Express Explained

AWS Step Functions: Standard vs. Express Explained

AWS Step Functions is an amazing orchestration service that enables you to design and manage complex workflows by coordinating various AWS services. It simplifies building and running multi-step applications and ensures that each step happens in the right order and errors are handled gracefully.

In this post, we'll dive into the world of Express and Standard Step Functions, the two types of workflows offered by AWS Step Functions.

We'll break down their differences, explore when to use each one, and discuss the benefits of both, helping you pick the right one for your needs.

Step Functions Comparison

For seeing the performance differences, we’ve prepared a very simple application that you can run in and deploy to your own AWS account.

Comparison Overview

Before we jump into the specifics, here’s a comparison summary about the limitations:

Type / CategoryStandard WorkflowsExpress Workflows
⏳ Maximum durationOne yearFive minutes
🏁 Execution start rateSee API action throttling quotas As of 09/2024: Bucket Size is 800-1300 & Refill Rate 150-300/sSee API action throttling quotas As of 09/2024: Bucket Size is 6000 & Refill Rate 6000/s
⚡️ State transition rateSee state throttling quotas As of 09/2024: Bucket Size is 800-5000 & Refill Rate 800-5000/sNo limit
💸 PricingBy state transitionsBy executions, duration, and memory
📚 Execution historyListed, described, and debugged via APIs, console, and CloudWatch LogsUnlimited within 5 minutes, debugged via console and CloudWatch Logs
Execution semanticsExactly-onceAsync: At-least-once, Sync: At-most-once
Service integrationsAll integrations and patterns supportedAll integrations, no Job-run (.sync) or Callback (.waitForTaskToken)
Distributed MapSupportedNot supported
ActivitiesSupportedNot supported

Overview of AWS Step Functions

AWS Step Functions is a fully managed service that enables developers to coordinate multiple AWS services into serverless workflows. It simplifies the process of building and running multi-step applications by providing a visual interface to design and manage workflows.

Flowchart illustrating a decision point leading to invoking either Function A or Function B, both of which lead to writing to a DynamoDB table.

The primary purpose of AWS Step Functions is to help developers orchestrate complex workflows by defining the sequence of steps required to execute a task.

It ensures that each step is executed in the correct order, handles errors gracefully, and manages the state of the workflow throughout its execution.

Key Features and Benefits

Before we dive into the specifics of the two types of step function state machines, let's take a look at their general key features and benefits:

  1. Visual Workflow Design: AWS Step Functions offers a graphical interface to design workflows, making it easier to visualize the sequence of steps and their dependencies.

  2. State Management: It automatically manages the state of each step in the workflow, ensuring that the application can resume from the last successful state in case of a failure.

  3. Error Handling: Built-in error handling capabilities allow developers to define retry policies and fallback mechanisms, improving the reliability of workflows.

  4. Integration with AWS Services: Step Functions seamlessly integrates with a wide range of AWS services, such as AWS Lambda, Amazon S3, Amazon DynamoDB, and more, enabling the creation of complex, serverless applications.

  5. Scalability: The service is designed to scale automatically, handling thousands of parallel executions without requiring manual intervention.

  6. Cost Efficiency: With pay-as-you-go pricing, users only pay for the number of state transitions required to execute their workflows, making it a cost-effective solution for orchestrating tasks.

  7. Audit and Monitoring: Step Functions provides detailed execution history and logs, allowing developers to monitor and audit the performance of their workflows.

  8. Flexibility: It supports both Standard and Express workflows, catering to different use cases ranging from long-running, durable workflows to high-volume, short-duration tasks.

Standard Step Functions

The Standard Workflows are designed for applications that require long-running, durable, and auditable state management.

💡
The type of state machine you choose cannot be changed after it has been created. This means that when you are setting up your workflow, you need to carefully consider whether you need a Standard or Express workflow.

Standard Step Functions guarantee exactly once processing, ensuring that each step in your workflow is executed precisely one time, even in the face of retries or failures.

A flowchart illustrating a transaction process. The steps are: "Remove Money from Account A" followed by "Add Money to Account B". Both steps need to be executed exactly once. The chart starts with a yellow circle, moves through two rectangular steps, and ends with a green circle.

This is particularly important for applications where data consistency and reliability are critical. For example, consider a financial transaction processing system where each transaction must be processed exactly once.

Moreover, Standard Step Functions are perfect for long-running processes that may span hours, days, or even months (effectively up to one year - if it’s longer, the execution will time out). These workflows are designed to handle complex state management and can seamlessly coordinate multiple services and tasks.

A flowchart depicts a process starting from a yellow circle, followed by sequential steps labeled "Step 1", "Step 2", and "Step N", and ending with a green circle. An arrow at the bottom indicates the process can take "up to 1 year".

For instance, in a supply chain management system, Standard Step Functions can track the progress of an order from the initial request through various stages such as inventory check, payment processing, packaging, and shipping.

Additionally, Standard Step Functions are considered durable due to their ability to maintain state (which will be stored internally in the Step Function service) and recover from failures.

A flowchart with a yellow circle labeled "Step 1," followed by rectangles labeled "Step 2" and "Step N," and ending with a green circle. Blue arrows indicate "State" with a floppy disk icon above each step.

This durability ensures that even if a step in the workflow fails or the system experiences an outage, the workflow can resume from the last successful state without losing any progress. You can also inspect the stored state of every step in the AWS console to current state of the application of that time. This is crucial for applications that require high reliability and fault tolerance.

Lastly, Standard Step Functions are auditable, which means that every step in the workflow is logged and can be reviewed later. This is particularly important for compliance and debugging purposes. You can easily trace the execution path of a workflow, inspect the input and output of each step, and identify where any issues occurred - everything directly in the console.

A flowchart with circular and diamond nodes connected by arrows. The nodes and arrows are in green and orange colors, indicating different paths. Green represents the "path taken" and orange represents "skipped" paths. A red highlighted section indicates "successfully reprocessed".

This level of transparency is important for maintaining the integrity of your processes. Additionally, the audit logs can be used to generate detailed reports which can be then for example used to get insights into the performance the workflows.

Express Step Functions

Express Workflows are designed for high-volume, short-duration, and cost-sensitive applications that require fast state management.

Express Step Functions are optimized for high-throughput and low-latency processing, making them ideal for applications where speed and cost-efficiency are important.

Diagram comparing "Exactly Once Processing" and "At-Least-Once Processing." The top shows exactly once processing with simple, non-overlapping circles connected by arrows. The bottom shows at-least-once processing with overlapping circles and arrows, indicating multiple processing.

Unlike Standard Step Functions, Express Workflows do not guarantee exactly once processing but instead offer at-least-once processing, which is suitable for scenarios where occasional duplicate executions are acceptable.

💡
This is very important to remember: If your implementation can't handle each step with idempotency, you can't use Express Step Functions. Express Step Functions may start two state machines with the same ID almost simultaneously. If your application can't tolerate this, Express Step Functions are not a valid choice.

Express Step Functions are perfect for short-lived processes that typically complete within minutes. The maximum duration of the workflow can’t exceed 5 minutes. These workflows are designed to handle high-frequency event processing. For instance, in a real-time data processing system, Express Step Functions can process incoming data streams, perform transformations, and store the results in near real-time.

Additionally, Express Step Functions are highly scalable and can handle thousands of executions per second. This scalability ensures that your application can handle sudden spikes in traffic without compromising performance.

Express Step Functions are also cost-effective, as you are billed based on the number of requests and the duration of each execution. This makes them an great choice for applications with unpredictable or bursty workloads.

While Express Step Functions do not offer the same level of durability as Standard Step Functions, they still provide basic error handling and retry mechanisms. This ensures that your workflows can handle transient failures and continue processing.

Express Step Functions are also auditable, but with a focus on performance and cost-efficiency. Execution history is retained for a limited time, allowing you to review recent executions and troubleshoot issues. However, the level of detail and retention period is less than that of Standard Step Functions.

Also, the execution details are retrieved from and rendered via data from CloudWatch. This means, after the execution has finished, it can take some time to visualized in the console UI.

💡
If you update the definition of your Express Step Function, previous tables and graphs won’t be displayed anymore.

Detailed Comparison: Express vs. Standard

Let’s have a look at the detailed comparison of both workflow types.

Performance and Scalability

Execution speed

Express Workflows are designed for high-throughput, short-duration tasks and can start almost instantly, making them ideal for real-time processing. Standard Workflows, while slightly slower to start, are optimized for long-running, durable tasks.

Concurrency limits

Express Workflows can handle up to 100,000 executions per second, making them highly scalable for massive workloads. Standard Workflows have a lower concurrency limit, typically around 2,000 executions per second, but offer better durability for each execution.

Cost

Pricing Model

Express Workflows are billed based on the number of requests and the duration of execution, making them cost-effective for high-frequency, short-duration tasks. Standard Workflows are billed based on the number of state transitions, which can be more economical for long-running workflows with fewer transitions.

Cost Efficiency in Different Scenarios

Express Workflows are more cost-efficient for high-volume, short-duration tasks due to their per-request pricing. Standard Workflows are more cost-efficient for long-duration tasks with fewer state transitions, as they charge per state transition.

Durability and Reliability

Error handling

Standard Workflows provide robust error handling with built-in retry mechanisms and support for catch and retry policies. Express Workflows also support error handling but are designed for transient errors and may not offer the same level of durability.

Execution history and state management

Standard Workflows maintain detailed execution history and state management for up to 90 days, making them suitable for auditing and debugging. Express Workflows offer limited execution history, typically retaining data for only 5 minutes.

Complexity and Flexibility

Workflow complexity

Standard Workflows support complex, long-running workflows with intricate state transitions and branching logic. Express Workflows are better suited for simpler, high-throughput workflows that require rapid execution.

Integration with other AWS services

Both workflow types integrate seamlessly with other AWS services such as Lambda, S3, and DynamoDB. However, Standard Workflows offer more advanced integration capabilities due to their support for long-running tasks and detailed state management.

Choosing the Right Step Function for Your Needs

Choose Express Workflows for

  • 🔥 high-frequency

  • 💨 short-duration tasks that

  • ⚡️ require rapid execution and

  • 🐛 can tolerate transient errors.

Opt for Standard Workflows for

  • 🐌 long-running,

  • 🔒 durable tasks that

  • 🔎 require detailed execution history and

  • 💪 robust error handling.

Example Scenarios and Recommendations

Use Express Workflows for

  • real-time data processing

  • ETL (Extract, Transform, Load - a process that involves extracting data from various sources, transforming it into a suitable format, and loading it into a destination system) tasks, and

  • microservices orchestration

Choose Standard Workflows for

  • complex business processes and

  • any scenario requiring detailed audit trails and long-term state management.

Our Recommendation

Our recommendation is to always start with Express Step Function unless you have a valid reason against it (like the callback pattern). Express Step Functions are way cheaper compared to Standard Step Functions and the functionality is often enough.

One thing to remember, is that you need to take care of the logging of an Express Step Function. That means you need to configure logging. This often results in a much slower debugging experience on AWS’s side. However, the reduction in costs is often worth it.

Conclusion

In summary, AWS Step Functions offers Standard and Express workflows to fulfill different needs.

Standard Workflows are great for long-running, complex tasks with robust error handling and detailed execution history. Express Workflows are ideal for high-throughput, short-duration tasks, offering rapid execution and cost efficiency.

Choose Standard for detailed, durable processes and Express for fast, high-frequency tasks.

FAQ

  1. What are AWS Step Functions?
    AWS Step Functions is a service that coordinates multiple AWS services into serverless workflows, simplifying multi-step applications.

  2. What is the difference between Standard and Express Workflows in AWS Step Functions?

    Standard Workflows are for long-running, durable tasks with exactly-once processing. Express Workflows are for high-throughput, short-duration tasks with at-least-once processing.

  3. When should I use Standard Workflows?

    Use Standard Workflows for long-running, durable tasks needing robust error handling and detailed execution history.

  4. When should I use Express Workflows?

    Use Express Workflows for high-frequency, short-duration tasks that require rapid execution and can handle transient errors.

  5. How does pricing differ between Standard and Express Workflows?

    Standard Workflows are billed by state transitions. Express Workflows are billed by requests

If you found this article on Step Functions insightful, you might also enjoy these related reads: