Implementing Function Composition (Step Functions / Durable Functions)
Function Composition is the organization of multiple serverless functions into a workflow with state management, branching, parallelism, and error handling. Calling one Lambda/Function directly from another is an antipattern: you lose history, error handling becomes complex, and there's no visibility into progress. Orchestrators solve these problems.
When You Need Function Composition
- A business process consists of multiple steps with state
- You need conditional branching (if step_A succeeded, then step_B, else step_C)
- Multiple functions execute in parallel with result awaiting
- Long-running processes (> 15 minutes for Lambda)
- Human approval at some step (wait for callback)
AWS Step Functions
State Machine Language (ASL) describes the workflow declaratively:
{
"Comment": "Order processing",
"StartAt": "ValidateOrder",
"States": {
"ValidateOrder": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123:function:validate-order",
"Next": "CheckInventory",
"Retry": [{"ErrorEquals": ["Lambda.ServiceException"], "MaxAttempts": 3}],
"Catch": [{
"ErrorEquals": ["ValidationError"],
"Next": "NotifyInvalidOrder"
}]
},
"CheckInventory": {
"Type": "Parallel",
"Branches": [
{"StartAt": "ReserveItems", "States": {"ReserveItems": {"Type": "Task", "Resource": "arn:...:reserve-items", "End": true}}},
{"StartAt": "CalculateShipping", "States": {"CalculateShipping": {"Type": "Task", "Resource": "arn:...:calc-shipping", "End": true}}}
],
"Next": "ProcessPayment"
},
"ProcessPayment": {
"Type": "Task",
"Resource": "arn:aws:states:::lambda:invoke.waitForTaskToken",
"Parameters": {
"FunctionName": "arn:...:process-payment",
"Payload": {
"taskToken.$": "$$.Task.Token",
"orderId.$": "$.orderId"
}
},
"Next": "FulfillOrder",
"TimeoutSeconds": 300
},
"FulfillOrder": {"Type": "Task", "Resource": "arn:...:fulfill-order", "End": true},
"NotifyInvalidOrder": {"Type": "Task", "Resource": "arn:...:notify-invalid", "End": true}
}
}
.waitForTaskToken allows Step Functions to wait for a callback from an external system (payment gateway) without polling. The payment gateway calls SendTaskSuccess with the token when the transaction completes.
Terraform for Step Functions
resource "aws_sfn_state_machine" "order_processing" {
name = "order-processing"
role_arn = aws_iam_role.sfn_role.arn
definition = templatefile("${path.module}/state_machine.json", {
validate_lambda_arn = aws_lambda_function.validate_order.arn
reserve_lambda_arn = aws_lambda_function.reserve_items.arn
payment_lambda_arn = aws_lambda_function.process_payment.arn
fulfill_lambda_arn = aws_lambda_function.fulfill_order.arn
})
logging_configuration {
log_destination = "${aws_cloudwatch_log_group.sfn.arn}:*"
include_execution_data = true
level = "ERROR"
}
tracing_configuration {
enabled = true # X-Ray tracing
}
}
Azure Durable Functions
.NET / Node.js / Python orchestrator based on Azure Functions:
# orchestrator function
import azure.durable_functions as df
def orchestrator_function(context: df.DurableOrchestrationContext):
# Parallel execution
parallel_tasks = [
context.call_activity("ReserveItems", context.get_input()),
context.call_activity("CalculateShipping", context.get_input())
]
results = yield context.task_all(parallel_tasks)
# Wait for external event (human approval)
approval = yield context.wait_for_external_event("ApprovalReceived")
if approval:
return (yield context.call_activity("FulfillOrder", context.get_input()))
else:
return (yield context.call_activity("CancelOrder", context.get_input()))
main = df.Orchestrator.create(orchestrator_function)
Durable Functions use Azure Storage to store state. The orchestrator can wait for an external event indefinitely.
Error Handling and Compensating Transactions
In distributed processes, there are no built-in transactions. The Saga pattern uses compensating actions on rollback:
"ProcessPayment": {
"Type": "Task",
"Resource": "...",
"Catch": [{
"ErrorEquals": ["PaymentFailed"],
"Next": "CompensateReservation"
}]
},
"CompensateReservation": {
"Type": "Task",
"Resource": "arn:...:release-reservation",
"Next": "NotifyPaymentFailed"
}
Each step that needs to be rolled back on error has a compensating function.
Visibility and Monitoring
Step Functions provides a visual workflow in the Console — each execution can be viewed step by step. CloudWatch Metrics: ExecutionsStarted, ExecutionsSucceeded, ExecutionsFailed, ExecutionThrottled.
X-Ray integration provides tracing across all Lambda functions in the workflow.
Express vs Standard Workflows
| Standard | Express | |
|---|---|---|
| Duration | Up to 1 year | Up to 5 minutes |
| Execution history | Full | CloudWatch Logs |
| Price | $0.025/1k transitions | $0.00001/state transition |
| Best for | Business processes | High-volume, short workflows |
Implementation Timeline
- State machine design + ASL specification — 2-3 days
- Lambda functions for each step — 3-7 days
- Step Functions state machine + IAM — 2-3 days
- Error handling + compensations — 2-3 days
- Monitoring + alerts + testing — 2-3 days







