Implementing Serverless Event-Driven Architecture
Event-driven architecture on serverless is a system where components communicate through events rather than direct calls. A Lambda function doesn't know who else is subscribed to its results. This provides loose coupling, independent scaling, and the ability to add new consumers without modifying the source.
Basic Concepts
Event Source — event origin: API Gateway (HTTP request), S3 (file upload), DynamoDB Streams (record change), SQS (queue message), EventBridge (custom event), Kinesis (data stream).
Event Bridge (event bus) — event router. Source publishes an event, the bus delivers it to the right consumers according to rules.
Consumer (Lambda) — function that reacts to an event.
Architecture by e-commerce Example
Order processing without event-driven: PlaceOrder → ValidateInventory → ProcessPayment → SendEmail → UpdateAnalytics — all sequential, tightly coupled.
With event-driven:
[Client] → PlaceOrder Lambda
↓
EventBridge: order.created
/ | \
ValidateInv SendEmail Analytics
↓
EventBridge: inventory.reserved
↓
ProcessPayment
↓
EventBridge: payment.processed
/ \
FulfillOrder SendReceipt
Each service reacts to events independently. A new service (e.g., fraud detection) subscribes to order.created without modifying existing code.
AWS EventBridge: Implementation
# Custom event bus
resource "aws_cloudwatch_event_bus" "orders" {
name = "orders-bus"
}
# Routing rule
resource "aws_cloudwatch_event_rule" "order_created" {
name = "order-created"
event_bus_name = aws_cloudwatch_event_bus.orders.name
event_pattern = jsonencode({
"detail-type": ["OrderCreated"],
"source": ["com.company.orders"]
})
}
resource "aws_cloudwatch_event_target" "process_inventory" {
rule = aws_cloudwatch_event_rule.order_created.name
event_bus_name = aws_cloudwatch_event_bus.orders.name
arn = aws_lambda_function.validate_inventory.arn
}
resource "aws_cloudwatch_event_target" "send_confirmation" {
rule = aws_cloudwatch_event_rule.order_created.name
event_bus_name = aws_cloudwatch_event_bus.orders.name
arn = aws_lambda_function.send_email.arn
}
Publishing an event from Lambda:
import boto3
import json
from datetime import datetime
events = boto3.client('events')
def publish_order_created(order: dict):
events.put_events(
Entries=[{
'EventBusName': 'orders-bus',
'Source': 'com.company.orders',
'DetailType': 'OrderCreated',
'Detail': json.dumps({
'orderId': order['id'],
'customerId': order['customer_id'],
'items': order['items'],
'totalAmount': order['total'],
'timestamp': datetime.utcnow().isoformat()
}),
'Time': datetime.utcnow()
}]
)
SQS for Reliable Delivery
EventBridge + SQS = fault-tolerant delivery with retry and dead letter queue:
resource "aws_sqs_queue" "inventory_updates" {
name = "inventory-updates"
visibility_timeout_seconds = 300
redrive_policy = jsonencode({
deadLetterTargetArn = aws_sqs_queue.inventory_dlq.arn
maxReceiveCount = 3 # After 3 failed attempts → DLQ
})
}
resource "aws_lambda_event_source_mapping" "inventory_processor" {
event_source_arn = aws_sqs_queue.inventory_updates.arn
function_name = aws_lambda_function.process_inventory.arn
batch_size = 10
function_response_types = ["ReportBatchItemFailures"]
}
ReportBatchItemFailures — only failed messages from batch return to queue, successful ones are not retried.
Handler with Partial Failure
def handler(event, context):
failed_message_ids = []
for record in event['Records']:
try:
process_message(json.loads(record['body']))
except Exception as e:
# Only this record goes to retry, others are OK
failed_message_ids.append({'itemIdentifier': record['messageId']})
return {'batchItemFailures': failed_message_ids}
Idempotency
In event-driven systems, events may be delivered twice (at-least-once delivery). Each handler must be idempotent:
import boto3
dynamodb = boto3.resource('dynamodb')
processed_events = dynamodb.Table('processed_events')
def handler(event, context):
for record in event['Records']:
event_id = record['messageId']
# Check if event has already been processed
try:
processed_events.put_item(
Item={'event_id': event_id, 'ttl': int(time.time()) + 86400},
ConditionExpression='attribute_not_exists(event_id)'
)
except processed_events.meta.client.exceptions.ConditionalCheckFailedException:
continue # Already processed
process_event(record)
Monitoring Event-Driven System
Key metrics:
- Event lag (SQS ApproximateAgeOfOldestMessage) — how fresh are events being processed
- DLQ depth — number of events in dead letter queue (nonzero = problem)
- Processing rate vs production rate — is the system keeping up with events
- End-to-end latency — time from event to result across the entire chain
Implementation Timeline
- Event schema design + bus architecture — 2-3 days
- EventBridge setup + routing rules — 2-3 days
- SQS + DLQ + Lambda event sources — 2-3 days
- Handler idempotency — 2-4 days
- Distributed tracing + monitoring — 2-3 days
- Integration testing — 2-4 days







