Guide 18 mins

Apache Superset on Pulumi Stack: Reference Deployment Pattern

Step-by-step production deployment of Apache Superset on Pulumi Stack. Covers networking, storage, secrets, autoscaling, and operational habits.

The PADISO Team ·2026-06-18

Why Superset on Pulumi Matters
Pre-Deployment Architecture Decisions
Networking and Security Foundation
Storage, Secrets, and State Management
Building the Superset Stack
Autoscaling and Load Balancing
Observability and Operational Habits
Disaster Recovery and Backup Strategy
Cost Optimisation and Governance
Common Pitfalls and How to Avoid Them
Next Steps and Scaling

Why Superset on Pulumi Matters

Apache Superset is a modern, open-source data visualisation and business intelligence platform. When deployed on Pulumi Stack, it becomes a repeatable, version-controlled, infrastructure-as-code asset that your team can ship, audit, and scale without manual configuration drift.

Pulumi lets you define cloud infrastructure using Python, TypeScript, Go, or C#. Unlike declarative tools that require learning domain-specific languages, Pulumi treats infrastructure as code in a real programming language. This means you can use loops, conditionals, functions, and version control the same way you would for application code.

For organisations running Superset across multiple environments—development, staging, production—or across multiple cloud providers, this approach saves weeks of manual deployment work and eliminates the human error that comes with point-and-click cloud consoles.

At PADISO, we’ve deployed Superset on Pulumi for teams modernising their analytics infrastructure. The pattern we’ve refined here is production-tested across insurance, retail, government, and financial services clients. Whether you’re a startup building your first analytics layer or an enterprise consolidating BI tools, this guide walks you through the complete pattern.

Pre-Deployment Architecture Decisions

Before writing a single line of Pulumi code, you need to make five critical architectural decisions. These choices shape everything downstream—cost, performance, security, and operational burden.

Cloud Provider Selection

Pulumi supports AWS, Azure, Google Cloud, Kubernetes, and others. For this guide, we’ll focus on AWS, the most common choice for analytics workloads in Australia and globally.

If you’re running government or defence workloads in Australia, you may need to deploy on AWS GovCloud or a sovereign alternative like Platform Development in Canberra | PADISO, which specialises in IRAP/PROTECTED-aligned architecture.

Compute Model: Containers or Serverless

Superset runs as a Python application. You have two main paths:

Path 1: Container-based (ECS Fargate or Kubernetes)

You control resource allocation, scaling policies, and cost.
Superset runs in a container, orchestrated by AWS ECS Fargate or self-managed Kubernetes.
More operational overhead; more control.
Better for teams with existing container orchestration experience.

Path 2: Serverless (AWS Lambda with API Gateway)

Simpler operational model; AWS manages scaling.
Superset’s long-running web server doesn’t fit Lambda’s execution model well.
Not recommended for production Superset deployments.

We recommend Path 1: Container-based on ECS Fargate. Fargate removes the need to manage EC2 instances while keeping the flexibility you need for a stateful application like Superset.

Database Backend

Superset requires a metadata database (to store dashboards, users, and configuration) and typically connects to one or more data warehouses (to query your actual analytics data).

Metadata Database Options:

PostgreSQL on RDS: Managed, highly available, easy to back up. Standard choice. Costs $20–100/month depending on instance size.
MySQL on RDS: Similar to PostgreSQL. Slightly cheaper in some regions.
Aurora PostgreSQL: Higher availability, auto-scaling storage. Better for large deployments; ~$50–200/month.

For most teams, PostgreSQL on RDS is the right balance of cost and reliability. The PostgreSQL Runtime Configuration documentation provides tuning guidance if you need to optimise for Superset’s workload.

Caching Layer

Superset benefits from a caching layer to reduce database load and improve dashboard load times. Redis is the standard choice.

Options:

ElastiCache Redis: AWS-managed, highly available, supports encryption in transit. ~$15–50/month for a small instance.
Self-managed Redis on EC2: Cheaper but requires operational overhead.

We recommend ElastiCache Redis. The Redis Cache Documentation covers operational best practices; for Superset, you’ll use Redis for caching query results and session storage.

Data Warehouse Connection

Superset queries external data warehouses. Common options:

Amazon Redshift: AWS-native, excellent for analytics. Costs scale with cluster size.
Snowflake: Cloud-agnostic, pay-per-query. Popular in Australia for financial services and retail.
BigQuery: Google Cloud; good if your data is already there.
ClickHouse: Open-source, fast columnar database. Increasingly popular for cost-conscious teams.

For Platform Development in Melbourne | PADISO and Platform Development in Sydney | PADISO, we’ve seen Superset + ClickHouse replace expensive per-seat BI tools, cutting costs by 60–70% while improving query speed.

Networking and Security Foundation

Superset handles sensitive data and user credentials. Your network and security posture must be locked down before you deploy.

VPC and Subnet Design

Create a VPC with public and private subnets. Superset runs in private subnets; only the load balancer sits in public subnets.

# Pulumi code snippet (Python)
import pulumi
import pulumi_aws as aws

config = pulumi.Config()
environment = config.require('environment')

# VPC
vpc = aws.ec2.Vpc(f'{environment}-superset-vpc',
    cidr_block='10.0.0.0/16',
    enable_dns_hostnames=True,
    enable_dns_support=True,
    tags={'Environment': environment})

# Public subnet for ALB
public_subnet = aws.ec2.Subnet(f'{environment}-public-subnet',
    vpc_id=vpc.id,
    cidr_block='10.0.1.0/24',
    availability_zone='ap-southeast-2a',
    map_public_ip_on_launch=True)

# Private subnet for Superset and RDS
private_subnet = aws.ec2.Subnet(f'{environment}-private-subnet',
    vpc_id=vpc.id,
    cidr_block='10.0.2.0/24',
    availability_zone='ap-southeast-2a')

This pattern isolates Superset from the internet. Traffic flows: User → ALB (public) → Superset (private) → RDS (private).

Security Groups

Define security groups with least-privilege rules.

# ALB security group: allow HTTPS from internet
alb_sg = aws.ec2.SecurityGroup(f'{environment}-alb-sg',
    vpc_id=vpc.id,
    ingress=[
        aws.ec2.SecurityGroupIngressArgs(
            protocol='tcp',
            from_port=443,
            to_port=443,
            cidr_blocks=['0.0.0.0/0'],  # HTTPS from anywhere
        ),
        aws.ec2.SecurityGroupIngressArgs(
            protocol='tcp',
            from_port=80,
            to_port=80,
            cidr_blocks=['0.0.0.0/0'],  # HTTP (redirect to HTTPS)
        ),
    ],
    egress=[
        aws.ec2.SecurityGroupEgressArgs(
            protocol='-1',
            from_port=0,
            to_port=0,
            cidr_blocks=['0.0.0.0/0'],
        ),
    ])

# Superset security group: allow traffic from ALB only
superset_sg = aws.ec2.SecurityGroup(f'{environment}-superset-sg',
    vpc_id=vpc.id,
    ingress=[
        aws.ec2.SecurityGroupIngressArgs(
            protocol='tcp',
            from_port=8088,  # Superset default port
            to_port=8088,
            security_groups=[alb_sg.id],
        ),
    ],
    egress=[
        aws.ec2.SecurityGroupEgressArgs(
            protocol='-1',
            from_port=0,
            to_port=0,
            cidr_blocks=['0.0.0.0/0'],
        ),
    ])

# RDS security group: allow traffic from Superset only
rds_sg = aws.ec2.SecurityGroup(f'{environment}-rds-sg',
    vpc_id=vpc.id,
    ingress=[
        aws.ec2.SecurityGroupIngressArgs(
            protocol='tcp',
            from_port=5432,  # PostgreSQL
            to_port=5432,
            security_groups=[superset_sg.id],
        ),
    ])

This ensures Superset can only be reached through the load balancer, and RDS can only be reached from Superset.

TLS/SSL Certificates

Superset should always run over HTTPS. Use AWS Certificate Manager (ACM) for free, auto-renewing certificates.

# Request a certificate for your domain
cert = aws.acm.Certificate(f'{environment}-superset-cert',
    domain_name='analytics.yourcompany.com',
    validation_method='DNS',
    tags={'Environment': environment})

If you’re running in Australia and need compliance audit readiness, review PADISO’s AI Quickstart Audit | PADISO — Fixed-fee 2-week diagnostic, which includes infrastructure security assessment.

Storage, Secrets, and State Management

Secrets Management

Superset needs credentials for:

PostgreSQL (metadata database)
Redis (cache)
Data warehouse connections (Redshift, Snowflake, etc.)
SMTP (for email alerts)
OAuth/SAML (for SSO)

Store these in AWS Secrets Manager, not in code or environment variables.

# Create a secret for PostgreSQL
db_secret = aws.secretsmanager.Secret(f'{environment}-superset-db-secret',
    description='PostgreSQL credentials for Superset metadata database',
    tags={'Environment': environment})

db_secret_version = aws.secretsmanager.SecretVersion(
    f'{environment}-superset-db-secret-version',
    secret_id=db_secret.id,
    secret_string=pulumi.Output.secret(pulumi.json.dumps({
        'username': 'superset_user',
        'password': config.require_secret('db_password'),
        'engine': 'postgresql',
        'host': rds_instance.endpoint,
        'port': 5432,
        'dbname': 'superset_metadata',
    })))

# Create a secret for Redis
redis_secret = aws.secretsmanager.Secret(f'{environment}-superset-redis-secret',
    description='Redis connection string for Superset caching',
    tags={'Environment': environment})

redis_secret_version = aws.secretsmanager.SecretVersion(
    f'{environment}-superset-redis-secret-version',
    secret_id=redis_secret.id,
    secret_string=pulumi.Output.secret(f'redis://{redis_endpoint}:6379/0'))

When Superset’s ECS task starts, it fetches these secrets from Secrets Manager at runtime. This keeps sensitive data out of container images and Pulumi state.

Pulumi State Backend

Pulumi stores the state of your infrastructure (resource IDs, outputs, etc.) in a state backend. For production, use AWS S3 with encryption and versioning enabled.

# Configure Pulumi to use S3 backend
pulumi login s3://your-pulumi-state-bucket

Enable S3 bucket versioning and encryption:

state_bucket = aws.s3.Bucket(f'{environment}-pulumi-state',
    versioning=aws.s3.BucketVersioningArgs(
        enabled=True,
    ),
    server_side_encryption_configuration=aws.s3.BucketServerSideEncryptionConfigurationArgs(
        rule=aws.s3.BucketServerSideEncryptionConfigurationRuleArgs(
            apply_server_side_encryption_by_default=aws.s3.BucketServerSideEncryptionConfigurationRuleApplyServerSideEncryptionByDefaultArgs(
                sse_algorithm='AES256',
            ),
        ),
    ),
    block_public_acls=True,
    block_public_policy=True,
    ignore_public_acls=True,
    restrict_public_buckets=True,
    tags={'Environment': environment})

Persistent Storage for Uploads

Superset allows users to upload CSV files for analysis. Store these in S3, not in the container.

superset_uploads_bucket = aws.s3.Bucket(f'{environment}-superset-uploads',
    versioning=aws.s3.BucketVersioningArgs(enabled=True),
    server_side_encryption_configuration=aws.s3.BucketServerSideEncryptionConfigurationArgs(
        rule=aws.s3.BucketServerSideEncryptionConfigurationRuleArgs(
            apply_server_side_encryption_by_default=aws.s3.BucketServerSideEncryptionConfigurationRuleApplyServerSideEncryptionByDefaultArgs(
                sse_algorithm='AES256',
            ),
        ),
    ),
    block_public_acls=True,
    block_public_policy=True,
    tags={'Environment': environment})

# IAM role for ECS task to access S3
superset_task_role = aws.iam.Role(f'{environment}-superset-task-role',
    assume_role_policy=pulumi.json.dumps({
        'Version': '2012-10-17',
        'Statement': [{
            'Action': 'sts:AssumeRole',
            'Effect': 'Allow',
            'Principal': {'Service': 'ecs-tasks.amazonaws.com'},
        }],
    }))

# Policy to read/write to S3 uploads bucket
s3_policy = aws.iam.RolePolicy(f'{environment}-superset-s3-policy',
    role=superset_task_role.id,
    policy=pulumi.json.dumps({
        'Version': '2012-10-17',
        'Statement': [{
            'Effect': 'Allow',
            'Action': ['s3:GetObject', 's3:PutObject', 's3:DeleteObject'],
            'Resource': pulumi.Output.concat(superset_uploads_bucket.arn, '/*'),
        }],
    }))

Building the Superset Stack

RDS PostgreSQL Instance

Create a managed PostgreSQL instance for Superset’s metadata database.

# Create a DB subnet group (required for RDS in a VPC)
db_subnet_group = aws.rds.SubnetGroup(f'{environment}-superset-db-subnet',
    subnet_ids=[private_subnet.id],
    tags={'Environment': environment})

# Create the RDS instance
rds_instance = aws.rds.Instance(f'{environment}-superset-db',
    allocated_storage=20,
    storage_type='gp3',
    engine='postgres',
    engine_version='15.3',
    instance_class='db.t3.micro',  # Start small; scale up as needed
    db_name='superset_metadata',
    username='superset_user',
    password=config.require_secret('db_password'),
    db_subnet_group_name=db_subnet_group.name,
    vpc_security_group_ids=[rds_sg.id],
    skip_final_snapshot=False,  # Always snapshot before deletion
    final_snapshot_identifier=f'{environment}-superset-db-final-{pulumi.automation.datetime.now().isoformat()}',
    backup_retention_period=7,  # Keep 7 days of backups
    multi_az=True,  # High availability
    storage_encrypted=True,
    tags={'Environment': environment})

pulumi.export('rds_endpoint', rds_instance.endpoint)

The multi_az=True setting ensures your metadata database is highly available. If the primary instance fails, RDS automatically promotes the standby replica.

ElastiCache Redis Instance

Create a managed Redis instance for caching.

# Create a cache subnet group
cache_subnet_group = aws.elasticache.SubnetGroup(f'{environment}-superset-cache-subnet',
    subnet_ids=[private_subnet.id],
    tags={'Environment': environment})

# Create the Redis cluster
redis_cluster = aws.elasticache.Cluster(f'{environment}-superset-redis',
    engine='redis',
    engine_version='7.0',
    node_type='cache.t3.micro',
    num_cache_nodes=1,
    parameter_group_name='default.redis7',
    port=6379,
    subnet_group_name=cache_subnet_group.name,
    security_group_ids=[redis_sg.id],
    at_rest_encryption_enabled=True,
    transit_encryption_enabled=True,
    transit_encryption_mode='preferred',
    auto_failover_enabled=False,  # Single-node; failover not needed
    tags={'Environment': environment})

pulumi.export('redis_endpoint', redis_cluster.cache_nodes[0].address)

For production deployments with higher availability requirements, use a Redis replication group instead of a single cluster node.

ECS Cluster and Task Definition

Create an ECS cluster and define a task to run Superset.

# Create ECS cluster
ecs_cluster = aws.ecs.Cluster(f'{environment}-superset-cluster',
    settings=[aws.ecs.ClusterSettingArgs(
        name='containerInsights',
        value='enabled',
    )],
    tags={'Environment': environment})

# CloudWatch log group for Superset
log_group = aws.cloudwatch.LogGroup(f'{environment}-superset-logs',
    retention_in_days=7,
    tags={'Environment': environment})

# ECS task execution role (allows ECS to pull image and access Secrets Manager)
task_execution_role = aws.iam.Role(f'{environment}-superset-task-execution-role',
    assume_role_policy=pulumi.json.dumps({
        'Version': '2012-10-17',
        'Statement': [{
            'Action': 'sts:AssumeRole',
            'Effect': 'Allow',
            'Principal': {'Service': 'ecs-tasks.amazonaws.com'},
        }],
    }))

# Attach the standard ECS task execution policy
task_execution_policy_attachment = aws.iam.RolePolicyAttachment(
    f'{environment}-superset-task-execution-policy',
    role=task_execution_role.name,
    policy_arn='arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy')

# Allow task execution role to access Secrets Manager
secrets_policy = aws.iam.RolePolicy(f'{environment}-superset-secrets-policy',
    role=task_execution_role.id,
    policy=pulumi.json.dumps({
        'Version': '2012-10-17',
        'Statement': [{
            'Effect': 'Allow',
            'Action': ['secretsmanager:GetSecretValue'],
            'Resource': [db_secret.arn, redis_secret.arn],
        }],
    }))

# ECS task definition
task_definition = aws.ecs.TaskDefinition(f'{environment}-superset-task',
    family=f'{environment}-superset',
    network_mode='awsvpc',
    requires_compatibilities=['FARGATE'],
    cpu='512',
    memory='1024',
    execution_role_arn=task_execution_role.arn,
    task_role_arn=superset_task_role.arn,
    container_definitions=pulumi.Output.all(
        log_group.name,
        db_secret.arn,
        redis_secret.arn,
        superset_uploads_bucket.id
    ).apply(lambda args: pulumi.json.dumps([{
        'name': 'superset',
        'image': 'apache/superset:latest-dev',  # Use a pinned version in production
        'portMappings': [{
            'containerPort': 8088,
            'hostPort': 8088,
            'protocol': 'tcp',
        }],
        'logConfiguration': {
            'logDriver': 'awslogs',
            'options': {
                'awslogs-group': args[0],
                'awslogs-region': 'ap-southeast-2',
                'awslogs-stream-prefix': 'ecs',
            },
        },
        'secrets': [
            {
                'name': 'SUPERSET_DATABASE_URL',
                'valueFrom': args[1],
            },
            {
                'name': 'REDIS_URL',
                'valueFrom': args[2],
            },
        ],
        'environment': [
            {'name': 'SUPERSET_LOAD_EXAMPLES', 'value': 'false'},
            {'name': 'SUPERSET_SECRET_KEY', 'value': config.require_secret('superset_secret_key')},
            {'name': 'SUPERSET_UPLOADS_FOLDER', 'value': f's3://{args[3]}/uploads'},
        ],
        'essential': True,
    }])),
    tags={'Environment': environment})

This task definition pulls the official Apache Superset Docker image, configures logging to CloudWatch, and injects secrets at runtime. For production, pin the image to a specific version (e.g., apache/superset:2.1.0) rather than latest-dev.

Application Load Balancer

Create an ALB to distribute traffic to Superset tasks.

# Create target group
target_group = aws.lb.TargetGroup(f'{environment}-superset-tg',
    port=8088,
    protocol='HTTP',
    target_type='ip',
    vpc_id=vpc.id,
    health_check=aws.lb.TargetGroupHealthCheckArgs(
        healthy_threshold=2,
        unhealthy_threshold=2,
        timeout=5,
        interval=30,
        path='/health',
        matcher='200',
    ),
    tags={'Environment': environment})

# Create ALB
alb = aws.lb.LoadBalancer(f'{environment}-superset-alb',
    internal=False,
    load_balancer_type='application',
    security_groups=[alb_sg.id],
    subnets=[public_subnet.id],
    tags={'Environment': environment})

# HTTPS listener
https_listener = aws.lb.Listener(f'{environment}-superset-https',
    load_balancer_arn=alb.arn,
    port=443,
    protocol='HTTPS',
    ssl_policy='ELBSecurityPolicy-TLS-1-2-2017-01',
    certificate_arn=cert.arn,
    default_actions=[aws.lb.ListenerDefaultActionArgs(
        type='forward',
        target_group_arn=target_group.arn,
    )])

# HTTP listener (redirect to HTTPS)
http_listener = aws.lb.Listener(f'{environment}-superset-http',
    load_balancer_arn=alb.arn,
    port=80,
    protocol='HTTP',
    default_actions=[aws.lb.ListenerDefaultActionArgs(
        type='redirect',
        redirect=aws.lb.ListenerDefaultActionRedirectArgs(
            port='443',
            protocol='HTTPS',
            status_code='HTTP_301',
        ),
    )])

pulumi.export('alb_dns_name', alb.dns_name)

ECS Service

Finally, create an ECS service to run Superset tasks.

service = aws.ecs.Service(f'{environment}-superset-service',
    cluster=ecs_cluster.arn,
    task_definition=task_definition.arn,
    desired_count=2,  # Run 2 tasks for high availability
    launch_type='FARGATE',
    network_configuration=aws.ecs.ServiceNetworkConfigurationArgs(
        subnets=[private_subnet.id],
        security_groups=[superset_sg.id],
        assign_public_ip=False,
    ),
    load_balancers=[aws.ecs.ServiceLoadBalancerArgs(
        target_group_arn=target_group.arn,
        container_name='superset',
        container_port=8088,
    )],
    depends_on=[https_listener],
    tags={'Environment': environment})

Autoscaling and Load Balancing

ECS Service Autoscaling

Configure the ECS service to scale based on CPU and memory utilisation.

# Create autoscaling target
autoscaling_target = aws.appautoscaling.Target(f'{environment}-superset-autoscaling-target',
    max_capacity=5,
    min_capacity=2,
    resource_id=pulumi.Output.concat('service/', ecs_cluster.name, '/', service.name),
    scalable_dimension='ecs:service:DesiredCount',
    service_namespace='ecs')

# Scale up when CPU > 70%
cpu_scaling_policy = aws.appautoscaling.Policy(f'{environment}-superset-cpu-scaling',
    policy_type='TargetTrackingScaling',
    resource_id=autoscaling_target.resource_id,
    scalable_dimension=autoscaling_target.scalable_dimension,
    service_namespace=autoscaling_target.service_namespace,
    target_tracking_scaling_policy_configuration=aws.appautoscaling.TargetTrackingScalingPolicyConfigurationArgs(
        target_value=70.0,
        predefined_metric_specification=aws.appautoscaling.TargetTrackingScalingPolicyConfigurationPredefinedMetricSpecificationArgs(
            predefined_metric_type='ECSServiceAverageCPUUtilization',
        ),
        scale_out_cooldown=60,
        scale_in_cooldown=300,
    ))

# Scale up when memory > 80%
memory_scaling_policy = aws.appautoscaling.Policy(f'{environment}-superset-memory-scaling',
    policy_type='TargetTrackingScaling',
    resource_id=autoscaling_target.resource_id,
    scalable_dimension=autoscaling_target.scalable_dimension,
    service_namespace=autoscaling_target.service_namespace,
    target_tracking_scaling_policy_configuration=aws.appautoscaling.TargetTrackingScalingPolicyConfigurationArgs(
        target_value=80.0,
        predefined_metric_specification=aws.appautoscaling.TargetTrackingScalingPolicyConfigurationPredefinedMetricSpecificationArgs(
            predefined_metric_type='ECSServiceAverageMemoryUtilization',
        ),
        scale_out_cooldown=60,
        scale_in_cooldown=300,
    ))

With these policies, your Superset deployment will automatically scale from 2 to 5 tasks as demand increases, and scale back down during quiet periods.

Database Connection Pooling

Superset instances need to share database connections efficiently. Configure Superset’s SQLALCHEMY_ENGINE_OPTIONS to use connection pooling.

# In your Superset configuration (superset_config.py or via environment variable)
SQLALCHEMY_ENGINE_OPTIONS = {
    'pool_size': 10,
    'pool_recycle': 3600,
    'pool_pre_ping': True,
    'max_overflow': 20,
}

These settings ensure:

pool_size=10: Maintain 10 persistent connections to the database.
pool_recycle=3600: Recycle connections every hour (prevents stale connections).
pool_pre_ping=True: Test connections before reusing them.
max_overflow=20: Allow up to 20 additional temporary connections if the pool is exhausted.

Observability and Operational Habits

CloudWatch Monitoring

Set up CloudWatch dashboards and alarms to monitor Superset’s health.

# Create a CloudWatch dashboard
dashboard = aws.cloudwatch.Dashboard(f'{environment}-superset-dashboard',
    dashboard_body=pulumi.Output.all(
        ecs_cluster.name,
        service.name,
        alb.arn,
        target_group.arn,
    ).apply(lambda args: pulumi.json.dumps({
        'widgets': [
            {
                'type': 'metric',
                'properties': {
                    'metrics': [
                        ['AWS/ECS', 'CPUUtilization', {'stat': 'Average'}],
                        ['.', 'MemoryUtilization', {'stat': 'Average'}],
                    ],
                    'period': 300,
                    'stat': 'Average',
                    'region': 'ap-southeast-2',
                    'title': 'ECS Task CPU and Memory',
                },
            },
            {
                'type': 'metric',
                'properties': {
                    'metrics': [
                        ['AWS/ApplicationELB', 'TargetResponseTime', {'stat': 'Average'}],
                        ['.', 'RequestCount', {'stat': 'Sum'}],
                        ['.', 'HTTPCode_Target_5XX_Count', {'stat': 'Sum'}],
                    ],
                    'period': 60,
                    'stat': 'Average',
                    'region': 'ap-southeast-2',
                    'title': 'ALB Performance',
                },
            },
        ],
    })))

# Alarm: ECS task CPU > 85% for 2 minutes
cpu_alarm = aws.cloudwatch.MetricAlarm(f'{environment}-superset-cpu-alarm',
    comparison_operator='GreaterThanThreshold',
    evaluation_periods=2,
    metric_name='CPUUtilization',
    namespace='AWS/ECS',
    period=60,
    statistic='Average',
    threshold=85,
    alarm_description='Alert when Superset ECS task CPU exceeds 85%',
    alarm_actions=[sns_topic.arn],  # Send to SNS topic
    dimensions=[
        aws.cloudwatch.MetricAlarmDimensionArgs(
            name='ClusterName',
            value=ecs_cluster.name,
        ),
        aws.cloudwatch.MetricAlarmDimensionArgs(
            name='ServiceName',
            value=service.name,
        ),
    ])

# Alarm: ALB target health
target_health_alarm = aws.cloudwatch.MetricAlarm(f'{environment}-superset-target-health-alarm',
    comparison_operator='LessThanThreshold',
    evaluation_periods=1,
    metric_name='HealthyHostCount',
    namespace='AWS/ApplicationELB',
    period=60,
    statistic='Average',
    threshold=1,
    alarm_description='Alert when fewer than 1 healthy Superset target',
    alarm_actions=[sns_topic.arn],
    dimensions=[
        aws.cloudwatch.MetricAlarmDimensionArgs(
            name='TargetGroup',
            value=target_group.arn_suffix,
        ),
        aws.cloudwatch.MetricAlarmDimensionArgs(
            name='LoadBalancer',
            value=alb.arn_suffix,
        ),
    ])

Application Logging and Log Insights

Superset logs to stdout, which ECS captures and sends to CloudWatch. Query logs with CloudWatch Logs Insights.

# Find errors in Superset logs
fields @timestamp, @message
| filter @message like /ERROR/
| stats count() by @message

For more sophisticated observability, integrate with Datadog, New Relic, or Prometheus.

Operational Runbook

Document operational procedures for your team:

Deploying a new version: Update the task definition image, push to production.
Scaling up: Modify desired_count in the ECS service or let autoscaling handle it.
Database maintenance: Use AWS RDS console to create snapshots, modify parameter groups.
Secrets rotation: Update secrets in AWS Secrets Manager; ECS tasks will pick up changes on next restart.
Debugging a failed task: Check CloudWatch logs, ECS task details, and security group rules.

Disaster Recovery and Backup Strategy

RDS Automated Backups

RDS automatically backs up your metadata database. Configure retention and testing.

# Already configured in the RDS instance definition above:
# backup_retention_period=7  # Keep 7 days of backups
# multi_az=True  # Synchronous standby replica

To restore from a backup:

aws rds restore-db-instance-from-db-snapshot \
  --db-instance-identifier superset-restored \
  --db-snapshot-identifier superset-2024-01-15-03-00

Snapshots and Point-in-Time Recovery

Create manual snapshots before major changes.

# Manual snapshot via Pulumi (run before deployments)
manual_snapshot = aws.rds.ClusterSnapshot(f'{environment}-superset-snapshot-pre-deploy',
    db_cluster_identifier=rds_instance.id,
    db_cluster_snapshot_identifier=f'superset-pre-deploy-{pulumi.automation.datetime.now().isoformat()}')

Redis Persistence

Configure Redis to persist data to disk. By default, ElastiCache Redis uses RDB snapshots.

# Already configured in the Redis cluster definition above:
# at_rest_encryption_enabled=True
# transit_encryption_enabled=True

For critical workloads, use Redis Cluster with multi-AZ failover.

S3 Uploads Bucket Versioning

Enable versioning and lifecycle policies on the uploads bucket.

# Already configured above:
# versioning=aws.s3.BucketVersioningArgs(enabled=True)

# Add lifecycle rule to archive old versions
lifecycle_rule = aws.s3.BucketLifecycleConfigurationV2(f'{environment}-superset-uploads-lifecycle',
    bucket=superset_uploads_bucket.id,
    rules=[
        aws.s3.BucketLifecycleConfigurationV2RuleArgs(
            id='archive-old-versions',
            status='Enabled',
            noncurrent_version_transitions=[
                aws.s3.BucketLifecycleConfigurationV2RuleNoncurrentVersionTransitionArgs(
                    storage_class='GLACIER',
                    days=30,
                ),
            ],
            noncurrent_version_expiration=aws.s3.BucketLifecycleConfigurationV2RuleNoncurrentVersionExpirationArgs(
                days=90,
            ),
        ),
    ])

Disaster Recovery Testing

Monthly, test your recovery procedures:

Restore RDS from snapshot to a test instance.
Verify Superset can connect and query the restored database.
Document any issues and update runbooks.

Cost Optimisation and Governance

Right-Sizing Compute

Start with small instance types and scale based on actual usage. For a pilot deployment:

ECS Fargate: 512 CPU, 1 GB memory per task; 2 tasks = ~$15/month.
RDS: db.t3.micro = ~$20/month.
ElastiCache: cache.t3.micro = ~$10/month.
ALB: ~$15/month.
NAT Gateway: ~$30/month (if using one).

Total pilot cost: ~$90/month. As you scale, costs will increase proportionally.

Reserved Instances and Savings Plans

Once you’ve stabilised your workload, purchase Reserved Instances or Savings Plans for 30–40% discounts.

# Example: Purchase a 1-year RDS reserved instance
rds_reservation = aws.rds.ReservedInstance(
    offering_id='12345678-1234-1234-1234-123456789012',
    reservation_id=f'{environment}-superset-db-reservation')

Cost Allocation Tags

Tag all resources consistently for cost tracking.

common_tags = {
    'Environment': environment,
    'Project': 'Superset',
    'CostCenter': 'Analytics',
    'Owner': 'Data Platform Team',
}

Use AWS Cost Explorer to filter costs by tag and identify optimisation opportunities.

Unused Resource Cleanup

Regularly audit your deployment for unused resources:

# List all Superset-related resources
aws resourcegroupstaggingapi get-resources \
  --tag-filters Key=Project,Values=Superset

Common Pitfalls and How to Avoid Them

Pitfall 1: Database Connection Exhaustion

Problem: Superset tasks can’t connect to RDS because the connection pool is exhausted.

Symptom: Errors like FATAL: remaining connection slots are reserved for non-replication superuser connections.

Solution: Configure connection pooling and monitor connection count.

# Monitor RDS connections
connection_count_alarm = aws.cloudwatch.MetricAlarm(f'{environment}-rds-connections-alarm',
    comparison_operator='GreaterThanThreshold',
    evaluation_periods=2,
    metric_name='DatabaseConnections',
    namespace='AWS/RDS',
    period=60,
    statistic='Average',
    threshold=80,  # Alert if > 80 connections
    alarm_description='Alert when RDS connection count exceeds 80',
    alarm_actions=[sns_topic.arn])

Pitfall 2: Unencrypted Secrets in Task Definition

Problem: Secrets are logged in plaintext in ECS task definition history.

Solution: Use AWS Secrets Manager (as shown above) instead of hardcoded environment variables.

Pitfall 3: No Health Checks on ALB

Problem: Unhealthy Superset tasks continue to receive traffic, causing 502 errors.

Solution: Configure ALB health checks with appropriate thresholds (as shown above).

Pitfall 4: Single-AZ Deployment

Problem: If an availability zone goes down, Superset becomes unavailable.

Solution: Deploy Superset tasks across multiple AZs.

# Create a second private subnet in a different AZ
private_subnet_2 = aws.ec2.Subnet(f'{environment}-private-subnet-2',
    vpc_id=vpc.id,
    cidr_block='10.0.3.0/24',
    availability_zone='ap-southeast-2b')  # Different AZ

# Update ECS service to use both subnets
service = aws.ecs.Service(...,
    network_configuration=aws.ecs.ServiceNetworkConfigurationArgs(
        subnets=[private_subnet.id, private_subnet_2.id],  # Both subnets
        ...
    ),
    ...)

Pitfall 5: No Log Retention

Problem: CloudWatch logs grow unbounded, increasing costs.

Solution: Set retention policies (as shown above).

Next Steps and Scaling

Beyond the Baseline

Once your baseline Superset deployment is stable, consider:

Multi-region deployment: Deploy Superset in multiple AWS regions for global availability. Use Route 53 for DNS failover.
Kubernetes migration: If you’re running other workloads on Kubernetes, migrate Superset to your existing cluster for operational simplicity.
Advanced caching: Implement query result caching with TTLs to reduce database load.
Custom plugins: Build Superset plugins for domain-specific visualisations or data connectors.

Integration with Data Platforms

Superset is most powerful when connected to a well-designed data platform. Consider:

Data warehouse: Use Platform Development in Australia | PADISO to design a ClickHouse or Snowflake data warehouse optimised for Superset queries.
ETL/ELT pipelines: Orchestrate data ingestion with Apache Airflow or Prefect.
Data governance: Implement lineage tracking and metadata management.

For teams in specific regions, PADISO offers specialised platform engineering:

Platform Development in Sydney | PADISO for financial services and retail.
Platform Development in Melbourne | PADISO for insurance and health.
Platform Development in Canberra | PADISO for government and defence.
Platform Development in San Francisco | PADISO for AI and SaaS.
Platform Development in Boston | PADISO for biotech and pharma.
Platform Development in Seattle | PADISO for cloud-native and aerospace.
Platform Development in Atlanta | PADISO for fintech and logistics.
Platform Development in Denver | PADISO for energy and aerospace.

Compliance and Auditing

If your organisation requires SOC 2 or ISO 27001 compliance, use Vanta to automate compliance evidence collection. Your Pulumi-managed infrastructure integrates seamlessly with Vanta’s continuous compliance monitoring.

PADISO specialises in Security Audit (SOC 2 / ISO 27001) readiness. If you need guidance on audit preparation, book a consultation.

Operational Excellence

As your Superset deployment matures, invest in:

Infrastructure-as-Code maturity: Modularise your Pulumi code into reusable stacks and components.
CI/CD pipelines: Automate Superset deployments with GitHub Actions or GitLab CI.
Cost optimisation: Use AWS Compute Optimizer and Cost Anomaly Detection to identify savings.
Disaster recovery drills: Quarterly test your backup and recovery procedures.

Getting Help

If you need hands-on support building or scaling Superset on Pulumi, PADISO offers fractional CTO and platform engineering services. Visit PADISO: AI Solutions & Strategic Leadership — AIR Bootcamps | SOC2 & ISO27001 via Vanta to learn more.

For a rapid assessment of your current analytics stack and Superset readiness, book an AI Quickstart Audit | PADISO — Fixed-fee 2-week diagnostic. We’ll tell you where you are, what to ship first, and what 90 days could unlock.

Summary

Deploying Apache Superset on Pulumi Stack is a proven pattern for shipping repeatable, auditable analytics infrastructure. This guide covers:

Architecture decisions: Cloud provider, compute model, databases, caching, and data warehouse selection.
Networking and security: VPC design, security groups, TLS, and secrets management.
Storage and state: Persistent uploads, Pulumi state backend, and database backups.
Core infrastructure: RDS PostgreSQL, ElastiCache Redis, ECS Fargate, and Application Load Balancer.
Autoscaling: ECS service scaling policies and database connection pooling.
Observability: CloudWatch dashboards, alarms, and operational runbooks.
Disaster recovery: Automated backups, snapshots, and recovery testing.
Cost optimisation: Right-sizing, reserved instances, and cost allocation.
Common pitfalls: Connection exhaustion, unencrypted secrets, missing health checks, single-AZ risk, and unbounded logs.

With this pattern, you can ship Superset to production in days, not weeks. Your infrastructure is version-controlled, auditable, and ready to scale. The operational habits described here—monitoring, backup testing, cost tracking—keep your deployment healthy and cost-effective.

Start small (pilot deployment, ~$90/month), validate your use case, then scale confidently. If you need support, PADISO’s platform engineering teams are experienced with Superset deployments across Australia and the US.

Want to talk through your situation?

Book a 30-minute call with Kevin (Founder/CEO). No pitch — direct advice on what to do next.

Book a 30-min call