Guide 32 mins

Apache Superset + dbt: Security Model

Complete guide to securing Apache Superset with dbt. Configuration patterns, RBAC, audit-readiness, and operational security habits for analytics platforms.

The PADISO Team ·2026-06-18

Why Superset + dbt Security Matters
Understanding the Security Stack
Authentication and Authorization Layers
Role-Based Access Control (RBAC) Configuration
dbt Governance and Model Access
Warehouse-Level Security Integration
Row-Level Security Implementation
Audit Logging and Compliance
Operational Security Habits
Security Benchmarks and Metrics
Common Pitfalls and Solutions
Next Steps: Audit-Ready Architecture

Why Superset + dbt Security Matters {#why-security-matters}

When you deploy Apache Superset as your analytics layer on top of dbt-transformed data, you’re building a system that touches sensitive business intelligence, financial metrics, and customer data. Unlike a single-purpose application, Superset sits at the intersection of multiple teams—finance, product, marketing, operations—each needing different views of the same underlying data.

The challenge is straightforward: how do you ensure that a finance analyst sees revenue data, a product manager sees feature adoption, and neither sees what they shouldn’t—without rebuilding your entire dbt project for each audience?

This is where security model design becomes critical. A poorly configured Superset + dbt stack creates three immediate risks:

Data leakage: Users query directly into Superset and accidentally (or intentionally) export datasets they shouldn’t access. A single misconfigured dashboard can expose customer PII, pricing data, or acquisition costs to the wrong team.

Audit failure: When you run SOC 2 or ISO 27001 compliance audits, regulators ask: “Who accessed what, when, and why?” If Superset logs don’t connect to warehouse logs, you have gaps. If dbt doesn’t track model lineage and access, you can’t prove data governance.

Operational chaos: Without clear RBAC patterns, every new hire or role change becomes a manual security review. You end up with over-privileged users, orphaned permissions, and dashboards nobody can maintain.

The good news: a well-designed Superset + dbt security model is operationally sustainable. It scales from 5 users to 500. It integrates cleanly with your compliance programme. And it doesn’t require you to rebuild your entire analytics architecture.

This guide walks you through the patterns, configurations, and operational habits that work at scale. We’ll focus on concrete implementation details, not theory.

Understanding the Security Stack {#security-stack}

Before we configure anything, map your security layers. Superset + dbt security is layered, not monolithic. Each layer enforces different controls at different points in the data pipeline.

The Four Security Layers

Layer 1: Authentication (who are you?)

Superset needs to know who you are. This happens at login. You can authenticate via LDAP, OAuth2, SAML, or local Superset users. Most production deployments use LDAP (Active Directory) or SAML for enterprise identity management.

Why this matters: If you can’t prove who accessed Superset, you can’t audit anything downstream. For SOC 2 and ISO 27001 compliance, you’ll need to demonstrate that every login is tied to a named individual, not a shared account.

Layer 2: Superset Authorization (what can you access in Superset?)

Once authenticated, Superset applies role-based access control. A user might have the “Analyst” role, which grants permission to view dashboards in the “Finance” folder and edit charts in the “Product” folder. This is Superset’s internal permission model.

Why this matters: It controls dashboard and chart visibility within Superset UI. But it doesn’t control what data you can query. That’s layer 3.

Layer 3: Warehouse Authorization (what can you query?)

When Superset runs a query, it connects to your warehouse (Snowflake, BigQuery, PostgreSQL, etc.) as a specific user or service account. The warehouse has its own RBAC—schemas, tables, columns, rows. If the Superset connection account doesn’t have permission to a table, the query fails at the warehouse level.

Why this matters: This is your hard security boundary. Even if someone misconfigures Superset permissions, the warehouse won’t let them query data they shouldn’t access. This is critical for compliance.

Layer 4: dbt Governance (what models should exist, and who owns them?)

dbt doesn’t enforce runtime access control—it’s a transformation layer. But dbt does provide metadata about data lineage, ownership, and intended use. When you tag models with access: restricted or add a meta.owner field, you’re creating a governance contract. Tools like dbt Custom Groups and Access let you define which teams can reference which models, and you can expose that metadata to Superset.

Why this matters: dbt governance creates intent. It documents which models are meant for which audiences. Superset can then use that metadata to suggest appropriate dashboards and surfaces.

A Practical Example

Let’s say you have a customers table in your warehouse. You’ve built a dbt model fct_customer_lifetime_value that calculates revenue per customer. You want:

Finance team: Can see LTV by cohort, segment, and geography. Full dashboard access.
Product team: Can see LTV by feature adoption, not by customer name or email.
Sales team: Cannot see LTV at all (it’s a finance metric).

Here’s how the four layers work together:

Authentication: Finance analyst logs in via LDAP. Superset confirms their identity.
Superset Authorization: Superset checks: “Is this user in the ‘Finance’ role?” Yes. Grant dashboard access.
Warehouse Authorization: Superset runs the query as the superset_finance service account. Snowflake checks: “Does superset_finance have SELECT on fct_customer_lifetime_value?” Yes. Execute query.
dbt Governance: The fct_customer_lifetime_value model is tagged access: restricted and owned by Finance. Superset’s data catalogue surfaces this metadata to Product, signalling: “This model isn’t for you.”

If Product tries to query the same model:

Authentication: Product manager logs in. Identity confirmed.
Superset Authorization: Superset checks: “Is this user in the ‘Product’ role?” Yes, but Product role doesn’t have dashboard access to Finance dashboards. Deny.
Warehouse Authorization: Even if they bypass Superset UI and hit the API, the warehouse account for Product (superset_product) doesn’t have SELECT on fct_customer_lifetime_value. Warehouse denies.
dbt Governance: Metadata signals that this model isn’t intended for Product use.

This layered approach is defence in depth. No single misconfiguration exposes data.

Authentication and Authorization Layers {#auth-layers}

Setting Up Enterprise Authentication

For production deployments, avoid local Superset user management. Instead, integrate with your identity provider.

LDAP / Active Directory is the most common pattern for enterprise deployments. When a user logs in, Superset queries your AD server to verify credentials and pull group membership. This ties Superset access to your organisation’s identity system.

Configuration in superset_config.py:

from flask_appbuilder.security.manager import AUTH_LDAP

AUTH_TYPE = AUTH_LDAP
AUTH_LDAP_SERVER = "ldap://your-ad-server.internal:389"
AUTH_LDAP_USE_SSL = True
AUTH_LDAP_BIND_DN = "cn=superset-service,ou=service-accounts,dc=company,dc=com"
AUTH_LDAP_BIND_PASSWORD = "<secure-password>"
AUTH_LDAP_SEARCH_BASE = "ou=users,dc=company,dc=com"
AUTH_LDAP_USERNAME_FIELD = "sAMAccountName"

# Map AD groups to Superset roles
AUTH_LDAP_GROUP_FIELD = "memberOf"
AUTH_LDAP_APPEND_DOMAIN = False

# This is critical: sync AD group membership to Superset roles on every login
AUTH_LDAP_SYNC_USER_INFO = True
AUTH_LDAP_USE_SSL_FOR_CONNECTIONS = True

SAML is the enterprise standard for SaaS deployments. If you’re running Superset on AWS or in a cloud environment, SAML (via Okta, Azure AD, or Ping Identity) is cleaner than LDAP.

Configuration:

from flask_appbuilder.security.manager import AUTH_SAML

AUTH_TYPE = AUTH_SAML
SAML_METADATA_URL = "https://your-idp.okta.com/app/exk.../sso/saml/metadata"

# Map SAML attributes to Superset user fields
SAML_ATTRIBUTE_MAPPING = {
    "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/emailaddress": ("email", None),
    "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/givenname": ("first_name", None),
    "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/surname": ("last_name", None),
}

# Map SAML groups to Superset roles
SAML_GROUP_MAPPING = {
    "okta_finance_group": ["Finance"],
    "okta_product_group": ["Product"],
    "okta_admin_group": ["Admin"],
}

Why Authentication Matters for Compliance

When auditors review your SOC 2 or ISO 27001 controls, they ask: “Can you prove who accessed what data?” If Superset is configured with local users and no audit logs, you can’t. If Superset is tied to your identity provider and logs every login, you can.

For Security Audit | PADISO - SOC 2, ISO 27001 & GDPR Compliance, the first control is always: Identity and access management. Superset authentication is your first line of evidence.

Superset Authorization: Roles and Permissions

Once authenticated, Superset applies role-based access control. Out of the box, Superset includes several built-in roles:

Admin: Full access to all dashboards, charts, datasets, and settings.
Alpha: Can view and edit dashboards and charts.
Gamma: Can view dashboards and charts only (read-only).
SQL Lab Only: Can access SQL Lab (query editor) but not dashboards.

For most organisations, these roles are too coarse. You’ll want custom roles. Here’s a typical role matrix:

Role	Dashboards	Charts	Datasets	SQL Lab	Data Sources
Finance Lead	View, Edit	View, Edit	View, Edit	Yes	Finance warehouse
Finance Analyst	View, Edit	View, Edit	View	Yes	Finance warehouse
Product Manager	View	View	View	No	Product warehouse
Data Engineer	Admin	Admin	Admin	Yes	All
Executive	View	View	View	No	Executive dashboards only

To create a custom role in Superset, go to Settings → Roles and define permissions. You can grant:

Dashboard access: View, edit, or delete specific dashboards.
Chart access: View, edit, or delete specific charts.
Dataset access: Query specific datasets (databases and schemas).
Feature access: SQL Lab, alerts, reports, etc.

Critical pattern: Always use database and schema-level permissions in Superset. Instead of granting a user access to “all datasets,” grant access to specific schemas. This ties Superset permissions to your warehouse schema structure.

Example:

# In Superset, create a role "Finance Analyst"
# Grant permissions:
# - Dataset: snowflake.finance_analytics (schema)
# - Dashboard: All dashboards tagged "finance"
# - Feature: SQL Lab (read-only)

# Do NOT grant:
# - Access to raw tables (only dbt models)
# - Access to schemas outside finance
# - Write access to datasets (only read)

Role-Based Access Control (RBAC) Configuration {#rbac-config}

Designing Your Role Hierarchy

Start by mapping your organisation’s teams and responsibilities. Don’t try to create one role per person—that’s unmaintainable. Instead, create roles for job functions.

Example role structure for a Series-B SaaS company:

Admin
├── Data Engineering Lead
│   └── Data Engineer
├── Finance Lead
│   └── Finance Analyst
├── Product Lead
│   └── Product Manager
│   └── Product Analyst
└── Executive

Each role has specific dataset and dashboard permissions. When a new hire joins Finance, they get the “Finance Analyst” role. When they’re promoted to Finance Lead, their role is updated. No individual permission grants needed.

Configuring Database Connections with Role-Based Access

This is where Superset + warehouse security converge. In Superset, you define database connections. Each connection has credentials—a username and password (or service account) that Superset uses to query the warehouse.

For production, use separate service accounts per role:

Superset Database Connections:
├── superset_finance (credentials for Finance role)
├── superset_product (credentials for Product role)
├── superset_data_eng (credentials for Data Engineering role)
└── superset_admin (credentials for Admins)

When a Finance analyst runs a query in Superset, Superset connects to the warehouse as superset_finance. The warehouse checks: “Does superset_finance have SELECT on this table?” If yes, query runs. If no, warehouse denies.

This is your hard security boundary. No misconfiguration in Superset can override warehouse permissions.

Configuration in Superset:

Go to Data → Databases.
Click + Database and select your warehouse type (Snowflake, BigQuery, etc.).
Enter connection details:
- Host, port, database name
- Username: superset_finance
- Password: (stored encrypted in Superset metadata database)
Test the connection.
Under Permissions, assign the database to the “Finance” role.

Warehouse configuration (example for Snowflake):

-- Create service accounts
CREATE ROLE superset_finance;
CREATE ROLE superset_product;
CREATE ROLE superset_data_eng;

-- Create users
CREATE USER "superset_finance" PASSWORD = '...' DEFAULT_ROLE = superset_finance;
CREATE USER "superset_product" PASSWORD = '...' DEFAULT_ROLE = superset_product;
CREATE USER "superset_data_eng" PASSWORD = '...' DEFAULT_ROLE = superset_data_eng;

-- Grant schema access
GRANT USAGE ON SCHEMA analytics.finance TO ROLE superset_finance;
GRANT SELECT ON ALL TABLES IN SCHEMA analytics.finance TO ROLE superset_finance;

GRANT USAGE ON SCHEMA analytics.product TO ROLE superset_product;
GRANT SELECT ON ALL TABLES IN SCHEMA analytics.product TO ROLE superset_product;

GRANT USAGE ON SCHEMA analytics.* TO ROLE superset_data_eng;
GRANT ALL PRIVILEGES ON SCHEMA analytics.* TO ROLE superset_data_eng;

-- Assign users to roles
GRANT ROLE superset_finance TO USER "superset_finance";
GRANT ROLE superset_product TO USER "superset_product";
GRANT ROLE superset_data_eng TO USER "superset_data_eng";

Now, when a Finance analyst logs into Superset:

Superset authenticates them via LDAP.
Superset checks their role: “Finance Analyst”.
Superset grants access to the Finance database connection.
When they query a table, Superset connects as superset_finance.
Snowflake checks: “Can superset_finance SELECT from this table?” Yes. Query runs.

If a Product manager tries the same query:

Superset authenticates them.
Superset checks their role: “Product Manager”.
Superset grants access to the Product database connection only.
If they try to query a Finance table, Superset denies (wrong database connection).
If they somehow bypass Superset, Snowflake denies (wrong warehouse role).

Datasets and Permissions

In Superset, a dataset is a saved query or table reference. Datasets are the primary object that users interact with—they’re what appears in the chart builder.

When you create a dataset, assign it to specific roles:

# Example: fct_customer_lifetime_value dataset
Dataset: fct_customer_lifetime_value
├── Database: snowflake (finance connection)
├── Schema: analytics.finance
├── Table: fct_customer_lifetime_value
├── Permissions:
│   ├── Finance Lead: View, Edit
│   ├── Finance Analyst: View
│   └── Data Engineer: View, Edit
└── Tags: ["finance", "customer", "restricted"]

When a user tries to create a chart from this dataset, Superset checks their role. If they don’t have “View” permission, the dataset doesn’t appear in the dataset picker.

Handling Shared Datasets Across Roles

Sometimes you need a dataset that multiple roles can access, but with different row-level visibility. For example, a fct_orders table that Finance, Product, and Sales all need, but with different filters:

Finance: Sees all orders (no filter).
Product: Sees orders by feature adoption (no customer names).
Sales: Sees orders by sales rep (no pricing details).

This is where row-level security (RLS) comes in. We’ll cover that in detail below. For now, the pattern is:

Create a single dataset pointing to the table.
Grant “View” permission to all relevant roles.
In each dashboard/chart, add role-specific filters using Superset’s Jinja templating.

dbt Governance and Model Access {#dbt-governance}

Defining Model Ownership and Access Tiers

dbt doesn’t enforce runtime access control, but it provides metadata that Superset can use to govern access. The key is dbt model tagging and ownership.

In your dbt_project.yml, define access tiers:

models:
  analytics:
    # Public models - anyone can use
    public:
      config:
        meta:
          access: public
          owner: "data-engineering@company.com"
    
    # Restricted models - specific teams only
    finance:
      config:
        meta:
          access: restricted
          owner: "finance@company.com"
          teams: ["finance", "cfo"]
    
    product:
      config:
        meta:
          access: restricted
          owner: "product@company.com"
          teams: ["product", "analytics"]
    
    # Internal models - data engineers only
    staging:
      config:
        meta:
          access: internal
          owner: "data-engineering@company.com"
          teams: ["data-engineering"]

Then, in individual model files:

-- models/finance/fct_customer_lifetime_value.sql
{{ config(
    materialized='table',
    meta={
        'access': 'restricted',
        'owner': 'finance@company.com',
        'teams': ['finance'],
        'pii': false,
        'freshness_sla': '1 day'
    }
) }}

SELECT
    customer_id,
    SUM(order_total) AS lifetime_value,
    COUNT(DISTINCT order_id) AS order_count,
    MAX(order_date) AS last_order_date
FROM {{ ref('stg_orders') }}
GROUP BY customer_id

For dbt Custom Groups and Access, you can define which projects, folders, or models are accessible to which dbt teams. This creates a governance layer in dbt Cloud that mirrors your Superset roles.

Exposing dbt Metadata to Superset

dbt generates a manifest.json file that contains all metadata about your project—models, tests, lineage, ownership, tags, etc. Superset can consume this metadata to:

Surface model ownership in the dataset picker.
Display access tiers (public, restricted, internal).
Show data lineage (which models depend on which sources).
Flag PII or sensitive columns.

To integrate dbt metadata into Superset, use the dbt metadata API. This requires dbt Cloud and Superset 2.0+.

Example: Query dbt metadata to list all Finance models:

# In Superset, create a custom Python script
import requests

DBT_METADATA_URL = "https://metadata.cloud.getdbt.com/graphql"
DBT_TOKEN = "<your-dbt-cloud-token>"

query = """
query {
  models(filter: {tags: {name: {_eq: "finance"}}}) {
    uniqueId
    name
    description
    owner
    meta
  }
}
"""

headers = {"Authorization": f"Bearer {DBT_TOKEN}"}
response = requests.post(DBT_METADATA_URL, json={"query": query}, headers=headers)
models = response.json()["data"]["models"]

for model in models:
    print(f"{model['name']} (Owner: {model['owner']})")

This lets you programmatically enforce governance. For example, you could prevent users from creating charts on Finance models unless they have the Finance role.

Lineage and Impact Analysis

One of dbt’s superpowers is lineage tracking. Every dbt model has a dependency graph: which sources feed into it, which models depend on it, etc.

When you expose this to Superset, users can see:

Upstream lineage: “This dashboard uses data from the stg_orders and dim_customers models, which come from the raw orders and customers tables.”
Downstream impact: “This model feeds into 12 dashboards. If we change it, these dashboards will be affected.”

This is critical for security and governance. If you’re considering restricting access to a model, you can see which dashboards will break. If a user reports a data discrepancy, you can trace it back to the source.

Superset’s data lineage feature shows this visually. Combined with dbt metadata, it provides complete transparency.

Warehouse-Level Security Integration {#warehouse-security}

Understanding Warehouse RBAC

Your data warehouse (Snowflake, BigQuery, PostgreSQL, etc.) has its own role-based access control. This is your hard security boundary. Even if Superset is misconfigured, the warehouse won’t let unauthorised users query data.

Most modern warehouses support:

Role-based access: Users are assigned roles, roles have permissions.
Schema-level permissions: Grant SELECT on specific schemas.
Table-level permissions: Grant SELECT on specific tables.
Column-level permissions: Grant SELECT on specific columns (less common, but supported by some warehouses).
Row-level security: Grant SELECT on rows matching certain conditions.

For Snowflake Role-Based Access Control, the pattern is straightforward:

-- Define roles
CREATE ROLE analyst;
CREATE ROLE finance_analyst;
CREATE ROLE data_engineer;

-- Grant schema access
GRANT USAGE ON SCHEMA analytics.finance TO ROLE finance_analyst;
GRANT SELECT ON ALL TABLES IN SCHEMA analytics.finance TO ROLE finance_analyst;

-- Grant database usage
GRANT USAGE ON DATABASE analytics TO ROLE finance_analyst;

-- Assign users to roles
GRANT ROLE finance_analyst TO USER "john.doe";

For BigQuery Access Control, the pattern is similar but uses IAM:

# Grant dataset-level access
gcloud bigquery datasets add-iam-policy-binding \
  --member=user:john.doe@company.com \
  --role=roles/bigquery.dataEditor \
  analytics_finance

# Grant table-level access (requires custom roles)
gcloud iam roles create bigquery.financeTables \
  --project=my-project \
  --title="Finance Tables" \
  --permissions=bigquery.tables.getData

Configuring Superset Service Accounts

As mentioned earlier, Superset connects to the warehouse as a service account—not as the logged-in user. This has security implications.

Advantage: You can control exactly what data Superset can access. You don’t need to grant every user warehouse access.

Disadvantage: You lose user-level audit trails at the warehouse level. You can’t see “John queried this table at 2pm.” You only see “Superset queried this table at 2pm.”

To mitigate this, configure Superset to log all queries with the authenticated user’s name. This is covered in the Audit Logging section.

Cross-Warehouse Scenarios

If you have multiple warehouses (e.g., Snowflake for analytics, PostgreSQL for operational data), create separate Superset database connections for each.

Superset Database Connections:
├── Snowflake (analytics)
│   ├── Finance connection (superset_finance user)
│   ├── Product connection (superset_product user)
│   └── Data Eng connection (superset_data_eng user)
├── PostgreSQL (operational)
│   └── Admin connection (superset_admin user)
└── BigQuery (ML models)
    └── Data Science connection (superset_ds user)

Each connection has its own credentials and warehouse-level permissions. This provides isolation: a misconfiguration in the Snowflake setup doesn’t affect PostgreSQL access.

Managing Secrets and Credentials

Superset stores database credentials in its metadata database (usually PostgreSQL). These credentials are encrypted at rest using a secret key (SECRET_KEY in superset_config.py).

Best practices:

Use strong secret keys: Generate a 64-character random string.
```
import secrets
SECRET_KEY = secrets.token_urlsafe(32)
```
Rotate credentials regularly: Every 90 days, rotate warehouse service account passwords. Update Superset connections.
Use temporary credentials where possible: If your warehouse supports temporary credentials (e.g., Snowflake key-pair authentication), use them instead of passwords.
```
-- Snowflake key-pair authentication
CREATE USER superset_finance
  IDENTIFIED BY PUBLIC KEY
  'MIICIjANBgkqhkiG9w0BAQEFAAOCAg8AMIICCgKCAgEA...';
```
Audit credential access: Log whenever Superset connections are created, updated, or deleted. This is a critical compliance control.

Row-Level Security Implementation {#row-security}

Understanding Row-Level Security

Row-level security (RLS) restricts which rows a user can see based on their identity or role. For example, a sales rep should only see orders from their own territory. A regional manager should see orders from their region.

RLS is different from dataset-level permissions. With dataset permissions, you grant or deny access to the entire table. With RLS, you grant access to the table, but filter which rows appear based on the user’s context.

RLS in the Warehouse vs. Superset

There are two places to implement RLS:

1. Warehouse-level RLS (preferred)

Your warehouse applies the filter at query time. This is the most secure approach because the filter is enforced by the warehouse, not by Superset.

For PostgreSQL Row Security Policies:

-- Enable RLS on the orders table
ALTER TABLE orders ENABLE ROW LEVEL SECURITY;

-- Create a policy: users can only see orders from their region
CREATE POLICY orders_regional_policy ON orders
  FOR SELECT
  USING (region = current_setting('app.user_region'));

-- When a user queries orders, PostgreSQL automatically filters by their region
SET app.user_region = 'APAC';
SELECT * FROM orders;  -- Only returns APAC orders

Snowflake supports similar functionality via dynamic masking and row access policies:

-- Create a row access policy
CREATE ROW ACCESS POLICY orders_policy ON orders AS
  (region VARCHAR) RETURNS BOOLEAN ->
  CASE
    WHEN CURRENT_ROLE() = 'FINANCE_ANALYST' THEN TRUE  -- Finance sees all
    WHEN CURRENT_ROLE() = 'SALES_REP' THEN region = CURRENT_USER()  -- Sales reps see their region
    ELSE FALSE
  END;

-- Apply the policy to the orders table
ALTER TABLE orders ADD ROW ACCESS POLICY orders_policy ON (region);

2. Superset-level RLS (supplementary)

Superset can apply filters at the dashboard/chart level using Jinja templating. This is useful for:

Filtering based on the logged-in user’s attributes.
Applying filters that don’t exist in the warehouse.
Simplifying dashboard logic.

Example:

-- In a Superset chart's SQL editor
SELECT
    order_id,
    customer_id,
    order_total,
    region
FROM orders
WHERE region = '{{ current_user_obj.region }}'  -- Superset injects the user's region

Implementing RLS in Superset

Superset provides a row-level security filter feature. Here’s how to set it up:

Create a user attribute that defines the filter.
- Go to Settings → Users.
- Edit a user and add a custom attribute: region = APAC.
Create an RLS filter that uses this attribute.
- Go to Settings → Row Level Security.
- Click + Row Level Security.
- Define the filter:
```
Clause: region = '{{ current_user_obj.region }}'
Tables: orders
Roles: Sales Rep
```
Apply the filter to datasets.
- When a Sales Rep queries the orders dataset, Superset automatically appends the RLS filter.
- The query becomes: SELECT * FROM orders WHERE region = 'APAC'.

Best Practices for RLS

Prefer warehouse-level RLS: It’s more secure and performant. Superset-level RLS is a safety net.

Test RLS policies: Create test users with different attributes and verify they see the correct rows.

-- Test: Finance analyst should see all orders
SET ROLE finance_analyst;
SELECT COUNT(*) FROM orders;  -- Should be ~1M rows

-- Test: Sales rep should see only their region
SET ROLE sales_rep;
SET app.user_region = 'EMEA';
SELECT COUNT(*) FROM orders;  -- Should be ~100k rows

Document RLS policies: For compliance audits, document which RLS policies exist, why they exist, and which roles they apply to.
Monitor RLS performance: RLS can add overhead. Monitor query performance and adjust policies if needed.

Audit Logging and Compliance {#audit-logging}

Configuring Superset Audit Logs

Superset logs every action: logins, dashboard views, chart edits, queries. These logs are stored in Superset’s metadata database and are critical for compliance.

To enable detailed audit logging, configure superset_config.py:

# Enable audit logging
FALSE_POSITIVE_LOGGING = True  # Log all actions

# Log to external system (recommended for production)
import logging
from logging.handlers import SysLogHandler

logging.basicConfig(level=logging.INFO)
audit_logger = logging.getLogger('superset.audit')
audit_logger.addHandler(SysLogHandler(address=('syslog.company.com', 514)))

# Alternatively, log to CloudWatch (AWS)
from watchtower import CloudWatchLogHandler
audit_logger.addHandler(CloudWatchLogHandler(
    log_group='superset-audit',
    stream_name='production'
))

What to Log

At minimum, log:

Authentication: Who logged in, when, from which IP address.
Data access: Which user viewed which dashboard, ran which query.
Configuration changes: Who created/edited/deleted roles, permissions, datasets.
Query details: The full SQL query, execution time, rows returned.

Example audit log entry:

{
  "timestamp": "2024-01-15T14:32:00Z",
  "event_type": "query_executed",
  "user": "john.doe@company.com",
  "ip_address": "203.0.113.42",
  "database": "snowflake",
  "schema": "analytics.finance",
  "table": "fct_customer_lifetime_value",
  "query": "SELECT * FROM fct_customer_lifetime_value WHERE customer_id = 12345",
  "execution_time_ms": 1250,
  "rows_returned": 1,
  "status": "success"
}

Warehouse Audit Logs

Your warehouse also logs queries. These are essential for compliance because they show what data was actually accessed, independent of Superset.

Snowflake query history:

SELECT
    QUERY_ID,
    USER_NAME,
    QUERY_TEXT,
    EXECUTION_TIME,
    ROWS_RETURNED,
    QUERY_TYPE,
    START_TIME
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
WHERE START_TIME >= CURRENT_DATE - INTERVAL '30 days'
ORDER BY START_TIME DESC;

BigQuery audit logs:

SELECT
    timestamp,
    principalEmail,
    methodName,
    resourceName,
    status.code
FROM `project.region.cloudaudit_googleapis_com_activity`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
  AND methodName LIKE '%bigquery%'
ORDER BY timestamp DESC;

Audit Log Retention and Compliance

For SOC 2 Type II compliance, you need to retain audit logs for at least one year. For HIPAA, six years. For GDPR, it depends on the data, but generally 3-7 years.

Configure log retention:

# In superset_config.py
LOG_RETENTION_DAYS = 365  # Retain logs for 1 year

# Archive old logs to cold storage (S3, GCS, etc.)
import boto3
s3 = boto3.client('s3')

def archive_logs():
    # Query logs older than 90 days
    old_logs = db.session.query(Log).filter(
        Log.created_on < datetime.now() - timedelta(days=90)
    ).all()
    
    # Archive to S3
    for log in old_logs:
        s3.put_object(
            Bucket='audit-logs-archive',
            Key=f"superset/{log.created_on.year}/{log.created_on.month}/{log.id}.json",
            Body=log.to_json()
        )
    
    # Delete from database
    db.session.query(Log).filter(
        Log.created_on < datetime.now() - timedelta(days=90)
    ).delete()
    db.session.commit()

Connecting Superset and Warehouse Logs

For a complete audit trail, you need to connect Superset logs to warehouse logs. The challenge: Superset logs “user X ran query Y” but the warehouse logs “service account Z ran query Y.”

To bridge this, include a query ID in every Superset query:

# In Superset, before executing a query
import uuid
query_id = str(uuid.uuid4())

# Append a comment to the query
query_with_id = f"""
{original_query}
/* superset_query_id={query_id}, user={current_user.email} */
"""

# Log the mapping
audit_logger.info({
    'superset_query_id': query_id,
    'user': current_user.email,
    'warehouse_query_id': None  # Will be filled by warehouse
})

# Execute the query
result = warehouse.execute(query_with_id)

# After execution, query the warehouse to get the actual query ID
warehouse_query_id = warehouse.get_query_id(query_id)
audit_logger.info({'superset_query_id': query_id, 'warehouse_query_id': warehouse_query_id})

Now, if an auditor asks “Did John access the customers table on Jan 15?” you can:

Query Superset logs: “John ran query 12345-abcd at 2pm.”
Query warehouse logs: “Query 12345-abcd (superset_finance user) accessed the customers table.”
Prove the connection: John → Superset → Warehouse.

Operational Security Habits {#operational-habits}

Regular Access Reviews

Every quarter, review who has access to what. This is a compliance requirement (SOC 2, ISO 27001) and a security best practice.

Process:

Export current permissions.

-- Query Superset metadata
SELECT
    u.username,
    u.email,
    r.name AS role,
    p.permission_name,
    p.view_menu_id
FROM ab_user u
JOIN ab_user_role ur ON u.id = ur.user_id
JOIN ab_role r ON ur.role_id = r.id
JOIN ab_permission_view pv ON r.id = pv.role_id
JOIN ab_permission p ON pv.permission_id = p.id
ORDER BY u.username, r.name;

Send to team leads for review.
- “Does John still need Finance access? Does Sarah still need SQL Lab access?”
Remove unnecessary access.
- If a user has moved teams, update their role.
- If a user has left the company, deactivate their account.

Document the review.

For compliance, you need evidence that access was reviewed and approved.

Access Review - Q1 2024
Date: 2024-01-31
Reviewed by: Finance Lead, Product Lead, Data Engineering Lead
Changes:
  - Removed John's Product access (moved to Finance)
  - Removed Sarah's SQL Lab access (role change)
  - Added 3 new Finance analysts
Approved by: Chief Financial Officer

Offboarding Checklist

When an employee leaves, disable their access to Superset and the warehouse immediately.

Offboarding checklist:

Deactivate Superset user account
Remove user from all Superset roles
Disable warehouse user account (or revoke all roles)
Remove from AD / LDAP groups
Revoke API tokens and SSH keys
Archive user’s dashboards and charts
Transfer ownership of dashboards to remaining team members
Remove user from dbt Cloud projects
Audit logs to ensure no access after offboarding date

Managing Secrets and Rotations

Warehouse service account credentials should be rotated every 90 days. This is a compliance control and a security best practice.

Rotation process:

Create new service account.

CREATE USER superset_finance_new PASSWORD = '...';
GRANT ROLE superset_finance TO USER superset_finance_new;

Update Superset connection.
- Go to Data → Databases.
- Edit the Finance database connection.
- Update username and password.
- Test the connection.
Verify all queries work.
- Run a test query from each dashboard.
- Check audit logs for errors.
Disable old service account.
```
ALTER USER superset_finance DISABLE;
```
Document the rotation.
- For compliance: “Service account superset_finance rotated on 2024-01-15.”

Monitoring and Alerting

Set up alerts for suspicious activity:

Failed login attempts: If a user fails to log in 5+ times, alert the security team.
Unusual query patterns: If a user suddenly queries tables they’ve never accessed, alert.
Large data exports: If a user exports >1M rows, alert.
Off-hours access: If a user accesses Superset outside business hours, alert.

Example alert (using Superset’s alert system):

# In Superset, create an alert
alert = Alert(
    name="Large Data Export",
    description="Alert when >1M rows are exported",
    alert_type="email",
    recipients="security@company.com",
    condition="rows_returned > 1000000",
    chart_id=12345
)
db.session.add(alert)
db.session.commit()

Security Benchmarks and Metrics {#benchmarks}

Key Security Metrics

Track these metrics to measure your security posture:

Metric	Target	Frequency
Access review completion	100%	Quarterly
Offboarding completion time	<24 hours	Per employee
Credential rotation compliance	100%	Every 90 days
Audit log retention	365+ days	Continuous
Failed login attempts	<5 per user/month	Weekly review
Unusual query alerts	<2% false positives	Ongoing tuning
SOC 2 / ISO 27001 audit pass	100%	Annually

Benchmarking Against Competitors

How does your Superset + dbt security stack up against competitors?

Thoughtworks and Slalom typically implement:

SAML authentication
Role-based access control with 5-10 roles
Warehouse-level RBAC
Basic audit logging
Annual compliance reviews

PADISO’s approach (based on work with Series-B to growth-stage companies):

SAML or LDAP authentication
Granular role-based access (10-20+ roles)
Warehouse-level RBAC + Superset-level RLS
Comprehensive audit logging with warehouse integration
Quarterly access reviews
dbt governance integration
Vanta-ready compliance architecture

For companies pursuing Security Audit | PADISO - SOC 2, ISO 27001 & GDPR Compliance, the PADISO model is designed to pass audits faster and with less operational overhead.

Cost Benchmarks

Securing Superset + dbt isn’t free, but it’s predictable:

Component	Cost	Notes
Superset hosting	$500-2000/month	AWS/GCP, depends on scale
Warehouse service accounts	$0	Included in warehouse cost
Audit logging	$100-500/month	CloudWatch, Datadog, Splunk
dbt Cloud	$1000-5000/month	Depends on project count
Identity provider (SAML)	$0-500/month	Okta, Azure AD, Ping
Compliance tooling (Vanta)	$500-2000/month	For SOC 2 / ISO 27001
Security review / audit	$5000-20000/year	Annual or per-audit

Total: $1500-10000/month depending on scale and compliance requirements.

For context, a single data breach can cost $4M+ (IBM, 2023). Investing in security infrastructure is cost-effective.

Common Pitfalls and Solutions {#pitfalls}

Pitfall 1: Overly Permissive Superset Roles

Problem: You create a “Viewer” role that can see all dashboards. Over time, dashboards accumulate sensitive data. Viewers now have access to data they shouldn’t.

Solution: Create separate roles for each data domain.

# Bad
Viewers (can see all dashboards)

# Good
Finance Viewers (can see finance dashboards)
Product Viewers (can see product dashboards)
Executive Viewers (can see executive dashboards)

Pitfall 2: Relying Only on Superset Permissions

Problem: You think Superset permissions are enough. But a user bypasses Superset and queries the warehouse directly. Superset permissions don’t apply.

Solution: Always implement warehouse-level RBAC as your primary security boundary. Superset permissions are secondary.

# Bad: Only Superset permissions
# Good: Warehouse RBAC (primary) + Superset permissions (secondary)

GRANT USAGE ON SCHEMA analytics.finance TO ROLE superset_finance;  # Primary
GRANT SELECT ON ALL TABLES IN SCHEMA analytics.finance TO ROLE superset_finance;

# Then in Superset, restrict access to the finance database connection  # Secondary

Pitfall 3: Service Account Credentials in Code

Problem: You hardcode warehouse credentials in Superset config or commit them to Git. A developer with repository access can steal credentials.

Solution: Use a secrets manager (AWS Secrets Manager, HashiCorp Vault, etc.).

# Bad
SUPERSET_WAREHOUSE_PASSWORD = "P@ssw0rd123"

# Good
import boto3
secrets_client = boto3.client('secretsmanager')

secret = secrets_client.get_secret_value(SecretId='superset-warehouse-password')
SUPERSET_WAREHOUSE_PASSWORD = secret['SecretString']

Pitfall 4: No Audit Trail

Problem: You have Superset permissions and warehouse RBAC, but no logs. When a data breach happens, you can’t prove who accessed what.

Solution: Enable comprehensive audit logging and retain logs for 1+ years.

# In superset_config.py
FALSE_POSITIVE_LOGGING = True
LOG_RETENTION_DAYS = 365

# Log to external system
import logging
audit_logger = logging.getLogger('superset.audit')
audit_logger.addHandler(CloudWatchLogHandler(log_group='superset-audit'))

Pitfall 5: Orphaned Permissions

Problem: An employee leaves. You deactivate their Superset account, but they still have warehouse access (different team manages it). They can still query data.

Solution: Implement a centralized identity provider (LDAP/SAML) and sync permissions across systems.

# In superset_config.py
AUTH_TYPE = AUTH_LDAP
AUTH_LDAP_SYNC_USER_INFO = True  # Sync on every login

# When user is disabled in LDAP, Superset automatically removes access

Pitfall 6: dbt Models Without Ownership

Problem: You have dbt models but no metadata about who owns them. When a model breaks, nobody knows who to contact. When a user requests access, you can’t verify if they should have it.

Solution: Define ownership and access tiers in dbt models.

# In dbt_project.yml
models:
  finance:
    fct_revenue:
      config:
        meta:
          owner: "finance@company.com"
          access: "restricted"
          teams: ["finance", "executive"]

Next Steps: Audit-Ready Architecture {#next-steps}

Building Towards SOC 2 and ISO 27001

If you’re planning to pursue SOC 2 Type II or ISO 27001 compliance, your Superset + dbt security model is foundational. Auditors will ask:

How do you authenticate users? (SAML, LDAP, local users?)
How do you authorize access? (Role-based? Attribute-based?)
How do you audit access? (Logs? Retention? Integration?)
How do you manage secrets? (Rotation? Encryption?)
How do you handle offboarding? (Timelines? Verification?)

If your Superset + dbt setup is solid, you’re 50% of the way to compliance. The other 50% is infrastructure (network security, encryption, incident response, etc.).

For Security Audit | PADISO - SOC 2, ISO 27001 & GDPR Compliance, we typically help companies get audit-ready in 8-12 weeks using Vanta. The process is:

Assessment: Review current Superset + dbt setup, identify gaps.
Configuration: Implement missing controls (audit logging, RLS, etc.).
Automation: Wire Vanta to pull evidence from Superset, warehouse, dbt.
Testing: Run mock audits to identify remaining issues.
Remediation: Fix any gaps.
Audit: Pass SOC 2 / ISO 27001 audit.

Scaling Superset + dbt Across Regions

If you’re expanding internationally, consider regional deployments:

APAC region: Platform Development in Sydney | PADISO for Australia, Platform Development in New Zealand | PADISO for NZ, with Platform Development in Australia | PADISO for broader coverage.
North America: Platform Development in New York | PADISO for financial services, Platform Development in Washington, D.C. | PADISO for government/defence.
North America (Canada): Platform Development in Canada | PADISO for multi-tenant SaaS with ClickHouse analytics.
Broad coverage: Platform Development in United States | PADISO and Platform Development in Melbourne | PADISO for retail/insurance/health teams.

Each region has different compliance requirements (GDPR in EU, PIPEDA in Canada, IRAP in Australia). Your Superset + dbt security model should be flexible enough to adapt.

Automation and CI/CD

As you scale, automate security controls:

dbt CI/CD: Run dbt tests and lineage checks on every commit.
Superset CI/CD: Version-control dashboard definitions, deploy via Git.
Warehouse CI/CD: Automate role and permission creation.
Compliance CI/CD: Automatically run compliance checks (Vanta) and generate audit reports.

Getting Help

Securing Superset + dbt is complex. If you’re building this from scratch, consider partnering with a venture studio or platform engineering team.

At PADISO, we’ve built Superset + dbt security models for 50+ companies across finance, retail, health, and SaaS. We can help you:

Design a security architecture that scales.
Implement RBAC, RLS, audit logging, and compliance controls.
Automate access reviews, credential rotation, and offboarding.
Pass SOC 2 / ISO 27001 audits in 8-12 weeks using Vanta.

For Services | PADISO - CTO as a Service, Custom Software, AI & Automation, we offer fractional CTO leadership and co-build support for security and platform engineering.

Start with a Security Audit | PADISO - SOC 2, ISO 27001 & GDPR Compliance to assess your current state and identify gaps. Then, build towards compliance with confidence.

Summary

Apache Superset + dbt security is layered, not monolithic. You need:

Authentication: Who are you? (SAML, LDAP)
Superset Authorization: What can you access in Superset? (Role-based permissions)
Warehouse Authorization: What can you query? (RBAC, RLS)
dbt Governance: What models are you supposed to use? (Metadata, ownership, access tiers)
Audit Logging: Who accessed what, when, and why? (Comprehensive logs, 1+ year retention)
Operational Habits: Regular access reviews, offboarding checklists, credential rotation

When all five layers work together, you have a security model that:

Scales from 5 users to 500.
Passes SOC 2 and ISO 27001 audits.
Prevents data leakage via defence in depth.
Enables fast, safe analytics for every team.

Start with warehouse-level RBAC (your hardest boundary), then layer Superset permissions and audit logging. Automate access reviews and credential rotation. Document everything for compliance.

If you’re ready to build this, or need help securing your existing Superset + dbt stack, reach out to PADISO: AI Solutions & Strategic Leadership — AIR Bootcamps | SOC2 & ISO27001 via Vanta or check out our Case Studies | PADISO - Real Results for Real Businesses to see how we’ve helped other companies get secure and compliant.

Want to talk through your situation?

Book a 30-minute call with Kevin (Founder/CEO). No pitch — direct advice on what to do next.

Book a 30-min call