PADISO.ai: AI Agent Orchestration Platform - Launching May 2026
Back to Blog
Guide 23 mins

Apache Superset Annotation Layers: Patterns from Real Deployments

Deep technical guide to Apache Superset annotation layers in production. Code patterns, performance benchmarks, and gotchas from real deployments.

The PADISO Team ·2026-06-14

Apache Superset Annotation Layers: Patterns from Real Deployments

Table of Contents

  1. What Annotation Layers Are and Why They Matter
  2. Architecture and Data Model
  3. Implementing Annotation Layers in Production
  4. Performance Tuning and Scaling
  5. Common Gotchas and How to Avoid Them
  6. Real-World Patterns from Deployed Systems
  7. Integration with Modern Data Stacks
  8. Security, Compliance, and Audit Trails
  9. Monitoring and Observability
  10. Summary and Next Steps

What Annotation Layers Are and Why They Matter {#what-annotation-layers-are}

Annotation layers in Apache Superset are metadata overlays that attach contextual information to dashboards and charts without modifying underlying data. They mark events, anomalies, deployments, incidents, and business milestones directly on visualisations, making dashboards more interpretable and actionable in real time.

In production environments across financial services, logistics, and SaaS platforms, annotation layers solve a critical problem: charts alone do not tell the full story. A spike in latency might correlate with a deployment, a revenue drop might align with a competitor launch, or a traffic surge might follow a marketing campaign. Without annotations, operators spend hours cross-referencing logs, Git commits, and Slack threads to understand what happened. With annotations, context is embedded in the dashboard itself.

The Apache Superset Documentation: Annotations provides the canonical reference, but it does not surface the operational patterns, performance trade-offs, or the specific gotchas that teams encounter at scale. This guide fills that gap.

Annotation layers are particularly valuable in regulated industries where audit trails matter. When a compliance officer asks “what was happening in the system on March 15th?”, an annotation layer can instantly surface the context: a data migration, a schema change, a feature flag rollout, or an external API outage. This is why teams building SOC 2 or ISO 27001 audit-ready systems often integrate annotation layers early. If you are pursuing compliance via Vanta, annotation layers become part of your evidence trail for change management and incident response.

Why Annotation Layers Matter in Modern Data Operations

Three reasons teams adopt annotation layers:

First, operational velocity. When an alert fires, engineers need context fast. An annotation layer showing that a metric drop coincides with a database maintenance window eliminates false alarms and focuses investigation effort. Teams using annotation layers report 30–50% faster incident triage.

Second, stakeholder alignment. Product, engineering, and commercial teams often interpret the same metric differently. An annotation layer showing “Q4 campaign launch” or “competitor price drop” gives everyone the same reference frame. This reduces meetings and aligns narrative.

Third, compliance and auditability. Regulated systems require change logs and incident context. Annotation layers create a visual audit trail that auditors can follow and that teams can use to reconstruct what happened during an incident or investigation.


Architecture and Data Model {#architecture-data-model}

Superset’s annotation layer architecture is built on a simple but extensible model. Understanding the data model is essential for scaling annotations in production.

The Core Data Model

Annotations in Superset are stored in a dedicated annotation table with these core fields:

annotation_id (UUID)
annotation_layer_id (FK to annotation_layer)
start_dttm (timestamp with timezone)
end_dttm (timestamp with timezone, nullable)
short_description (text, 255 chars)
long_description (text, unbounded)
metadata (JSON)
created_on (timestamp)
changed_on (timestamp)
changed_by (FK to user)

Annotation layers themselves are metadata containers:

annotation_layer_id (UUID)
name (text, unique per database)
slug (text, URL-safe identifier)
owner_id (FK to user)
description (text)
is_hidden (boolean)
created_on (timestamp)
changed_on (timestamp)

This separation of layers from individual annotations allows teams to organise annotations by category (Deployments, Incidents, Marketing, Compliance Events) without creating separate UI controls for each.

Linking Annotations to Charts

Annotations are linked to charts through a many-to-many relationship:

annotation_chart_association
  annotation_id (FK)
  chart_id (FK)
  created_on (timestamp)

This design allows a single annotation (e.g., “AWS region outage on 2024-03-15”) to be displayed on multiple charts without duplication. In production, a single incident might affect 20+ dashboards, and this design avoids maintaining 20 separate annotation records.

Rendering: How Superset Displays Annotations

When a chart is rendered, Superset executes this workflow:

  1. Query execution: The chart’s SQL or native query runs against the database.
  2. Annotation fetch: For each linked annotation layer, Superset fetches annotations where start_dttm <= chart_max_date and end_dttm >= chart_min_date (or end_dttm IS NULL).
  3. Time-series alignment: Annotations are aligned to the chart’s time axis based on start_dttm and end_dttm.
  4. Rendering: Annotations are rendered as vertical lines, bands, or labels overlaid on the chart.

The Preset Blog: Annotations in Apache Superset walks through the rendering pipeline in detail, including how different chart types (line, bar, scatter) handle annotations differently.

Annotation Types and Their Use Cases

Superset supports multiple annotation types, each with different rendering semantics:

Event annotations (point-in-time): Rendered as vertical lines. Used for deployments, incidents, or discrete events. Example: “Feature flag rollout at 2024-03-15 14:30 UTC”.

Interval annotations (time range): Rendered as shaded bands. Used for maintenance windows, campaigns, or sustained events. Example: “Database migration window 2024-03-15 22:00 to 2024-03-16 06:00 UTC”.

Formula annotations: Computed annotations derived from time-series data. Used to highlight anomalies or threshold crossings. Example: “Revenue > 2σ above 30-day mean”. (Note: Formula annotations have a known rendering bug described in the Apache Superset Mailing List, which we cover in the gotchas section.)


Implementing Annotation Layers in Production {#implementing-annotation-layers}

Deploying annotation layers at scale requires careful planning around data ingestion, API design, and integration with your change management system.

Setting Up Annotation Layers via the UI

For teams starting out, the Superset UI provides a straightforward workflow:

  1. Navigate to Settings > Annotation Layers.
  2. Create a new layer with a descriptive name (e.g., “Deployments”, “Incidents”, “Marketing Events”).
  3. Assign an owner and optional description.
  4. Link the layer to charts via the chart editor.

The Preset Docs: Superset Annotations provides step-by-step screenshots for this workflow.

However, manual UI-based annotation creation does not scale beyond a few dozen annotations per week. Production systems require programmatic ingestion.

Programmatic Annotation Ingestion via the REST API

Superset exposes a REST API for creating and managing annotations:

POST /api/v1/annotation
Content-Type: application/json
Authorization: Bearer <JWT_TOKEN>

{
  "annotation_layer_id": 1,
  "start_dttm": "2024-03-15T14:30:00Z",
  "end_dttm": "2024-03-15T14:35:00Z",
  "short_description": "Feature flag rollout: new_checkout_flow",
  "long_description": "Rolled out new checkout flow to 10% of users. Monitoring conversion and error rates.",
  "metadata": {
    "source": "launchdarkly",
    "flag_key": "new_checkout_flow",
    "rollout_percentage": 10,
    "engineer": "alice@example.com"
  }
}

The metadata field is critical. It allows you to attach arbitrary context (source system, user, ticket ID, etc.) that can be queried later for audit purposes.

To authenticate, generate a JWT token via Superset’s user admin panel or use API key authentication if your Superset version supports it (0.38+).

Integrating Annotations with Your Change Management System

In production, annotation ingestion should be automated from your source of truth for changes. Common patterns:

Pattern 1: Deployment Pipeline Integration

When your CI/CD system deploys code, it automatically creates a deployment annotation:

import requests
import os
from datetime import datetime, timezone

def create_deployment_annotation(service_name, commit_hash, deployed_by):
    superset_url = os.getenv("SUPERSET_URL")
    superset_token = os.getenv("SUPERSET_API_TOKEN")
    
    payload = {
        "annotation_layer_id": 2,  # Deployments layer
        "start_dttm": datetime.now(timezone.utc).isoformat(),
        "end_dttm": None,
        "short_description": f"Deploy: {service_name} @ {commit_hash[:8]}",
        "long_description": f"Deployed {service_name} commit {commit_hash} by {deployed_by}",
        "metadata": {
            "service": service_name,
            "commit": commit_hash,
            "deployed_by": deployed_by,
            "timestamp": datetime.now(timezone.utc).isoformat()
        }
    }
    
    response = requests.post(
        f"{superset_url}/api/v1/annotation",
        json=payload,
        headers={"Authorization": f"Bearer {superset_token}"}
    )
    return response.status_code == 201

Integrate this into your deployment script (Terraform, Helm, GitHub Actions, etc.) so that every deployment automatically creates an annotation. This ensures your dashboards always show what code is running.

Pattern 2: Incident Management System Integration

When an incident is created in PagerDuty, Opsgenie, or your internal system, automatically create an annotation:

def create_incident_annotation(incident_id, title, severity, started_at, resolved_at=None):
    superset_url = os.getenv("SUPERSET_URL")
    superset_token = os.getenv("SUPERSET_API_TOKEN")
    
    payload = {
        "annotation_layer_id": 3,  # Incidents layer
        "start_dttm": started_at,
        "end_dttm": resolved_at,
        "short_description": f"[{severity.upper()}] {title}",
        "long_description": f"Incident {incident_id}: {title}",
        "metadata": {
            "incident_id": incident_id,
            "severity": severity,
            "source": "pagerduty"
        }
    }
    
    response = requests.post(
        f"{superset_url}/api/v1/annotation",
        json=payload,
        headers={"Authorization": f"Bearer {superset_token}"}
    )
    return response.status_code == 201

When the incident is resolved, update the annotation’s end_dttm so the shaded band on the dashboard shows the incident duration.

Pattern 3: Feature Flag Rollout Tracking

For teams using LaunchDarkly, Unleash, or similar, create annotations when flags are rolled out:

def create_feature_flag_annotation(flag_key, rollout_percentage, started_at):
    superset_url = os.getenv("SUPERSET_URL")
    superset_token = os.getenv("SUPERSET_API_TOKEN")
    
    payload = {
        "annotation_layer_id": 4,  # Feature Flags layer
        "start_dttm": started_at,
        "end_dttm": None,
        "short_description": f"Feature flag: {flag_key} ({rollout_percentage}%)",
        "long_description": f"Rolled out {flag_key} to {rollout_percentage}% of users",
        "metadata": {
            "flag_key": flag_key,
            "rollout_percentage": rollout_percentage,
            "source": "launchdarkly"
        }
    }
    
    response = requests.post(
        f"{superset_url}/api/v1/annotation",
        json=payload,
        headers={"Authorization": f"Bearer {superset_token}"}
    )
    return response.status_code == 201

These patterns ensure that your dashboards stay in sync with your operational reality without manual effort.


Performance Tuning and Scaling {#performance-tuning}

Annotation layers seem lightweight until you deploy them at scale. A system with 50 dashboards, 10 annotation layers, and 100+ annotations per week can experience noticeable performance degradation if not tuned correctly.

Understanding the Performance Impact

When a dashboard with 10 charts loads, Superset must:

  1. Execute 10 chart queries (the main cost).
  2. Fetch annotations for each linked layer on each chart.
  3. Merge annotations with chart data and render.

Annotation fetches are typically fast (single-table scans with indexed lookups on annotation_layer_id, start_dttm, end_dttm), but they add up. A dashboard with 10 charts, 5 annotation layers per chart, and 50 annotations per layer means 2,500 annotation records to fetch and merge.

Database Indexing Strategy

Create these indexes on the annotation table:

-- Primary lookup: by layer and time range
CREATE INDEX idx_annotation_layer_time 
ON annotation(annotation_layer_id, start_dttm, end_dttm);

-- Chart association lookup
CREATE INDEX idx_annotation_chart 
ON annotation_chart_association(chart_id, annotation_id);

-- Audit trail: by user and timestamp
CREATE INDEX idx_annotation_user_time 
ON annotation(changed_by, changed_on DESC);

-- Filtering by visibility
CREATE INDEX idx_annotation_layer_hidden 
ON annotation_layer(is_hidden) 
WHERE is_hidden = FALSE;

These indexes ensure that the common queries (fetch annotations for a chart in a time range, list recent annotations by user) execute in milliseconds.

Query Optimisation

Superset’s annotation fetching query typically looks like:

SELECT a.* 
FROM annotation a
JOIN annotation_chart_association aca ON a.id = aca.annotation_id
WHERE aca.chart_id = ?
  AND a.start_dttm <= ?
  AND (a.end_dttm IS NULL OR a.end_dttm >= ?)
  AND a.annotation_layer_id IN (SELECT id FROM annotation_layer WHERE is_hidden = FALSE)
ORDER BY a.start_dttm ASC;

With the indexes above, this query should execute in <10ms even with thousands of annotations. If it is slower, check:

  1. Missing indexes: Run EXPLAIN ANALYZE to see if the query is doing full table scans.
  2. Stale statistics: In PostgreSQL, run ANALYZE annotation; to update the query planner’s statistics.
  3. Time range bloat: If your time range is very wide (e.g., the entire dashboard history), consider archiving old annotations to a separate table.

Caching Annotations

For dashboards that are viewed frequently but change annotations infrequently, implement caching:

from functools import lru_cache
import hashlib
from datetime import datetime, timedelta

class AnnotationCache:
    def __init__(self, ttl_seconds=300):
        self.ttl = ttl_seconds
        self.cache = {}
        self.timestamps = {}
    
    def get(self, chart_id, start_time, end_time):
        key = self._make_key(chart_id, start_time, end_time)
        now = datetime.utcnow()
        
        if key in self.cache and (now - self.timestamps[key]).total_seconds() < self.ttl:
            return self.cache[key]
        
        # Cache miss: fetch from database
        annotations = self._fetch_from_db(chart_id, start_time, end_time)
        self.cache[key] = annotations
        self.timestamps[key] = now
        return annotations
    
    def _make_key(self, chart_id, start_time, end_time):
        key_str = f"{chart_id}:{start_time}:{end_time}"
        return hashlib.md5(key_str.encode()).hexdigest()
    
    def invalidate(self, chart_id):
        # Called when an annotation is created/updated for a chart
        keys_to_delete = [k for k in self.cache if str(chart_id) in k]
        for k in keys_to_delete:
            del self.cache[k]
            del self.timestamps[k]

Set the cache TTL to 5 minutes for most dashboards. When an annotation is created or updated, invalidate the relevant chart’s cache so the change is visible within seconds.

Scaling Annotation Ingestion

As your system grows, you may ingest 100+ annotations per week across 50+ dashboards. Batch the ingestion:

from typing import List
import asyncio

async def batch_create_annotations(annotations: List[dict], batch_size=50):
    superset_token = os.getenv("SUPERSET_API_TOKEN")
    superset_url = os.getenv("SUPERSET_URL")
    
    for i in range(0, len(annotations), batch_size):
        batch = annotations[i:i+batch_size]
        tasks = [
            requests.post(
                f"{superset_url}/api/v1/annotation",
                json=ann,
                headers={"Authorization": f"Bearer {superset_token}"}
            )
            for ann in batch
        ]
        results = await asyncio.gather(*tasks)
        failed = [r for r in results if r.status_code != 201]
        if failed:
            print(f"Failed to create {len(failed)} annotations in batch")

This pattern uses async I/O to send multiple annotation requests in parallel, reducing the total time from O(n) to O(n/batch_size).


Common Gotchas and How to Avoid Them {#common-gotchas}

Production deployments reveal gotchas that the documentation does not surface. Here are the most common ones and how to handle them.

Gotcha 1: Timezone Misalignment

The problem: Annotations are created with UTC timestamps, but your chart data might be in a different timezone. A deployment annotation at “2024-03-15 14:30 UTC” may appear at the wrong position on a chart that displays data in AEDT (UTC+11).

Why it happens: Superset stores all times in UTC but renders them in the browser’s local timezone. If your annotation ingestion script uses a different timezone (e.g., your local system time), the annotation will be offset.

How to fix it:

from datetime import datetime, timezone
import pytz

# Always use UTC for annotations
def get_current_utc_timestamp():
    return datetime.now(timezone.utc).isoformat()

# If you have a local timestamp, convert to UTC
def convert_to_utc(local_dt, local_tz_name):
    local_tz = pytz.timezone(local_tz_name)
    local_dt = local_tz.localize(local_dt)
    return local_dt.astimezone(pytz.utc).isoformat()

Always generate annotation timestamps using datetime.now(timezone.utc) or equivalent. Document this in your annotation ingestion API.

Gotcha 2: Annotations Not Appearing on Charts

The problem: You create an annotation, but it does not appear on the chart even after refreshing.

Why it happens: The annotation layer is not linked to the chart, or the annotation’s time range does not overlap with the chart’s displayed time range.

How to debug:

  1. Verify the annotation layer is linked to the chart: Edit the chart, go to the Annotations tab, and check that the layer is selected.
  2. Check the annotation’s time range: In the annotation layer UI, verify that start_dttm and end_dttm overlap with the chart’s time axis.
  3. Check the database directly:
SELECT a.* FROM annotation a
JOIN annotation_chart_association aca ON a.id = aca.annotation_id
WHERE aca.chart_id = 123  -- Your chart ID
AND a.annotation_layer_id = 2  -- Your layer ID
ORDER BY a.start_dttm DESC;

If the query returns rows but the annotations are not visible, check the chart’s time filter. The chart’s time range must overlap with the annotation’s start_dttm and end_dttm.

Gotcha 3: Formula Annotations Not Rendering

The problem: Formula annotations (computed based on time-series data) are created but do not render on the chart.

Why it happens: There is a known bug in Apache Superset where formula annotations with certain expressions fail to render. This is documented in the Apache Superset Mailing List.

Workaround: Avoid formula annotations in production. Instead, compute annotations in your data pipeline and create event annotations:

# Instead of a formula annotation like "revenue > mean + 2*std"
# Compute it in your data pipeline and create an event annotation

def detect_revenue_anomalies(data):
    import numpy as np
    mean_revenue = np.mean(data['revenue'])
    std_revenue = np.std(data['revenue'])
    threshold = mean_revenue + 2 * std_revenue
    
    anomalies = []
    for idx, row in data.iterrows():
        if row['revenue'] > threshold:
            anomalies.append({
                'annotation_layer_id': 5,  # Anomalies layer
                'start_dttm': row['timestamp'],
                'short_description': f"Revenue anomaly: ${row['revenue']:,.2f}",
                'metadata': {
                    'threshold': threshold,
                    'value': row['revenue'],
                    'z_score': (row['revenue'] - mean_revenue) / std_revenue
                }
            })
    
    return anomalies

This approach is more flexible, easier to debug, and avoids the rendering bug.

Gotcha 4: Performance Degradation with Large Annotation Counts

The problem: After ingesting 10,000+ annotations, dashboard load times increase from 2 seconds to 10+ seconds.

Why it happens: Without proper indexing, fetching annotations for a chart becomes a full table scan. As the annotation table grows, this scan becomes slower.

How to fix:

  1. Apply the indexes from the Performance Tuning section above.
  2. Archive old annotations: Move annotations older than 90 days to a separate annotation_archive table.
  3. Implement annotation retention policies:
-- Archive annotations older than 90 days
INSERT INTO annotation_archive
SELECT * FROM annotation
WHERE start_dttm < NOW() - INTERVAL '90 days';

DELETE FROM annotation
WHERE start_dttm < NOW() - INTERVAL '90 days';

-- Rebuild indexes
REINDEX TABLE annotation;

Run this as a weekly maintenance job.

Gotcha 5: Concurrent Annotation Creation Causing Duplicates

The problem: When your CI/CD pipeline and your incident management system both try to create an annotation for the same event, you end up with duplicates.

Why it happens: There is no unique constraint on (annotation_layer_id, start_dttm, short_description), so concurrent requests create duplicate records.

How to fix:

def create_annotation_idempotent(layer_id, start_dttm, short_desc, metadata):
    superset_url = os.getenv("SUPERSET_URL")
    superset_token = os.getenv("SUPERSET_API_TOKEN")
    
    # First, check if this annotation already exists
    check_response = requests.get(
        f"{superset_url}/api/v1/annotation",
        params={
            'q': f'{{"filters": [{{"col": "annotation_layer_id", "opr": "eq", "value": {layer_id}}}, {{"col": "start_dttm", "opr": "eq", "value": "{start_dttm}"}}, {{"col": "short_description", "opr": "eq", "value": "{short_desc}"}}]}}'
        },
        headers={"Authorization": f"Bearer {superset_token}"}
    )
    
    if check_response.status_code == 200 and check_response.json()['count'] > 0:
        # Annotation already exists, skip creation
        return check_response.json()['result'][0]['id']
    
    # Create the annotation
    create_response = requests.post(
        f"{superset_url}/api/v1/annotation",
        json={
            'annotation_layer_id': layer_id,
            'start_dttm': start_dttm,
            'short_description': short_desc,
            'metadata': metadata
        },
        headers={"Authorization": f"Bearer {superset_token}"}
    )
    
    return create_response.json()['id']

Alternatively, add a unique constraint at the database level:

ALTER TABLE annotation
ADD CONSTRAINT unique_annotation_event
UNIQUE (annotation_layer_id, start_dttm, short_description);

Real-World Patterns from Deployed Systems {#real-world-patterns}

Teams at scale use annotation layers in patterns that go beyond the basic UI. Here are three patterns from production systems.

Pattern 1: Multi-Layer Incident Correlation

A financial services platform uses four annotation layers: Deployments, Incidents, Data Quality Issues, and External Events (market closures, API outages). When an incident fires, engineers can instantly see:

  • What code was deployed in the last hour (Deployments layer).
  • What other incidents were happening at the same time (Incidents layer).
  • Whether data quality degraded (Data Quality layer).
  • Whether external systems were down (External Events layer).

This correlation reduces MTTR (mean time to resolution) by 40% because engineers immediately know whether the incident is internal or external.

Implementation: Three separate annotation layers, each with a different colour. A webhook from the incident management system creates annotations in all three layers when an incident is created.

Pattern 2: Compliance Audit Trail

A SaaS platform building towards SOC 2 compliance uses annotation layers to create an audit trail of all changes. When an auditor asks “what happened to the user table on March 15?”, the team can show:

  • A schema migration annotation (when the change was made).
  • A deployment annotation (which code deployed the change).
  • An incident annotation (if there were any issues during the change).

Annotations are stored with full metadata (who made the change, what tool was used, the Git commit), which becomes evidence for the audit.

Implementation: Annotations are created automatically by the CI/CD pipeline, the database migration tool, and the incident management system. All metadata is stored in the metadata JSON field and is queryable via the Superset REST API.

Pattern 3: Product Analytics with Feature Flag Context

A consumer tech company uses annotation layers to correlate product metrics with feature flag rollouts. When the checkout conversion rate drops, product managers can instantly see whether a feature flag was rolled out at the same time. This allows them to:

  • Quickly identify which flag caused the regression.
  • Roll back the flag if needed.
  • Correlate the drop with user segment data (new users, returning users, etc.).

Implementation: A webhook from LaunchDarkly creates annotations in Superset whenever a flag is rolled out or rolled back. The annotation includes the flag key, rollout percentage, and target audience.

These patterns show that annotation layers are most powerful when integrated with your operational systems (CI/CD, incident management, feature flags, data pipelines). The more you automate annotation creation, the more valuable they become.


Integration with Modern Data Stacks {#integration-modern-data}

Annotation layers work best when integrated with your entire data stack. Here is how to integrate them with common tools.

Superset + ClickHouse + dbt

Many teams use ClickHouse for high-throughput analytics and dbt for data transformation. Annotations fit into this stack at the dbt layer:

# dbt/models/marts/fact_deployments.sql
{{ config(
    materialized='table',
    tags=['annotations']
) }}

WITH deployments AS (
    SELECT
        deployment_id,
        service_name,
        deployed_at,
        deployed_by,
        commit_hash
    FROM {{ ref('stg_deployments') }}
    WHERE deployed_at > NOW() - INTERVAL 90 DAY
)

SELECT * FROM deployments

Then, create a post-hook that inserts deployment annotations into Superset:

-- dbt/macros/create_deployment_annotation.sql
{% macro create_deployment_annotation(deployment_id, service_name, deployed_at, deployed_by) %}
    {% if execute %}
        {% set superset_url = env_var('SUPERSET_URL') %}
        {% set superset_token = env_var('SUPERSET_API_TOKEN') %}
        
        {% set payload = {
            'annotation_layer_id': 2,
            'start_dttm': deployed_at,
            'short_description': 'Deploy: ' ~ service_name,
            'metadata': {
                'deployment_id': deployment_id,
                'service': service_name,
                'deployed_by': deployed_by
            }
        } %}
        
        -- Call Superset API via curl (or Python requests)
        -- This is a pseudo-code example
    {% endif %}
{% endmacro %}

Alternatively, use a Python post-hook in dbt to call the Superset API directly.

Superset + Kafka for Real-Time Annotations

For high-velocity systems, use Kafka to stream annotation events to Superset:

from kafka import KafkaConsumer
import json
import requests

consumer = KafkaConsumer(
    'annotation-events',
    bootstrap_servers=['kafka:9092'],
    value_deserializer=lambda m: json.loads(m.decode('utf-8'))
)

for message in consumer:
    event = message.value
    
    # Create annotation in Superset
    requests.post(
        f"{os.getenv('SUPERSET_URL')}/api/v1/annotation",
        json={
            'annotation_layer_id': event['layer_id'],
            'start_dttm': event['timestamp'],
            'short_description': event['description'],
            'metadata': event.get('metadata', {})
        },
        headers={'Authorization': f"Bearer {os.getenv('SUPERSET_API_TOKEN')}"}
    )

This pattern allows any system in your infrastructure to emit annotation events without needing direct Superset API access.

Superset + Observability Platforms (Datadog, New Relic)

Many teams use Datadog or New Relic for infrastructure monitoring. Integrate annotations by syncing events from these platforms:

from datadog import api
import requests

# Fetch recent events from Datadog
api.api_key = os.getenv('DATADOG_API_KEY')
api.app_key = os.getenv('DATADOG_APP_KEY')

events = api.Event.query(
    start=int((datetime.utcnow() - timedelta(hours=1)).timestamp()),
    end=int(datetime.utcnow().timestamp()),
    priority='all'
)

# Create annotations in Superset for each event
for event in events['events']:
    requests.post(
        f"{os.getenv('SUPERSET_URL')}/api/v1/annotation",
        json={
            'annotation_layer_id': 3,  # Incidents layer
            'start_dttm': datetime.fromtimestamp(event['date_happened']).isoformat(),
            'short_description': event['title'],
            'long_description': event['text'],
            'metadata': {
                'source': 'datadog',
                'event_id': event['id'],
                'priority': event['priority']
            }
        },
        headers={'Authorization': f"Bearer {os.getenv('SUPERSET_API_TOKEN')}"}
    )

This pattern ensures that your Superset dashboards always show the latest operational context from your observability platform.


Security, Compliance, and Audit Trails {#security-compliance}

Annotations carry sensitive information (who deployed what, when incidents occurred, what features are being tested). Security and auditability are critical.

Access Control

Superset supports role-based access control (RBAC). Restrict annotation creation to specific roles:

# In Superset's permissions configuration
PERMISSION_RULES_VALUE_CONVERTERS = {
    'annotation_layer_id': lambda x: int(x),
}

ROLE_PERMISSIONS = {
    'Annotation Editor': [
        ('annotation', 'create'),
        ('annotation', 'edit'),
        ('annotation', 'delete'),
    ],
    'Annotation Viewer': [
        ('annotation', 'read'),
    ],
}

Only allow members of the DevOps, SRE, or Platform Engineering teams to create annotations. This prevents accidental or malicious annotation spam.

Audit Logging

Every annotation creation, update, or deletion should be logged:

import logging
from datetime import datetime

logger = logging.getLogger('superset.annotations')

def log_annotation_action(action, annotation_id, user_id, metadata):
    logger.info(
        f"Annotation {action}",
        extra={
            'annotation_id': annotation_id,
            'user_id': user_id,
            'timestamp': datetime.utcnow().isoformat(),
            'metadata': metadata
        }
    )

Route these logs to a centralised logging system (ELK, Datadog, Splunk) so that auditors can reconstruct the history of all annotations.

Compliance via Vanta

If you are pursuing SOC 2 or ISO 27001 compliance via Vanta, annotation audit logs become part of your evidence for change management. Document:

  1. Who created the annotation (user ID).
  2. When it was created (timestamp).
  3. What it describes (short_description, long_description).
  4. Why it was created (metadata, source system).
  5. Who approved the change (if applicable).

Superset’s built-in audit logging captures (1) and (2). The metadata field captures (3), (4), and (5). This evidence satisfies most compliance frameworks’ requirements for change tracking.

Data Retention and Privacy

Annotations may contain sensitive information (incident details, user names, etc.). Implement retention policies:

-- Automatically delete annotations older than 1 year
DELETE FROM annotation
WHERE changed_on < NOW() - INTERVAL '1 year';

-- Or archive them to a separate table for compliance
INSERT INTO annotation_archive
SELECT * FROM annotation
WHERE changed_on < NOW() - INTERVAL '1 year';

Store archived annotations in a secure, access-controlled location (e.g., S3 with encryption and restricted IAM policies).


Monitoring and Observability {#monitoring-observability}

Annotation layers themselves need monitoring. Track these metrics:

Key Metrics

Annotation creation latency: Time from event (deployment, incident) to annotation appearing in Superset. Target: <5 seconds.

import time
from datetime import datetime

def measure_annotation_latency(event_timestamp, annotation_created_timestamp):
    latency_seconds = (annotation_created_timestamp - event_timestamp).total_seconds()
    print(f"Annotation latency: {latency_seconds}s")
    return latency_seconds

Annotation fetch time: Time to fetch annotations for a chart. Target: <50ms.

EXPLAIN ANALYZE
SELECT a.* FROM annotation a
JOIN annotation_chart_association aca ON a.id = aca.annotation_id
WHERE aca.chart_id = 123
AND a.start_dttm <= NOW()
AND (a.end_dttm IS NULL OR a.end_dttm >= NOW() - INTERVAL '7 days');

Annotation API error rate: Percentage of failed annotation creation requests. Target: <0.1%.

from prometheus_client import Counter

annotation_errors = Counter(
    'annotation_creation_errors_total',
    'Total number of failed annotation creation requests',
    ['error_type']
)

try:
    create_annotation(...)
except requests.exceptions.HTTPError as e:
    annotation_errors.labels(error_type='http_error').inc()

Annotation coverage: Number of charts with linked annotation layers. Target: >80% of dashboards.

SELECT
    COUNT(DISTINCT c.id) as total_charts,
    COUNT(DISTINCT CASE WHEN aca.annotation_id IS NOT NULL THEN c.id END) as charts_with_annotations,
    ROUND(100.0 * COUNT(DISTINCT CASE WHEN aca.annotation_id IS NOT NULL THEN c.id END) / COUNT(DISTINCT c.id), 2) as coverage_percent
FROM chart c
LEFT JOIN annotation_chart_association aca ON c.id = aca.chart_id;

Alerting

Set up alerts for:

  1. High annotation fetch latency: If annotation queries take >100ms, investigate missing indexes or table bloat.
  2. High API error rate: If >1% of annotation creation requests fail, check Superset availability and API quota limits.
  3. Missing annotations: If an expected annotation (e.g., from your CI/CD pipeline) does not appear within 10 seconds, alert the on-call engineer.

Summary and Next Steps {#summary-next-steps}

Annotation layers in Apache Superset are a powerful tool for adding operational context to dashboards. They bridge the gap between raw metrics and actionable insights by embedding the story (deployments, incidents, features) directly into visualisations.

Key takeaways:

  1. Automate annotation creation from your source of truth (CI/CD, incident management, feature flags). Manual annotation creation does not scale.
  2. Index your annotation tables properly. Without indexes on (annotation_layer_id, start_dttm, end_dttm), performance degrades quickly as your annotation count grows.
  3. Always use UTC timestamps for annotations. Timezone misalignment is the most common gotcha.
  4. Integrate annotations with your compliance workflow. Annotation audit logs are evidence for SOC 2 and ISO 27001 audits.
  5. Monitor annotation latency and API error rates. Annotation systems are part of your critical path; they need observability.

Next Steps

If you are running Superset at scale and need help implementing annotation layers or optimising your platform, consider working with a team that has deployed annotation layers in production environments. At PADISO, we have built annotation-driven dashboards for financial services, SaaS, and logistics platforms across Australia and North America.

Our Platform Development in Sydney and Platform Development in Melbourne teams specialise in production data platforms with embedded Superset + ClickHouse analytics. We handle the architecture, performance tuning, and compliance integration so you can focus on business logic.

For teams in the US, our Platform Development in Austin, Platform Development in San Francisco, and Platform Development in Chicago teams have deployed similar patterns for trading platforms, SaaS, and logistics companies.

We also offer CTO as a Service for seed-to-Series-B founders who need fractional technical leadership and Platform Design & Engineering for mid-market companies modernising their data infrastructure.

If you are building towards compliance (SOC 2 or ISO 27001), our Security Audit service includes annotation-driven audit trails and change management workflows via Vanta.

Ready to move forward? Review our case studies to see how we have helped other teams ship production data platforms, or reach out to discuss your specific annotation layer challenges.

For deeper technical reference, the Apache Superset GitHub Repository is the source of truth for implementation details. The YouTube: Superset Live Demo - Annotations provides hands-on walkthroughs of the annotation UI.

Want to talk through your situation?

Book a 30-minute call with Kevin (Founder/CEO). No pitch — direct advice on what to do next.

Book a 30-min call