Table of Contents
- Why Embedded Analytics Matter in Manufacturing
- Understanding Apache Superset for Embedded Deployments
- Data Modelling Foundations for Manufacturing Analytics
- Dashboard Design Principles for Manufacturing
- Implementing Embedded Superset in Your Product
- Rollout Patterns and Operational Excellence
- Security, Governance, and Compliance
- Common Pitfalls and How to Avoid Them
- Summary and Next Steps
Why Embedded Analytics Matter in Manufacturing {#why-embedded-analytics}
Manufacturing organisations sit on mountains of operational data—production line metrics, asset utilisation, supply chain movements, quality inspection results, and equipment downtime logs. Yet most organisations never unlock the value locked in those datasets. They either export data to Excel, run monthly PDF reports that arrive too late to act on, or force customers to buy separate BI tools at $2,000–$5,000 per seat per year.
Embedded analytics flips this model. Instead of asking customers to leave your product and open a separate tool, you embed real-time dashboards directly into the application they already use every day. A plant manager sees production efficiency metrics without switching context. A supply chain director tracks supplier performance without licensing additional software. A customer sees their own operational KPIs in real time, embedded in your SaaS platform.
The business case is clear: embedded analytics reduce customer churn, unlock upsell opportunities, and differentiate your product in crowded markets. In manufacturing specifically, where operational visibility directly impacts profitability, embedding analytics into your platform becomes table stakes.
Apache Superset is the open-source tool that makes this possible at scale. Unlike closed-source BI platforms, Superset offers embedding capabilities via its official SDK, costs a fraction of per-seat tools, and runs on your infrastructure—critical for manufacturing organisations with data sovereignty and compliance requirements.
Understanding Apache Superset for Embedded Deployments {#understanding-superset}
What Makes Superset Suitable for Embedded Analytics
Apache Superset is a modern, open-source data visualisation platform that has matured significantly since its inception. The CNCF project page for Apache Superset confirms its standing as a cloud-native tool backed by enterprise adoption. What makes it particularly suited to embedded manufacturing analytics is its modular architecture, SQL-native query builder, and first-class embedding support.
Unlike monolithic BI tools, Superset separates concerns cleanly: the API layer, the query engine, the visualisation layer, and the embedding SDK are independently deployable and scalable. You can run Superset on Kubernetes, scale the query backend independently, and embed dashboards into your product without forcing customers to log into a separate interface.
For manufacturing, this matters because your data model is often complex—multiple data sources (MES systems, ERP platforms, IoT sensors, quality management systems), varying update frequencies, and strict latency requirements. Superset’s SQL-first approach means your analysts write queries against a unified semantic layer, not proprietary drag-and-drop interfaces. When your data model evolves (and it will), you update SQL, not the tool itself.
The Embedded SDK and Guest Tokens
Superset’s embedding story centres on two components: the embedded SDK and guest tokens. The official embedding documentation explains the full technical flow, but here’s the operational reality.
When a customer logs into your manufacturing SaaS platform, your backend generates a short-lived guest token (typically valid for 15–60 minutes) that grants that specific customer access to their own dashboards. You embed the dashboard URL with the token into an iframe or use the JavaScript SDK for deeper integration. The customer sees real-time analytics without ever authenticating to Superset directly. Your application controls what data each customer can see through row-level security (RLS) rules baked into the dashboard’s underlying SQL queries.
This separation is crucial. Your product remains the system of record for user identity and access control. Superset becomes a pure analytics engine. If a customer loses access to your product, they lose access to embedded analytics immediately—no dangling Superset accounts to clean up.
Deployment Architecture Considerations
For manufacturing organisations, deployment topology matters. You’ll typically run Superset in one of three patterns:
Single-tenant Superset (one instance per customer). Rare, expensive, and only justifiable if a customer has extreme data volume or compliance isolation requirements. Most manufacturing organisations don’t need this.
Multi-tenant Superset with row-level security. All customers share one Superset instance. Row-level security rules in SQL ensure each customer sees only their own data. This is the standard pattern and what we focus on throughout this guide.
Hybrid: Superset cluster with customer-specific data warehouses. Large organisations with multiple business units or regional operations sometimes run a central Superset cluster but point different dashboards at different data warehouses. This adds complexity but enables truly independent scaling.
For most manufacturing organisations shipping embedded analytics to customers, multi-tenant Superset with RLS is the right choice. It’s operationally simpler, cost-effective, and scales to hundreds or thousands of customers without architectural redesign.
Data Modelling Foundations for Manufacturing Analytics {#data-modelling}
Why Data Modelling Precedes Dashboard Design
The most common mistake teams make is designing dashboards first, then trying to model data to fit those dashboards. This leads to brittle, slow, unmaintainable analytics. The correct sequence is: data model first, dashboards second.
A sound data model for manufacturing embedded analytics requires you to think about three layers: raw data ingestion, a semantic layer (dimensional or fact-table model), and the presentation layer (dashboards). Superset sits primarily at the presentation layer, but your success depends entirely on the semantic layer beneath it.
Manufacturing data is inherently hierarchical and temporal. You have assets (machines, production lines, facilities), operations (production runs, orders, inspections), and events (downtime, quality failures, maintenance). Your data model must reflect this structure, not fight it.
Building a Dimensional Model for Manufacturing
A dimensional model (also called a star schema) is the proven pattern for manufacturing analytics. You have facts (measurable events: production output, downtime duration, defect count) and dimensions (descriptive attributes: machine ID, product type, shift, facility, operator).
Here’s a simplified example for an automotive parts manufacturer:
Fact table: production_events
event_id(primary key)machine_id(foreign key to dim_machine)product_id(foreign key to dim_product)shift_id(foreign key to dim_shift)facility_id(foreign key to dim_facility)event_timestamp(when the event occurred)units_produced(measure)cycle_time_seconds(measure)defects_detected(measure)downtime_seconds(measure)
Dimension tables:
dim_machine: machine_id, machine_name, line_id, equipment_type, installation_date, manufacturerdim_product: product_id, product_name, product_category, customer_iddim_shift: shift_id, shift_name, start_time, end_timedim_facility: facility_id, facility_name, city, region, countrydim_date: date_id, calendar_date, year, month, day_of_week, is_holiday
This structure allows you to answer questions like:
- “What’s the average cycle time per machine, by shift, over the last 30 days?”
- “Which products have the highest defect rate?”
- “How much downtime did each facility experience last month, and why?”
Without this structure, you’ll write the same complex SQL joins and aggregations in every dashboard. With it, dashboards become simple queries against clean, pre-aggregated dimensions.
Handling Time-Series and Real-Time Data
Manufacturing analytics often requires real-time or near-real-time data. A production line manager needs to know current downtime status, not yesterday’s aggregates. This creates tension: dimensional models are traditionally batch-oriented (daily snapshots), but manufacturing demands streaming or sub-minute updates.
The solution is a hybrid approach:
-
Ingest raw events into a streaming layer (Kafka, Kinesis, or a message queue). Each event is immutable: machine started, machine stopped, part inspected, defect detected.
-
Maintain a real-time state table that tracks the current status of each asset (machine X is currently running, has been for 47 minutes, zero defects so far). Update this table as events arrive.
-
Aggregate into dimensional tables on a schedule (every 5 minutes, every hour, every day, depending on your use case). These aggregations are immutable snapshots: “On 2024-01-15 at 14:00, machine X produced 240 units with 2 defects.”
-
Embed Superset dashboards that query both layers. Real-time KPIs come from the state table (current production rate, current downtime). Historical trends come from dimensional aggregates (30-day average cycle time, month-over-month defect rate).
This pattern scales from small manufacturers (single facility, 10–20 machines) to large enterprises (multiple facilities, thousands of assets) without rearchitecting your analytics infrastructure.
Data Quality and Freshness Guarantees
Manufacturing data quality is non-negotiable. A single bad sensor reading or misconfigured MES export can corrupt an entire month’s analytics. Before you build dashboards, establish data quality rules.
For each fact table, define:
- Freshness SLA: How recent must data be? (e.g., production events within 5 minutes)
- Completeness rules: What percentage of events must arrive? (e.g., 99.5% of production events must be logged)
- Validity rules: What makes a record valid? (e.g., cycle_time_seconds must be > 0 and < 3600)
- Uniqueness rules: What should never be duplicated? (e.g., no two events with the same machine_id and event_timestamp)
Implement these as automated checks in your data pipeline. If data quality drops below thresholds, alert your ops team and mark dashboards as “stale” or “degraded” in the UI. Manufacturing leaders make decisions based on analytics; bad data leads to bad decisions.
Dashboard Design Principles for Manufacturing {#dashboard-design}
The Role-Based Dashboard Strategy
Not every user needs the same dashboards. A plant manager cares about overall facility efficiency. A production engineer cares about individual machine performance. A supply chain director cares about supplier lead times and inventory turns. A customer (if you’re embedding analytics in your SaaS platform) cares about their own order status and delivery performance.
Design dashboards by role, not by data source. Start with these core manufacturing personas:
Plant Manager Dashboard
- Facility-wide KPIs: overall equipment effectiveness (OEE), production volume vs. target, downtime by category, quality metrics
- Alerts: machines below target OEE, unexpected downtime, quality failures
- Drill-down: ability to click into a specific line or machine for deeper investigation
Production Engineer Dashboard
- Machine-level detail: real-time cycle time, defect rate, downtime duration and reason, maintenance history
- Comparison: this machine vs. similar machines, this shift vs. historical average
- Action triggers: ability to log maintenance, adjust parameters, or escalate issues
Supply Chain Director Dashboard
- Supplier performance: on-time delivery %, lead time trends, quality issues by supplier
- Inventory status: days of supply by product, stock-out risk, obsolescence alerts
- Demand vs. capacity: production capacity utilisation, forecast vs. actual, bottleneck identification
Customer-Facing Dashboard (embedded in your SaaS)
- Order status: current order, expected completion, quality checks completed
- Delivery performance: on-time delivery history, lead time trends
- Cost visibility: cost per unit, total order cost, cost vs. budget
Each dashboard should fit on a single screen without scrolling (or minimal scrolling). If you need more than 6–8 visualisations to tell the story, split into multiple dashboards.
Choosing the Right Visualisation Types
Superset supports dozens of visualisation types. For manufacturing, focus on these core types:
Time-series line charts. Perfect for trend analysis: OEE over time, defect rate by week, production volume by day. Always include a target line (e.g., “target OEE = 85%”) for context.
Gauge or progress charts. Ideal for KPIs: current OEE, current downtime duration, production vs. target. Make the target explicit and use colour coding (green = on-track, yellow = warning, red = critical).
Heatmaps. Excellent for identifying patterns: machine performance by hour of day, defect rate by product type and shift, downtime frequency by day of week. Manufacturing leaders instantly spot anomalies in heatmaps.
Bar charts (horizontal). Use for comparisons: OEE by machine, defect rate by supplier, production volume by facility. Horizontal bars are easier to read when category names are long (machine descriptions, product names).
Pivot tables. Necessary for detailed analysis: production by machine by shift, defect count by product by quality code. Keep pivot tables to a maximum of 3 dimensions; beyond that, the data becomes unreadable.
Scatter plots. Useful for correlation analysis: cycle time vs. defect rate, temperature vs. downtime, operator experience vs. quality. Manufacturing engineers often spot root causes in scatter plots.
Avoid pie charts, 3D visualisations, and unnecessary decoration. Manufacturing data is serious; visualisations should be clear and functional.
Embedding Dashboards: Design for Context
When you embed dashboards into your manufacturing SaaS platform, design for seamless context switching. A customer should not feel like they’ve left your application.
Key design principles:
Remove chrome. Hide the Superset navigation bar, branding, and footer. The dashboard should feel like a native part of your product. The Preset embedded dashboard guide provides detailed guidance on customising embedded experiences.
Inherit styling. Match the dashboard colour scheme, typography, and spacing to your product. Superset’s theming system allows you to customise the entire visual appearance.
Contextualise filters. If a customer is viewing their own order, pre-filter the dashboard to show only their data. Use guest tokens with row-level security to enforce this at the database level; use URL parameters or embedded filters to enforce it at the UI level.
Provide export options. Manufacturing users often need to download data for reports, presentations, or further analysis. Ensure dashboards support CSV and PDF export.
Monitor embed performance. Embedded dashboards run in iframes; slow query performance is immediately visible to your customers. Establish SLA targets (e.g., dashboard loads in < 3 seconds, queries complete in < 10 seconds) and monitor them continuously.
Implementing Embedded Superset in Your Product {#implementing-embedded}
Architecture: Multi-Tenant Superset with Row-Level Security
Here’s the reference architecture for embedding Superset in a manufacturing SaaS platform:
Your SaaS Application
|
+-- Authentication Layer (your auth system)
| |
| +-- Generate guest token for logged-in user
| +-- Include customer_id in token claims
|
+-- Embed Layer (JavaScript SDK or iframe)
| |
| +-- Load dashboard with guest token
| +-- Pass customer_id as filter parameter
|
+-- Superset Instance (multi-tenant)
| |
| +-- Validate guest token
| +-- Enforce row-level security based on customer_id
| +-- Execute query against data warehouse
|
+-- Data Warehouse (Postgres, Snowflake, BigQuery, etc.)
|
+-- Fact tables (production_events, quality_events, etc.)
+-- Dimension tables (machines, products, facilities, customers)
The critical security boundary is between Superset and the data warehouse. Row-level security rules in Superset ensure that when customer A requests a dashboard, only customer A’s data is returned, even if the underlying query could theoretically access all data.
Setting Up Row-Level Security in Superset
Superset’s row-level security (RLS) system works by injecting WHERE clauses into queries based on user attributes. Here’s how to set it up:
-
Define RLS rules in Superset’s admin interface. For example:
- Rule name: “Customer isolation”
- Clause:
customer_id = '{{ current_user_id }}' - Tables: production_events, quality_events, orders, shipments
-
Embed the customer_id in the guest token. When you generate a guest token from your backend, include the customer ID as a claim:
{ "user_id": "customer_12345", "exp": 1234567890, "iat": 1234567800 } -
Superset validates the token and enforces RLS. When the dashboard query runs, Superset injects the RLS clause:
WHERE customer_id = 'customer_12345'.
This ensures that even if a customer tries to manipulate the request or bypass filters, they cannot see data from other customers. The security is enforced at the database query level, not just the UI level.
Generating Guest Tokens Securely
Guest tokens are the trust boundary between your application and Superset. Generate them securely:
-
Use a service account. Create a Superset user account with minimal permissions (guest token generation only). Authenticate to Superset using this account, never a human user account.
-
Call the guest token API. Your backend makes a POST request to Superset’s
/api/v1/security/guest_token_by_usernameendpoint with the service account credentials. -
Include the right claims. Pass the customer ID and any other attributes needed for RLS:
{ "username": "guest_user", "first_name": "Customer", "last_name": "12345", "email": "customer_12345@example.com", "user_id": "customer_12345" } -
Set a short expiry. Guest tokens should expire in 15–60 minutes. If a user needs a new token, they request it from your application. This limits the window for token compromise.
-
Cache tokens carefully. Don’t cache tokens across user sessions. Generate a new token for each page load or session. If you cache, use a short TTL (time-to-live) and invalidate on logout.
Embedding via iframe vs. JavaScript SDK
Superset supports two embedding approaches:
iframe embedding. Simplest approach. You create an iframe pointing to the Superset dashboard URL with the guest token:
<iframe
src="https://superset.yourcompany.com/superset/dashboard/123/?guest_token=abc123"
width="100%"
height="600"
frameborder="0"
></iframe>
Pros: Simple, no JavaScript required, works everywhere. Cons: Limited customisation, harder to coordinate interactions between dashboard and host page.
JavaScript SDK embedding. More powerful. You use Superset’s SDK to load dashboards programmatically, customise styling, and listen for events:
import { EmbeddedDashboard } from '@superset-ui/embedded-sdk';
const dashboard = new EmbeddedDashboard({
id: '123',
supersetDomain: 'https://superset.yourcompany.com',
mountPoint: document.getElementById('dashboard-container'),
fetchGuestToken: () => {
return fetch('/api/guest-token', {
method: 'POST',
credentials: 'include'
}).then(r => r.json()).then(data => data.token);
}
});
Pros: Full customisation, can coordinate with host app, better performance. Cons: Requires JavaScript, more complex setup.
For most manufacturing SaaS applications, start with iframe embedding. It’s simpler to implement and debug. As your embedded analytics mature and you need deeper integration (e.g., clicking a machine name in the dashboard to navigate to the machine detail page in your app), migrate to the JavaScript SDK.
Performance Optimisation for Embedded Dashboards
Embedded dashboards must be fast. If a dashboard takes 10 seconds to load, customers will perceive your entire product as slow, even if the slowness is in Superset.
Query optimisation is the foundation. Before you worry about caching or infrastructure, ensure your underlying queries are efficient:
- Index all foreign keys and filter columns in your fact tables.
- Partition large fact tables by date (e.g., monthly partitions for production_events).
- Pre-aggregate common metrics (e.g., daily OEE by machine) in a separate table, queried for historical trends.
- Use approximate aggregations (e.g., HyperLogLog for unique count) when exact precision is not required.
Enable Superset’s cache layer. Superset can cache query results for a configurable TTL (time-to-live). For manufacturing dashboards, a 5-minute cache is often appropriate: data is fresh enough for decision-making, but repeated views of the same dashboard don’t hit the database.
Configure cache settings in Superset’s config:
CACHE_CONFIG = {
'CACHE_TYPE': 'redis',
'CACHE_REDIS_URL': 'redis://localhost:6379/1',
'CACHE_DEFAULT_TIMEOUT': 300 # 5 minutes
}
Use Superset’s query result cache. This is different from the chart cache. When multiple dashboards query the same underlying data, Superset can reuse the result set. Configure this in the database connection settings.
Monitor query performance. Set up alerts if dashboard queries exceed your SLA (e.g., > 10 seconds). Use Superset’s native query performance logging or integrate with a monitoring tool like Datadog or New Relic.
Rollout Patterns and Operational Excellence {#rollout-patterns}
The Phased Rollout Approach
Don’t launch all embedded analytics at once. A phased rollout reduces risk and lets you iterate based on customer feedback.
Phase 1: Internal pilot (weeks 1–4)
- Deploy Superset to a staging environment.
- Create dashboards for your internal team (product, engineering, support).
- Test RLS, performance, and token generation at small scale.
- Gather feedback and iterate on dashboard design.
Phase 2: Early customer access (weeks 5–8)
- Deploy Superset to production.
- Onboard 5–10 early-adopter customers.
- Embed one or two core dashboards (e.g., order status, production efficiency).
- Monitor performance, gather feedback, fix bugs.
- Measure adoption: how many customers log in, how often, which dashboards are used most.
Phase 3: Gradual rollout (weeks 9–16)
- Enable embedded analytics for 25% of your customer base.
- Add more dashboards based on Phase 2 feedback.
- Establish support processes: how do customers request new dashboards, report issues, etc.
- Document best practices for your support team.
Phase 4: Full rollout (week 16+)
- Enable for all customers.
- Continue monitoring adoption and performance.
- Plan enhancements: new dashboards, deeper customisation, API access for advanced customers.
This phased approach typically takes 4–6 months from initial Superset deployment to full customer rollout. It’s slower than a big-bang launch, but far lower risk.
Operationalising Superset in Production
Once Superset is embedded in your product, it becomes part of your operational responsibility. Here’s what you need:
Infrastructure as code. Deploy Superset using Kubernetes manifests or Terraform. Version-control your Superset configuration (databases, dashboards, RLS rules). If Superset crashes, you should be able to redeploy in minutes, not hours.
Monitoring and alerting. Track these metrics:
- Uptime: Superset API availability (target: 99.9%)
- Query performance: P50, P95, P99 latency of dashboard queries
- Cache hit rate: percentage of queries served from cache
- Error rate: percentage of failed queries
- Token generation latency: time to generate a guest token (target: < 100ms)
Set up alerts if any metric exceeds thresholds. Use tools like Prometheus + Grafana or Datadog.
Backup and disaster recovery. Superset stores dashboard definitions, RLS rules, and user metadata in a database (typically PostgreSQL). Back this up daily. Test recovery procedures monthly. If your Superset database is corrupted, you should be able to restore from backup without losing dashboard definitions.
Log aggregation. Centralise Superset logs (application logs, query logs, error logs) in a tool like ELK Stack or Datadog. When a customer reports “the dashboard is slow,” you need to be able to query logs and find the exact query, execution time, and any errors.
Capacity planning. As you add customers and dashboards, Superset’s load grows. Plan for this:
- Monitor database connection pool utilisation. If it’s > 80%, you’ll start seeing connection timeouts.
- Monitor Superset pod memory and CPU usage. If consistently > 80%, add more replicas or increase resource limits.
- Monitor data warehouse query queue. If queries are queuing (waiting for a free slot), your data warehouse is the bottleneck, not Superset.
Supporting Customers with Embedded Analytics
Embedded analytics create new support scenarios:
“The dashboard is blank.” Usually means the RLS rule is too restrictive, or the customer has no data. Check:
- Does the customer have data in the underlying tables?
- Is the RLS rule correctly configured? (e.g., is customer_id spelled correctly?)
- Is the guest token valid and not expired?
“The dashboard is slow.” Check:
- Is the underlying query slow? (Query the data warehouse directly to verify.)
- Is Superset’s cache warm? (If a dashboard is rarely viewed, cache misses are common.)
- Is the data warehouse under load? (Check CPU, query queue, connection count.)
“I need a custom dashboard.” Establish a process: customer requests a dashboard → your analytics team reviews the request → if feasible, creates the dashboard in Superset → customer gets access. Set SLAs (e.g., “custom dashboards delivered within 2 weeks”).
“Can I export this data?” Yes. Superset supports CSV and JSON export. If customers need more advanced export (e.g., scheduled reports emailed daily), you can integrate Superset with a reporting tool or build a custom export API.
Document these scenarios in your support knowledge base. Train your support team to troubleshoot basic issues (blank dashboards, slow performance) without escalating to engineering.
Security, Governance, and Compliance {#security-governance}
Authentication and Authorisation
Superset has its own user and permission system, but for embedded analytics, you delegate authentication to your application. Here’s the security model:
-
Your app authenticates the user. The user logs into your SaaS platform using your auth system (OAuth, SAML, username/password, etc.).
-
Your app generates a guest token. When the user navigates to a page with embedded analytics, your backend generates a short-lived guest token and passes it to the frontend.
-
The frontend embeds the dashboard with the token. The dashboard loads with the token, and Superset validates it.
-
Superset enforces RLS. The token includes the customer ID, which Superset uses to filter data.
This model means Superset never authenticates users directly. It only validates tokens. If a token is compromised, it expires quickly (15–60 minutes). If a user’s access to your app is revoked, they can no longer generate tokens, so they lose access to embedded analytics.
Data Access Control via Row-Level Security
RLS is your primary mechanism for ensuring customers see only their own data. However, RLS is only as good as your implementation. Common mistakes:
Mistake 1: RLS rules that are too permissive. Example: customer_id IN (SELECT customer_id FROM customers WHERE region = '{{ current_user_region }}')
If a user is in the “APAC” region and there are 50 customers in APAC, this rule gives them access to 50 customers’ data. This might be intentional (a regional manager seeing all regional customers), but it’s not customer isolation.
Mistake 2: RLS rules that don’t apply to all tables. If you define RLS on the production_events table but not on the quality_events table, a customer can still see quality data from other customers.
Mistake 3: Hardcoded customer IDs in RLS rules. Don’t do this. Use dynamic placeholders like {{ current_user_id }} that are populated from the guest token.
Best practice: For each RLS rule, document the intent (e.g., “Customer isolation: each customer sees only their own data”), the tables affected, and the WHERE clause. Review RLS rules quarterly to ensure they still match your security model.
Compliance Considerations for Manufacturing
Manufacturing organisations often have compliance requirements (ISO 9001 for quality, IATF for automotive, FDA for pharmaceuticals). How does embedded analytics fit?
Data retention. Your compliance framework may require you to retain production data for 5–7 years. Ensure your data warehouse and Superset backups align with this requirement.
Audit trails. Some frameworks require you to log who accessed what data and when. Superset has native audit logging; enable it and ensure logs are immutable (stored in a append-only system).
Data sovereignty. If you operate in Europe, GDPR requires you to process EU data in the EU. If your customers are in Australia, they may require data to stay in Australia. When you deploy Superset, ensure it’s in the same region as your data warehouse.
SOC 2 / ISO 27001. If you’re pursuing these certifications, embedded analytics are part of your audit scope. Document your RLS implementation, access controls, and monitoring. PADISO can guide you through SOC 2 and ISO 27001 compliance if you’re building manufacturing SaaS.
For manufacturing organisations in Australia, platform engineering in Australia includes guidance on data residency and compliance-ready architecture.
Monitoring Access and Detecting Anomalies
Once embedded analytics are live, monitor for suspicious access patterns:
Metric 1: Guest token generation rate. If a single customer account generates 1,000 tokens in one hour, something is wrong (automated scraping, token theft, or a bug in your token generation code). Set alerts for unusual spikes.
Metric 2: Query latency per customer. If customer A’s queries suddenly become 10x slower, either their data volume increased dramatically, or they’re running malicious queries (e.g., full table scans). Investigate.
Metric 3: Failed queries per customer. If a customer’s queries start failing (permission denied, query timeout), it could indicate:
- RLS rule is misconfigured.
- Customer’s data has become corrupted.
- They’re trying to access tables they shouldn’t have access to.
Set up dashboards in Superset to monitor these metrics. Alert your ops team if thresholds are exceeded.
Common Pitfalls and How to Avoid Them {#common-pitfalls}
Pitfall 1: Slow Dashboards Due to Inefficient Queries
The problem: A dashboard loads in 30 seconds. Customers complain. You check Superset and see that the underlying query scans 500 million rows.
Root cause: The query has no indexes, no partitioning, and no aggregation. It’s doing a full table scan every time.
Solution: Before you embed a dashboard, profile the query:
- Run the query in your data warehouse and measure execution time.
- Check the query plan. Are indexes being used? Is the query parallelised?
- If execution time > 5 seconds, optimise:
- Add indexes on filter columns.
- Partition the table by date.
- Pre-aggregate common metrics.
- Use approximate aggregations (HyperLogLog) if exact precision isn’t required.
- Set a performance SLA (e.g., “all dashboard queries must complete in < 3 seconds”) and monitor it.
In manufacturing specifically, if you’re embedding dashboards for customers, slow performance is immediately visible and damages trust. Invest in query optimisation before rollout.
Pitfall 2: RLS Rules That Are Incomplete or Incorrect
The problem: You configure RLS to isolate customers, but a customer reports seeing data from another customer.
Root cause: RLS rules are applied to some tables but not others. Or the RLS rule has a logic error.
Solution:
- List all tables that contain customer-specific data.
- For each table, define an RLS rule.
- Document the rule and the intent.
- Test the rule: log in as customer A and verify they can’t see customer B’s data.
- Repeat for all tables.
- Review RLS rules quarterly.
For manufacturing, this typically means RLS rules on:
- production_events (each customer sees only their own production data)
- quality_events (each customer sees only their own quality data)
- orders (each customer sees only their own orders)
- shipments (each customer sees only their own shipments)
Pitfall 3: Stale or Missing Data
The problem: A customer logs into the dashboard and sees data from 3 days ago. They assume it’s a bug and lose trust in embedded analytics.
Root cause: Your data pipeline failed silently. Data stopped flowing into the warehouse, but no one was alerted.
Solution:
- Establish data freshness SLAs (e.g., “production data updated within 5 minutes”).
- Implement automated checks: if data is older than the SLA, alert your ops team.
- In Superset, mark dashboards as “stale” or “degraded” if underlying data is older than the SLA.
- Communicate transparently with customers: “Dashboard data was last updated 2 hours ago due to a pipeline issue. We’re investigating.”
Manufacturing leaders make real-time decisions based on analytics. Stale data can lead to bad decisions. Treat data freshness as seriously as you treat uptime.
Pitfall 4: Embedding Without Proper Testing
The problem: You embed a dashboard into your product and launch to customers. Immediately, customers report:
- The dashboard doesn’t load.
- The dashboard loads but shows no data.
- The dashboard shows data from the wrong customer.
Root cause: You didn’t test the embedding in a production-like environment before launch.
Solution: Before you embed a dashboard:
- Test in a staging environment that mirrors production.
- Generate guest tokens using your production token generation code.
- Embed the dashboard in your staging app.
- Test as different customers: verify each customer sees only their own data.
- Test edge cases: a customer with no data, a customer with millions of rows, a customer with special characters in their name.
- Test performance: measure dashboard load time and query latency.
- Test failure scenarios: what happens if Superset is down, if the token is invalid, if the query times out.
Use a staging environment that’s as close to production as possible. If you only test in development, you’ll miss infrastructure issues (network latency, database performance, etc.) that only appear in production.
Pitfall 5: Underestimating the Operational Burden
The problem: You deploy Superset and embedded analytics, then discover:
- Superset requires daily maintenance (backups, log rotation, database cleanup).
- Customers request new dashboards constantly, and you don’t have a process for handling them.
- When a dashboard is slow, you don’t know how to debug it.
Root cause: You treated Superset as a one-time project, not as an ongoing operational responsibility.
Solution:
- Assign ownership: who is responsible for Superset uptime, performance, and feature requests?
- Establish processes:
- How do customers request new dashboards? (e.g., through your product feedback system)
- How do you prioritise dashboard requests? (e.g., by customer segment, revenue impact)
- How do you handle performance issues? (e.g., escalation path, SLA)
- Document runbooks: how to restart Superset, how to restore from backup, how to debug a slow dashboard.
- Plan for growth: as you add customers and dashboards, Superset’s infrastructure needs will grow. Budget for this.
Embedded analytics are a feature, not a one-time project. Plan accordingly.
Summary and Next Steps {#summary}
Embedded customer analytics on Apache Superset is a powerful way to differentiate your manufacturing SaaS product, reduce customer churn, and unlock upsell opportunities. But success requires careful planning and execution.
Key Takeaways
-
Data model first, dashboards second. Invest in a clean dimensional or fact-table model before you design dashboards. This makes dashboards faster, simpler, and more maintainable.
-
Multi-tenant Superset with row-level security. This is the standard pattern for embedded analytics. All customers share one Superset instance; RLS ensures data isolation.
-
Security is non-negotiable. Implement RLS correctly, use short-lived guest tokens, monitor access patterns, and test thoroughly before launch.
-
Performance matters. Slow dashboards damage trust. Optimise queries, enable caching, and monitor performance continuously.
-
Plan for operations. Superset is not a fire-and-forget tool. Assign ownership, establish processes, and budget for ongoing maintenance.
-
Phased rollout reduces risk. Start with an internal pilot, then early-adopter customers, then gradual rollout. Iterate based on feedback.
Immediate Next Steps
If you’re just starting:
- Assess your current data infrastructure. Where is your manufacturing data stored? (MES system, ERP, data warehouse, data lake?)
- Design a dimensional model for your core manufacturing metrics (production, quality, downtime).
- Deploy Superset to a staging environment and create 3–5 pilot dashboards.
- Test RLS and guest token generation.
- Get feedback from internal stakeholders (product, engineering, support).
If you’re already running Superset:
- Audit your RLS rules. Are they complete? Are they correct?
- Profile your dashboard queries. Are they within your performance SLA?
- Set up monitoring: uptime, query latency, cache hit rate, token generation latency.
- Document your operational processes: how do you handle customer requests, how do you debug issues, how do you plan capacity.
- Plan your rollout strategy: which customers get embedded analytics first, and why?
If you need expert guidance:
Building and operating embedded analytics at scale is complex. If you’re a manufacturing SaaS company in Australia or the US, PADISO can help. We specialise in platform engineering for manufacturing organisations, including data architecture, Superset deployment, and embedded analytics.
For manufacturing organisations in specific regions, we have dedicated expertise:
- Platform development in Sydney for financial services, retail, and media with bank-grade architecture and embedded Superset analytics.
- Platform development in Brisbane for logistics and health teams with fleet telematics and embedded ops analytics.
- Platform development in Adelaide for defence, space, and advanced manufacturing with sovereign-aligned architecture and MES integration.
- Platform development in Chicago for trading, logistics, and manufacturing with low-latency data platforms and embedded Superset analytics.
- Platform development in Austin for semiconductors and tech with multi-tenant SaaS and embedded analytics.
- Platform development in Seattle for cloud-native tech and aerospace with well-architected AWS/Azure platforms.
- Platform development in Denver for aerospace, energy, and tech startups with scalable data platforms and telemetry pipelines.
- Platform development in New York for financial services with low-latency data platforms and Superset replacing per-seat BI.
- Platform development in Dallas for finance, telecom, and logistics with enterprise data consolidation and Superset analytics.
- Platform development in Toronto for financial services with PIPEDA-aware architecture and Superset analytics.
- Platform development across the United States with embedded Superset + ClickHouse analytics tuned to each city.
- Platform development in Dunedin for education, health, and manufacturing with governed data platforms and reproducible research pipelines.
- Platform development across Australia with backend, data platforms, and embedded Superset analytics.
We also offer fractional CTO and CTO advisory in Chicago for technical leadership on architecture, reliability, and vendor decisions.
If you’re building embedded analytics for manufacturing, you’re solving a real problem. Get the architecture right from the start, and embedded analytics will become a core differentiator for your product. Start with a clear data model, implement RLS correctly, and plan for ongoing operations. The payoff—happier customers, reduced churn, and new revenue from analytics upsells—is worth the effort.
See our case studies to learn how we’ve helped companies across industries build and scale data platforms and embedded analytics.