SAP S/4HANA Reporting in Apache Superset for Manufacturers
Complete guide to building SAP S/4HANA reporting dashboards in Apache Superset. Architecture, CDS views, performance tuning, and real manufacturing use cases.
SAP S/4HANA Reporting in Apache Superset for Manufacturers
Table of Contents
- Why Manufacturers Are Moving from SAP BW to Apache Superset
- Architecture: Connecting SAP S/4HANA to Apache Superset
- Exposing SAP S/4HANA Tables and CDS Views
- Setting Up Your Superset Instance for Manufacturing Data
- Building Semantic Layers for Manufacturing KPIs
- Real-World Manufacturing Dashboard Examples
- Performance Tuning and Query Optimisation
- Security, Compliance, and Data Governance
- Migration Strategy: BW and SAC to Superset
- Getting Started: Next Steps for Your Organisation
Why Manufacturers Are Moving from SAP BW to Apache Superset
Manufacturing organisations have relied on SAP Business Warehouse (BW) and SAP Analytics Cloud (SAC) for decades to deliver reporting and analytics. These tools work, but they come with significant operational friction: high licensing costs, lengthy deployment cycles, vendor lock-in, and limited flexibility when you need to iterate on dashboards or integrate with modern data tools.
Apache Superset changes this equation. It’s an open-source, lightweight, SQL-native analytics platform that sits directly on top of your SAP S/4HANA database. Instead of moving data into a separate BW instance, you query S/4HANA tables and CDS (Core Data Services) views directly. Manufacturers using this approach report 40–60% reductions in reporting infrastructure costs, faster dashboard iteration (weeks instead of months), and the ability to combine SAP data with external sources—suppliers, IoT sensors, market data—in a single semantic layer.
The shift is driven by three forces:
Cost: SAP BW licensing is per-core, per-year, and scales with your data volume. Superset runs on commodity infrastructure (cloud VMs, Kubernetes) and costs a fraction of that. A mid-market manufacturer with 50+ reporting users might spend $200K–400K annually on BW licences alone. Moving to Superset cuts that to operational costs only—infrastructure, team time, and maintenance.
Speed: BW requires data cubes, aggregates, and ETL jobs that must be designed, tested, and deployed through change management. A new dashboard request that takes 4–6 weeks in BW takes 2–3 days in Superset because you’re querying live S/4HANA data and building on a semantic layer, not moving data around.
Flexibility: Superset is SQL-first and integrates with modern data tools. You can combine S/4HANA with Salesforce, supply chain data from external APIs, or real-time IoT feeds in one dashboard. BW and SAC lock you into the SAP ecosystem.
The tradeoff is straightforward: you need in-house SQL expertise and a willingness to manage an open-source stack. For manufacturers with technical teams, this is a net win.
Architecture: Connecting SAP S/4HANA to Apache Superset
The reference architecture for exposing SAP S/4HANA tables and CDS views in Superset follows a simple, scalable pattern. You’re not moving data; you’re exposing S/4HANA as a queryable database layer.
High-Level Architecture
SAP S/4HANA (HANA Database)
↓
├─ Transactional Tables (MARA, MARC, VBAK, VBAP, MIGO, MKPF)
├─ CDS Views (C_Product, C_SalesOrder, C_Inventory)
└─ Custom Tables & Extractors
↓
Network Layer (Direct DB Connection or OData Proxy)
↓
Apache Superset Instance (Docker/K8s)
├─ SQLAlchemy Drivers (HANA, ODBC, JDBC)
├─ Semantic Layer (Metrics, Dimensions, Calculated Columns)
├─ Dashboards (Manufacturing KPIs, Supply Chain, Quality, Production)
└─ Permissions & Row-Level Security
↓
End Users (Factory Managers, Planners, Finance)
Database Connectivity Options
You have three primary paths to connect Superset to S/4HANA:
Direct HANA Connection: If your S/4HANA instance runs on SAP HANA (most do), you can connect Superset directly using the hana-sqlalchemy driver. This is the fastest, lowest-latency option—queries run against the HANA database directly, no intermediate layer. The trade-off is that you need network access from your Superset instance to the HANA port (typically 39013 for SQL). This works well if Superset runs in the same data centre or cloud region as S/4HANA.
ODBC/JDBC Bridge: Some manufacturers prefer an ODBC or JDBC layer for firewall isolation or to use existing SAP client infrastructure. You set up an ODBC Data Source Name (DSN) pointing to S/4HANA, then configure Superset to use pyodbc or a JDBC driver. This adds latency (microseconds per query) but simplifies network topology and allows you to apply SAP RFC security rules at the connection level.
OData Gateway: SAP provides an OData API layer on top of S/4HANA. You can expose S/4HANA tables and CDS views via OData, then query them through Superset’s REST API connector. This is the most flexible option for hybrid setups (Superset in the cloud, S/4HANA on-premises) but introduces an additional hop and requires careful pagination and caching to avoid performance issues.
For most manufacturers, direct HANA connection is the recommended starting point. It’s the fastest, simplest to debug, and easiest to scale.
Recommended Tech Stack
Here’s what a production Superset setup for manufacturing looks like:
- Superset Version: 4.0+ (latest stable)
- Database Driver:
hana-sqlalchemyfor SAP HANA - Backend Database: PostgreSQL 14+ (for Superset metadata, users, dashboards)
- Cache Layer: Redis 6+ (query results, session cache)
- Deployment: Docker Compose (dev/test) or Kubernetes (production)
- Reverse Proxy: Nginx or Apache (SSL termination, load balancing)
- SSO: SAML 2.0 or OAuth 2.0 (integrate with your corporate directory)
This stack is battle-tested across dozens of manufacturing deployments. It scales from 50 to 5,000 concurrent users and can handle terabyte-scale datasets.
Exposing SAP S/4HANA Tables and CDS Views
The core of your reporting layer is deciding which S/4HANA tables and CDS views to expose to Superset. You don’t expose everything; you expose a curated set of tables and views that represent your core business processes.
Core Manufacturing Tables
Most manufacturers query these standard S/4HANA tables:
Materials & Products:
MARA(Material Master): Product codes, descriptions, UOM, valuation classMARC(Material Plant): Plant-specific data—stock levels, lead times, reorder pointsMARD(Material Warehouse): Warehouse-level stock, broken down by storage location
Sales & Orders:
VBAK(Sales Order Header): Order numbers, customers, dates, statusVBAP(Sales Order Item): Line items, quantities, pricing, delivery datesVBUK(Sales Order Status): Order fulfilment status (shipped, invoiced, etc.)
Production & Manufacturing:
AFPO(Production Order Operation): Work centre, labour hours, material assignmentsAUFK(Production Order Header): Order number, product, quantity, datesRESB(Reservation Item): Material reservations for production orders
Goods Movement & Inventory:
MIGO(Goods Issue/Receipt): Stock movements, dates, quantities, reasonsMKPF(Material Document Header): Batch header for goods movementsMSKU(Stock Segment): Inventory by material, plant, storage location, stock type
Finance & Costing:
COSP(Cost Object: Actual Costs): Production costs by order or cost centreCKIS(Costing Run): Standard costs, overhead rates
Core Data Services (CDS) Views
CDS views are SAP’s modern data abstraction layer. They’re SQL views defined in ABAP that sit on top of base tables and provide a cleaner, more semantic interface. For manufacturers, the key CDS views include:
C_Product(Product Master): Simplified product view with hierarchiesC_SalesOrder(Sales Order): Order header + key line items in one viewC_Inventory(Stock): Current inventory by product, plant, storage locationC_ProductionOrder(Production Order): Order + operation + material detailC_GoodsMovement(Goods Movement): Movements with full context (material, plant, reason, user)
CDS views are preferred for Superset because:
- They’re maintained by SAP and updated with each release—less risk of breaking changes
- They include business logic (hierarchies, calculated fields, filters) that you’d otherwise code in Superset
- They perform better than hand-written joins across base tables
- They’re documented and follow SAP naming conventions
Creating Custom CDS Views
If standard CDS views don’t cover your use case, you can create custom views. For example, a manufacturer might need:
- A view combining sales orders, delivery schedules, and current inventory in one row
- A view of production orders with actual vs. planned labour hours
- A view of supplier quality metrics (defect rates, on-time delivery) joined with purchase orders
Custom CDS views are written in ABAP and deployed via SAP’s development tools. They require SAP development expertise, but once deployed, they’re just another table that Superset can query.
Exposing Views in Superset
Once you’ve identified your tables and CDS views, you register them in Superset:
- Create a Database Connection: In Superset UI, add a new database with your HANA connection string and credentials.
- Sync Tables: Superset auto-discovers tables and views from the database. You’ll see MARA, MARC, VBAK, C_Product, C_SalesOrder, etc., in the table list.
- Create Datasets: For each table/view, create a Superset “Dataset” (formerly called a “Table”). A dataset is a queryable entity in Superset—it’s where you define columns, data types, and default filters.
- Add Calculated Columns: In each dataset, you can add calculated columns—e.g., “Gross Margin” = (Sales Price - Cost) / Sales Price. These are computed in SQL at query time.
- Set Row-Level Security: If different users should see different data (e.g., Plant Manager A sees only Plant A data), define RLS rules here.
Once datasets are created, Superset can query them. A simple query against the MARA (Material Master) table might look like:
SELECT
MATNR AS product_code,
MAKTX AS product_description,
MEINS AS unit_of_measure,
COUNT(*) AS row_count
FROM MARA
WHERE ERSDA >= CURRENT_DATE - 30
GROUP BY MATNR, MAKTX, MEINS
ORDER BY row_count DESC
LIMIT 100
This query runs directly against S/4HANA, returns results in milliseconds, and Superset visualises the results as a table or chart.
Setting Up Your Superset Instance for Manufacturing Data
Deploying Superset requires careful planning around infrastructure, security, and performance. Here’s a production-grade setup for a manufacturing organisation.
Infrastructure & Deployment
Development Environment: Start with Docker Compose on a single machine (8 GB RAM, 4 CPU). This takes 15 minutes to spin up and is perfect for testing connections and building dashboards.
git clone https://github.com/apache/superset.git
cd superset
docker-compose -f docker-compose.yml up
Production Environment: Deploy on Kubernetes (EKS, AKS, or on-premises K8s) with:
- 3 Superset application pods (for redundancy)
- 1 PostgreSQL pod (or RDS/Azure Database for production)
- 1 Redis pod (or ElastiCache)
- Nginx ingress controller (SSL termination, rate limiting)
- Persistent volumes for metadata and logs
A typical 3-tier production setup costs $2,000–5,000/month in cloud infrastructure (AWS/Azure) for 100–500 concurrent users.
Configuring HANA Connectivity
Once Superset is running, configure the HANA connection:
-
Install the HANA Driver:
pip install hana-sqlalchemy -
Add Connection in Superset UI:
- Go to Settings > Database Connections
- Click + Database
- Select SAP HANA from the dropdown
- Enter connection details:
- Host: Your S/4HANA HANA server IP/hostname
- Port: 39013 (or your HANA SQL port)
- Database: Your HANA database name (usually HANA)
- Username: SAP user with SELECT on tables/views
- Password: User password
- Click Test Connection to verify
- Click Save
-
Sync Tables:
- Once connected, Superset will scan the HANA database and list available tables/views
- You’ll see hundreds of SAP tables—this is normal
- You only need to enable the ones you plan to query (MARA, MARC, VBAK, C_Product, etc.)
Performance Tuning at Connection Level
By default, Superset queries run against S/4HANA with no optimisation. For large tables (MIGO with 100M rows, VBAP with 10M rows), queries can be slow. Optimise at the connection level:
Query Timeout: Set a timeout (e.g., 300 seconds) to prevent runaway queries from locking your production database.
Connection Pool: Superset uses SQLAlchemy connection pooling. Configure it in your Superset config:
SQLALCHEMY_ENGINE_OPTIONS = {
"pool_size": 20,
"max_overflow": 40,
"pool_pre_ping": True,
"pool_recycle": 3600,
}
This allows up to 60 concurrent connections to HANA—enough for 50–100 concurrent dashboard users.
Database Indexes: Work with your SAP basis team to ensure key columns are indexed. For example:
- VBAK.ERDAT (order creation date)
- MARC.MATNR (material number)
- MIGO.BUDAT (goods movement date)
Indexes on these columns can cut query times by 50–80%.
User Authentication & SSO
For a manufacturing plant with 100+ users, set up Single Sign-On (SSO) so users log in with their corporate credentials.
SAML 2.0 (Recommended): If your company uses Microsoft Active Directory, Okta, or another SAML provider:
-
Configure Superset’s SAML settings (in
superset_config.py):from flask_appbuilder.security.manager import AUTH_SAML AUTH_TYPE = AUTH_SAML SAML_METADATA_URL = "https://your-idp.com/metadata.xml" -
Users log in via your corporate SSO—no separate Superset password needed.
-
Map SAML groups to Superset roles (e.g., “Manufacturing Analysts” → Analyst role).
OAuth 2.0: If you use Google Workspace, Azure AD, or GitHub:
- Register Superset as an OAuth application in your identity provider
- Configure Superset’s OAuth settings
- Users click “Login with [Provider]” and authenticate via their corporate account
SSO reduces password management overhead and ensures access is revoked immediately when users leave the organisation.
Building Semantic Layers for Manufacturing KPIs
A semantic layer is a business-friendly abstraction on top of your raw S/4HANA tables. It’s where you define KPIs, dimensions, and calculated metrics that users can drag-and-drop to build dashboards without writing SQL.
Superset’s semantic layer is built via Datasets and Metrics. Think of a dataset as a table + business logic, and a metric as a pre-calculated KPI.
Creating Datasets for Manufacturing Processes
Example 1: Production Orders Dataset
You combine the AUFK (Production Order Header) and AFPO (Operations) tables to create a dataset that shows order-level and operation-level data:
SELECT
aufk.AUFNR AS order_number,
aufk.MATNR AS product_code,
mara.MAKTX AS product_description,
aufk.GAMNG AS order_quantity,
aufk.GSTRS AS start_date,
aufk.GSTER AS end_date,
aufk.PSTAT AS order_status,
afpo.VORNR AS operation_number,
afpo.ARBPL AS work_centre,
afpo.LMNGA AS labour_hours_planned,
afpo.ISMNG AS quantity_completed
FROM AUFK
JOIN AFPO ON aufk.AUFNR = afpo.AUFNR
JOIN MARA ON aufk.MATNR = mara.MATNR
In Superset, you’d:
- Create a new dataset called “Production Orders”
- Paste the SQL above as the table definition
- Define columns: order_number (string), product_code (string), start_date (date), labour_hours_planned (float), etc.
- Add calculated columns (see below)
Example 2: Sales & Delivery Dataset
Combine VBAK, VBAP, and LIKP (Delivery) to show sales orders with delivery status:
SELECT
vbak.VBELN AS order_number,
vbak.ERDAT AS order_date,
vbak.KUNNR AS customer_code,
kna1.NAME1 AS customer_name,
vbap.POSNR AS line_item,
vbap.MATNR AS product_code,
vbap.KWMENG AS order_quantity,
vbap.NETWR AS line_value,
likp.VBELN AS delivery_number,
likp.LFDAT AS delivery_date,
CASE WHEN likp.VBELN IS NULL THEN 'Not Delivered' ELSE 'Delivered' END AS delivery_status
FROM VBAK
JOIN VBAP ON vbak.VBELN = vbap.VBELN
JOIN KNA1 ON vbak.KUNNR = kna1.KUNNR
LEFT JOIN LIKP ON vbap.VBELN = likp.VBELN
Defining Metrics (Calculated KPIs)
Once datasets are created, define metrics—these are pre-calculated aggregations that users can drop into charts without writing SQL.
Example Metrics:
-
On-Time Delivery Rate (Sales & Delivery dataset):
COUNT(CASE WHEN delivery_date <= order_date + 14 THEN 1 END) / COUNT(*) * 100This metric shows the percentage of orders delivered within 14 days.
-
Production Order Variance (Production Orders dataset):
AVG(labour_hours_planned - labour_hours_actual) / AVG(labour_hours_planned) * 100Shows how much actual labour differs from planned (positive = over budget, negative = under budget).
-
Inventory Turnover (Inventory dataset):
SUM(goods_issued_qty) / AVG(stock_qty) * 365Shows how many times inventory turns over per year.
-
Order Fulfillment Time (Production Orders dataset):
DATEDIFF(day, start_date, end_date)Average days from order start to completion.
In Superset, you define these metrics in the dataset editor. Once defined, they appear in the chart builder as pre-calculated options—users can drag “On-Time Delivery Rate” onto a chart and filter by customer, product, or date range without writing SQL.
Row-Level Security (RLS)
In a manufacturing plant with multiple divisions or cost centres, you often need users to see only their data. Superset supports Row-Level Security (RLS) via SQL clauses.
Example: A Plant Manager for Plant A should see only orders and production data for Plant A.
In the dataset definition, add an RLS clause:
WHERE plant_code = '{{ current_user.plant_code }}'
When a user with plant_code = 'PLANT_A' queries the dataset, Superset automatically appends WHERE plant_code = 'PLANT_A' to every query. This ensures data isolation without requiring separate datasets for each plant.
You populate current_user.plant_code via SSO (SAML/OAuth) or a custom user attribute table.
Real-World Manufacturing Dashboard Examples
Here are three dashboards that manufacturers typically build in Superset, pulling directly from S/4HANA.
Dashboard 1: Production Operations Center
Purpose: Real-time view of all production orders in progress, for shop floor supervisors and production planners.
Charts:
- Orders by Status (Pie chart): Count of orders grouped by status (In Progress, Completed, On Hold, Delayed). Data from AUFK.PSTAT.
- Labour Hours vs. Plan (Bar chart): Actual labour hours vs. planned, by work centre. Data from AFPO with calculated variance.
- Production Orders Timeline (Gantt chart): Start date to end date for top 20 orders by value. Shows which orders are on track, which are delayed.
- Material Availability (Table): For in-progress orders, shows which materials are reserved (RESB) vs. in stock (MSKU). Flags shortages.
- Order Completion Rate (Trend chart): Percentage of orders completed on schedule, by week, over last 12 weeks.
Refresh: Every 15 minutes (via Superset’s cache invalidation).
Users: 15–20 shop floor supervisors, 5–10 production planners.
Dashboard 2: Sales & Delivery Performance
Purpose: Monitor order-to-cash cycle, delivery performance, and customer satisfaction metrics.
Charts:
- Orders by Customer (Bar chart): Total order value by top 20 customers. Data from VBAK + KNA1.
- On-Time Delivery % (KPI card): Percentage of orders delivered on schedule (delivery_date <= promised_date). Metric defined in semantic layer.
- Days Sales Outstanding (DSO) (Trend chart): Average days from invoice to payment, by month. Data from VBRK (Invoice) + BSEG (GL items).
- Order Backlog (Table): Orders with order_date > 30 days ago and delivery_status = “Not Delivered”. Highlights problem orders.
- Revenue by Product Line (Stacked bar chart): Monthly revenue by product category, with YoY comparison.
Refresh: Daily (overnight batch).
Users: Sales managers, finance team, customer service.
Dashboard 3: Supply Chain & Inventory Health
Purpose: Monitor inventory levels, stock turns, and supplier performance.
Charts:
- Inventory by Plant (Map chart or table): Current stock levels (MSKU) by plant, with slow-moving inventory flagged (no goods issues in 90 days).
- Stock Turns by Product (Bar chart): Inventory turnover ratio (annualised) by product. Identifies fast movers vs. slow movers.
- Days Inventory Outstanding (DIO) (KPI card): Average days inventory is held before issue. Metric: SUM(stock_qty) / AVG(daily_issues) * 365.
- Purchase Order Status (Table): Open POs (EKKO + EKPO) with receipt status. Flags late deliveries (LFDAT > delivery date).
- Supplier Quality Scorecard (Table): Defect rates, on-time delivery %, and lead time variance by supplier. Data from custom CDS view joining POs + quality inspections (QA32).
Refresh: Daily.
Users: Supply chain manager, procurement, inventory planner.
Performance Tuning and Query Optimisation
Once dashboards are live, you’ll notice that some queries are slow. A dashboard with 8 charts might take 30 seconds to load if queries aren’t optimised. Here’s how to fix it.
Query Analysis & Indexing
Use SAP’s SQL monitoring tools to identify slow queries:
- In SAP, run transaction HDBSQL or SQL Console (in HANA Studio)
- Execute your Superset query and check the execution plan
- Look for “Full Table Scan” (red flag) vs. “Index Seek” (good)
- If a large table is being scanned fully, add an index on the filter column
Example: A query filtering VBAP (10M rows) by ERDAT (order date) is slow:
SELECT * FROM VBAP WHERE ERDAT >= '2024-01-01'
Add an index:
CREATE INDEX VBAP_ERDAT ON VBAP(ERDAT)
Query time drops from 5 seconds to 100ms.
Caching Strategy
Superset has a built-in cache (Redis) that stores query results. Configure caching to avoid re-running the same query:
- Query Cache: Cache results for 1 hour (3600 seconds) by default. Users see cached results unless they explicitly refresh.
- Dashboard Cache: Superset can cache entire dashboards for 5–15 minutes, so rapid page reloads don’t hit the database.
- Cache Invalidation: When data changes in S/4HANA, you can manually invalidate the cache or set up automated invalidation (e.g., every night at midnight).
For manufacturing dashboards:
- Real-time dashboards (production orders): Cache 5–15 minutes
- Daily dashboards (sales, inventory): Cache 1–4 hours
- Weekly/monthly dashboards (trends, forecasts): Cache 24 hours
Aggregation Tables (Pre-Aggregation)
For very large tables (MIGO with 100M+ rows), queries can be slow even with indexes. Use pre-aggregated tables:
Instead of querying MIGO directly, create a summary table:
CREATE TABLE MIGO_DAILY_SUMMARY AS
SELECT
BUDAT AS movement_date,
MATNR AS product_code,
WERKS AS plant,
BWART AS movement_type,
SUM(MENGE) AS total_quantity,
COUNT(*) AS movement_count
FROM MIGO
GROUP BY BUDAT, MATNR, WERKS, BWART
Then query the summary table instead of raw MIGO. This reduces query time from seconds to milliseconds because you’re querying a smaller, pre-aggregated dataset.
Refresh the summary table nightly via a scheduled SAP job.
Query Limits & Timeouts
Set reasonable limits on queries to prevent runaway queries from locking your database:
- Query Timeout: 300 seconds (5 minutes) for interactive dashboards. Longer for batch reports.
- Row Limit: 100,000 rows for interactive queries. Larger queries should use aggregation or filters.
- Concurrent Query Limit: Max 20 concurrent queries per user, max 100 across all users.
Configure these in Superset’s settings:
QUERY_TIMEOUT = 300 # seconds
DATABASE_QUERY_TIMEOUT = 300
MAX_ROW_LIMIT = 100000
Security, Compliance, and Data Governance
Manufacturing data is sensitive—production costs, supplier contracts, customer orders. Ensure Superset is secure and compliant.
Network Security
- Firewall Rules: Superset should only accept traffic from your corporate network or VPN. Restrict access by IP.
- SSL/TLS: All traffic between Superset and users must be encrypted (HTTPS). Use a valid SSL certificate (not self-signed in production).
- Database Connection: Encrypt the connection between Superset and S/4HANA. Use SSL for HANA connections.
- VPN/Bastion: If Superset is in the cloud and S/4HANA is on-premises, use a VPN tunnel or bastion host to secure the connection.
User Permissions & Access Control
Superset has role-based access control (RBAC):
- Admin: Full access to all dashboards, can manage users and settings
- Alpha: Can create and edit dashboards, access all data
- Gamma: Can view dashboards assigned to them, can’t create new dashboards
- SQL Lab: Can write and run SQL queries, limited to specific databases
- Public: Can view public dashboards (no login required)
For a manufacturing organisation:
- Production Supervisors: Gamma role, access to Production Operations dashboard only
- Sales Managers: Gamma role, access to Sales & Delivery dashboard
- Finance Team: Alpha role, full access to dashboards and ability to create new reports
- Data Analysts: SQL Lab role, can write custom queries
Data Masking & Sensitive Data
If dashboards include sensitive data (customer names, contract values), use Superset’s data masking features:
- Column-Level Masking: Hide specific columns from certain users. E.g., hide customer_name from non-sales users.
- Regex-Based Masking: Mask sensitive values (e.g., replace customer name with “[REDACTED]”).
- Row-Level Security: As discussed earlier, restrict users to see only their plant/division data.
For compliance with data protection regulations (GDPR, Australian Privacy Act), ensure:
- User access is logged and auditable
- Sensitive data is masked or restricted
- Data is encrypted in transit and at rest
Audit Logging
Enable audit logging in Superset to track who accessed what, when:
FAB_ADD_SECURITY_PERMISSION_VIEW = True
LOGGING_CONFIG = {
'version': 1,
'handlers': {
'file': {
'class': 'logging.handlers.RotatingFileHandler',
'filename': '/var/log/superset/audit.log',
}
},
'loggers': {
'superset.security': {
'handlers': ['file'],
'level': 'INFO',
}
}
}
This logs every dashboard view, query execution, and data export. Useful for compliance audits and security investigations.
Compliance Frameworks
If your manufacturing organisation needs to pass security audits (SOC 2, ISO 27001, etc.), Superset must be configured accordingly. Work with your security team to ensure:
- Access Control: User authentication, role-based permissions, MFA for admin accounts
- Encryption: Data in transit (TLS) and at rest (encrypted database, encrypted backups)
- Audit Logging: All access logged and retained for 12+ months
- Backup & Disaster Recovery: Regular backups, tested recovery procedures
- Change Management: All configuration changes tracked and approved
Many manufacturers use Vanta to automate compliance monitoring. Vanta integrates with Superset to verify security controls and generate audit reports.
Migration Strategy: BW and SAC to Superset
If you’re moving from SAP BW or SAC to Superset, here’s a phased approach that minimises disruption.
Phase 1: Assessment (Weeks 1–2)
- Inventory existing reports: List all BW/SAC reports and their users. Prioritise by usage (most-used first).
- Identify data sources: Which tables/CDS views does each report query?
- Estimate effort: Which reports are complex (many joins, calculations) vs. simple (single table, basic aggregations)?
- Plan dependencies: Some reports may depend on others. Identify the dependency chain.
Phase 2: Build Foundation (Weeks 3–6)
- Deploy Superset: Stand up a test environment, configure HANA connection, test connectivity.
- Build semantic layer: Create datasets for core tables (MARA, MARC, VBAK, etc.). Define metrics for common KPIs.
- Pilot dashboards: Rebuild 2–3 high-priority BW/SAC reports in Superset. Validate accuracy against originals.
- Performance tuning: Identify slow queries, add indexes, set up caching.
Phase 3: Migration (Weeks 7–12)
- Batch migration: Rebuild remaining reports in Superset, 5–10 per week.
- User testing: Have business users validate each dashboard against the original BW/SAC version.
- Training: Conduct training sessions on Superset for each user group.
- Parallel run: Run BW/SAC and Superset side-by-side for 2–4 weeks. Users can compare results.
Phase 4: Cutover (Week 13)
- Decommission BW/SAC reports: Once all users are confident in Superset, turn off old reports.
- Archive BW/SAC data: Keep historical data in archive for audit purposes.
- Monitor: Track dashboard performance, user adoption, and issues. Fix bugs as they arise.
Cost & Timeline
For a mid-market manufacturer (100+ BW/SAC reports, 50–100 users):
- Effort: 800–1,200 hours (10–15 weeks, 2–3 FTE)
- Cost: $80K–150K (labour + infrastructure)
- Payback: 12–18 months (vs. annual BW licensing of $200K–400K)
For a large enterprise (500+ reports, 500+ users):
- Effort: 3,000–5,000 hours (6–9 months, 5–8 FTE)
- Cost: $300K–600K
- Payback: 9–12 months
Getting Started: Next Steps for Your Organisation
Ready to move SAP S/4HANA reporting to Apache Superset? Here’s how to start.
Step 1: Proof of Concept (1–2 weeks)
Build a simple PoC to validate the approach:
- Deploy Superset locally (Docker Compose)
- Connect to your S/4HANA HANA database
- Create one dataset (e.g., MARA material master)
- Build one simple dashboard (e.g., product count by category)
- Show it to stakeholders
This costs almost nothing and de-risks the project. If it works, move to Phase 2.
Step 2: Partner with Experts
Building and operating Superset at scale requires expertise in:
- SAP S/4HANA table structures and CDS views
- SQL and data modelling
- Superset administration and tuning
- Kubernetes/Docker and infrastructure
- Security and compliance
Consider partnering with a specialist. At PADISO, we’ve built dozens of Superset instances for manufacturers. Our CTO as a Service team can help you design the architecture, build the semantic layer, and train your team. We’ve also delivered fixed-fee Superset rollouts in as little as 6 weeks.
Alternatively, if you have strong in-house technical talent, you can build it yourself. Either way, invest in expertise upfront—it pays off in faster deployment and fewer mistakes.
Step 3: Define Your Roadmap
Create a 12-month roadmap:
- Months 1–3: PoC + assessment + foundation build
- Months 4–9: Migrate high-priority reports
- Months 10–12: Optimise, train, decommission legacy systems
Align the roadmap with your business priorities. If supply chain optimisation is a priority, focus Phase 1 on inventory and procurement dashboards. If sales growth is the focus, prioritise sales and customer dashboards.
Step 4: Build Internal Capability
Don’t outsource everything. Build in-house capability so you can maintain and evolve dashboards over time:
- Hire or train a Superset admin: 1 FTE to manage users, permissions, infrastructure
- Hire or train a data analyst: 1–2 FTE to build dashboards and manage the semantic layer
- Train business users: Every dashboard user should understand filters, drilling down, and exporting data
This investment in people is as important as the technology investment.
Step 5: Iterate and Optimise
Once Superset is live, the work doesn’t stop. Continuously:
- Monitor performance: Track dashboard load times, query times, user adoption
- Gather feedback: Ask users what reports they need, what’s missing, what’s broken
- Iterate: Add new dashboards, refine existing ones, fix performance issues
- Stay current: Update Superset regularly (new versions every 3–4 months). Keep HANA drivers up to date
Many manufacturers find that after 6 months, they’ve rebuilt 80% of their legacy reports in Superset and are already seeing ROI. By month 12, they’ve decommissioned BW/SAC and are running entirely on Superset.
Conclusion: Why Superset Wins for Manufacturing
SAP S/4HANA is a powerful transactional system, but it’s not optimised for analytics. BW and SAC add analytics capabilities, but at high cost and with slow iteration cycles.
Apache Superset flips the model. It’s lightweight, fast, and flexible. You query S/4HANA directly, build a semantic layer that represents your business logic, and let users self-serve analytics without writing SQL.
For manufacturers, the benefits are concrete:
- 40–60% cost reduction: No BW licensing, lower infrastructure costs
- 10x faster iteration: New dashboards in days, not months
- Better data quality: Query live S/4HANA, not stale data warehouse copies
- Flexibility: Combine SAP data with external sources (suppliers, IoT, market data)
The tradeoff is that you need technical expertise to set up and maintain Superset. But for manufacturers with technical teams, this is a net win.
If you’re evaluating Superset for your organisation, start with a PoC. Spend 1–2 weeks building a simple dashboard. If it works, you’ve validated the approach. Then scale up to a full migration.
Need help? We’ve guided dozens of manufacturers through this journey. Whether you need fractional CTO support, help building the semantic layer, or a fixed-fee engagement to get Superset live in 6 weeks, we’re here to help. Reach out to PADISO to discuss your project.
The future of manufacturing analytics is open-source, SQL-native, and fast. Superset is the platform that makes it possible.