PADISO.ai: AI Agent Orchestration Platform - Launching May 2026
Back to Blog
Guide 24 mins

Apache Superset for Embedded Customer Analytics in Hospitality

Design and operate embedded customer analytics on Apache Superset for hospitality. Data modelling, dashboard design, and rollout patterns.

The PADISO Team ·2026-06-04

Table of Contents

  1. Why Embedded Analytics Matter in Hospitality
  2. Understanding Apache Superset for Hospitality
  3. Data Modelling for Hospitality Analytics
  4. Dashboard Design Patterns
  5. Building Embedded Analytics Architecture
  6. Security and Access Control
  7. Implementation and Rollout Strategy
  8. Optimisation and Performance Tuning
  9. Common Pitfalls and How to Avoid Them
  10. Next Steps and Getting Started

Why Embedded Analytics Matter in Hospitality

Hospitality operators—hotels, restaurants, venues, and booking platforms—live and die by data. Revenue per available room (RevPAR), occupancy rates, guest satisfaction scores, and labour cost ratios determine survival. Yet most hospitality teams still rely on emailed spreadsheets, manual dashboards, or expensive per-seat BI tools that don’t scale across properties or franchise partners.

Embedded analytics change this. Instead of asking staff to log into a separate tool, you embed real-time insights directly into the operations platform they already use every day. A hotel manager sees occupancy forecasts and pricing recommendations in their property management system (PMS). A restaurant owner sees table turnover and food cost variance in their point-of-sale (POS) dashboard. A booking platform shows partners their performance metrics without giving them access to your entire analytics layer.

This is where Apache Superset excels. Apache Superset is an open-source analytics and visualization platform that supports embedded dashboards, role-based access control, and cost-effective deployment at scale. Unlike per-seat BI tools (Tableau, Looker, Power BI), Superset charges by infrastructure, not by user. That means you can embed analytics for 500 hotel partners, 10,000 restaurant franchisees, or internal teams without the licensing bill exploding.

Hospitality organisations using embedded Superset analytics typically see:

  • 40–60% faster decision-making because data lives where decisions happen
  • 20–30% improvement in operational metrics (occupancy, turnover, labour efficiency) through real-time visibility
  • 50%+ reduction in BI licensing costs compared to per-seat tools
  • Faster onboarding for new properties or partners, since dashboards are pre-built and role-locked

This guide walks you through the complete design and operation of embedded customer analytics on Superset for hospitality, from data modelling through rollout and optimisation.


Understanding Apache Superset for Hospitality

What Makes Superset Suitable for Hospitality

Apache Superset is a modern, open-source analytics platform built for speed and simplicity. It’s not a data warehouse—it’s a visualization and query layer that sits on top of your data infrastructure (PostgreSQL, MySQL, Snowflake, BigQuery, ClickHouse, etc.).

For hospitality, Superset’s strengths are clear:

Cost efficiency: No per-seat licensing. You pay for the infrastructure (a cloud VM or Kubernetes cluster) and the data warehouse underneath. Scale to 5,000 embedded users for the same cost that Tableau charges for 50.

Embedded-first design: Superset’s embedding API and token-based authentication make it trivial to embed dashboards in your own application. A guest can see their booking analytics without leaving your platform; a property manager sees revenue reports in their PMS.

SQL flexibility: Superset lets you write SQL directly or use its visual query builder. For hospitality, this means you can model complex metrics (RevPAR, ADR, length of stay trends) without waiting for a data engineer to build a semantic layer.

Role-based access control (RBAC): You control which users see which dashboards, which data, and which filters. A franchisee sees only their property; a corporate user sees all properties; a guest sees only their booking.

Open-source and deployable anywhere: Run Superset on your own servers, in your own cloud account, or on a managed platform like Preset. No vendor lock-in.

Superset vs. Alternatives for Hospitality

You might be comparing Superset to Tableau, Looker, Power BI, or Metabase. Here’s the trade-off:

Tableau: Excellent UX, powerful data storytelling, but per-seat licensing ($70–$100/user/month) makes it prohibitive for embedding across hundreds of partners. Better for internal analytics teams, not customer-facing.

Looker: Tightly integrated with Google Cloud, strong semantic layer (LookML), but again, per-seat licensing and slower embedded workflows. Best for enterprises with large BI teams.

Power BI: Microsoft ecosystem integration is strong, but licensing is complex and per-seat. Embedding requires separate licensing tiers.

Metabase: Open-source and simpler than Superset, but less flexible for complex hospitality metrics and weaker embedding support.

Superset: Open-source, infinitely scalable embedding, SQL-first, and cost-linear with infrastructure. Trade-off: requires more hands-on data engineering and SQL knowledge than Tableau or Looker.

For hospitality at scale—especially multi-property operations, franchise networks, or booking platforms—Superset is the pragmatic choice.


Data Modelling for Hospitality Analytics

Core Hospitality Metrics and Dimensions

Before you build a single dashboard, you need a clean data model. Hospitality data is inherently temporal and hierarchical: properties → rooms/tables → bookings/transactions → guests. Your data model must reflect this structure.

Core fact tables (measurable events):

  • Bookings: One row per booking. Columns: booking_id, property_id, guest_id, check_in_date, check_out_date, room_type, rate_type, total_revenue, currency, booking_source, cancellation_flag.
  • Transactions: One row per transaction (payment, refund, ancillary charge). Columns: transaction_id, booking_id, property_id, transaction_date, amount, transaction_type (room, food, parking, etc.), payment_method.
  • Operational events: One row per shift, service, or activity. Columns: event_id, property_id, event_date, event_type (housekeeping, maintenance, staff shift), duration, cost, status.
  • Guest interactions: One row per review, contact, or complaint. Columns: interaction_id, booking_id, guest_id, interaction_date, interaction_type (review, complaint, feedback), sentiment_score, resolution_time.

Core dimension tables (context):

  • Properties: property_id, property_name, city, country, property_type (hotel, restaurant, venue), star_rating, opening_date.
  • Rooms: room_id, property_id, room_type, capacity, amenities, rate_category.
  • Guests: guest_id, guest_name, country, loyalty_status, lifetime_value, first_booking_date.
  • Dates: date, day_of_week, week, month, quarter, year, is_holiday, is_weekend.

Designing the Data Warehouse Layer

You don’t need a complex data warehouse for hospitality analytics. A single PostgreSQL or MySQL database with well-indexed tables is often enough for a small-to-mid-sized operation. For larger organisations (100+ properties), use ClickHouse, Snowflake, or BigQuery.

Schema design principles:

  1. Denormalise for query speed: Store derived metrics (total_revenue, guest_count, occupancy_rate) in the fact table rather than calculating them in every query. This is faster and more consistent.

  2. Use date-based partitioning: Partition booking and transaction tables by check_in_date or transaction_date. This dramatically speeds up date-range queries (e.g., “last 12 months”).

  3. Pre-aggregate where possible: Create a daily summary table with one row per property per date, containing occupancy, revenue, guest count, and average rating. This is faster than querying millions of booking rows every time someone opens a dashboard.

  4. Maintain a slowly changing dimension for properties: Track property metadata changes (name, city, star rating) over time. Use effective_date and end_date columns so historical reports are accurate.

  5. Soft-delete bookings and transactions: Don’t actually delete rows; mark them as deleted with a deleted_flag or deleted_date. This preserves historical accuracy and makes auditing easier.

Sample SQL for Key Hospitality Metrics

Once your data model is in place, define reusable SQL snippets or views for common metrics. Superset can then reference these views, keeping dashboards simple and consistent.

-- Revenue Per Available Room (RevPAR)
SELECT
  property_id,
  DATE(check_in_date) AS date,
  SUM(total_revenue) / COUNT(DISTINCT room_id) AS revpar
FROM bookings
WHERE check_in_date >= DATE_TRUNC('month', CURRENT_DATE)
GROUP BY property_id, DATE(check_in_date);

-- Occupancy Rate
SELECT
  property_id,
  DATE(check_in_date) AS date,
  COUNT(DISTINCT room_id) * 100.0 / (
    SELECT COUNT(*) FROM rooms WHERE property_id = b.property_id
  ) AS occupancy_pct
FROM bookings b
WHERE check_in_date >= DATE_TRUNC('month', CURRENT_DATE)
GROUP BY property_id, DATE(check_in_date);

-- Average Daily Rate (ADR)
SELECT
  property_id,
  DATE(check_in_date) AS date,
  AVG(total_revenue / NULLIF(LENGTH_OF_STAY, 0)) AS adr
FROM bookings
WHERE check_in_date >= DATE_TRUNC('month', CURRENT_DATE)
GROUP BY property_id, DATE(check_in_date);

Store these as views or materialised tables in your warehouse. Superset will query them directly, keeping dashboards fast and maintainable.


Dashboard Design Patterns

The Role-Based Dashboard Strategy

You won’t have one dashboard. You’ll have a family of dashboards, each tailored to a specific user role and use case. This is the key to adoption.

Property Manager Dashboard (daily operations):

  • Occupancy rate (today, this week, this month vs. last year)
  • Revenue (today, week-to-date, month-to-date)
  • Top booking sources
  • Staff scheduling and labour costs
  • Guest complaints and resolution status
  • Housekeeping and maintenance queue

Revenue Manager Dashboard (pricing and forecasting):

  • RevPAR by room type and date
  • Occupancy forecast (next 30 days)
  • Average daily rate (ADR) trend
  • Booking curve (pace of bookings vs. last year)
  • Cancellation rate by source
  • Pricing recommendations (if using dynamic pricing logic)

Guest Experience Dashboard (quality and loyalty):

  • Average guest rating by property
  • Net Promoter Score (NPS) trend
  • Complaint categories and resolution time
  • Guest repeat rate
  • Loyalty program enrollment and spend
  • Guest lifetime value by cohort

Corporate/Franchise Dashboard (portfolio view):

  • Revenue, occupancy, and RevPAR across all properties
  • Year-over-year comparison
  • Peer benchmarking (property A vs. property B)
  • Top and bottom performers
  • Variance analysis (why did occupancy drop?)
  • Drill-down to individual properties

Finance Dashboard (P&L and cost analysis):

  • Total revenue by property and category (rooms, food, parking, etc.)
  • Cost breakdown (labour, utilities, supplies, marketing)
  • Gross operating profit (GOP) and margin
  • Cash flow and receivables
  • Budget vs. actual

Each dashboard should be single-purpose and filterable. A property manager should be able to filter by date range, room type, and booking source. A revenue manager should filter by property, rate plan, and market segment. This is where Superset’s filter and drill-down capabilities shine.

Design Principles for Embedded Dashboards

Embedded dashboards live inside your application, not in a separate tool. This means they must feel native and fast.

1. Minimise cognitive load: Show 4–6 key metrics per dashboard. Don’t cram 20 charts into one view. If a user needs to see more, they drill down or navigate to a detail dashboard.

2. Lead with the key metric: Place the most important number (revenue, occupancy, or rating) at the top left in a large card. Use colour coding: green for on-target, yellow for caution, red for below target.

3. Show trend, not just current value: A 95% occupancy rate means nothing without context. Is it up from 92% last week? Down from 98% last year? Add a sparkline or small trend chart next to each metric.

4. Use consistent colour schemes: Assign colours to properties, room types, or booking sources and stick with them across all dashboards. This makes pattern-spotting instant.

5. Make filters obvious: Filters should be at the top of the dashboard, not hidden in a sidebar. A property manager should see “Property: Grand Hotel Sydney” and a date range picker immediately.

6. Avoid real-time updates for embedded dashboards: If you embed a dashboard in your app and it refreshes every 30 seconds, it will drain your data warehouse and frustrate users. Refresh every 1–4 hours instead. Superset allows you to set cache TTL per dashboard.

7. Optimise for mobile: If your app is mobile-first (many hospitality apps are), design dashboards to stack vertically. Avoid wide tables; use cards instead.

Sample Dashboard Layouts

Property Manager Dashboard:

[Filter: Property | Date Range]

[Occupancy %]  [Revenue (AUD)]  [Avg Rating]  [Complaints]

[Booking Sources - Bar Chart] [Revenue Trend - Line Chart]

[Labour Cost vs Budget] [Housekeeping Queue]

Revenue Manager Dashboard:

[Filter: Property | Room Type | Date Range]

[RevPAR]  [Occupancy %]  [ADR]  [Booking Pace]

[RevPAR Trend - Line] [Occupancy Forecast - Area]

[ADR by Room Type - Bar] [Cancellation Rate - Gauge]

Keep layouts grid-based and responsive. Superset’s drag-and-drop builder makes this easy.


Building Embedded Analytics Architecture

Superset Deployment Options for Hospitality

You have three main deployment options:

1. Self-hosted Superset (on your cloud account or on-premises):

  • Full control over data security and compliance
  • No third-party vendor access to your data
  • Lower cost at scale (no SaaS markup)
  • Requires DevOps and ongoing maintenance
  • Best for: Large enterprises, regulated industries, or organisations with in-house infrastructure teams

2. Preset (Superset-as-a-Service):

  • Managed hosting and updates
  • Built-in embedding and SSO
  • Preset handles scaling and backups
  • Higher cost per user/dashboard (but no infrastructure cost)
  • Best for: Mid-market organisations, faster time-to-value, less DevOps overhead

3. Hybrid (self-hosted for internal, Preset for customer-facing):

  • Internal teams use self-hosted Superset for cost and control
  • Customer-facing dashboards use Preset’s embedded dashboard service for reliability and ease
  • Balances cost and operational burden

For most hospitality organisations, self-hosted Superset is the right choice. You control the data, the rollout, and the cost. If you’re building a booking platform or multi-tenant SaaS, self-hosted gives you the flexibility to embed dashboards for thousands of customers without per-seat licensing.

Architecture: Data Flow from PMS to Dashboard

Here’s a typical architecture for a hotel chain:

Property Management System (PMS)
  ↓ (nightly ETL or real-time sync)
Data Warehouse (PostgreSQL, ClickHouse, or Snowflake)
  ↓ (SQL queries)
Apache Superset
  ↓ (embedded dashboards via API)
Your Application (web or mobile)

Property Managers, Revenue Managers, Guests

Data ingestion: Extract data from your PMS (Opera, Micros, Protel, etc.) nightly or via real-time API. Transform it into the data model described earlier. Load it into your warehouse. Use tools like Fivetran, Stitch, or custom Python scripts for this.

Superset instance: Deploy Superset on a Kubernetes cluster or a cloud VM (AWS EC2, Google Compute, Azure VM). Allocate 2–4 CPU cores and 4–8 GB RAM for small-to-mid deployments. Use a managed PostgreSQL instance for Superset’s metadata store.

Embedding layer: Use Superset’s REST API and token-based authentication to embed dashboards. When a property manager logs into your app, your backend generates a short-lived JWT token that allows them to view their dashboard. No Superset login required.

Caching: Use Redis to cache query results. This is critical for embedded dashboards—if 500 property managers open the same dashboard at 8 AM, you don’t want 500 warehouse queries. Cache the result for 1–4 hours.

Embedding Dashboards: The Technical Pattern

Superset’s embedding API is straightforward. Here’s the pattern:

  1. Create a dashboard in Superset with filters (property_id, date_range, etc.).
  2. Generate a guest token from your backend, specifying which user can see what:
    import requests
    
    response = requests.post(
      'https://your-superset.com/api/v1/security/guest_token',
      json={
        'username': 'guest_user',
        'user_id': property_manager_id,
        'resources': [{'type': 'dashboard', 'id': dashboard_id}],
        'rls': [{
          'clause': f"property_id = {user_property_id}"
        }]
      },
      headers={'Authorization': f'Bearer {admin_token}'}
    )
    guest_token = response.json()['token']
  3. Embed the dashboard in your app using an iframe:
    <iframe
      src="https://your-superset.com/embedded/dashboard/{dashboard_id}?guest_token={guest_token}"
      width="100%"
      height="600px"
    ></iframe>
  4. Apply row-level security (RLS) so the guest token can only query their own data. The rls clause in step 2 ensures that even if someone tries to tamper with the token, they can only see data where property_id matches their property.

This pattern scales to thousands of embedded users. Each user sees only their data, and Superset handles caching and query optimisation.


Security and Access Control

Row-Level Security (RLS) for Multi-Property Operations

In a hotel chain or franchise network, security is paramount. A property manager in Melbourne must not see data from a property in Sydney. A guest must not see booking data from other guests.

Superset’s RLS feature enforces this at the database level. When a user queries a dashboard, Superset automatically appends a WHERE clause based on their role and attributes.

Example RLS configuration:

Role: Property Manager
RLS clause: property_id IN (SELECT property_id FROM property_assignments WHERE manager_id = {user_id})

Role: Regional Manager
RLS clause: property_id IN (SELECT property_id FROM properties WHERE region = {user_region})

Role: Guest
RLS clause: booking_id IN (SELECT booking_id FROM bookings WHERE guest_id = {user_id})

Every query is rewritten to include this clause. Even if a malicious user tries to access the raw SQL or the API, they can’t bypass RLS. It’s enforced at the database level.

Authentication and Single Sign-On (SSO)

If you’re embedding dashboards in your app, users should never see a Superset login screen. Use SSO or JWT tokens.

Option 1: OIDC/SAML SSO Configure Superset to trust your identity provider (Okta, Auth0, Azure AD, etc.). When a user logs into your app, they’re automatically authenticated in Superset.

Option 2: JWT tokens Generate short-lived JWT tokens in your backend and pass them to Superset. This is simpler if you already have an auth system.

Option 3: Guest tokens (for public or semi-public dashboards) Superset’s guest token API lets you create one-time tokens for unauthenticated users. Useful for sharing dashboards with partners or guests.

Data Encryption and Compliance

Hospitality data includes personally identifiable information (PII): guest names, emails, phone numbers, payment methods. You must encrypt this data in transit and at rest.

In transit: Use HTTPS/TLS for all Superset traffic. Enforce this at the load balancer level.

At rest: Encrypt your data warehouse backups. If using AWS, enable RDS encryption. If using PostgreSQL on-premises, use encrypted filesystems or column-level encryption.

PII handling: Mask or redact PII in dashboards where possible. For example, show guest names as “Guest A”, “Guest B” instead of full names. Show payment methods as “Card ending in 1234” instead of full card numbers.

For organisations subject to Australian Privacy Act or GDPR, consider implementing SOC 2 and ISO 27001 compliance frameworks. PADISO can help you design and audit your analytics infrastructure to meet these standards, ensuring your embedded analytics platform is audit-ready and secure.


Implementation and Rollout Strategy

Phase 1: Foundation (Weeks 1–4)

Week 1–2: Data modelling and warehouse setup

  • Design your fact and dimension tables (as described earlier)
  • Set up a PostgreSQL or ClickHouse instance
  • Build ETL pipelines to ingest data from your PMS
  • Validate data quality (no nulls in key columns, consistent date formats, etc.)

Week 3: Superset deployment

  • Deploy Superset on a cloud VM or Kubernetes cluster
  • Configure PostgreSQL as the metadata store
  • Connect Superset to your data warehouse
  • Set up Redis for caching
  • Configure HTTPS and basic authentication

Week 4: First dashboard

  • Build a simple Property Manager Dashboard with occupancy, revenue, and guest count
  • Test with a small group of internal users
  • Gather feedback and iterate

Phase 2: Scaling (Weeks 5–8)

Week 5–6: Multi-dashboard build-out

  • Build Revenue Manager, Guest Experience, and Corporate dashboards
  • Implement RLS so each user sees only their data
  • Set up caching to optimise query performance

Week 7: Embedding and SSO

  • Implement guest tokens or JWT authentication
  • Embed dashboards in your application
  • Test with property managers from 2–3 properties
  • Measure load times and query performance

Week 8: Rollout to pilot properties

  • Deploy to 5–10 properties
  • Provide training and documentation
  • Monitor performance and gather feedback
  • Fix bugs and optimise slow queries

Phase 3: Full Rollout (Weeks 9–12)

Week 9–10: Rollout to all properties

  • Deploy to all properties simultaneously or in waves
  • Provide on-site training if needed
  • Monitor system performance and user adoption

Week 11–12: Optimisation and handover

  • Identify and optimise slow queries
  • Add new metrics based on user feedback
  • Document the system for ongoing maintenance
  • Train your ops team to manage Superset

Change Management and Training

Technology is only half the battle. You need to prepare your teams for change.

Pre-rollout communication: Explain why you’re implementing analytics (better decisions, faster insights, less manual work). Show examples of how dashboards will help their daily work.

Training: Provide role-specific training. Property managers need to know how to filter by date and room type. Revenue managers need to understand RevPAR and occupancy forecasts. Keep training to 30 minutes—people are busy.

Quick reference guides: Create 1-page cheat sheets for each dashboard. Show what each metric means and what action to take if it’s red.

Support channel: Assign someone (or a team) to answer questions during the first month. Most questions will be simple (“How do I filter by room type?”). Having quick answers builds confidence.

Celebrate wins: When a property manager catches a pricing issue early or optimises labour based on a dashboard, highlight it. This builds momentum and shows ROI.


Optimisation and Performance Tuning

Query Performance: The Critical Path

If a dashboard takes 10 seconds to load, users won’t use it. Optimisation is non-negotiable.

1. Index your data warehouse:

  • Add indexes on foreign keys (property_id, guest_id, room_id)
  • Add indexes on date columns (check_in_date, transaction_date)
  • Add indexes on filter columns (booking_source, rate_type)
CREATE INDEX idx_bookings_property_date ON bookings(property_id, check_in_date);
CREATE INDEX idx_bookings_guest ON bookings(guest_id);
CREATE INDEX idx_transactions_property_date ON transactions(property_id, transaction_date);

2. Pre-aggregate data: Instead of summing millions of booking rows every time someone opens a dashboard, pre-aggregate into a daily summary table:

CREATE TABLE daily_summary AS
SELECT
  property_id,
  DATE(check_in_date) AS summary_date,
  COUNT(*) AS booking_count,
  SUM(total_revenue) AS daily_revenue,
  AVG(total_revenue) AS avg_booking_value,
  COUNT(DISTINCT room_id) AS rooms_occupied
FROM bookings
GROUP BY property_id, DATE(check_in_date);

CREATE INDEX idx_daily_summary_property_date ON daily_summary(property_id, summary_date);

Now your dashboards query this summary table instead of the raw bookings table. Queries run 10–100x faster.

3. Use materialised views: Create views for common metrics so they’re pre-computed:

CREATE MATERIALIZED VIEW revpar_by_property AS
SELECT
  property_id,
  DATE(check_in_date) AS date,
  SUM(total_revenue) / COUNT(DISTINCT room_id) AS revpar,
  COUNT(DISTINCT room_id) * 100.0 / (
    SELECT COUNT(*) FROM rooms WHERE property_id = b.property_id
  ) AS occupancy_pct
FROM bookings b
GROUP BY property_id, DATE(check_in_date);

REFRESH MATERIALIZED VIEW revpar_by_property;

Refresh this view nightly. Superset queries it instead of computing RevPAR on the fly.

4. Partition large tables: If you have millions of bookings, partition by date:

CREATE TABLE bookings (
  booking_id INT,
  property_id INT,
  check_in_date DATE,
  ...
) PARTITION BY RANGE (YEAR(check_in_date)) (
  PARTITION p2023 VALUES LESS THAN (2024),
  PARTITION p2024 VALUES LESS THAN (2025),
  PARTITION p2025 VALUES LESS THAN (2026)
);

When a query filters by date range, the database only scans the relevant partitions.

Caching Strategy

Superset’s caching layer (Redis) is critical for embedded dashboards.

Cache settings:

  • Short TTL (1 hour) for dashboards that change frequently (today’s revenue, current occupancy)
  • Medium TTL (4 hours) for dashboards that update daily (yesterday’s metrics)
  • Long TTL (24 hours) for historical comparisons (year-over-year revenue)
# In superset_config.py
CACHE_CONFIG = {
  'CACHE_TYPE': 'RedisCache',
  'CACHE_REDIS_URL': 'redis://localhost:6379/0',
  'CACHE_DEFAULT_TIMEOUT': 3600,  # 1 hour default
}

# Per-dashboard cache settings (in Superset UI)
# Dashboard: Property Manager Daily
# Cache timeout: 3600 (1 hour)

# Dashboard: Historical Trends
# Cache timeout: 86400 (24 hours)

Monitoring and Alerting

Set up monitoring to catch performance issues early.

Key metrics to monitor:

  • Query latency (p50, p95, p99)
  • Cache hit rate
  • Database CPU and memory
  • Superset API response time
  • Number of concurrent users

Tools:

  • Prometheus + Grafana for infrastructure monitoring
  • Superset’s built-in metrics (Superset → Admin → Metrics)
  • Database query logs (PostgreSQL slow query log, ClickHouse system.query_log)

Alerts:

  • Query latency > 5 seconds
  • Cache hit rate < 70%
  • Database CPU > 80%
  • Superset API error rate > 1%

When an alert fires, check the slow query log, identify the offending query, and optimise it (add an index, pre-aggregate, or rewrite the query).


Common Pitfalls and How to Avoid Them

Pitfall 1: No Data Quality Checks

Problem: Your data warehouse has nulls, duplicates, or inconsistent date formats. Dashboards show incorrect metrics. Users lose trust.

Solution: Implement data validation in your ETL pipeline.

# Validation checks
assert df['property_id'].notna().all(), "Null property_id"
assert df['check_in_date'].notna().all(), "Null check_in_date"
assert df['total_revenue'].dtype == 'float64', "Revenue is not numeric"
assert (df['total_revenue'] >= 0).all(), "Negative revenue"
assert df.duplicated(subset=['booking_id']).sum() == 0, "Duplicate bookings"

Run these checks after every data load. If they fail, halt the load and alert your data team.

Pitfall 2: Slow Queries Kill Adoption

Problem: A dashboard takes 15 seconds to load. Users wait, get frustrated, and stop using it.

Solution: Measure query latency during development. Set a target of < 2 seconds for interactive dashboards. Use the query profiling steps described in the “Optimisation” section.

Pitfall 3: RLS Misconfiguration Leaks Data

Problem: A property manager in Melbourne accidentally sees data from Sydney. Compliance nightmare.

Solution: Test RLS thoroughly. Create test users with different roles and verify they see only their data. Automate RLS tests:

# Test: Property manager in Melbourne sees only Melbourne properties
token = generate_guest_token(user_id=manager_1, property_id=melbourne_property)
result = query_dashboard(token, dashboard_id=property_dashboard)
assert result['occupancy_data']['property'] == 'Melbourne Property'
assert 'Sydney Property' not in result['occupancy_data']

Pitfall 4: Metrics Misalignment

Problem: Your Revenue Manager dashboard shows RevPAR = $150, but the PMS shows $145. Users don’t trust the dashboard.

Solution: Reconcile metrics between your dashboard and source systems. For each key metric, write a reconciliation query:

-- Reconciliation: Dashboard revenue vs. PMS revenue
SELECT
  'Dashboard' AS source,
  SUM(total_revenue) AS total
FROM bookings
WHERE property_id = 1 AND check_in_date = '2024-01-15'
UNION ALL
SELECT
  'PMS' AS source,
  SUM(revenue) AS total
FROM pms_export
WHERE property_id = 1 AND check_in_date = '2024-01-15';

If the numbers differ, investigate. Is it a data type issue? A rounding difference? A missing filter? Fix it before rolling out.

Pitfall 5: Embedding Without Security

Problem: You embed dashboards without row-level security. A guest token is intercepted and used to view all guests’ data.

Solution: Always use guest tokens with RLS clauses. Never embed a dashboard without access control. Test token expiration:

# Test: Expired token is rejected
token = generate_guest_token(user_id=123, ttl=1)  # 1 second TTL
time.sleep(2)
response = query_dashboard(token)
assert response.status_code == 401, "Expired token was not rejected"

Next Steps and Getting Started

Immediate Actions (This Week)

  1. Audit your data: What data do you currently have? Where is it stored (PMS, booking system, accounting software)? What’s missing?

  2. Define your key metrics: Work with your operations and revenue teams to agree on the 5–10 most important metrics for your business.

  3. Sketch your data model: Draw out the fact and dimension tables you’ll need. Don’t overthink it; a simple model is better than a complex one.

  4. Evaluate your infrastructure: Do you have a data warehouse? If not, you’ll need to set one up (PostgreSQL, ClickHouse, or Snowflake). Budget 2–4 weeks for this.

Short-Term Plan (Next 4–8 Weeks)

  1. Deploy Superset: Self-host on a cloud VM or use Preset. Configure authentication and caching.

  2. Build your first dashboard: Start with the Property Manager Dashboard. Keep it simple: 4–6 metrics, 2–3 charts.

  3. Implement RLS: Ensure each user sees only their data.

  4. Embed in your app: Use guest tokens to embed dashboards without requiring a separate Superset login.

  5. Pilot with 5–10 users: Gather feedback and iterate.

Long-Term Strategy (Months 2–6)

  1. Expand to all dashboards: Build Revenue Manager, Guest Experience, and Corporate dashboards.

  2. Optimise performance: Profile slow queries, add indexes, pre-aggregate data.

  3. Rollout to all users: Train your teams and monitor adoption.

  4. Integrate with other systems: Connect booking data, POS data, guest feedback, and operational metrics.

  5. Build custom metrics: Work with your teams to define new metrics specific to your business (e.g., “upsell rate”, “complaint resolution time”).

Getting Professional Help

If you’re building embedded analytics for a multi-property operation or a booking platform, consider partnering with a platform engineering team. PADISO specialises in platform development for hospitality, including embedded Superset analytics.

If you’re in Sydney, PADISO’s platform development team in Sydney can help you design and implement Superset for your specific use case. They’ve built embedded analytics for hotels, restaurants, and booking platforms across Australia.

For organisations in other regions, PADISO also offers platform development in Melbourne, Gold Coast, Auckland, and across Australia and the United States.

If you need to ensure your analytics platform is audit-ready for SOC 2 or ISO 27001, PADISO’s security audit service can help you design and validate your infrastructure. They work with Vanta to streamline the audit process and get you compliance-ready in weeks, not months.

Resources for Further Learning

For deeper technical knowledge, refer to the Apache Superset official documentation and the GitHub repository. The community is active and helpful.

For embedded analytics best practices, Preset’s embedded analytics guide and their blog on customer-facing data applications are excellent resources.

For hospitality-specific data strategies, consult with your revenue management team and consider benchmarking against industry standards (STR, Smith Travel Research publishes hospitality KPIs).


Summary

Apache Superset is a pragmatic choice for embedded customer analytics in hospitality. It’s cost-effective, flexible, and scales to thousands of users without per-seat licensing. By following the data modelling, dashboard design, and rollout patterns in this guide, you can deploy embedded analytics in 8–12 weeks and start seeing operational improvements (faster decisions, better pricing, improved labour efficiency) immediately.

The key to success is:

  1. Clean data: Invest in data quality and a well-designed warehouse.
  2. Role-based dashboards: Build different dashboards for different users.
  3. Fast queries: Optimise with indexes, pre-aggregation, and caching.
  4. Secure embedding: Use row-level security and guest tokens.
  5. Change management: Train your teams and celebrate wins.

Start small—build one dashboard, pilot with a few users, gather feedback, and iterate. Within months, you’ll have a analytics platform that drives real business results.

If you need help designing or implementing embedded Superset analytics for your hospitality business, PADISO’s platform engineering team has deployed analytics platforms across Australia, New Zealand, and the United States. They can help you design your data model, build your dashboards, and ensure your platform is secure and performant. Book a call to discuss your specific use case.

Want to talk through your situation?

Book a 30-minute call with Kevin (Founder/CEO). No pitch — direct advice on what to do next.

Book a 30-min call