Migrating from Looker to Superset for PE Portco Organisations
Table of Contents
- Why PE Portcos Are Moving from Looker to Superset
- Understanding the Scope of Your Migration
- Cost Benchmarks and Financial Planning
- Governance and Access Control Architecture
- Technical Assessment and Readiness
- Migration Planning and Sequencing
- The Cutover Pattern: Phased Rollout
- Post-Migration Operations and Optimisation
- Common Pitfalls and How to Avoid Them
- Next Steps and Getting Help
Why PE Portcos Are Moving from Looker to Superset {#why-pe-portcos-are-moving}
When a private equity firm acquires a portfolio company running Looker, the first conversation usually centres on three hard truths: per-seat licensing costs, vendor lock-in, and the friction of managing multiple BI platforms across the roll-up.
Looker’s per-user subscription model—typically $70–$150 USD per user per month depending on tier—compounds across a 10-company portfolio. A mid-market portco with 200 analytical users across its platform quickly faces $1.7M–$3.6M in annual BI licensing alone. That’s before considering the engineering effort required to maintain Looker’s LookML layer, manage embedded Looker instances, or navigate Google Cloud’s pricing and governance overhead.
Apache Superset, by contrast, is open-source and free to self-host. You pay only for infrastructure, not per-user licenses. A typical Superset deployment costs £5,000–£25,000 per month to run on managed cloud infrastructure (AWS RDS, Kubernetes, or similar), regardless of whether you have 50 or 500 users. For a PE-backed roll-up consolidating BI across multiple acquisitions, that delta is material: you’re looking at 60–80% cost reduction year-on-year.
But cost alone doesn’t drive migrations. The real lever is operational control. Superset is open-source, self-hosted, and semantically flexible. Your teams own the data transformation layer. You’re not locked into Google’s product roadmap, API versioning, or regional availability constraints. For PE operators modernising a portco’s tech stack—whether that’s moving to a platform engineering approach in Sydney, Melbourne, or across the United States—Superset fits naturally into a composable, cloud-native data architecture.
Three specific portco scenarios drive Looker-to-Superset migrations:
Scenario 1: Roll-up consolidation. You’ve acquired three companies, each running Looker on separate Google Cloud projects. Consolidating to a single Superset instance cuts licensing by 70%, centralises governance, and gives you a unified audit trail for SOC 2 compliance.
Scenario 2: Cost-out mandate. Your PE sponsor has a 24-month cost-reduction target. Migrating BI to Superset is a quick win: no new revenue required, immediate savings, and a clear line item on the board deck.
Scenario 3: Embedded analytics at scale. You’re embedding BI into your product for customers or internal stakeholders. Superset’s lightweight, REST-based embedding model is cheaper and faster to operationalise than Looker’s embedded dashboards, especially when you’re scaling to hundreds of embedded instances.
Regardless of your scenario, a structured migration playbook—covering scoping, governance, cost benchmarks, and cutover—is non-negotiable. Without it, you’ll lose institutional knowledge, break dependencies, and face months of unplanned rework. This guide walks you through the entire journey.
Understanding the Scope of Your Migration {#understanding-scope}
Before you move a single dashboard, you need a complete inventory of what you’re actually moving. Most PE portcos underestimate this step. They assume “we’ll just export dashboards and rebuild them in Superset.” That approach fails because Looker dashboards are not portable objects—they’re tightly coupled to LookML models, explores, and Google Cloud infrastructure.
Audit Your Current Looker Estate
Start with a full audit of your Looker instance. Use the Looker API to export metadata:
- Dashboards: Count, ownership, refresh cadence, embedded vs. standalone, user reach, and last-modified dates.
- Looks (saved queries): Identify which are actively used, which are orphaned, and which are dependencies for downstream dashboards.
- LookML models: Document your semantic layer—explores, dimensions, measures, derived tables, and any custom SQL or Liquid templating.
- Data connections: List all databases, warehouses, and source systems feeding Looker (Snowflake, BigQuery, Redshift, etc.).
- User and group structure: Export your access control matrix—who owns what, which teams have read vs. write access, and how permissions cascade.
- Scheduled deliveries: Identify all automated reports, email distributions, and alert rules.
- Custom extensions and blocks: Note any custom visualisations, Marketplace blocks, or bespoke integrations.
You’ll likely find that 30–50% of your Looker assets are unused or redundant. This is normal in a roll-up environment: each acquired company brought its own BI practice, and consolidation reveals duplication. Use this audit to right-size your scope and eliminate technical debt upfront.
Map Looker Concepts to Superset
Looker and Superset speak different languages. Understanding the translation is critical:
| Looker Concept | Superset Equivalent | Notes |
|---|---|---|
| LookML model + explore | dbt project + semantic layer (Cube or dbt) | Superset doesn’t include a built-in semantic layer; you’ll use dbt or Cube |
| Look (saved query) | Saved query in Superset | Direct translation; most logic can be preserved |
| Dashboard | Dashboard | Superset dashboards are similar but use different layout and refresh logic |
| Scheduled delivery | Alert and report scheduling | Superset has native alerts; scheduled reports require custom workflows |
| Row-level security (RLS) | Database-level RLS + Superset RBAC | Superset doesn’t have Looker-style RLS; you’ll implement it in your data warehouse |
| Custom visualisation | Custom plugin or community viz | Superset has fewer native viz types; you may need custom development |
This mapping exercise is crucial because it reveals what can be lifted-and-shifted versus what requires rework. Most migrations discover that 60% of Looker assets can be recreated in Superset with minimal effort, 30% require moderate refactoring, and 10% don’t warrant migration (they’re replaced by better Superset workflows or eliminated entirely).
Define Your Target Architecture
Superset is a presentation layer. It doesn’t include semantic modelling. Your migration must specify how you’ll handle the semantic layer—the business logic that defines metrics, dimensions, and relationships.
Three patterns are common:
Pattern A: dbt + Superset. You migrate your LookML logic to dbt (a modern data transformation framework) and use Superset to query the resulting tables and views. This is the most common pattern for PE portcos because dbt is open-source, version-controlled, and integrates cleanly with modern data warehouses. Your data warehouse becomes the source of truth; Superset is purely a viz layer.
Pattern B: Cube + Superset. You use Cube (or similar semantic layer) as a metrics engine between your warehouse and Superset. Cube handles aggregations, caching, and metric definitions; Superset consumes Cube’s REST API. This pattern is heavier but gives you a dedicated metrics layer that can serve multiple BI tools, APIs, and embedded analytics.
Pattern C: Direct warehouse + Superset. You query your warehouse directly from Superset without an intermediate semantic layer. This works for simple use cases but doesn’t scale well for complex metrics, role-based access, or multi-tenant scenarios. We don’t recommend it for PE portcos unless you’re purely replacing Looker’s presentation layer with minimal transformation logic.
For most PE-backed organisations, Pattern A (dbt + Superset) is the sweet spot. It’s cost-effective, operationally simple, and aligns with modern data stack practices. If you’re running platform engineering across Australia or building a data-driven architecture in a specific region, your engineering team will already be familiar with dbt.
Cost Benchmarks and Financial Planning {#cost-benchmarks}
PE sponsors want a clear financial picture before they approve a migration. Here’s how to model it.
Looker Cost Baseline
Calculate your current annual Looker spend:
- Per-user licensing: Count your Looker users across all instances. Include developers, analysts, and business users. Multiply by your per-user monthly rate ($70–$150 USD depending on tier) and by 12 months.
- Google Cloud infrastructure: Check your GCP bills for Looker-related resources—Cloud SQL, Compute Engine, Cloud Storage, and data transfer costs. Looker typically consumes £2,000–£10,000 per month in GCP infrastructure for a mid-market portco.
- Managed Looker service fees: If you’re using Looker Cloud (managed by Google), add any premium support or SLA fees.
- Custom development and maintenance: Estimate the annual cost of your BI engineering team maintaining LookML, custom visualisations, and Looker integrations. This is often 1–2 FTE at £100,000–£200,000 per FTE per year.
Example: A 10-company portco with 150 analytical users across Looker
- Per-user licensing: 150 users × £85 USD (~£65 GBP) × 12 = £117,000
- GCP infrastructure: £5,000 × 12 = £60,000
- BI engineering (1.5 FTE): £225,000
- Total annual Looker cost: £402,000
Superset Cost Model
Superset is free software, but you’ll pay for infrastructure and engineering:
- Managed Superset hosting: £5,000–£15,000 per month for a production-grade deployment on AWS, including RDS, Kubernetes, load balancing, and monitoring. Larger deployments (500+ users) may cost £15,000–£25,000 per month.
- Data warehouse costs: Superset queries your warehouse directly. If you’re moving from BigQuery (Looker’s native home) to Snowflake or Redshift, your warehouse costs may shift. Budget £2,000–£10,000 per month depending on query volume and data size.
- dbt Cloud or self-hosted dbt: If using dbt, budget £500–£2,000 per month for dbt Cloud (managed SaaS) or £1,000–£3,000 per month for self-hosted dbt infrastructure.
- BI engineering (ongoing): You’ll still need 0.5–1 FTE to maintain Superset, manage the semantic layer, and support users. This is lower than Looker because you’re not maintaining LookML. Budget £75,000–£150,000 per year.
Example: Same 10-company portco on Superset
- Managed Superset hosting: £10,000 × 12 = £120,000
- Warehouse costs (Snowflake): £5,000 × 12 = £60,000
- dbt Cloud: £1,000 × 12 = £12,000
- BI engineering (0.75 FTE): £112,500
- Total annual Superset cost: £304,500
Migration Investment
You’ll also incur one-time migration costs:
- Discovery and planning: 4–8 weeks, 2–3 FTE. Cost: £40,000–£80,000.
- LookML-to-dbt conversion: 6–12 weeks, 2–4 FTE. Cost: £60,000–£150,000.
- Dashboard and look recreation: 4–8 weeks, 2–3 FTE. Cost: £40,000–£80,000.
- Testing and cutover: 2–4 weeks, 2 FTE. Cost: £20,000–£40,000.
- External support (if using a partner like PADISO): £50,000–£150,000 depending on scope and complexity.
Total migration investment: £210,000–£500,000 (typically 4–6 months of effort).
Financial Summary
Using the example above:
- Year 1 savings: £402,000 (Looker) – £304,500 (Superset) – £350,000 (migration) = –£252,500 (net cost in Year 1 due to migration investment).
- Year 2+ savings: £402,000 – £304,500 = £97,500 per year (recurring savings).
- Payback period: 3.6 years (migration cost ÷ annual savings).
For PE sponsors, the narrative is: “We invest £350K upfront, save £97.5K per year, and break even in 3.6 years. But we also reduce vendor lock-in, gain operational control, and build a modern data stack that supports M&A and product innovation.” That’s a compelling story.
For larger portcos (20+ companies, 500+ users), payback periods drop to 18–24 months because the licensing savings scale while migration costs stay relatively flat.
Governance and Access Control Architecture {#governance-architecture}
Looker’s row-level security (RLS) and role-based access control (RBAC) are tightly integrated into the LookML layer. Superset’s access model is different—it’s database-centric with application-level RBAC. Understanding this difference is critical for PE portcos managing multi-tenant data and regulatory compliance.
Access Control Patterns
Looker’s model: Users have explore-level and look-level permissions. RLS is defined in LookML using user attributes and SQL filters. This is powerful but creates tight coupling between your semantic layer and access logic.
Superset’s model: Access is controlled via database connections, schemas, and tables. Superset has dataset-level and dashboard-level permissions, but no built-in row-level filtering. RLS is implemented in your data warehouse (using views, dynamic masking, or materialized tables).
For PE portcos, the best approach is warehouse-native RLS:
- Define access policies in your warehouse. Use Snowflake’s dynamic masking, Redshift’s row-level security, or BigQuery’s authorised views to enforce access at the data layer.
- Map Superset users to warehouse roles. When a user logs into Superset, their identity is passed to the warehouse, and the warehouse enforces access based on their role.
- Manage permissions in dbt. If using dbt, define access rules in your dbt project and materialise them as views in your warehouse. This keeps your access logic version-controlled and testable.
This approach has three advantages:
- Single source of truth: Access is defined once, in your warehouse, and automatically enforced across all tools (Superset, SQL clients, APIs, etc.).
- Scalability: You’re not managing access in Superset’s UI; you’re managing it in code. Changes are version-controlled and auditable.
- Compliance: Warehouse-native RLS is easier to audit and demonstrate to regulators. If you’re pursuing SOC 2 or ISO 27001 compliance, warehouse-level access controls are more defensible than application-level controls.
Multi-Tenant and Portco-Specific Governance
For PE roll-ups, you often need to isolate data by portco company. Here’s a pattern that works:
Data Warehouse (Snowflake / Redshift)
├── Raw layer (all data, no RLS)
├── Transformed layer (dbt models)
└── Semantic layer (views with RLS)
├── company_a_views (masked/filtered for Company A)
├── company_b_views (masked/filtered for Company B)
└── company_c_views (masked/filtered for Company C)
Superset
├── Database connection 1 (points to company_a_views)
├── Database connection 2 (points to company_b_views)
├── Database connection 3 (points to company_c_views)
└── RBAC: Users are assigned to datasets (e.g., "Company A Analyst" can only query company_a datasets)
This pattern is cleaner than Looker’s approach because:
- Each portco company has its own set of views, not mixed RLS logic in LookML.
- Superset’s database-level access control naturally aligns with this structure.
- It’s easier to onboard new companies during M&A: just add a new database connection and new views.
Audit and Compliance
Superset logs all dashboard views and query executions. For SOC 2 / ISO 27001 compliance, you’ll want to:
- Enable Superset’s audit logging. Superset tracks who accessed what, when. Export these logs to your SIEM (Splunk, Datadog, etc.).
- Log warehouse queries. Your data warehouse (Snowflake, Redshift) also logs all queries. Correlate warehouse logs with Superset logs to create a complete audit trail.
- Implement dashboard change tracking. Superset has a built-in change log; use it to track who modified dashboards and when.
- Document access policies. Keep your dbt RLS logic and warehouse access policies in version control with clear comments and change history.
When an auditor (or Vanta if you’re using it for compliance automation) asks “who accessed the revenue dashboard last month?” you can pull a report from Superset’s audit logs in seconds. This is harder with Looker because RLS logic is embedded in LookML, not easily queryable.
Technical Assessment and Readiness {#technical-assessment}
Before you start the migration, you need to assess your technical readiness. This step often reveals hidden dependencies and risks.
Data Warehouse Compatibility
Superset supports most modern data warehouses: Snowflake, Redshift, BigQuery, Databricks, ClickHouse, and others. However, query performance and feature support vary.
If you’re currently on BigQuery (Looker’s native home): You’ll likely see improved query performance with Superset because Superset’s query optimiser is simpler and more efficient than Looker’s. No changes needed here; BigQuery works great with Superset.
If you’re on Snowflake or Redshift: Superset works well. Just ensure your warehouse is properly tuned (indexes, clustering, partitioning) because Superset doesn’t abstract away performance issues like Looker sometimes does.
If you’re on legacy data warehouses (Teradata, Vertica, etc.): You’ll want to modernise as part of this migration. Superset supports these systems, but they’re operationally expensive and don’t scale well. Use the migration as a catalyst to move to Snowflake or Redshift.
Data Transformation Layer
You need a clear answer: are you migrating your LookML logic to dbt, Cube, or something else?
dbt is the industry standard. If your team has SQL expertise, dbt is the right choice. dbt models are SQL SELECT statements; your team will be productive immediately. dbt integrates with your CI/CD pipeline, supports testing and documentation, and is free and open-source.
However, dbt requires discipline. Without proper governance (naming conventions, testing, documentation), your dbt project becomes a mess. For PE portcos, we recommend:
- Start small. Migrate 20% of your LookML logic to dbt first. Learn the patterns. Then scale.
- Invest in governance. Define naming conventions, testing standards, and documentation requirements upfront.
- Use dbt Cloud (or self-hosted dbt). Managed dbt Cloud costs £500–£2,000 per month but saves you from managing dbt infrastructure. For most PE portcos, it’s worth it.
Cube is heavier but more powerful. If you need a dedicated metrics engine (e.g., for embedded analytics, multi-tenant scenarios, or complex aggregations), Cube is worth considering. It’s a separate service that sits between your warehouse and Superset, handling metric definitions, caching, and access control. It’s overkill for simple migrations but essential if you’re building a product with embedded analytics.
Infrastructure and DevOps
Superset is a Python/Node application. You’ll deploy it on Kubernetes, Docker, or managed cloud services (AWS ECS, Google Cloud Run, etc.).
For PE portcos, we recommend managed Superset hosting via Preset (a managed Superset SaaS) or self-hosted on Kubernetes. Here’s why:
- Preset (managed): £500–£5,000 per month depending on scale. Preset handles upgrades, scaling, and backups. Good for portcos that want operational simplicity.
- Self-hosted on Kubernetes: £5,000–£15,000 per month for infrastructure, but you own the deployment. Good for portcos with strong DevOps teams or regulatory requirements (e.g., sovereign cloud, air-gapped networks).
- Self-hosted on Docker (single server): Cheap (£500–£2,000 per month) but doesn’t scale. Only suitable for very small deployments.
For most PE roll-ups, self-hosted Kubernetes is the best choice because it gives you control, scales well, and aligns with modern platform engineering practices. If you’re working with a platform engineering team in Sydney or New York, they’ll likely recommend Kubernetes as well.
Integration Points
Identify all systems that integrate with Looker:
- BI tools: Tableau, Power BI, Qlik, etc. (usually separate from Looker; no change needed).
- Embedded analytics: Apps or websites that embed Looker dashboards. Superset has a REST API and embedding SDK; most integrations can be recreated.
- Scheduled reports: Tools like Zapier, Make, or custom scripts that trigger Looker reports. These will need to be repointed to Superset’s API.
- Data apps: LookML-derived applications (e.g., Looker’s Action Hub). You’ll need to rebuild these using Superset’s webhooks or custom code.
- SSO and identity: Looker likely integrates with your identity provider (Okta, Azure AD, etc.). Superset also supports SSO; the migration is straightforward.
Most of these integrations are simpler with Superset because Superset’s API is cleaner and more RESTful than Looker’s. Plan for 2–4 weeks to migrate integrations.
Migration Planning and Sequencing {#migration-planning}
A successful migration is a project, not a switchover. It requires careful planning and sequencing.
Phase 1: Discovery and Planning (Weeks 1–4)
Deliverables: Complete inventory of Looker assets, target architecture diagram, migration roadmap, and risk register.
Activities:
- Audit your Looker instance (dashboards, looks, models, users, integrations).
- Identify high-value and high-risk assets. High-value = heavily used, business-critical. High-risk = complex LookML, custom visualisations, embedded dashboards.
- Define your target architecture (dbt + Superset, Cube + Superset, or direct warehouse + Superset).
- Map Looker assets to Superset equivalents.
- Create a migration roadmap with priorities and sequencing.
- Identify risks (data quality, performance, access control, integrations) and mitigation strategies.
Team: 2–3 people (BI lead, data engineer, architect). If using an external partner like PADISO, they’ll lead this phase with your team.
Phase 2: Infrastructure Setup (Weeks 3–6)
Deliverables: Production-ready Superset instance, Kubernetes cluster (if self-hosted), dbt project structure, and CI/CD pipeline.
Activities:
- Provision Superset infrastructure (Kubernetes, RDS, load balancer, monitoring).
- Set up dbt project structure, version control, and testing framework.
- Configure SSO (Okta, Azure AD) in Superset.
- Set up audit logging and SIEM integration.
- Create database connections from Superset to your warehouse.
- Deploy dbt to your warehouse and validate initial models.
Team: 2–4 people (DevOps, data engineer, platform engineer). This phase overlaps with Phase 1.
Phase 3: LookML-to-dbt Conversion (Weeks 4–10)
Deliverables: dbt models that replicate LookML logic, tested and documented.
Activities:
- Convert high-priority LookML models to dbt.
- Implement RLS in your warehouse (Snowflake masking, Redshift RLS, etc.).
- Test dbt models against original Looker output (row counts, aggregations, etc.).
- Document dbt models and create a style guide.
- Set up dbt testing and CI/CD.
Team: 2–4 people (data engineer, analytics engineer). This is the longest phase and the most critical.
Pro tip: Don’t try to convert 100% of your LookML. Aim for 80%. The remaining 20% often consists of one-off reports or complex custom SQL that’s easier to rebuild from scratch in Superset.
Phase 4: Dashboard and Look Recreation (Weeks 8–14)
Deliverables: All high-priority dashboards and looks recreated in Superset, tested and documented.
Activities:
- Recreate high-priority dashboards in Superset.
- Test dashboard performance and accuracy.
- Implement dashboard-level access control in Superset RBAC.
- Migrate scheduled reports and alerts.
- Document dashboard ownership and refresh cadence.
Team: 2–3 people (BI analyst, BI engineer). This phase can run in parallel with Phase 3.
Pro tip: Use Superset’s API to bulk-create dashboards if you have many similar dashboards. This saves weeks of manual work.
Phase 5: Integration Migration (Weeks 10–14)
Deliverables: All external integrations (embedded dashboards, scheduled reports, data apps) working with Superset.
Activities:
- Migrate embedded dashboards to Superset’s embedding SDK.
- Update scheduled report workflows to use Superset’s API.
- Rebuild data apps using Superset webhooks or custom code.
- Test all integrations end-to-end.
Team: 1–2 people (BI engineer, application engineer). Often the smallest phase.
Phase 6: Testing and Cutover (Weeks 14–16)
Deliverables: Signed-off testing plan, cutover runbook, and rollback plan.
Activities:
- Run parallel testing: users query both Looker and Superset, compare results.
- Stress-test Superset with production query volume.
- Conduct user acceptance testing (UAT) with key stakeholders.
- Create a detailed cutover runbook (who does what, in what order, with rollback steps).
- Execute cutover (disable Looker, enable Superset).
- Monitor Superset for 2 weeks post-cutover.
Team: 3–4 people (BI lead, BI engineer, DevOps, QA). This phase is high-intensity and requires 24/7 availability.
Phase 7: Post-Cutover Optimisation (Weeks 16–24)
Deliverables: Superset tuned for production performance, documentation complete, team trained.
Activities:
- Monitor query performance and optimize slow dashboards.
- Gather user feedback and fix issues.
- Decommission Looker (after 4–8 weeks to ensure no surprises).
- Complete documentation and runbooks.
- Train operations team on Superset maintenance.
Team: 1–2 people (BI engineer, DevOps). Low-intensity, ongoing.
Timeline and Dependencies
Phases 1 and 2 run in parallel. Phase 3 (LookML conversion) is the critical path and gates Phase 4 (dashboard recreation). Phase 5 (integrations) can run in parallel with Phases 3 and 4. Phase 6 (cutover) happens after Phases 3, 4, and 5 are complete.
Total duration: 16–24 weeks (4–6 months) for a mid-market portco with 150 users and 200+ dashboards.
Smaller migrations (50 users, 50 dashboards) can be done in 8–12 weeks. Larger migrations (500+ users, 1000+ dashboards) may take 6–9 months.
The Cutover Pattern: Phased Rollout {#cutover-pattern}
Cutover is the riskiest phase. A poorly executed cutover can break reporting for weeks, damage stakeholder trust, and jeopardise the entire project. Here’s a battle-tested pattern.
Pre-Cutover Validation
Week 14–15: Parallel Testing
Run Looker and Superset in parallel for 2 weeks. Users run the same queries in both systems and compare results. This catches data discrepancies, performance issues, and missing features before cutover.
- Test coverage: Aim for 100% of high-priority dashboards, 80% of medium-priority, 50% of low-priority.
- Metrics to compare: Row counts, aggregations, date ranges, filters, drill-downs.
- Performance targets: Superset dashboards should load in < 5 seconds. If slower, optimize before cutover.
- Sign-off: Get explicit sign-off from business stakeholders (CFO, COO, functional leaders) that Superset results match Looker.
Week 15: Stress Testing
Simulate production query load on Superset. Use load-testing tools (Apache JMeter, Locust) to send 100+ concurrent queries. Monitor Superset’s CPU, memory, database connections, and query latency.
- Target: Superset should handle 2x peak production load without degradation.
- If it fails: Optimize database indexes, enable query caching, increase infrastructure capacity, or defer low-priority dashboards to post-cutover.
Week 15: Cutover Runbook
Create a detailed runbook:
CUTOVER RUNBOOK – Looker to Superset Migration
Date: [Date]
Duration: 8 hours (e.g., 6 PM Friday to 2 AM Saturday)
Team: BI Lead (lead), BI Engineer, DevOps, Database Admin
Rollback Owner: [Name]
PRE-CUTOVER (Friday 5 PM)
1. Backup Looker instance (Google Cloud snapshots)
2. Backup Superset database (RDS snapshots)
3. Backup warehouse (Snowflake, Redshift snapshots)
4. Notify stakeholders: "Cutover starting at 6 PM. Looker will be offline for 8 hours."
5. Disable Looker scheduled reports (to avoid duplicate reports)
CUTOVER (Friday 6 PM – Saturday 2 AM)
1. 6:00 PM: Disable Looker instance (set to read-only or offline)
2. 6:15 PM: Run final dbt models to sync any last-minute data changes
3. 6:30 PM: Validate Superset data freshness (row counts, latest dates)
4. 7:00 PM: Enable Superset in production (DNS switch, load balancer update)
5. 7:15 PM: Smoke test: BI Lead and BI Engineer run 20 critical queries
6. 8:00 PM: Open Superset to 10 power users (invite-only testing)
7. 10:00 PM: Open Superset to all users
8. 12:00 AM: Monitor for issues (check Superset logs, database performance, user feedback)
9. 2:00 AM: Declare cutover successful if no critical issues
POST-CUTOVER (Saturday morning)
1. 8:00 AM: Debrief with team (what went well, what didn't)
2. 10:00 AM: Notify stakeholders: "Cutover complete. Superset is live."
3. 12:00 PM: Begin monitoring (2-week post-cutover support window)
4. Week 1: Gather user feedback, fix bugs, optimize slow queries
5. Week 4: Decommission Looker
ROLLBACK PLAN
If critical issues arise (data loss, security breach, widespread performance issues):
1. Disable Superset
2. Re-enable Looker from backup
3. Notify stakeholders
4. Investigate root cause
5. Fix and retry cutover
This level of detail is non-negotiable. It ensures everyone knows their role and minimises chaos during cutover.
Phased User Rollout
Instead of cutting over all users at once, roll them out in waves:
Wave 1 (Day 1): Power Users (10 users) Analytics team, BI engineers, CFO’s office. These users are forgiving and can provide real-time feedback.
Wave 2 (Day 2): Functional Leaders (50 users) CFO, COO, VP Sales, VP Marketing, etc. They use dashboards daily and will quickly spot issues.
Wave 3 (Day 3): Operational Users (100+ users) Everyone else. By this point, you’ve fixed the major issues and built confidence.
This phased approach gives you time to catch and fix issues before they affect the entire organisation.
Post-Cutover Support
Plan for 2–4 weeks of intensive post-cutover support:
- On-call BI engineer: Available 24/7 for critical issues.
- Daily syncs: BI team meets daily to review issues, prioritise fixes, and communicate status.
- User feedback loop: Actively solicit feedback from users and fix issues within 24 hours.
- Monitoring dashboard: Create a Superset dashboard that monitors Superset itself (query latency, error rates, user activity). Use this to spot issues early.
After 4 weeks, you can move to normal support operations (8/5 coverage, standard SLAs).
Post-Migration Operations and Optimisation {#post-migration-ops}
Migration doesn’t end at cutover. The next 6 months are critical for bedding in Superset and extracting value.
Query Performance Optimisation
Superset queries your data warehouse directly. Performance depends on three factors:
- Warehouse query optimisation: Are your dbt models and SQL queries efficient? Use EXPLAIN PLAN to identify slow queries. Add indexes, clustering, or partitioning where needed.
- Superset caching: Superset can cache query results for 1–3600 seconds. For dashboards that don’t need real-time data, enable caching to reduce warehouse load and improve load times.
- Infrastructure scaling: If Superset or your warehouse is underpowered, scale up. Monitor CPU, memory, and database connections.
Pro tip: Create a “slow query dashboard” in Superset that shows which dashboards and queries are slowest. Review it weekly and optimise the top 10% of slow queries. This yields 80% of the performance improvement.
User Adoption and Training
Technically, Superset is simpler than Looker. But users need training to be productive:
- Recorded training videos: Show users how to create queries, build dashboards, set filters, and export data. Keep videos short (5–10 minutes) and task-focused.
- Live training sessions: Run 2–3 cohorts of live training (1 hour each) for different user groups.
- Documentation and FAQs: Create a wiki or Notion page with common tasks and troubleshooting.
- Slack channel: Create a #superset-help channel for users to ask questions. BI team responds within 24 hours.
Expect 70% of users to be productive within 2 weeks, 90% within 4 weeks, and 95% within 8 weeks. Some legacy users may never fully adopt; that’s normal.
Cost Optimisation
After cutover, look for cost savings:
- Warehouse query optimisation: Fewer queries, faster queries = lower warehouse costs. Quantify this monthly.
- Infrastructure rightsizing: You might have over-provisioned Superset or your database. Measure actual usage and scale down if needed.
- Scheduled report consolidation: Eliminate redundant scheduled reports. If 10 users run the same report, schedule it once and distribute results instead.
Example: A portco reduced warehouse costs by 25% in the 6 months post-migration by optimising dbt models and consolidating reports. That’s an extra £15,000 annual saving on top of the Looker licensing savings.
Governance and Maintenance
Establish ongoing governance:
- Semantic layer maintenance: dbt models change as business logic evolves. Establish a change control process (code review, testing, documentation).
- Dashboard lifecycle: Dashboards become stale. Quarterly, audit dashboards for last-modified date and usage. Archive or delete unused dashboards.
- Access reviews: Quarterly, review who has access to what. Revoke access for leavers, update access for role changes.
- Compliance: If you’re pursuing SOC 2 or ISO 27001, maintain audit logs, document access policies, and demonstrate controls to auditors.
For PE portcos, this governance is not optional. It’s the foundation of a secure, compliant, scalable BI platform.
Common Pitfalls and How to Avoid Them {#common-pitfalls}
We’ve seen hundreds of BI migrations. Here are the most common pitfalls and how to avoid them.
Pitfall 1: Underestimating LookML Conversion Effort
The problem: Teams assume “we’ll just export LookML and convert it to dbt.” In reality, LookML is a domain-specific language with unique concepts (explores, derived tables, Liquid templating). Converting it to SQL + dbt is not a 1:1 translation.
The fix: Budget 2–4 weeks per 100 LookML models. Start with a small pilot (10 models) to calibrate your estimate. If you’re converting 500 models, budget 10–20 weeks, not 4.
Pitfall 2: Ignoring Data Quality Issues
The problem: Looker masks data quality issues with its semantic layer. When you move to Superset + dbt, those issues surface. Users see inconsistent row counts, missing dates, or incorrect aggregations.
The fix: During the discovery phase, audit data quality. Use dbt tests to validate row counts, nullability, and referential integrity. Fix issues before cutover, not after.
Pitfall 3: Neglecting Access Control
The problem: “We’ll set up RLS after cutover.” Then cutover happens, and suddenly everyone can see everyone else’s data. Chaos ensues.
The fix: Design your RLS strategy in Phase 1. Implement it in Phase 3 (alongside dbt). Test it in Phase 5. Don’t defer it.
Pitfall 4: Underinvesting in Infrastructure
The problem: Teams deploy Superset on a single £500/month EC2 instance. It works fine for 50 users, then collapses under production load.
The fix: Right-size your infrastructure upfront. For a mid-market portco, budget £8,000–£15,000 per month for Superset + database + monitoring. Scale up if needed, but start with a solid foundation.
Pitfall 5: Losing Institutional Knowledge
The problem: The analyst who built the original Looker dashboards leaves during the migration. No one knows why certain filters exist, why certain measures are defined that way, or what the business logic is.
The fix: Document everything as you go. For each dbt model, add comments explaining the business logic. For each dashboard, document the owner, refresh cadence, and intended audience. This takes 10% extra effort but saves 10x effort when someone asks “why is this metric calculated this way?”
Pitfall 6: Premature Looker Decommissioning
The problem: Teams decommission Looker too quickly. Then users discover a dashboard or report that wasn’t migrated, and they panic.
The fix: Keep Looker running for 4–8 weeks post-cutover. This gives you time to catch any missed dashboards. Only after confirming 100% coverage should you decommission Looker.
Pitfall 7: Not Planning for Ongoing Maintenance
The problem: “Once we’re on Superset, we’re done.” Then dbt models break, dashboards become stale, and users lose trust.
The fix: Budget 0.5–1 FTE ongoing for BI operations. This person maintains dbt, manages dashboard lifecycle, handles user support, and monitors performance. It’s not optional.
Next Steps and Getting Help {#next-steps}
If you’re a PE portco planning a Looker-to-Superset migration, here’s your action plan:
Week 1: Internal Alignment
- Get sponsor approval: Present the financial case (£97.5K annual savings, 3.6-year payback) to your PE sponsor and board.
- Assign a project lead: Nominate a senior person (CTO, VP Engineering, or BI Lead) to own the migration.
- Scope the work: Use the audit framework in this guide to count your Looker assets and estimate effort.
- Build the business case: Document the financial benefit, timeline, risks, and mitigation strategies.
Week 2–3: Technical Assessment
- Audit your Looker instance: Export metadata using the Looker API. Count dashboards, looks, models, users, and integrations.
- Define target architecture: Decide on dbt + Superset vs. Cube + Superset vs. direct warehouse + Superset.
- Assess your data warehouse: Ensure your warehouse is compatible with Superset and optimised for performance.
- Identify integrations: List all systems that integrate with Looker (embedded dashboards, scheduled reports, data apps).
Week 3–4: Partner Selection (Optional)
If you don’t have internal capacity, engage a partner. Look for:
- BI migration experience: Have they done Looker-to-Superset migrations before? Ask for references.
- dbt expertise: Can they build and maintain a dbt project? This is critical.
- Platform engineering knowledge: Do they understand modern data stacks, Kubernetes, and cloud infrastructure? For PE portcos, this matters.
- Compliance experience: Have they implemented SOC 2 / ISO 27001 controls? If you’re pursuing compliance, this is non-negotiable.
PADISO is a Sydney-based venture studio and AI digital agency that specialises in exactly this work. We’ve helped PE portcos migrate from Looker to Superset, implement dbt, and pass compliance audits. We offer fractional CTO advisory to help you navigate the technical and organisational aspects of the migration. We also have platform engineering teams across Australia, the US, and other key markets who can execute the migration end-to-end.
Other reputable partners include Slalom, Thoughtworks, and Deloitte Digital, though they tend to be more expensive and slower than boutique BI specialists.
Month 1–2: Migration Kickoff
If using a partner, they’ll lead this. If going solo:
- Form the migration team: Assign a BI lead, 2–3 data engineers, 1 DevOps engineer, and 1 project manager.
- Set up governance: Define naming conventions, testing standards, documentation requirements, and code review processes.
- Provision infrastructure: Deploy Superset, set up Kubernetes, create dbt project structure, configure CI/CD.
- Start LookML conversion: Begin with high-priority, low-complexity models. Learn the patterns. Then scale.
Month 2–4: Execution
Execute the migration plan (Phases 1–6 from earlier in this guide). Track progress weekly. Escalate risks immediately.
Month 4–6: Cutover and Stabilisation
Execute cutover using the runbook. Support users intensively for 4 weeks post-cutover. Then transition to normal operations.
Beyond Month 6: Optimisation and Value Realisation
Optimise query performance, reduce costs, and realise the full value of the migration. By month 12, you should be seeing the full £97.5K annual savings (or more, depending on your scale).
Final Thoughts
Migrating from Looker to Superset is not trivial, but it’s highly achievable. For PE portcos, the financial case is compelling: 60–80% cost reduction, improved operational control, and a modern data stack that supports M&A and product innovation.
The key to success is planning. Spend weeks on discovery, architecture, and governance upfront. Then execute the migration in phases, with clear sign-offs and risk mitigation at each stage. Invest in post-cutover support and ongoing maintenance. And don’t hesitate to engage a partner if you lack internal capacity.
If you’re ready to start, book a call with PADISO. We’ll help you scope the work, assess risks, and build a migration plan that works for your portco. Whether you need fractional CTO guidance, platform engineering support, or a complete end-to-end migration, we’ve got the expertise and experience to deliver.
The modern data stack is within reach. Let’s build it together.