How to Migrate Tableau Custom SQL to Data Warehouses
Tableau, a powerful data visualization tool, has revolutionized the way organizations analyze and present data. Its user-friendly interface and ability to connect to various data sources have made it a popular choice among data analysts and business users alike. However, one feature that often raises concerns from a data governance and performance perspective is the use of custom SQL queries within Tableau workbooks.
Tableau custom SQL allows users to write their own SQL queries to extract and manipulate data directly within Tableau. While this flexibility can be advantageous in certain situations, it also poses significant challenges in terms of data governance, workbook performance, and maintainability.
In this article, we will explore the disadvantages of using custom SQL queries in Tableau, highlight the benefits of migrating them to a centralized data warehouse, and see how this migration process can be streamlined by using Select Star - an intelligent data discovery platform.
The pitfalls of using custom SQL queries in Tableau
1. Lack of data governance
When users create custom SQL queries within Tableau workbooks, it becomes difficult to enforce data governance policies and ensure data consistency across the organization. Each workbook may contain its own version of the data logic, leading to discrepancies and potential errors. Inconsistent data definitions, limited auditing and traceability further complicate the enforcement of governance policies. Without a centralized and governed data model, organizations struggle to maintain a single version of the truth, protect sensitive data, and ensure data accuracy and consistency across the organization.
2. Performance overhead
Custom SQL queries embedded within Tableau workbooks can have a significant impact on performance, especially when dealing with large datasets. As the complexity of the queries increases, the workbook’s loading time and responsiveness may suffer. This performance overhead can lead to a frustrating user experience and hinder the adoption of Tableau within the organization.
Using custom SQL in Tableau can also have cost implications, particularly when establishing a live connection to cloud-based data sources. Each time a custom SQL query is executed, it consumes computational resources and may incur additional costs based on the pricing model of the data source provider. Moreover, inefficient or poorly optimized custom SQL queries can lead to longer query execution times, resulting in higher resource consumption and increased costs.
3. Limited scalability
When custom SQL is scattered across multiple Tableau workbooks, scaling and maintaining the data infrastructure becomes challenging. As the number of workbooks and users grows, managing and optimizing such queries becomes a time-consuming and error-prone process.
4. Difficulty in troubleshooting
Debugging issues related to custom SQL within Tableau workbooks can be a complex and time-consuming task. When problems arise, data analysts need to dive into individual workbooks and parse through the queries manually to identify and fix the problematic queries. This process becomes even more challenging when multiple users are involved, leading to increased downtime and reduced productivity.
The benefits of migrating custom SQL to a data warehouse
1. Centralized data model
By migrating custom SQL queries from Tableau workbooks to a data warehouse, organizations can create a centralized and standardized data model. This approach ensures that all data logic and transformations are defined and managed in a single location, promoting consistency and accuracy across the entire organization.
2. Enhanced data governance and quality
Moving custom SQL to a data warehouse allows organizations to implement robust data governance practices and quality checks. Data stewards can define and enforce data standards, ensuring that the data entering the warehouse is reliable and consistent. This centralized control enables better data lineage tracking, data validation, and error handling, ultimately improving the overall quality and trustworthiness of the data.
3. Improved performance and scalability
By offloading the data processing and aggregation to the data warehouse, Tableau workbooks can benefit from improved performance and faster loading times. The data warehouse is designed to handle large volumes of data efficiently, leveraging advanced query optimization techniques and parallel processing capabilities.
In addition, by migrating custom SQL queries to a centralized data warehouse and optimizing them, organizations can better control the costs associated with data source usage in Tableau.
4. Streamlined data discovery and analysis
With a centralized data warehouse, data discovery and analysis processes become more streamlined and efficient. The data warehouse acts as a single source of truth, providing a consistent and reliable data foundation for all analytical activities. This centralized approach also promotes collaboration and knowledge sharing among team members, as everyone works with the same data model.
5. Future-proofing your data infrastructure
Migrating custom SQL queries to a data warehouse future-proofs your data infrastructure by decoupling the data logic from the visualization layer. As business requirements evolve and new data sources emerge, the data warehouse can adapt and scale independently of Tableau workbooks.
Preparing for the migration
Before embarking on the migration, it is crucial to conduct a thorough assessment of the current state of custom SQL usage within your Tableau environment. This assessment usually involves several steps:
1. Create an inventory of all Tableau workbooks that utilize custom SQL queries
This inventory will provide a comprehensive view of the scope and complexity of the migration effort.
2. Analyze the custom SQL queries and categorize them based on their purpose, complexity, workbook usage, frequency of querying, and the data sources they interact with
This categorization will help prioritize the migration process and identify common patterns or requirements.
3. Identify the dependencies between custom SQL queries and Tableau workbooks
Understand how the queries are used within the workbooks and how they interact with other data sources or calculations. This dependency mapping will ensure that the migration process takes into account all the necessary relationships and avoids any disruptions to the functionality of the workbooks.
4. Assess the business logic embedded within the custom SQL queries
Document the specific calculations, transformations, and data manipulations performed by each query. This documentation will serve as a reference during the migration process and help in recreating the same logic within the centralized data model in the data warehouse.
Use Case: Speed-up migration of a custom SQL query using Select Star
Now that we have a general idea of how custom SQL can be migrated, let’s see how this process can be expedited using Select Star capabilities.
Before we jump in, let’s learn a bit more about Select Star. As an intelligent data discovery platform, Select Star automatically analyzes & documents your data. From data catalog, lineage, usage analysis, and AI assistants, Select Star provides an easy to use data portal, where data teams can govern and manage their data with automation. Today, Select Star is used as a co-pilot of data teams for data governance, data migration, self-service analytics / data democratization, and cost optimization initiatives.
Find all data sources with ‘custom SQL’ in their name
Go to the Dashboards section on the left side of the screen, and select Tableau Online. Select Data Sources in the Data Type menu in the Filters panel on the right side of the screen. If needed, filter to a specific data source type (embedded or published) using the Data Source Type filter.
If you want to explore the custom SQL Query dataset from your search, click on its name, and you will be directed to this dataset’s dedicated page.
Check if a specific workbook has custom SQL data sources
If your goal is to optimize a specific workbook, you can go to the main page of your Tableau Online connection (the Dashboards section on the left side of the screen, and select Tableau Online). By default, the view shows all workbooks across all projects, and you can use the filter above the workbooks list to search for a specific name. Alternatively, you can go directly to the project where this workbook is saved, using the navigation tree on the left side, and find the workbook there.
Once you are on the workbook’s overview page, go to the Lineage section to see Upstream data sources. In this example, there is a data source named Custom SQL Query. Click on the arrow next to the data source name to see which tables are used in this SQL query.
If you click on the set of angle brackets next to the custom SQL Query name, the text of the SQL query will be displayed in a new pop-up window. That enables you to quickly copy the code, without the need to open the workbook on Tableau Online.
Explore the lineage of a custom SQL data source
If you want to learn more about this custom SQL query, click on the square with an arrow next to the custom SQL Query name to go to this data source overview page.
On this page, you can see which Fields are used in the query, understand the Lineage between this data source, data warehouse tables, and Downstream workbooks and dashboards. You can also see the SQL code used to create this data source by clicking the SQL button in the top right corner.
You can also update the data source’s documentation directly from this page by using Select Star’s AI capabilities. It allows you to quickly generate table and column descriptions based on the metadata, existing documentation, and other data context that Select Star has analyzed.
If your custom SQL is a complex one, you might want to focus on the upstream Lineage section of this page. In the example below, you get a clear picture of which tables in my data warehouse (BigQuery) and which dbt models are used in this query.
From here, you can delve deeper into the column-level lineage. Click on the Lineage button located in the top right corner to explore the interactive representation of the query’s lineage.
Using this view, you can:
- See all data tables and dbt models connected to your custom query
- Understand which dashboards are dependent on that query
- See the SQL query directly from this view
- Quickly understand which data tables the individual fields in this query are coming from, or which fields are used in calculations in the query
- You can also use filters above the lineage to see or hide certain type of objects, such as dbt models or dashboards
Now that you understand what logic is contained in your custom SQL query, and which fields and data sources are used, you can copy the code directly in the Select Star interface, and transfer the query from Tableau to the data warehouse.
By following the steps above, you can successfully migrate your custom SQL queries from Tableau to your data warehouse, establishing a single source of truth and improving data governance, quality, and performance. Remember to involve relevant stakeholders, communicate the benefits and progress of the migration, and provide adequate support and training to ensure a smooth transition for all users.
Conclusion
Migrating custom SQL to a centralized data warehouse allows organizations to establish a single source of truth, improve data quality, and enhance the performance and scalability of their Tableau dashboards. It is a strategic step towards achieving a more efficient, reliable, and scalable data ecosystem.
Regular review of custom SQL queries used in Tableau should become a part of the ongoing data governance process after the initial migration to the data warehouse. By doing so, organizations can ensure that the benefits of the migration are sustained over time and that any new custom SQL queries are properly integrated into the centralized data model.
The article was originally published on Select Star’s blog.