Analyzing 1M+ Looker Error Messages: Our Findings
At Spectacles, we automatically test our customers' Looker instances and LookML changes for various issues like invalid SQL, content validation problems, and data test failures.
We find a lot of errors! Of the 10,000+ Pull Requests we've tested, 45% of them had at least one of these problems.
We took the plunge into over a million Looker error messages to understand the errors Looker developers encounter every day. We'll share some recommendations based on what we learned to make LookML development on your team smoother.
Spoiler alert: The best thing you can do to improve LookML code quality and reduce errors is to sign up for Spectacles. We test every code change automatically to ensure you're shipping LookML that always works.
Problem: Invalid SQL
It's deceptively easy to write invalid SQL when writing LookML. Looker doesn't check the SQL you write in definitions for derived tables, dimensions, measures, or SQL fields like sql_where. As a result, 64% of the errors we see are invalid SQL errors.
Why should you care? SQL errors in LookML lead to scary error messages in popular Explores and critical dashboards. When your LookML doesn't run in your database, your users lose confidence in your data team, and your developers lose time re-doing their work.
Here are the most common types of SQL errors we found.
Missing columns and tables
70% of SQL errors happened when a LookML developer specified a column, table, or schema name that didn't exist in the database.
Here are some examples of missing column and table errors:
- Invalid identifier 'ORDERS.SHIPPING_METHOD'
- Name renewal_date not found inside opportunities
Why do these errors happen?
- A developer makes a typo in the SQL, accidentally writing something like "sales_price" instead of "sale_price."
- The schema of the data warehouse table beneath a Looker view changes without a corresponding update to the view's LookML.
Syntax errors
Syntax errors are caused by SQL that the database can't parse—i.e., it isn't valid or correct SQL. Misplaced characters like periods or unclosed parentheses usually are the culprit.
Here are a few examples of syntax errors:
- Query execution failed: - Syntax error: Unexpected ")" at [151:33]
- Syntax error line 3 at position 69 unexpected '.'.
- Expression references marketing.utm_campaign which is neither grouped nor aggregated
When you write SQL in a SQL IDE, the IDE will typically catch these errors before query execution, but syntax errors become much more common when you write SQL in Looker.
Type errors
While not as common as missing column or syntax issues, type errors also trip up Looker developers frequently. For example, ever provided an INT column to a DATE_TRUNC function? We've all done something like this, especially during a data warehouse migration or while trying to learn a new SQL flavor.
Here are some examples of type errors:
- Function DATE_TRUNC does not support VARCHAR argument type
- No matching signature for operator = for argument types: INT64, STRING
- Invalid cast from TIME to TIMESTAMP
If SQL errors are so prevalent in LookML, what can we do about it?
What can we do about invalid SQL?
Here are some solutions for reducing the number of SQL errors in your LookML.
Proofread
It's easy to overlook this tip. Don't forget to check your work and proofread your SQL! When developing in a hurry, it's easy to forget a character or use the wrong column name. So before you commit your changes, give them one more read for accuracy.
Enabling Pull Requests and code review in Looker is also a best practice that ensures you're not the only one looking at your code before it's released.
Test changes manually in Explores
Before shipping LookML, you can switch to the Explore page and test any new dimensions, derived tables, or measures. Then, run sample queries with any new or changed fields to ensure they'll work as expected in your database. You can also manually load any important dashboards or Looks that depend on your changes.
Validate SQL with continuous integration
Manual testing is time-consuming—what can we do to automate the process?
If you use Spectacles, you can directly test the SQL defined in LookML as part of your Pull or Merge Request process. Spectacles runs lightweight queries to test your dimension SQL and reports any errors found in the database. If your LookML changes pass in Spectacles, you can be confident that your changes will work as expected in production.
Problem: Broken Content
In Looker, dashboards and Looks are built from fields defined in LookML. When you change LookML, you can accidentally remove or modify a field that a piece of content depends on, breaking that dashboard for all its users. 35% of the errors we see are content validation errors like this.
The only way to check for these kinds of errors in Looker is to use the Content Validator. Unfortunately, it's easy for developers to forget this step or skip it because it's become overrun with errors. When developers miss content validation, they will unknowingly break dashboards or Looks that your users depend on.
Let's look at two types of content validator errors.
Missing fields or Explores
It's probably not surprising that 97% of content validation errors we analyzed are caused by missing fields or Explores. These errors occur when a developer removes or changes the name of an Explore, dimension, measure, or another field. Unfortunately, the content still points to the old reference, so it fails to run.
Table calculation fields missing from query
If you're using a table calculation in your content, remember to keep the fields that constitute it operational. If a field goes missing from that query without changing the table calculation, it will cause the content to fail.
What can we do about broken content?
Encourage Content Validation
As a best practice, Looker developers should run the Content Validator on their branch before merging any changes. Get started by setting up some time as a team to clean up your Content Validator in production. This will ensure developers can identify errors that they're responsible for quickly.
Developers should also avoid renaming dimensions or measures directly. Instead, use Looker's alias parameter to avoid breaking existing references.
Validate Content with Continuous Integration
You can enforce content validation through continuous integration with Spectacles for even more seamless governance.
With Spectacles, you can run content validation as part of your Pull or Merge Request process, ensuring developers understand the implications of their code changes on dashboards and Looks.
Spectacles also enables ignoring content in personal spaces or specific folders, making mandatory content validation doable even if your Looker instance already has many content errors in production. You can also run the content validator incrementally, only displaying content errors that are caused by a certain branch. This means you don't have to clean up production until you're ready.
Problem: Looker and database misconfiguration
Even with perfect SQL and valid content references, Looker developers can still run into problems if Looker isn't configured correctly. However, these issues are less common, only representing about 7% of the errors in our dataset.
Missing user attributes
These errors occur when a user attribute is required for a Looker query, but that user attribute hasn't been set for the user running the query.
For example, if you define an access filter dependent on the "company" user attribute, but don't set that user attribute for a user who needs it, you'll end up with an error like, "No value of user attribute "company" set for current user."
Invalid LookML
If developers on your team prefer to develop and commit LookML in their text editor instead of the Looker IDE, you may encounter LookML syntax errors in production.
This issue can also occur when you commit and generate LookML through a script or automated process.
LookML that's written and committed outside the Looker IDE doesn't get checked by the LookML validator. So if you don't test that LookML before you release it, you can end up with errors like this in production:
"A LookML model issue occurred. Multiple primary key definitions for view 'dim_animation': 'dim_animation.animation_style' and 'dim_animation.animation_ssid'"
User access, connection, and permission issues
Because Looker sits on top of your data warehouse, any database access issues also affect Looker.
There are tons of possible access issues you could encounter, but here's a sample of what we've come across:
- IP 33.212.54.194 is not allowed to access Snowflake
- Incorrect username or password was specified
- User temporarily locked
- User does not have permission to query table
- Permission denied for relation users_fact
- User does not have bigquery.jobs.create permission
What can we do about Looker and database misconfigurations?
Manually test access for new users
When creating new users, double-check that they have the necessary user attributes defined. You should also test their access to important content and Explores to ensure you've configured their user correctly.
Looker user attributes should be documented in a central location so admins can determine how to set them for new users.
Require LookML validation for releases
If you allow LookML development outside of the Looker IDE, you should ensure that LookML is validated before it's released. For example, as part of your release process, you could manually check out the branch in Looker and run syntax validation.
Alternatively, you could use a tool like Spectacles to run LookML validation automatically on all Pull Requests.
Learning from our mistakes
If you've been developing LookML for some time, you've probably encountered most of these issues. Luckily, there are development best practices your team can take advantage of to reduce the number of bugs you ship to production.
In summary, here are some best practices you can implement today:
- Turn on Pull Requests and require at least one approving review for releases to production
- Test new dimensions and measures for SQL errors
- Require Content Validation and clean up your content validator in production
- Require LookML validation for all releases
Or, sign up for a free trial of Spectacles today to run all of these tests automatically on every Pull or Merge Request! In the words of Andrea, one of our customers, "You'll wonder how you ever developed in Looker without it."
If you've got other ideas for improving LookML code quality, or want to share your team's best practices for reducing errors, tweet them at @SpectaclesCI!