Development
March 20, 2023

How Epidemic Sound Launched LookML Linting

Profile photo of Valentin Leister, freelance data consultant
Valentin Leister
Freelance Data Consultant
A human figure standing in front of a giant neon robot, surrounded by walls of chaotically stacked speakers.

Thanks to Valentin Leister and the team at Epidemic Sound for these valuable insights into how they pitched and rolled out LookML linting to 20+ developers using the Spectacles Style Validator. If you're inspired by the thought of working on a team that takes LookML code quality this seriously, Epidemic Sound is hiring. Connect with with Valentin on LinkedIn for more recommendations to create a scalable Looker project.

TL;DR

This spring, we rolled out LookML linting to our team of more than 20 developers as a required CI check for new LookML and fixed all existing issues in our project. We thought the Looker community could benefit from our learnings and experience. It's still early days, but we've already observed the following benefits:

  • New joiners learn our style guide hands-on when running the Style Validator on their Draft PRs, helping them onboard faster without requiring Senior Developer time for PR review.
  • PR quality has improved, allowing reviewers to focus on more important aspects like performance, scalability, and project architecture.
  • Rolling out the linter primary key check has raised awareness of the importance of good primary key tagging and improved our team's understanding of aggregate awareness/fan-out risks.
  • The improved awareness of fan-out risks has resulted in a follow-up project to improve the test coverage of key metrics.

Here's a summary of what we learned, but I encourage you to read the full post!

  • Go with a managed linter solution and invest the time saved to spring clean your codebase instead
  • Cherry pick the style rules that matter - not all style rules may be relevant for your Looker project, cherry pick the rules that actually make a difference
  • Leverage custom rules to fix your particular code style issues
  • Communication is key for smooth sailing: involve analytics leadership and developers early and explain why and how you are rolling out the CI check
  • People learn by doing - enforcing good coding style as CI/CD has done more for developer upskilling than writing endless coding conventions on Confluence
  • Reduced code reviews & time to deployment - PR reviews take less time and less iterations / modifications are required

Why we decided to roll out the Spectacles Style Validator

What is a linter again?

A linter (or style validator, style checker) is a tool that checks whether you write good code.

In the context of LookML, good code mainly means LookML code that is

  • Easy for developers to navigate, upgrade, and maintain
  • Thoroughly documented and business user-friendly
  • Resilient to bugs and mistakes caused by code ambiguity or misunderstandings

When to put in place a LookML linter?

If all your Looker developers have several years of experience writing LookML and can recite your LookML coding conventions in their sleep, you probably don’t need one. 

On the other hand, if your development team comes from a mixed bunch of backgrounds and prior experience with LookML, chances are their PRs will not always comply with all coding conventions and best practices you have documented so nicely on Confluence.

If any of the following apply to your team, consider implementing a linter:

  • Junior team members with limited previous Looker/LookML experience
  • Team members from business backgrounds with limited software development experience
  • Large, complex project structure and extensive coding conventions
  • Pressure to ship new functionality quickly 

Managed solution vs. open source linters

Looker linters have been around for a while and open source solutions such as LAMS offer a vast range of style rules that you can apply to your project.

We picked Spectacles’ new managed linter solution (Style Validator by its official name) for two main reasons:

Time savings

  • It saves us time to cherry pick and customise all the truly relevant rules
  • It saves us time to roll out the CI check in Github and present the result in easily digestible form

Developer experience

  • All our other Looker CI checks (SQL Validator, Content Validator) were already running in Spectacles, so having all runs and logs in one place just makes for a more streamlined developer experience.
  • The Style Validator’s run log is also very user-friendly: It’s nicely formatted and includes links to the relevant code and .lkml files causing a style issue plus a verbose explanation of the issue.
A screenshot of the Spectacles Style Validator
The Spectacles Style Validator output.

Steps and key learnings

Step 1: Define a timeline and get the green light from stakeholders & leadership

Lay out the steps, time and resources required for a proper, smooth roll out. Contrast that effort with the long term technical debt of having to maintain and migrate bad code in your Looker repo and provide a convincing proposal to the key stakeholders of your Looker project. Decide whether the Style Validator should be required for any Pull Request to be merged or just optional.

Recommendation: Understand which code style rules are key to align the Looker codebase with analytics leadership’s long term vision of the analytics platform and strategy.

Step 2: Configure and customise the Style Validator

Cherry pick rules

The Style Validator ships with 24 pre-defined rules, but you can configure which ones you want to apply via a lkmlstyle.yaml configuration file that you place in the root folder of your project. While these make for a great starting point, we decided only eight were considered critical enough to apply to our entire codebase. 

Those eight rules relate to the following themes:

  • Data integrity and fan-out avoidance: making sure each view has exactly one primary key, explicitly defining join relationships (D107, J100, V110)
  • DRY (Don’t Repeat Yourself) code: measure making direct column references instead of using substitution operators, no duplicate views that refer to the same table (M110, V112)
  • Naming consistency: Fields, Views and Explores in snake case (D101, M103, V100)

Another 12 rules we deemed “Best practice”, i.e. relevant, but not critical. We entirely ignore 4 rules as we don’t find them relevant.

Recommendation: Cherry pick only the rules that are actually relevant to your Looker project - keep it simple and make your developer’s job as easy or hard as needed but as easy as possible.

Use custom rules for ownership and data quality: 

The  lkmlstyle.yaml configuration file also allows you to define custom rules for pretty much every object you define in your Looker Project, using regex syntax. We made ample use of this feature and defined four custom rules. For example, one rule ensures that all of our explores have a proper flag for Data Quality and have a clear Domain Ownership within the Analytics Domain.

custom_rules:
  - title: Explore group_label is not one of 'Domain 1', Domain 2, Domain 3
    code: CE102
    rationale: The label needs to clearly state the ownership of a specific explore for maintenance purposes.
    select:
      - explore.group_label
    regex: ^(?:Domain 1|Domain 2|Domain 3)$
    negative: false
    type: PatternMatchRule

Recommendation: Consider designing custom rules to address your specific code style requirements.

Choose the right directory scope

If you follow best practices for layering Looker projects and make ample use of auto-generated views and refinement layers, chances are that some auto-generated code will not comply with certain rules. To avoid your developers banging their heads against their keyboards in anger every time they add auto-generated views in their PR, make sure to carve those directories out from the scope of the linter.

file_overrides:
  - path: 1_schema # Excluded as it contains exclusively auto-generated views
    ignore:
      - D107
      - M110
      - V110
      - CE101
      - CE103

Recommendation: Avoid frustration by only linting relevant directories in your Looker project.

Step 3: Inform and educate your developers

Inform all your developers about the rollout and the exact timeline. That way, they can plan their work accordingly and avoid missing deadlines because they need to fix some style issues that were not previously flagged. Also present all your cherry-picked and custom rules for review - experienced developers may have valuable input on how to improve rule definitions or scope. We ultimately decided not to enforce descriptions on all fields given the size of our team and current capacity.

Most importantly, accompany this communication with a rationale for each of the rules and, ideally, some upskilling for related basic software development principles such as DRY (Don’t Repeat Yourself) and ETC (Easy To Change) code. This will allow developers to perceive the Style Validator as helpful guidance and additional quality batch for their work, rather than an additional hurdle to get work done.

Recommendation: Explain why you are introducing the linter and why you've chosen specific rules.

Step 4: Repo spring clean

To make your entire Looker repo and not just the incremental PR comply with the critical rules we have defined earlier, consider the linter roll-out as an opportunity for a proper refactor and spring clean. 

Of course you could roll out the Style Validator in incremental mode only, but frankly that feels like spray painting over a car part fixed with duct-tape to me.

Add the Style Validator to your repo, create a suite in Spectacles and perform an ad-hoc run on the current main branch to identify all style issues in your repo. Then, distribute or centralise the work to clean up all style issues, depending on how the work is performed most efficiently. 

Note that we created one separate refactor branch for everyone to contribute to, as we wanted to deal with all breaking changes (e.g. renaming of fields to snake case) in one go.

Recommendation: Try to invest some developer time into cleaning up the existing codebase, rather than just applying the Style Validator on new additions (incrementally).

Step 5: Required CI check roll-out

This is where it all comes together: You need to merge your spring-cleaning PR into production, fix any breaking changes with the Looker Content Validator and roll out a required CI check for merging any new PRs that could potentially violate your style requirements.

We solved this with zero disruption by deploying the clean-up at close of business, fixing the breaking changes after business hours and requiring the Style Validator check for any PR to be merged from midnight onward.

At the time of activating the required CI check, we reiterated the message to developers to create Draft PRs and run the Style Validator early. Since the Style Validator can (for now) not be run locally or inside the Looker Developer UI, this helps to avoid frustration and re-writing larger code contributions

Recommendation: Pick a time of the day and week with low usage to activate the required CI check. Advise your developers to run the Style Validator early when preparing a PR.