Complexity: the new analytics frontier

Anna Filippova

Sep 18, 2022

What it's like doing analytics engineering six years on.

Read →

1 Comment

Taylor A Murphy

Sep 19, 2022

Other things that came to mind reading this based on my GitLab experience:

* Doing CI for dbt can be very confusing for new users. Building the mental model of what each step of a CI pipeline is doing and giving the decision tree for whether or not to use a specific test was a huge pain. I wanted it to run automatically but could never get it there (this was pre defer though...)

* PR review was tricky. Does this person need a style or substance review? Do i *really* have enough context to understand the scope of what they're trying to change? Do I *really* know the downstream impact of what the change is?

* How you orchestrate all this is another layer of confusion. Is it possible to have multiple Airflow DAGs that select parts of your dbt DAG but leave you unwittingly not selecting a part of that DAG but it still runs b/c there's an old table in the warehouse and you just don't have tests on that part of it yet? Yes.

* Creating a space where folks outside the team can build models in a less structured way that doesn't affect the rest of the dag (but uses earlier parts of the dag) was a fun challenge. They want to view their work in the BI tool but need "production" to do that!

* How do you surface tests that were built specifically for other ops teams to care about but those folks don't want or need the SQL skill to do their regular job? They just want the output of the failed test to action their work in their tool.

Those are just some of the things that came to mind. But having to worry about those rather than all the other fiddly bits dbt takes care of is much better I'd say :)

Expand full comment

The Analytics Engineering Roundup

Complexity: the new analytics frontier