Discover more from The Analytics Engineering Roundup
Three missions for the Community
Where we go from here
One of the first things I did this year after coming back from Coalesce was reread the dbt Viewpoint.
If you’re looking for the foundational document of the dbt Community and analytics engineering, you don’t need to look any further than the Viewpoint. It lays down a set of guiding principles about how data teams can operate like software engineers - writing modular code, collaborating and building workflows designed for scalability and maintainability.
An interesting thing about this document is that depending on when you read it, you’d probably have a different reaction.
When I first read the viewpoint in 2017 - it felt borderline transgressive. The idea that data practitioners could own their code workflows and begin to operate like software teams felt bold and new.
If you’d read it in 2020 or 2021, you probably had a sense that this was a natural evolution, the next step for the industry.
And reading it today - well it feels, dare I say it, obvious. Maybe even a little cliche?
As a bit of a historian of analytics engineering and the dbt Community - I want to take a second and reflect on this transition.
The early days of a movement have this fiery, energetic quality to them. You see this deep truth about a way the world could be different and just want everyone to know about it.
Then - if you’re very lucky you just might get to see that actually happen.
That was my overwhelming feeling at this Coalesce - that the Viewpoint is no longer a wild vision of a world that might be, but rather a day to day reality for data practitioners worldwide.
And then, if you’ll believe it, I had a bit of an existential crisis. If we’ve gotten this far, then what’s left to do?
Well it turns out - most of the interesting stuff. What you learn by sticking with a problem for a while and watching it grow and evolve is that you get to sense of its true depths. Its edges, nooks and crannies and the parts that are really important.
So today I thought we’d talk about three large scale missions ahead for us at dbt Labs and the dbt Community overall:
Spreading analytics engineering everywhere - despite the rapid growth and adoption of dbt, the simple fact of the matter is that most people who will ever use dbt have not started yet. To these people, the ideas and values expressed are going to be just as fresh and new as they ever were.
Solving the second order challenges created by the adoption of the viewpoint - obviously we believe (as do our 90,000 closest friends in the dbt Community) that dbt is a gamechanging unlock for data teams - a true before and after moment in the data world. But that doesn’t mean there aren’t challenges left to solve - in fact some of the thorniest challenges that exist can only truly be tackled at scale.
Bringing to life the second order opportunities created by the adoption of the Viewpoint - right now there are a whole new category of problems that are solvable today because the adoption of the dbt Viewpoint was a prerequisite.
Let’s take a look at each of these.
The next generation of analytics engineers
A few years back, if you were a dbt user, we could be pretty sure about your organizational background, location and industry. You probably worked at a venture backed startup (that was most likely SaaS or ecommerce), you probably had a relatively modestly sized data team, you were probably physically located in a major tech hub.
While there are still plenty of people in the Community that fit this profile, one of the great joys of this era of the dbt Community is seeing just how many new people are coming into this world and the incredible variety of experiences they bring with them.
One of the most interesting and unintuitive features of being on an exponential curve is that from wherever you are standing, most of the value has been accrued relatively recently.
What this means for dbt is that we can expect that over the next few years we’ll see drastically more people engaging, bringing fresh ideas, perspective and excitement.
We can expect the practice of analytics engineering to grow across a number of dimensions.
Org sizes - from solo data teams to global-scale enterprises
Industries / use cases - from SaaS and e-commerce to industrial, agricultural and public sector use cases
Geographies - from tech hubs in the US to the entire globe
Bringing on board this next generation requires the same spirit that has always made the Community great. The delight when a new user has a lightbulb moment. The spark of recognition that this person’s problems are a lot like our own. The joy of rolling up our sleeves and figuring it out together.
The only difference is now a whole bunch more people have a seat at the table.
Solving for the second order
A big theme of Coalesce 2023 is creating the systems and tools to help solve the second order challenges of the adoption of dbt.
Any widespread adoption of a powerful technology or idea will necessarily lead to a huge number of wide ranging second order effects.
Right now - the biggest second order challenge we’re seeing is a rise in the complexity that can happen as larger teams adopt dbt.
Solving these second order effects is actually a necessary component of following through on your initial vision. It’s not enough to have the first unlock, the spark of an idea if you can’t also build the systems and processes to enable widespread adoption.
Second order problems have a bit of a different flavor than their first order counterparts. They tend to have a thorniness about them - reality is sticky and works to avoid being changed. Solving them requires us to have the intellectual humility to recognize them while at the same time maintaining the tenacity to keep trying new things.
This year’s Coalesce saw our first big swing at tackling the second order problems created by the rise of analytics engineering - dbt Mesh. dbt Mesh looks a little different than some of our other feature releases, not a feature, but a set of features, patterns and best practices.
The goal is to create a new set of primitives, abstractions and access points that combine to allow for autonomy, agency and fluidity while doing data work - even on the largest and most complex projects.
But the truth is that while the details of Mesh look different, the tenets are straight from the original viewpoint. Cross project ref enables modularity on a larger scale. The governance features of groups, contracts and versions allow us to truly design for maintainability and create service level guarantees. And dbt Explorer allows you to work with your documentation in a state and metadata aware environment.
It’s the Viewpoint moving up the stack.
The road goes ever on and on
The flip side of second order problems is that once you have widespread adoption of a new standard - you unlock a whole category of second order opportunities. These are problems that can only be solved once you have achieved a baseline level of adoption and usage.1
In many ways, this is the whole game of this epoch of the industry, moving from one-off solutions to standards. Ad hoc transformation pipelines to strongly typed, version controlled, documented, tested DAGs.
The Semantic Layer is our big bet on the next wave of unlocks enabled by the adoption of the dbt Viewpoint.
While the idea of a Semantic Layer is not new, the conditions for a universal Semantic Layer couldn’t exist until we had seen widespread adoption of an open standard for data transformation. There simply wasn’t the nexus of technology, tooling, alignment, ecosystem and expertise needed to make this a reality.
Now there is - and there’s a fully featured, generally available Semantic Layer built on top of dbt that can provide organizations with flexible but trusted access to all of your most important metrics.
🎉 If you missed Coalesce, you can still catch the highlight reel of dbt product launches:
November 1st, 11:00 AM CST (North America Friendly)
November 2nd, 11:00 AM AEDT (APAC Friendly)
November 2nd, 10:00 AM GMT+1 (EMEA Friendly)
What are the next wave of second order opportunities that get unlocked here?
No one can quite say yet. Maybe its the fact that we now have the ability to create standards not just around how we model our data but the actual metrics we use (check out SOMA for a peek into the future here).
Maybe it’s the fact that LLMs and Generative AI systems are going to need a standard interface to access our organizational data.
Probably it’s something that was discussed in a quiet corner of this years Coalesce over afternoon donuts or on a nighttime stroll to grab tacos.
We can’t know for sure what comes next or where this all goes - and that’s what makes this fun.
What we can know for sure is that we do know what makes this worth it - all of you. To everyone that came out to San Diego, that joined online, that’s ever had a strong opinion about dbt, the viewpoint and how to make this all better, thank you. It’s been an absolute honor to watch this community grow, learn and transform.
One way to understand dbt and the modern data stack is as a second order opportunity of cloud computing and particularly cloud data warehouses