Discover more from The Analytics Engineering Roundup
For the love of the game
Why analytics engineering is game design
A category of failures we don't often discuss as a community are failures of collective imagination. We talk a lot about tools and trends, then go right back to doing things the exact same way, even as new technologies zoom forward around us. What if the reason we seem stuck making jokes about people downloading CSVs and screenshooting charts from our fancy BI tools into Excel and PowerPoint is not, in fact, about BI tooling? What if it’s about us?
The rising tide of LLMs, data apps, and the Semantic Layer is summoning a sea change, but I worry that if we don't shift our approach now, we're going to simply tack a more accessible chat interface on to the process of bringing a dataset into a spreadsheet or a chart into a deck. This isn't necessarily bad, but it's so much less than what should be, and I've recently come to believe that the limiting factor is how we think about storytelling.
Storytelling in data is often likened, by default, to a novel: a linear experience. You make a chart, it starts at last year, it goes to today. Line go up, line go right, bank account go brrr.We like this vision of storytelling: it's simple, it's static. Analytics engineers have for years been referring to themselves as the librarians of the data stack — enabling the creation, cataloging, and consumption of these static, linear stories. Unfortunately, I think this paradigm might be all wrong. Analytics engineering is about storytelling, but it's about interactive, non-linear storytelling. Analytics engineering isn't writing a novel, it's game design. We talk about data as a guide to decision making, but I'm increasingly convinced data is not just the map, but the whole game.
Fun(ctions) and Games
My beloved colleague Kshitij wrote a piece recently that resonated deeply with me. Here's a quote that really hit home:
I can enter a flow state in dbt in a way I never can in Terraform. I can feel my dbt models coming to life thanks to the tight local feedback loop. I don’t need to flit between reading documentation and writing code, since dbt fits in my working memory. I’m free to play in a dbt sandbox without worrying about bankrupting my employer. That last feeling – play – makes working in dbt a deeply personal experience in a way Terraform isn’t.
This captures the experience I and many other analytics engineers have working in dbt, and the spirit of a satisfying and fun technical experience in just about any domain. This is the experience we're failing to create for our data end users. This is why people continue to download files into Excel: it's a game they know how to play and have fun with. The way we're designing our marts and BI layers leaves no room for this play and the creation of satisfying, non-linear experiences, because, frankly, we're not building them — and as long as we treat the modern data stack as a library for generating linear stories, we won't.
The seed of this perspective was a deep dissatisfaction with the design of user facing data products. It felt deeply incongruous to me that I could feel so fast and powerful transforming data, and so slow and limited using it. I started looking outside data for inspiration: the designs of Donald Judd, the architecture of Louis Kahn, the music of Laurie Spiegel, Brian Eno’s Oblique Strategies, and eventually games. I inhaled articles dissecting classic games, instructions on designing your first board game, and textbooks on video game design. Boiling down this obsessive research deep dive, the basic elements of a game are:
core mechanics: running and jumping, fighting, moving blocks around, uncovering clues
environments: a level in Mario, a chess board, the state of California, the kingdom of Hyrule
goals to achieve in that environment with the core mechanics: save the princess, score more points than your opponent, take over the world, solve a mystery
Lets look at how spreadsheets meet these needs:
core mechanics: inputting values, connecting and processing cells with reactive programming functions, visualizing datasets
environment: an n by n grid of cells
goals: a sandbox, goals are determined by the user.
I can take any goal I imagine, go into a spreadsheet and build what I need. I can learn new moves, I get fast feedback, and by nature of the activity I'm goal oriented while doing it.
Let's contrast BI tools as they're most commonly used today:
core mechanics: filtering and...filtering.
environment: a visual display of charts
goals: “looking at numbers real quick”
Have you ever played a game that held your hand too much? It doesn't matter how great it looks and sounds, it loses meaning as a game because you're just clicking the occasional button as an experience-on-rails rattles forward. You're watching, not playing. This is the experience we've built for our end users in most BI implementations.
Forced to choose between accuracy and flexibility, we've veered hard into tight governance at the expense of creatively pursuing goals. Writing books instead of building games. This is the most exciting thing about the Semantic Layer to me: it lets us dissolve this tension and bring expansive flexibility and consistency to our data products.
We no longer have to choose between accurate metrics and freedom to explore, nor is this power confined to a single proprietary platform. We can build anything, and many things, on top of the Semantic Layer, and I suggest we build more games.
We do this by carefully crafting environments (balancing the depth, breadth, and interconnectivity of our data modeling), understanding and enabling users’ goals, and building core mechanics that enable progress towards those goals within the environment.
Quests over questions
So far this might sound a bit semantic or abstract, but I want you to understand this as something deeply pragmatic. So let's compare a traditional approach and a game design approach to the same common problem:
Question What is monthly ARR?
Approach Build a dashboard that shows a chart of monthly ARR at the top of a dashboard, allow filtering on time periods and relevant dimensions. Move the card to done.
Quest Increase monthly ARR.
Approach Talk with stakeholders to identify possible levers for increasing ARR. Build a data app that integrates operational actions and views into these levers. Set a goal for the next 30 days and build monitoring for progress towards that goal into the dashboard. Slice the metric(s) down to where you can receive feedback on a human time scale. Have people play the game. Take action and refine repeatedly.
You’ll notice there’s a lot more interaction and iteration with the game design approach. It takes more resources to do less things. The trade off is, those things might actually matter. If we want to avoid building multi-billion dollar vacuous panopticons of metrics that we ourselves have little faith in the impact of, we have to reconcile with our lack of intentionality. We have to start treating our internal stakeholders as real customers who deserve satisfying, delightful experiences that help them achieve their goals. And we have to get clear on what those goals are.
We don't often leave you with homework in the Roundup, but I would encourage you, if this is resonant, to try applying this practically in your work over the next month. What's one data product (an app, a dashboard, a dataset) that you could bring this game design approach to? Identify it, and then work through:
What are the different available environments? How are you balancing depth vs breadth and why?
Who are the characters? What are their goals and conflicts?
What sort of mechanics are there for interacting with the environments and the characters?
How do people learn about the way to play?
And most importantly: What are the quests? How do you progress? How do you keep score?
Thanks for reading The Analytics Engineering Roundup! Subscribe for free to receive new posts.
Meanwhile, in other parts of the map
The illustrious Josh Wills gives a state of the union on my personal favorite dbt adapter, dbt-duckdb (and its pirate cousins Buena Vista and DuckDBT). I laughed, I cried, I gasped, and I enjoyed a small sense of pride as dbt Labs’ very own DuckDB-based project, the Core Quickstart, got a shout out.
Justin brings his signature clarity to a topic that’s under-studied in analytics engineering, the ORM. You know how the software developers you work with do things that break your data? That stuff probably was handled by an ORM! If you want to better understand and communicate with the devs who generate a lot of the data you work with, understanding these systems is key. Especially relevant as our ecosystem wrangles more seriously with mesh structures and data contracts.
Expand and Contract
Speaking of contracts, PayPal released their data contract template in May. Honestly, I still look at it every couple weeks. If you’re equally fascinated by the ideas of mesh structures and a YAML spec the length of War and Peace, you should join me in regular perusing. There are a lot of ideas in this repo.
Particularly me, enthusiastically.
Sue me for dreaming of retirement to an adorably haunted library in Portland with my sassy magical cat familiar.
Potential future roundups, all.
Within these constraints, the story of No Man's Sky is illuminating. This was a game that touted groundbreaking volume of procedurally generated levels, but was roundly dismissed as an empty disappointment on delivery. The developers stuck with it, and it's now a beloved game, but they didn't get there by saying 'let's double the number of planets'. No, they focused on adding more unique mechanics, more goals for you to choose and achieve.