Discover more from The Analytics Engineering Roundup
Learning to fall
Also in this issue: vector databases, avoiding mediocre decisions, and rollups that anticipate that "real quick" follow up question.
It’s finally starting to feel like summer out here in the Bay Area, so in keeping with the season, I’m back with another wakeboarding post 😎 Hope you’re having a good weekend wherever you are!
PS. And if you happen to be at either of the big data summits happening next week stop by the dbt Labs booth. We’ll see you there!
Learning to Fall
Someone once told me that to truly get comfortable in a new sport, you need to learn how to fall.
The idea is simple: whether you’re learning new tricks, or just learning the basics, the thing that is often in your way isn’t your body or your equipment — it’s your brain. Until you feel comfortable enough recovering from a bad situation, you won’t be able to really push yourself past what your brain thinks is possible. We often say it helps to build “muscle memory” when learning something new, but what does that really mean? Your muscles will do whatever your brain tells them to do. It’s new brain memories that you really need — and you typically won’t make progress on those new brain memories until you straight up eat it a few times.
Here’s a drill I learned recently that really blew my mind. It blew my mind not because it’s particularly difficult or strenuous, but because it requires a lot of trust. Two, kinds of trust to be specific: learning to trust your rope, and learning to trust the person driving the boat.
It goes something like this:
you’re getting pulled by a boat going at 20 miles an hour and you stay upright primarily by providing resistance against the line that is towing you behind the boat — this is your base position
you get as far to the side of the boat as you possibly can
you lean back while staying as tall as you possibly can — ever tried a trust fall during a team building event where you fall into someone arms? It’s kind of like that, but someone’s arms are the water below you
you lean back some more
you lean back some more
.. you get the idea
The drill is designed to teach you exactly how much the rope you are holding on to will be able to support you before you fall — and it turns out the answer is so much more than your brain is telling you is OK. You go through several long seconds of “pull up pull up PULL UP” before you realize you’re still OK, and all of a sudden the next time you do a set your brain goes — “oh OK this is fine. I’ve seen this before.”
There’s another thing happening when you’re doing this drill — your coach and (typically) the person driving the boat, is spotting the shore for you making sure you’re not slamming your head against a tree branch or worse — a boat going the other way. They’re watching you and ready to slow down at the first sign of trouble. They have your back, quite literally, as you lean further and further back towards the water.
Inevitably, if you do the drill right, you’ll fall. But when you do, you realize how much further you can go than your brain originally thought you could. And that’s the whole point.
Here’s what all of this has to do with data:
the boat is your warehouse. It sets your speed and defines the structure within which you are generally operating.
the water is (obvs) your data (lake, river or ocean — gotta love them water metaphors)
the rope is your data stack — it’s the only thing between you, the water and the boat, or you, the data and the warehouse layer
and the person spotting you from inside the boat is your friendly neighborhood data person. They’re setting the speed, they’re watching out for branches and oncoming traffic, they’re plotting a route down the water body that gives you the best possible conditions to do the thing you want to do on the water. Or to break the metaphor for a second, they’re structuring your data in a way that makes it more useful and intuitive, they’re deciding which data sets are best positioned to answer the type of question you likely have and making those available to you more easily — in other words, creating the best conditions for you to do the thing you want to do with your data.
What does “learning to fall” mean when you’re learning to work more with data? I think it means letting yourself trust your data stack waaay more than you have before. Definitely read the docs, use the best practices, etc. etc. to make sure your form is all good. And then allow yourself to fall — open yourself up to the possibility of making a mistake… and then really go for it. Do something you wouldn’t have done before without asking someone for help. Go down that rabbit hole.
When you’re done, show your work to someone on your friendly neighborhood data team — your coach and your boat driver. You’re going to find that what you were able to achieve on your own is far more than you thought possible. And it’s far more correct and insightful than you imagined.
Go try this wakeboarding-inspired drill at home! What did it shift for you?
Thanks for reading The Analytics Engineering Roundup! Subscribe for free if you want more
wakeboarding tips data commentary 💜
Elsewhere on the internet…
I love a good Maslow’s “hierarchy of needs” diagram, don’t you? Thank you, as always, Winnie for hitting the nail on the proverbial head.
Why adding “just one more metric” to your report doesn’t have to blow your entire project timeline — Greg Meyer wants to change your mind on the beauty of the entity rollup, and why it’s an overlooked tool in your data modeling arsenal. My favorite part of this write-up is imagining what it looks like to anticipate related metrics — and the benefits that you get by doing so. This is where I get really excited about the intersection of semantic layers and AI — once you have well defined entities and relationships between them in machine readable form, you can start to let your data stack do some of the grunt work for you in terms of anticipating that next metric someone will want to look at. Done well — the UX will be invisible: by the time your stakeholder asks you a question, it’ll already be pre-calculated and maybe even have a historical trend.
Sarah Floris has published a brilliant high level explanation of vector databases. My brain immediately connected the dots between Greg’s post and Sarah’s — imagine that same well defined semantic layer of concept relationships that represent your data on top of a vector database of permutations of those concepts as they relate to each other. 🤯 Hello semantic search and natural language query results that give you something you didn’t anticipate but is actually useful. I can haz?
Why making mediocre decisions isn’t about the pool of options in front of you… but the process by which you arrive at that pool of options. Sean Byrnes has a great abstraction for us this week that applies to so many areas of running a business or your personal life — choosing your business strategy for the next few years, choosing a new home, or a new location to live, enabling your kids to choose the best school possible for them… Sean’s entire series on decision making is a must read.
And finally… one more thing on OLAP vs OLTP from Mark Freeman. If you’re familiar with the dichotomy, you’re not going to learn anything ground-breakingly new on OLAP, OLTP or the philosophical differences between them. But the engineering team you’re currently working with to set up some data contracts will absolutely have an a-ha moment. This post does such a great job bridging the two mental spaces occupied by data teams vs product engineering teams.