Ep 9: Friday Night (Data) Fights w/ Benn Stancil of Mode Analytics

Is the Modern Data Stack a real thing or just marketing buzz?

Oct 21, 2021

Benn is Chief Analytics Officer and a Co-founder at Mode Analytics, but you may know him from his Substack and Twitter, where each Friday he dives into a semi-controversial topic (recent examples: “Is BI Dead?” and “BI is Dead”).

In this episode, Benn, Tristan & Julia finally hash out some of these debates IRL: what *is* the modern data stack, why is the metrics layer important, and what’s the point of all of this?

Listen & Subscribe

Listen & subscribe from:

Links Mentioned in the Episode:

Show Notes

Key points from Benn in this episode:

What is your take on the modern data stack? Is there a definition or is it just marketing buzz?

It's probably mostly marketing buzz.

Well, there certainly is a new cohort of tools and things like that. I think in terms of what is, and what isn't in the modern data stack is that you can come up with a sort of variety of different definitions. I think some of them are our tools that follow the philosophies that I think Fishtown and dbt have pioneered. I think a lot of the stuff like what dbt believes — it has been very influential in this.

I said this is a joke to answer this question a couple of months ago, but I think I actually kind of like it as an answer more than I thought, which was that the modern data stack is data tools that were launched on product hunt. And I actually kind of think that's not a bad answer because it says two things: 1) it's like roughly the timeline in which they were launched, and 2) it's kind of the audience that they were meant for. If a tool came out like "Let's immediately sell this to the fortune 100, it's going to be like a very IT-focused sort of thing", I think a lot of people will be like "Is that part of the modern data stack? I don't know". The kind of tools that you go to the website and you have to spend 20 minutes being like "What in the world is this product?"

And I think there is this sense of like the modern data stack are the tools that are a little bit more immediately understandable. There are tools that fit into the philosophies of the way that kind of Silicon valley thinks about tool development and software development these days. And in some ways, again, the shorthand for that to be as like "That's kind of the stuff that people put on product hunt".

Share with us little bit more about the your article you wrote about the hazards of categorization and what the risks are it we over categorize things

So, generally I agree with the sort of modular compostable is a word that I think I 50% understand. But the challenge of it obviously is if you go too far like how much do you divide everything up?

And there are lots of examples of places where you could do this, where you want political territories to be a certain way, but like you can't divide them up infinitely small. And it feels like the technology landscape of the modern data stack, whatever that is — in part because it is something that lots of people talking about, there's a lot of money in it, it seems like the sort of thing that's going to be the next big thing — there is a lot of like jockeying for how do I wedge my idea into this thing? And there are lots of little places where it doesn't do a good job. There's like lots of products, there's lots of gaps in products, you can find lots of holes to stick stuff in. And it feels like that's a little bit of the position we're in, where we try to look for new places where there's "Oh, here's a thing to sand down. Let's take a product in it".

One I am guilty of being part of, like literally one of the things I've written on this Substack is "The missing piece of the modern data stack". But, think in the short term, that's probably good. That's like you need to figure out where these problems go and all that sort of stuff you need to sort of bring on new technologies and new innovation, I think the challenge to me seems like how do you eventually sand it down into something that's actually smooth?

I think that there are lines that make sense to me around what are the boundaries of "Okay, these things should probably be so closely tied together that existed one product". I'm sure there's lots of ways to dispute that. But I think you end up with a really bad experience if you fragment it too much, because everybody's just trying to solve their one little tiny part of the sort of assembly line, and nobody's thinking about the bigger picture of what this thing is supposed to be producing.

There's lots of questions about how that all shakes out. And I think that's one of the reasons why consolidation to some degree seems inevitable because even if the experience is bad or it goes better, these things all probably can't be standalone businesses. Like, at some point that economics is gonna become a factor of this too.

But I think that we, we have to sort of have some sense of where this whole thing is supposed to go, and each of us is some cog and this bigger machine, what does that machine supposed to do? And we can't just say, "Well, we're going to solve our problem and hand it off to the next one and hope that they do a good job of it and the final experience of using it cohesively is all nice and neat". The product we ship is a reflection of the org charts that we have and, in our case, our org charts are a bunch of different companies.

What's your take in the metrics layer? How does that have to evolve if we have independent or discrete metrics?

I don't think it has to, but I think this is a place where integration could be better, but there is a sort of a better solution.

So, to me, the point of a metrics layer is basically to say "This is effectively governance that transformation and data modeling is effectively, governance in the sense of like defining what data means and making sure it's consistently reported on and things like that, not governance and who has access to what" — that's sort of different thing. But, it's effectively governance of saying "Okay, all of our metrics say the same thing. When I go look at AR and you look at AR, we see the same stuff. If we compute it by a month or by customer or whatever, we are getting reliable things".

In terms of the natural boundaries of where something should live, that to me makes a lot more sense in a layer that looks like dbt or adjacent to dbt than it does in a BI tool, because then that part of governance is only consumable by the BI tool. So you end up splitting say you have dbt and a tool like Looker, you end up splitting your governance half between dbt and then half between Looker. And if some other application of data wants to use the thing that has governed in Looker, like you don't really have a clean path to get there.

And traditionally Looker in this case basically represents the way that BI has traditionally been built, like that is for reasons that are more architectural and technical and stuff like that. That is how BI has often come around. It's like BI as a modeling layer and a visualization there's smashed together.

I think because we have gotten to a point where we have the technology to be able to pull out the modeling layer, and I think soon the metrics layer and certainly there are companies that believe that, that leaves the BI layer is just kind of a drag and drop visual thing, you could continue to do that. And then you could have kind of an analytics layer of data science and those sorts of things and all of that consumption pieces as other tools, to me, what essentially happens though is you just cut it, just cut the stack differently where rather than cutting it where there is modeling and BI on one part and sort of data science, that's talking to the granular data, the database, and the other side. You have modeling as half of it and then everything from dashboarding and BI down to advanced analytics and data science, all living in one place.

The reason you combine those things is there is a very thin and fuzzy line between "What is BI? What is a dashboard? What is a report? What is data science? What is ad hoc analysis?" All of those things are kind of the same and they get jumbled up in a way that it doesn't make sense for a consumer of data to be like, "Oh, this answer lives in BI, and this answer lives in my data science tool". It's like you're all just trying to make sense of data.

So in some cases, it's with interfaces that make sense for people who want to do drag and drop. In some cases, it's interfaces that people want to write code, and in some cases people want to mix it, in some cases you start wanting to move to the other. And so, if those things all work together in a way that's more fluid than just integrations between products, I think you're able to answer questions much more quickly and if the governance lives another place, you're able to answer those questions much more reliably.

What makes a great analyst?

It's not to be a "benngineer". Like, this is the hard part about this, our analysts technical. My answer is they shouldn't be, and then be like, "What are they instead?" And I come up with just a bunch of word salad that doesn't make any sense.

It's this kinda used to have a fuzzy problem and you can turn it into something that you can actually answer. It's like, "How do you problem solve?" I've always viewed it as a detective of sorts, which I don't know anything about, short of watching movies and stuff. But I imagine what makes good detectives isn't their technical forensic ability for that — I'm sure that helps — it's more, you walk into a crime scene and you kind of see the things are out of place and you see this thing, that's at a place in that thing at a place. And you're able to figure out like to deduce from those two things. "Oh, wait, if that happened, this happened, what might've happened here? If this is the thing that happened and what are other implications of that that I might be able to go find."

And so like in data, you have a question you need to kind of figure out what are the clues that might point you in one direction or another. If you see something that looks out of place, do you notice that? Do you understand what the implications are like what other things might be out of place, because this is at a place? Can you go find those things? Can you sort of build those stories in your head? And be like "All right, if this is like this, then maybe it's one of these three things. If it's one of that, if it's this other thing, then I need to go look at this problem, it's the problem I'm going to look at this". Like, how you sort of come up with theories in your head and sort of test them with indirect ways of measuring them.

It's like, you have to be curious enough to look and then you have to be observant enough to see it, and then you'd have to be like analytical enough to connect the dots.

Looking 10 years out, what do you hope to be true for the data industry?

I hope that we basically spend a lot more time on the detective work and a lot less time searching for clues. Really, to me, the thing that I think makes data valuable is you have it when you need it, you don't have to go fishing for it in a bunch of haystacks. And your job as whoever it is that's using it, whether or not it's a person in sales is trying to make a decision about who to go talk to or whether or not it's a marketer, trying to figure out what campaigns to run or whether or not it's a CEO trying to figure out what company you acquire next. To me, it's being able to weigh those decisions, and like, what are the implications of this?

If I'm a CEO, should I start a team? Like build a sales team in Europe? There's a lot of things that go into making that decision. Part of it is collecting a bunch of data and being like, "What kind of view do we have to this market?". We spent too much time on things like the collection and figuring out like all of the inputs to that, and too little time on, "All right, we have all this information. What decision do we make about whether or not the sales team should be in Europe or whether or not it should be in Asia or whatever?".

Like, and I would rather us be able to spend all of our time on trying to reason our way through those problems and talk about what we do instead of "Okay, what data do we have? How do we get there?" That kind of stuff.

And so there's a small version of this, which is you have an executive dashboard and an executive meeting or whatever. How much time do you spend debating? Is this number, right? Or like, why is this number what it is and what changed? What's underneath it versus what do we do about it?

And right now, I think even on that small scale, we spend a lot of time asking, "Is this number right? What do we do? Is this number right? What does it mean? What’s changed? Did something break?, wondering kind of, "Is the data accurate? Or is it representative of something?" instead of saying like, "Now that we know this, what do we do?".

And there are small decisions, there are big decisions. There are decisions all the way from company changing decisions of "Do we launch new products?" and stuff like that, down to the small minute decisions of " Who do I as a BDR call next on my list of 200 people?". But the more we can think about like, "What is actually going to take?" instead of "What does this data mean?" I think is the ambition to me.

The Analytics Engineering Roundup

Discussion about this post