Ep 31: Data activation everywhere (w/ Julie Beynon of Clearbit)
As Head of Analytics at Clearbit, Julie serves as a data team of one in a 200+ person company (wow!).
In this conversation with Tristan and Julia, Julie dives into how she's helped Clearbit implement data activation throughout the business, and realize the glorious dream of self-serve analytics.
Listen & subscribe from:
Show Notes
Key points from Julie in this episode:
What does Clearbit do? How big is the team? How long have you been there? How is the data work setup?Â
So I joined just four years ago, and it was 40 people at the time. At that time we were sort of selling APIs to connect data to your tools. So everyone was pretty technical and the data team sort of anyone who could write some SQL and get a report.
The company as a product has grown as our data org has grown and we've sort of grown into activating your data. When I came in, we had to just clean up the data. We had to turn it into something that we could actually use. And there was none of that existed there.Â
And then we sort of moved into self-serving data. So I wanted to be as little of a blocker as possible. So how do you make this data that you've collected and put it in a place that people can use and they can just self-serve it, whether it's in a Mode or a Tableau or a Salesforce or wherever?Â
Where we're at now, we're still data 201 internally. If you look at Clearbit, you'll see we have a ton of data engineers, we have a ton of data professionals, but they work on the product that we sell. So they're not doing any internal reviews or internal performance analytics.Â
So now we've grown into a team where our goal is to activate the data that we collect and so too has our product, which I find it's like our data team has grown as a product has grown kind of in lockstep.
What other destinations have you thought about as being like super important to actualize self-serve data?Â
So the first thing I did when I started was at called Tristan and I said, "We need to bring someone in to help us".Â
So truthfully, I started and worked with Claire and we built out all of the models that were required to give us that information. We didn't even have a table of all the users, so there was no way to know every single person in our world because they were in a whole bunch of different places. Account-level funnel, metrics, web analytics, like all of that, had to be built, and I think to your point it's still something you have to do. Like, I don't think any data person will say their data is perfect in any capacity.Â
So you can always go back and iterate on getting the data ready and as new data comes in, you have to go back and you have to set that up for success as well. That's something where we got to a place. I like to use 80% - and there's no scientific reason behind it - but I was 80% happy and we had enough data to move to the next stage where we could answer 80% of the questions people were asking on a consistent basis. So I had a lot of stumbles at this page at this stage.Â
Getting the data ready is there were definitely some hard pieces with the data, but that's a relatively easy thing to do. Where I find it harder is getting all this data that you've done all this work and making it useful. And we definitely tried a lot of Mode reporting. And the amount reports I have of sitting in what I call the report graveyard is really embarrassing. Like there's a ton of reports that I spent hours and hours to build, and no one's ever used and put all these dynamic parameters on so that someone could answer any question that they could ever possibly think of. And then what I learned from there is we just need to go simple. Let's get the easiest question that everyone can answer, and once everyone can answer that question, let's move to the next stage. And that I would say came in year two. So I fumbled for about a year of trying to answer these questions and building these complex dashboards and these crazy models and nothing ever worked because it just, no one could answer the basic question.
One thing I'm trying, we wanna put all of our data where, like eat our own dog food, so to speak. So I want to put our data - that's not currently in our instance of Clearbit - in there so that we can start to get our insights inside of our own tool and understand how to be the best users, the best customers for our own product. And so that's my new goal is: how do I help us learn how to be better with our tool and also get legitimate insights? Because our sales team has good insights.
Our marketing team still has to fumble around a little bit in the top of the funnel with Google analytics. And they're great at it. They're good at it, but I would love to help even more. Like there's a lot of data that they can't access themselves. So getting it into our own tools, we become our top cust.
But right now my current journey is like, how do we grow our total data team? Because we can't stay as one person anymore. And the reason that I think we've stayed at one person is I've struggled with I'm on this island by myself. I've seen a lot of smaller companies put data on the growth team or the marketing team because that's where the data is. And then they sort of have this person that they can get the data whenever they need it from. So that's where I've seen it start, that's where I've personally started.
It works when you're smaller, but as we start to grow the problems you see are we have data over here. We have data over here and everybody wants data in some capacity and it's not consolidated. So my next real project is consolidating our data across, as a product, the engineering data, our GTM data, our finance data, get it all in one centralized place because we've kind of wild, wild west. We had you know, we have a BigQuery, we have Snowflake, we have Redshift, there are so many different paths to get sort of the same answer. So that's my big next project.Â
But from a cool project, I would say getting our data inside of Clearbit, is another one that I'm focusing on of how we make data less of an overhead department.
You were focused on the persona that you were serving internally and you didn't have a lot of ego invested in. Like your users were at a certain place and you met them there. Is that a fair way of putting that?Â
Yeah. We spent all this money on Mode and I tried to make it work and it does work in some reporting. So it's not a size fit all for a piece of data has to go through Mode. And that's the struggle and where we finally said, "What if we just put the data?" because if you're serving GTM teams, there's a ton of sales people involved, right? So let's put the data where the sales teams are comfortable. And then we reduce one barrier to getting the data, which is learning a new tool. Let's just put it where they're at and they can do the reporting. It's a, one of the first, I would say.
The first time I felt truly successful in self-serve was when we got our attribution set up and our demand team was able to answer questions about what's the most successful campaign that we've run in terms of the pipeline without ever having to ask me that question. And once that came in and then even we were able to figure out influence. So maybe it wasn't the thing that ultimately pushed them into our funnel, but they read this blog, they downloaded this ebook prior to being that lead or being whatever our kind of attribution, and they were able to get that influence and without ever asking me. So that's when I was like, we have solved attribution self-serve to a capacity. Like it definitely can be improved, but we're now a place where I'm not involved in any of that. Nobody asked me, which is great.Â
Do you think too much consolidation into one destination is a bad thing or it's never enough? Like you just continue to have one stack that everyone can consume from and the more consolidation you have, the better?
I think it depends on the type of data. Each Redshift has its pros and there are a lot of cons for it. So it depends on what you're trying to do with the data.Â
I'm not necessarily suggesting we take every piece of data that we have and put it into the same place. It's more about consolidating the processes, naming conventions, and logic, so that everything at least follows a similar path. And whether we collect it differently, but then we apply dbt so that they all follow a similar naming convention. Obviously, if something's being put somewhere it was created four years ago and something's now being created now it's not done the same way.
So we just continue with that but then have a layer where everything comes in and we follow the same rules and practices regardless of how the data was collected. And it can be stored and we don't care where it's stored, but everyone has a similar process.Â
That's probably where we're gonna get in a year from now, like the same rules applied to data, no matter what. And if it's this type of data, do this, if it's this type of data, do this, but everyone follows that.
High data quality is just essential for the Clearbit product. So I was wondering what kind of overlaps there would be between the data team and the product?
Yeah, I, so my view on this is that data alone is kind of like a commodity. You can swap it in or out. And it's hard on its own to be successful.Â
We've had data for a while. I remember back when I first started, it's like let's do something with this data, let's make this data actionable. And so we had a ton of data, we just didn't know how to actually do anything with it. So the one thing that I appreciate being clear is they take it that step further it's like, let's activate your ICP, which is essentially like, find your best per best customer and actually do something effective with that. And that's what we've been growing to do.Â
But I think with reveal and prospector, combining all of those tools of enrichment allowed you to see who was your best customer prosper, lets you go after more of them and reveal. Let you know when this was back in 2017 when or 2018, I joined those three products they worked separately, but together and now we've moved into a place where it's a platform and it does it all in and of itself. There are no separate products, no separate APIs.Â
So I've just never thought of data alone as being very interesting. The reason that a lot of them don't is they focus on the data quality and which is important, but everyone is gonna be 1% or 2% off each other. Like the data itself, depending on your market, your industry. It's gonna be pretty close in the data quality, it's what you do with it. And that's why I've seen other companies fail.Â