Coalesce 2024 edition: What’s next for data teams?
Brooklyn Data Co. Founder Scott Breitenother joins Tristan in conversation at Coalesce 2024 in Las Vegas
Welcome to Season 6 of The Analytics Engineering Podcast. Thank you for listening, and please reach out at podcast@dbtlabs.com for questions, comments, and guest suggestions.
Scott Breitenother, founder of data consultancy Brooklyn Data Co., joins Tristan at Coalesce 2024 in Las Vegas to discuss the early days of dbt, the evolution of data teams, and what's next for the dbt community.
Missed Coalesce 2024? Don’t worry, we’re bringing the dbt product highlights to you. Plus, we’ll show you how you can apply these features for real business impact today.
Listen & subscribe from:
Key takeaways from this episode
So, you have some news to share that I think you've been talking with some different folks about. So, I don't think we're going to break this live or anything, but it's a little bit of new information. Do you want to get it out there? Sure.
Scott Breitenother: I started Brooklyn Data in the summer of 2018. We grew rapidly, I think big thanks to dbt and all the success the dbt community has had. We built something really special.
And about a year and a half ago, we went through an acquisition. We were acquired by Velir, which is an amazing digital agency based out of Boston. I couldn't have asked for a better home for Brooklyn Data. For the last year and a half, I've been working hard on integrating. Any acquisition is like a marriage. I feel like I can do a whole podcast about integrations and acquisitions and everything I've learned.
But the story is that tue integration has been great. We're one company, and at the end of October I'm going to be stepping back from the day-to-day. I'll still be a board member of Velir and Brooklyn Data. I'm going to be involved in the strategy, the big picture, all that kind of fun stuff.
But I won't be doing any more hands-on work, and I’m not quite sure what's next. No plans. So I feel like I'm gonna have to ask you for some help on coming up with hobbies.
When you were at Casper, you were very involved in dbt and the dbt community, and then you started Brooklyn Data. I remember asking how you were managing to be a husband, a new father, and a founder. What did it take to get this thing off the ground?
I think the answer is I didn't sleep a lot. You hear the saying it isn't work if you love what you're doing. And I mean, it's true. The reason I have to find a hobby is because building Brooklyn Data has been a hobby. Any free processor time I had on my mental CPU, I was thinking about new and innovative ways to grow Brooklyn Data. Not because I was on the clock, not because I had to, but because I loved it.
We met when you were working with us at Casper and helping us set up dbt. And I felt very privileged to, I don't know, be in the know on something like dbt. It was special and cool.
I told all my friends because dbt was great and it was changing my life, my team's life, and the data scene, especially in New York.
We had a lot of people asking us for advice because at that point we had probably one of the more sophisticated dbt setups in 2017, 2018. There was a point where we were like, there's a business here. There were very smart people working on very cool pieces of software—dbt, Snowflake, Looker at that time. But there was a gap in professional services to guide companies on this journey to get the most out of these tools.
The classic consulting companies had not really woken up to this, and you couldn't just show up to Accenture and say, “Help me build a modern data stack and move faster.” Do you think that the McKinsey of the modern data environment is just McKinsey?
Is there still a great independent consultancy to be built out of this era of data? I don't think there’ll be a pure-play large-scale data consultancy. It's too early. IYou need to be a broader agency.
What's happened over the years is that the industry has just matured, and you need scale for brand awareness, to scale to invested systems and account management and sales teams and sponsoring conferences and all the things that, you know, when it was just me and my Brooklyn apartment coding I couldn't have even imagined or afforded. And so you need that scale.
One of the reasons we chose to join Velier, which is a digital agency, is you need multiple complementary service lines. Ddata work by it's very nature will ebb and flow. Not every organization is going to refactor their data warehouse every single year.
Don't get me wrong, we have long, durable, amazing relationships with our clients. It typically starts with kind of a bigger project, then it evolves to different projects. And for many of our clients we have a long, durable, similar-sized relationship for many years.
The best way to kind of keep those long durable relationships is to take that trust that you have built with this client and help them with other stuff. The beautiful thing about data is it plugs into everything.
I started my career at Deloitte, and I've now gotten to see things from the other side partnering with the largest consulting organizations in the world. Trust is just not replicable without years and years of relationship building.
People buy from people. With services, the people are the product so it's doubly true. I reinforce to my colleagues at Brooklyn Data that we're somewhat in the hospitality business—we're in the surprise and delight business. Listen and read behind the lines, give people what they want, but also give them what they need. Truly successful consultancies are empathetic.
At the end of the day, there is going to be a transactional relationship to any third-party consultancy, but it should feel as much like a partnership as possible, and I think we've done that. A lot of our clients call us a consultancy that doesn't feel like a consultancy, and I think that’s a compliment.
You are a little bit emblematic of a data practitioner of a particular era. You started as an econ person, right?
Yes, I studied business. I did strategy consulting for four years. Excel, PowerPoint. And it was very quantitative. Very quantitative. No SQL. I build economic forecast models in Excel. I remember waking up at 2 in the morning early in my career with Excel nightmares. Now I look back and I see the testing and all the version control that you have in data would have saved a lot of heartache early in my career.
Your Excel macros were strong?
I’m extremely good at Excel shortcuts.
In 2009, I remember hanging out with some friends and someone asked, “What is the thing you think you're top 10,000 in the world at?” And I said I'm really good at Excel. Everybody else had better answers, but it’s a point of pride.
I mean, it's still my happy spot. I love when I code dbt. I don't do it much very often, but when I do, it's very tangible, very fulfilling. I love when I do Excel. And no one's looking, but I'm still gonna format it really nicely. No one's gonna see it. This is just for me.
I really like it to always be the correct font size course across the entire sheet. And then the
One hundred percent.
You got a job running data at Casper. How did that happen?
Yeah, that's a head scratcher for me. I'm not quite sure how I got it. I was living in London at the time, working consulting. Moved to New York, had taken a sabbatical from my consulting job and decided I was going to work in tech. And I didn't know what that meant.
I just literally went to meetups and reached out to anybody I had the most tenuous connection with on LinkedIn, went to a meetup every night for months.
I do really, really recommend to everybody, networking is key. Build your network. Go to meet people. Actually meet people. Meeting people in real life builds a much stronger relationship than digitally.
A friend of a friend introduced me to the founders of Casper. They brought me on as a freelancer and I remember in the interview, they asked, “why should we hire you?” And I said, “You don't want to hire someone that knows the tool and only knows the tool, you want to hire someone who knows how to think and can learn any tool.”
You were the first person data hire?
Yep, I was employee number 16 at Casper. I was there for four years. The data team grew to 15 or 16 people. The Casper experience was amazing. I learned so much. It set my whole career up. People talk about the PayPal mafia, I think there's a Casper mafia. Everybody is doing very cool things.
Someone took them private. I mean, I still buy Casper mattresses. I don't know anybody that works there, but I buy it out of brand affinity.
I bought the top-of-the-line mattress.
Oh, I'm still cheap. It's supposed to keep you cooler. Does it work?
I love it. It's actually worth it.
You're convincing me.
We had a podcast about data and now we're talking mattresses.
Here's the reason that I wanted to go back in time to this. Historically, data teams weren’t a thing. There were IT organizations that did data things because executives needed dashboards. But all of a sudden, all the tools were changing, and there were startups with existing data infrastructure. You got the opportunity to take people who understood strategy and give them control of the infrastructure. And it produced very interesting things.
I agree. The evolution of the Casper data team exemplified that. When we started, everybody in the Casper data team looked a lot like me, former management consultants. And actually we looked a lot like our stakeholders. We’d start to specialize, so we'd have people on the data team that’d report with the marketing team, and they’d go to marketing meetings and be embedded.
But we got to the point, I’d say two years in, where the scale of the problem, of the data, of the complexity of the infrastructure we were managing got to a point that we saw a bit of a divergence and evolution of the team where we started seeing more STEM degrees. More people with engineering or tech backgrounds focused on writing good, clean pull requests. I look back and those people that I'm describing, those were early analytics engineers.
We organically saw the split of the analyst and the analytics engineer.
I think your vision is to get them all speaking a common language. And that was SQL. That was a big aha moment six, seven years ago, when there was a clear debate, Python versus SQL.
And we all forget about that. And I'm not saying that Python lost, but SQL has very much become the language of communication between analysts and engineers.
What do you think has created the opportunity for change over the past decade? Because things in data have changed a lot.
Compute and storage got cheaper. If you strip away all the things, that's the thing. Cloud driving compute and storage to be cheap for you to not just save some data, just save it all. Let's not look at five days of data. Let's look at all the data. I think Redshift was very much the big unlock. It's not like, because I remember early days I was chatting with, you know, I think
Fivetran made it easier to bring data in. dbt made it easier to transform and scale and use GitHub and version control. So it's just all these technologies that allow people to do something easier and easier and cheaper and cheaper.
What are the companies that are going to get created, like the professional services companies, what are they going to get created based on now?
The modern data stack wave was gigantic, and I feel very privileged to be at the right, exact right place at the right time when it happened. Kind of like the dbt community, I don't know if I will ever see something like that in my lifetime again.
You see people talking about AI, of course.
Are there boutique AI consulting shops?
I haven't looked around, but I'm sure, there has to be, right? I think the only difference maybe with AI, though, is the modern data stack flew under the radar for many years. Even today, we win against the big global consultants because we know the tools better. I've been using dbt since 2016. Yeah. You know, it wasn't enterprise ready back then. It wasn't on the radar. You look at Accenture making one to two billion dollars a year on AI work, that didn't exist.
The whole AI industry, we all agree that there's a there there. But I also think we all universally agree that 90 percent of the startups we see will fail. And these VCs are paying really high valuations for investments. I think a lot of people are going to get washed. But they'll also find that one next thing, too. It's going to be beautiful, creative destruction in the AI space.
I'm bullish on the underlying technology versus I'm bullish on the equity returns currently.
One hundred percent. I'm reading a book right now on what moves markets. Not many people made money on canals and railroads. They were hugely valuable for America. But a lot of people got wiped out. It’s all about timing.
Do you have any thoughts on Iceberg? Are you finding customers engaged in this?
As a conceptual concept, yes. I don't think any of the use cases are perfectly ready for primetime. I think everyone's excited about the direction. It's really exciting to see Snowflake support Iceberg, both as a native format and external tables. And you play that forward, theoretically, you know, you're abstracting storage. May the best compute platform win.
Everybody I talk to is excited about this. But I think they're all in a holding pattern. They're experimenting it. They're using it for ad hoc use cases.
I don't know of anybody with their entire data lake on Iceberg and they use eight different compute platforms.
But I'm excited to experiment. I'm excited to flip on Iceberg as a materialization method in dbt and play around with it. The team is already starting to test into it. I think there might be a moment where we go to our clients and say it's hit the tipping point and we should convert everything to Iceberg and start thinking about it from an Iceberg first mindset when doing architecture.
Fivetran has gone very far at Iceberg. They're making it very easy to take data and deliver it directly to Iceberg and then you can use it from there however you want.
That aligns, because when we bring Fivetran into enterprise clients a lot of them do want some sort of landing zone in some sort of cloud storage before bringing it into their data warehouse. And I think that speaks to both an Iceberg trend and just a need in the enterprise to have a separate landing zone.
One of the other interesting things that I've heard this week is about dbt Semantic Layer. The thing that I'm hearing over and over again is people saying they are finally ready to make the investment to do this.
Have you seen conversations with clients where there's an interest in a semantic layer that spans BI tools, or is this another conversation like Iceberg, where in theory sounds great, but we're just like not there yet.
One hundred percent there's interest. People are really excited about doing this. There are a few things that have happened. It was only GA last summer, right? There is the network effects of all the integrations. One of our clients works at a large enterprise, and when you announced the Power BI integration, people were really excited about that. That's game-changing.
It does definitely feel like a moment. It’s much more tangible and less theoretical than Iceberg at the moment.
I'll take that as a compliment. Okay, so let's close on the community. You have seen a real journey from a couple folks in a room at Casper. Here at Coalesce there are people who this is the first time they've been a part of a community that's outside of the people that they work with at a large enterprise.
What do you think this group of people needs next? Whether it's from their software or maybe more importantly, from each other. What is the dbt community?
First I’ll step back and just say how grateful I’m just to be an early member. You say the dbt community, I just say my friends. I go to Coalesce for a few things. I go to see the product announcements. I go to see which vendors are cool. I go to see people. And I think for a lot of people that's why, because it's the community.
I mean we love the Slack version, but nothing beats in person. I's just been such a wonderful experience. And what I've loved is how inclusive it is. Come as you are used to mean one thing, now it also means wear a button-down and a suit to Coalesce. Or come in a t-shirt. Be an executive at a Fortune 500 or be an analyst at a startup.
The dbt community is very welcoming. dbt is a collection of people that have shared interests, shared challenges, and now have a common language to express those challenges, a common community to learn and interact. The community is only going to get bigger.
The personas are only going to get more varied. I really love the One dbt aspect because it's community but it's also the tool. There are people that are never going to learn SQL. There are people that might be able to but need an onramp that's a little less intimidating than the command line. And that's where the Cloud IDE came in. And so now the drag-and-drop is going to be really interesting, and it's a good opportunity to make the tool more welcoming.
It's going to be very interesting to see what the talks look like next year. Because there's going to be talks for people that literally know dbt only as a drag-and-drop interface. It's a different community. And so we just need to keep welcoming and open to all the different personas.
This newsletter is sponsored by dbt Labs. Discover why more than 30,000 companies use dbt to accelerate their data development.