Ep 60: Being pro-human in the AI era
Barry McCardel, cofounder and CEO of Hex, on AI's impact on data practitioners
Barry McCardel is the co-founder and CEO of Hex. Hex is an analytics tool that's structured around a notebook experience, but as you'll hear in the episode, goes well beyond the traditional notebook.
We're big fans of Hex at dbt Labs, and use it for a bunch of our internal data work. In this episode, Barry and Tristan discuss notebooks and data analysis, before zooming out to discuss the hype cycle of data science, how AI is different, the experience of building AI products, and how AI will impact data practitioners.
dbt Labs recently published the 2024 State of Analytics Engineering—the annual survey of data practitioners and leaders. This year’s report delves into:
The top priorities and challenges of data teams
How data team objectives align to those of the broader organization
How data teams are investing for the future—including how they think about AI
Listen & subscribe from:
Key takeaways from this episode.
Talk to me about market structure in the space that you're in. If you were going to construct a bear case on why not to invest in any tool at the BI layer, it’s that Power BI is coming to eat everyone's lunch. The other is that there's just a tremendous amount of dispersion and that there is value in coming together.
Do you think that there is consolidation to be had in this layer of the stack?
Barry McCardel: Totally, yeah. We see this at every layer of the data stack and not just the data stack; any area of enterprise technology and software right now is seeing consolidatory pushes. You can look in HR tech, you can look in finance tech, you can look in dev tools. All of it is seeing this consolidation push. Is there a stand-alone market for vector databases is a really interesting question right now.
And so you're seeing this all over the place. Some of it is just natural cycles. Some of this is sort of the end of the zero-interest rate environment and just tighter budgets. We’re personally, in some ways, poised to be beneficiaries of that. But this is one of the reasons from very early on that we never thought of ourselves or aspired to just be like a really great notebook. In fact, I never felt like there was a standalone market for notebooks.
I remember in my seed pitch to investors saying there's not a notebook market. And if you don't invest in us, that's fine. But any company that has “notes” in their name or is saying that there's gonna be a standalone notebook thing, it's just not gonna be a big enough market for them even if they build a great product. And I know that sounds ironic given that we hear from users all the time that we've built the best notebook. But for us, that was always an entry point into a much bigger market.
Now, I think there's a real question of whether Power BI or Tableau, these last-gen-dominant BI players come and roll everything up? I don't know and it’s certainly very hard to shadow-box with Microsoft and sort of the juggernaut they've built.
It's funny because we can talk about things that are living memories for us, like Slack and Teams, but there's generations of this. I was talking about this with my uncle who was a very successful tech exec in the 2000s, and he was talking about Microsoft snuffing out markets that I'd never heard of. And so they've just been very good at this for a very long time.
It is entirely possible that Hex or dbt or any of us do a really, really great job and Microsoft winds up just steamrolling our market. I don't know how to ward against that other than just trying to do a great job. And it's a big-ass market, so that's helpful.
Ultimately though, I think there's going to be a lot of consolidation. We see it and we see a ton of customers who are coming to us and saying they’ve got too many tools, and want to consolidate. They're rolling up workflows within Hex because it's so flexible and because it can serve a lot of different purposes. We're going to continue to focus on how we can broaden the value for our customers and do more for them, because I think that trend will continue. And if we can keep doing a great job solving more problems for our customers, we'll help them.
When you started, it was fairly common for Hex to be deployed inside of organizations that also had a more classic BI tool. Is that still common or is Hex often now the system of analytical record by itself inside of an organization?
Yeah, it's a really good question. We've seen this start to change quite a bit. We made a bunch of improvements over the years and especially last year on the ability to do really nice reporting, stuff that's going to look nice, feel nice, little things like having tabs, bigger things like investing in new caching systems to make sure things felt really great.
We didn't do that as a Tableau killer—or whatever, you pick your dashboarding-centric-tool-of-choice killer. Because most of our customers do still have something like that floating around; those are pretty well entrenched. And our general position is that if those are doing a great job generating your KPI dashboards, let's leave them there. That's fine.
When we show people Hex, it's pretty obvious that we don't do things slightly better than Tableau; we just do a whole set of things that those types of tools weren't even built for. You can do things in Hex that are just not possible at all. It's very obvious and visceral to people why we can serve a lot of different purposes.
We see more and more customers starting to consolidate. Typically, they'll keep that legacy dashboarding solution around for a narrower set of things, but they wind up moving a lot of new projects into Hex.
I think it's a really interesting question that we think about a lot internally. Does this de facto make us BI? Are we secretly a BI company? I really hate that term in a lot of ways. It comes with a lot of baggage, and I think it connotes this dashboard-centric world that’s useful for certain things and a complete mess for others.
As much as we want to help our customers do a better job on a lot of those workflows and help reduce their overall spend on things by consolidating, we just have a much bigger vision and a much bigger ambition for what we can build. I don't think of ourselves and we don't pitch of ourselves as, “Hey, replace Tableau one for one.” We want to come in and help people donet new things or help consolidate a lot of broken workflows that are just not possible today at all.
One more Hex question before we get into industry stuff. What is coming that we should know about that you're ready to talk about today?
We've been a little quieter than normal the last few months because we've been heads down on some big, big foundational pieces at every layer of the stack. Some of it we're not ready to talk about yet. I don't know when this podcast will air, but a couple of things we've launched recently that we're pretty excited about are a next iteration of our built-in AI tools.
Generally, how AI is going to fit in with data is a very, very interesting topic right now. But we spent an enormous amount of time the last year iterating on both the UX and the underlying generation and retrieval engines for that.
And then another thing I'm really excited about that I don't think is as sexy to a lot of people but might be to this audience is the stuff around collaboration and governance. Like I mentioned, the reviews feature. There's stuff coming around, better governance and endorsement. We've invested a lot in search, updates, and changes for admins. All this stuff that as you have a lot of people building with data in your organization, gets really, really important. We have customers that have built thousands and thousands and thousands of projects in Hex. And so helping them be able to govern that well, discover the right things, set people up to be able to work with the right data is really, really important.
How similar is AI today and machine learning in 2015? Are we going to look back on this in 10 years and say, remember when all we wanted to do was hit this nail with this hammer? Or is there something different about this?
I think it's very fundamentally different. You can look at them and say that the hype cycle feels similar and we can talk about ways in which that's true, but having lived through and been pretty deep in the trenches on both of these hype cycles, they feel very, very different. And the reason is the ML wave in the mid 2010s required an enormous amount of work to get things to work.
Most ML projects in that era wound up as shelved R&D. And the question I would always ask when I interviewed candidates who had experience with those types of projects would be, “Is it in prod? Is it serving predictions right now?” And the vast majority of projects that people would do, especially in these big enterprises, would never make it there. Contrast that with where we are today with large language models. Anyone can go to Chat-GPT right now and experience it working live.
There are a lot of things that aren't working, and in many ways we'll look back at the current vintage of models as remarkably primitive. The way we interact with them is remarkably primitive. You have to yell at them in very specific ways. It's like you're lecturing a rambunctious seven-year -old sometimes when you're doing prompt engineering. “You will not reply with Markdown when I ask you this question.”
It's ridiculous, and I think we'll look back at this and be like, you know, well, that was weird. But the fundamental technology works well, and there's a bunch of applications where it clearly does a great job. One of the first that people got broad exposure to, even before Chat-GPT, was GitHub Copilot. If you're doing a task that is fundamentally about writing code, these things can write code pretty well.
It's not perfect code, and I think there's a very specific set of questions within analytics to query the data in the exact way you want. But as someone building a product where people spend a lot of time writing code, SQL, Python, other languages, it was very obvious to us even before Chat-GPT to see that this is something we wanted to incorporate.
We see it used and loved by thousands of our users now, the sort of Magic features we've built in. And so in that way, it feels very real. We have some PhDs here, but we don't have a massive fleet of PhDs and a lot of infrastructure to deploy this stuff.
We're hitting these endpoints through APIs that exist, and we can just pay for it with a credit card. And in that way, it is incredibly materially different than 10 years ago. And I think that's what makes me so bullish, which is ironic, because when we started the company, you could characterize my co-founders and I almost as ML skeptics. Our Palantir experience and going through that hype cycle in the mid 2010s made us very skeptical.
I was ready for you to be more cynical than you are on AI, but you're making a strong case that this is different than the way it was before.
It is, and it's easy in this moment to point out specific quirks with GPT-4 or other models or the way we're deploying it, but I still think we're just so early in understanding this stuff. Even the UX and how should it be incorporated into our workflows.
So many people are saying, look, we'll just build a chat bot for this and a chat bot for that. And maybe there's places where a chat is a really good UX, but man, it would be depressing if all of our software just regressed into being chat.
Within Hex, we're thinking about this every day of like, how does this fold into your workflow in a way that's helpful and useful? And in some ways, what we talk about internally is that it’s okay to live in the future a little bit. There are things today that the models don't get right, or that they feel really slow. And we're making a choice with some of what we're doing to embrace that and say, that's okay.
We're going to build for the future, because since we started building with this, inference costs have gone down orders of magnitude, speed has gone up in order of magnitude, accuracy has gone up many times over, and that's in a year and a half. So if you extrapolate forward, you're just going to keep seeing these gains. I think it's okay to be skeptical of certain things today, but I think you have to be very specific on what exactly you're skeptical of and whether that's short-term or something more structural.
So here's the thing that makes me really bullish about specifically Hex and dbt being able to do really effective AI code gen. The place where AI does not do a great job today is where there is really large unstructured solutions. My sense is that AI can pretty reliably write a program that has well-defined input, well-defined output, and is 100 lines or less. But the more code it has to write and the harder it is to define what good looks like on the beginning and the end, it drifts.
But it turns out that a notebook cell or a dbt model or a dbt test spec, whatever, these are very well-understood. You can clearly define success and, and they're modular enough so that you're not asking AI to generate so much code. Oftentimes the hardest problem of getting a good response is structuring the problem in a way where you can expect a good response. And I think both of our products do do that.
Yes, totally. One of the things we really discovered and appreciated early on was the naturally modular cellular UI and the notebook-inspired original UI we wound up with.
It lends itself very, very well to these types of things. Under the hood, among the many things that we did from a product perspective that goes beyond what other notebook products have done in the past was underneath every Hex project is an execution DAG.
We do static code analysis on all the code in your project. We execute the code based on the dependency graph. We did it because it fixes a lot of problems with state management and a lot of the typical reproducibility issues. Hex solves a lot of that and it's more performant and all that.
We built that well before Gen AI, but it pays this massive dividend because when you're asking something in our AI features, you're asking a cell, we can pass in the upstream context and understand the upstream dependencies of the code. And we have ways to give more sophisticated context and structure to what you're trying to do.
There’s other dividends on the UX side for us, if you look at something like the way we're able to incorporate it into the UI in many ways brings the best of chat, where you can sit in a prompt bar at the bottom of the project, just type something you want, and watch it generate cells in a much more functional UI.
Among the many reasons I don't think data work just evolves into just being chat bots is because you don't want to; that's not a highly functional way to be able to interact with the work you're doing. You want to be able to point and click. You want to be able to type. You want to be able to change things. You don't want to have to change the chart colors via prompt.
And that's how we describe it, the multimodal front end for this. You can jump in and out of editing or generating with prompts at any point. The notebook format UI makes a lot of sense for that. And also, as you said, it allows us to structure the context for the models in a really clean way. We've done a ton of work on this, but we're still early in probing that frontier.
We have a very clear vision for how we want this to fit into the workflows for folks. This is built to augment, not replace human insight. I am fundamentally bullish on data practitioners in this new era. I'm fundamentally pro-human. That is very different than a lot of the rhetoric and hype you hear. But we think that's the right approach from a product perspective. And we're doubling down on that.
There's a lot of focus around AI in the consumption layer. But I'm really interested in the data pipeline world. There is “help make me a data pipeline ” AI. Cool.
The other thing that happens is that you're constantly debugging problems with existing data pipelines. I am really interested in “ship a ton of metadata and figure out the underlying causality that caused a failure.”
I think this is 100% tractable and right now I don't think that assessing causality is the very best, the place where LLMs today are the strongest, but you can get them to do it in certain cases. Do you agree with this?
Yes, I'm so excited for this. We can talk about assessing causality. I think how these things work is very interesting. We have a feature that's called Autofix, which is if you have an error in a cell you're running, we just will suggest that we can fix it automatically for you with AI.
Can you say anything about how you're getting the model to do that? Is it just a really clever prompt? What are you doing?
Yeah, it's prompt engineering. At the high level, we take a lot of prompt templates we use. In that case, we have a prompt template for Autofix. We'll take the code from that cell that's executed and errored. We'll take the error message itself. And then we'll also take upstream context, as I mentioned, parsing upstream in the DAG to pass the context of what's happening upstream in the project.
And you pass that through to the model, and with some high percentage, not 100%, it's important to know, but with some high enough percentage of time, it's going to take that information and go and write you a new version that will run.
Why does that work? How does it reason about that? Can it infer the causality between a bug in your code and the actual error message? How these things work under the hood is a little bit spooky.
It almost doesn't really matter.
Yeah, it also doesn't matter. We know through our experimentation that that prompt template generates a response that more often than not is fixing it for people.
In some ways, working with AI is so strange because it's so stochastic. In software engineering, it's a very formalistic way of working, and so it is a little weird that sometimes we don't know why it works, but it works. In most software engineering, you would not tolerate such an answer for why a feature works.
But we do that, and I think you can imagine in the case you laid out, Tristan, of building data pipelines or maybe repairing data pipelines, if it's happening overnight, or at the very least if it's happening where I'm not having to stare at it as a user, you can also just try multiple times. You could kick off 10 different prompts and try to fix it and see which ones get the tests to pass.
Yeah, you may run up inference costs or database query costs, but at some level, you can. We've even seen this in some things we've experimented with. You can almost brute-force things sometimes and embrace the stochasticity of it. And again, it feels a little weird. It feels like you're throwing a bunch of stuff against the wall and seeing what sticks, which is usually not how you approach an engineering task, but it is very fun to have this different way of working that challenges your assumptions.
I think it's going to be completely tractable to say, “figure out what the problem is and submit a PR to fix the problem. I don't want to actually look into data pipeline problems. I want to review PRs that solve them.”
100%. There's a massive percent of these pipeline problems that could be fixed effectively on autopilot and have someone review it, have someone merge it in. Now, the really interesting question to me, Tristan, is that you can draw a very clean line between you and I on this conversation right now, getting really excited about that, and people hearing that and saying, “My job is fixing data pipelines. What do you mean the robot's going to do it for me? What's going to happen to my job?”
It's this implicit assumption that these LLMs can write code and so we'll replace humans with the LLMs that can write code. And I fundamentally don't believe that for a bunch of reasons. I don't think anyone wants to be manually going and fixing data pipelines because of upstream format changes. That's the worst part of anyone's workflow.
One of our company values is we believe in moving up the stack. There's always more and more valuable work to be done. And no one pays anyone's salary to fix data pipelines. They pay people salaries to solve business problems.
Yes, 100 %. Well, that's what they want them to be doing. And when they find out that they're actually solving these data pipelines, everyone gets frustrated because it's not what we want these people to be doing.
I think anyone who's thinking that these things will reduce the sort of net count of people and data analysts or data scientists or analytics engineering roles, I think is missing the point. I think that presupposes that there's some finite demand for insight.
What company are you aware of that is like, yes, all of our questions are being answered with this many people? No one. There's an infinite demand for insight and an infinite demand for data work. And if you make it cheaper and faster to be able to do that, you're going to get more of it, not less. And I'm super, super bullish on that.
We are taking a bet on data teams and data practitioners, and we think there's gonna be more of these people and not less. I fundamentally am bullish on this new golden era of data teams, and that's what we're going to keep pushing on.
Yeah, I'm with you.
This newsletter is sponsored by dbt Labs. Discover why more than 30,000 companies use dbt to accelerate their data development.