Ep 61: From Moneyball to Gen AI
Eric Avidon, a journalist at TechTarget, on hoping it's not BS
Eric Avidon is a journalist at TechTarget who's interviewed Tristan a few times, and now Tristan gets to flip the script and interview Eric. Eric is a journalist veteran, covering everything from finance to the Boston Red Sox, but now he spends a lot of time with vendors in the data space and has a broad view of what's going on. Eric and Tristan discuss AI and analytics and how mature these features really are today, data quality and its importance, the AI strategies of Snowflake and Databricks, and a lot more. Plus, part way through you can hear Tristan reacting to a mild earthquake that hit the East Coast.
Join dbt Labs executives and product leaders as they reveal the latest dbt Cloud innovations designed to help you deliver data that works. You’ll see live demos, learn about the newest releases, and get a peek into what’s coming to dbt Cloud over the next few months. Register now.
Listen & subscribe from:
Key takeaways from this episode.
Compare and contrast writing about analytics and data management versus the Boston Red Sox. I have to imagine that your fans now are far more passionate than those Red Sox fans.
Eric Avidon: I first heard of analytics through sports really around, around maybe 2000 or so, 2000, 2001. You started hearing about these much more advanced statistics around 2000.
The Billy Beane era.
That's it. They started valuing different things. He created an advantage for his team, was able to create, develop a team on a much smaller budget than teams like the Yankees.
The Yankees are spending four times as much as the A's were on payroll, but the A's were able to compete with the Yankees because they were ahead of the curve in discovering that certain things should be valued above other things.
You've been at TechTarget for five years, covering business analytics and now data management as well. Is that, is that right? Yeah. How would you kind of define the boundaries around those areas?
Business analytics is relatively straightforward. I mean, it's really statistical analysis at core and using statistical analysis to drive decisions.
Data management is under the hood, it's the deep technology, it is complex stuff that every time I write a story, I'm going and looking up things that I wrote about two weeks earlier to describe some complex technology.
When I first started covering analytics, one of the analysts that I frequently spoke with was a Forrester analyst, and he kept saying analytics is a mature technology at that point in 2020.
And now I understand what he means as I'm now covering data management. In analytics, there's really not a whole lot new that's happening other than generative AI, of course, but it's more about enhancing what's already there. Whereas data management is always new stuff. And there's all these new vendors that are popping up with specialties and big vendors are buying up those startups to add those capabilities into what they can offer.
And that's what I'm seeing the difference between those two. One is really moving fast and the other is enhancing what's there.
Everything that leads up to that moment when you're actually visualizing the data and trying to make a decision based off of that data. I would call everything leading up to that data management because it's preparing the data for that moment.
Generative AI, what's your read on how real this is today? People that log into their Tableau dashboard or their Power BI, whatever, are they mostly using Gen AI capabilities today, or are we not quite there yet?
What I hear for the most part is that it's largely still theoretical. There are some tools that are starting to trickle out now. I would say over the last three months, a handful of vendors have made some tools generally available. They're more in the AI assistant realm where you can ask questions without having to write code.
Tableau has one Gen AI tool that's GA and one that's now in beta testing. Microstrategy has a has a chat interface that's GA.
We released our State of Analytics Engineering, and a shocking percentage of data practitioners surveyed were either currently powering Gen AI capabilities with their pipelines or have active projects in the works.
Yeah, I'm hearing the same thing.
Among the analytics vendors, I think you're getting the chat bot, the AI assistants that they're rolling out and you're not getting a ton of use yet, but yes, among the data management vendors, I think you're getting a lot of work to facilitate development of AI, genAI applications and models, and that customers are taking advantage of those.
Are you a Gen AI bull or a bear? Do you think that the way that data people work in five years is going to be completely transformed?
Five years, I don't know if I can project that far. I think vendors have to get a handle on accuracy before gen AI can really be everywhere.
When I talk to vendors, they’ll have introduced some capability a year ago that's still in preview. And that's obviously not a typical release cycle. A typical release cycle is you release something in preview, and then three months later it's GA.
I asked them what's taking so long, and it's about accuracy. It's about making sure that when someone asks a question of their data, they're not getting incorrect answers.
You can almost leave gen AI aside when you talk about data quality. There's some problems that the data ecosystem over many decades has made progress on, and there are some problems that we haven't made as much progress on. The ability to store and compute large volumes of data, we got way better at that. The ability to scalably build data transformation pipelines on top of that data; we've gotten way better at that.
When we survey folks, we find that quality is their number one issue that is a problem today. I'm curious if you see anything in your coverage of the space that makes you optimistic that this might be one of those problems that we solve as an industry.
The more you emphasize something, the more it's going to be addressed. People weren't talking about data quality previously; data just gets passed up the line and used. I didn't hear a whole lot about data quality right when I started with Tech Target. I've definitely heard more now, but I think that's tied to AI as AI has exploded over the last year or two. I think there's been a heightened emphasis on data quality because if you’re suddenly relying on an automated system, what's going into that automated system had damn well better be good, or else what comes out of it is going to mess us up.
What do you hope to be true of the data ecosystem in five years?
I hope that all the stuff I've been writing about isn't BS. That I haven't wasted my time covering all this stuff. And that readers haven't wasted their time reading about it. And that vendors haven't wasted people's time promising it. And that it actually comes to fruition.
One of the statistics you hear is that in organizations, 25% of people use analytics as part of their job, and that has been stuck there for two decades or so, maybe even more.
The hope is that the natural language processing capabilities promised by Gen AI can really break through that. Maybe it's not going to be 100%, but maybe 75% of people don't need to have extensive data literacy training and to use analytics because they can simply type something into a Google-like interface and get responses that can help them.
Will we begin over the next tear or two to see that 25% creep up to 40%? Hopefully someone will do a study, whether it's Gartner or someone else, in early 2025, to see where that number is.
This newsletter is sponsored by dbt Labs. Discover why more than 30,000 companies use dbt to accelerate their data development.