Ep 58: How the media covers Gen AI
Matthew Lynley of Supervised joins to discuss AI for the analytics world
Matthew Lynley is a bit of a hybrid. He's been a long-time journalist covering enterprise tech, currently in his fantastic AI and data newsletter Supervised, and he's also been a hands-on data practitioner. Matthew has covered the analytics tech stack, but this time Tristan turns the tables to get Matthew’s perspective on the world of AI.
Matthew and Tristan discuss the rise of Gen AI as a topic in the popular press, what's going on in the space today, and where AI is headed.
We are currently accepting proposals for qualified speakers to join us at Coalesce 2024 in Las Vegas. There is SO MUCH changing in data today, and Coalesce is the single biggest stage for you to highlight the amazing ways that you and your data team are innovating in this fast-changing landscape.
Submit your ideas before the 4/3 deadline closes!
Listen & subscribe from:
Key takeaways from this episode.
Why did ChatGPT generate so much coverage? Is it because it’s just a good story and it generates eyeballs? Is there cynicism there?
Matthew Lynley: I think what was different about ChatGPT, almost in the same way as Instagram and Foursquare and some of the early iPhone 4 apps, was that you got it right away. The second it was in your hands, you could understand what it was. And it was working in front of you and it was doing really crazy stuff and really weird stuff, right? It was also getting a lot of stuff wrong.
It wasn't like you needed to explain to your CTO what weights and biases were. It was really freaking obvious.
Right. I think also part of it was just some of the answers were so ridiculous and comical that they went viral on Twitter. And we had to throw a name on them. We called it hallucinations, which I still think is a bad name, personally. But so there was an era of comedy around what was happening with it to a certain extent too. But it was really cool.
I think when it first came out, there had been such a pent-up craving for something really new, something that actually had a chance to change the way we interacted with technology.
What did we get out of [earlier tech]? We got Uber and DoorDash; they're super useful, but are they life-altering, generational technologies? Probably not really. And so there was, in general, pessimism in tech.
Especially as the Cambridge Analytica stuff was happening with Facebook, on the consumer side and on the enterprise side too, we were pessimistic about a lot of product developments. Even though during this whole time, storage costs are dropping and things are great.
And so there was a craving for something new and really cool and really interesting. And with ChatGPT, we had never seen a technology like it before. And that really hasn't happened since the iPhone, right? 13 years or 12 years or something like that. One of the things it did was wake people up out of the pessimistic, “I'm addicted to Instagram and I hate this; all I'm doing is flipping through TikTok.”
On the consumer side, there was just something to get excited about for the first time in a really, really long time. And obviously, when that happens, the first thing that happens is everyone is like, “OK, how do we integrate this into our business?”
The point is there was just so much potential. Everyone saw this insane amount of potential and everyone saw where it could go, and there were infinite permutations. Everyone was hacking away at it, hacking on it, jail-breaking it, making it do crazy stuff. But it was just finally something new.
Can you give me some specific opinions on some different AI topics? Do you think that there is one Harvard Business School case that gets written about the last 18 months of AI? What is the case and what makes it interesting?
I'm not gonna name any specific one company, because the story hasn't really played out just yet, all the way.
There’s a real question around a lot of these startups and apps that basically just use GPT-4 for everything, and just call OpenAI APIs for everything, in the same way that, back to the iPhone 4, some stuff were wrappers on top of Foursquare. And there’ll be a number of companies that were launched and funded and grew and then maybe dropped off and hit that trough. And some of them were just UX on top of OpenAI, or Claude or Gemini or something on a Hugging Face Space. Like it could be anything.
And there are some companies that have done a pretty good job of that. It can be technical. It can be they've just gotten enough GPUs, because those things are fricking expensive. Or they have a really powerful community. Hugging Face is a good example of where they just have this very big fan base; it's GitHub for AI. That’s locked up and taken. Don't even bother going after it.
But the story hasn't played out yet. The potential challenge with foundation models is that if everything is trained on the same data, you can't really discriminate against each other. There may be some technical wizardry under the hood or optimizations that can't be transferred back and forth, but it's not clear how you create real staying power.
Data is probably the closest one in so far that maybe you can collect data on how people are using my products and understand in better detail Company A or company C to my left and right, and refine my products more quickly.
Bringing it back around to the world of analytics, you have written about metadata platforms being very relevant in the age of AI. Can you say more about that? What’s your belief about how metadata about structured data analytics can be incorporated into AI or drive value.
For a long time, the notion of being able to use AI was, “I need to have a GPU in my VPC, and I need to host it myself. I can't trust sending OpenAI my data and open source is really hard to use.”
What's changed in the last four months, the same principles that made Snowflake possible like having it accessible through an abstraction layer like Databricks or Snowflake, naturally extends to the usage of AI, which is basically really aggressive lineage and maybe even a higher bar than typical analytics because I need to know what's going into the model and what's coming on, all that kind of stuff, or how it's trained or how it's customized. And IAM (identity and access managament) is a big thing too, permissioning and access management and all that kind of stuff.
And all these governance tooling, all these governance requirements—observability, evaluation on the other side—increasingly that’s an option for getting these things into production. Snowflake and Databricks justifiably recognized and just both announced Mistral models are gonna be on their platforms because they have all those like the, all those governance tools are baked in and built in already.
That means that the same tooling that went into us moving into a more ELT abstraction mindset for data capture and access and analysis, we can use the same tools and apply them to AI. And apply them to a chat bot into production that won’t give a plane ticket away to a customer for free.
And all the companies that were powering this have a second shot at fitting into new emerging workflows. So anyone who's working in lineage, anyone who's in catalogs I think are all in a really good position to have a second go-around here on governance and lineage and observability layers. And for all of this, again, you still need high-quality data in the first place to do anything
Say I want to run a customer chat bot. It should probably know about my products and it should probably have a history of my company. It needs to get stuff from me specifically to achieve the specific goals I have for it. And so you still need the same data quality tooling, the same stuff used for Snowflake and Databricks.
Everything old is new again. It comes back around for a second time. It's just this time it's for some non-deterministic use cases that we really hope to God is right.
This newsletter is sponsored by dbt Labs. Discover why more than 30,000 companies use dbt to accelerate their data development.