Let's Talk about ChatGPT
Some of my favorite technology strategists' takes on the implications of the newest LLM-meets-chatbot.
Episode 36 of the Analytics Engineering Podcast is out! In it, Julia and I talk to Sean Taylor and Vijaye Raji about deploying experimentation at scale: how to watch out for spillover effects in experiments, avoiding bias, how to run an experiment review, and why experiment throughput is a better indicator of success than individual experiment results.
There are not too many folks who have the kind of insight into digital experimentation at scale as these two and it was a real pleasure to chat with them.
Enjoy the issue, see you in 2023 :)
- Tristan
Anna briefly touched on about ChatGPT last week; I’m going to take it a bit further and turn to some of my favorite, although less-commonly-linked-here, authors. It’s rare when a topic from data goes so mainstream as to get written about in essentially every tech-focused blog I read, but that’s what’s happened over the past two weeks.
—
Ben Thompson never (rarely?) disappoints. His recent post on ChatGPT compares its content to other untrusted content online, suggesting that in many ways this doesn’t change anything: content had already become free and what’s scarce is trust, because ChatGPT outputs are frequently incorrect (although convincing).
The solution will be to start with Internet assumptions, which means abundance, and choosing Locke and Montesquieu over Hobbes: instead of insisting on top-down control of information, embrace abundance, and entrust individuals to figure it out. In the case of AI, don’t ban it for students — or anyone else for that matter; leverage it to create an educational model that starts with the assumption that content is free and the real skill is editing it into something true or beautiful; only then will it be valuable and reliable.
There were already ways to get your programming problem solved online (Stack Overflow)…it’s just that previously you had to rely on humans + algorithms working together. Now ChatGPT wraps the human knowledge further. But in so doing, it makes trust even harder to come by. The structural problem, however, hasn’t changed: the person responsible for the work—do we still call them the “author”?—still needs to validate and take ownership of the final result.
One of the points Ben and others bring up is that there is still a tremendous amount of creativity involved in writing good prompts. We don’t generally think in these terms for interactions with computers, but it is not at all foreign in other contexts—there is a reason that the Frost/Nixon interviews are widely regarded. This kind of creativity is exactly what you see in /r/ChatGPT; after spending too long browsing there it’s really clear that this community is collectively learning how to elicit creative and interesting responses.
Ben Evans wants to know what use cases will turn out to be well-served with this tech:
One of the ways I used to describe machine learning was that it gives you infinite interns. You don’t need an expert to listen to a customer service call and hear that the customer is angry and the agent is rude, just an intern, but you can’t get an intern to listen to a hundred million calls, and with machine learning you can. But the other side of this is that ML gives you not infinite interns but one intern with super-human speed and memory - one intern who can listen to a billion calls and say ‘you know, after 300m calls, I noticed a pattern you didn’t know about…’ That might be another way to look at a generative network - it’s a ten-year-old that’s read every book in the library and can repeat stuff back to you, but a little garbled and with no idea that Jonathan Swift wasn’t actually proposing, modestly, a new source of income for the poor.
What can they make, then? It depends what you can ask, and what you can explain to them and and show to them, and how much explanation they need. This is really a much more general machine learning question - what are domains that are deep enough that machines can find or create things that people could never see, but narrow enough that we can tell a machine what we want?
Completely agree. The capabilities we’re seeing out of ChatGPT are not actually new (or rather, they are ~2 years old given that they come from GPT-3)…what’s new of late is the attention generated by the accessibility of the chat interface. And often it’s the attention required to get people focused on deployment, on use cases, on building products.
What feels obviously true to me is that we aren’t ready to think at scale about how to build such products. In the wake of the internet, it took quite a long time (I’d put it at ~15 years, from ‘95 to ‘10) until we started to have reasonably (although certainly not universally) consistent, mature UX / design practices. We still have a lot of figuring out to do when it comes to 100% NLP, open-ended human-to-computer conversational products.
Fred Wilson writes about the impacts of endless AI-generated content generation on identity:
So what do we do about this world we are living in where content can be created by machines and ascribed to us? I think we will need to sign everything to signify its validity. When I say sign, I am thinking cryptographically signed, like you sign a transaction in your web3 wallet.
I can’t tell how much of this is an “everyone” problem vs. an “important person” problem. Fred is a highly identifiable figure, probably in the top .01% of humanity in terms of someone worth impersonating for nefarious purposes. As impersonation gets easier, will this be a thing that all of us need to deal with? Hard to tell…although I could imagine the spearfishing potential is very high.
For a last post on ChatGPT…and more as a post-script…I’ll leave you with Venkatesh Rao’s The Dawn of Mediocre Computing. As with most of Rao’s work, it always contributes new connections to my neural net but takes some effort to parse.
It’s tempting, as someone who may have been following the progress of large language models over the course of multiple years now, to be somewhat cynical about all of the attention that ChatGPT is getting right now. I get it. But the attention and accessibility really is beneficial. This is the part where we need to get the imaginations of an entire industry cranking, and corral the zeitgeist towards the more positive, optimistic directions this innovation unlocks.
Let’s talk about something else, shall we?
Enough speculation about LLMs!
Joseph Monti writes a fantastic post on APIs and how to apply the lessons that software engineering has learned about building great APIs to data engineering. Could not agree more. There’s a lot more to mature systems architecture than simply data contracts, and the history briefly reviewed in this article contains so many lessons to be mined.
—
Facebook recently wrote about UPM, its framework for static analysis / rewriting of SQL. It is very interesting. This is the type of functionality that we have long been interested in delivering inside of dbt, but supporting it across arbitrarily many data platforms is a big lift and we haven’t yet been able to make that investment.
As the arc of the product continues to bend, however, we’ve started to see things that were always in the “someday” pile start to get put in the “today” pile. At some point I have no doubt that will happen here as well.
—
Gergely Orosz writes about the Staff Engineer role. Or rather, he publishes selected passages from Tanya Reilly’s new book, The Staff Engineer’s Path.
I’ll be the first to say that my current job is the first time I’ve ever worked directly with Staff Software Engineers, or at a company with a well-defined Staff SWE role. For those who have not had this pleasure, it really is a fascinating role and helps to develop truly fascinating, uniquely qualified humans. I link to it here because I think that, in analytics engineering in particular but in most data career ladders broadly, we really lack an understanding of how career ladders should advance once you’ve been through the Junior » Senior route.
Being a Staff SWE isn’t just about doing the same stuff as a Senior SWE but now you’re more experienced. The role, and how you make an impact in a team, is qualitatively different. Highly recommended reading.
—
Classic Stephen Bailey in In Search of the Übergraph:
Absent transfer, data is a linguistic crutch, an incidental aspect of doing-business, like how classes taught in English are not therefore “English” classes. Amongst the business, obsessing about data is a sign of disorder, not virtue.
But if you aim to turn data into knowledge and disseminate it broadly — now, you enter the proper domain of data.
The problem becomes neither the technology nor its application, but its lineage: its ancestry, context, reliability, fidelity, centrality. Those who work in this area learn that data is never wrong — only misunderstood.
It’s like a riverside village: everyone drinks the water, but no one thinks about where it flows from or where it goes, only whether they have enough. But for the data professional, the river, not the water, is the mystery.
I could not agree more—with this intro, or with the post as a whole. The Graph is what matters, and frictionlessly mapping more of it is the fascinating journey that I’ve been on over the past 6+ years. I think we’re just getting started.
Hey! Thanks for the post - how the heck do you keep track of all these articles?? Lol
The thing that most resonated with me is at the end of your post:
“The problem becomes neither the technology nor its application, but its lineage: its ancestry, context, reliability, fidelity, centrality. Those who work in this area learn that data is never wrong — only misunderstood.”
On this topic, I think that something data professionals get wrong, and therefore everyone who uses their data gets wrong, is the different between your code working and your business logic making sense. As in, just because the query works doesn’t mean it’s logically right. And while this may seem super obvious, it’s the reason y’all created dbt metrics - code worked for two different people, but the logic was different.
One thing that I’m curious about in the future it more extensive dbt testing. Writing tests that test the logic of interrelated models, not just if the columns are unique/not null. Or maybe it’s a standard set of logical questions that get asked for each model - how many rows are expected? How did you get the expected number of rows? What are the relationships of the joins and how can we prove the results are intended?
Overall, when comparing an analytics engineer to a software engineer, the AEs test coverage seems super... bare.
Anyways, thanks again for the post <3