Should we even care about using LLMs to query enterprise data?
It's looking like we'll be able to. Does it matter?
Hi folks!
Last time we chatted, I presented evidence drawing on the work of Juan Sequeda and the data.world team that knowledge graphs and the dbt Semantic Layer provide a substantial boost in accuracy in our ability to correctly answer natural language questions about enterprise data.
This post generated some great discussion and even a replication. But the thing that really got me thinking most was a parallel conversation where some very smart people were asking whether we should even care about text to SQL or text to trusted dataset.
I really appreciated this aspect of the discussion because it refocuses us on what matters - how can we actually drive better organizational outcomes, not just how can we integrate fancy new tools.
I’m more optimistic than Abhi about the naive use case of a chat interface that can be used to let stakeholders ask questions about their organizational data with high confidence. This would take some pressure off data teams and give more agency to the rest of the organization. I expect the real, if modest gains from this would be enough for many organizations to consider this a worthwhile pursuit.
I actually tried to set up this very workflow in 2017 when running a data team (using a slackbot connected to Looker) and while the implementation was rough at the time and we didn’t actually productionize this, it was clear to me for my own use case it would have generated value.
But it’s very far from being the most interesting ways one might imagine this branch of the tech tree progressing. If we squint a bit, we can imagine some much broader and more flexible applications.
So pull up a chair, get comfortable - last time we were scientists showing empirical stats. Today we’re getting just a little bit s p e c u l a t i v e .
First we’ll look at how a single workflow might adapt to incorporate natural language processing and understanding.
Then, we’ll go all out and imagine a world where these natural language questions power the decisionmaking systems for LLM systems themselves.
The goal here is not to make predictions about how exactly things will go, but to give a few speculative glances around the corner of how they might.
Solving the stakeholder's dilemma
Flow states are a bit of an obsession of mine and others at dbt Labs. It’s the reason I fell in love with dbt in the first place - nothing like it in my work as a data practitioner had ever allowed me to as reliably slip into a flow state as developing dbt models.
These days I don’t get to write as much dbt code as I used to. Instead I write a whole lot of notion documents.
I write notion documents about how dbt features are being used and how they might be better. About how successfully we’re engaging the dbt Community and how we can create programs that make data practitioners more successful in their day to day work.
And I actually really love this part of the job - it’s fun, interesting and intellectually stimulating. But then at a certain point in the process, I need to connect my ideas to the on the ground reality of what’s happening in the business. I need to pull in data about the systems I’m describing.
Like many managers before me, I made the sad realization that my connection to the day to day technical work isn’t as close as it used to be. Sure, I can usually get what I need. But in the past I would be surfing the intricacies of a project I knew in and out - every detail, every nuance, every little trick to effortlessly get the answers I need. Nowadays, it’s more of a struggle.
A lot of the time, I end up asking someone on the data team for help. That’s right - I’ve become a stakeholder.
I call this the stakeholders dilemma - do I bother the data team for a request, do I try and pull it myself or do I handwave away the question and not bring in the numbers I was looking for?
Let me tell you folks, nothing kills a flow state like the stakeholders dilemma. More often than I’d care to admit, I end up not going looking for that piece of data that would complement (or more importantly contradict) my point.
And this is where the interesting bit comes in.
It’s a pretty simple jump to imagine an interface that connects both to the document I’m writing and to my Semantic Layer - allowing it to make some pretty good guesses about the types of data we have that would be most relevant to what I’m writing about.
If I’m writing about the health of one of our products, I probably want to see metrics and feature usage associated with that. If I’m looking at our best practices guides - I probably want to see how those are performing.
This fits really neatly into the copilot / assistant paradigm of AI products - building tools that assist and augment knowledge workers to increase the quality and speed of work.
We can imagine a world where I checkpoint my planning document by highlighting some text I’m writing and I ask it to find the subset of questions that:
Would help validate or improve my thinking
Are answerable based on the information we have baked into the Semantic Layer
And just like that I get back a vetted list of questions, and their associated answers, from my organizational data and I can keep cooking.
Will it be perfect? Probably not. Could it be useful? Absolutely.
Stakeholder ex machina
An interesting thing about the example above is that while the end consumer of the data was still a human, both the natural language question and the answer to that natural language question were generated by the LLM.
What if the stakeholders asking the natural language questions of our enterprise data aren’t only people? What if we have one big stakeholder - LLM systems themselves using text as the universal interface?
We’re increasingly seeing that while LLM systems are powerful in and of themselves, they get much more powerful when they have reliable tools.
The first big example of this was when Code Interpreter got added to ChatGPT and all of a sudden a huge wave of new use cases got unlocked. Then came browsing. Then vision.
These additions provide superlinear benefits - each new tool has access to and can route information through the other tools via the LLM acting as the central kernel, the router between different systems.
OpenAIs Andrej Karpathy has been calling this the LLM OS.
Tool use is a key capability of large language models - and accessing trusted datasets is a very potent tool.
If we move up the ladder of abstraction one more time, we get LLM systems that have a specific goal and the ability to interact in the digital (or physical world) in some way. This doesn’t need to be as sci-fi as it sounds - you can imagine systems with relatively bounded goals that can operate somewhat independently.
An advertising agent that can test and deploy new creative and then measures the efficacy of new campaigns
A support agent that not only answers a ticket, but validates that the expected behaviors in product follow the resolution of a ticket (or escalates if not)
(Slightly more speculative) An inventory control agent which is able to suggest when products in a physical store need to be restocked or have their schedules updated
The pattern here is the same - a human worker creates a bounded AI agent to accomplish some sort of meaningful goal. A key part of ensuring that this goal is achieved successfully is mapping this to organizational data.
And so now, we return to the original question that took us down this long and winding path - should we even care about connecting enterprise data to natural language queries by LLMs?
In the end it depends on how valuable, and how likely you think each of the following paradigms are.
Chatbot - question and answering allowing anyone in the org to answer ad-hoc questions about your organizational data
Assistant - integrated workflows that act as assistants to knowledge workers, bringing relevant data to the systems they are working in, as they need it
Agent - systems that are able to reliably access and make decisions in the real world using your organizational data
If you think all three of these scenarios are at least somewhat possible and that they would really transform how organizations engage with data - you should care a lot! If you think only the first one is possible and you’re not that interested in it, super reasonable to wait and see on this - or go for a totally different tack of how data + LLM systems will interact.
We’re at the beginning of understanding and working with a very new and very different way of working with computers! The ways this ultimately shake out are sure to surprise us.
The best way for us to figure this out? Try things! Share learnings, ideas. Talk to your peers. Ask questions. Be skeptical! And together - we’ll get there.
Before we go - I just want to say thanks for reading the Roundup this year. It’s a real joy to get to connect, share ideas and see the work that everyone in the industry is doing. We appreciate you very much - looking forward to 2024.
Good read, thanks!
If you can get SQL to work, then other things are possible. Like looking at component features and forecasting the impacts on results, then notifying management about the most significant trends and what their outcome might be. Or similar accuracy around text data might lead to other insights about how customers respond to actions the company takes, product performance, how people work, etc. We can score sentiment analysis now, but what about more complex feedback like comparing actual communication between employees against some desired state or model? The agent model is the most exciting because you can throw an army of AI at a task, each with specialized skills, and maybe get more complex answers than you were expecting. Then multiply them far beyond your capacity to hire real people and have them analyze mountains of data far more frequently than your staff would be able to. Maybe that's science fiction right now, but LLMs have only been out there for about a year and look at the progress they have made.