The data jobs to be done (w/ Erik Bernhardsson)

Erik Bernhardsson, the CEO and co-founder of Modal Labs, on his serverless platform for AI, data and ML teams, and his take on the future of data engineering

Nov 03, 2024

This is Season 6 of The Analytics Engineering Podcast. Please reach out at podcast@dbtlabs.com for questions, comments, and guest suggestions.

Erik Bernhardsson, the CEO and co-founder of Modal Labs, joins Tristan to talk about Gen AI, the lack of GPUs, the future of cloud computing, and egress fees. They also discuss whether the job title of data engineer is something we should want more or less of in the future. Erik is not afraid of a spicy take, so this is a fun one.

Listen & subscribe from:

Key takeaways from this episode

You might be only the second person who's a repeat guest. You were on during Season One, and back then you were still in stealth mode with the project that has become now Modal. What have you been up to in the last few years?

Erik Bernhardsson: When we talked, it was the depths of COVID. I had just quit my job and was hacking on something that's now turned into Modal. And back then, my idea was I wanted to build a general-purpose platform for running compute in the cloud. And in particular, focus a lot on data, AI, machine learning use cases. Three years later, I'm still working on that.

What we discovered along the way was AI and Gen AI is a great application, because when we started building this, I didn't have like a clear use case in mind. I just had an idea that if I build this platform, people will use it because it seems like a gap in the market. Turned out that Gen AI is a killer app for what we built.

We've seen a lot of interest in using large-scale applications for audio, video, image, diffusion models, biotech models, and video processing. So we run a big compute cloud, a big pool of GPUs, CPUs in the cloud. And then the other side of that is we offer an easy-to-use Python SDK that makes it very nice, very easy to take code and deploy it into the cloud and in a way where you don't have to think about scaling and provisioning and containers and all that stuff that typically you have to do if you build your own stack.

You are building a compute platform that's built on top of the cloud providers. Why do people need Modal? How is it different than just operating directly with the services that organizations like AWS provide?

There’s more room in the cloud space. AWS, Oracle and Azure and all of these have done a fantastic job building a foundational layer of storage and compute.

I've used AWS for a good 15 years, and I love AWS for what it enables me to build. But it's still not easy to use. And it still gets in the way of iterating quickly and shipping things. AWS and others have a very solid place in that stack; they're an amazing compute and storage provider. But there’s plenty of room to innovate in the layer above, which I think of as the sort of layer that you were talking about, which is the Snowflake layer, and that's also where I would put Modal. We repackage a lot of the cloud primitives in a way that suits what data, AI, and machine learning teams want to accomplish. And by focusing on one particular use case, we can offer a much better user experience.

Cloud providers are hard to use because they try to build for every user at the same time, which means no user is going to have a good user experience. At the end of the day, they're massively successful businesses generating tons of money from storage and compute. That's where I think they belong. That's the core part of the stack.

Anything above that is going to try to drive usage of storage and compute. They just want to drive demand to the underlying services. And you've seen this with the success of Snowflake. Snowflake, builds on top of AWS and the other clouds and in a way competes with them, but not really, because at the end of the day, like AWS and the other clouds, they get the money either way.

So there's room for a layer on top of the hyperscalers and Databricks and Snowflake are both sitting in that place today, and you're amore nascent entrant there. How do I frame the thing that you're helping users with versus Databricks—and especially Databricks because they’re very focused on AI use cases and you're doing a lot of work there too.

I hesitate to position ourselves against Databricks because in many ways they're a fantastic company. But aspirationally, we share the same vision of what we want to accomplish. They're obviously massively ahead of us by 10, 15 years, but their vision is the same; they want to build an end-to-end platform to serve data, AI, and machine learning needs.

We come at it with a very different architectural approach. We basically said, you can't run this yourself. We're going to be not just the software layer, we're also going to be a hosted infrastructure as a service platform—building for containerization, building for cloud, building for this multi-tenant super elastic compute pool. That meant that we could make very different architectural decisions. Obviously I have bias, but I think we have the right tailwinds because we're thinking about architecture in a very different way.

There are a couple of companies recently that are doing some version of helping you have an abstraction layer across the different hyperscalers and across the different availability zones to do resource pooling. Why is that happening today?

I honestly think it just comes down to the fact that GPUs are expensive. In order to make the economics work, you have to run them at very high utilization. And because you run them at very high utilization, you're going to have poor availability, which means that for any single availability zone, you may actually be close to capacity most of the time. Which means in order to do this well, you need to go to different availability zones, you're to need to go to different regions, you need to go to different clouds.

It's a big part of what we've been spending time doing—integrating with a bunch of different cloud vendors, using all the different regions and zones, and then just getting capacity to the customer wherever we can find capacity. That's actually a fun, interesting problem in itself. We monitor its prices, which change dynamically 24/7. We solve a mixed integer programming problem to figure out the optimal placement, given the resource constraints. How do we allocate the pool of machines in the cheapest possible way? So this is a fun, interesting thing.

There's a fixed number of GPUs in the world. There are fewer GPUs than the number of workloads in the world, I think. that a true statement?

The fundamental economics of GPUs is that most of the cost goes to Nvidia. To recoup that cost, you need to run them at very high utilization. Most of the CPU cost is power, which means that it's a more variable cost, which means that you don't really care about utilization of CPUs as much. So AWS could just over-provision and run things at much lower utilization. But GPUs to make the economics work, you need to run that at high, high utilization. Hopefully GPU prices will come down. That's what I'm hoping for. But right now, it’s a supply and demand problem.

If somehow GPU prices come down over time and they become more like x86 processors in the way that the market works, do we still care about all this hard work that you're doing to combine resource pools of GPUs across multiple availability zones?

Maybe it becomes less relevant, but on the other hand, the value of the platform then becomes more important. I think a lot about this for a lot of AI startups. Are you long on GPU prices or are you short on GPU prices? If GPU prices go up, what happens to the value of your company? Does it go up? The truth is if GPU prices were to crash, It would be hard for us in the short term because we have a bunch of long-term contracts and the revenue would go down quicker.

But I actually think in the long run, it would be good for us. Because having an abundance of GPUs is very good for customers. It's good for the world. But I also think for a lot of infrastructure providers in a way, we focus on the software, not the hardware. And if the underlying cost of the hardware goes down, the relative value of the software goes up.

Let's talk about egress fees. One of the driving forces of being cross-cloud, cross-availability is GPUs. In the past, one of the reasons not to do that had been, it's just really expensive to move your data around. I think in certain cases that's still true, but it's starting to change. Can you say more about what's happening with the egress fees?

AWS still has very high egress fees, but they're coming down. I think there's a lot of pressure from R2 and Cloudflare, for instance. The interesting thing about Gen AI is egress fees don’t really matter that much. And that's been a weird re-architecture of a lot of compute. Part of why we've been able to build a multi-region, multi-cloud architecture is that if you think about it, like Gen AI doesn't need a lot of bandwidth. It's very compute-hungry. The bandwidth data is actually very little compared to the compute.

I have a feeling like over time, egress fees will come down and region distinction will matter less and less, except for latency-sensitive applications. But it turns out a lot of stuff is actually not that latency sensitive.

One of the biggest conversations in the data industry today is what's going on in the file format catalog wars—Unity, Iceberg, Delta—and there's a lot of focus on making sure that different systems can talk to each other. But one of the things that we're not talking about yet is that inevitably if you have a global company, you're probably not using a single cloud provider in a single availability zone. So you also have to solve a fabric issue of where the data is physically located. It's not just a format thing.

Yeah, that seems hard.

Okay, let's go from the future of the cloud to the future of data engineering. You spent a long time as a data engineer or building tooling for data engineers.

I was at Spotify for seven years. I built a lot of music recommendation system. And then I was at a company called Better for many years as the CTO. I've been focused on data, AI, machine learning. The precursor to all of that, or the prerequisite is you’ve got to get to data. And so that sort of necessitates doing a lot of data engineering, data cleaning, building data pipelines. I ended up building my own workflow called Luigi.

No one uses it today, but you're skipping by a time period in which a lot of people used it.

A lot of people used it 10 years ago. I was deep in that swamp. Data engineering is funny because in a way, I kind of don't want it to exist. My prediction has always been it's going to go away at some point.

I want to explore this with you because in many ways, I agree with you. At the same time, ther are a ton of humans in the world that call themselves data engineers that use dbt. And so the last thing in the world I want to do is like say that data engineers suck because I don't believe that. The funny thing though is that technology progresses over time and the jobs to be done that humans need to do change.

I've seen this so many times a team of data scientists benefiting tremendously from just injecting a lot of data engineering skills. Suddenly they can get the data.

I don't really like the idea of it being someone's job to shuffle data around. I want everyone to think about what the business needs and to build business applications. I would say the same thing about any internal platform team. All these internal platform teams tend to be somewhat ephemeral and transient.

All these titles too, right? There are data engineers, data scientists, analytics engineers. To me, it doesn't matter. There are always going to be people who need to work with data, AI, machine learning. and that slice is going to grow and grow and grow. But the actual composition of those teams is going to change a lot. And so I don't really pay that much attention to titles.

I just wrote a white paper on the analytics development life cycle. And there are three different jobs to be done in the analytics development life cycle. There's a developer—people who create reusable assets for other people. There's an analyst—people who interact with the data to try to draw conclusions about the real world. And then there's a decision maker—people who get the recommendations from the analyst and make decisions. If you start slicing it up more than that, I think you inject friction into the process as opposed to add clarity.

I have to compliment you on creating that label analytics engineer. Because at that time there were so many people out there unsure how they fit into their organization. You brought them an identity. That was eyeopening for so many people and created a sense of belonging in a community.

It was my favorite thing was when people told me that they got an analytics engineer title and their pay went up 50%. I was like, great, but you're doing valuable work. You should be paid for it.

You gave a lot of people recognition and I think that you should get a lot of credit for that.

We've talked about data engineering is a valuable thing to be done. There's a trajectory here where probably fewer humans need to turn knobs and dials. What about machine learning and AI? So there are ML engineers. More and more people describe themselves as AI engineers. Do we need specific titles for these things or are we all just software engineers?

It's a difference between AI engineers and machine learning engineers, that AI engineers use TypeScript and machine learning engineers use Python for the large part. A lot of the recent stuff around LLMs—not to trivialize it—but it's a low-code machine learning tool for people for which machine learning felt unapproachable. Suddenly they were given a new tool and they could stitch a bunch of prompt engineering and get it to work.

Let's wrap up with the question that I love to ask everyone. What do you hope will be true of the data AI software industry over the coming five years?

I hope that people will never have to think about containers, infrastructure, provisioning, or resource management. I really hope that all of that will be abstracted away in the next five, 10 years. That people will focus on application code and business logic and building cool AI shit, and then rely on infrastructure to take care of all the other stuff.

This newsletter is sponsored by dbt Labs. Discover why more than 30,000 companies use dbt to accelerate their data development.

Book a demo

The Analytics Engineering Roundup

Discussion about this post