Developer productivity on GitHub Copilot

GitHub Next's Dr. Eirini Kalliamvakou on making sure tracking productivity reflects reality

Sep 08, 2024

Welcome to Season 6 of The Analytics Engineering Podcast. Thank you for listening, and please reach out at podcast@dbtlabs.com for questions, comments, and guest suggestions.

Dr. Eirini Kalliamvakou is a senior researcher at GitHub Next. Eirini has built a career on studying software engineers, how to measure their productivity, and how developer experience impacts productivity.

Recently, Eirini has been working on quantifying the impacts of GitHub Copilot. Does it help software engineers be more productive? Tristan and Eirini explore how to quantify developer productivity in the first place, and finally, whether Copilot‌ makes a difference. In the search for real business value, this research is a real bellwether for things to come.

Join data practitioners and data leaders in Las Vegas this October at Coalesce—the analytics engineering conference built by data people, for data people. Use the code podcast20 for a 20% discount.

Listen & subscribe from:

Key takeaways from this episode

Can you give us a little bit of background on Copilot? I think Copilot came out ahead of ChatGPT, which means that it clearly predates the groundswell of attention that AI has gotten.

Eirini Kalliamvakou: Fundamentally, Copilot is an AI-powered code completion tool. So as developers are in their developer environment, in their editor and they're typing, they're writing their code, they get suggestions from Copilot. Sometimes that means completing the line of code that they are writing. Sometimes it means that it gives them much longer snippets. Sometimes it means that they write a comment and then it suggests code.

The concept of auto-completion was not new. The AI-powered completions and the way that Copilot suggests them, that's where the interface comes in. It doesn't have a special interface per se, but the experience of it appears as Italicized text, which we call ghost text, makes it very easy to just ignore it and keep typing if it's not something that you're interested in or it's not a suggestion that fits exactly what you were going to write. But it also makes it very helpful when you do accept it because it‌ gives you this boost of progress.

Developer productivity has historically had a bit of a bad rap. Can you introduce us to what you believe is a good way to measure developer productivity?

So you said a word earlier about developer productivity, uh, you said obsessed, and I think that definitely captures it. I think the tech industry is kind of obsessed, which sometimes seems unfair because I think productivity generally is something that other industries and other sectors are interested in.

I don't find myself talking about the productivity of researchers, for example, or doctors and lawyers, but we talk a lot about developer productivity.

What I think is best is if the metrics and the measurements and the models that we're using to track productivity, reflect reality. One of the problems with how traditionally we've been thinking about developer productivity is it does not match the reality of software engineers, because it comes from archaic concepts of productivity from more industrial settings, but now we're talking about more knowledge work. And those two don't match.

It's an error of judgment when we always only go for things that we can measure or even more so things that we can measure easily. It's very easy to count lines of code or numbers of PRs or all of these things. And they do have their place, right? They show a pace of activity for the developer. But then the conclusions that we draw from that, we have to be careful. If we see a slower pace, is it the developer that is at fault? Is it the system that they're working with? Is it the process that development is happening in their team or in their organization that is the problem? Most of the time we don't do that diagnosis afterwards.

I see it in my work a lot of the time where people ask for a lot of metrics and a pluralistic view of productivity, but they actually say, “Can you please narrow it down to this one metric that I can put on my dashboard and I can look at every day?”

I think the correct thing is to have a view of productivity that matches developers’ reality, some of which is perceptual and some of it is observed. We need to have metrics and ways to measure that capture both of those.

This is something that I have tried to practice whenever I'm called to do, or I'm interested in doing measurements of productivity. We know, and, and we definitely live and breathe that at GitHub too, that it's not just a single thing. So when the time came to do productivity evaluations for Copilot, yes, we looked at the acceleration of developers as one aspect of productivity, but we also looked at satisfaction. We looked at the state of flow. We looked at cognitive load.

Can you talk a little bit about the SPACE framework?

The SPACE framework and the metrics that you mentioned are examples of metrics that you see. You could use the interesting parts and the impactful part about the SPACE framework is that it says you need to look at more than one thing.

S stands for satisfaction and wellbeing. P is performance. A is Activity, which is usually when we think about automated measurements. C is Communication and Collaboration, and E is for Efficiency and Flow. So there's a lot more there than just Activity. And the Space Framework is a great reminder to look at more than just Activity.

Don't just only count how much of things are happening or how fast. Think about some of these other dimensions.

When you talk about developer experience or developer productivity to senior executives, CIOs, CDOs, et cetera, do they get it? Do they care? Do they consider this a strategic priority?

It's definitely a very different conversation to have that with, as you say, a CTO, a CIO versus a head of engineering versus a developer. We're talking to completely different audiences and the conversation ends up very different.

When I talk with executives about this, there is some reluctance. And there are a lot of questions. The first wave of questions is always, “But what is the developer experience?” And I think sometimes it comes with some preconceived notions that developer experience is all about. I don't know, like having perks at work and so on.

Ping pong tables.

The beers on a Friday and the ping pong tables and so on. At the very beginning, that's all people thought developer experience was. Now, fast forward a few years and a few studies, and we realize that there's a lot more to that.

So the first wave of questions is always, what is it? And what we talk about is the developer experience, which is essentially how satisfied or hampered developers feel with the tools and the processes that they use, so that we start developing an understanding of what is it that boosts them.

It starts with how satisfied people are with the tools and the processes that they're using today. We have narrowed it down to three main factors, and when I say we, I mean as a research community. One, it has to do with the flow states. I could be talking about the flow state for, for hours if, if someone lets me, but essentially is, you measure that as ‌satisfaction, which is very perceptual.

With the amount of deep work that developers are able to do, the frequency of interruptions and surprise, surprise, whether they find their work engaging versus boring and repetitive or non-impactful.

Then there's feedback loops, which is the time to get. And that's something that you can get from a perceptual angle.

You can ask people approximately how much time does it take you or how slow or fast is a particular feedback loop. You can also add data to this because you can see in your systems for at least a few things, from the start of a flow to the end. You can also divide it into value-added work and not.

Then there's cognitive load, which is how easy it is for people to understand and read code that already exists or artifacts that they're dealing with. It also depends on how easy it is to deploy code, not just understand it, but also be able to to work with it. And how intuitive are the tools and the processes that people have to work with daily. So these three pillars, the moment we start talking about those, they start making a lot more sense and become a lot more relatable.

And then the next wave usually of questions is, why does that matter? Why, of all the things that I can use my finite resources for why should I put it towards improving developer experience and how do I go about it? And that's why we did a study a few months ago that was specifically focused on measuring the tangible impact of introducing improvements.

And you showed meaningful results. What was it? 50 percent?

It's a 50 percent productivity boost for developers to have deep work that is dedicated, like it's blocked off and it's a significant amount of deep work. There's so much piling evidence in terms of the state of flow and the state of deep work for developers, and how it's not only fundamental to their being able to make progress, which is a more traditional view of productivity, but it's also fundamental to them feeling good about what they're doing. We did find significant results. And I mean that both in the statistical sense and that we found significant results and impactful ways that improvements on developer experience can affect the bottom line for enterprises.

That helps move the conversation a little bit further than before, because now we have evidence, we have data that we can show about what you can expect to get in return.

This newsletter is sponsored by dbt Labs. Discover why more than 30,000 companies use dbt to accelerate their data development.

Book a demo

The Analytics Engineering Roundup

Discussion about this post