Data teams: embedded or centralized? Reactive or self-directed?

Why neither dichotomy is the right level of abstraction. Also in this issue: practitioner stories, how to retain customers, who are quantitative UX researchers, and the ultimate theory of burnout.

Apr 17, 2022

Data team leaders: how much of your time have you spent thinking of the optimal way to organize your limited human resources to satisfice the unlimited number of things on your teams’ todo lists?

Folks on data teams: how often have you been moved from being embedded on a team to a central data function, or vice versa?

How often have you been a part of a conversation that went something like:

“I wish my team had more time to do the real work of researching the fundamental problems of the business.”

TJ @teej_m

@imrobertyi My team’s workload is at least 50% self directed. The difference is that we drive decisions for a profitable business, not a growth-at-all-costs VC-backed startup. Models exist where the short term needs of the business dont account for > 100% of data team capacity.

There is a constant tension in every data team between the amount of time it spends on reactive work (short-term priorities set by people outside of the team) and pro-active research/data organization (longer term priorities set by the data team themselves).

Things that we data leaders lean on to keep some of this tension at bay:

ticket queues and rotations to staff those queues
staffing an analytics team based on resource needs of other departments, with departments allocating headcount rather than the data team
embedding an entire human’s/set of humans’ time with another function to free up other folks on the team to think about the data platform, data model or bigger picture analyses

The first is usually employed by centralized teams. The second, by embedded data functions. The third is a hybrid — reserving some capacity for “self-directed” work on a central team, with embedded folks acting as human shields for the majority of inbound requests.

Data team structure: embedded or centralised? by Mikkel Dengsøe

All of these models have some sort of downside. Centralized teams get disconnected from broader business priorities (so we hire data PMs). Embedded teams lead to knowledge fragmentation, a lot of which is lost and needs to be rebuilt when someone leaves. Hybrid teams can lead to resentment if the different work sub-teams do isn’t framed just right: why does my teammate get to work on the “important, self directed” business problem while I’m answering questions all day? 1

Underneath this tension is an implicit value judgement: priorities set by others are less good. Priorities set by data people, for data work are better because we data people know what is best for the business.

We’ve come to accept this as normal. It’s just how data teams work. You move from embedded to centralized models as you gain more trust from your stakeholders, and back more towards embedded models if you lose some of that trust.

But why do we work this way? And do we have to?

Let’s start with the why.

What are we really trying to accomplish? We say we are driving data driven decision making. But what does that mean? Are we successful if teams outside of the data function are basing their decisions on numbers? When teams outside of data check that those numbers are right before they use them? When teams outside of data change their mind after looking at data?

We say we’re trying to drive cultural change and build a data culture in our organizations. So let’s think about what culture means.

Culture is a sense of shared identity in a group of people. That shared identity is made up of group noms (what is the right and expected way to do something), a shared language (when we say “x” do we both mean the same thing), shared beliefs and values (what is important to us above everything else?), shared methods of self-expression (music, art, jokes, memes) — you get the picture.

So data culture means:

norms around data testing, quality, control, access
shared language around describing the business
values like accuracy, timeliness, reproducibility
shared self expression like jokes, memes and other cultural references

But Anna, you might say, doesn’t teaching every person in the company to appreciate all those things you’ve listed mean we’re just turning them into Analysts? Isn’t this why we just hire more data people on our teams to help build data culture?

We're actually doing very little data culture building in all of these models (centralized, embedded, reactive, self-directed) because the data team remains the keeper of all of these practices. The data team defines itself in opposition to "data stakeholders". It makes inside jokes about how nobody understands “us, data people”. You can sometimes even hear talk about embedded analysts being on “an island” and having "no team" rather than their business unit being their team.

There is another way.

It is similar to how engineering culture is adopted and lived through practices in an organization, such as engineering systems (monitoring, paging, alerting) and processes (pager rotations, architectural design records, CI/CD, version control, and so many more), governance, alignment and ownership structures (code owners, style guides, planning processes, commit philosophy).

We don’t solve for engineering culture by embedding engineering team members to teach others how to “do engineering”. Nor do we centralize our engineering teams and have them develop new product features in a “self-directed” way.

Instead, we build solid engineering cultures using tools, systems and frameworks to help humans make fewer mistakes, and we partner them with humans on other teams (such as product, design, sales, etc.) who work together to drive to the right business outcomes.

And while engineering cultures may differ across organizations, the principles on which they are built, and therefore the tools that enable those cultures, don’t vary dramatically across an organization. Whether you think monorepos are still the best or that micro-services rule, you still leverage version control and a clear release process to enable easy rollbacks in case of problems. Whether you think pager rotations should be limited to the engineering team working on their specific feature, or be broad across an entire organization, the concept of an “on-call” exists and tools help facilitate (and often teach us) how it works.

Why not do the same for data culture? Instead of moving people around the company to build data culture (read: using our data human resources to plug literal holes in our data processes), why not develop best practices and norms based on the values that we hold about “good data process”? Why not we set up systems that codify those values and make sure our data processes are easy to follow regardless of someone’s level of analytics chops? Instead of focusing on who gets to ask the questions, why not focus on building a shared language around important business concepts with the rest of the company? Why not codify that shared language in the data models that are accessible to the whole company, and invite folks outside the data team to learn and develop this shared language with you?

Maybe if we do this, we’ll spend less time making memes about bad data questions, and more time making impact.

Elsewhere on the internet

Thinking about evolving your data stack

By Emily Thompson

Emily’s post this week is such a great and specific illustration of what I’m generally proposing above. Emily reminds us to focus on business outcomes, and not what we think we need to do more of, to drive the process of designing and building (or evolving) our data stack. She also reminds us to focus on the workflows of humans who will be working with the data, and the challenges that they face with being successful with data. To talk to others about how they’ve solved these problems, because they’re likely not unique to our organizations. 🙌 🙌 🙌

How to sell products and retain customers

By Sarah Krasnik

Sarah’s post this week really resonated with me. In it, she talks about the importance of solving real problems with products, and then talking about how your products solve those real problems. Those boring problems that Benn wrote about last week. She also reminds us to not underestimate the importance of a simple onboarding experience, regardless of how valuable people already find your product:

A user is already excited to use a tool that will remove manual work and inefficiency. And the first thing, as a vendor, that you’re asking the user to do is go through a manual inefficient flow? That’s a blow to the gut. (Let me caveat this with talking about post-Series A products—in MVP stage, well that might be a different story and the price would likely be lower as well.)
Eventually, you’ll have users that don’t have this hacky process and want to use the product. Not having both types of users rave about ease of use is just a missed opportunity.

What Sorts of Questions Quantitative UX Researchers get

By Randy Au

Have you heard of the new data job title on the block? The Randy writes about what it means to be a “Quantitative UX Researcher”, and it turns out there’s a whole conference on this topic coming up in a couple of months. It’s hard to predict what job titles will stick, but this one lands very well for me — this feels like a very distinct specialization in what is otherwise a very muddy field of “Data Science”. I can imagine teams of quant and qual UX researchers working together to help businesses understand both what to build next, and how users feel about this with both rich context and hard data. Questions that remain unanswered for me that perhaps the folks at the conference would get into: when do you think about making your quant UX research hire? What are the right qualifications to look for? Must they have a PhD, ML or a survey methodologist background? Where in the organization do they sit? Do they report to Design, to Data, to Product? There are rarely right answers to these questions, but I’m looking forward to the conversation!

How to know when to stop

By Andy Johns (via Lenny’s Newsletter)

I don’t know if you’ve caught this conversation from early March in which Andy opens up about mental health, but it stayed with me. And I’m so glad Andy gave it a deeper and even more thoughtful look because what he is sharing is so important.

A few things that I didn’t have language for that I do now thanks to Andy:

Levels of tolerance, or what are you willing to put up with in your work and what are deal breakers? Just like in a relationship, we all have “red flags” or things that make us (or should, ideally, if we have the awareness) make us rethink our commitments. Those flags are different for everyone, and probably change throughout our lives as our priorities change too. But knowing what they are, and not compromising on them is step 1 towards mental health. I’m putting this on my wall.
Figuring out boundaries before they take you to the intolerance danger zone. What are the warning signs for you?
Finally, what helps you flourish? These aren’t just work related either. In fact, they’re mostly not. They are things that recharge you, that bring you balance, calm and energy. One sign you are losing sight of your boundaries is that you stop doing the things that help you flourish.

Andy puts all of this much better, but I hope I’ve convinced you to give the entire piece a read.

In my last roundup, I made a call for folks who want to build more writing experience or share their practitioner stories to reach out, and you have — THANK YOU! That’s awesome! Keep it coming!

I also really love what Stephen Bailey did this week by flipping the format of his blog and inviting his audience to tell him about themselves: Thread: How did your college degree improve your data practice? Check it out, and maybe you’ll be inspired to join the conversation!

Until next time,

Anna

Yes, we can solve this with career ladders that encourage senior data humans to move more towards self-directed work, or with different job titles that distinguish the different work “analysts” and “data scientists” do. But this isn’t the right level to optimize for.

The Analytics Engineering Roundup

Discussion about this post