Your Data PM is not a panacea
Congrats! You managed to convince someone to let you hire a Data PM. Are you setting them up for success? Three things to think about before you publish that job description.
Howdy 👋
Yup, you’re in the right place. This is still the Analytics Engineering Roundup.
Because we’ve added a new author (ME! 🥳) we’re playing around with the From: address — it now reads “Analytics Engineering Roundup” instead of Tristan’s name. If you haven’t already, whitelist this e-mail address to make sure these e-mails continue landing in your inbox.
With that preamble, let’s jump right into today’s Roundup 🦘
-Anna
The Data Product Manager
Today we’re talking about Rifat Majumder’s The Data Product Manager. Things I absolutely loved about this article:
It defines "data product" and "customer" upfront. A "customer", per Rif, is any “employee in an organization, and a "data product" is 1) internally facing; 2) represents "every piece of data”; and 3) also contains “the tools used to generate, access, and analyze that data”1.
One might disagree with these definitions (and I'm about to! 😉) but it's an important baseline to have a meaningful shared conversation. Great to see these stated explicitly!
The article points out some of the biggest barriers to building data informed culture that are within the control of data teams to overcome:
building data products without a clear purpose creates only an "illusion of progress" towards this goal; and
building data products that aren't just in time for when the business actually needs them leads to a significant waste of resources (yesss 🙌 we must be honest with ourselves here).
The article doesn't just tell us that we need Data PMs to help bust down these barriers. It tells us what will make Data PMs successful given the definitions in #1 and barriers in #2:
focusing on demonstrating the business value of data, and
evangelizing the work of data teams.
So we're done here right? Case closed on Data PMs. </Roundup> Cue "Elsewhere on the internet..." Thanks for coming everyone, please stack your chairs on the way out.
👀
I agree with just about everything Rif says in this article, except two things. I'm going to argue that to build purposeful, just in time data products you need narrower definitions of “data product” and “customer” and a good amount of pre-existing buy-in into the business value your data products provide. Both of these things should exist before you hire your first data PM.
Allow me to elaborate. If you think of your customers and data products in the same way as the article does, here's how hiring a Data PM plays out in practice:
Your data org gets one PM 💪 Everyone celebrates 🎉
Your Data PM asks: "What is my product and who are my customers?" You say: "why, all of our data, of course, and everyone who uses it at the company".
Your Data PM:
Problem #1: When everything is a product, everything is a priority
Let’s say you’re running a centralized data organization. That is, you own both the analytics done at the company, as well as the data engineering infrastructure that powers said analytics.
At any given moment, you are probably negotiating some combination of:
The priority of your analytics backlog with your stakeholders (because everything is needed yesterday).
Some amount of system maintenance and support for your warehouse/lake/lakehouse and the BI tools used by the entire business.
Maybe even designing some reusable
dbt
models in your team’s copious spare time 😏.
Each of these are important, and each depend on the others. Where does your PM start? The answer is… 🥁 none of the above 🥁 because they are not data products.
My hot take:
💡 A
data product
satisfies the needs of a
specific group of users (customers)
through data insight, and goes through a lifecycle: development, release, maturity, and eventual retirement.
2
Your analytics backlog is not a product because you’ll never retire it, and you’re rarely incentivized to mature the process. Your data ware(lake)house is similarly not a product because it doesn’t have a specific group of customers — your ware(lake)house is a platform that enables your data products to exist, each serving a set of customers. On their own, your nicely defined dbt
models are not a product, until you can measure that they are satisfying the needs of your customers in obtaining insights.
Therefore, before you hire a Data PM, first ask yourself:
❓What are my data products? Which data products are a priority?
Your answer can't be “everything”.
Priority data products are those that provide the highest leverage for a specific set of customers. For internal data products, this means having a significant impact on the business. High leverage internal data products are those without which the business could not function. What does this look like for you?
High leverage does not mean things you are spending a lot of time on today, or things your organization doesn’t know how to do well yet. Those are details of execution, and the domain of a project manager.
If you don’t define your highest priority products up front, you’ll find that your Data Product Manager gets pulled into all your 🔥s simultaneously, and ends up doing relatively little strategic thinking about what your customers need long-term. What you’ll get instead is an overworked Project Manager and no closer to building a data informed culture.
Problem #2: You have to have buy-in first
It's hard to hire someone to do the work of demonstrating business value for you, if you don’t already have buy-in across the company. Most people want a job where they oversee a well defined area with proven business value. Yes, a select few enjoy trail blazing. But your PM is not a panacea and they won’t magically make your company develop a data informed culture without some legs to stand on. Before you hire your first Data PM, ask yourself:
❓Do I have a track record of demonstrating business value with my data products? Which ones?
Ask your customers where they find the most value. Do some interviews, or run a survey. You may be surprised to learn that what you think is most valuable is not necessarily what your customers value in your organization.
And then start there. Start with areas of your data organization that drive known business value, and optimize them with your new PM. Gather an (objective) list of the most common customer pain points. If it’s not at least a little hard for you to read this list, you’re not done collecting data on this yet. Then create the best possible customer experience in areas you’re already doing well in, and use that momentum to fry bigger fish with your Data PM bestie.🐟
Problem #3: Reporting structures
Ask yourself:
❓Whom does my Data PM report to? What does their career progression look like? Who allocates resources to them when they need additional headcount/funding?
There are a few models, and you should pick one and understand its tradeoffs before bringing someone onboard.
Option 1: Your Data PM reports to the head of your data organization.
Pros: you have their full attention. You might have more flexibility in background when hiring: e.g. bringing onboard someone who is interested in doing the work, has a strong data background, but is newer to the Product Manager role.
Cons: it’s harder to get additional resources to support your existing PM if they want to grow a team. If you do, you end up creating a shadow Product team in your Data organization. If you don’t, that limits your Data PM’s career progression. Mentorship and growth may be challenging for your PM without a Product leader to report to, especially if someone is new to the role. Your Data PM pay not be able to get into every conversation they need to be in, depending on how rigid your organization’s chains of communications are.
Option 2: Your Data PM reports to a Product leader, and is embedded with your team.
Pros: A healthier set-up for your Data PM in terms of personal growth and career progression. You get lots of insider info on Product priorities, and can use this information to negotiate for more resources towards data products that are aligned with the Product organization (if any).
Cons: You are more likely to receive only a part of someone’s time. You may end up working with several people over short stints if there are changes to the structure on your Product team or their priorities. Your PM may not have as much context on what you actually do and the value your team brings. It may be harder to prioritize cross functional data efforts that are not a priority for the Product organization.
How do you decide? There are no perfect answers. You just need to decide where you prefer to spend your calories:
If it is important to you to have full autonomy over your Data PM’s time, you’re probably better off hiring (or promoting) a data professional who is great at strategy and stakeholder communication to be your Data PM. Know that you will need to put in extra effort to get alignment, buy-in, and resources, as well as supporting their growth.
If you wanted to strengthen your relationship with your Product organization, an embedded PM may be a better option for you. Be prepared to be very decisive over where to direct their attention, and make sure you keep good records of your initiatives for any unexpected team shakeups.
Simple, no? 😉
What else should we be asking ourselves before making that first, critical Data PM Hire? Reply to this e-mail, leave a comment on this substack, or hit me up on the blue bird site. I want to hear what you think!
Elsewhere on the Internet:
☠️ CRUD tables need to die. I realize I’m preaching to the ‘all data must be events’ choir a little bit, but I found this article particularly ✨ because it’s written by a software engineer for software engineers. Moving away from CRUD is the first step in moving toward microservices. Yes it’s not for every organization. But if we are to blur the boundaries between data teams and software teams through things like the Data Mesh, it helps if our software engineering colleagues are bought in too. 😬 Warning: Medium Paywall. 👎
📈 120 years of timezones visualized. I’m a sucker for ambitious data visualizations, and this one does not disappoint. If you’re a history nerd, and/or you dislike changing clocks as much as I do, you might appreciate seeing the emergence of daylight savings in 1916 because the German Empire wanted to minimize the use of artificial lighting and save fuel for the war effort.
🏗️ Cloud Infrastructure as SQL. Light on details, but heavy on opinions from both the data and software communities. Check out the HN thread for a (genuinely) insightful discussion, although I admit my favorite take so far is from the Twitter 💩posters:
🤖 No code self-serve analytics with GPT-3. If you ever wanted to build a bot that turned natural language into SQL, somebody just beat you to it. Michael even used the classic Jaffle Shop data model for their demo. Watch our trusty dbt Community members trying to break it in this Slack thread. 🙃
📊 Is BI Dead? If you’ve been following Benn Stancil’s takes on the Modern Data Experience, check out the next installment that just dropped this week:
“BI tools are just one of several destinations for a company’s data. If BI tools require their own legs—for example, if they rely on a semantic layer to define metrics—they’ll be duplicative, because the same metrics also need to be defined in other destinations. Instead, BI tools should sit on top of global governance layers like dbt and metric stores. Eventually, a semantic layer will be a bug, not a feature.”
Hear hear!
Until next time 👋
See Run your data team like a product team and Data as a Product vs Data as a Service for earlier takes on the definition of a data product.
Inspiration for my hot take.