Michael O'Neill on The Analytics Engineering Roundup

5 Comments

I stumbled on this post from a job req for a Principal PM, and it reminds me of an interview I deliberately tanked with Databricks because I wanted point out what I believe is a gap in user personas within the data infrastructure world. Product Managers like me.

Every commercial data platform I've used seems designed for data engineers to get rows and columns into the hands of data analysts and data scientists. As a product manager, I'm not one of those key personas, and I have yet to encounter a data platform that gives me the tools to wrap those rows and columns up into an actual product.

Self-directed discovery? The only tools I have as a product manager to hide science experiments are permissions, non-prod environments and obnoxious/obscure naming conventions. But... how do I warn users that a table or column is deprecated without breaking the schema without direct communication? How do I distinguish between a beta that shouldn't really have production dependencies from the current release? How do I get documentation into a user's hands when they're poking around a tree of tables?

Correctness? My data platform should let me publish the results of my data quality investments, from visualizations to expectations on the expected range of imperfectness (trillions of executions, I'm going to get weird stuff) as part of the product's UX. Instead, I have to hope my users know to go to my DQ dashboards. Once I've built them from scratch.

Timeliness? My users shouldn't need to spelunk in an orchestration tool to find my data engineers' batch jobs. And streaming data... it should be easier to visualize how long it takes from customer action to processed data.

Feedback? I'm desperate for that feedback loop, but to get it, I have to schedule meetings and cajole busy people to take a meeting. The only guaranteed feedback I get is when something goes wrong.

I could go on, but in short, how do I manage a full product lifecycle in a platform designed to output rows and columns?

In fact, I did go on and on in that interview with Databricks. I could tell in ten minutes I didn't want the job, so I spent the remaining 20 telling this poor hiring manager all the features that product managers would need to deliver a mature, self-serve data product. :-)

Expand full comment