Values, community, and public performance.

What can a post with some lighthearted poems teach us about the data community today?

Feb 26, 2023

In Episode 40 of the Analytics Engineering Podcast, Julia and I speak to Sarah Nagy of Seek and Chris Aberger of Numbers Station, two startups building analytics products on top of large language models. Over the course of an hour I learned a ton about the technology and its applications to analytics.

Enjoy the issue!

- Tristan

—

This is a bit random, but I’m going with it.

One of my favorite bloggers at one of my favorite firms, Rachel Stephens at RedMonk, wrote the most trivial / lighthearted / just downright pleasant post on Valentine’s Day. RedMonk is an analyst firm of the software industry focusing on software developers, and Rachel is a senior analyst there. The post is titled Roses are Red, and it contains a whole series of 4-line poems just like this one:

Roses are red
(not checkered like flannel)
Please ask this again
In a public channel

On first read, this post feels like nothing at all. Someone took 15 minutes out of their day, wrote some silly poems, and hit submit. Moving on.

But stop and sit with this post for a little while and it becomes…notable. Somehow Rachel considered this unique expression of her own identity to be worth spending 15 minutes on. She probably had some fun with it and figured that her readers would as well. Not only that, Rachel posted this on her professional blog at redmonk.com. When’s the last time you fired up your company’s Wordpress editor and wrote some poetry?

When I really look at what’s going on here, it feels like a pretty stunning statement of values. Here’s what I see coming through:

Vulnerability / psychological safety.
Humanity—an integration of professional and personal identities.
Quirkiness.
Self-deprecation.
Elevation of the collective good (Roses are red / Open source is free / Thank you, maintainers / For bettering humanity).

None of this would be worth pointing out except that, as a (fairly) longtime observer of the software engineering ecosystem, I actually think that this blog post is tapping into a consistent, and very positive strain in the culture of the modern industry.

Certainly, there is and has been plenty of toxicity in software engineering’s online milieu—no argument there. But if you look in the right places, the values above are regularly on display. How did this happen? How did software engineering as a profession create a culture that supports these values? Or—am I cherrypicking my data points and creating something out of nothing?

I…don’t know. And if I go much deeper commentating on an industry that I am only on the periphery of I am in danger of being truly out over my skis. So maybe I’ll stick to making a more personal statement. When I read the best of this kind of writing I don’t just get smarter on a particular topic. I also:

…feel something. And feelings are what actually motivate behavior.
…build a personal connection with the writer. Think: scalable community-building.
…observe someone I respect modeling a behavior in public. Culture is the collective set of behaviors we consistently engage in.

I cannot help but reflect, then, on the discourse within our own industry. What values does our public discourse put on display?

I worry that this answer today is not what I would want. I worry that the data commentariat (including me!) is—as data people can be!—overly obsessed with nuance and impressing ourselves with sophistication. That rather than human and vulnerable we are clever and a little distant. That we see too little public participation because we unintentionally create impostor syndrome.

Maybe that’s overly critical. I didn’t really come here to answer the question, but to ask it. Because I think it matters a lot. The values we live out in public spaces—online, at meetups, at conference talks—will, over the space of a decade, become the values that our profession is built upon.

I do think that, in spaces that look more interactive, we’ve done a great job of modeling values like friendliness / helpfulness / supportiveness—which is fantastic. You don’t get a lot of “RTFM” in the dbt Community, and that’s very much by design. And there are certainly far more, and far more diverse public voices in data than there were a decade ago. But it is my fondest hope that we are still in the early innings and that our most important voices have yet to be heard. Our culture is still very, very nascent.

Rachel: if you ever read this, thanks for the poems. They made me smile :)

—

This year’s MAD (Machine Learning, Artificial Intelligence, and Data) Landscape, by Matt Turck at Firstmark, just dropped. Every year it’s a good way to stay on top of the changes in the ecosystem. Probably more interesting than the graphic itself for me this year is Matt’s narration of the changes since 2021. In Part 2, he gets into the impacts of the capital markets on the ecosystem—definitely the best overview on this topic I’ve seen—and in Part 3 he specifically talks about the modern data stack.

I pretty much agree with Matt’s thoughts on the MDS (although it’s the first time I’ve heard it suggested that it’s “elitist!”…not sure there). The question of convergence during economic pressure is not new, the fact that observability and cataloging are categories that are sorting themselves out a bit… All fair.

The most interesting part for me was the section titled “Data mesh, products, contracts: dealing with organizational complexity.” This is where my brain has been at recently—you’ll see a lot more from me on this topic in the near future—and it was nice to see Matt recognize it as a top-level concern.

IMO we’ve been too focused on the complexity-from-multiple-tools problem and too little focused on complexity-from-lots-of-people-participating. The former will be a transitory thing while the latter isn’t going anywhere, as the trajectory towards greater participation has barely even gotten started.

—

Great posts that I ran out of time to write about:

The future analytics developer experience, by Petr Janda. Strong agree with this perspective. I do have to mention: some of this is available in dbt Cloud today. Most/all of it is aligned with our long-term vision.
Sync Back to dbt, from Dani Kellogg @ Castor Data. A data catalog that lets you push changes back to the source code. Yes!!
David Jayatillake writes about the metrics layer space since the news of our acquisition of Transform. He poses some good questions at the end; we have a community AMA coming shortly where we’ll get into some answers. Follow #dbt-cloud-semantic-layer in dbt Slack for more info.
A recent deep dive into the open table standards battle. “Currently, Iceberg sits in the driver’s seat in terms of garnering that market momentum, Baer said, while Delta Table is sitting just behind. In many ways, the market has bifurcated into a Databricks vs. Snowflake war, and it’s clear which open table format those two vendors favor.”
Stephen Bailey encourages us all to be more opinionated. Even if we’re not totally sure we’re right—lay down that gauntlet.

The Analytics Engineering Roundup

Discussion about this post

Ready for more?