The Magic of the Analyst. Fictional (but all-too-real) Data Teams. Technology is Not the Hard Part. [#256]

Episode 2 of the Analytics Engineering Podcast dropped this week! In it, Julia and I talk to Venkat Venkataramani, CEO of Rockset, on building massive-scale data platforms at Facebook and the future of streaming.

At this early stage of the podcast, reviews matter a lot! If you're enjoying the show, I’d be forever grateful if you’d leave a review on your platform of choice.

Also! We just launched an apprenticeship program!! This actually came out of a recent issue of the roundup where I talked about Airbnb’s apprenticeship program and praised them for it, and a kind reader responded saying “why don’t you do this??” So…we are. Read the blog post for more, and please please apply and/or forward this to folks in your network. We’d love to use this as an way to pull folks from non-traditional backgrounds into the data and software engineering career paths.

-Tristan


Let’s start off this week’s issue with a picture! The below was shared on Locally Optimistic Slack this past week. I love the concept…this is a great way of visualizing the set of humans and how they interact with the jobs-to-be-done IMO. I have gripes around the edges, but I think the framework is more important than the details.

Always looking into the future, I think roles get more complicated as we introduce new capabilities in the modern data stack. Who is responsible for:

  • curating the data catalog?

  • facilitating the self-service experience? (see: below)

  • managing security/governance/compliance?

  • observability tooling? (!!)

  • …etc?

I don’t have answers, but I will say that I don’t think the industry is done with our role/task realignment. Which is great—more things to learn for more people, better pay, more career options, and of course, at the end, better data capabilities.

Relatedly, at what point do stop thinking of these humans as “the data team” and just say that, like Excel, this stuff is a required skillset of all knowledge workers? My guess is that this will take another decade, but that things will continue to move in this direction.

Have thoughts? Jump into the thread in LO and hash it out.

Two 🔥 Benn Posts

I’ve run out of superlatives about Benn Stancil’s Substack so I’ll just tell you to sign up if you haven’t already. I want to talk about two recent posts today.

I believe that Analytics is at a crossroads will become one of the most important blog posts in pushing the conversation about our collective profession forwards. I am not kidding—it is that important.

The central point of this article is that technical skills are not what is fundamentally hard about the job. I really could not agree more with this, and I have observed it first-hand. I have, and more broadly dbt Labs has, trained a lot of people to be competent—or even great!—with a certain set of technical skills required to be an analytics engineer. We know how to train these skills and can do it in a fairly well-defined period of time (days or weeks depending on the learning modality).

But we don’t—and I don’t!—truly know how to train someone to be a great data analyst. Honestly I haven’t seen anyone do a good job of this. Universally, the truly great data analysts I know have a combination of:

  • an overabundance of curiosity on an often broad range of topics

  • a depth of experience that allows them to traverse non-linear thought patterns within their field at an instinctual level. Typically this means at least five years of experience within a specific domain, with no real shortcuts.

But that’s kind of all they have in common.

As someone who considers themselves a pretty competent data analyst in a certain set of domains, I can tell you exactly the feeling I have when I’m doing great analytics work. I visualize it as ballet…effortlessly dancing from thought to thought, propelled by curiosity, only limited by the speed with which I’m able to satisfy that next impulse of that curiosity. It is in this mode that I often write my worst code. And that’s ok! The engineering part of the job is, for the moment, kind of beside the point.

As the person who may have been the most responsible for the saying “analytics is a subfield of software engineering,” I want to move beyond it. I think that, in emphasizing the technical aspects of analytics, my goal in 2016 was to push the field in this direction, or at least to differentiate my own personal approach from what was commonplace at the time. It was an intentionally-controversial statement to start an industry-wide conversation. I believed, and still believe, that certain technical skills are important to effectively pursuing a career in the field. But these technical skills are by no means the heart of the trade! And they can be taught in a small number of weeks.

Much to my surprise, the pendulum has swung really far towards emphasizing the technical parts of analytics during the past half-decade…perhaps too far. I agree with Benn that, in having internalized the importance of technical skills like coding (in whatever language) and git, the field is overall ready to develop another more nuanced layer of self-awareness. I’d love to see the industry agree on what exactly the magic is that goes on in the brains of a experienced analysts and how to train, fairly compensate, and create career paths for those humans. We do these things very poorly today.

I’ll stop there for now. I could talk about this forever, and maybe that’s just a cue that Julia and I should have Benn as a guest on the Analytics Engineering podcast…! I’m also excited to share some thinking that our community team has been doing on this topic of late…more soon.

Benn’s had a very productive week, and I would be remiss to not also mention Self-serve is a feeling. Rather than prattling on, I’ll let him summarize it:

So often, as data teams, we chase the self-serve experience that we think we’re supposed to build. We should be more critical of that, and chase the self-serve experience that makes us and our customers feel most at home.

We’re early as an industry on figuring out what “self-service” means and how to do it.

Three mindset changes that helped me as an analyst

Speaking of that magical-ness of analytics done well…

To succeed on your own, you’ll need to drop the passive mindset you learned in school and adopt an active one instead. School taught you that learning came from sit-down-and-listen consumption. The philosophy: Go to class. Listen to the teacher. Repeat what they say on the test. That’s nonsense.

The truth is learning is a messy process. Wisdom is earned through action, and consumption is only the first step. You have to build too.

This short post is a useful window into how one does analytics well.

For SQL

This post hit an emotional register that almost nothing I read in the data space hits, and I loved it. It was in reaction to a recent post called Against SQL that made the rounds on Hacker News. Our pro-SQL author, Pedram Navid, shares his concerns openly:

My real fear is that [posts like this] discourage people from learning SQL and that they make those whose primary language is SQL feel inadequate.

Throughout the post, Pedram refuses to even engage with the core points brought against SQL as a language and rather argues that the language is not the point:

The people who do [this] work are not any less skilled. And the tools and languages they use are not any less useful because they don’t have features of other languages. A good analyst is worth their weight in gold. They’re ruthless in their precision and excellent communicators. They’re empathetic to the business and have endless curiosity. Data people are some of the smartest, kindest people I’ve met.

So, while there might be people out there Against SQL, know that there are many of us who are very much For SQL. And the world is better for it.

Thanks Pedram. Good shit.

There’s some complex identity stuff going on here… The software engineering community definitely dominates online conversation, but the data community is trying to carve out a space for itself and its own distinct identity. In this environment, a technical conversation can easily spill into being about much more.

Building a data team at a mid-stage startup: a short story

Julia and I had the pleasure of talking to Erik Bernhardsson on a soon-to-be-released episode of the podcast, just in the wake of what will be (IMO) potentially his most-important post to-date.

It’s a 22-minute read, and emulates books like The Goal and Five Dysfunctions of a Team (apparently called business fables) in that it helps readers understand something real by telling a fictionalized version of it. If you’re on a data team, you’ll feel your lived experience being played out in Erik’s writing. If you don’t currently work on a data team, you’ll develop real empathy for folks who do.


Just a quick end note before signing off. This issue feels like an important one. The conversations on topics that I care about are being had by more people, more substantively, than ever before. It really feels like a coming-together, an increasing of kinetic energy…the kind of joint refining of a set of ideas that can actually move us forwards as a collective.

I have felt for years that I was a bit alone in caring about the things that I’ve cared about (how to distill knowledge from data at scale, how to build teams that do this, etc). It’s so rewarding to learn from others who are now pushing the conversation forwards. We’re doing something good here.

That’s all for now, see you in two weeks.

Tristan