Data Science is Different Now. DS in Startups. Job Applications. Podcasts. [DSR #174]

❤️ Want to support this project? Forward this email to three friends!

🚀 Forwarded this from a friend? Sign up to the Data Science Roundup here.

This week's best data science articles

Data Science is Different Now

Holy shit. This post is gospel. From the inimitable (and Philly-based!) Vicki Boykis, this extremely well-researched piece tells the story of an industry that is being super-saturated with junior-level talent with poor expectations, a lack of differentiated skills, and no clear way to get in the door. It shares many data points on how the job is minimally about model training and primarily about data cleansing and transport. It recommends prioritization of skills related to a) SQL, b) engineering, and c) the cloud.

But the scope is bigger than that. It positions the industry on a long-term continuum of hype-vs-maturity and is the first place I’ve read that suggests that the field is oversaturated from a human capital perspective. Most analyses of human capital in a field talk about headcount: “We need 10,000 machinists.” But in such a new and such a technical field, Vicki’s point is that while we may need some more data scientists, we need far fewer than the data scientist training ecosystem is now printing out. Instead, what we actually need is more data scientists with experience…and that can only happen with time.

I cannot say enough about how much I like this post. You should read it. You should send it to your friends. You should particularly send it to your friends who are thinking about getting into the data science today.

Finally: being a Twitter native, Vicki quotes a lot of tweets in the post. Here is my favorite, quoting Hinton:




Hinton on ML research:
“We should be going for radically new ideas. We know a radically new idea in the long run is going to be much more influential than a tiny improvement. That’s the downside when we only have a few senior guys and a gazillion juniors.”

7:15 AM - 15 Dec 2018

Succeeding as a Data Scientist in Startups

This is probably the best post I’ve seen on this topic. Here’s my favorite bit:

A common trap I see are people who come out of Data Science programs joining these positions expecting to be using sexy things like Spark and applying RNNs to their work. But sadly, they want to live on top of a mountain of foundation work that needs to be done first, both from an engineering standpoint and from cultural standpoint. The mismatch is brutal.

Being the first person specifically hired to handle data, it’s very unlikely any pieces of the pyramids are sturdy. It’s a multi-year, cross-functional, full company effort to get all the pieces in place. Nurturing those all those pieces in parallel is a big part of the job.

“I’m a data scientist, it’s not my job to handle [infrastructure / reporting / etc.].” If you’re early at a company, regardless of your title, your job is to do whatever is needed to help the company grow using data. What most folks miss is that it’s actually an incredible opportunity to understand and/or have built of the core foundations that data science relies upon at your org. This path will frequently be a long one, but it can lead to your having an outsize impact on the company, and in turn on your career.


What No One Will Tell You About Data Science Job Applications

Great post. I don’t want to spoil it, just read it. Short and to the point. I know the title is a bit click-bait-y, but it actually delivers.


One of many GOV | DNA views of the data.

One of many GOV | DNA views of the data.


This year’s winner of the World Data Visualization Prize interactive category is impressive, to say the least. Spend a couple of minutes clicking around to get the full extent of what’s available.

So few datasets ever get this type of treatment. We’re still at the very beginning of building useful interfaces for data—most of the time we’re still just throwing together a dashboard full of line graphs. Building interfaces like this still just takes too much time and expertise. In the future this will be less true.


The Ultimate List of Data Science Podcasts

The ultimate list of data science podcasts! Over a dozen shows that discuss topics in big data, data analysis, statistics, machine learning, and artificial intelligence.

Several of these were new to me. If you’re an avid podcast listener like I am, this list will likely give you some new fodder for your feed.


Thanks to our sponsors!

Fishtown Analytics: Analytics Consulting for Startups

At Fishtown Analytics, we work with venture-funded startups to build analytics teams. Whether you’re looking to get analytics off the ground after your Series A or need support scaling, let’s chat.


Stitch: Simple, Powerful ETL Built for Developers

Developers shouldn’t have to write ETL scripts. Consolidate your data in minutes. No API maintenance, scripting, cron jobs, or JSON wrangling required.


By Tristan Handy

The internet's most useful data science articles. Curated with ❤️ by Tristan Handy.

Tweet Share

If you don't want these updates anymore, please unsubscribe here.

If you were forwarded this newsletter and you like it, you can subscribe here.

Powered by Revue

915 Spring Garden St., Suite 500, Philadelphia, PA 19123