DataCamp. Training NNs. Top 10 DS Coding Mistakes. Constructing KPIs. [DSR #184]

Apr 28, 2019

❤️ Want to support this project? Forward this email to three friends!

🚀 Forwarded this from a friend? Sign up to the Data Science Roundup here.

Quick note: If you’re in New York, come hang out IRL on May 23 to talk about dbt, analytics engineering, and more. We just finished lining up the talks and they’re good—I’m very excited :D

Learn more and reserve your slot here. 🤝

This week's best data science articles

PSA: Julia Silge - Writing a Letter to DataCamp

As a member of the data community, you should read this. Reading this post made me upset, angry, and frustrated.

DataCamp’s board and Jonathan Cornelissen, its CEO and the offender, have since both responded to what has become a real crisis for the company. The CEO is now on an indefinite leave of absence. Hopefully that becomes permanent.

Julia’s post is remarkably constructive, all things considered:

I am an optimist and still hold out hope for the folks at DataCamp to demonstrate that I (and the broader community) should trust them.

I hope that, for the sake of the community, jettisoning Cornelissen begins a process that regains that trust.

juliasilge.com • Share

A Recipe for Training Neural Networks

Wow. Andrej Karpathy doesn’t post often, but when he does it’s worth reading. This most recent post is golden. He starts by sharing why all of the one-line tutorials on model training are just not how training networks happens in real life. He then goes deep on how that process should actually go. In the process, he drops wonderful little tidbits like:

Now that we understand our data can we reach for our super fancy Multi-scale ASPP FPN ResNet and begin training awesome models? For sure no. That is the road to suffering.

Incredible resource from one of the leading practitioners in AI today.

karpathy.github.io • Share

Redacted

A brilliantly simple visualization to answer to the question “How much of the recently-released Mueller report was redacted?” Click through for the code.

flowingdata.com • Share

Top 10 Coding Mistakes Made by Data Scientists

If you’re a long-time reader, this won’t be brand new for you, but this list is both concise and spot-on. Maybe use it as a gentle nudge to that one coworker?

towardsdatascience.com • Share

5 Advanced Features of Python and How to Use Them

This article expands on one point from the above article: writing for loops. Performance of an iterator is one of the most critical things to consider when programming in data. Writing a for loop traversing 1B items is not a good idea.

This short-ish post digs into lambdas, maps, itertools, and generators. You’ve probably used each of these before but may not have them deeply engrained in your programming workflow. Now is the time to change that.

towardsdatascience.com • Share

KPI Principles

Today I’m going to outline a few key principles related to KPIs aimed specifically at analysts and data scientists. I’ll discuss: a) how KPIs should be structured within an organization, and b) how, and why Analytics, can best partner with stakeholders in determining the best KPIs for them.

Solid post on an important topic.

www.locallyoptimistic.com • Share

How Your Data Scientists Become Bottlenecked

There are three areas data scientists/analysts commonly lack which hold them back: a) customer empathy, b) stakeholder empathy, and c) product knowledge.

My favorite recommendation in this post is that data scientists should actually use the product. Revolutionary! And surprisingly rare.

Short, useful.

www.gregkamradt.com • Share

Thanks to our sponsors!

Fishtown Analytics: Analytics Consulting for Startups

At Fishtown Analytics, we work with venture-funded startups to build analytics teams. Whether you’re looking to get analytics off the ground after your Series A or need support scaling, let’s chat.

www.fishtownanalytics.com • Share

Stitch: Simple, Powerful ETL Built for Developers

Developers shouldn’t have to write ETL scripts. Consolidate your data in minutes. No API maintenance, scripting, cron jobs, or JSON wrangling required.

www.stitchdata.com • Share

By Tristan Handy

The internet's most useful data science articles. Curated with ❤️ by Tristan Handy.

Tweet Share

If you don't want these updates anymore, please unsubscribe here.

If you were forwarded this newsletter and you like it, you can subscribe here.

915 Spring Garden St., Suite 500, Philadelphia, PA 19123

The Analytics Engineering Roundup

Discussion about this post

Ready for more?