Discover more from The Analytics Engineering Roundup
DataCamp. Training NNs. Top 10 DS Coding Mistakes. Constructing KPIs. [DSR #184]
❤️ Want to support this project? Forward this email to three friends!
🚀 Forwarded this from a friend? Sign up to the Data Science Roundup here.
Quick note: If you’re in New York, come hang out IRL on May 23 to talk about dbt, analytics engineering, and more. We just finished lining up the talks and they’re good—I’m very excited :D
Learn more and reserve your slot here. 🤝
This week's best data science articles
As a member of the data community, you should read this. Reading this post made me upset, angry, and frustrated.
DataCamp’s board and Jonathan Cornelissen, its CEO and the offender, have since both responded to what has become a real crisis for the company. The CEO is now on an indefinite leave of absence. Hopefully that becomes permanent.
Julia’s post is remarkably constructive, all things considered:
I am an optimist and still hold out hope for the folks at DataCamp to demonstrate that I (and the broader community) should trust them.
I hope that, for the sake of the community, jettisoning Cornelissen begins a process that regains that trust.
Wow. Andrej Karpathy doesn’t post often, but when he does it’s worth reading. This most recent post is golden. He starts by sharing why all of the one-line tutorials on model training are just not how training networks happens in real life. He then goes deep on how that process should actually go. In the process, he drops wonderful little tidbits like:
Now that we understand our data can we reach for our super fancy Multi-scale ASPP FPN ResNet and begin training awesome models? For sure no. That is the road to suffering.
Incredible resource from one of the leading practitioners in AI today.
A brilliantly simple visualization to answer to the question “How much of the recently-released Mueller report was redacted?” Click through for the code.
If you’re a long-time reader, this won’t be brand new for you, but this list is both concise and spot-on. Maybe use it as a gentle nudge to that one coworker?
This article expands on one point from the above article: writing for loops. Performance of an iterator is one of the most critical things to consider when programming in data. Writing a for loop traversing 1B items is not a good idea.
This short-ish post digs into lambdas, maps, itertools, and generators. You’ve probably used each of these before but may not have them deeply engrained in your programming workflow. Now is the time to change that.
Today I’m going to outline a few key principles related to KPIs aimed specifically at analysts and data scientists. I’ll discuss: a) how KPIs should be structured within an organization, and b) how, and why Analytics, can best partner with stakeholders in determining the best KPIs for them.
Solid post on an important topic.
There are three areas data scientists/analysts commonly lack which hold them back: a) customer empathy, b) stakeholder empathy, and c) product knowledge.
My favorite recommendation in this post is that data scientists should actually use the product. Revolutionary! And surprisingly rare.
Thanks to our sponsors!
At Fishtown Analytics, we work with venture-funded startups to build analytics teams. Whether you’re looking to get analytics off the ground after your Series A or need support scaling, let’s chat.
Developers shouldn’t have to write ETL scripts. Consolidate your data in minutes. No API maintenance, scripting, cron jobs, or JSON wrangling required.
The internet's most useful data science articles. Curated with ❤️ by Tristan Handy.
If you don't want these updates anymore, please unsubscribe here.
If you were forwarded this newsletter and you like it, you can subscribe here.
Powered by Revue
915 Spring Garden St., Suite 500, Philadelphia, PA 19123