100s of PBs @ Uber. 5 Stat Concepts. KPIs @ Airbnb. A Massive Border Flyover. My Little Ponies. [DSR #159]
❤️ Want to support us? Forward this email to three friends!
🚀 Forwarded this from a friend? Sign up to the Data Science Roundup here.
This week's best data science articles
Uber's Big Data Platform: 100+ Petabytes with Minute Latency
Uber’s data platform has evolved significantly from when the company was processing 100s of GB to 10s of PB to 100s of PB. This in-depth post talks about each of their stages of evolution and the tools and technology associated with each.
There are few companies operating at this level and I liked seeing the choices that Uber has made along the way. Impressive work.
The 5 Basic Statistics Concepts Data Scientists Need to Know
If you skip the intro (which is a bit of buzzword salad), the content is quite good, if basic. Might be relevant for you, might be relevant for someone you know whom you work with. 4800 claps on Medium in the past week says its worth it.
towardsdatascience.com • Share

An interactive look at the barriers that divide the US and Mexico
What is along the nearly 2,000 miles of border that divides the U.S. from Mexico?
I’ve never seen anything like this—it’s a truly impressive piece of interactive content from the Washington Post. It’s worth a look purely as an experimental piece of data viz whether or not you’re personally invested in the topic.
www.washingtonpost.com • Share
How Leaving Data Science Made Me a Better Data Scientist
This slide deck from a recent talk tells the story of Joel Grus’ path into, and then out of, data science. His takeaways are familiar ones: data scientists should focus on the readability, reproducibility, and test coverage of their code. If you’ve been reading the Roundup for long, you’ll know that this viewpoint resonates deeply with me.
If you read through the talk (which I did—it’s worth it) make sure to read the slide notes.
12 weeks, 1 internship, 1 very good blog post. The post is an interesting overview of working in data @ Airbnb, but also a very insightful look at how Airbnb trains employees to think about and design KPIs. The section about KPI design is covered in the later parts of the post so make sure to get there—it’s a great treatment of the topic.
New My Little Ponies, Designed by Neural Network
The author used a neural network to create new My Little Pony names. The names, and the author’s commentary on them, are hilarious. I actually LOLed at several of them. Some of my favorites:
Sob Dancer
Tardy Pony
Princess Sweat
…ok that’s probably enough. If you think this is the dumbest thing I’ve ever linked to, well, to each their own I suppose.
Thanks to our sponsors!
Fishtown Analytics: Analytics Consulting for Startups
At Fishtown Analytics, we work with venture-funded startups to build analytics teams. Whether you’re looking to get analytics off the ground after your Series A or need support scaling, let’s chat.
www.fishtownanalytics.com • Share
Stitch: Simple, Powerful ETL Built for Developers
Developers shouldn’t have to write ETL scripts. Consolidate your data in minutes. No API maintenance, scripting, cron jobs, or JSON wrangling required.
The internet's most useful data science articles. Curated with ❤️ by Tristan Handy.
If you don't want these updates anymore, please unsubscribe here.
If you were forwarded this newsletter and you like it, you can subscribe here.
Powered by Revue
915 Spring Garden St., Suite 500, Philadelphia, PA 19123