Discover more from The Analytics Engineering Roundup
The AI 100. Behavioral Science @ Uber. Argparse. Minimally Sufficient Pandas. [DSR #173]
❤️ Want to support this project? Forward this email to three friends!
🚀 Forwarded this from a friend? Sign up to the Data Science Roundup here.
This week's best data science articles
The most promising 100 AI startups working across the artificial intelligence value chain, from hardware and data infrastructure to industrial applications.
I found this list valuable—it represents the frontier of current AI commercialization efforts within startups. Obviously there is plenty of AI work going on inside of big tech as well, but we end up hearing about that (and using it!) on a daily basis already.
My most interesting find from this list was Applitools. Very useful application of AI to software testing.
The article covers how Uber applies behavioral science research to a new product called Express Pool. I pulled the two most interesting paragraphs in line here:
…we dove into the behavioral science literature to gather insights about people’s perceptions of time and waiting. We identified three concepts that are important in presenting wait time: idleness aversion, operational transparency, and the goal gradient effect. [Tristan’s note: explanations of each in the article itself!]
Given these insights, we recommended highlighting progress during wait times by explaining each granular step going on behind the scenes, like identifying other riders traveling the same way and finding a car for the trip. (…) The Express POOL team tested these ideas in an A/B experiment and observed an 11 percent reduction in the post-request cancellation rate.
Behavioral science is distinctly different from data science, but the two are quite complementary. It’s unusual and fascinating to see a company applying behavioral science so formally—this is well beyond the tool kit of a traditional product manager.
If you plan to be a software developer with Python, you’ll want to be able to use argparse for your scripting needs. If you’re a data scientist, you’ll likely find yourself needing to port your code from a Jupyter Notebook to a reproducible script.
You should always be in the process of refactoring useful code you’ve written in Jupyter out into modular scripts that can be called from and piped together from the command line. Argparse is a great tool in your kit as you do that.
I love this. Straightforward, very useful.
In this article, I will offer an opinionated perspective on how to best use the Pandas library for data analysis. My objective is to argue that only a small subset of the library is sufficient to complete nearly all of the data analysis tasks that one will encounter. This minimally sufficient subset of the library will benefit both beginners and professionals using Pandas.
Speaking of machine learning, does linear regression really qualify as machine learning?
Yes, linear regression is lumped into the “machine learning” tool bag.
Awesome, I do that in Excel all the time. So can I call myself a machine learning practitioner too?
Sigh technically, yes.
This amusing dialog between a current and a wannabe data scientist highlights some of the ridiculousness of the field today. Only slightly useful, but will definitely make you grin :)
Thanks to our sponsors!
At Fishtown Analytics, we work with venture-funded startups to build analytics teams. Whether you’re looking to get analytics off the ground after your Series A or need support scaling, let’s chat.
Developers shouldn’t have to write ETL scripts. Consolidate your data in minutes. No API maintenance, scripting, cron jobs, or JSON wrangling required.
The internet's most useful data science articles. Curated with ❤️ by Tristan Handy.
If you don't want these updates anymore, please unsubscribe here.
If you were forwarded this newsletter and you like it, you can subscribe here.
Powered by Revue
915 Spring Garden St., Suite 500, Philadelphia, PA 19123