Discover more from The Analytics Engineering Roundup
Data Science Roundup #69: Machine Learning @ Google, Focus on Recommenders & more!
Do you ❤️️ the Data Science Roundup? Please share with your network!
Referred by a friend? Sign up.
Focus on: Google ML
43 rules that Google engineers have learned while implementing some of the most sophisticated and widely used machine learning models in the world, as written by a Google engineer. Here are a few examples I particularly liked:
Don’t be afraid to launch a product without machine learning.
Don’t overthink which objective you choose to directly optimize.
Plan to launch and iterate.
Save this and revisit often—it’s that good.
The more I read about deep learning, the more I feel like a teamster before the mass deployment of self-driving semis. Fortunately, at least one engineer at Google has sympathy for my burgeoning obsolescence and has built a 3-hour course for those of us with more curiosity than time.
Focus on: Recommenders
This post is awesome. The team at Retention Sciences has been optimizing their recommendation algorithm for two years and walks through their process and its improvements month by month. It’s fascinating to see their algorithm grow in sophistication from a 14-line SQL statement to a sophisticated set of algorithms that output a tested and optimized algorithm for each of their 30 clients.
There are five basic styles of recommenders. In order to understand any of the five, you need to understand what’s going on inside the box. This article walks through each of the five in enough detail to paint you a very solid mental model.
This post pairs very well with the Retention Sciences post above, as you can actually see the team at RS move along the path from one recommender type to the next.
Making the Rounds
The author, a PhD student focused on deep learning, points out some disappointing, although perhaps not surprising, flaws in its research community:
…we’re under-appreciating the fact that we’re dealing with pure software. That sounds obvious, but it’s actually a big deal. Setting up tightly controlled experiments in fields like medicine or psychology is almost impossible and involves an extraordinary amount of work. With software it’s essentially free. It’s more unique than most of us realize. But we’re just not doing it.
The solution? Good old fashioned engineering. Writing and sharing high quality, documented code.
Someone filmed two bots talking and uploaded the live stream to Twitch, where it subsequently attracted 25k viewers. Most of the resulting discourse was bleep-bloop, but there were some highlights.
Bot 1: “I am human, a stressed out human but human nevertheless”
Bot 2: “I’m sorry Donna, but you are not human. There is nothing wrong with being yourself. You are an artificial entity and should be proud of that.”
You should watch a couple of minutes of this video. You won’t learn anything, you won’t become a better person, but it’ll be a fascinating experience nonetheless.
Data viz of the week
The yellow line is my guess. How do you measure up? (Interactive!)
Thanks to our sponsors!
Fishtown Analytics works with venture-funded startups to implement Redshift, BigQuery, Mode Analytics, and Looker. Want advanced analytics without needing to hire an entire data team? Let’s chat.
Developers shouldn’t have to write ETL scripts. Consolidate your data in minutes. No API maintenance, scripting, cron jobs, or JSON wrangling required.
The internet's most useful data science articles. Curated with ❤️ by Tristan Handy.
If you don't want these updates anymore, please unsubscribe here.
If you were forwarded this newsletter and you like it, you can subscribe here.
Powered by Revue
915 Spring Garden St., Suite 500, Philadelphia, PA 19123