Data Science Roundup #69: Machine Learning @ Google, Focus on Recommenders & more!

Do you ❤️️ the Data Science Roundup? Please share with your network!

on Twitter | on Facebook | forward this email

Referred by a friend? Sign up.

Focus on: Google ML

Google's Rules of Machine Learning

43 rules that Google engineers have learned while implementing some of the most sophisticated and widely used machine learning models in the world, as written by a Google engineer. Here are a few examples I particularly liked:

  • Don’t be afraid to launch a product without machine learning.

  • Don’t overthink which objective you choose to directly optimize.

  • Plan to launch and iterate.

Save this and revisit often—it’s that good.

martin.zinkevich.orgShare

Learn TensorFlow and deep learning, without a Ph.D.

The more I read about deep learning, the more I feel like a teamster before the mass deployment of self-driving semis. Fortunately, at least one engineer at Google has sympathy for my burgeoning obsolescence and has built a 3-hour course for those of us with more curiosity than time.

cloud.google.comShare

Focus on: Recommenders

Scaling our Recommendation Engine: 15,000 to 130M Users in 24 Months

Scaling our Recommendation Engine: 15,000 to 130M Users in 24 Months

This post is awesome. The team at Retention Sciences has been optimizing their recommendation algorithm for two years and walks through their process and its improvements month by month. It’s fascinating to see their algorithm grow in sophistication from a 14-line SQL statement to a sophisticated set of algorithms that output a tested and optimized algorithm for each of their 30 clients.

www.retentionscience.comShare

5 Types of Recommenders

There are five basic styles of recommenders. In order to understand any of the five, you need to understand what’s going on inside the box. This article walks through each of the five in enough detail to paint you a very solid mental model.

This post pairs very well with the Retention Sciences post above, as you can actually see the team at RS move along the path from one recommender type to the next.

www.datasciencecentral.comShare

Making the Rounds

Engineering is the Bottleneck in Deep Learning Research

The author, a PhD student focused on deep learning, points out some disappointing, although perhaps not surprising, flaws in its research community:

…we’re under-appreciating the fact that we’re dealing with pure software. That sounds obvious, but it’s actually a big deal. Setting up tightly controlled experiments in fields like medicine or psychology is almost impossible and involves an extraordinary amount of work. With software it’s essentially free. It’s more unique than most of us realize. But we’re just not doing it.

The solution? Good old fashioned engineering. Writing and sharing high quality, documented code.

blog.dennybritz.comShare

See Bots Chat

Someone filmed two bots talking and uploaded the live stream to Twitch, where it subsequently attracted 25k viewers. Most of the resulting discourse was bleep-bloop, but there were some highlights.

Bot 1: “I am human, a stressed out human but human nevertheless”

Bot 2: “I’m sorry Donna, but you are not human. There is nothing wrong with being yourself. You are an artificial entity and should be proud of that.”

You should watch a couple of minutes of this video. You won’t learn anything, you won’t become a better person, but it’ll be a fascinating experience nonetheless.

www.twitch.tvShare

Data viz of the week

The yellow line is my guess. How do you measure up? (Interactive!)

The yellow line is my guess. How do you measure up? (Interactive!)

Thanks to our sponsors!

Fishtown Analytics: Analytics Consulting for Startups

Fishtown Analytics works with venture-funded startups to implement Redshift, BigQuery, Mode Analytics, and Looker. Want advanced analytics without needing to hire an entire data team? Let’s chat.

fishtownanalytics.comShare

Stitch: Simple, Powerful ETL Built for Developers

Developers shouldn’t have to write ETL scripts. Consolidate your data in minutes. No API maintenance, scripting, cron jobs, or JSON wrangling required.

www.stitchdata.comShare

By Tristan Handy

The internet's most useful data science articles. Curated with ❤️ by Tristan Handy.

Tweet Share

If you don't want these updates anymore, please unsubscribe here.

If you were forwarded this newsletter and you like it, you can subscribe here.

Powered by Revue

915 Spring Garden St., Suite 500, Philadelphia, PA 19123