Data Science Roundup #86: AI in Medicine, Opportunity Sizing @ Etsy & Deep Learning Advice.

May 21, 2017

Unambiguously positive results in medicine and education. A great talk from Etsy, practical advice for deep learning projects, and more. Solid week. Enjoy 😊

- Tristan

❤️ Want to support us? Forward this email to three friends!

🚀 Forwarded this from a friend? Sign up to the Data Science Roundup here.

Two Posts You Can't Miss

Applying Artificial Intelligence in Medicine: Our Early Results

Two quotes:

One year ago, we teamed up with UCSF Cardiology to start the mRhythm study, which 6,158 Cardiogram users enrolled in. Cardiogram trained a deep neural network on the Apple Watch’s heart rate readings and was able to obtain an AUC of 0.97, enabling us to detect atrial fibrillation with 98.04% sensitivity and 90.2% specificity.

The most promising finding of our study is that consumer-grade wearables can be used to detect disease.

Hardware and algorithm improvements in wearables will be tightly coupled, and it only takes a couple of consumer-grade successes to get the flywheel started.

blog.cardiogr.am • Share

Duolingo: How We Learn How You Learn

This is a fascinating use case. Duolingo took a piece of 100-year-old research on educational theory called the “forgetting curve” and turned it into an algorithm that now sits at the heart of their product. The intuition is simple: student retention occurs best when practice is spaced out over time. Not shocking to anyone who has ever crammed for a test.

What I find so interesting about this piece is that the solution wasn’t particularly hard. That is not at all to say that it wasn’t challenging, but it didn’t take a millennium of GPU time or fundamental breakthroughs in methods. It was a bunch of smart people thinking hard about how to apply existing techniques to a very specific, high-value domain. This will be increasingly common as data science moves further into the deployment phase.

making.duolingo.com • Share

This Week's Top Posts

Etsy: An Intro to Opportunity Sizing

The author walks through Etsy’s process for opportunity sizing. If you’ve never spent much time thinking about opportunity sizing, this is a must-read. It’s quite likely that many of the projects you engage in just aren’t that valuable.

datadriven.club • Share

Questions & Intuition for Tackling Deep Learning Problems

In this article, I’d like to recount five key lessons that I’ve learned after one too many walks down dead alleyways.

Very, very good. If you’re thinking about tackling a problem using deep learning, read this first.

blog.semantics3.com • Share

The A16Z AI Playbook

This microsite is intended to help newcomers (both non-technical and technical) begin exploring what’s possible with AI. We’ve met with hundreds of Fortune 500 / Global 2000 companies, startups, and government policy makers asking: “How do I get started with artificial intelligence?” and “What can I do with AI in my own product or company?” This site is designed as a resource for anyone asking those questions, complete with examples and sample code to help you get started.

Great resource, maybe for you, definitely for people you work with.

aiplaybook.a16z.com • Share

The Cost of Doing Data Science on Laptops

This article makes a great point: it’s just not a great idea to train a model on your laptop if you’re doing anything even moderately sophisticated. Using the cloud is just too easy and too cheap today.

While this is obviously content marketing for Domino (they’re selling a cloud-based data science product), that doesn’t make it any less true. Sure, use Domino…or just spend a couple of hours getting off the ground with AWS.

blog.dominodatalab.com • Share

The Behance Artistic Media Dataset

This is a gold mine. Amazing images, strong classifier, really straightforward to access. I’m not sure whether this is more useful for making memes or as training data, but this data set is one-of-a-kind.

bam-dataset.org • Share

A Deep Reinforced Model for Abstractive Summarization

Abstractive summarization—last covered in #49— remains only partially solved. Existing models don’t perform well enough to eliminate humans in critical use cases. MetaMind just released research that improves upon existing models’ performance, getting us one step closer.

If you’re new to abstractive summarization, this post is a great intro.

metamind.io • Share

From Physics to Finance: My First Year in Industry

The author has two master’s, a doctorate, and a post-doc fellowship under his belt, but after making the switch he has this to say:

According to Vitae, only 18% of the young researchers who made this transition would go back to academia. I am confident I am one of the 82%.

Good read if you’re considering your options.

www.linkedin.com • Share

Data Viz of the week

"What parties support what policies?" Perfect answer.

Thanks to our sponsors!

Fishtown Analytics: Analytics Consulting for Startups

At Fishtown Analytics, we work with venture-funded startups to implement Redshift, Snowflake, Mode Analytics, and Looker. Want advanced analytics without needing to hire an entire data team? Let’s chat.

fishtownanalytics.com • Share

Stitch: Simple, Powerful ETL Built for Developers

Developers shouldn’t have to write ETL scripts. Consolidate your data in minutes. No API maintenance, scripting, cron jobs, or JSON wrangling required.

www.stitchdata.com • Share

End note:

I’m attending CogX London 2017, a conference which will explore the impact of AI across industry. I have a 50% discount off the Early Bird tickets for Data Science Roundup readers. Just visit and use the code 877f!5ht0wn50.

Let me know if you plan on being there!

By Tristan Handy

The internet's most useful data science articles. Curated with ❤️ by Tristan Handy.

Tweet Share

If you don't want these updates anymore, please unsubscribe here.

If you were forwarded this newsletter and you like it, you can subscribe here.

915 Spring Garden St., Suite 500, Philadelphia, PA 19123

The Analytics Engineering Roundup

Discussion about this post

Ready for more?