Unambiguously positive results in medicine and education. A great talk from Etsy, practical advice for deep learning projects, and more. Solid week. Enjoy 😊
❤️ Want to support us? Forward this email to three friends!
🚀 Forwarded this from a friend? Sign up to the Data Science Roundup here.
Two Posts You Can't Miss
One year ago, we teamed up with UCSF Cardiology to start the mRhythm study, which 6,158 Cardiogram users enrolled in. Cardiogram trained a deep neural network on the Apple Watch’s heart rate readings and was able to obtain an AUC of 0.97, enabling us to detect atrial fibrillation with 98.04% sensitivity and 90.2% specificity.
The most promising finding of our study is that consumer-grade wearables can be used to detect disease.
Hardware and algorithm improvements in wearables will be tightly coupled, and it only takes a couple of consumer-grade successes to get the flywheel started.
This is a fascinating use case. Duolingo took a piece of 100-year-old research on educational theory called the “forgetting curve” and turned it into an algorithm that now sits at the heart of their product. The intuition is simple: student retention occurs best when practice is spaced out over time. Not shocking to anyone who has ever crammed for a test.
What I find so interesting about this piece is that the solution wasn’t particularly hard. That is not at all to say that it wasn’t challenging, but it didn’t take a millennium of GPU time or fundamental breakthroughs in methods. It was a bunch of smart people thinking hard about how to apply existing techniques to a very specific, high-value domain. This will be increasingly common as data science moves further into the deployment phase.
This Week's Top Posts
The author walks through Etsy’s process for opportunity sizing. If you’ve never spent much time thinking about opportunity sizing, this is a must-read. It’s quite likely that many of the projects you engage in just aren’t that valuable.
In this article, I’d like to recount five key lessons that I’ve learned after one too many walks down dead alleyways.
Very, very good. If you’re thinking about tackling a problem using deep learning, read this first.
This microsite is intended to help newcomers (both non-technical and technical) begin exploring what’s possible with AI. We’ve met with hundreds of Fortune 500 / Global 2000 companies, startups, and government policy makers asking: “How do I get started with artificial intelligence?” and “What can I do with AI in my own product or company?” This site is designed as a resource for anyone asking those questions, complete with examples and sample code to help you get started.
Great resource, maybe for you, definitely for people you work with.
This article makes a great point: it’s just not a great idea to train a model on your laptop if you’re doing anything even moderately sophisticated. Using the cloud is just too easy and too cheap today.
While this is obviously content marketing for Domino (they’re selling a cloud-based data science product), that doesn’t make it any less true. Sure, use Domino…or just spend a couple of hours getting off the ground with AWS.
This is a gold mine. Amazing images, strong classifier, really straightforward to access. I’m not sure whether this is more useful for making memes or as training data, but this data set is one-of-a-kind.
Abstractive summarization—last covered in #49— remains only partially solved. Existing models don’t perform well enough to eliminate humans in critical use cases. MetaMind just released research that improves upon existing models’ performance, getting us one step closer.
If you’re new to abstractive summarization, this post is a great intro.
The author has two master’s, a doctorate, and a post-doc fellowship under his belt, but after making the switch he has this to say:
According to Vitae, only 18% of the young researchers who made this transition would go back to academia. I am confident I am one of the 82%.
Good read if you’re considering your options.
Data Viz of the week
"What parties support what policies?" Perfect answer.
Thanks to our sponsors!
At Fishtown Analytics, we work with venture-funded startups to implement Redshift, Snowflake, Mode Analytics, and Looker. Want advanced analytics without needing to hire an entire data team? Let’s chat.
Developers shouldn’t have to write ETL scripts. Consolidate your data in minutes. No API maintenance, scripting, cron jobs, or JSON wrangling required.
I’m attending CogX London 2017, a conference which will explore the impact of AI across industry. I have a 50% discount off the Early Bird tickets for Data Science Roundup readers. Just visit and use the code 877f!5ht0wn50.
Let me know if you plan on being there!
The internet's most useful data science articles. Curated with ❤️ by Tristan Handy.
If you don't want these updates anymore, please unsubscribe here.
If you were forwarded this newsletter and you like it, you can subscribe here.
Powered by Revue
915 Spring Garden St., Suite 500, Philadelphia, PA 19123