Deep Learning Framework Growth. Why Job Postings Suck. Fraud Detection. Warby's A/R. [DSR #181]

Apr 07, 2019

❤️ Want to support this project? Forward this email to three friends!

🚀 Forwarded this from a friend? Sign up to the Data Science Roundup here.

This week's best data science articles

Which Deep Learning Framework is Growing Fastest?

The author collected data and evaluated the most popular frameworks along a variety of indicators, including search interest, Github activity, and job postings. The image above is the final “growth scores” arrived at.

Comprehensive data; the results are a useful proxy for where we stand today.

towardsdatascience.com • Share

The Problem with Data Science Job Postings

This is my favorite post about the hiring process I’ve read for some time. The overall advice: make sure you can solve the problem a company wants to solve, worry less about the specific technologies a posting asks for. There are so many quotable lines; here’s my favorite:

If a company asks for more than 6 years of deep learning experience, then their posting was written by someone who has zero technical knowledge (AlexNet came out in 2012, so this basically narrows the field down to Geoff Hinton’s entourage).

towardsdatascience.com • Share

Warby Parker’s Virtual Try-On Placement Algorithm

Using Apple’s augmented reality technology and TrueDepth camera, along with proprietary frame placement and fit system, Warby Parker developed a new virtual try-on tool for iPhone X series phones.

Pretty wild. Check out the animated gif at the end to see just how accurate the simulation is—very impressive.

medium.com • Share

Implicit Generation and Generalization Methods for Energy-Based Models

We’ve made progress towards stable and scalable training of energy-based models (EBMs) resulting in better sample quality and generalization ability than existing models. Generation in EBMs spends more compute to continually refine its answers and doing so can generate samples competitive with GANs at low temperatures, while also having mode coverage guarantees of likelihood-based models. We hope these findings stimulate further research into this promising class of models.

Energy-based models were brand new to me prior to reading this post. This article kicked off a long rabbit-hole of reading, so beware! Of course Yann LeCunn wrote the original paper back in 2006.

openai.com • Share

Where in the U.S. Are You Most Likely to Be Audited by the IRS?

In a baffling twist of logic, the intense IRS focus on Humphreys County is actually because so many of its taxpayers are poor. More than half of the county’s taxpayers claim the earned income tax credit, a program designed to help boost low-income workers out of poverty. As we reported last year, the IRS audits EITC recipients at higher rates than all but the richest Americans, a response to pressure from congressional Republicans to root out incorrect payments of the credit.

The study estimates that Humphreys, with a median annual household income of just $26,000, is audited at a rate 51 percent higher than Loudoun County, Virginia, which boasts a median income of $130,000, the highest in the country.

Apparently audits aren’t just an attempt to maximize tax dollars collected.

projects.propublica.org • Share

Fraud detection with cost-sensitive machine learning

I quite enjoyed this—accuracy (as measured by F1 score) is not the only measure of success of an algorithm because not all errors have the same cost. This post walks through creating optimizing different types of models to minimize actual costs incurred rather than to maximize accuracy.

It’s a long-ish post; if time is tight, just read the conclusion and the graphs at the end. Good takeaways.

towardsdatascience.com • Share

Thanks to our sponsors!

Fishtown Analytics: Analytics Consulting for Startups

At Fishtown Analytics, we work with venture-funded startups to build analytics teams. Whether you’re looking to get analytics off the ground after your Series A or need support scaling, let’s chat.

www.fishtownanalytics.com • Share

Stitch: Simple, Powerful ETL Built for Developers

Developers shouldn’t have to write ETL scripts. Consolidate your data in minutes. No API maintenance, scripting, cron jobs, or JSON wrangling required.

www.stitchdata.com • Share

By Tristan Handy

The internet's most useful data science articles. Curated with ❤️ by Tristan Handy.

Tweet Share

If you don't want these updates anymore, please unsubscribe here.

If you were forwarded this newsletter and you like it, you can subscribe here.

915 Spring Garden St., Suite 500, Philadelphia, PA 19123

The Analytics Engineering Roundup

Discussion about this post

Ready for more?