New Deep Learning Architectures, GPU Databases, Histograms, & more! š [DSR #99]
ā¤ļø Want to support us? Forward this email to three friends!
š Forwarded this from a friend? Sign up to the Data Science Roundup here.
Two Posts You Can't Miss

An Intuitive Guide to Deep Network Architectures
I got pretty tired of reading āguides to deep learningā a while ago, but am always on the lookout for ones that bring something new to the table. This is the first rundown Iāve seen on the major advances in network architecture over the past couple of years. Very digestible, very interesting. Even if youāre not going to put this to work tomorrow, highly recommended.
medium.com ⢠Share

Image Augmentation for Deep Learning using Keras and Histogram Equalization
In order to combat the high expense of collecting thousands of training images, image augmentation has been developed in order to generate training data from an existing dataset. Image Augmentation is the process of taking images that are already in a training dataset and manipulating them to create many altered versions of the same image. This both provides more images to train on, but can also help expose our classifier to a wider variety of lighting and coloring situations so as to make our classifier more robust.
This is the single best post Iāve seen on the topic of image pre-processing, an increasingly critical skill in a wide range of use cases. The writeup and code for histogram normalization (pictured above) was particularly cool.
Whether of not you work with image data today, this is a must-read.
medium.com ⢠Share
This Week's Top Posts
Histograms are a way to summarize a numeric variable. They use counts to aggregate similar values together and show you the overall distribution. However, they can be sensitive to parameter choices! Weāre going to take you step by step through the considerations with lots of data visualizations.
šš
tinlizzie.org ⢠Share
Hype or Not? Some Perspective on OpenAIās DotA 2 Bot
The OpenAI article I linked last week churned up quite a storm in the geek community, where overlap in interests between gaming and AI is high. Apparently several pros were able to beat the bot consistently within six hours of its release. Hereās a lengthy thread on Hacker News about the topic.
OpenAIās accomplishment is still impressive, but its work in this type of real-time, collaborative, informationally-obscured environment is still very early.
www.wildml.com ⢠Share
Currently, the three primary cloud analytic database platforms (Redshift / Snowflake / BigQuery) use CPUs. Other data-intensive applications have made the switch to GPUs to take advantage of their superior parallel processing, but this change is only beginning in the world of analytic databases.
Several companies have begun to play in this space; my hope is that the tech gets incorporated into an offering from AWS or GCP. This represents real opportunity for a decrease in query response times.
The Data Journalism Awards are the first international awards recognizing outstanding work in the field of data journalism worldwide.
Much of this work is really stunning. Especially worth a look is the WSJ piece on lyrical styles in Hamilton.
www.datajournalismawards.org ⢠Share
The Top 100 Medium Writers on AI
My aim with this research is to allow me to quickly find the most relevant and appreciated articles [on AI], so that I can improve my knowledge about the subject, without having to read 100 articles to find the 4ā5 of them that are interestingā¦
This is a solid piece of data journalism, an interesting new open data set to play with, and a great index of content to peruse if youāre just getting into the space. Chris Dixonās posts, in particular, are foundational.
Data viz of the week

Itās always satisfying when you find the very best version of a thing. This entire site is the most informative presentation Iāve ever seen of global arms trade information. Click aroundāitās worth it.
Thanks to our sponsors!
Fishtown Analytics: Analytics Consulting for Startups
At Fishtown Analytics, we work with venture-funded startups to implement Redshift, Snowflake, Mode Analytics, and Looker. Want advanced analytics without needing to hire an entire data team? Letās chat.
fishtownanalytics.com ⢠Share
Stitch: Simple, Powerful ETL Built for Developers
Developers shouldnāt have to write ETL scripts. Consolidate your data in minutes. No API maintenance, scripting, cron jobs, or JSON wrangling required.
The internet's most useful data science articles. Curated with ā¤ļø by Tristan Handy.
If you don't want these updates anymore, please unsubscribe here.
If you were forwarded this newsletter and you like it, you can subscribe here.
Powered by Revue
915 Spring Garden St., Suite 500, Philadelphia, PA 19123