How to Become a Data Scientist. Plus, DOTA, Subways, Python as Poetry & more! [DSR #98]
So much good stuff this week! Enjoy! š
- Tristan
ā¤ļø Want to support us? Forward this email to three friends!
š Forwarded this from a friend? Sign up to the Data Science Roundup here.
Two Posts You Can't Miss
How to Become a Data Scientist
This is the best āhow to become a data scientistā post Iāve ever read (and I have an entire section of my Pocket just dedicated to this micro-genre of posts!). There are three reasons why itās so brilliant:
The author is a recruiter, not a data scientist, and is focused solely on hiring for data science positions. He/she talks to dozens of data scientists a day and the post reflects this broad view of the landscape.
The post features interviews with practitioners in a wide range of roles, giving an up-close-and-personal look at the breadth of roles available.
The post focuses on the motivations, not just the capabilities, of the best data scientists. This is so critically important. I love this quote from Dylan Hogg, Head of Data Science at The Search Party:
Regardless of education or experience, thereās something more fundamental, which is your nature of curiosity, determination and tenacity. There are so many times when you hit a problem: perhaps the algorithm isnāt performing in the way it needs to, or perhaps the technology is being a pain. Either way, you can study machine learning algorithms or software engineering best practice, but if youāre not really determined, youāre going to give up and not get through it.
If you or anyone you know is currently hoping to get into the field, this is a must-read.
medium.com ā¢ Share
Weāve created a bot which beats the worldās top professionals at 1v1 matches of Dota 2 under standard tournament rules. The bot learned the game from scratch by self-play, and does not use imitation learning or tree search. This is a step towards building AI systems which accomplish well-defined goals in messy, complicated situations involving real humans.
Iāll admit that Iāve spent probably more hours than I should have playing Dota, so this really hits home for me. Whereas Chess and Go are turn-based and therefore involve much more discreet choices, Dota is real-time: the entire game is one continuous stream of decisions. Real-time decision-making is obviously a critical step towards OpenAIās goal of AGI.
OpenAIās bot performance is now best in the world in a 1v1 setting, but Dotaās competitive scene is mostly focused around 5v5 teams. The next step is training a group of bots to perform at that level. The results have the potential to be truly fascinatingāwill bot teams play similarly to human teams? My bet is no.
While OpenAI is working on Dota 2, DeepMind is working on Starcraft 2. Itās a good time to be a fan of real-time strategy games.
blog.openai.com ā¢ Share
This Week's Top Posts
This is the best article Iāve read in a while on organizational behavior and data science. Very highly recommended.
Data science is best viewed as a form of company culture, rather than a set of technologies. However, many firms will try to create that company culture by acquiring data-science technology, rather than working on their culture.
blog.richardweiss.org ā¢ Share
Code readability is a much bigger deal than many data scientists realize:
Python code is more like poetry than prose. Poets and Python programmers donāt wrap lines once they hit an arbitrary length; they wrap lines when they make sense for readability and beauty.
Especially important if youāre working on a team.
treyhunner.com ā¢ Share
10 Significant Visualization Developments: January to June 2017
The Oscars of data viz. This seasonās winners are impressive.
www.visualisingdata.com ā¢ Share
What New York Subway Stations Actually Look Like
Subway stationsā complex tunnel systems are a mystery even to most regular riders. Architect Candy Chanās new X-ray maps demystify the paths in and around them.
Unique, very cool, mapping concept.
www.citylab.com ā¢ Share
Facebook: Transitioning Entirely to Neural Machine Translation
ā¦we recently switched from using phrase-based machine translation models to neural networks to power all of our backend translation systems, which account for more than 2,000 translation directions and 4.5 billion translations each day.
State of the art work on translation at scale. Worth a read even if youāre not directly working with NLP just to stay up-to-date on what is now achievable.
Dots vs. polygons: How I Choose the Right Visualization
If youāre mapping geographical data, would you use a dot density, choropleth, hexbin, or heatmap chart? I hadnāt thought much about this particular set of viz choices before reading this article and learned a lot from it. The author is a designer at Mapbox and likely spends more time thinking about mapping visualizations than almost anyone in the world.
blog.mapbox.com ā¢ Share
Data Science: Challenges and Directions
This paper proposes an updated answer to the question āWhat is data science?ā that focuses on its interdisciplinary nature:
Data science is a new trans-disciplinary field that builds on and synthesizes a number of relevant disciplines and bodies of knowledge, including statistics, informatics, computing, communication, management, and sociology, to study data following ādata science thinkingā.
Itās a fascinating, but dense, read. The results of this research will trickle out elsewhereā¦
cacm.acm.org ā¢ Share
Data viz of the week
Because sometimes the point is to have some fun :)
Thanks to our sponsors!
Fishtown Analytics: Analytics Consulting for Startups
At Fishtown Analytics, we work with venture-funded startups to implement Redshift, Snowflake, Mode Analytics, and Looker. Want advanced analytics without needing to hire an entire data team? Letās chat.
fishtownanalytics.com ā¢ Share
Stitch: Simple, Powerful ETL Built for Developers
Developers shouldnāt have to write ETL scripts. Consolidate your data in minutes. No API maintenance, scripting, cron jobs, or JSON wrangling required.
The internet's most useful data science articles. Curated with ā¤ļø by Tristan Handy.
If you don't want these updates anymore, please unsubscribe here.
If you were forwarded this newsletter and you like it, you can subscribe here.
Powered by Revue
915 Spring Garden St., Suite 500, Philadelphia, PA 19123