Steam-powered ML. Jeff Dean. AI Job Impacts. Model size. Data org maturity. [DSR #208]
Short issue this week for the holiday! Hope all of you Americans out there had a happy Thanksgiving! đŚ
- Tristan
â¤ď¸ Want to support this project? Forward this email to three friends!
đ Forwarded this from a friend? Sign up to the Data Science Roundup here.
This week's best data science articles
We're still in the steam-powered days of machine learning
I have no desire to add anything to this post other than to say that itâs fantastic and you should absolutely read it. Vicki Boykis continues to put out fantastic work.
Itâs not often Jeff Dean puts out new work. This paper is brand new, from a talk at a recent conference, and is jam-packed with interesting stuff if you care about the intersection of chip design and ML. I was particularly interested in the section Machine-Learning-Specialized Hardware, which was the best overview of the differences between a classic microprocessor and a TPU that Iâve read.
In related news: Microsoft is now offering Graphcore processors in Azure.
Deep learning has a size problem
The size of some of the recently-released language models are intense. This is problematic for two reasons:
First, it hinders democratization. If we believe in a world where millions of engineers are going to use deep learning to make every application and device better, we wonât get there with massive models that take large amounts of time and money to train.
Second, it restricts scale. There are probably less than 100 million processors in every public and private cloud in the world. But there are already 3 billion mobile phones, 12 billion IoT devices, and 150 billion micro-controllers out there. In the long term, itâs these small, low power devices that will consume the most deep learning, and massive models simply wonât be an option.
This is the best post Iâve read on model efficiency. It goes deep in certain tactical areas but remains extremely accessible at all points.
Brookings: What Jobs are Affected by AI?
My feelings on this report by the Brookings Institute: ÂŻ\_(ă)_/ÂŻ
The main takeaway is that white-collar jobs are likely to be more impacted than blue-collar jobs from the widespread deployment of AI, and they got there via a bunch of NLP work using a couple of different datasets. Hereâs the big problem with the analysis though:
âŚthe exposure measure employed here only suggests that in particular occupations some kind of impact can be expected, whether positive or negative.
The report is just saying that certain fields are more âAI-exposedâ than others. For instance, software engineers are listed as being very highly AI-exposed. That seems quite obvious, given that software engineers literallyâŚbuild AI systems. Other top areas listed also fall under âI could have just told you that without needing to do a bunch of language NLPâ.
I include this link here because it is going certainly made the rounds in the last couple of weeks, worth a scan just to have the water cooler conversation.
The Three Levels of Data Analysis- A Framework for Assessing Data Organization Maturity
There are three tiers of data analysis: reporting, insights, and prediction.
dbt community member Emilie Schario has an excellent post up on the Gitlab blog about the stages of organizational maturity. As an industry, weâre still having a hard time getting reporting right.
Thanks to our sponsors!
dbt: Your Entire Analytics Engineering Workflow
Analytics engineering is the data transformation work that happens between loading data into your warehouse and analyzing it. dbt allows anyone comfortable with SQL to own that workflow.
getdbt.com ⢠Share
Stitch: Simple, Powerful ETL Built for Developers
Developers shouldnât have to write ETL scripts. Consolidate your data in minutes. No API maintenance, scripting, cron jobs, or JSON wrangling required.
The internet's most useful data science articles. Curated with â¤ď¸ by Tristan Handy.
If you don't want these updates anymore, please unsubscribe here.
If you were forwarded this newsletter and you like it, you can subscribe here.
Powered by Revue
915 Spring Garden St., Suite 500, Philadelphia, PA 19123