One email per week, 5 links.

Do you want to keep up to date with the latest trends of machine learning, data science, and artificial intelligence?

But keeping up to date with all the blogs, podcasts, and articles is time consuming so why not let someone else curate the content for you?

With our weekly newsletter you will get 5 top stories hand-picked into your inbox every Monday with topic ranging from neural networks, deep learning, Markov chains, natural language processing, covering scientific papers, and even basics of statistics, data science, and data visualisations.

Escape the distractions of social media and own your focus. Check out the latest issue and subscribe!

AI Digest#138

this week's favorite

Which Machine Learning Classifiers are best for small datasets?

Although "big data" and "deep learning" are dominant, my own work at the Gates Foundation involves a lot of small (but expensive) datasets, where the number of rows (subjects, samples) is between 100 and 1000. For example, detailed measurements throughout a pregnancy and subsequent neonatal outcomes from pregnant women. A lot of my collaborative investigations involve fitting machine learning models to small datasets like these, and it's not clear what best practices are in this case.

NLP Datasets: 611 text datasets in 467 languages

Datasets is a lightweight python library providing two main features: one-line data loaders for public dataset and efficient data pre-processing:

Why I’m lukewarm on graph neural networks

GNNs can provide wins over simpler embedding methods, but we’re at a point where other research directions matter more.

DALL·E: Creating Images from Text

We’ve trained a neural network called DALL·E that creates images from text captions for a wide range of concepts expressible in natural language.

The Pile: An 800GB Dataset of Diverse Text for Language Modeling

The Pile is a 825 GiB diverse, open source language modelling data set that consists of 22 smaller, high-quality datasets combined together.