Digests » 150

this week's favorite

Finetune GPT2-XL and GPT-NEO on a single GPU

I needed to finetune the GPT2 1.5 Billion parameter model for a project, but the model didn't fit on my gpu. So i figured out how to run it with deepspeed and gradient checkpointing, which reduces the required GPU memory. Now it can fit on just one GPU.

Google AI introduces a new system for open-domain long-form question answering

Open-domain long-on answering (LFQA) form questions a fundamental challenge in natural language processing (NLP) that involves retrieving documents relevant to a given query and using them to generate a detailed paragraph-length answer.

Pervasive label errors in ML datasets destabilize benchmarks

We identify label errors in the test sets of 10 of the most commonly-used computer vision, natural language, and audio datasets, and subsequently study the potential for these label errors to affect benchmark results.

Redefining what a map can be with new information and AI

Sixteen years ago, many of us held a printout of directions in one hand and the steering wheel in the other to get around— without information about the traffic along your route or details about when your favorite restaurant was open. Since then, we’ve been pushing the boundaries of what a map can do, propelled by the latest machine learning.

An approximation scheme for reflected stochastic differential equations

We apply our result to derive some geometric properties of coupled reflected Brownian motion, especially those properties which have been used in the recent work on the “hot spots” conjecture for special domains.