or subscribe with
Join 3,800+ readers for one email each week.
Digests » 166
this week's favorite
Modern language models often require a significant amount of compute for pretraining, making it impossible to obtain them without access to tens and hundreds of GPUs or TPUs. Though in theory it might be possible to combine the resources of multiple individuals, in practice, such distributed training methods have previously seen limited success because connection speeds over the Internet are way slower than in high-performance GPU supercomputers.
Understand how AI creates new images using Generative Adversarial Networks in 2 minutes.
A research team from Microsoft, Zhejiang University, Johns Hopkins University, Georgia Institute of Technology and University of Denver proposes Only-Train-Once (OTO), a one-shot DNN training and pruning framework that produces a slim architecture from a full heavy model without fine-tuning while maintaining high performance.
This post is part of our “Data Engineers of Netflix” series, where our very own data engineers talk about their journeys to Data Engineering @ Netflix.
In the pursuit of learning about fundamentals of the natural world, scientists have had success with coming at discoveries from both a bottom-up and top-down approach. Neuroscience is a great example of the former. Spanish anatomist Santiago Ramón y Cajal discovered the neuron in the late 19th century. While scientists’ understanding of these building blocks of the brain has grown tremendously in the past century, much about how the brain works on the whole remains an enigma. In contrast, fluid dynamics makes use of the continuum assumption, which treats the fluid as a continuous object. The assumption ignores fluid’s atomic makeup yet makes accurate calculations simpler in many circumstances.