Digests » 123
this week's favorite
Since its release, GPT-3, OpenAI’s massive language model, has been the topic of much discussion among developers, researchers, entrepreneurs, and journalists. Most of those discussions have been focused on the capabilities of the AI-powered text generator. Users have been publishing the results of interesting experiments using the AI to generate anything and everything from articles to website code
peech-to-text has traditionally had high barriers of entry due to a number or reasons: hard-to-collect data, costly annotation and high data requirements, high compute requirements and adoption of obsolete hard to use technologies.
KILT (Knowledge Intensive Language Tasks) is a new unified benchmark to help AI researchers build models that are better able to leverage real-world knowledge to accomplish a broad range of tasks.
n our recent paper, “Geometric Dataset Distances via Optimal Transport,” we propose the Optimal Transport Dataset Distance, or the OTDD for short, an approach to defining and computing similarities, or distances, between classification datasets. The OTDD relies on optimal transport (OT), a flexible geometric method for comparing probability distributions, and can be used to compare any two datasets, regardless of whether their label sets are directly comparable.
We show that, unlike pruning after training, accuracy is the same or higher when randomly shuffling which weights these methods prune within each layer or sampling new initial values. As such, the per-weight pruning decisions made by these methods can be replaced by a per-layer choice of the fraction of weights to prune. This property undermines the claimed justifications for these methods and suggests broader challenges with the underlying pruning heuristics, the desire to prune at initialization, or both.