Digests » 65
this week's favorite
The early prediction of deterioration could have an important role in supporting healthcare professionals, as an estimated 11% of deaths in hospital follow a failure to promptly recognize and treat deteriorating patients1. To achieve this goal requires predictions of patient risk that are continuously updated and accurate, and delivered at an individual level with sufficient context and enough time to act.
Chris Albon's notes on using data science and artificial intelligence to fight for something that matters.
Generative Adversarial Networks, or GANs, are a new machine learning technique developed by Goodfellow et al. (2014). GANs are generally known as networks that generate new things like images, videos, text, music or nealry any other form of media. This is not the only application of GANs, however. GANs can be used for image reconstruction as well as you’ll see in this post where we’re building a watermark remover tool.
Natural language understanding (NLU) and language translation are key to a range of important applications, including identifying and removing harmful content at scale and connecting people across different languages worldwide. Although deep learning–based methods have accelerated progress in language processing in recent years, current systems are still limited when it comes to tasks for which large volumes of labeled training data are not readily available.
We introduce the HSIC (Hilbert-Schmidt independence criterion) bottleneck for training deep neural networks. The HSIC bottleneck is an alternative to conventional backpropagation, that has a number of distinct advantages. The method facilitates parallel processing and requires significantly less operations. It does not suffer from exploding or vanishing gradients. It is biologically more plausible than backpropagation as there is no requirement for symmetric feedback. We find that the HSIC bottleneck provides a performance on the MNIST/FashionMNIST/CIFAR10 classification comparable to backpropagation with a cross-entropy target, even when the system is not encouraged to make the output resemble the classification labels. Appending a single layer trained with SGD (without backpropagation) results in state-of-the-art performance.