Digests » 72


Save 40% at manning.com!

Take the time to learn something new. Click here to discover how to get 40% off your entire purchase at manning.com!

this week's favorite

Deep Learning Demystified: Loss Functions Explained

In any deep learning project, configuring the loss function is one of the most important steps to ensure the model will work in the intended manner. The loss function can give a lot of practical flexibility to your neural networks and it will define how exactly the output of the network is connected with the rest of the network.

Africa Is Building an A.I. Industry That Doesn’t Look Like Silicon Valley

Over the last three years, academics and industry researchers from around the African continent have begun sketching the future of their own A.I. industry at a conference called Deep Learning Indaba. The conference brings together hundreds of researchers from more than 40 African countries to present their work, and discuss everything from natural language processing to A.I. ethics.

Forecasting Models for Tidy Time Series

The R package fable provides a collection of commonly used univariate and multivariate time series forecasting models including exponential smoothing via state space models and automatic ARIMA modelling. These models work within the fable framework, which provides the tools to evaluate, visualise, and combine models in a workflow consistent with the tidyverse.

Using machine learning to predict what file you need next

At Dropbox, we are building smart features that use machine intelligence to help reduce people’s busywork. Since introducing content suggestions, which we described in our previous blog post, we have been improving the underlying infrastructure and machine learning algorithms that power content suggestions.

Exascale Deep Learning for Scientific Inverse Problems

We introduce novel communication strategies in synchronous distributed Deep Learning consisting of decentralized gradient reduction orchestration and computational graph-aware grouping of gradient tensors. These new techniques produce an optimal overlap between computation and communication and result in near-linear scaling (0.93) of distributed training up to 27,600 NVIDIA V100 GPUs on the Summit Supercomputer.