Digests » 97

this week's favorite

Document Classification: 7 pragmatic approaches for small datasets

In big organizations the datasets are large and training deep learning text classification models from scratch is a feasible solution but for the majority of real-life problems your dataset is small and if you want to build your machine learning model you need to be smart.

Time Series Anomaly Detection using LSTM Autoencoders with PyTorch in Python

In this tutorial, you’ll learn how to detect anomalies in Time Series data using an LSTM Autoencoder. You’re going to use real-world ECG data from a single patient with heart disease.

High-Resolution Daytime Translation Without Domain Labels

Modeling daytime changes in high resolution photographs, e.g., re-rendering the same scene under different illuminations typical for day, night, or dawn, is a challenging image manipulation task. We present the high-resolution daytime translation (HiDT) model for this task. HiDT combines a generative image-to-image model and a new upsampling scheme that allows to apply image translation at high resolution.

Stanza - A Python NLP Library for Many Human Languages

Stanza is a Python natural language analysis package. It contains tools, which can be used in a pipeline, to convert a string containing human language text into lists of sentences and words, to generate base forms of those words, their parts of speech and morphological features, to give a syntactic structure dependency parse, and to recognize named entities. The toolkit is designed to be parallel among more than 70 languages, using the Universal Dependencies formalism.

Neural Networks are Surprisingly Modular

The learned weights of a neural network are often considered devoid of scrutable internal structure. In order to attempt to discern structure in these weights, we introduce a measurable notion of modularity for multi-layer perceptrons (MLPs), and investigate the modular structure of MLPs trained on datasets of small images.