Digests » 171

this week's favorite

Deep learning implementations

This is a collection of simple PyTorch implementations of neural networks and related algorithms. These implementations are documented with explanations, and the website renders these as side-by-side formatted notes. We believe these would help you understand these algorithms better.

Ensuring that new language-processing models don't backslide

The models behind machine learning (ML) services are continuously being updated, and the new models are usually more accurate than the old ones. But an overall improvement in accuracy can still be accompanied by regression — a loss of accuracy — in particular cases.

Using pre-trained language models for database queries on unstructured data

By organizing information, databases are an essential component of nearly every computer program and online service. But the rigid structure of conventional database systems also constrains how they can be used. These systems require preset schemas and can only answer queries with well-defined semantics written in SQL (structured query language). Queries must be exacting to return correct information. Moreover, the data must be stored in a way to comply with the schema; therefore, taking advantage of the abundance of available unstructured data is challenging.

The fundamentals of data warehouse and data lake

With the evolution of Data Warehouses and Data Lakes, they have certainly become more specialized yet siloed in their respective landscapes over the last few years. Both data management technologies each have their own identities and are best used for certain tasks and needs, however they also struggle in providing some important abilities.

AI-powered command-line photo search tool

rclip is a command-line photo search tool based on the awesome OpenAI's CLIP neural network.