or subscribe with
Join 3,500+ readers for one email each week.
Digests » 92
this week's favorite
A new document in a word processor can be a magical thing, a blank page onto which thoughts and ideas are put forth as quickly as we can input text. We can select words and phrases to underline and highlight and add images, shapes, and bulleted lists, and when we need editorial help, we can run a grammar and spell checker. The experience can feel so seamless at times that perhaps we don’t give much thought to how it all works.
In this post we’ll demo how to train a “small” model (84 M parameters = 6 layers, 768 hidden size, 12 attention heads) – that’s the same number of layers & heads as DistilBERT – on Esperanto. We’ll then fine-tune the model on a downstream task of part-of-speech tagging.
This paper is an attempt to explain all the matrix calculus you need in order to understand the training of deep neural networks. We assume no math knowledge beyond what you learned in calculus 1, and provide links to help you refresh the necessary math where needed.
Computer Vision is often seen by software developers and others as a hard field to get into. In this article, we'll learn Computer Vision from basics using sample algorithms implemented within Microsoft Excel, using a series of one-liner Excel formulas. We'll use a surprise trick that helps us implement and visualize algorithms like Face Detection, Hough Transform, etc., within Excel, with no dependence on any script or a third-party plugin.
This post focusea on the recent decade's most important developments and applications based on our work, also mentioning related work, and concluding with an outlook on the 2020s, also addressing privacy and data markets.