AI Digest
Digests » 120
sponsor
RudderStack: An Open Source Segment Alternative
An Open Source Customer Data Platform built for Developers. Offering Segment API compatibility, multiple hosting options, fixed infrastructure based pricing & powerful real time transformations.
this week's favorite
Domain-specific language model pretraining for biomedical natural language processing
In this blog post, we present our recent advances in pretraining neural language models for biomedical NLP. We question the prevailing assumption that pretraining on general-domain text is necessary and useful for specialized domains such as biomedicine.
Interpretable machine learning models
Straightforward implementations of interpretable ML models + demos of how to use various interpretability techniques. Code is optimized for readability.
Hopfield Networks is All You Need
This blog post explains the paper Hopfield Networks is All You Need and the corresponding new PyTorch Hopfield layer.
Software Engineering Tips and Best Practices for Data Science
If you’re into data science you’re probably familiar with this workflow: you start a project by firing up a jupyter notebook, then begin writing your python code, running complex analyses, or even training a model. As the notebook file grows in size with all the functions, the classes, the plots, and the logs, you find yourself with an enormous blob of monolithic code sitting up in one place in front of you.
Visual Guide to Random Forests
Random Forests are a widely used Machine Learning technique for both regression and classification. In this video, we show you how decision trees can be ensembled to create powerful predictive models.