Digests » 66


Scikit-Learn vs MLR for Machine Learning

Scikit-Learn is known for its easily understandable API and for Python users and MLR became and alternative to the popular Caret package with more a large suite of algorithms available and an easy way of tuning hyperparameters. These two packages are somewhat in competition due to the debate where many people involved in analytics turn to Python for machine learning and R for statistical analysis.

HungaBunga: Brute-Force all sklearn models with all parameters using .fit .predict

Runs through all sklearn models (both classification and regression), with all possible hyperparameters, and rank using cross-validation.

Towards Data Science

Last week I published my 3rd post in TDS. Before the next post, I wanted to publish this quick one. I hope this post helps people who want to get into data science or who just started learning data science. In this post, I will share the resources and tools I use. It is basically all the apps and links I use day to day activities. All the below I used or will be using in the future. If you want me to add any other info then please post it in the comment section and I will include it.

Getting Started with AmpliGraph

In this tutorial we’re going to use the Game of Thrones knowledge Graph. Please note: this isn’t the greatest dataset for demonstrating the power of knowledge graph embeddings, but is small, intuitive and should be familiar to most users.

What is Data Engineering? Can you swim in a data lake?

Before I started as a Data Engineer two years ago, I had no idea what the role entailed or how it differed from data science and data analytics. Job titles with the word "data" in them are known to be an enigmatic black box. That's true even for folks in technical roles. This post is what I would have wanted to read when I was trying to fit the pieces of the data pipeline together.