Machine Learning Study Goals – January 2023: SHAP paper & getting started with Scala

In this monthly post, I tell you what I plan to study or improve on in the area of machine learning (including an update at the end of the month).

Oh, and if you want to receive (bi)weekly updates on how I’m doing with these goals, consider subscribing to my newsletter πŸ˜‰

Table of Contents

Goal 1: Read the SHAP research paper

I’ve always been quite interested in HOW machine learning works, and while Explainable and Interpretable ML is not quite the same as having a solid intuition and/or mathematical proof of why an algorithm works the way it does, it still provides some transparency about the predictions we code.
I experimented with SHAP values in the last month and now I want to strengthen this by reading the original research paper that introduced them:

A Unified Approach to Interpreting Model Predictions

You can find this paper freely available on arxiv:

This also ties into my tentative yearly goal of reading one paper per month to get over my fear of research papers and feeling too stupid for them.

Goal 2: Doing 10 Katas of Scala

I recently started learning a new programming language – Scala. Just to be clear, I don’t think this is a hugely in demand skill as a ML engineer or Data Scientist. However, I am working with (Py)Spark at work, and while some ML algorithms in that framework have Python wrappers, so you can use them with Python, the underlying code is all done is Scala. We ran into some issues where we needed algorithms that weren’t readily available and there is more code available on GitHub etc. in Scala than Python.

Another reason is I simply wanted to get familiar with another language. It’s fun and exposes me to different ways of thinking in code πŸ™‚

Back to the 10 katas: katas are small coding challenges, similar to Leetcode problems. I use (free version), which provides katas in increasing difficulty from 8 kyu (easiest) to 1 kyu (very hard) in many languages including Scala.

The next suggested Scala challenge currently for me on Codewars

I want to do 10 of these (since I’m writing this in the middle of January, I already did 7) to get familiar with the syntax and some basics like

  • sorting & filtering lists
  • building loops
  • basic math operations like power functions
  • converting between different types

Previous month’s study goals & results

November 2022: Anomaly Detection & Ensembles

October 2022: Time Series

September 2022: Transformers

August 2022: Random Forests

July 2022: Decision Trees

Leave a Reply

Consent Management Platform by Real Cookie Banner