By now, this is turning into a monthly tradition on this blog: I’m sharing my goals for improving at Machine Learning with you and at the end of the month I update you on how I did to keep myself accountable.
This one looks quite ambitious, but I have some free brain capacity due to time off from work – at least that’s the idea… Let’s get into my lofty goals:
Table of Contents
- Goal 1: September’s Playground Kaggle Challenge (Time series prediction :o)
- Goal 2: First steps towards Transformers
- Goal 3: “Attention Is All You Need” Paper
- Goal 4: Read Storytelling with Data Chapter 2
- Goal 5: Read Hands-On Machine Learning Chapter 2
- Update: Medium success due to vacation and sickness :/
- What I learned about my goals: Time series modelling is challenging
Goal 1: September’s Playground Kaggle Challenge (Time series prediction :o)
If you’ve read my previous two goal posts (July & August), this will not be much of a surprise. I enjoy doing these simulated challenges a lot.
As always my “deliverables” will be
- submit two different predictions myself no matter how bad
- read the code submission of two other Kagglers and learn something from them
This month’s challenge is a time series prediction, which is a bit of weak spot for me, so I’m happy (and nervous) to try my hand at this and improve.
Goal 2: First steps towards Transformers
Since I need to use some model for the time series challenge above, I might as well use neural networks for this (secret passion of mine) and try my hand at a transformer architecture for the first time.
Transformers are everywhere lately – mainly in image creation tasks, like DALL-E, which generates an image purely from a textual prompt – and honestly, I feel kind of left out because I never worked with NLP or transformers.
I found a tutorial for transformers for time series prediction, so the goal is to make this code work for the Kaggle challenge – wish me luck please.
Goal 3: “Attention Is All You Need” Paper
Building on top of goal nr. 2, I want to read the “Attention is all you need” research paper that introduced the transformer architecture back in 2017.
I will probably not understand half of it, but I’ll share my insights regardless.
I feel like this was the first “Big Bang” of transformers and I’m interested in seeing where it all began. I also hope that the complexity won’t be as advanced in this first paper compared to looking at recent research and architectures in the area.
Goal 4: Read Storytelling with Data Chapter 2
Since I have 3 weeks off in September and am away from my computer, I want to do some reading of books I started ages ago and never finished. First up is “Storytelling with Data” which focuses all on presenting the insights we generate as data scientists in a way that is ideal for the respective audience.
I shared my thoughts of the first chapter here: How to present data in context – Chapter 1 Summary Storytelling with Data
Goal 5: Read Hands-On Machine Learning Chapter 2
More reading. This book is a great beginner book, but it does include some nice tricks with sci-kit learn that I never used in university because we implemented a lot from scratch.
So this will be easy reading for me probably, but will be useful long term.
Update: Medium success due to vacation and sickness :/
Not only was I out of the country for the planned two weeks, I also added a week of complete bedrest after getting sick on arrival back at home. I am literally still coughing today on October 1st. So that changed some things…
Goal 1 – Time Series: Fail I barely opened the Kaggle challenge data before leaving, so that was a definite fail in regards to goal reaching.
Goal 2 – Transformer: Fail I did implement one following a tutorial, but I didn’t get to the training stage and frankly I think it was too ambitious to set this goal. Time series and transformers are both challenging with their classical approaches, combining them makes it even more complicated to find answers to noob-questions.
Goal 3 – Attention is All You Need: Success! I did read the paper on vacation. As suspected I only understood about 20%, but I found it interesting nonetheless. I haven’t shared my (limited) insights yet, but I might do that soon – it will be interesting to look back on in the future.
Goal 4 – Storytelling with Data Chapter 2: Success! I did read that chapter, even though it wasn’t the most captivating. Glad I made progress in the book though.
Goal 5 – Hands-on Machine Learning Chapter 2: Success! I did read that chapter and though a lot was knowledge I already had, I did learn a handful of new things to consider when preprocessing data for a machine learning model.
What I learned about my goals: Time series modelling is challenging
I’m not mad at myself, much of this month was a gamble and getting sick simply happens. I got to relax plenty on my vacation, collect many sunrays and reset my mindset by getting some distance.
It’s nice that I set these monthly goals, because it gives me a natural reset-point now to leave the old month behind with its unfinished goals and reconsider if I want to take anything from it to the next month. In the case of the Transformers, I will likely leave them behind for now and get familiar with time series and maybe NLP as a whole before diving into a new architecture in detail.
That seems to be quite a challenge. Great that you can dive so deeply into this topic, dear Sarah.