https://www.gravatar.com/avatar/008a4f3abe29cd2fc1e4fed64de2d82a?s=240&d=mp

Makemore4 Becoming a Backprop Ninja

A look at episode #5: The spelled-out intro to language modeling: Building makemore Part 4: Becoming a Backprop Ninja from Andrej Karpathy amazing tutorial series. We go back to the previous N-gram character-level MLP model from session #4 and dive into a hands-on manual backpropagation session. Computing Gradients by Hand This lesson is a bit of a different format. It’s a lot more exercise centric and less of a type-along lecture.

Learning With Others: Joining Recurse Center

Today I’m excited to join a community of learners. For the next 12 weeks I’m going to participate in the Fall-2 2023 Recurse Center batch. Recurse Center's logo What is it? If this blog is “learning in public”, Recurse Center is “learning with others”. RC is a code retreat for self-directed passionate programmers. It’s a place to work on your own projects and learn new things, surrounded by other people doing the same.

Makemore3 Internals of MLP and Visualization

A look at episode #4: The spelled-out intro to language modeling: Building makemore Part 3: Activations & Gradients, BatchNorm from Andrej Karpathy amazing tutorial series. It re-uses the N-gram character-level MLP from session #3 and discuss three kind of incremental improvements to training Initial weights While the model was training even with totally random weights this episode gives an intuition of why normally distributed values lead to faster training. Assigning clever weight at initialization time improve the loss of the first batches from 27 to 3.

Makemore2 Implement a MLP N-gram Character Level Language Model

A look at episode #3: The spelled-out intro to language modeling: building makemore Part 2: MLP from Andrej Karpathy amazing tutorial series. It picks up where the previous makemore video ended. Going from a bigram character-level Language Model to an N-gram MLP character-level Language Model. Meaning: “given the last N characters embeddings, guess the next character”. It’s still trained on a list of names to produce new unique name-sounding words.

Makemore Implement a Bigram Character-level Language Model

Let’s look at episode #2: The spelled-out intro to language modeling: building makemore from Andrej Karpathy amazing tutorial series. It covers an intro to Language Model using a very barebone from scratch approch using a Bigram Character-level Language Model. It means: “given a single character, guess the next character”. For this session the NN is trained on a list of names to produce new unique name-sounding words. The lecture goes from calculating the probabilities of each letters by hand, to automatically generating the probablilities as the set of weight of a very simple one layer NN that produce the exact same results.

Google Introduction to Generative AI Learning Path

A look at Google Introduction to Generative AI Learning Path https://www.cloudskillsboost.google/journeys/118. Sad Bard “Learning Path” is a bit of a misnomer, as the site does not involve much learning. The course is split into 5 chapters consisting of short videos and quizzes. Chapter 1: Introduction to Generative AI Covers a bit of semantics: AI (theoretical field) vs ML (practical methods) supervised vs unsupervised discriminative vs generative models Chapter 2: Introduction to Large Language Models Brush up on fine tuning, and PETM (Parameter-Efficient Tuning) and the Google ecosystem.

Emacs Copilot Jupyter

Setup Emacs (for Windows) I needed to setup emacs on windows for the first time. Here are my steps: Download it from GNU http://ftp.gnu.org/gnu/emacs/windows/emacs-29/emacs-29.1.zip Extract it to: C:\emacs-29.1 Navigate to: C:\emacs-29.1\bin\runemacs.exe > right-click > Show more options > Pin to taskbar This gives us a basic working emacs but for some reason the font seemed super choppy so I went and downloaded Menlo for a bit more of an OS X feel.

Starting a Grimoire

In the dimly lit chamber of an ancient, forgotten castle, a hooded figure, draped in a robe adorned with enigmatic symbols, stood within a pentagram formed by tall, flickering candles that cast eerie shadows upon the aged, mystical tapestries that lined the walls. With a chalice filled with shimmering liquid in one hand and a gleaming obsidian dagger in the other, the figure chanted in an arcane tongue, invoking the dormant powers of the castle’s very stones.

Huggingface NLP Course Chapter 7 to 9 and Retrospective

Wrapping up 🤗 Hugging Face: NLP Course. Chapter 7: Main NLP Tasks Chapter 7 is another repository of copy/pastable snippets. A lot of the topics covered already appeared in previous chapters though (e.g. Token Classification, Question Answering). Chapter 8: How to Ask for Help If you’re coming from a Software Engineering background you can safely skip this chapter entirely. It covers reading stacktraces, how to post questions, open GitHub issues, and a very basic use of pdb.

Huggingface NLP Course Chapter 6

Continuing with Chapter 6: The 🤗 Tokenizer Library. Theory This is a good dense chapter covering the theory behind tokenizers. It covers their architecture: Tokenization Pipeline The tradeoffs happening during the normalization phase. Followed by a tour of the 3 most popular subwords tokenization algorithms. I highly recommend going over the videos to get a good feel of the implementations. BPE (aka. GPT-2) WordPiece (aka. BERT) Unigram (aka. T5) Practical Coding The chapter also go over how question answering pipeline manage contexts that are bigger than the allowed amount of tokens.

Huggingface NLP Course Chapter 3 to 5

Continuing with 🤗 Hugging Face: NLP Course. Chapter 3: Fine-Tuning a Pretrained Model Goes over fine-tuning. It’s a OK as a source of copy/pastable snippets, but there isn’t much insight to glean from here. Chapter 4: Sharing Models and Tokenizers Nothing to see here. Just an advertisement for the 🤗 platform. Chapter 5: The 🤗 Datasets Library Another OK source of copy/pastable snippets. But this time we also get a treat, an intro to FAISS (Facebook AI Similarity Search).

Huggingface NLP Course Chapter 1 & 2

Today is time for the 🤗 Hugging Face: NLP Course. Chapter 1: Transformer Models The first chapter is just a very quick glance at Hugging Face transformer capabilities. It’s a rapidfire set of examples going through two-liners for: sentiment analysis NER (entity extraction) classification text generation / question answering fill mask translation For example: from transformers import pipeline classifier = pipeline("sentiment-analysis") classifier([ "I've been waiting for a HuggingFace course my whole life.

Micrograd Intro to Neural Network and Backpropagation

Today I’m talking about Andrej Karpathy excellent tutorial series The spelled-out intro to neural networks and backpropagation: building micrograd. This session covers a full intruduction to backpropagation. Starting with building a strong intuition of derivatives and their usages in ML. Beginning with numerical derivation, followed by symbolic derivation, and finally automating it by wrapping Python’s primitive operations (+, -, *, /, exp, tanh) with code. It then goes into learning. Covering what is a Neuron

First Setup for ML

Let’s get a place to run code on the GPU (on Windows, don’t judge me, it’s where my GPU lives). I’m going to setup: Python Jupyter PyTorch with CUDA Download and install Python: https://www.python.org/downloads/ For some reason my Python didn’t come with pip so I also ran python -m ensurepip python -m pip install --upgrade pip For Jupyter I’m also going through pip. Installing everything globally. I’ll use venv later for project specific dependencies:

Learn in Public

This is the start of an experiment. There’s an internet theory that you can learn faster about any topic if you learn it in public source. So I’m going to be rambling in here while I transition from Software Engineer to ML Engineer. To start, I need a place to ramble so we set up a blog using: GitHub to host the code Hugo (with the loveit theme) to convert markdown to pretty HTML Cloudflare Pages to serve the HTML Utterances for comments sudo apt install hugo hugo new site SWE-to-MLE cd SWE-to-MLE/ git init .