Einsum for Tensor Manipulation

In the ethereal dance of the cosmos, where the arcane whispers intertwine with the silent echoes of unseen dimensions, the Ioun Stone of Mastery emerges as a beacon of unparalleled prowess. This luminescent orb, orbiting its bearer’s head, is a testament to the mastery of both magical and mathematical realms, offering a bridge between the manipulation of arcane energies and the intricate ballet of tensor mathematics. As the stone orbits, it casts a subtle glow, its presence a constant reminder of the dual dominion it grants over the spellbinding complexities of magic and the abstract elegance of multidimensional calculations, making the wielder a maestro of both mystical incantations and the unseen algebra of the universe.

ViT - Vision Transformer

Veiled in a mist of arcane energy, the Orb of Scrying rests silently upon its ancient pedestal. Crafted from crystal as clear as mountain spring water, it waits for the touch of a seer. To the untrained eye, it’s merely a beautiful artifact, but to a wielder of magic, it’s a window to the unseen. Whispering the old words, the mage’s eyes lock onto the orb’s depths. Visions swirl within, revealing secrets hidden across lands and time, as the orb bridges the gap between the known and the unknown.

Positional Encoding for Self Attention

In the dimly lit chambers of his ancient library, the wizard Eldron carefully weaves his spell over a complex array of arcane symbols. With each precise gesture, he transmutes these symbols, imbuing them with a hidden layer of meaning: the magic of positional encoding. This enchantment allows the symbols to hold not just the essence of words, but also their place in the grand tapestry of language. Eldron’s eyes gleam with satisfaction as the embeddings shimmer, now ready to reveal their secrets in perfect harmony and sequence.

GAN, WGAN, and Instance Noise

The Mirror of Life Trapping, a relic of ancient magic, ensnares the souls of those who dare gaze upon its deceptive surface. Within its mystical depths, trapped spirits linger, awaiting release or eternal confinement. Mirror of Life Trapping The Quest Craft a Mirror of Life Trapping. Capture the visual essence of a target. GAN (Generative Adversarial Network) GAN is an architecture merging two different networks competing with each other: Discriminator: wants to predict if the input is real or fake Generator: wants to generate fakes indistinguishable from the real ones GAN Discriminator The discriminator is a simple binary classifier.

Daedalus Generating Mazes With Autoencoders and Variational Autoencoders

Daedalus, master craftsman of ancient myths, conceived the Labyrinth: a maze of bewildering complexity. Its winding paths and endless turns, a testament to his genius, were designed to confine the fearsome Minotaur, blurring the line between architectural marvel and cunning trap. Daedalus designing the labyrinth by DALL-E The Quest Train a network on Daedalus work to generate new mazes. Autoencoder An autoencoder is a type of network shaped like an hourglass.

Neural Style Transfer

Born of ancient magic, the Chromatic Chameleon prowls the shadows with scales that pulse and shimmer in a dance of arcane radiance. Its form, a living canvas, shifts through the hues of twilight, an elusive guardian draped in spectral energies. Only those with keen senses may glimpse the majestic, ever-changing creature lurking in the mystic realms. Chromatic Chameleon The Quest Repurpose an image classifier to do style transfer from a donor style image to a receiver content image.

Deepdream and Mechanistic Interpretability

A Beholder awakens. Its myriad eyes, each a facet of mechanistic insight, gaze upon the intricate layers of information, revealing hidden patterns in the dreams of code. In the tapestry of deepdream, the Beholder becomes the guardian of interpretability, its central eye illuminating the enigmatic connections woven within the digital labyrinth. Beauty is in the eye of the Beholder The Quest Produce deepdreams from an image classifier. Try to identify specific features in the network, and alter them to blind the network.

Fooling an Image Classifier

In the dimly lit corridors of the ancient dungeon, where shadows dance and secrets lie in wait, an eerie silence is suddenly shattered by the faint creaking of wooden planks. Unbeknownst to the adventurers, a malevolent presence lurks among the mundane, adopting the guise of an innocuous chest or treasure trove. Beware the mimic, a shape-shifting aberration that hungers for the thrill of deception and the taste of unsuspecting intruders.

Unsupervised Clustering

Gelatinous Cube The Quest Get a feel for how unsupervised clustering algorithms work and their differences. Unsupervised Clustering A set of algorithms used to identify groups within unlabeled dataset. If we go back to the word embeddings examples, running a clustering algorithm would return groups of words with similar meanings, or sharing a common topic (e.g. [king, queen, prince, princess], [apple, lemon, banana, coconut]). Lets run through a few popular clustering algorithms.

Grokking With Weights Decay

Say hi to our new bestiary friend, Grok. Our lovely ogre: Grok the cruel The Quest Let’s explore how a network can generalize the solution after already reaching perfect loss. Grokking Grokking is the model’s ability to move beyond rote learning of training data and develop a broader understanding that allows it to generalize well to unseen inputs. The Model We’ll try to reproduce this effect using a model trained to predict modular addition (a + b) % vocab.

DQN: Deep Q-Leaning a Maze

Adding a new entry to the bestiary, the Minotaur. Minotaur by stable diffusion The Quest As a first step toward Reinforcement Learning (RL) let’s write a maze solver using Deep Q-Network (DQN). Bellman’s Equation To me DQN seems to be the RL technique requiring the least effort. All you need to do is to balance the left side of the Bellman’s equation with its right side: $$Q(s, a) = R + \gamma .

Embeddings Necronomicon

This is the first post going off road for our own little adventure and not following an online course. Today’s quest will consist of slaying a dragon building an intuition for embeddings. The Quest The original idea was to try to reproduce the word arithmetic examples from Google’s Word2Vec demo: King - Man + Woman = Queen and Paris - France + Italy = Rome. (Spoiler alert) it turned out to be more of an experiment on how to create/handle/visualize word embeddings.

Let's Build NanoGPT

A look at episode #7: Let’s build GPT: from scratch, in code, spelled out from Andrej Karpathy amazing tutorial series. For the final episode of the series 😭 we keep all the little things about reading, partitioning and tokenizing the dataset from previous videos. And start a new model from scratch to generate some shakespeare sounding text. The model The model is inspired GPT-2 and the Attention is All You Need paper.

Makemore5 Building a WaveNet

A look at episode #6: The spelled-out intro to language modeling: Building makemore Part 5: Building a WaveNet from Andrej Karpathy amazing tutorial series. Starting from the makemore3 (3-gram character-level MLP model) code as a base. It implements a deepter more structured model (while maintaining roughly the same number of parameters) to improve the loss. Improve the structure of the code The first half of the video focus on bringing more structure to the code.

Makemore4 Becoming a Backprop Ninja

A look at episode #5: The spelled-out intro to language modeling: Building makemore Part 4: Becoming a Backprop Ninja from Andrej Karpathy amazing tutorial series. We go back to the previous N-gram character-level MLP model from session #4 and dive into a hands-on manual backpropagation session. Computing Gradients by Hand This lesson is a bit of a different format. It’s a lot more exercise centric and less of a type-along lecture.

Learning With Others: Joining Recurse Center

Today I’m excited to join a community of learners. For the next 12 weeks I’m going to participate in the Fall-2 2023 Recurse Center batch. Recurse Center's logo What is it? If this blog is “learning in public”, Recurse Center is “learning with others”. RC is a code retreat for self-directed passionate programmers. It’s a place to work on your own projects and learn new things, surrounded by other people doing the same.

Makemore3 Internals of MLP and Visualization

A look at episode #4: The spelled-out intro to language modeling: Building makemore Part 3: Activations & Gradients, BatchNorm from Andrej Karpathy amazing tutorial series. It re-uses the N-gram character-level MLP from session #3 and discuss three kind of incremental improvements to training Initial weights While the model was training even with totally random weights this episode gives an intuition of why normally distributed values lead to faster training. Assigning clever weight at initialization time improve the loss of the first batches from 27 to 3.

Makemore2 Implement a MLP N-gram Character Level Language Model

A look at episode #3: The spelled-out intro to language modeling: building makemore Part 2: MLP from Andrej Karpathy amazing tutorial series. It picks up where the previous makemore video ended. Going from a bigram character-level Language Model to an N-gram MLP character-level Language Model. Meaning: “given the last N characters embeddings, guess the next character”. It’s still trained on a list of names to produce new unique name-sounding words.

Makemore Implement a Bigram Character-level Language Model

Let’s look at episode #2: The spelled-out intro to language modeling: building makemore from Andrej Karpathy amazing tutorial series. It covers an intro to Language Model using a very barebone from scratch approch using a Bigram Character-level Language Model. It means: “given a single character, guess the next character”. For this session the NN is trained on a list of names to produce new unique name-sounding words. The lecture goes from calculating the probabilities of each letters by hand, to automatically generating the probablilities as the set of weight of a very simple one layer NN that produce the exact same results.

Google Introduction to Generative AI Learning Path

A look at Google Introduction to Generative AI Learning Path Sad Bard “Learning Path” is a bit of a misnomer, as the site does not involve much learning. The course is split into 5 chapters consisting of short videos and quizzes. Chapter 1: Introduction to Generative AI Covers a bit of semantics: AI (theoretical field) vs ML (practical methods) supervised vs unsupervised discriminative vs generative models Chapter 2: Introduction to Large Language Models Brush up on fine tuning, and PETM (Parameter-Efficient Tuning) and the Google ecosystem.