Python on OranLooney.com
https://www.oranlooney.com/tags/python/
Recent content in Python on OranLooney.com
Hugo  gohugo.io
en
© Copyright {year} Oran Looney
Sun, 12 May 2024 00:00:00 +0000

Let's Play Jeopardy! with LLMs
https://www.oranlooney.com/post/jeopardy/
Sun, 12 May 2024 00:00:00 +0000
https://www.oranlooney.com/post/jeopardy/
Update 20240514: Hot off the presses, the benchmark now includes the recently released GPT4o model!
How good are LLMs at trivia? I used the Jeopardy! dataset from Kaggle to benchmark ChatGPT and the new Llama 3 models. Here are the results:
There you go. You’ve already gotten 90% of what you’re going to get out of this article. Some guy on the internet ran a halfbaked benchmark on a handful of LLM models, and the results were largely in line with popular benchmarks and received wisdom on finetuning and RAG.

Kaprekar's Magic 6174
https://www.oranlooney.com/post/kaprekar/
Sun, 25 Feb 2024 00:00:00 +0000
https://www.oranlooney.com/post/kaprekar/
Kaprekar’s routine is a simple arithmetic procedure which, when applied to four digit numbers, rapidly converges to the fixed point 6174, known as the Kaprekar constant. Unlike other famous iterative procedures such as the Collatz function, the somewhat arbitrary nature of the Kaprekar routine doesn’t hint at fundamental mathematical discoveries yet to be made; rather, its charm lies in its intuitive definition (requiring no more than elementary mathematics,) its oddly offcenter fixed point of 6174, and its surprisingly rapid convergence (which requires only five iterations on average and never more than seven.

Cracking Playfair Ciphers
https://www.oranlooney.com/post/playfair/
Wed, 13 Sep 2023 00:00:00 +0000
https://www.oranlooney.com/post/playfair/
In 2020, the Zodiac 340 cipher was finally cracked after more than 50 years of trying by amateur code breakers. While the effort to crack it was extremely impressive, the cipher itself was ultimately disappointing. A homophonic substitution cipher with a minor gimmick of writing diagonally, the main factor that prevented it from being solved much earlier was the several errors the Zodiac killer made when encoding it.
Substitution ciphers, which operate at the level of a single character, are children’s toys, the kind of thing you might get a decoder ring for from the back of a magazine.

ML From Scratch, Part 6: Principal Component Analysis
https://www.oranlooney.com/post/mlfromscratchpart6pca/
Mon, 16 Sep 2019 00:00:00 +0000
https://www.oranlooney.com/post/mlfromscratchpart6pca/
In the previous article in this series we distinguished between two kinds of unsupervised learning (cluster analysis and dimensionality reduction) and discussed the former in some detail. In this installment we turn our attention to the later.
In dimensionality reduction we seek a function \(f : \mathbb{R}^n \mapsto \mathbb{R}^m\) where \(n\) is the dimension of the original data \(\mathbf{X}\) and \(m\) is less than or equal to \(n\). That is, we want to map some high dimensional space into some lower dimensional space.

A Seriously Slow Fibonacci Function
https://www.oranlooney.com/post/slowfibonacci/
Sat, 06 Jul 2019 00:00:00 +0000
https://www.oranlooney.com/post/slowfibonacci/
I recently wrote an article which was ostensibly about the Fibonacci series but was really about optimization techniques. I wanted to follow up on its (extremely moderate) success by going in the exact opposite direction: by writing a Fibonacci function which is as slow as possible.
This is not as easy as it sounds: any program can trivially be made slower, but this is boring. How can we make it slow in a fair and interesting way?

ML From Scratch, Part 5: Gaussian Mixture Models
https://www.oranlooney.com/post/mlfromscratchpart5gmm/
Wed, 05 Jun 2019 00:00:00 +0000
https://www.oranlooney.com/post/mlfromscratchpart5gmm/
Consider the following motivating dataset:
Unlabled Data
It is apparent that these data have some kind of structure; which is to say, they certainly are not drawn from a uniform or other simple distribution. In particular, there is at least one cluster of data in the lower right which is clearly separate from the rest. The question is: is it possible for a machine learning algorithm to automatically discover and model these kinds of structures without human assistance?

Adaptive Basis Functions
https://www.oranlooney.com/post/adaptivebasisfunctions/
Tue, 21 May 2019 00:00:00 +0000
https://www.oranlooney.com/post/adaptivebasisfunctions/
Today, let me be vague. No statistics, no algorithms, no proofs. Instead, we’re going to go through a series of examples and eyeball a suggestive series of charts, which will imply a certain conclusion, without actually proving anything; but which will, I hope, provide useful intuition.
The premise is this:
For any given problem, there exists learned featured representations which are better than any fixed/humanengineered set of features, even once the cost of the added parameters necessary to also learn the new features into account.

ML From Scratch, Part 4: Decision Trees
https://www.oranlooney.com/post/mlfromscratchpart4decisiontree/
Fri, 01 Mar 2019 00:00:00 +0000
https://www.oranlooney.com/post/mlfromscratchpart4decisiontree/
So far in this series we’ve followed one particular thread: linear regression > logistic regression > neural network. This is a very natural progression of ideas, but it really represents only one possible approach. Today we’ll switch gears and look at a model with completely different pedigree: the decision tree, sometimes also referred to as Classification and Regression Trees, or simply CART models. In contrast to the earlier progression, decision trees are designed from the start to represent nonlinear features and interactions.

A Fairly Fast Fibonacci Function
https://www.oranlooney.com/post/fibonacci/
Tue, 19 Feb 2019 00:00:00 +0000
https://www.oranlooney.com/post/fibonacci/
A common example of recursion is the function to calculate the \(n\)th Fibonacci number:
def naive_fib(n): if n < 2: return n else: return naive_fib(n1) + naive_fib(n2) This follows the mathematical definition very closely but it’s performance is terrible: roughly \(\mathcal{O}(2^n)\). This is commonly patched up with dynamic programming. Specifically, either the memoization:
from functools import lru_cache @lru_cache(100) def memoized_fib(n): if n < 2: return n else: return memoized_fib(n1) + memoized_fib(n2) or tabulation:

ML From Scratch, Part 3: Backpropagation
https://www.oranlooney.com/post/mlfromscratchpart3backpropagation/
Sun, 03 Feb 2019 00:00:00 +0000
https://www.oranlooney.com/post/mlfromscratchpart3backpropagation/
In today’s installment of Machine Learning From Scratch we’ll build on the logistic regression from last time to create a classifier which is able to automatically represent nonlinear relationships and interactions between features: the neural network. In particular I want to focus on one central algorithm which allows us to apply gradient descent to deep neural networks: the backpropagation algorithm. The history of this algorithm appears to be somewhat complex (as you can hear from Yann LeCun himself in this 2018 interview) but luckily for us the algorithm in its modern form is not difficult  although it does require a solid handle on linear algebra and calculus.

ML From Scratch, Part 2: Logistic Regression
https://www.oranlooney.com/post/mlfromscratchpart2logisticregression/
Thu, 27 Dec 2018 00:00:00 +0000
https://www.oranlooney.com/post/mlfromscratchpart2logisticregression/
In this second installment of the machine learning from scratch we switch the point of view from regression to classification: instead of estimating a number, we will be trying to guess which of 2 possible classes a given input belongs to. A modern example is looking at a photo and deciding if its a cat or a dog.
In practice, its extremely common to need to decide between \(k\) classes where \(k > 2\) but in this article we’ll limit ourselves to just two classes  the socalled binary classification problem  because generalizations to many classes are usually both tedious and straightforward.

ML From Scratch, Part 1: Linear Regression
https://www.oranlooney.com/post/mlfromscratchpart1linearregression/
Thu, 29 Nov 2018 00:00:00 +0000
https://www.oranlooney.com/post/mlfromscratchpart1linearregression/
To kick off this series, will start with something simple yet foundational: linear regression via ordinary least squares.
While not exciting, linear regression finds widespread use both as a standalone learning algorithm and as a building block in more advanced learning algorithms. The output layer of a deep neural network trained for regression with MSE loss, simple AR time series models, and the “local regression” part of LOWESS smoothing are all examples of linear regression being used as an ingredient in a more sophisticated model.

ML From Scratch, Part 0: Introduction
https://www.oranlooney.com/post/mlfromscratchpart0introduction/
Sun, 11 Nov 2018 00:00:00 +0000
https://www.oranlooney.com/post/mlfromscratchpart0introduction/
Motivation “As an apprentice, every new magician must prove to his own satisfaction, at least once, that there is truly great power in magic.”  The Flying Sorcerers, by David Gerrold and Larry Niven
How do you know if you really understand something? You could just rely on the subjective experience of feeling like you understand. This sounds plausible  surely you of all people should know, right? But this runs headfirst into in the DunningKruger effect.

Craps Variants
https://www.oranlooney.com/post/crapsgamevariants/
Wed, 11 Jul 2018 00:00:00 +0000
https://www.oranlooney.com/post/crapsgamevariants/
Craps is a suprisingly fair game. I remember calculating the probability of winning craps for the first time in an undergraduate discrete math class: I went back through my calculations several times, certain there was a mistake somewhere. How could it be closer than $\frac{1}{36}$?
(Spoiler Warning If you haven’t calculated these odds for yourself then you may want to do so before reading further. I’m about to spoil it for you rather thoroughly in the name of exploring a more general case.

Semantic Code
https://www.oranlooney.com/post/semanticcode/
Wed, 30 Apr 2008 00:00:00 +0000
https://www.oranlooney.com/post/semanticcode/
semantic (siman’tik) adj. 1. Of or relating to meaning, especially meaning in language.
Programming destroys meaning. When we program, we first replace concepts with symbols and then replace those symbols with arbitrary codes — that’s why it’s called coding.
At its worst programming is writeonly: the program accomplishes a task, but is incomprehensible to humans. See, for example, the story of Mel. Such a program is correct, yet at the same time meaningless.