A/B Tests and Experiment Size
Let’s say you’re running an A/B test. Maybe you want to test how many conversions you will get if you change the design of the Signup page or the wording. Yo...
Let’s say you’re running an A/B test. Maybe you want to test how many conversions you will get if you change the design of the Signup page or the wording. Yo...
Often with Natural Language Processing (NLP) applications a pipeline is useful to take the raw text and process it and extract relevant features before input...
Imputing missing values
imgproxy is a fast and secure standalone server for resizing and converting remote images. The main principles of imgproxy are simplicity, speed, and securit...
In Bayesian statistics, often the Beta function is used as the prior distribution for some unknown parameter in a Bernoulli experiment. In this post we compu...
There are times when our sample size is too small for asymptotic results to be valid. We cannot simply use Slutsky’s theorem to replace the true variance wit...
Definition
Log-Likelihood
In a previous post, I walked through the maths of back-propagation (“backprop”). Here I will go through the implementation in Python (heavily based on Andrew...
Design matrix
I’m sure that every developer who has learned a little bit of ML dreams of applying this to the stock market and getting rich. At first glance RNNs and LSTMs...
This guide will be about setting up the fiddly bits when deploying a Jupyter Hub to an AWS instance. It won’t go into explicit detail about absolutely every ...
Jupyter Notebooks are a great and widely used tool in data science. Quite often then run are run on localhosts or have to be accessed via SSH tunnelling.
This year has been like no other in living memory. Coronavirus has shaken our collective sense of what normality means and will undoubtedly continue to affec...
Deciding if a coin is fair
Instead of just having a single random variable $X$, we may have an experiment for which we are recording several random variables, which we can consider as ...
Building on the univariate delta method
The Delta Method
The Django ORM makes interacting with the database a breeze, but without due care can also lead to poor performance.
Where does the mysterious $(n-1)$ factor come from when computing the unbiased sample variance?
If we bet a fraction $f$ of our capital on each round, and a win resulted in a gain of $f\times b$, and a loss resulted in a loss of $f \times a$, then after...