I preface this post by saying that its going make me seem like a fucking luddite and is more of a rant about git’s shocking UI than anything substantial – sorry! Throughout the late 70s and 80s there was the famous format war of Betamax vs. VHS. Despite Betamax having superior video and audio quality1, VHS tapes won the battle. Following this, throughout the mid 00s and 10s, there was a format war between two distributed version control systems: git and mercurial (hg)2....
Trains 🚄, Planes ✈️, and Automobiles 🚙
I recently took the train over 1200 km from Huntingdon in the East of England to Vienna in Austria (see my Twitter threads here and here). This is a long journey. On the outbound I went from Huntingdon to London (Great Northern), London to Amsterdam (Eurostar), Amsterdam to Vienna (ÖBB NightJet), taking over 24 hours. On the return I went from Vienna to Frankfurt (DB ICE), Frankfurt to Brussels (DB ICE), Brussels to London (Eurostar), London to Stevenage (Thameslink), Stevenage to Huntingdon (Thameslink), taking about 16 hours....
HTTPS 🔒 should be compulsory
Recently, I spoke about how it was completely incompetent for forgetting to renew its SSL certificate for a fifth time. This got me thinking about HTTPS (SSL-encrypted HTTP traffic). HTTPS should be compulsory and browsers should refuse to serve non-encrypted websites to users. In 2015, Mozilla, the creators of Firefox, announced their intention to deprecate non-secure HTTP. Seven years later, HTTP still exists. Albeit, it certainly isn’t very common these days, but in some places it does still exists....
Manjaro is 🐕💩
Recently manjaro, an Arch Linux-based distribution, forgot to renew its SSL certificate for at least the fifth time. This is bonkers. Briefly, an SSL certificate allows you to communicate with websites using encryption (https) rather than http. Thanks to Let’s Encrypt and certbot, this is now very easy. Once it’s all set-up all that’s needed is to enable the systemd timer certbot-renewal, and then it will literally never expire. How manjaro’s team let it expire even once, let alone FIVE times, is beyond me....
Stop using p=0.05
How many of you use p=0.05 as an absolute cut off? p ≥ 0.05 means not significant. No evidence. Nada. And then p < 0.05 great it’s significant. This is a crude way of using p-values, and hopefully I will convince you of this. What is a p-value? A lot of us use p-values following this arbitrary cut off but don’t actually know the theoretical background of a p-value. A p-value is the probability, under the null hypothesis, of observing data at least as extreme as the observed data....
An introduction to Generalized Estimating Equations
A key assumption underpinning generalized linear models (which linear regression is a type of) is the independence of observations. In longitudinal data this will simply not hold. Observations within an individual (between time points) are likely to be more similar than those between individuals. So, how do you deal with this? One option is to fit a generalized linear mixed model in which there are random intercept and slope terms for each individual....
Stop testing for normality
I see a lot of data scientists using tests such as the Shapiro-Wilk test and the Kolmogorov–Smirnov to test for normality. Stop doing this. Just stop. If you’re not yet convinced (and I don’t blame you!), let me show you why these are a waste of your time. Why do we care about normality? We should care about normality. It’s an important assumption that underpins a wide variety of statistical procedures....