Building Intuition through Visualization


From political science to cancer genomics, Markov Chain Monte Carlo (MCMC) has proved to be a valuable tool for statistical analysis in a variety of different fields. At a high level, MCMC describes a collection of iterative algorithms that obtain samples from distributions that are difficult to sample directly. These Markov chain-based algorithms combined with a stunning increase in computing power over the past 30 years have allowed researchers to sample from extremely complicated theoretical distributions. …

Use permutation feature importance to discover which features in your dataset are useful for prediction — implemented from scratch in python.

Photo by Arno Senoner on Unsplash


Advanced topics in machine learning are dominated by black box models. As the name suggests, black box models are complex models where it’s extremely hard to understand how model inputs are combined to make predictions. Deep learning models like artificial neural networks and ensemble models like random forests, gradient boosting learners, and model stacking are examples of black box models that yield remarkably accurate predictions in a variety of domains from urban planning to computer vision.

Image Source: Matchroom Pool Facebook Page

Evaluating FargoRate and BilliauRate Simulations

Four days prior to the 2021 World Pool Masters, I posted an article with my tournament predictions based on FargoRate and BilliauRate simulations. This article breaks down what I got right and what I got wrong.

Alexander Kazakis wins the 2021 World Pool Masters

The 2021 World Pool Masters was a story of redemption for Alex Kazakis. After crushing disappointment in 2019 final, Kazakis had a magical 2021 tournament. Beating Justin Sajich, Skyler Woodward, and Eklent Kaci on his way to the final, Kazakis whitewashed American Shane Van Boening to claim his first major victory on the world stage.

In FargoRate and BilliauRate…

Source: Matchroom Pool

Simulating the 2021 World Pool Masters using FargoRate and Other Probabilistic Models

The Dafabet 2021 World Pool Masters presented by Matchroom Pool is one of the most highly-anticipated 9-ball pool invitational events of the year. With $100,000 on the line, twenty-four of the world’s best will compete over 4 days with the likes of 2020 Mosconi Cup MVP Jayson Shaw, 2021 World Cup of Pool Champion Joshua Filler, and Predator Championship League Pool winner Albin Ouschan headlining the star-studded field.

With the event less than a week away and the tournament draw set in stone, everyone is wondering: Who’s leaving Gibraltar as 2021 World Pool Masters Champion? To find out, I simulated…

Rtex with Overleaf

Everything you need to know about Rtex files, how to make them, and how Rtex compares with R Markdown

An open letter to my fellow statistics students,

I decided to major in statistics during my sophomore fall while taking Stat 110: Introduction to Probability because I loved solving fun problems. Coin flipping! Gambling! Game show hosts with a strange affinity for goats! (Oh my!) As the semester progressed, I also began to love writing my own solutions. Each week, I enjoyed learning how to use LaTeX to turn pen-and-paper sketches into typeset documents. …

Coauthored by Eric Sun and Seth Billiau

Harvard’s Undergraduate Council Finance Committee is a major source of financial support for hundreds of student organizations, doling out nearly $150,000 of student term bill each semester to support campus life. But who is all that money really going to? And how is it spent? This investigation by the Harvard Open Data Project attempts to answer these questions. Certain types of student groups receive more money than others, and a few student groups receive the most money. The UC’s flawed accountability system also leaves substantial room for improvements to the grants system.

Data Collection Methods


Seth Billiau

