Review A previous post discussed how the decision tree algorithm classifies items using the gini impurity index--a simple score that is easily calculated. Please visit this post for some background information or a refresher if necessary. Pitfalls of Decision Trees Decision trees are relatively straightforward, regardless whether it is built on the gini impurity index… Continue reading A Mathless Breakdown of the Random Forest Classification Algorithm

## Support Vector Machine: The Kernel Trick Explained with Minimal Math

Introduction Support vector machines are supervised learning algorithms that help classify things into different groups. Supposed that the shapes below had to be separated somewhere on the line. Where should the line be drawn? It could be drawn line this, which might allow for some misclassification: In this example, it's pretty intuitive that the dividing… Continue reading Support Vector Machine: The Kernel Trick Explained with Minimal Math

## Fraud Classification with AutoML

Introduction Data scientist follow a structure of obtaining data, cleaning it, and then feeding it through one of their many fancy models to determine which one has the best "predictive" capabilities. This insight is highly valued by many companies either as a means to have some foresight into future conditions or revise current systems to… Continue reading Fraud Classification with AutoML

## A Mathless Breakdown of Decisions Trees and the Gini Impurity Index

The decision tree algorithm is one of the first learned by most aspiring data scientists because while the underlying math may be tedious, the overall logic is pretty simple to understand, at least on a surface level. At its core, a decision tree is a classifier algorithm, which means it has to make a decision… Continue reading A Mathless Breakdown of Decisions Trees and the Gini Impurity Index

## Good Metrics & Bad Targets: A Summary on Goodhart’s Law

Every child has learned of the usual scientific method: the variable you change is the independent variable, and the variable that changes as a result is the dependent variable. To ensure causation of the dependent, only one independent variable can be changed at a time while many other things must be kept constant between trials.… Continue reading Good Metrics & Bad Targets: A Summary on Goodhart’s Law