Introduction Data scientist follow a structure of obtaining data, cleaning it, and then feeding it through one of their many fancy models to determine which one has the best "predictive" capabilities. This insight is highly valued by many companies either as a means to have some foresight into future conditions or revise current systems to… Continue reading Fraud Classification with AutoML
The decision tree algorithm is one of the first learned by most aspiring data scientists because while the underlying math may be tedious, the overall logic is pretty simple to understand, at least on a surface level. At its core, a decision tree is a classifier algorithm, which means it has to make a decision… Continue reading A Mathless Breakdown of Decisions Trees and the Gini Impurity Index
Every child has learned of the usual scientific method: the variable you change is the independent variable, and the variable that changes as a result is the dependent variable. To ensure causation of the dependent, only one independent variable can be changed at a time while many other things must be kept constant between trials.… Continue reading Good Metrics & Bad Targets: A Summary on Goodhart’s Law
It's fine to celebrate success, but it is more important to heed the lessons of failure. -Bill Gates A Preface: To be blunt, I have failed. More accurately, my models have failed (by ironically being not accurate, all pun intended). Despite weeks of work, I have failed to provide anything conclusive about the objective… Continue reading Evaluating Hospital Effectiveness: The First of Many Failures
As I'm approaching the end of my data science journey, I still find it extremely hard to explain to parents , colleagues, and friends (outside the tech field) exactly what I'll be aiming to do as a data scientist, and quite frankly, I still have a hard time pinpointing it myself. Through my studies, I've… Continue reading Data, Do You Mined?