Putting the “day” in data science

1 min readDec 23, 2020

9:45am. I’m going to write a quick regression model. Just want to see the coefficients.

10:00. I do this all the time in R. Shouldn’t be too bad in Python.

10:15. Oh, not too bad. That worked nicely. Thanks Statsmodels.

10:30am. Let me just do some feature preprocessing. Center and scale. Impute the median. Should be easy. I want better inference, after all.

4:00pm. Okay, finally got that working with Statsmodels and SKLearn.

4:01. Dang. There goes my day.

4:02. What was I trying to do? Oh, yea. The regression. I wanted to see the coefficients.

4:05. Didn’t seem to change much. Well, till tomorrow.

This is why I fully embrace automated machine learning. Too many questions are going unanswered because employees are wasting time learning how to use a new language on a new tech stack. Whether it’s SKLearn, R, caret, H2O, Spark, or whatever - they’re all painfully different that it’s not a 5-second task to load a data and get the inference you want.

2030 will be a great year. I’m sure I’ll be spending my time probably learning why my Quantum AutoML tool isn’t generating the 1,000,000 permutations as fast as I want it to. What’s the point of my subconscious if I have to wait for results?

Introducing AutoML newsfeed. Swipe right if you like the model. Swipe left if it’s useless to you.

Putting the “day” in data science

Written by Bryan Whiting