Putting the “day” in data science

Bryan Whiting
1 min readDec 23, 2020

9:45am. I’m going to write a quick regression model. Just want to see the coefficients.

10:00. I do this all the time in R. Shouldn’t be too bad in Python.

10:15. Oh, not too bad. That worked nicely. Thanks Statsmodels.

10:30am. Let me just do some feature preprocessing. Center and scale. Impute the median. Should be easy. I want better inference, after all.

4:00pm. Okay, finally got that working with Statsmodels and SKLearn.

4:01. Dang. There goes my day.

4:02. What was I trying to do? Oh, yea. The regression. I wanted to see the coefficients.

4:05. Didn’t seem to change much. Well, till tomorrow.

This is why I fully embrace automated machine learning. Too many questions are going unanswered because employees are wasting time learning how to use a new language on a new tech stack. Whether it’s SKLearn, R, caret, H2O, Spark, or whatever - they’re all painfully different that it’s not a 5-second task to load a data and get the inference you want.

2030 will be a great year. I’m sure I’ll be spending my time probably learning why my Quantum AutoML tool isn’t generating the 1,000,000 permutations as fast as I want it to. What’s the point of my subconscious if I have to wait for results?

Introducing AutoML newsfeed. Swipe right if you like the model. Swipe left if it’s useless to you.

--

--

Bryan Whiting

The world is defined by writers | Silicon Valley Data Scientist | Google