Data Products: Standards up front
- Danielle Costa Nakano
- Aug 11, 2021
- 1 min read
Updated: Jan 20
For those of you looking to establish a new department at work or want to play with modeling for the first time, you should think about standards.
When is it done? How do you know it's good enough to productize?
Standards and protocols.
When should standards be established?
Set standards at the beginning. Even if your standards are a default set by an education institution, that's good enough. It only counts if you write it down though.
Where do I start?
At the very least, there's always best approaches and most appropriate models to use for problems and data sources (if it's not time data you won't use ARIMA).
At the top of the project, take the time to work on a project plan that includes data models intended for data exploration and proof of concept phases.
Each model should have a standard. Here's a couple of examples:
Logistic Regression, AUC > .7
Linear Regression, r-squared > .6
Do not allow Forests models to be used. The nature of these models obscures us from having access to metrics for deep analysis. Reserve these models for boosting.
What are the basic protocols to always use?
Establish and codify your standards in the planning phase.
Always deliver conclusions with visuals like ROC, gg-plots, elbow-charts, etc.
Even as a professional, it's always fun to kick off an experiment. Drop me a note and tell me how you do it.
-DCN
Comentarios