top of page

Danielle Costa Nakano

r&d - data products - predictive analytics - AI/ML - data science

dcostanakano@gmail.com

Search

Danielle Costa Nakano
- Aug 15, 2021
- 1 min read

Data Science Products: PCA and Feature Selection

I'm working on my MS in Analytics and someone recently asked about using Principal Component Analysis (PCA) for feature selection.

Real life story: 40 features

The team takes a 4-column 2MM sample data set and turns that into 40 features.
- Feature engineering enables you to build more complex models than you could with only raw data. It also allows you to build interpretable models from any amount of data.
What next? Maybe both.

What is Feature Selection and why do you do it?

Feature selection will help you limit these features to a manageable number.
Methods
- Forward, backward or stepwise. Tiny Algorithm: Each feature must meet target criteria or be dropped.
- Lasso, Ridge, Elastic Net. Deep divers only, check out https://towardsdatascience.com/whats-the-difference-between-linear-regression-lasso-ridge-and-elasticnet-8f997c60cf29.

What is PCA and why do you do it?

The professor's classic answer - PCA is a dimensionality reduction tool.
PCA may help with feature selection is if the most important variables also have the most variation.
This is a great tool to observe trends, clusters, outliers and reduce data sets for exploration.
Here's a great in-depth article if you want to dig into it, https://towardsdatascience.com/pca-is-not-feature-selection-3344fb764ae6.

Drop me a note and tell me how you do it.

-DCN

Recent Posts

Identity Resolution

Data Science Products: Building trust with Case Studies

If you are productizing predictive analytics or just trying to get buy-in around a model, try a case study to keep the momentum building and project moving forward. Partner with revenue teams and clie

Data Science Products: Too good to be true

If you are productizing a predictive model at work or playing around with MLE in R for the first time, always check the data. Roles When working on a predictive model at work, everyone has a role. Pr

Post: Blog2_Post

bottom of page