top of page
  • Writer's pictureDanielle Costa Nakano

Data Science Products: PCA and Feature Selection

I'm working on my MS in Analytics and someone recently asked about using Principal Component Analysis (PCA) for feature selection.


Real life story: 40 features

  • The team takes a 4-column 2MM sample data set and turns that into 40 features.

    • Feature engineering enables you to build more complex models than you could with only raw data. It also allows you to build interpretable models from any amount of data.

  • What next? Maybe both.


What is Feature Selection and why do you do it?


What is PCA and why do you do it?

  • The professor's classic answer - PCA is a dimensionality reduction tool.

  • PCA may help with feature selection is if the most important variables also have the most variation.

  • This is a great tool to observe trends, clusters, outliers and reduce data sets for exploration.

  • Here's a great in-depth article if you want to dig into it, https://towardsdatascience.com/pca-is-not-feature-selection-3344fb764ae6.


Drop me a note and tell me how you do it.


-DCN


0 comments

Recent Posts

See All

Data Science Products: Building trust with Case Studies

If you are productizing predictive analytics or just trying to get buy-in around a model, try a case study to keep the momentum building and project moving forward. Partner with revenue teams and clie

Data Science Products: Too good to be true

If you are productizing a predictive model at work or playing around with MLE in R for the first time, always check the data. Roles When working on a predictive model at work, everyone has a role. Pr

Post: Blog2_Post
bottom of page