top of page

Data Science: PCA and Feature Selection

  • Writer: Danielle Costa Nakano
    Danielle Costa Nakano
  • Aug 15, 2021
  • 1 min read

Updated: Dec 1, 2024

I'm working on my MS in Analytics and someone recently asked about using Principal Component Analysis (PCA) for feature selection.


Real life story: 40 features

  • The team takes a 4-column 2MM sample data set and turns that into 40 features.

    • Feature engineering enables you to build more complex models than you could with only raw data. It also allows you to build interpretable models from any amount of data.

  • What next? Maybe both.


What is Feature Selection and why do you do it?


What is PCA and why do you do it?

  • The professor's classic answer - PCA is a dimensionality reduction tool.

  • PCA may help with feature selection is if the most important variables also have the most variation.

  • This is a great tool to observe trends, clusters, outliers and reduce data sets for exploration.

  • Here's a great in-depth article if you want to dig into it, https://towardsdatascience.com/pca-is-not-feature-selection-3344fb764ae6.


Drop me a note and tell me how you do it.


-DCN


 
 
 

Recent Posts

See All
2025 Snapshot

Fisher’s Linear Discriminate This blog continues to be a resource for data scientists studying FLD. 19 Countries Top international visitors from India, Russia, and Singapore. 19 States The US is the

 
 
 
Data Products 101

What is a data product? Business Data that we want to reuse so we apply the software development lifecycle to it. When we reuse data...

 
 
 

Comments


Subscribe Form

Thanks for submitting!

©2026 by Danielle Costa Nakano.

  • LinkedIn
bottom of page