← Lessons

quiz vs the machine

Silver1120

Machine Learning

Feature Engineering Basics

Reshape raw columns into inputs a model can learn from.

5 min read · intro · beat Silver to climb

Why it matters

A model can only learn from the features you give it. Good features often beat fancy algorithms, because they expose the structure that matters in a form the model can use.

Common transformations

  • Scaling puts numeric features on a comparable range, which helps distance and gradient methods.
  • Encoding turns categories into numbers, such as one hot columns for nominal values.
  • Binning groups a continuous value into ranges, which can capture thresholds.
  • Interactions multiply or combine two features to expose joint effects.

Domain features

The biggest wins usually come from domain knowledge.

  • From a timestamp, extract the hour, day of week, or whether it is a holiday.
  • From a price and a count, derive a per unit value.
  • From text, count keywords or compute a length.

Pitfalls

  • Avoid features that secretly contain the target, a form of leakage.
  • Fit any scaler or encoder on training data only.
  • More features is not always better, since noise can hurt and slow training.

Key idea

Feature engineering reshapes raw data into informative inputs through scaling, encoding, and domain derived features, with care to avoid leakage.

Check yourself

Answer to earn rating on the learn ladder.

1. Which often improves a model the most?

2. What is a leakage risk in feature engineering?