From e28016ea34d0eb864a15025ce8e147c7b44d3b7f Mon Sep 17 00:00:00 2001
From: AmrutaJayanti <142327526+AmrutaJayanti@users.noreply.github.com>
Date: Fri, 7 Jun 2024 11:50:36 +0530
Subject: [PATCH] Create feature-engineering.md

---
 docs/python/feature-engineering.md | 864 +++++++++++++++++++++++++++++
 1 file changed, 864 insertions(+)
 create mode 100644 docs/python/feature-engineering.md

diff --git a/docs/python/feature-engineering.md b/docs/python/feature-engineering.md
new file mode 100644
index 000000000..078a6c8b4
--- /dev/null
+++ b/docs/python/feature-engineering.md
@@ -0,0 +1,864 @@
+
+**Feature:**
+
+In the context of machine learning, a feature (also known as a variable
+or attribute) is an individual measurable property or characteristic of
+a data point that is used as input for a machine learning algorithm.
+Features can be numerical, categorical, or text-based, and they
+represent different aspects of the data that are relevant to the problem
+at hand.
+
+For example, in a dataset of housing prices, features could include the
+number of bedrooms, the square footage, the location, and the age of the
+property. In a dataset of customer demographics, features could include
+age, gender, income level, and occupation.
+
+**What is Feature Engineering?**
+
+Feature engineering is the process of transforming raw data into
+features that are suitable for machine learning models. In other words,
+it is the process of selecting, extracting, and transforming the most
+relevant features from the available data to build more accurate and
+efficient machine learning models.
+
+**Why need it?**
+
+*Improve User Experience:* The primary reason we engineer features is to
+enhance the user experience of a product or service. By adding new
+features, we can make the product more intuitive, efficient, and
+user-friendly, which can increase user satisfaction and engagement.
+
+*Competitive Advantage:* Another reason we engineer features is to gain
+a competitive advantage in the marketplace. By offering unique and
+innovative features, we can differentiate our product from competitors
+and attract more customers.
+
+*Meet Customer Needs:* We engineer features to meet the evolving needs
+of customers. By analyzing user feedback, market trends, and customer
+behavior, we can identify areas where new features could enhance the
+product's value and meet customer needs.
+
+*Increase Revenue:* Features can also be engineered to generate more
+revenue. For example, a new feature that streamlines the checkout
+process can increase sales, or a feature that provides additional
+functionality could lead to more upsells or cross-sells.
+
+*Future-Proofing:* Engineering features can also be done to future-proof
+a product or service. By anticipating future trends and potential
+customer needs, we can develop features that ensure the product remains
+relevant and useful in the long term.
+
+![FE](https://tse3.mm.bing.net/th?id=OIP.sCoM-hxdiEZW73coQYeQawHaDA&pid=Api&P=0&h=180)
+
+**Processes involved:-**
+
+-   Feature Transformation
+
+-   Feature Construction
+
+-   Feature Extraction
+
+-   Feature Selection
+
+**FEATURE TRANSFORMATION:-**
+
+Feature Transformation is the process of transforming the features into
+a more suitable representation for the machine learning model. This is
+done to ensure that the model can effectively learn from the data.
+
+-   Missing Value Imputation
+-   Handling Categorical Values
+-   Outlier Detection
+-   Feature Scaling
+
+1.) Missing Value Imputation
+
+Missing value imputation is a critical step in data preprocessing where
+missing data points are filled in with estimated values.
+
+-   Simple Imputation
+
+-   K-Nearest Neighbours (KNN) Imputation
+
+-   Multivariate Imputation by Chained Equations (MICE)
+
+**SIMPLE IMPUTATION**
+:
+``` python
+#Simple Imputation
+#Lets first create Some sample dataset
+import numpy as np
+import pandas as pd
+
+# Sample DataFrame
+data = {
+    'A': [1, 2, np.nan, 4, 5],
+    'B': [np.nan, 2, 3, 4, 5],
+    'C': ['cat', 'dog', np.nan, 'mouse', 'rabbit']
+}
+df = pd.DataFrame(data)
+print("Original DataFrame:")
+print(df)
+```
+
+Output:
+
+     Original DataFrame:
+         A    B       C
+    0  1.0  NaN     cat
+    1  2.0  2.0     dog
+    2  NaN  3.0     NaN
+    3  4.0  4.0   mouse
+    4  5.0  5.0  rabbit
+
+Mean Imputation
+
+``` python
+from sklearn.impute import SimpleImputer
+
+# Mean Imputation for numerical columns
+mean_imputer = SimpleImputer(strategy='mean')
+df['A'] = mean_imputer.fit_transform(df[['A']])
+
+print("\nDataFrame after Mean Imputation:")
+print(df)
+```
+Output: 
+
+    DataFrame after Mean Imputation:
+         A    B       C
+    0  1.0  NaN     cat
+    1  2.0  2.0     dog
+    2  3.0  3.0     NaN
+    3  4.0  4.0   mouse
+    4  5.0  5.0  rabbit
+
+The above code performs Mean imputation.
+
+Replacing missing values with the mean of the column
+
+`SimpleImputer` is a class in the `sklearn.impute` module of the
+Scikit-learn library, used for handling missing data by providing basic
+strategies for imputing missing values. It replaces missing values with
+specified constant values or statistical values (like mean, median, or
+mode) of the corresponding column.
+
+Above code selects the column A and then replaces `NaN` with mean of the
+column
+
+``` python
+# Median Imputation for numerical columns
+median_imputer = SimpleImputer(strategy='median')
+df['B'] = median_imputer.fit_transform(df[['B']])
+
+print("\nDataFrame after Median Imputation:")
+print(df)
+```
+
+Output:
+
+    DataFrame after Median Imputation:
+         A    B       C
+    0  1.0  3.5     cat
+    1  2.0  2.0     dog
+    2  3.0  3.0     NaN
+    3  4.0  4.0   mouse
+    4  5.0  5.0  rabbit
+
+Above code is for median imputation.
+
+Replacing missing values with the median of the column. Useful for
+skewed distributions.
+
+Above code selects the column B and then replaces `Nan` with the median
+of the column.
+
+**KNN IMPUTATION**
+
+``` python
+import numpy as np
+import pandas as pd
+from sklearn.impute import KNNImputer
+# Sample DataFrame
+data = {
+    'A': [1, 2, np.nan, 4, 5],
+    'B': [np.nan, 2, 3, 4, 5],
+    'C': [7, 8, 9, np.nan, 11]
+}
+df = pd.DataFrame(data)
+print("Original DataFrame:")
+print(df)
+# Initialize the KNNImputer
+knn_imputer = KNNImputer(n_neighbors=3)
+
+# Fit the imputer and transform the data
+df_imputed = pd.DataFrame(knn_imputer.fit_transform(df), columns=df.columns)
+
+print("\nDataFrame after KNN Imputation:")
+print(df_imputed)
+```
+
+Output: 
+
+    Original DataFrame:
+         A    B     C
+    0  1.0  NaN   7.0
+    1  2.0  2.0   8.0
+    2  NaN  3.0   9.0
+    3  4.0  4.0   NaN
+    4  5.0  5.0  11.0
+
+
+    DataFrame after KNN Imputation:
+              A    B          C
+    0  1.000000  3.0   7.000000
+    1  2.000000  2.0   8.000000
+    2  2.333333  3.0   9.000000
+    3  4.000000  4.0   9.333333
+    4  5.000000  5.0  11.000000
+
+KNN (K-Nearest Neighbors) imputation is a method that replaces missing
+values by considering the values of the nearest neighbors. The KNN
+imputer finds the k-nearest neighbors of an instance with missing values
+and then uses their values to fill in the gaps. This method can handle
+both numerical and categorical data, and it tends to be more robust than
+simpler imputation methods like mean or median imputation.
+
+Scikit-learn provides a `KNNImputer` class in the `sklearn.impute`
+module, which makes it straightforward to perform KNN imputation.
+
+**MULTIVARIATE IMPUTATION BY CHAINED EQUATIONS**
+
+``` python
+import numpy as np
+import pandas as pd
+from sklearn.experimental import enable_iterative_imputer
+from sklearn.impute import IterativeImputer
+
+# Sample DataFrame
+data = {
+    'A': [1, 2, np.nan, 4, 5],
+    'B': [np.nan, 2, 3, 4, 5],
+    'C': [7, 8, 9, np.nan, 11]
+}
+df = pd.DataFrame(data)
+print("Original DataFrame:")
+print(df)
+
+# Initialize the IterativeImputer
+mice_imputer = IterativeImputer(max_iter=10, random_state=0)
+
+# Fit the imputer and transform the data
+df_imputed = pd.DataFrame(mice_imputer.fit_transform(df), columns=df.columns)
+
+print("\nDataFrame after MICE Imputation:")
+print(df_imputed)
+```
+
+Output: 
+
+    Original DataFrame:
+         A    B     C
+    0  1.0  NaN   7.0
+    1  2.0  2.0   8.0
+    2  NaN  3.0   9.0
+    3  4.0  4.0   NaN
+    4  5.0  5.0  11.0
+
+    DataFrame after MICE Imputation:
+             A         B          C
+    0  1.00000  0.999988   7.000000
+    1  2.00000  2.000000   8.000000
+    2  3.00005  3.000000   9.000000
+    3  4.00000  4.000000   9.999993
+    4  5.00000  5.000000  11.000000
+
+Multivariate Imputation by Chained Equations (MICE), also known as Fully
+Conditional Specification (FCS), is a method for handling missing data
+by iteratively imputing each missing value using a regression model. It
+allows for complex relationships between variables and can provide more
+accurate imputations than simpler methods.
+
+In this example, the `IterativeImputer` iteratively imputes the missing
+values in columns \'A\', \'B\', and \'C\' using regression models. This
+imputation method takes into account the relationships between all the
+columns, providing a more accurate imputation than simpler methods.
+
+2.)Handling categorical values Handling categorical values is a critical
+step in feature transformation for machine learning. Categorical data
+can be transformed into numerical values in various ways to make them
+suitable for modeling.
+
+Common techniques used to handle categorical values:-
+
+-   One Hot Encoding
+
+-   Label Encoding
+
+-   Ordinal Encoding
+
+**One-Hot Encoding**
+
+One-hot encoding transforms categorical variables into a set of binary
+columns. Each category is represented as a binary column (0 or 1).
+
+``` python
+import pandas as pd
+from sklearn.preprocessing import OneHotEncoder
+
+# Sample DataFrame
+data = {'Color': ['Red', 'Blue', 'Green', 'Blue', 'Red']}
+df = pd.DataFrame(data)
+
+# Initialize the OneHotEncoder
+onehot_encoder = OneHotEncoder(sparse_output=False)
+
+# Fit and transform the data
+onehot_encoded = onehot_encoder.fit_transform(df[['Color']])
+
+# Create a DataFrame with the encoded features
+onehot_encoded_df = pd.DataFrame(onehot_encoded, columns=onehot_encoder.get_feature_names_out(['Color']))
+
+print(onehot_encoded_df)
+```
+
+Output: 
+
+       Color_Blue  Color_Green  Color_Red
+    0         0.0          0.0        1.0
+    1         1.0          0.0        0.0
+    2         0.0          1.0        0.0
+    3         1.0          0.0        0.0
+    4         0.0          0.0        1.0
+
+Above code performs one hot encoding. `get_features_names_out` extracts
+the features present in the dataset.
+
+**Label Encoding**
+
+Label encoding converts each category into a numerical value. This can
+be useful for ordinal data but may not be suitable for nominal data
+since it introduces an ordinal relationship between categories.
+
+``` python
+from sklearn.preprocessing import LabelEncoder
+
+# Sample DataFrame
+data = {'Color': ['Red', 'Blue', 'Green', 'Blue', 'Red']}
+df = pd.DataFrame(data)
+
+# Initialize the LabelEncoder
+label_encoder = LabelEncoder()
+
+# Fit and transform the data
+df['Color_Encoded'] = label_encoder.fit_transform(df['Color'])
+
+print(df)
+```
+
+Output: 
+
+       Color  Color_Encoded
+    0    Red              2
+    1   Blue              0
+    2  Green              1
+    3   Blue              0
+    4    Red              2
+
+`LabelEncoder` assigns each color in the above code with a numerical
+value.
+
+Blue - 0
+
+Green - 1
+
+Red - 2
+
+**Ordinal Encoding**
+
+Ordinal encoding is useful for ordinal categorical data where there is
+an inherent order. It assigns integers to categories while preserving
+the order.
+
+``` python
+from sklearn.preprocessing import OrdinalEncoder
+
+# Sample DataFrame
+data = {'Size': ['Small', 'Medium', 'Large', 'Medium', 'Small']}
+df = pd.DataFrame(data)
+
+# Initialize the OrdinalEncoder
+ordinal_encoder = OrdinalEncoder(categories=[['Small', 'Medium', 'Large']])
+
+# Fit and transform the data
+df['Size_Encoded'] = ordinal_encoder.fit_transform(df[['Size']])
+
+print(df)
+```
+
+Output: 
+   
+         Size  Size_Encoded
+    0   Small           0.0
+    1  Medium           1.0
+    2   Large           2.0
+    3  Medium           1.0
+    4   Small           0.0
+
+Above code encodes the ordinal data \[Small,Medium,Large\] based on
+their ranking by assigning a numerical value to it .
+
+3.)Outlier Detection:Outlier detection is an important step in data
+preprocessing as outliers can significantly affect the performance of
+machine learning models. Outliers are data points that differ
+significantly from other observations in the dataset. Detecting and
+handling outliers can improve model accuracy and reliability.
+
+Here are some common techniques for detecting outliers:
+
+-   Z-score
+-   Interquartile Range (IQR)
+
+**Z-Score (Standard Score)**
+
+Z-score is a measure of how many standard deviations a data point is
+from the mean. It assumes that the data follows a Gaussian (normal)
+distribution.
+
+![Zscore](https://tse4.mm.bing.net/th?id=OIP.WF_pHaZPSWI5RNvMOcH8zQAAAA&pid=Api&P=0&h=180)
+
+``` python
+import numpy as np
+import pandas as pd
+
+# Sample DataFrame
+data = {'Value': [10, 12, 12, 13, 12, 14, 100, 12, 15, 10, 12]}
+df = pd.DataFrame(data)
+
+# Calculate Z-scores
+df['Z-Score'] = (df['Value'] - df['Value'].mean()) / df['Value'].std()
+
+# Identify outliers
+threshold = 3
+df['Outlier'] = df['Z-Score'].abs() > threshold
+
+print(df)
+```
+
+
+Output: 
+
+        Value   Z-Score  Outlier
+    0      10 -0.384024    False
+    1      12 -0.308591    False
+    2      12 -0.308591    False
+    3      13 -0.270874    False
+    4      12 -0.308591    False
+    5      14 -0.233158    False
+    6     100  3.010478     True
+    7      12 -0.308591    False
+    8      15 -0.195441    False
+    9      10 -0.384024    False
+    10     12 -0.308591    False
+
+
+Particular data item z-score is calculated and if value is greater than
+given threshold value then it returns `True` else `False`
+
+
+**Interquartile Range (IQR)**
+
+The IQR method is based on the quartiles of the data. Outliers are
+defined as points outside the range \[Q1 - 1.5 \* IQR, Q3 + 1.5 \*
+IQR\], where Q1 is the first quartile and Q3 is the third quartile.
+
+``` python
+# Calculate IQR
+Q1 = df['Value'].quantile(0.25)
+Q3 = df['Value'].quantile(0.75)
+IQR = Q3 - Q1
+
+# Define outlier thresholds
+lower_bound = Q1 - 1.5 * IQR
+upper_bound = Q3 + 1.5 * IQR
+
+# Identify outliers
+df['Outlier'] = (df['Value'] < lower_bound) | (df['Value'] > upper_bound)
+
+print(df)
+```
+
+Output: 
+
+        Value   Z-Score  Outlier
+    0      10 -0.384024    False
+    1      12 -0.308591    False
+    2      12 -0.308591    False
+    3      13 -0.270874    False
+    4      12 -0.308591    False
+    5      14 -0.233158    False
+    6     100  3.010478     True
+    7      12 -0.308591    False
+    8      15 -0.195441    False
+    9      10 -0.384024    False
+    10     12 -0.308591    False
+
+
+lower_bound and upper_bound mentions the range. Below lower_bound or
+above upper_bound then data point is said to be an outlier
+
+
+
+
+4.)Feature Scaling:Feature scaling is a crucial step in data
+preprocessing for machine learning. It ensures that the numerical
+features are on a similar scale, which can improve the performance of
+many machine learning algorithms.
+
+Here are some common methods of feature scaling:
+
+-   Standardization
+-   Normalisation
+
+
+**Standardization**
+
+It is also called z-score Normalisation.
+
+`Z = (X - μ) / σ`
+
+
+``` python
+from sklearn.preprocessing import StandardScaler
+
+# Initialize the StandardScaler
+scaler = StandardScaler()
+
+# Fit and transform the data
+df['Value_Standardized'] = scaler.fit_transform(df[['Value']])
+
+print(df)
+```
+
+Output: 
+
+        Value   Z-Score  Outlier  Value_Standardized
+    0      10 -0.384024    False           -0.402768
+    1      12 -0.308591    False           -0.323653
+    2      12 -0.308591    False           -0.323653
+    3      13 -0.270874    False           -0.284095
+    4      12 -0.308591    False           -0.323653
+    5      14 -0.233158    False           -0.244538
+    6     100  3.010478     True            3.157416
+    7      12 -0.308591    False           -0.323653
+    8      15 -0.195441    False           -0.204980
+    9      10 -0.384024    False           -0.402768
+    10     12 -0.308591    False           -0.323653
+
+Above code uses `StandardScaler` to standardize the values
+
+**Normalisation**
+
+-   MinMax Scaling
+
+-   Robust Scaling
+
+-   MaxAbs Scaling
+
+
+**MinMax Scaling**
+
+Min-max scaling transforms the features to a fixed range, usually \[0,
+1\].
+
+![MinMax](https://tse4.mm.bing.net/th?id=OIP.vBO3wyehnnOLY67eMrXt7wHaCS&pid=Api&P=0&h=180)
+
+
+``` python
+import pandas as pd
+from sklearn.preprocessing import MinMaxScaler
+
+# Sample DataFrame
+data = {'Value': [10, 20, 30, 40, 50]}
+df = pd.DataFrame(data)
+
+# Initialize the MinMaxScaler
+scaler = MinMaxScaler()
+
+# Fit and transform the data
+df['Value_Scaled'] = scaler.fit_transform(df[['Value']])
+
+print(df)
+```
+
+Output:
+
+       Value  Value_Scaled
+    0     10          0.00
+    1     20          0.25
+    2     30          0.50
+    3     40          0.75
+    4     50          1.00
+
+`MinMaxScaler` is used to perform MinMaxScaling.
+
+
+**Robust Scaling**
+
+Robust scaling uses the median and the interquartile range (IQR). It is
+useful for data with outliers.
+
+![Robust](https://tse1.mm.bing.net/th?id=OIP.g0PtmCXLTAQJmzKyAUJQPAAAAA&pid=Api&P=0&h=180)
+
+``` python
+from sklearn.preprocessing import RobustScaler
+
+# Initialize the RobustScaler
+scaler = RobustScaler()
+
+# Fit and transform the data
+df['Value_Robust'] = scaler.fit_transform(df[['Value']])
+
+print(df)
+```
+Output:
+
+          Value  Value_Scaled  Value_Robust
+      0     10          0.00          -1.0
+      1     20          0.25          -0.5
+      2     30          0.50           0.0
+      3     40          0.75           0.5
+      4     50          1.00           1.0
+
+`RobustScaler' is used for performing Robust Scaling
+
+**MaxAbs Scaling**
+
+MaxAbs scaling scales each feature by its maximum absolute value. The
+result is a dataset where each feature has a range of \[-1, 1\].
+
+![MaxAbs](https://tse4.mm.bing.net/th?id=OIP.dQzNvcQua99b3Pwyk0VnoAAAAA&pid=Api&P=0&h=180)
+
+``` python
+from sklearn.preprocessing import MaxAbsScaler
+
+# Initialize the MaxAbsScaler
+scaler = MaxAbsScaler()
+
+# Fit and transform the data
+df['Value_MaxAbs'] = scaler.fit_transform(df[['Value']])
+
+print(df)
+```
+Output:
+
+        Value  Value_Scaled  Value_Robust  Value_MaxAbs
+      0     10          0.00          -1.0           0.2
+      1     20          0.25          -0.5           0.4
+      2     30          0.50           0.0           0.6
+      3     40          0.75           0.5           0.8
+      4     50          1.00           1.0           1.0
+
+`MaxAbsScaler` performs the MaxAbs Scaling
+
+**FEATURE CONSTRUCTION**
+
+Feature construction, involves creating new features from the existing
+ones to improve the performance of machine learning models.
+
+-   Polynomial Features
+
+-   Interaction Features
+
+-   Logarithmic and Exponential Transformations
+
+**Polynomial Features**
+
+Creating polynomial features involves generating new features by taking
+combinations of existing features to a certain power.
+
+``` python
+from sklearn.preprocessing import PolynomialFeatures
+
+# Sample data
+X = [[2, 3], [3, 4], [4, 5]]
+
+# Create polynomial features
+poly = PolynomialFeatures(degree=2)
+X_poly = poly.fit_transform(X)
+
+print(X_poly)
+```
+
+Output:
+
+    [[ 1.  2.  3.  4.  6.  9.]
+     [ 1.  3.  4.  9. 12. 16.]
+     [ 1.  4.  5. 16. 20. 25.]]
+
+If you have a feature X , polynomial features could be X\^2, X\^3
+
+**Interaction Features**
+
+Interaction features are created by multiplying two or more existing
+features to capture interactions between variables.
+
+``` python
+import pandas as pd
+
+# Sample data
+X = pd.DataFrame({'X1': [1, 2, 3], 'X2': [4, 5, 6]})
+
+# Create interaction features
+X['X1_X2'] = X['X1'] * X['X2']
+
+print(X)
+```
+Output:
+
+     X1  X2  X1_X2
+    0   1   4      4
+    1   2   5     10
+    2   3   6     18
+    
+For features 𝑋1 and 𝑋2, interaction features could be 𝑋 1 × 𝑋 2
+
+**Logarithmic and Exponential Transformations**
+
+Applying logarithmic or exponential transformations can stabilize
+variance and make the data more normally distributed.
+
+``` python
+import numpy as np
+
+# Sample data
+X = np.array([1, 2, 3, 4, 5])
+
+# Apply logarithmic transformation
+X_log = np.log(X)
+
+print(X_log)
+```
+Output:
+
+    [0.         0.69314718 1.09861229 1.38629436 1.60943791]
+    
+For a feature 𝑋 , a logarithmic transformation could be log ( 𝑋 )
+
+**FEATURE EXTRACTION**
+
+Feature extraction is a process of transforming raw data into a set of
+features that can be used for machine learning models. The goal is to
+reduce the dimensionality of the data while preserving its relevant
+information.
+
+-   Principal Component Analysis
+-   Linear Discriminant Analysis
+
+**Principal Component Analysis (PCA)**
+
+Principal Component Analysis (PCA) is a technique that transforms the
+data into a new coordinate system such that the greatest variances by
+any projection of the data come to lie on the first coordinates (called
+principal components).
+
+``` python
+from sklearn.decomposition import PCA
+import numpy as np
+
+# Sample data
+X = np.array([[1, 2], [3, 4], [5, 6], [7, 8]])
+
+# Apply PCA
+pca = PCA(n_components=1)
+X_pca = pca.fit_transform(X)
+
+print(X_pca)
+```
+
+Output:
+
+    [[ 4.24264069]
+     [ 1.41421356]
+     [-1.41421356]
+     [-4.24264069]]
+
+`PCA` is used for Principal component analysis
+
+**Linear Discriminant Analysis (LDA)**
+
+Linear Discriminant Analysis (LDA) is a technique used to find a linear
+combination of features that separates two or more classes of objects or
+events.
+
+``` python
+from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA
+import numpy as np
+
+# Sample data
+X = np.array([[1, 2], [3, 4], [5, 6], [7, 8]])
+y = np.array([0, 1, 0, 1])
+
+# Apply LDA
+lda = LDA(n_components=1)
+X_lda = lda.fit_transform(X, y)
+
+print(X_lda)
+```
+Output:
+
+    [[-1.06066017]
+     [-0.35355339]
+     [ 0.35355339]
+     [ 1.06066017]]
+
+`LinearDiscriminantAnalysis` is used
+
+**Feature Selection**
+
+Feature selection is the process of selecting a subset of relevant
+features (variables, predictors) for use in model construction. It helps
+in improving model performance, reducing overfitting, and decreasing
+computational cost.
+
+-   Filter methods
+
+-   Wrapper methods
+
+-   Embedded methods
+
+**Filter Methods**
+
+Filter methods apply statistical measures to score the relevance of
+features. They are computationally efficient and independent of any
+machine learning algorithms.
+
+Examples:
+
+1.  Correlation Coefficient
+2.  Chi-Square Test
+3.  ANOVA
+
+**Wrapper Methods**
+
+Wrapper methods evaluate the performance of a subset of features using a
+specific machine learning algorithm. They are more computationally
+intensive compared to filter methods.
+
+Examples:
+
+1.  Forward Selection
+2.  Backward Elimination
+3.  Recursive Feature Elimination (RFE)
+
+**Embedded Methods**
+
+Embedded methods perform feature selection as part of the model training
+process. They include methods like regularization and tree-based
+methods.
+
+Examples:
+
+1.  Lasso (L1 Regularization)
+2.  Ridge (L2 Regularization)
+3.  Decision Trees