diff --git a/4-data-analytics.md b/4-data-analytics.md
index 96c5e0c..e022c6b 100644
--- a/4-data-analytics.md
+++ b/4-data-analytics.md
@@ -12,7 +12,7 @@
     - [Numeric with meaningful magnitude:](#numeric-with-meaningful-magnitude)
     - [Have enough samples](#have-enough-samples)
     - [Bring human insight to problem](#bring-human-insight-to-problem)
-  - [Process of Feature Engineering](#process-of-feature-engineering)
+  - [Methods of Feature Engineering](#methods-of-feature-engineering)
     - [Scaling](#scaling)
       - [Rationale:](#rationale)
       - [Methods:](#methods)
@@ -29,7 +29,16 @@
         - [k means binning](#k-means-binning)
         - [decision trees](#decision-trees)
     - [Encoding](#encoding)
+      - [Definition](#definition)
+      - [Reason](#reason)
+      - [Methods](#methods)
+        - [One hot encoding](#one-hot-encoding)
+        - [Ordinal encoding](#ordinal-encoding)
+        - [Count / frequency encoding](#count-frequency-encoding)
+        - [Mean / target encoding](#mean-target-encoding)
     - [Transformation](#transformation)
+      - [Reasons](#reasons)
+      - [Methods](#methods)
     - [Generation](#generation)
 <!--toc:end-->
 
@@ -83,8 +92,8 @@
 ### Numeric with meaningful magnitude:
 
 - It does not mean that **categorical** features can't be used in training:
-  simply, they will need to be **transformed** through a process called one-hot
-  encoding
+  simply, they will need to be **transformed** through a process called
+  [encoding](#encoding)
 - Example: Font category: (Arial, Times New Roman)
 
 ### Have enough samples
@@ -99,7 +108,7 @@
   **curious mind**
 - This is an iterative process, need to use **feedback** from production usage
 
-## Process of Feature Engineering
+## Methods of Feature Engineering
 
 ### Scaling
 
@@ -153,7 +162,7 @@
 #### Reason for binning
 
 - Example: Solar energy modeling
-    - Acelleration calculation, by binning, and reduce the number of simulation
+    - Acceleration calculation, by binning, and reduce the number of simulation
       needed
 - Improves **performance** by grouping data with **similar attributes** and has
   **similar predictive strength**
@@ -192,6 +201,93 @@
 
 ### Encoding
 
+#### Definition
+
+- The inverse of binning: creating numerical values from categorical variables
+
+#### Reason
+
+- Machine learning algorithms require **numerical** input data, and this
+  converts **categorical** data to **numerical** data
+
+#### Methods
+
+##### One hot encoding
+
+- Replace categorical variable (nominal) with different binary variables
+- **Eliminates** **ordinality**: since categorical variables shouldn't be
+  ranked, otherwise the algorithm may think there's ordering between the
+  variables
+- Improve performance by allowing model to capture the complex relationship
+  within the data, that may be **missed** if categorical variables are treated
+  as **single** entities
+- Cons
+    - High dimensionality: make the model more complex, and slower to train
+    - Is sparse data
+    - May lead to overfitting, especially if there's too many categories and
+      sample size is small
+- Usage:
+    - Good for algorithms that look at all features at the same time: neural
+      network, clustering, SVM
+    - Used for linear regression, but **keep k-1** binary variable to avoid
+      **multicollinearity**:
+        - In linear regression, the presence of all k binary variables for a
+          categorical feature (where k is the number of categories) introduces
+          perfect multicollinearity. This happens because the k-th variable is a
+          linear **combination** of the others (e.g., if "Red" and "Blue" are 0,
+          "Green" must be 1).
+    - Don't use for tree algorithms
+
+##### Ordinal encoding
+
+- Ordinal variable: comprises a finite set of discrete values with a **ranked**
+  ordering
+- Ordinal encoding replaces the label by ordered number
+- Does not add value to give the variable more predictive power
+- Usage:
+    - For categorical data with ordinal meaning
+
+##### Count / frequency encoding
+
+- Replace occurrences of label with the count of occurrences
+- Cons:
+    - Will have loss of unique categories: (if the two categories have same
+      frequency, they will be treated as the same)
+    - Doesn't handle unseen categories
+    - Overfitting, if low frequency in general
+
+##### Mean / target encoding
+
+- Replace the _value_ for every categories with the avg of _values_ for every
+  _category-value_ pair
+- monotonic relationship between variable and target
+- Don't expand the feature space
+- Con: prone to overfitting
+- Usage:
+    - High cardinality (the number of elements in a mathematical set) data, by
+      leveraging the target variable's statistics to retain predictive power
+
 ### Transformation
 
+#### Reasons
+
+- Linear/Logistic regression models has assumption between the predictors and
+  the outcome.
+    - Transformation may help create this relationship to avoid poor
+      performance.
+    - Assumptions:
+        - Linear dependency between the predictors and the outcome.
+        - Multivariate normality (every variable X should follow a Gaussian
+          distribution)
+        - No or little multicollinearity
+        - homogeneity of variance
+    - Example:
+        - assuming y > 0.5 lead to class 1, otherwise class 2
+        - ![page 1](./assets/4-analytics-line-regression.webp) 
+        - ![page 2](./assets/4-analytics-line-regression-2.webp) 
+- Some other ML algorithms do not make any assumption, but still may benefit
+  from a better distributed data
+
+#### Methods
+
 ### Generation
diff --git a/assets/4-analytics-line-regression-2.webp b/assets/4-analytics-line-regression-2.webp
new file mode 100644
index 0000000..02d72b9
Binary files /dev/null and b/assets/4-analytics-line-regression-2.webp differ
diff --git a/assets/4-analytics-line-regression.webp b/assets/4-analytics-line-regression.webp
new file mode 100644
index 0000000..348c6ad
Binary files /dev/null and b/assets/4-analytics-line-regression.webp differ