EBU6502_cloud_computing_notes/1-4-scalability.md

# Scalability

## Definition

- Ability to use more resources, without major change in original setup

## Reasons ( Not important )

- Increasing data
- Globalization
- New technologies

## Trends

### Moore's law

### Parallel computing

### Distributed and cloud computing

### Virtualization

### Cluster computing

- Definition: loosely or tightly coupled pool of computers that work together
  collectively and cooperatively as a single computing resource to solve the
  same or common task
- Benefits
    - Enables scalable parallel computing
    - Achieves high availability through stand along operation, and fail over
    - Modular growth: easy to upgrade

## Measurements of scalability

- Functional scalability: Add new functions, and no degradation in memory usage
  and performance
- Geographical scalability: Distributive globally, no degradation in
  performance.
- Administrative scalability: Adding more users, no degradation
- Heterogeneous and generational: Adding features and components from different
  vendors and manufacturers, no degradation
- Load scalability: Adding more load, no degradation

## Strategies for scalability: horizontal and vertical

- Horizontal: Add more nodes to and existing cluster
- Vertical: Add more resource to **single node**

## Performance and Hardware scalability

### Relationship (Important)

- Scaling **increases** performance
- Performance is **not** directly proportional to resources added
- Diminishing return occurs, then tuning is a better choice

### Amdahl's law (Important)

- formula
    - Speedup factor:
      $$S = T / (\alpha \times T + (1 - \alpha) \times T / n) = 1 / (\alpha + (1 - \alpha) / n)$$
    - $\alpha$: fraction of serial computation
    - 1 - alpha: part that can be parallelized
    - $n$: processors used
- Max speedup of n processors, is only achieved when alpha reaches zero, which
  means the program is fully parallelized
- TODO: work on page 19, draw graph
- Assumption: use the **same amount of workload for both sets**
- System efficiency formula:
    - $$Efficiency = E = Speedup / n = 1 / ( \alpha \times n + 1 - \alpha )$$
- Efficiency is low, when load is large, because most nodes are idling (waiting
  for serial computation to complete)
- Also called **fixed workload efficiency**

### Gustafson’s Law

- To enhance efficiency, scale the workload to match the capacity
- Also called the Scaled Worload Speedup
- Formula
    - Scale the workload to: $$Wnew= \alpha W + (1 - \alpha) \times n \times W$$
    - $n$: processor count
    - $W$: workload
    - $Wnew$: Scaled workload
- Only parallelizable portion is scaled
- **Scaled workload** speedup
    - $$S' = W' / W = a + (1-a) \times n$$
- **Scaled** efficiency:
    - $$S' / n = a / n + (1 - a)$$
- TODO: do some calculation

## Availaility (Important)

- Formula:
    - $$SA = MTTF / (MTTF + MTTR)$$
    - $MTTF$: Mean time to **failure**, the longer the better
    - $MTTR$: Mean time to **repair**, the shorter the better

## Types of scaling

- Strong: How performances changes, by increasing **processors** for **fixed**
  problem
- Weak: How the performance changes by increasing **processors** for **problem
  per processor**

## Scalability in cloud computing

- Elasticity: resource can be altered
- Virtualization scaling: adding existing system to cloud
    - Performance issues: Performance of virtualized components may be slower
      than bare metal
- Scale by migrating resources: Regions and Availaility Zones
    - Region: physical geographical location, that consists of one or more zones
- Load balancing: AWS Elastic Load balancing, distributes incoming application
  traffic across multiple targets.
- Auto scaling: AWS EC2 auto scaling
- CDN: AWS CloudFront, use edge caching on edge location to serve content to
  anywhere closer to the vieweer, in order to achieve lower **latency** and
  higher **transfer speed**
- TODO: work the questions
-												add 1-4, took me 2 hrs

											
										
										
											2024-12-28 15:38:07 +08:00
+								# Scalability
 								## Definition
 								- Ability to use more resources, without major change in original setup
 								## Reasons ( Not important )
 								- Increasing data
 								- Globalization
 								- New technologies
 								## Trends
 								### Moore's law
 								### Parallel computing
 								### Distributed and cloud computing
 								### Virtualization
 								### Cluster computing
 								- Definition: loosely or tightly coupled pool of computers that work together
 								  collectively and cooperatively as a single computing resource to solve the
 								  same or common task
 								- Benefits
 								    - Enables scalable parallel computing
 								    - Achieves high availability through stand along operation, and fail over
 								    - Modular growth: easy to upgrade
 								## Measurements of scalability
 								- Functional scalability: Add new functions, and no degradation in memory usage
 								  and performance
 								- Geographical scalability: Distributive globally, no degradation in
 								  performance.
 								- Administrative scalability: Adding more users, no degradation
 								- Heterogeneous and generational: Adding features and components from different
 								  vendors and manufacturers, no degradation
 								- Load scalability: Adding more load, no degradation
 								## Strategies for scalability: horizontal and vertical
 								- Horizontal: Add more nodes to and existing cluster
 								- Vertical: Add more resource to **single node**
 								## Performance and Hardware scalability
 								### Relationship (Important)
 								- Scaling **increases** performance
 								- Performance is **not** directly proportional to resources added
 								- Diminishing return occurs, then tuning is a better choice
 								### Amdahl's law (Important)
 								- formula
 								    - Speedup factor:
-												Fix some content

											
										
										
											2025-01-04 16:31:11 +08:00
+								      $$S = T / (\alpha \times T + (1 - \alpha) \times T / n) = 1 / (\alpha + (1 - \alpha) / n)$$
-												add 1-4, took me 2 hrs

											
										
										
											2024-12-28 15:38:07 +08:00
+								    - $\alpha$: fraction of serial computation
 								    - 1 - alpha: part that can be parallelized
 								    - $n$: processors used
 								- Max speedup of n processors, is only achieved when alpha reaches zero, which
 								  means the program is fully parallelized
 								- TODO: work on page 19, draw graph
 								- Assumption: use the **same amount of workload for both sets**
 								- System efficiency formula:
 								    - $$Efficiency = E = Speedup / n = 1 / ( \alpha \times n + 1 - \alpha )$$
 								- Efficiency is low, when load is large, because most nodes are idling (waiting
 								  for serial computation to complete)
-												Explained preserved IPs

											
										
										
											2025-01-04 15:36:07 +08:00
+								- Also called **fixed workload efficiency**
-												add 1-4, took me 2 hrs

											
										
										
											2024-12-28 15:38:07 +08:00
 								### Gustafson’s Law
 								- To enhance efficiency, scale the workload to match the capacity
 								- Also called the Scaled Worload Speedup
 								- Formula
 								    - Scale the workload to: $$Wnew= \alpha W + (1 - \alpha) \times n \times W$$
 								    - $n$: processor count
 								    - $W$: workload
 								    - $Wnew$: Scaled workload
 								- Only parallelizable portion is scaled
 								- **Scaled workload** speedup
 								    - $$S' = W' / W = a + (1-a) \times n$$
 								- **Scaled** efficiency:
 								    - $$S' / n = a / n + (1 - a)$$
 								- TODO: do some calculation
 								## Availaility (Important)
 								- Formula:
 								    - $$SA = MTTF / (MTTF + MTTR)$$
 								    - $MTTF$: Mean time to **failure**, the longer the better
 								    - $MTTR$: Mean time to **repair**, the shorter the better
 								## Types of scaling
-												Explained preserved IPs

											
										
										
											2025-01-04 15:36:07 +08:00
+								- Strong: How performances changes, by increasing **processors** for **fixed**
 								  problem
 								- Weak: How the performance changes by increasing **processors** for **problem
 								  per processor**
-												add 1-4, took me 2 hrs

											
										
										
											2024-12-28 15:38:07 +08:00
 								## Scalability in cloud computing
 								- Elasticity: resource can be altered
 								- Virtualization scaling: adding existing system to cloud
-												Explained preserved IPs

											
										
										
											2025-01-04 15:36:07 +08:00
+								    - Performance issues: Performance of virtualized components may be slower
 								      than bare metal
-												add 1-4, took me 2 hrs

											
										
										
											2024-12-28 15:38:07 +08:00
+								- Scale by migrating resources: Regions and Availaility Zones
 								    - Region: physical geographical location, that consists of one or more zones
 								- Load balancing: AWS Elastic Load balancing, distributes incoming application
 								  traffic across multiple targets.
 								- Auto scaling: AWS EC2 auto scaling
-												Explained preserved IPs

											
										
										
											2025-01-04 15:36:07 +08:00
+								- CDN: AWS CloudFront, use edge caching on edge location to serve content to
 								  anywhere closer to the vieweer, in order to achieve lower **latency** and
 								  higher **transfer speed**
-												add 1-4, took me 2 hrs

											
										
										
											2024-12-28 15:38:07 +08:00
+								- TODO: work the questions