diff --git a/1-4-scalability.md b/1-4-scalability.md new file mode 100644 index 0000000..3314bbd --- /dev/null +++ b/1-4-scalability.md @@ -0,0 +1,113 @@ +# Scalability + +## Definition + +- Ability to use more resources, without major change in original setup + +## Reasons ( Not important ) + +- Increasing data +- Globalization +- New technologies + +## Trends + +### Moore's law + +### Parallel computing + +### Distributed and cloud computing + +### Virtualization + +### Cluster computing + +- Definition: loosely or tightly coupled pool of computers that work together + collectively and cooperatively as a single computing resource to solve the + same or common task +- Benefits + - Enables scalable parallel computing + - Achieves high availability through stand along operation, and fail over + - Modular growth: easy to upgrade + +## Measurements of scalability + +- Functional scalability: Add new functions, and no degradation in memory usage + and performance +- Geographical scalability: Distributive globally, no degradation in + performance. +- Administrative scalability: Adding more users, no degradation +- Heterogeneous and generational: Adding features and components from different + vendors and manufacturers, no degradation +- Load scalability: Adding more load, no degradation + +## Strategies for scalability: horizontal and vertical + +- Horizontal: Add more nodes to and existing cluster +- Vertical: Add more resource to **single node** + +## Performance and Hardware scalability + +### Relationship (Important) + +- Scaling **increases** performance +- Performance is **not** directly proportional to resources added +- Diminishing return occurs, then tuning is a better choice + +### Amdahl's law (Important) + +- formula + - Speedup factor: + $$S = T / (\alpha \times T + (1 - \alpha \times T / n)) = 1 / (\alpha + (1 - \alpha) / n)$$ + - $\alpha$: fraction of serial computation + - 1 - alpha: part that can be parallelized + - $n$: processors used +- Max speedup of n processors, is only achieved when alpha reaches zero, which + means the program is fully parallelized +- TODO: work on page 19, draw graph +- Assumption: use the **same amount of workload for both sets** +- System efficiency formula: + - $$Efficiency = E = Speedup / n = 1 / ( \alpha \times n + 1 - \alpha )$$ +- Efficiency is low, when load is large, because most nodes are idling (waiting + for serial computation to complete) + +### Gustafsonā€™s Law + +- To enhance efficiency, scale the workload to match the capacity +- Also called the Scaled Worload Speedup +- Formula + - Scale the workload to: $$Wnew= \alpha W + (1 - \alpha) \times n \times W$$ + - $n$: processor count + - $W$: workload + - $Wnew$: Scaled workload +- Only parallelizable portion is scaled +- **Scaled workload** speedup + - $$S' = W' / W = a + (1-a) \times n$$ +- **Scaled** efficiency: + - $$S' / n = a / n + (1 - a)$$ +- TODO: do some calculation + +## Availaility (Important) + +- Formula: + - $$SA = MTTF / (MTTF + MTTR)$$ + - $MTTF$: Mean time to **failure**, the longer the better + - $MTTR$: Mean time to **repair**, the shorter the better + +## Types of scaling + +- Strong: How performances changes, by increasing **processors** for **fixed** problem +- Weak: How the performance changes by increasing **processors** for **problem per processor** + +## Scalability in cloud computing + +- Elasticity: resource can be altered +- Virtualization scaling: adding existing system to cloud + - Performance issues: Performance of virtualized components may be slower than bare metal +- Scale by migrating resources: Regions and Availaility Zones + - Region: physical geographical location, that consists of one or more zones +- Load balancing: AWS Elastic Load balancing, distributes incoming application + traffic across multiple targets. +- Auto scaling: AWS EC2 auto scaling +- CDN: AWS CloudFront, use edge caching on edge location to serve content to anywhere closer to the vieweer, in order to achieve lower **latency** and higher **transfer speed** +- TODO: work the questions