# Scalability ## Definition - Ability to use more resources, without major change in original setup ## Reasons ( Not important ) - Increasing data - Globalization - New technologies ## Trends ### Moore's law ### Parallel computing ### Distributed and cloud computing ### Virtualization ### Cluster computing - Definition: loosely or tightly coupled pool of computers that work together collectively and cooperatively as a single computing resource to solve the same or common task - Benefits - Enables scalable parallel computing - Achieves high availability through stand along operation, and fail over - Modular growth: easy to upgrade ## Measurements of scalability - Functional scalability: Add new functions, and no degradation in memory usage and performance - Geographical scalability: Distributive globally, no degradation in performance. - Administrative scalability: Adding more users, no degradation - Heterogeneous and generational: Adding features and components from different vendors and manufacturers, no degradation - Load scalability: Adding more load, no degradation ## Strategies for scalability: horizontal and vertical - Horizontal: Add more nodes to and existing cluster - Vertical: Add more resource to **single node** ## Performance and Hardware scalability ### Relationship (Important) - Scaling **increases** performance - Performance is **not** directly proportional to resources added - Diminishing return occurs, then tuning is a better choice ### Amdahl's law (Important) - formula - Speedup factor: $$S = T / (\alpha \times T + (1 - \alpha) \times T / n) = 1 / (\alpha + (1 - \alpha) / n)$$ - $\alpha$: fraction of serial computation - 1 - alpha: part that can be parallelized - $n$: processors used - Max speedup of n processors, is only achieved when alpha reaches zero, which means the program is fully parallelized - TODO: work on page 19, draw graph - Assumption: use the **same amount of workload for both sets** - System efficiency formula: - $$Efficiency = E = Speedup / n = 1 / ( \alpha \times n + 1 - \alpha )$$ - Efficiency is low, when load is large, because most nodes are idling (waiting for serial computation to complete) - Also called **fixed workload efficiency** ### Gustafsonā€™s Law - To enhance efficiency, scale the workload to match the capacity - Also called the Scaled Worload Speedup - Formula - Scale the workload to: $$Wnew= \alpha W + (1 - \alpha) \times n \times W$$ - $n$: processor count - $W$: workload - $Wnew$: Scaled workload - Only parallelizable portion is scaled - **Scaled workload** speedup - $$S' = W' / W = a + (1-a) \times n$$ - **Scaled** efficiency: - $$S' / n = a / n + (1 - a)$$ - TODO: do some calculation ## Availaility (Important) - Formula: - $$SA = MTTF / (MTTF + MTTR)$$ - $MTTF$: Mean time to **failure**, the longer the better - $MTTR$: Mean time to **repair**, the shorter the better ## Types of scaling - Strong: How performances changes, by increasing **processors** for **fixed** problem - Weak: How the performance changes by increasing **processors** for **problem per processor** ## Scalability in cloud computing - Elasticity: resource can be altered - Virtualization scaling: adding existing system to cloud - Performance issues: Performance of virtualized components may be slower than bare metal - Scale by migrating resources: Regions and Availaility Zones - Region: physical geographical location, that consists of one or more zones - Load balancing: AWS Elastic Load balancing, distributes incoming application traffic across multiple targets. - Auto scaling: AWS EC2 auto scaling - CDN: AWS CloudFront, use edge caching on edge location to serve content to anywhere closer to the vieweer, in order to achieve lower **latency** and higher **transfer speed** - TODO: work the questions