EBU6502_cloud_computing_notes/4-3-NOSQL.md

# Cloud Databases

## Cloud Databases

### Traditional database

#### Intro

- Most traditional DB are relational
- Relational data model has been developed to serve application that focus on
  consistency and reliability
- Data consistency and reliability are guaranteed by ACID properties

### ACID properties

- Atomicity: No partial transaction, if it fail, rolls back
- Consistency: Changes made by transaction respect all database integrity
  constraints, and database remain in consistent state
- Isolation: Transactions can not see changes made by concurrent transactions
- Durability: Changes stored permanently, persistent data

### NOSQL Database System

- Not relational database, called (Not Only SQL)
- Features:
    - Relax one of ACID properties
    - Key-Value data model, for storing data
    - schemeless, or scheme lacks the constraints
    - Uses data partitioning, has data replication and distribution
    - Has no joins, and use data sharding

### NOSQL vs. relation DB

| Comparison              | NoSQL               | Relational Database |
| ----------------------- | ------------------- | ------------------- |
| Type of data            | Mainly unstructured | Structured only     |
| Volume of data          | High volume         | Low volume          |
| Single point of failure | No                  | Yes                 |
| Data access             | Many locations      | Few locations       |

### Advantages of NOSQL

- Cheap
- Data replication and partition, leading to no single point of failure
- Easy to distribute
- No schema reqired

### Issues with relational database

- Vertical scaling is hard
- Horizontal scaling will break ACID

## CAP

### Theorem

#### Content

- A distributed system can't have all of the three
- Consistency of storage: having the same value for all copies, every read
  receive the most write or error
- Availability: Request received from non-failing node, results in a response
- Partitioning tolerance: System continues to operate despite a number of
  messages being dropped by network

#### NoSQL and CAP

- Cloud databases's integrity are not expected
- Availability is characterized by a small latency
- No Clear border between availability and network partitions

#### Trade offs

- SQL: Strong consistency and availability, but not partition tolerance
- NoSQL, for example Dynamo: availability and partition tolerance, but weak
  consistency
    - High latency losts revenues, so sacrificed strong consistency
    - Now has a model of eventual consistency

### Relaxing Availability:

- Not possible, we want our website to be available at all time

### Consistency

#### Definition

- Strong consistency: replicas duplicate linearly in same total order
    - No apparent conflicts
    - The traditional way
    - ACID SQL database are strongly consistent
- Distributed and ACID: Need to form a **concensus** by synchronizing the
  replica states

#### Eventual Consistency

- May not converge to same total order
- Update accepted by local node, and propagated to other nodes
- No synchronization needed
    - Eventually all replicas are updated, if there's no update for a long
      enough time

#### Breakdown of a example

- TODO: read the diagrams from P25

#### Example: Facebook

- When checking out a story, it may not be available at first, because the
  consistency is not yet achieved
- Because it's impossible without eventual consistency, with 2.7 billion users.
- This reduces the load and improve the availability

#### Example: Dropbox

- Based on personal syncing, not real time collaboration, so immediate
  consistency is not that important
- Eventual consistency, your updates are propagated

#### Example: ATM

- Event ATM uses eventual consistency
- Because of higher availability needed
- You can overdraw money, if the machined is partitioned from network
- It has a limit on the amount of withdraw, and you will be charged

#### Strong consistency

- Counterpart of Eventual consistency
- All read must return data from latest write
- Very hard to achieve with high availability and high partition tolerance

## Replication of data

### Data partitioning and replication

- Reasons:
    - Amount of data exceeds the capacity of a single machine
    - Scaling for load balancing
    - Ensure reliability and availability by replication
- NoSQL databases usually maximize availability when partitioning
- Techniques to achieve this
    - Memory caches
    - Separating RW
    - Clustering and sharding

### Caches

- Memory caches are transient, partly partitioned and replicated in-memory
  databases
- Replicates most requested parts to main memories of a number of servers
- Advantages:
    - Fast response to clients
    - Off-loading of database servers

### Separating Reads from Writes

- Master server dedicated to **writes**
- Some replicas servers (Slaves) dedicated to **reads**
- Master replicates updates to slaves
- Error handling: if the master crashes
    - Before completing replication: Write operation is lost
    - Otherwise: Most up to date slave take the master role

### Clustering

- HA (High availability) clusters are **groups** of computers that support
  server applications
- Using redundant computers in groups to provide continued service, in case one
  of the components fail, to elimimate **Single Point of Failure** (SPOF)
- **Failover**: When an HA cluster detects a fault, the application is
  immediately restarted on another system without requiring intervention

### Sharding

- Data partitioning scheme, that
    - data that is accessed and modified frequently are stored on the same shard
    - Each shard is on a different node
- To spread load across different servers
- This technique is not appropriate for relational database, since they are
  normalized
- Popular with non-relational databases

## Cloud DB Services

### Types of NOSQL, and examples

- Key value databases:
    - Single value or row, indexed by a key
    - `Memcached`, `vertica`
- Column Family databases
    - A sparse, distributed, persistent multidimensional sorted **map**
    - `Google BigTable`, `Apache Cassandra`
- Document databases
    - Multi-field documents or objects with JSON access
    - `MongoDB`
- Graph databases
    - Nodes, edges, properties
    - `Neo4J`, `sones`

### Comparing solutions based on CAP

- CA: `RDBMS`
- CP: `BigTable`, `MongoDB`
- AP: `Dynamo`, `Cassandra`

### Examples

#### Memcached

- Features
    - High performance distributed in-memory caching service
    - Similar to hash table, manages objects
    - Use standard `get` and `set`
- To serve requests for read-heavy workloads
- Used in `Facebook`, `LinkedIn`
- Example pseudocode:
    ```text
    public String getFoo(String fooId)
        String foo = memcached.get(fooId)
        if foo != null return foo
        foo = fetchFooFromDatabase(fooId)
        memcached.set(fooId, foo)
        return foo
    ```
- Size:
    - Has set number of bits 128 bits
    - Each value no more than 1MB
    - LRU policy: when full, new one use old one
        - Least used item removed
        - This way, frequently accessed items are in cache, regardless of
          capacity
- Clustering:
    - Multiple nodes, each one of them has different keys
    - Servers are not aware there are multiple of them (transparent): **Smart
      Clients & Dumb Servers**
        - Client has function that maps keys to servers, while
        - Servers are not aware there are multiple of them

#### Cassandra

##### Features:

- Elastic scaling: read and write increases linearly
- Replication, per node and per data center
- Decentralized
- Column based, multi dimensional hash table without join capabilities
- Consistency is tunable

##### Consistent hashing: Point on a circle

- Each database object is mapped to a point on circle, by hashing key
- Node is also mapped to a point on the same circle
- To find an object, it:
    - Hashes the object's key to a point on the circle
    - Walk clockwise, until it encounters the first node, the node will store
      the object
- Benefits: reduce the **impact** of node entering and leaving the ring
- Node changes: redistribute existing data objects
    - Leave: node in the clockwise direction will store objects that belong to
      left code
    - Add: mapped to a point, map objects that should belong to it to the node

##### Replication

- Replication factor: how many copies of my data exists: data item replicated at
  N hosts, this is associated to a key. And replicated nodes are chosen
  clockwise
- Coordinator node: replication of the data items
- Consistency level: how recent and in-sync all replicas of a row of data are.
  It prefers availability over consistency, and let you tune how consistent it
  would be, replica consistency levels:
    - ANY: one replica node, including handoff need to reply, in order for write
      to success
    - ONE: one node
    - QUORUM: N / 2 + 1
        - `LOCAL_QUORUM`: local datacenter
        - `EACH_QUORUM`: each datacenter
    - ALL: all replicas
- Consistency policies: Rack unaware, rack aware, datacenter aware

##### Data Model

- 3D hash table:
- A table: column families
- Column family: row keys inside
- Row key: paired with name-value columns

## Example

### EBay

- Managed by cassandra
Add NOSQL 2025-01-04 13:12:54 +08:00			`# Cloud Databases`

			`## Cloud Databases`

			`### Traditional database`

			`#### Intro`

			`- Most traditional DB are relational`
			`- Relational data model has been developed to serve application that focus on`
			`consistency and reliability`
			`- Data consistency and reliability are guaranteed by ACID properties`

			`### ACID properties`

			`- Atomicity: No partial transaction, if it fail, rolls back`
			`- Consistency: Changes made by transaction respect all database integrity`
			`constraints, and database remain in consistent state`
			`- Isolation: Transactions can not see changes made by concurrent transactions`
			`- Durability: Changes stored permanently, persistent data`

			`### NOSQL Database System`

			`- Not relational database, called (Not Only SQL)`
			`- Features:`
			`- Relax one of ACID properties`
			`- Key-Value data model, for storing data`
			`- schemeless, or scheme lacks the constraints`
			`- Uses data partitioning, has data replication and distribution`
			`- Has no joins, and use data sharding`

			`### NOSQL vs. relation DB`

			`\| Comparison \| NoSQL \| Relational Database \|`
			`\| ----------------------- \| ------------------- \| ------------------- \|`
			`\| Type of data \| Mainly unstructured \| Structured only \|`
			`\| Volume of data \| High volume \| Low volume \|`
			`\| Single point of failure \| No \| Yes \|`
			`\| Data access \| Many locations \| Few locations \|`

			`### Advantages of NOSQL`

			`- Cheap`
			`- Data replication and partition, leading to no single point of failure`
			`- Easy to distribute`
			`- No schema reqired`

			`### Issues with relational database`

			`- Vertical scaling is hard`
			`- Horizontal scaling will break ACID`

			`## CAP`

			`### Theorem`

			`#### Content`

			`- A distributed system can't have all of the three`
			`- Consistency of storage: having the same value for all copies, every read`
			`receive the most write or error`
			`- Availability: Request received from non-failing node, results in a response`
			`- Partitioning tolerance: System continues to operate despite a number of`
			`messages being dropped by network`

			`#### NoSQL and CAP`

			`- Cloud databases's integrity are not expected`
			`- Availability is characterized by a small latency`
			`- No Clear border between availability and network partitions`

			`#### Trade offs`

			`- SQL: Strong consistency and availability, but not partition tolerance`
			`- NoSQL, for example Dynamo: availability and partition tolerance, but weak`
			`consistency`
			`- High latency losts revenues, so sacrificed strong consistency`
			`- Now has a model of eventual consistency`

			`### Relaxing Availability:`

			`- Not possible, we want our website to be available at all time`

			`### Consistency`

			`#### Definition`

			`- Strong consistency: replicas duplicate linearly in same total order`
			`- No apparent conflicts`
			`- The traditional way`
			`- ACID SQL database are strongly consistent`
			`- Distributed and ACID: Need to form a concensus by synchronizing the`
			`replica states`

			`#### Eventual Consistency`

			`- May not converge to same total order`
			`- Update accepted by local node, and propagated to other nodes`
			`- No synchronization needed`
			`- Eventually all replicas are updated, if there's no update for a long`
			`enough time`

			`#### Breakdown of a example`

			`- TODO: read the diagrams from P25`

			`#### Example: Facebook`

			`- When checking out a story, it may not be available at first, because the`
			`consistency is not yet achieved`
			`- Because it's impossible without eventual consistency, with 2.7 billion users.`
			`- This reduces the load and improve the availability`

			`#### Example: Dropbox`

			`- Based on personal syncing, not real time collaboration, so immediate`
			`consistency is not that important`
			`- Eventual consistency, your updates are propagated`

			`#### Example: ATM`

			`- Event ATM uses eventual consistency`
			`- Because of higher availability needed`
			`- You can overdraw money, if the machined is partitioned from network`
			`- It has a limit on the amount of withdraw, and you will be charged`

			`#### Strong consistency`

			`- Counterpart of Eventual consistency`
			`- All read must return data from latest write`
			`- Very hard to achieve with high availability and high partition tolerance`

			`## Replication of data`

			`### Data partitioning and replication`

			`- Reasons:`
			`- Amount of data exceeds the capacity of a single machine`
			`- Scaling for load balancing`
			`- Ensure reliability and availability by replication`
			`- NoSQL databases usually maximize availability when partitioning`
			`- Techniques to achieve this`
			`- Memory caches`
			`- Separating RW`
			`- Clustering and sharding`

			`### Caches`

			`- Memory caches are transient, partly partitioned and replicated in-memory`
			`databases`
			`- Replicates most requested parts to main memories of a number of servers`
			`- Advantages:`
			`- Fast response to clients`
			`- Off-loading of database servers`

			`### Separating Reads from Writes`

			`- Master server dedicated to writes`
			`- Some replicas servers (Slaves) dedicated to reads`
			`- Master replicates updates to slaves`
			`- Error handling: if the master crashes`
			`- Before completing replication: Write operation is lost`
			`- Otherwise: Most up to date slave take the master role`

			`### Clustering`

			`- HA (High availability) clusters are groups of computers that support`
			`server applications`
			`- Using redundant computers in groups to provide continued service, in case one`
			`of the components fail, to elimimate Single Point of Failure (SPOF)`
			`- Failover: When an HA cluster detects a fault, the application is`
			`immediately restarted on another system without requiring intervention`

			`### Sharding`

			`- Data partitioning scheme, that`
			`- data that is accessed and modified frequently are stored on the same shard`
			`- Each shard is on a different node`
			`- To spread load across different servers`
			`- This technique is not appropriate for relational database, since they are`
			`normalized`
			`- Popular with non-relational databases`

			`## Cloud DB Services`

			`### Types of NOSQL, and examples`

			`- Key value databases:`
			`- Single value or row, indexed by a key`
			- `Memcached`, `vertica`
			`- Column Family databases`
			`- A sparse, distributed, persistent multidimensional sorted map`
			- `Google BigTable`, `Apache Cassandra`
			`- Document databases`
			`- Multi-field documents or objects with JSON access`
			- `MongoDB`
			`- Graph databases`
			`- Nodes, edges, properties`
			- `Neo4J`, `sones`

			`### Comparing solutions based on CAP`

			- CA: `RDBMS`
			- CP: `BigTable`, `MongoDB`
			- AP: `Dynamo`, `Cassandra`

			`### Examples`

			`#### Memcached`

			`- Features`
			`- High performance distributed in-memory caching service`
			`- Similar to hash table, manages objects`
			- Use standard `get` and `set`
			`- To serve requests for read-heavy workloads`
			- Used in `Facebook`, `LinkedIn`
			`- Example pseudocode:`
			```text
			`public String getFoo(String fooId)`
			`String foo = memcached.get(fooId)`
			`if foo != null return foo`
			`foo = fetchFooFromDatabase(fooId)`
			`memcached.set(fooId, foo)`
			`return foo`
			```
			`- Size:`
			`- Has set number of bits 128 bits`
			`- Each value no more than 1MB`
			`- LRU policy: when full, new one use old one`
			`- Least used item removed`
			`- This way, frequently accessed items are in cache, regardless of`
			`capacity`
			`- Clustering:`
			`- Multiple nodes, each one of them has different keys`
update on memcached 2025-01-06 14:27:06 +08:00			`- Servers are not aware there are multiple of them (transparent): **Smart`
			`Clients & Dumb Servers**`
			`- Client has function that maps keys to servers, while`
			`- Servers are not aware there are multiple of them`
Add NOSQL 2025-01-04 13:12:54 +08:00
			`#### Cassandra`

			`##### Features:`

			`- Elastic scaling: read and write increases linearly`
			`- Replication, per node and per data center`
			`- Decentralized`
			`- Column based, multi dimensional hash table without join capabilities`
			`- Consistency is tunable`

			`##### Consistent hashing: Point on a circle`

			`- Each database object is mapped to a point on circle, by hashing key`
			`- Node is also mapped to a point on the same circle`
			`- To find an object, it:`
			`- Hashes the object's key to a point on the circle`
			`- Walk clockwise, until it encounters the first node, the node will store`
			`the object`
			`- Benefits: reduce the impact of node entering and leaving the ring`
			`- Node changes: redistribute existing data objects`
			`- Leave: node in the clockwise direction will store objects that belong to`
			`left code`
			`- Add: mapped to a point, map objects that should belong to it to the node`

			`##### Replication`

			`- Replication factor: how many copies of my data exists: data item replicated at`
			`N hosts, this is associated to a key. And replicated nodes are chosen`
			`clockwise`
			`- Coordinator node: replication of the data items`
			`- Consistency level: how recent and in-sync all replicas of a row of data are.`
			`It prefers availability over consistency, and let you tune how consistent it`
			`would be, replica consistency levels:`
Update consistency level 2025-01-06 14:47:10 +08:00			`- ANY: one replica node, including handoff need to reply, in order for write`
			`to success`
Add NOSQL 2025-01-04 13:12:54 +08:00			`- ONE: one node`
			`- QUORUM: N / 2 + 1`
			- `LOCAL_QUORUM`: local datacenter
			- `EACH_QUORUM`: each datacenter
			`- ALL: all replicas`
			`- Consistency policies: Rack unaware, rack aware, datacenter aware`

			`##### Data Model`

			`- 3D hash table:`
			`- A table: column families`
			`- Column family: row keys inside`
			`- Row key: paired with name-value columns`

			`## Example`

			`### EBay`

			`- Managed by cassandra`