Add NOSQL
This commit is contained in:
parent
3b7ee26a8c
commit
886477b6f0
292
4-3-NOSQL.md
Normal file
292
4-3-NOSQL.md
Normal file
|
@ -0,0 +1,292 @@
|
|||
# Cloud Databases
|
||||
|
||||
## Cloud Databases
|
||||
|
||||
### Traditional database
|
||||
|
||||
#### Intro
|
||||
|
||||
- Most traditional DB are relational
|
||||
- Relational data model has been developed to serve application that focus on
|
||||
consistency and reliability
|
||||
- Data consistency and reliability are guaranteed by ACID properties
|
||||
|
||||
### ACID properties
|
||||
|
||||
- Atomicity: No partial transaction, if it fail, rolls back
|
||||
- Consistency: Changes made by transaction respect all database integrity
|
||||
constraints, and database remain in consistent state
|
||||
- Isolation: Transactions can not see changes made by concurrent transactions
|
||||
- Durability: Changes stored permanently, persistent data
|
||||
|
||||
### NOSQL Database System
|
||||
|
||||
- Not relational database, called (Not Only SQL)
|
||||
- Features:
|
||||
- Relax one of ACID properties
|
||||
- Key-Value data model, for storing data
|
||||
- schemeless, or scheme lacks the constraints
|
||||
- Uses data partitioning, has data replication and distribution
|
||||
- Has no joins, and use data sharding
|
||||
|
||||
### NOSQL vs. relation DB
|
||||
|
||||
| Comparison | NoSQL | Relational Database |
|
||||
| ----------------------- | ------------------- | ------------------- |
|
||||
| Type of data | Mainly unstructured | Structured only |
|
||||
| Volume of data | High volume | Low volume |
|
||||
| Single point of failure | No | Yes |
|
||||
| Data access | Many locations | Few locations |
|
||||
|
||||
### Advantages of NOSQL
|
||||
|
||||
- Cheap
|
||||
- Data replication and partition, leading to no single point of failure
|
||||
- Easy to distribute
|
||||
- No schema reqired
|
||||
|
||||
### Issues with relational database
|
||||
|
||||
- Vertical scaling is hard
|
||||
- Horizontal scaling will break ACID
|
||||
|
||||
## CAP
|
||||
|
||||
### Theorem
|
||||
|
||||
#### Content
|
||||
|
||||
- A distributed system can't have all of the three
|
||||
- Consistency of storage: having the same value for all copies, every read
|
||||
receive the most write or error
|
||||
- Availability: Request received from non-failing node, results in a response
|
||||
- Partitioning tolerance: System continues to operate despite a number of
|
||||
messages being dropped by network
|
||||
|
||||
#### NoSQL and CAP
|
||||
|
||||
- Cloud databases's integrity are not expected
|
||||
- Availability is characterized by a small latency
|
||||
- No Clear border between availability and network partitions
|
||||
|
||||
#### Trade offs
|
||||
|
||||
- SQL: Strong consistency and availability, but not partition tolerance
|
||||
- NoSQL, for example Dynamo: availability and partition tolerance, but weak
|
||||
consistency
|
||||
- High latency losts revenues, so sacrificed strong consistency
|
||||
- Now has a model of eventual consistency
|
||||
|
||||
### Relaxing Availability:
|
||||
|
||||
- Not possible, we want our website to be available at all time
|
||||
|
||||
### Consistency
|
||||
|
||||
#### Definition
|
||||
|
||||
- Strong consistency: replicas duplicate linearly in same total order
|
||||
- No apparent conflicts
|
||||
- The traditional way
|
||||
- ACID SQL database are strongly consistent
|
||||
- Distributed and ACID: Need to form a **concensus** by synchronizing the
|
||||
replica states
|
||||
|
||||
#### Eventual Consistency
|
||||
|
||||
- May not converge to same total order
|
||||
- Update accepted by local node, and propagated to other nodes
|
||||
- No synchronization needed
|
||||
- Eventually all replicas are updated, if there's no update for a long
|
||||
enough time
|
||||
|
||||
#### Breakdown of a example
|
||||
|
||||
- TODO: read the diagrams from P25
|
||||
|
||||
#### Example: Facebook
|
||||
|
||||
- When checking out a story, it may not be available at first, because the
|
||||
consistency is not yet achieved
|
||||
- Because it's impossible without eventual consistency, with 2.7 billion users.
|
||||
- This reduces the load and improve the availability
|
||||
|
||||
#### Example: Dropbox
|
||||
|
||||
- Based on personal syncing, not real time collaboration, so immediate
|
||||
consistency is not that important
|
||||
- Eventual consistency, your updates are propagated
|
||||
|
||||
#### Example: ATM
|
||||
|
||||
- Event ATM uses eventual consistency
|
||||
- Because of higher availability needed
|
||||
- You can overdraw money, if the machined is partitioned from network
|
||||
- It has a limit on the amount of withdraw, and you will be charged
|
||||
|
||||
#### Strong consistency
|
||||
|
||||
- Counterpart of Eventual consistency
|
||||
- All read must return data from latest write
|
||||
- Very hard to achieve with high availability and high partition tolerance
|
||||
|
||||
## Replication of data
|
||||
|
||||
### Data partitioning and replication
|
||||
|
||||
- Reasons:
|
||||
- Amount of data exceeds the capacity of a single machine
|
||||
- Scaling for load balancing
|
||||
- Ensure reliability and availability by replication
|
||||
- NoSQL databases usually maximize availability when partitioning
|
||||
- Techniques to achieve this
|
||||
- Memory caches
|
||||
- Separating RW
|
||||
- Clustering and sharding
|
||||
|
||||
### Caches
|
||||
|
||||
- Memory caches are transient, partly partitioned and replicated in-memory
|
||||
databases
|
||||
- Replicates most requested parts to main memories of a number of servers
|
||||
- Advantages:
|
||||
- Fast response to clients
|
||||
- Off-loading of database servers
|
||||
- Counterpart of Eventual consistency
|
||||
- All read must return data from latest write
|
||||
- Very hard to achieve with high availability and high partition tolerance
|
||||
|
||||
### Separating Reads from Writes
|
||||
|
||||
- Master server dedicated to **writes**
|
||||
- Some replicas servers (Slaves) dedicated to **reads**
|
||||
- Master replicates updates to slaves
|
||||
- Error handling: if the master crashes
|
||||
- Before completing replication: Write operation is lost
|
||||
- Otherwise: Most up to date slave take the master role
|
||||
|
||||
### Clustering
|
||||
|
||||
- HA (High availability) clusters are **groups** of computers that support
|
||||
server applications
|
||||
- Using redundant computers in groups to provide continued service, in case one
|
||||
of the components fail, to elimimate **Single Point of Failure** (SPOF)
|
||||
- **Failover**: When an HA cluster detects a fault, the application is
|
||||
immediately restarted on another system without requiring intervention
|
||||
|
||||
### Sharding
|
||||
|
||||
- Data partitioning scheme, that
|
||||
- data that is accessed and modified frequently are stored on the same shard
|
||||
- Each shard is on a different node
|
||||
- To spread load across different servers
|
||||
- This technique is not appropriate for relational database, since they are
|
||||
normalized
|
||||
- Popular with non-relational databases
|
||||
|
||||
## Cloud DB Services
|
||||
|
||||
### Types of NOSQL, and examples
|
||||
|
||||
- Key value databases:
|
||||
- Single value or row, indexed by a key
|
||||
- `Memcached`, `vertica`
|
||||
- Column Family databases
|
||||
- A sparse, distributed, persistent multidimensional sorted **map**
|
||||
- `Google BigTable`, `Apache Cassandra`
|
||||
- Document databases
|
||||
- Multi-field documents or objects with JSON access
|
||||
- `MongoDB`
|
||||
- Graph databases
|
||||
- Nodes, edges, properties
|
||||
- `Neo4J`, `sones`
|
||||
|
||||
### Comparing solutions based on CAP
|
||||
|
||||
- CA: `RDBMS`
|
||||
- CP: `BigTable`, `MongoDB`
|
||||
- AP: `Dynamo`, `Cassandra`
|
||||
|
||||
### Examples
|
||||
|
||||
#### Memcached
|
||||
|
||||
- Features
|
||||
- High performance distributed in-memory caching service
|
||||
- Similar to hash table, manages objects
|
||||
- Use standard `get` and `set`
|
||||
- To serve requests for read-heavy workloads
|
||||
- Used in `Facebook`, `LinkedIn`
|
||||
- Example pseudocode:
|
||||
```text
|
||||
public String getFoo(String fooId)
|
||||
String foo = memcached.get(fooId)
|
||||
if foo != null return foo
|
||||
foo = fetchFooFromDatabase(fooId)
|
||||
memcached.set(fooId, foo)
|
||||
return foo
|
||||
```
|
||||
- Size:
|
||||
- Has set number of bits 128 bits
|
||||
- Each value no more than 1MB
|
||||
- LRU policy: when full, new one use old one
|
||||
- Least used item removed
|
||||
- This way, frequently accessed items are in cache, regardless of
|
||||
capacity
|
||||
- Clustering:
|
||||
- Multiple nodes, each one of them has different keys
|
||||
- Servers are not aware there are multiple of them (transparent)
|
||||
|
||||
#### Cassandra
|
||||
|
||||
##### Features:
|
||||
|
||||
- Elastic scaling: read and write increases linearly
|
||||
- Replication, per node and per data center
|
||||
- Decentralized
|
||||
- Column based, multi dimensional hash table without join capabilities
|
||||
- Consistency is tunable
|
||||
|
||||
##### Consistent hashing: Point on a circle
|
||||
|
||||
- Each database object is mapped to a point on circle, by hashing key
|
||||
- Node is also mapped to a point on the same circle
|
||||
- To find an object, it:
|
||||
- Hashes the object's key to a point on the circle
|
||||
- Walk clockwise, until it encounters the first node, the node will store
|
||||
the object
|
||||
- Benefits: reduce the **impact** of node entering and leaving the ring
|
||||
- Node changes: redistribute existing data objects
|
||||
- Leave: node in the clockwise direction will store objects that belong to
|
||||
left code
|
||||
- Add: mapped to a point, map objects that should belong to it to the node
|
||||
|
||||
##### Replication
|
||||
|
||||
- Replication factor: how many copies of my data exists: data item replicated at
|
||||
N hosts, this is associated to a key. And replicated nodes are chosen
|
||||
clockwise
|
||||
- Coordinator node: replication of the data items
|
||||
- Consistency level: how recent and in-sync all replicas of a row of data are.
|
||||
It prefers availability over consistency, and let you tune how consistent it
|
||||
would be, replica consistency levels:
|
||||
- ANY: one replica node, including handoff
|
||||
- ONE: one node
|
||||
- QUORUM: N / 2 + 1
|
||||
- `LOCAL_QUORUM`: local datacenter
|
||||
- `EACH_QUORUM`: each datacenter
|
||||
- ALL: all replicas
|
||||
- Consistency policies: Rack unaware, rack aware, datacenter aware
|
||||
|
||||
##### Data Model
|
||||
|
||||
- 3D hash table:
|
||||
- A table: column families
|
||||
- Column family: row keys inside
|
||||
- Row key: paired with name-value columns
|
||||
|
||||
## Example
|
||||
|
||||
### EBay
|
||||
|
||||
- Managed by cassandra
|
Loading…
Reference in a new issue