EBU6502_cloud_computing_notes/3-2-hadoop.md
2024-12-30 15:14:36 +08:00

6.4 KiB

Hadoop Architecture

Features

  • Designed to run in clusters of pcs
  • Scales up linearly
  • Suitable for local networks or data centers
  • Design principles
    • Data is distributed around the network
    • Computation is sent to data: code is sent to run on nodes
    • Basic architecture is mater / worker
  • Offers the following:
    • Redundant, fault tolerant data storage
    • Parallel computation framework
    • Job coordination

Structure of a MapReduce job

  • Job: a program to be executed across the entire dataset
    • Packaged as a jar file, with all the code needed
    • Job is assigned a cluster-unique ID
    • Data attached to job is replicated over the entire internet
  • Task: an execution on a slice of data
  • Task Attempt: An instance of a local execution

Job execution flow

  • Split data into computing chunks
  • Assign a chunk to node manager
  • Run many mappers
  • Shuffle and sort
  • Run many reducers
  • Result from reducers create the job output

The Optional Combiner

  • The bottleneck in map-reduce frameworks:
    • Map and reduce jobs scale close to linearly, so they are not
    • Potential bottleneck at the shuffle and sort operations (between map and reduce)
      • Data need to be copied over network
      • Lots of keys are emitted by mapper, and sorting they are costly
  • Combiner can be used to execute before shffling and sorting
  • Reason
    • Acts as a preliminary reducer
    • Executed at each mapper node, just before sending all the pairs for shuffling
    • Reduces amount of data emitted by mapper, to improve efficiency
  • Restrictions and rules:
    • Cannot be mandatory, the job should work the correctly without it
    • Idempotent: The number of time the combiner is applied shouldn't change the output
      • No Side effect, or they won't be idempotent
    • Preserve the keys: can't change the keys to disrupt the sort order, or changing the partitioning
  • Example of using reducer code as the combiner:
    public void Combine(String key, List<Integer> values) {
        int sum = 0;
        for (Integer count: values){
            sum+=count;
        }
        emit(key, sum);
    }
    
  • TODO review the combiner diagram

Apache Hadoop

Architecture of Hadoop

  • Executes on nodes connected by network
  • Each node runs a set of daemons
    • Computing:
      • ResourceManager
      • NodeManager
    • Storage:
      • NameNode
      • SecondaryNameNode as backup
      • DataNode
  • Nodes are in Master Slave architecture
    • Master node: NameNode, ResourceManager
      • Aware of slave nodes
      • Receives external requests
      • Decide the work split of slaves
      • Notify slaves
    • Slave node, also called Worker node: DataNode, NodeManager
      • Executes the tasks, received from master

What Hadoop Does

  • Resource Management: the existence and availability of resources
  • Job Allocation: needed resources for job, and the split of work
  • Job Execution: Run job, make sure it's completed, deal with failures

Job execution: YARN

Intro

  • Estimate how many map and reduce tasks are needed for a job, based on input dataset and job definition
  • Ideally, one different node for each map / reduce tasks

Deciding the number of workers

Mapper Parallelization

  • Different input split are processed on each mapper
  • Input data size is known
  • Number of mappers: Input size / Split Size
    • If input size is small, and has many files, they won't be splitted and will use more mappers

Reducer

  • Number of reducer is user defined, since it's hard to figure out automatically
    • Keys are partitioned, partitioning too much lead to overhead in shuffle and sort

Execution Daemons

ResourceManager

  • On master: one per cluster
  • Responsibility:
    • Receive job requests from client
    • Create a ApplicationMaster per job to manage it
    • Allocate container in slave nodes, with the assigned resources
    • Provision the health of NodeManager nodes

NodeManager

  • Responsibility:
    • Coordinate the execution of tasks on node
    • Send health information to ResourceManager

ApplicationMaster

  • Only one per job
  • Responsibility: Job allocation and job execution
    • Implements specific framework, for example MapReduce
    • Negitiates with ResourceManager on the resources required
    • Decides which node will run which job, in the container, given by ResourceManager
    • Destroyed when job completed

Storage: HDFS

Definition

  • Hadoop Distributed File System
  • This is the storage for Hadoop's input and output
  • Features:
    • Tailored for MapReduce jobs
    • Large Block size (64MB)
    • Not a POSIX compliant file system

Data distribution: key element of map reduce

  • Job code (jars) moved to where data is stored
  • Blocks are replicated on the cluster, by default three times, to ensure reliability

Storage daemon

  • DataNode: many per cluster
    • Stores block from HDFS
    • Report the blocks stored to NameNode
  • NameNode: one per cluster
    • Keep index and location for every block
    • Don't do computation, because this is heavy
    • Single point of failure
  • SecondaryNameNode
    • Communicates directly with a NameNode
    • Store backup of index table

Data Replication

  • Format: csv, example:

    Filename numReplicas block-ids
    part-0 r:2 {1,3}
    part-1 r:3 {2,4,5}
  • Definition: Creating and maintaining multiple copies of data, across different nodes in the HDFS

  • Significance

    • Fault tolerance
    • Data availability
    • System Reliability
    • Support Parallel Processing

Failure recovery

  • Identifying failure: by missing the hearbeat signal
  • Replication: NameNode initializes replication once a node failure is detected
  • Maintaining Integrity: use a predetermined replication factor
  • Mitigating Potential Disruptions: Dynamic data management

Operation

  • TODO: See the graph in p59
  • Input data:
    • Mappers are assigned input splits from HDFS input path (default 64MB)
    • Data locality: ApplicationMaster try to assign mapper where data is stored
  • Output data:
    • Copied to HDFS, one file per reducer
    • Replicate