Starfish: A Self-tuning System for Big Data Analytics
Content
Analysis in the Big Data Era
Hadoop MapReduce Ecosystem
Practitioners of Big Data Analytics
Tuning Challenges
Starfish: Self-tuning System
What are the Tuning Problems?
Starfish’s Core Approach to Tuning
Starfish Architecture
MapReduce Job Execution
What Controls MR Job Execution?
Effect of Configuration Settings
MapReduce Job Tuning in a Nutshell
Job Profile
Job Profile Fields
Generating Profiles by Measurement
What-if Engine
Virtual Profile Estimation
Job Optimizer
Workflow Optimization Space
Optimizations on TF-IDF Workflow
New Challenges
Cluster Sizing Problem
Multi-objective Cluster Provisioning
Experimental Evaluation
Job Optimizer Evaluation
Estimates from the What-if Engine
Profiling Overhead Vs. Benefit
Multi-objective Cluster Provisioning