| 
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.apache.hadoop.conf.Configured
org.apache.hadoop.mapred.JobClient
public class JobClient
JobClient is the primary interface for the user-job to interact
 with the JobTracker.
 
 JobClient provides facilities to submit jobs, track their 
 progress, access component-tasks' reports/logs, get the Map-Reduce cluster
 status information etc.
 
 
The job submission process involves:
InputSplits for the job.
   DistributedCache 
   of the job, if necessary.
   JobTracker and optionally monitoring
   it's status.
   JobConf and then uses the JobClient to submit 
 the job and monitor its progress.
 
 Here is an example on how to use JobClient:
     // Create a new JobConf
     JobConf job = new JobConf(new Configuration(), MyJob.class);
     
     // Specify various job-specific parameters     
     job.setJobName("myjob");
     
     job.setInputPath(new Path("in"));
     job.setOutputPath(new Path("out"));
     
     job.setMapperClass(MyJob.MyMapper.class);
     job.setReducerClass(MyJob.MyReducer.class);
     // Submit the job, then poll for progress until the job is complete
     JobClient.runJob(job);
 
 
 At times clients would chain map-reduce jobs to accomplish complex tasks which cannot be done via a single map-reduce job. This is fairly easy since the output of the job, typically, goes to distributed file-system and that can be used as the input for the next job.
However, this also means that the onus on ensuring jobs are complete (success/failure) lies squarely on the clients. In such situations the various job-control options are:
runJob(JobConf) : submits the job and returns only after 
   the job has completed.
   submitJob(JobConf) : only submits the job, then poll the 
   returned handle to the RunningJob to query status and make 
   scheduling decisions.
   JobConf.setJobEndNotificationURI(String) : setup a notification
   on job-completion, thus avoiding polling.
   
JobConf, 
ClusterStatus, 
Tool, 
DistributedCache| Nested Class Summary | |
|---|---|
static class | 
JobClient.TaskStatusFilter
 | 
| Field Summary | |
|---|---|
static int | 
CLUSTER_INCREMENT
 | 
static long | 
COUNTER_UPDATE_INTERVAL
 | 
static int | 
FILE_NOT_FOUND
 | 
static int | 
HEARTBEAT_INTERVAL_MIN
 | 
static String | 
MAP_OUTPUT_LENGTH
The custom http header used for the map output length.  | 
static float | 
MAX_INMEM_FILESIZE_FRACTION
Constant denoting the max size (in terms of the fraction of the total size of the filesys) of a map output file that we will try to keep in mem.  | 
static float | 
MAX_INMEM_FILESYS_USE
Constant denoting when a merge of in memory files will be triggered  | 
static String | 
RAW_MAP_OUTPUT_LENGTH
The custom http header used for the "raw" map output length.  | 
static int | 
SUCCESS
 | 
static String | 
TEMP_DIR_NAME
Temporary directory name  | 
static String | 
WORKDIR
 | 
| Constructor Summary | |
|---|---|
JobClient()
Create a job client.  | 
|
JobClient(InetSocketAddress jobTrackAddr,
          Configuration conf)
Build a job client, connect to the indicated job tracker.  | 
|
JobClient(JobConf conf)
Build a job client with the given JobConf, and connect to the 
 default JobTracker. | 
|
| Method Summary | |
|---|---|
 void | 
close()
Close the JobClient. | 
 JobStatus[] | 
getAllJobs()
Get the jobs that are submitted.  | 
 ClusterStatus | 
getClusterStatus()
Get status information about the Map-Reduce cluster.  | 
static Configuration | 
getCommandLineConfig()
return the command line configuration  | 
 int | 
getDefaultMaps()
Get status information about the max available Maps in the cluster.  | 
 int | 
getDefaultReduces()
Get status information about the max available Reduces in the cluster.  | 
 FileSystem | 
getFs()
Get a filesystem handle.  | 
 RunningJob | 
getJob(JobID jobid)
Get an RunningJob object to track an ongoing job. | 
 RunningJob | 
getJob(String jobid)
Deprecated. Applications should rather use getJob(JobID). | 
 TaskReport[] | 
getMapTaskReports(JobID jobId)
Get the information of the current state of the map tasks of a job.  | 
 TaskReport[] | 
getMapTaskReports(String jobId)
Deprecated. Applications should rather use getMapTaskReports(JobID) | 
 TaskReport[] | 
getReduceTaskReports(JobID jobId)
Get the information of the current state of the reduce tasks of a job.  | 
 TaskReport[] | 
getReduceTaskReports(String jobId)
Deprecated. Applications should rather use getReduceTaskReports(JobID) | 
 Path | 
getSystemDir()
Grab the jobtracker system directory path where job-specific files are to be placed.  | 
 JobClient.TaskStatusFilter | 
getTaskOutputFilter()
Deprecated.  | 
static JobClient.TaskStatusFilter | 
getTaskOutputFilter(JobConf job)
Get the task output filter out of the JobConf.  | 
 void | 
init(JobConf conf)
Connect to the default JobTracker. | 
 JobStatus[] | 
jobsToComplete()
Get the jobs that are not completed and not failed.  | 
static void | 
main(String[] argv)
 | 
 int | 
run(String[] argv)
Execute the command with the given arguments.  | 
static RunningJob | 
runJob(JobConf job)
Utility that submits a job, then polls for progress until the job is complete.  | 
 void | 
setTaskOutputFilter(JobClient.TaskStatusFilter newValue)
Deprecated.  | 
static void | 
setTaskOutputFilter(JobConf job,
                    JobClient.TaskStatusFilter newValue)
Modify the JobConf to set the task output filter.  | 
 RunningJob | 
submitJob(JobConf job)
Submit a job to the MR system.  | 
 RunningJob | 
submitJob(String jobFile)
Submit a job to the MR system.  | 
| Methods inherited from class org.apache.hadoop.conf.Configured | 
|---|
getConf, setConf | 
| Methods inherited from class java.lang.Object | 
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait | 
| Methods inherited from interface org.apache.hadoop.conf.Configurable | 
|---|
getConf, setConf | 
| Field Detail | 
|---|
public static final int HEARTBEAT_INTERVAL_MIN
public static final int CLUSTER_INCREMENT
public static final long COUNTER_UPDATE_INTERVAL
public static final float MAX_INMEM_FILESYS_USE
public static final float MAX_INMEM_FILESIZE_FRACTION
public static final int SUCCESS
public static final int FILE_NOT_FOUND
public static final String MAP_OUTPUT_LENGTH
public static final String RAW_MAP_OUTPUT_LENGTH
public static final String TEMP_DIR_NAME
public static final String WORKDIR
| Constructor Detail | 
|---|
public JobClient()
public JobClient(JobConf conf)
          throws IOException
JobConf, and connect to the 
 default JobTracker.
conf - the job configuration.
IOException
public JobClient(InetSocketAddress jobTrackAddr,
                 Configuration conf)
          throws IOException
jobTrackAddr - the job tracker to connect to.conf - configuration.
IOException| Method Detail | 
|---|
public static Configuration getCommandLineConfig()
public void init(JobConf conf)
          throws IOException
JobTracker.
conf - the job configuration.
IOException
public void close()
           throws IOException
JobClient.
IOException
public FileSystem getFs()
                 throws IOException
IOException
public RunningJob submitJob(String jobFile)
                     throws FileNotFoundException,
                            InvalidJobConfException,
                            IOException
RunningJob which can be used to track
 the running-job.
jobFile - the job configuration.
RunningJob which can be used to track the
         running-job.
FileNotFoundException
InvalidJobConfException
IOException
public RunningJob submitJob(JobConf job)
                     throws FileNotFoundException,
                            InvalidJobConfException,
                            IOException
RunningJob which can be used to track
 the running-job.
job - the job configuration.
RunningJob which can be used to track the
         running-job.
FileNotFoundException
InvalidJobConfException
IOException
public RunningJob getJob(JobID jobid)
                  throws IOException
RunningJob object to track an ongoing job.  Returns
 null if the id does not correspond to any known job.
jobid - the jobid of the job.
RunningJob handle to track the job, null if the 
         jobid doesn't correspond to any known job.
IOException
@Deprecated
public RunningJob getJob(String jobid)
                  throws IOException
getJob(JobID).
IOException
public TaskReport[] getMapTaskReports(JobID jobId)
                               throws IOException
jobId - the job to query.
IOException
@Deprecated
public TaskReport[] getMapTaskReports(String jobId)
                               throws IOException
getMapTaskReports(JobID)
IOException
public TaskReport[] getReduceTaskReports(JobID jobId)
                                  throws IOException
jobId - the job to query.
IOException
@Deprecated
public TaskReport[] getReduceTaskReports(String jobId)
                                  throws IOException
getReduceTaskReports(JobID)
IOException
public ClusterStatus getClusterStatus()
                               throws IOException
ClusterStatus.
IOException
public JobStatus[] jobsToComplete()
                           throws IOException
JobStatus for the running/to-be-run jobs.
IOException
public JobStatus[] getAllJobs()
                       throws IOException
JobStatus for the submitted jobs.
IOException
public static RunningJob runJob(JobConf job)
                         throws IOException
job - the job configuration.
IOException@Deprecated public void setTaskOutputFilter(JobClient.TaskStatusFilter newValue)
newValue - task filter.public static JobClient.TaskStatusFilter getTaskOutputFilter(JobConf job)
job - the JobConf to examine.
public static void setTaskOutputFilter(JobConf job,
                                       JobClient.TaskStatusFilter newValue)
job - the JobConf to modify.newValue - the value to set.@Deprecated public JobClient.TaskStatusFilter getTaskOutputFilter()
public int run(String[] argv)
        throws Exception
Tool
run in interface Toolargv - command specific arguments.
Exception
public int getDefaultMaps()
                   throws IOException
IOException
public int getDefaultReduces()
                      throws IOException
IOExceptionpublic Path getSystemDir()
public static void main(String[] argv)
                 throws Exception
Exception
  | 
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||