org.apache.hadoop.contrib.index.main
Class UpdateIndex
java.lang.Object
  
org.apache.hadoop.contrib.index.main.UpdateIndex
public class UpdateIndex
- extends Object
 
A distributed "index" is partitioned into "shards". Each shard corresponds
 to a Lucene instance. This class contains the main() method which uses a
 Map/Reduce job to analyze documents and update Lucene instances in parallel.
 
 The main() method in UpdateIndex requires the following information for
 updating the shards:
   - Input formatter. This specifies how to format the input documents.
   - Analysis. This defines the analyzer to use on the input. The analyzer
     determines whether a document is being inserted, updated, or deleted.
     For inserts or updates, the analyzer also converts each input document
     into a Lucene document.
   - Input paths. This provides the location(s) of updated documents,
     e.g., HDFS files or directories, or HBase tables.
   - Shard paths, or index path with the number of shards. Either specify
     the path for each shard, or specify an index path and the shards are
     the sub-directories of the index directory.
   - Output path. When the update to a shard is done, a message is put here.
   - Number of map tasks.
 All of the information can be specified in a configuration file. All but
 the first two can also be specified as command line options. Check out
 conf/index-config.xml.template for other configurable parameters.
 Note: Because of the parallel nature of Map/Reduce, the behaviour of
 multiple inserts, deletes or updates to the same document is undefined.
| 
Field Summary | 
static org.apache.commons.logging.Log | 
LOG
 
            | 
 
 
| 
Method Summary | 
static void | 
main(String[] argv)
 
          The main() method | 
 
| Methods inherited from class java.lang.Object | 
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait | 
 
LOG
public static final org.apache.commons.logging.Log LOG
UpdateIndex
public UpdateIndex()
main
public static void main(String[] argv)
- The main() method
- Parameters:
 argv - 
 
 
Copyright © 2008 The Apache Software Foundation