Implementation:Heibaiying BigData Notes WordCountReducer Reduce
| Knowledge Sources | |
|---|---|
| Domains | Distributed_Computing, Big_Data |
| Last Updated | 2026-02-10 10:00 GMT |
Overview
Concrete tool for aggregating word occurrence counts into total frequencies provided by the BigData-Notes repository.
Description
The WordCountReducer class extends Hadoop's Reducer<Text, IntWritable, Text, IntWritable> and implements the reduce phase of the word count pipeline. For each unique word (key), the reduce() method iterates over all associated count values and computes their sum. It then emits a single (word, totalCount) pair representing the total number of occurrences of that word across the entire input dataset.
Because the summation operation is both associative and commutative, this reducer can also be reused as a Combiner for local pre-aggregation on the mapper side.
Usage
Use this reducer as part of a word count MapReduce job. It is registered with the job via job.setReducerClass(WordCountReducer.class) during job assembly. It can also be set as the Combiner via job.setCombinerClass(WordCountReducer.class).
Code Reference
Source Location
- Repository: BigData-Notes
- File: code/Hadoop/hadoop-word-count/src/main/java/com/heibaiying/component/WordCountReducer.java
- Lines: L12-22
Signature
public class WordCountReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
@Override
protected void reduce(Text key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException
}
Import
import com.heibaiying.component.WordCountReducer;
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| key | Text | Yes | A unique word from the intermediate map output |
| values | Iterable<IntWritable> | Yes | An iterable of integer counts (each typically 1, or pre-aggregated by a Combiner) |
| context | Context | Yes | The Reducer context used to emit the final output key-value pair |
Outputs
| Name | Type | Description |
|---|---|---|
| key | Text | The word being counted |
| value | IntWritable | The total number of occurrences of the word across all input data |
Usage Examples
Basic Usage
import com.heibaiying.component.WordCountReducer;
import org.apache.hadoop.mapreduce.Job;
// Register the reducer with a MapReduce job
Job job = Job.getInstance(conf, "WordCount");
job.setReducerClass(WordCountReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
Internal Logic
// For key "Hadoop" with values [1, 1, 1, 1, 1]:
// The reduce method computes:
// sum = 0
// sum += 1 -> sum = 1
// sum += 1 -> sum = 2
// sum += 1 -> sum = 3
// sum += 1 -> sum = 4
// sum += 1 -> sum = 5
// Emits: ("Hadoop", 5)