Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Heibaiying BigData Notes WordCountReducer Reduce

From Leeroopedia


Knowledge Sources
Domains Distributed_Computing, Big_Data
Last Updated 2026-02-10 10:00 GMT

Overview

Concrete tool for aggregating word occurrence counts into total frequencies provided by the BigData-Notes repository.

Description

The WordCountReducer class extends Hadoop's Reducer<Text, IntWritable, Text, IntWritable> and implements the reduce phase of the word count pipeline. For each unique word (key), the reduce() method iterates over all associated count values and computes their sum. It then emits a single (word, totalCount) pair representing the total number of occurrences of that word across the entire input dataset.

Because the summation operation is both associative and commutative, this reducer can also be reused as a Combiner for local pre-aggregation on the mapper side.

Usage

Use this reducer as part of a word count MapReduce job. It is registered with the job via job.setReducerClass(WordCountReducer.class) during job assembly. It can also be set as the Combiner via job.setCombinerClass(WordCountReducer.class).

Code Reference

Source Location

  • Repository: BigData-Notes
  • File: code/Hadoop/hadoop-word-count/src/main/java/com/heibaiying/component/WordCountReducer.java
  • Lines: L12-22

Signature

public class WordCountReducer extends Reducer<Text, IntWritable, Text, IntWritable> {

    @Override
    protected void reduce(Text key, Iterable<IntWritable> values, Context context)
        throws IOException, InterruptedException
}

Import

import com.heibaiying.component.WordCountReducer;

I/O Contract

Inputs

Name Type Required Description
key Text Yes A unique word from the intermediate map output
values Iterable<IntWritable> Yes An iterable of integer counts (each typically 1, or pre-aggregated by a Combiner)
context Context Yes The Reducer context used to emit the final output key-value pair

Outputs

Name Type Description
key Text The word being counted
value IntWritable The total number of occurrences of the word across all input data

Usage Examples

Basic Usage

import com.heibaiying.component.WordCountReducer;
import org.apache.hadoop.mapreduce.Job;

// Register the reducer with a MapReduce job
Job job = Job.getInstance(conf, "WordCount");
job.setReducerClass(WordCountReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);

Internal Logic

// For key "Hadoop" with values [1, 1, 1, 1, 1]:
// The reduce method computes:
//   sum = 0
//   sum += 1  -> sum = 1
//   sum += 1  -> sum = 2
//   sum += 1  -> sum = 3
//   sum += 1  -> sum = 4
//   sum += 1  -> sum = 5
// Emits: ("Hadoop", 5)

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment