Implementation:Heibaiying BigData Notes Job SetCombinerClass

Knowledge Sources	BigData-Notes Hadoop API
Domains	Distributed_Computing, Big_Data
Last Updated	2026-02-10 10:00 GMT

Overview

Concrete tool for configuring a local Combiner on a MapReduce job to perform pre-aggregation before the shuffle phase, provided by the Hadoop MapReduce framework.

Description

The job.setCombinerClass() method registers a Combiner class with the MapReduce job. In the BigData-Notes word count example, the WordCountReducer is reused as the Combiner because the summation operation it performs is both associative and commutative. This is a common pattern in MapReduce applications where the reduce logic can be safely applied as a local optimization.

When set, the Combiner runs on each mapper node after the map phase completes but before data is shuffled to the reducers. This is a wrapper around the Hadoop Job.setCombinerClass() API that demonstrates how to wire an existing Reducer into the Combiner role.

Usage

Use this configuration when your reduce function is associative and commutative (such as summation) and you want to reduce the volume of data shuffled across the network. It is a single line of configuration during job assembly.

Code Reference

Source Location

Repository: BigData-Notes
File: code/Hadoop/hadoop-word-count/src/main/java/com/heibaiying/WordCountCombinerApp.java
Lines: L55

Signature

// From org.apache.hadoop.mapreduce.Job:
public void setCombinerClass(Class<? extends Reducer> cls) throws IllegalStateException

// Usage in WordCountCombinerApp:
job.setCombinerClass(WordCountReducer.class);

Import

import org.apache.hadoop.mapreduce.Job;
import com.heibaiying.component.WordCountReducer;

I/O Contract

Inputs

Name	Type	Required	Description
cls	Class<? extends Reducer>	Yes	The Reducer class to be used as the Combiner (e.g., WordCountReducer.class)

Outputs

Name	Type	Description
void	void	Configures the job internally; no return value

Usage Examples

Basic Usage

import org.apache.hadoop.mapreduce.Job;
import com.heibaiying.component.WordCountReducer;

Job job = Job.getInstance(conf, "WordCountWithCombiner");
job.setMapperClass(WordCountMapper.class);
job.setReducerClass(WordCountReducer.class);

// Reuse the Reducer as the Combiner for local pre-aggregation
job.setCombinerClass(WordCountReducer.class);

Verifying Combiner Effect

// Without Combiner:
//   Mapper emits 6000 pairs (1000 lines x 6 words)
//   All 6000 pairs are shuffled to reducers

// With Combiner:
//   Mapper emits 6000 pairs (1000 lines x 6 words)
//   Combiner reduces to 6 pairs per mapper (one per unique word)
//   Only 6 pairs per mapper are shuffled to reducers

// The Combiner counters in the job output will show:
//   Combine input records: 6000
//   Combine output records: 6

Related Pages

Implements Principle

Principle:Heibaiying_BigData_Notes_MapReduce_Combiner

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment