Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Heibaiying BigData Notes Job SetCombinerClass

From Leeroopedia


Knowledge Sources
Domains Distributed_Computing, Big_Data
Last Updated 2026-02-10 10:00 GMT

Overview

Concrete tool for configuring a local Combiner on a MapReduce job to perform pre-aggregation before the shuffle phase, provided by the Hadoop MapReduce framework.

Description

The job.setCombinerClass() method registers a Combiner class with the MapReduce job. In the BigData-Notes word count example, the WordCountReducer is reused as the Combiner because the summation operation it performs is both associative and commutative. This is a common pattern in MapReduce applications where the reduce logic can be safely applied as a local optimization.

When set, the Combiner runs on each mapper node after the map phase completes but before data is shuffled to the reducers. This is a wrapper around the Hadoop Job.setCombinerClass() API that demonstrates how to wire an existing Reducer into the Combiner role.

Usage

Use this configuration when your reduce function is associative and commutative (such as summation) and you want to reduce the volume of data shuffled across the network. It is a single line of configuration during job assembly.

Code Reference

Source Location

  • Repository: BigData-Notes
  • File: code/Hadoop/hadoop-word-count/src/main/java/com/heibaiying/WordCountCombinerApp.java
  • Lines: L55

Signature

// From org.apache.hadoop.mapreduce.Job:
public void setCombinerClass(Class<? extends Reducer> cls) throws IllegalStateException

// Usage in WordCountCombinerApp:
job.setCombinerClass(WordCountReducer.class);

Import

import org.apache.hadoop.mapreduce.Job;
import com.heibaiying.component.WordCountReducer;

I/O Contract

Inputs

Name Type Required Description
cls Class<? extends Reducer> Yes The Reducer class to be used as the Combiner (e.g., WordCountReducer.class)

Outputs

Name Type Description
void void Configures the job internally; no return value

Usage Examples

Basic Usage

import org.apache.hadoop.mapreduce.Job;
import com.heibaiying.component.WordCountReducer;

Job job = Job.getInstance(conf, "WordCountWithCombiner");
job.setMapperClass(WordCountMapper.class);
job.setReducerClass(WordCountReducer.class);

// Reuse the Reducer as the Combiner for local pre-aggregation
job.setCombinerClass(WordCountReducer.class);

Verifying Combiner Effect

// Without Combiner:
//   Mapper emits 6000 pairs (1000 lines x 6 words)
//   All 6000 pairs are shuffled to reducers

// With Combiner:
//   Mapper emits 6000 pairs (1000 lines x 6 words)
//   Combiner reduces to 6 pairs per mapper (one per unique word)
//   Only 6 pairs per mapper are shuffled to reducers

// The Combiner counters in the job output will show:
//   Combine input records: 6000
//   Combine output records: 6

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment