Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Heibaiying BigData Notes WordCountMapper Map

From Leeroopedia


Knowledge Sources
Domains Distributed_Computing, Big_Data
Last Updated 2026-02-10 10:00 GMT

Overview

Concrete tool for tokenizing input text lines into individual word counts provided by the BigData-Notes repository.

Description

The WordCountMapper class extends Hadoop's Mapper<LongWritable, Text, Text, IntWritable> and implements the map phase of the word count pipeline. For each input line, the map() method splits the text by tab characters and emits a (word, 1) key-value pair for every token found.

The input key is a LongWritable representing the byte offset of the line within the input split (provided by the framework and typically unused by the mapper logic). The input value is a Text object containing the line content. The output key is a Text object containing an individual word, and the output value is an IntWritable with the value 1.

Usage

Use this mapper as part of a word count MapReduce job. It is registered with the job via job.setMapperClass(WordCountMapper.class) during job assembly.

Code Reference

Source Location

  • Repository: BigData-Notes
  • File: code/Hadoop/hadoop-word-count/src/main/java/com/heibaiying/component/WordCountMapper.java
  • Lines: L13-23

Signature

public class WordCountMapper extends Mapper<LongWritable, Text, Text, IntWritable> {

    @Override
    protected void map(LongWritable key, Text value, Context context)
        throws IOException, InterruptedException
}

Import

import com.heibaiying.component.WordCountMapper;

I/O Contract

Inputs

Name Type Required Description
key LongWritable Yes Byte offset of the input line (provided by the framework)
value Text Yes A single line of text from the input file (tab-delimited words)
context Context Yes The Mapper context used to emit output key-value pairs

Outputs

Name Type Description
key Text An individual word extracted from the input line
value IntWritable The integer constant 1, representing a single occurrence of the word

Usage Examples

Basic Usage

import com.heibaiying.component.WordCountMapper;
import org.apache.hadoop.mapreduce.Job;

// Register the mapper with a MapReduce job
Job job = Job.getInstance(conf, "WordCount");
job.setMapperClass(WordCountMapper.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);

Internal Logic

// For an input line: "Spark\tHadoop\tHBase\tStorm\tFlink\tHive"
// The map method splits by "\t" and emits:
//   ("Spark", 1)
//   ("Hadoop", 1)
//   ("HBase", 1)
//   ("Storm", 1)
//   ("Flink", 1)
//   ("Hive", 1)

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment