Implementation:Heibaiying BigData Notes WordCountMapper Map

Knowledge Sources	BigData-Notes Hadoop API
Domains	Distributed_Computing, Big_Data
Last Updated	2026-02-10 10:00 GMT

Overview

Concrete tool for tokenizing input text lines into individual word counts provided by the BigData-Notes repository.

Description

The WordCountMapper class extends Hadoop's Mapper<LongWritable, Text, Text, IntWritable> and implements the map phase of the word count pipeline. For each input line, the map() method splits the text by tab characters and emits a (word, 1) key-value pair for every token found.

The input key is a LongWritable representing the byte offset of the line within the input split (provided by the framework and typically unused by the mapper logic). The input value is a Text object containing the line content. The output key is a Text object containing an individual word, and the output value is an IntWritable with the value 1.

Usage

Use this mapper as part of a word count MapReduce job. It is registered with the job via job.setMapperClass(WordCountMapper.class) during job assembly.

Code Reference

Source Location

Repository: BigData-Notes
File: code/Hadoop/hadoop-word-count/src/main/java/com/heibaiying/component/WordCountMapper.java
Lines: L13-23

Signature

public class WordCountMapper extends Mapper<LongWritable, Text, Text, IntWritable> {

    @Override
    protected void map(LongWritable key, Text value, Context context)
        throws IOException, InterruptedException
}

Import

import com.heibaiying.component.WordCountMapper;

I/O Contract

Inputs

Name	Type	Required	Description
key	LongWritable	Yes	Byte offset of the input line (provided by the framework)
value	Text	Yes	A single line of text from the input file (tab-delimited words)
context	Context	Yes	The Mapper context used to emit output key-value pairs

Outputs

Name	Type	Description
key	Text	An individual word extracted from the input line
value	IntWritable	The integer constant 1, representing a single occurrence of the word

Usage Examples

Basic Usage

import com.heibaiying.component.WordCountMapper;
import org.apache.hadoop.mapreduce.Job;

// Register the mapper with a MapReduce job
Job job = Job.getInstance(conf, "WordCount");
job.setMapperClass(WordCountMapper.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);

Internal Logic

// For an input line: "Spark\tHadoop\tHBase\tStorm\tFlink\tHive"
// The map method splits by "\t" and emits:
//   ("Spark", 1)
//   ("Hadoop", 1)
//   ("HBase", 1)
//   ("Storm", 1)
//   ("Flink", 1)
//   ("Hive", 1)

Related Pages

Implements Principle

Principle:Heibaiying_BigData_Notes_MapReduce_Map_Phase

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment