Implementation:Scikit learn Scikit learn LabelEncoder
| Knowledge Sources | |
|---|---|
| Domains | Data Preprocessing, Label Encoding |
| Last Updated | 2026-02-08 15:00 GMT |
Overview
Concrete tool for encoding target labels with values between 0 and n_classes-1 provided by scikit-learn.
Description
LabelEncoder normalizes labels by encoding each unique label as an integer between 0 and n_classes-1. This transformer should be used to encode target values (y), not the input features (X). It can handle both numerical and non-numerical labels (as long as they are hashable and comparable). The module also provides LabelBinarizer and MultiLabelBinarizer for one-hot style label encoding.
Usage
Use LabelEncoder when you need to convert categorical target labels into numeric form for use with classifiers that require numeric labels. It is particularly useful for transforming string labels into integers for model training and converting predictions back to original labels.
Code Reference
Source Location
- Repository: scikit-learn
- File: sklearn/preprocessing/_label.py
Signature
class LabelEncoder(TransformerMixin, BaseEstimator, auto_wrap_output_keys=None):
"""Encode target labels with value between 0 and n_classes-1."""
Import
from sklearn.preprocessing import LabelEncoder
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| y | array-like of shape (n_samples,) | Yes | Target values to encode. Can be numeric or string labels. |
Outputs
| Name | Type | Description |
|---|---|---|
| y_encoded | ndarray of shape (n_samples,) | Encoded labels as integers from 0 to n_classes-1. |
| classes_ | ndarray of shape (n_classes,) | Holds the unique label for each class, learned during fit. |
Usage Examples
Basic Usage
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
le.fit(["paris", "paris", "tokyo", "amsterdam"])
print(le.classes_)
# array(['amsterdam', 'paris', 'tokyo'], dtype='<U9')
encoded = le.transform(["tokyo", "tokyo", "paris"])
print(encoded)
# array([2, 2, 1])
original = le.inverse_transform([2, 2, 1])
print(original)
# array(['tokyo', 'tokyo', 'paris'], dtype='<U9')