Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:PacktPublishing LLM Engineers Handbook HuggingFaceProcessor Run

From Leeroopedia


Overview

HuggingFaceProcessor Run implements the Principle:PacktPublishing_LLM_Engineers_Handbook_SageMaker_Evaluation_Orchestration principle by wrapping the SageMaker HuggingFaceProcessor API to launch a GPU-backed processing job that executes the evaluation script remotely.

Aspect Detail
Implementation Name HuggingFaceProcessor Run
Workflow Model_Evaluation
Type Wrapper Doc (SageMaker)
Source File llm_engineering/model/evaluation/sagemaker.py (Lines 17–57)
Implements Principle:PacktPublishing_LLM_Engineers_Handbook_SageMaker_Evaluation_Orchestration

API Signature

def run_evaluation_on_sagemaker(is_dummy: bool = True) -> None

Internally creates and invokes:

HuggingFaceProcessor(role, instance_count, instance_type, ...).run(code, source_dir)

Key Code

def run_evaluation_on_sagemaker(is_dummy: bool = True) -> None:
    hfp = HuggingFaceProcessor(
        role=settings.AWS_ARN_ROLE,
        instance_count=1,
        instance_type="ml.g5.2xlarge",
        transformers_version="4.36",
        pytorch_version="2.1",
        py_version="py310",
        base_job_name="llm-twin-evaluation",
        env={
            "HUGGING_FACE_HUB_TOKEN": settings.HUGGING_FACE_HUB_TOKEN,
            "OPENAI_API_KEY": settings.OPENAI_API_KEY,
            "MODEL_HUGGINGFACE_WORKSPACE": settings.MODEL_HUGGINGFACE_WORKSPACE,
            "IS_DUMMY": str(is_dummy),
        },
    )

    hfp.run(
        code="evaluate.py",
        source_dir=str(current_dir),
    )

Imports

from sagemaker.huggingface import HuggingFaceProcessor

Inputs

Parameter Type Description
is_dummy bool When True, runs evaluation in dummy/lightweight mode for testing. Defaults to True.
AWS credentials Environment IAM role ARN from settings.AWS_ARN_ROLE
HuggingFace token Environment Hub access token from settings.HUGGING_FACE_HUB_TOKEN
OpenAI API key Environment API key for LLM-as-Judge scoring from settings.OPENAI_API_KEY

Outputs

A SageMaker processing job is launched that:

  • Provisions an ml.g5.2xlarge GPU instance
  • Installs the HuggingFace Transformers 4.36 + PyTorch 2.1 environment
  • Uploads and executes evaluate.py from the local source directory
  • Passes all environment variables into the container
  • Runs the full evaluation pipeline (inference + scoring + aggregation) on the remote instance

The function returns None; results are persisted to HuggingFace Hub by the evaluation script itself.

Configuration Details

Setting Value Purpose
instance_type ml.g5.2xlarge NVIDIA A10G GPU with 24 GB VRAM, sufficient for 7B parameter models
instance_count 1 Single instance; evaluation is not distributed
transformers_version 4.36 Matches the version used during training
pytorch_version 2.1 Matches the version used during training
py_version py310 Python 3.10 runtime
base_job_name llm-twin-evaluation Prefix for the SageMaker job name in the AWS console

External Dependencies

Dependency Purpose
sagemaker AWS SageMaker Python SDK for creating and managing processing jobs
huggingface_hub Accessed within the remote evaluation script for model/dataset operations
loguru Structured logging within the orchestration function

External Reference

How It Works

  1. The function is called with an is_dummy flag indicating whether to run a lightweight evaluation
  2. A HuggingFaceProcessor instance is created with the specified IAM role, instance configuration, and framework versions
  3. All required environment variables (API keys, workspace identifier, dummy flag) are injected via the env parameter
  4. The .run() method uploads the entire source_dir (containing evaluate.py and supporting modules) to S3
  5. SageMaker provisions the specified GPU instance, pulls the HuggingFace DLC (Deep Learning Container), and executes evaluate.py
  6. The evaluation script runs the full pipeline: model validation, batch inference, LLM-as-Judge scoring, and results aggregation
  7. Upon completion, the instance is automatically terminated

See Also

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment