Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:OpenBMB UltraFeedback Inference Environment Setup

From Leeroopedia


Knowledge Sources
Domains DevOps, ML_Infrastructure
Last Updated 2023-10-02 00:00 GMT

Overview

Concrete tool for setting up the inference environment using shell launcher scripts for HuggingFace and vLLM backends.

Description

This is an External Tool Doc documenting shell scripts rather than Python APIs. Two launcher scripts configure the environment and invoke the generation pipeline:

run.sh (HuggingFace backend): Installs pinned dependency versions and launches main.py with model_type and shard ID arguments.

run_vllm.sh (vLLM backend): Sets NCCL and Ray environment variables, installs latest package versions, and launches main_vllm_batch.py with model_type argument. Note: the script references main_vllm_batch.py but the repository contains main_vllm.py — this may be a filename discrepancy.

Usage

Run from the src/comparison_data_generation/ directory:

  • HF backend: bash run.sh {model_type} {shard_id}
  • vLLM backend: bash run_vllm.sh {model_type}

Code Reference

Source Location

  • Repository: UltraFeedback
  • File: src/comparison_data_generation/run.sh (Lines 1-7)
  • File: src/comparison_data_generation/run_vllm.sh (Lines 1-15)

Signature

# run.sh — HuggingFace backend launcher
pip install transformers==4.31.0
pip install tokenizers==0.13.3
pip install deepspeed==0.10.0
pip install accelerate -U

python main.py --model_type ${1} --id ${2}
# run_vllm.sh — vLLM backend launcher
export NCCL_IGNORE_DISABLED_P2P=1

pip install transformers -U
pip install tokenizers -U
pip install deepspeed -U
pip install accelerate -U
pip install vllm -U

echo $1

export NCCL_IGNORE_DISABLED_P2P=1
export RAY_memory_monitor_refresh_ms=0
CUDA_LAUNCH_BLOCKING=1 python main_vllm_batch.py --model_type ${1}

Import

# Shell scripts — no Python imports
# Usage: bash run.sh ultralm-13b 0
# Usage: bash run_vllm.sh ultralm-13b

I/O Contract

Inputs

Name Type Required Description
$1 (model_type) str Yes Model identifier (e.g., "ultralm-13b", "alpaca-7b")
$2 (shard_id) int HF only Shard ID for parallel processing (0, 1, 2, ...)

Outputs

Name Type Description
Installed environment System Python packages installed at required versions
Pipeline execution Process Launches main.py or main_vllm_batch.py with arguments

Usage Examples

HuggingFace Backend

# Generate completions for ultralm-13b, shard 0
cd src/comparison_data_generation/
bash run.sh ultralm-13b 0

# Generate completions for alpaca-7b, shard 1
bash run.sh alpaca-7b 1

vLLM Backend

# Generate completions for ultralm-13b with vLLM (all shards at once)
cd src/comparison_data_generation/
bash run_vllm.sh ultralm-13b

# Generate completions for vicuna-33b
bash run_vllm.sh vicuna-33b

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment