Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:Mlfoundations Open flamingo HuggingFace Open CLIP Dependencies

From Leeroopedia


Knowledge Sources
Domains Infrastructure, Deep_Learning, Computer_Vision
Last Updated 2026-02-08 03:30 GMT

Overview

Python 3.9 environment with HuggingFace Transformers >= 4.28.1, OpenCLIP >= 2.16.0, and supporting libraries for the OpenFlamingo model.

Description

This environment provides the core model dependencies for OpenFlamingo. It combines HuggingFace Transformers (for language models like OPT, MPT, Pythia, LLaMA) with OpenCLIP (for vision encoders like ViT-L-14). Additional packages include einops for tensor manipulation, sentencepiece for tokenization, and Pillow for image handling. These are the base requirements needed for model creation, inference, and weight loading.

Usage

Use this environment for Model Creation and Inference workflows. It is the mandatory prerequisite for initializing a Flamingo model via `create_model_and_transforms()`, loading pretrained weights, and running text generation.

System Requirements

Category Requirement Notes
Python 3.9 Specified in `environment.yml` and `setup.py` classifiers
Java OpenJDK Required via `conda-forge::openjdk` in `environment.yml`

Dependencies

Python Packages

  • `transformers` >= 4.28.1
  • `open_clip_torch` >= 2.16.0
  • `torch` == 2.0.1
  • `einops`
  • `einops-exts`
  • `pillow`
  • `sentencepiece`

Credentials

No credentials are required for this base environment. However, if using gated models on HuggingFace Hub:

  • `HF_TOKEN`: HuggingFace API token (if downloading gated models)

Quick Install

# Install from setup.py (recommended)
pip install -e .

# Or install manually
pip install torch==2.0.1 transformers>=4.28.1 open_clip_torch>=2.16.0 einops einops-exts pillow sentencepiece

# With conda (full environment)
conda env create -f environment.yml

Code Evidence

Package requirements from `setup.py:9-17`:

REQUIREMENTS = [
    "einops",
    "einops-exts",
    "transformers>=4.28.1",
    "torch==2.0.1",
    "pillow",
    "open_clip_torch>=2.16.0",
    "sentencepiece",
]

OpenCLIP usage for vision encoder in `open_flamingo/src/factory.py:42-46`:

vision_encoder, _, image_processor = open_clip.create_model_and_transforms(
    clip_vision_encoder_path,
    pretrained=clip_vision_encoder_pretrained,
    cache_dir=cache_dir,
)

HuggingFace Transformers usage for language model in `open_flamingo/src/factory.py:65-70`:

lang_encoder = AutoModelForCausalLM.from_pretrained(
    lang_encoder_path,
    local_files_only=use_local_files,
    trust_remote_code=True,
    cache_dir=cache_dir,
)

Supported LM decoder layer names from `open_flamingo/src/factory.py:132-141`:

__KNOWN_DECODER_LAYERS_ATTR_NAMES = {
    "opt": "model.decoder.layers",
    "gptj": "transformer.h",
    "gpt-j": "transformer.h",
    "pythia": "gpt_neox.layers",
    "llama": "model.layers",
    "gptneoxforcausallm": "gpt_neox.layers",
    "mpt": "transformer.blocks",
    "mosaicgpt": "transformer.blocks",
}

Common Errors

Error Message Cause Solution
`ValueError: We require the attribute name for the nn.ModuleList` Unsupported language model architecture Pass `--decoder_layers_attr_name` manually
`trust_remote_code=True` warning Using custom model architectures (MPT) This is expected; flag is set automatically in factory.py
MPT missing `get_input_embeddings` MPT-1B model lacks standard HF method Handled by `EmbeddingFnMixin` hack in factory.py:73-82

Compatibility Notes

  • Supported LMs: OPT, GPT-J, Pythia, LLaMA, MPT, MosaicGPT. Custom models require specifying `decoder_layers_attr_name`.
  • MPT-1B: Requires a runtime monkey-patch to add `get_input_embeddings` / `set_input_embeddings` methods (factory.py:73-82).
  • Offline mode: Set `--offline` to use `local_files_only=True` for both HuggingFace and Transformers.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment