Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:Ucbepic Docetl LLM API Keys

From Leeroopedia


Knowledge Sources
Domains Infrastructure, LLM_Pipelines, Security
Last Updated 2026-02-08 01:00 GMT

Overview

API key and credential environment variables required for LLM providers, Azure Document Intelligence, and AWS Bedrock in DocETL pipelines.

Description

DocETL uses LiteLLM as a unified LLM gateway, supporting 100+ providers through environment variable-based API key configuration. The primary key is `OPENAI_API_KEY`, but pipelines can target any LiteLLM-supported provider by setting the appropriate environment variable. Additional credentials are needed for Azure Document Intelligence (PDF parsing) and AWS Bedrock. All keys are loaded via `python-dotenv` from a `.env` file at startup.

DocETL also supports encrypted API keys stored directly in pipeline YAML configs, decrypted at runtime using `DOCETL_ENCRYPTION_KEY`.

Usage

Use this environment for any pipeline that calls an LLM (map, reduce, filter, resolve, equijoin, rank, extract, topk operations). Azure credentials are only needed for Azure Document Intelligence PDF parsing. AWS credentials are only needed when using Bedrock models.

System Requirements

Category Requirement Notes
Network Internet access Required for LLM API calls
Storage `.env` file Place in project root or Docker volume

Dependencies

System Packages

  • `python-dotenv` >= 1.0.1 (loaded via `load_dotenv()` in multiple entry points)

Credentials

Core LLM Access:

  • `OPENAI_API_KEY`: API key for OpenAI models. Required for default model (`gpt-4o-mini`). Also used as fallback for other providers via LiteLLM.

Alternative LLM Providers (set one or more):

  • `ANTHROPIC_API_KEY`: For Claude models (format: `sk-ant-...`)
  • `GEMINI_API_KEY`: For Google Gemini models
  • `COHERE_API_KEY`: For Cohere models

Azure Document Intelligence (for PDF parsing):

  • `DOCUMENTINTELLIGENCE_API_KEY`: Azure Document Intelligence API key
  • `DOCUMENTINTELLIGENCE_ENDPOINT`: Azure Document Intelligence endpoint URL
  • Alternative names also supported: `AZURE_DOCUMENT_INTELLIGENCE_ENDPOINT`, `AZURE_DOCUMENT_INTELLIGENCE_KEY`

AWS Bedrock (optional):

  • `AWS_PROFILE`: AWS profile name (default: `default`)
  • `AWS_REGION`: AWS region (default: `us-west-2`)

Encryption:

  • `DOCETL_ENCRYPTION_KEY`: Decryption key for encrypted API keys stored in pipeline YAML config

Ollama (local models):

Data Storage:

  • `DOCETL_HOME_DIR`: Override for cache and data directory (default: `~`)

Quick Install

# Create .env file with your API key
echo "OPENAI_API_KEY=sk-..." > .env

# For Azure Document Intelligence PDF parsing
echo "DOCUMENTINTELLIGENCE_API_KEY=your-key" >> .env
echo "DOCUMENTINTELLIGENCE_ENDPOINT=https://your-instance.cognitiveservices.azure.com/" >> .env

Code Evidence

API key loading via dotenv from `docetl/runner.py:52`:

load_dotenv()

Encrypted key decryption from `docetl/config_wrapper.py:56-63`:

encrypted_llm_api_keys = self.config.get("llm_api_keys", {})
if encrypted_llm_api_keys:
    self.llm_api_keys = {
        key: decrypt(value, os.environ.get("DOCETL_ENCRYPTION_KEY", ""))
        for key, value in encrypted_llm_api_keys.items()
    }

Azure Document Intelligence credential check from `docetl/parsing_tools.py:282-290`:

key = os.getenv("DOCUMENTINTELLIGENCE_API_KEY")
endpoint = os.getenv("DOCUMENTINTELLIGENCE_ENDPOINT")

if key is None:
    raise ValueError("DOCUMENTINTELLIGENCE_API_KEY environment variable is not set")
if endpoint is None:
    raise ValueError("DOCUMENTINTELLIGENCE_ENDPOINT environment variable is not set")

Backend server env vars from `server/app/main.py:10-15`:

host = os.getenv("BACKEND_HOST", "127.0.0.1")
port = int(os.getenv("BACKEND_PORT", 8000))
reload = os.getenv("BACKEND_RELOAD", "False").lower() == "true"
allow_origins = os.getenv("BACKEND_ALLOW_ORIGINS", "http://localhost:3000").split(",")

Common Errors

Error Message Cause Solution
`AuthenticationError` from LiteLLM `OPENAI_API_KEY` not set or invalid Set valid API key in `.env` file
`ValueError: DOCUMENTINTELLIGENCE_API_KEY environment variable is not set` Azure DI key missing Set `DOCUMENTINTELLIGENCE_API_KEY` in `.env`
`ValueError: DOCUMENTINTELLIGENCE_ENDPOINT environment variable is not set` Azure DI endpoint missing Set `DOCUMENTINTELLIGENCE_ENDPOINT` in `.env`
`RateLimitError` API quota exceeded Wait for quota reset or upgrade API plan

Compatibility Notes

  • LiteLLM Providers: Any provider supported by LiteLLM can be used. Set the appropriate environment variable (e.g., `ANTHROPIC_API_KEY` for Claude, `GEMINI_API_KEY` for Gemini).
  • Docker: API keys are passed through `docker-compose.yml` environment section. Never bake keys into Docker images.
  • Encrypted Keys: Pipeline YAML configs can store encrypted API keys directly, decrypted at runtime via `DOCETL_ENCRYPTION_KEY`.
  • Azure Page Limit: Azure Document Intelligence has a hard limit of 200 pages per PDF (`MAX_AZURE_PAGE_LIMIT = 200` in `server/app/routes/convert.py:34`).

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment