Main Page
Welcome to Leeroopedia
Your ML & Data Knowledge Wiki. Best practices and expert-level knowledge for Machine Learning and Data Engineering, covering 1000+ frameworks and libraries from training to deployment.
Browse implementation patterns, configuration guides, debugging heuristics, and battle-tested defaults for frameworks like vLLM, DeepSpeed, Megatron-LM, FlashAttention, Triton, Unsloth, LangChain, and many more. Every page is structured so both humans and AI agents can find what they need fast.
Connect your AI coding agent. Plug Leeroopedia into your favorite coding agent, and let it build robust AI/ML systems autonomously:
- SuperML plugin — converts your AI coding agent into an expert ML engineer with agentic memory
- Leeroopedia MCP — search over best-practices and skills of ML/AI
- Kapso — experimentation platform for autonomous AI/ML software building
Browse by Category
| Category | Description | Browse |
|---|---|---|
| Workflows | Step-by-step processes and procedures | Browse All |
| Principles | Core ideas and foundational knowledge | Browse All |
| Implementations | Code-level details and modules | Browse All |
| Heuristics | Best practices and guidelines | Browse All |
| Environments | Setup and configuration guides | Browse All |
Explore Pages
Workflows
- Workflow:Allenai Open instruct Reward Model Training
- Workflow:Infiniflow Ragflow Knowledge Base Document Ingestion
- Workflow:Dagster io Dagster ETL Pipeline
- Workflow:Mage ai Mage ai Destination Data Loading
- Workflow:ArroyoSystems Arroyo Connection Setup
- Workflow:Mage ai Mage ai Building a New Destination Connector
- Workflow:Volcengine Verl Data Preprocessing For RL
- Workflow:Apache Airflow Core Release Process
- Workflow:Junyanz Pytorch CycleGAN and pix2pix Pix2pix Training
- Workflow:Google deepmind Dm control Manipulation Task Setup
Principles
- Principle:ARISE Initiative Robomimic Training Loop Execution
- Principle:Pytorch Serve Distributed Worker
- Principle:FlagOpen FlagEmbedding Multimodal Retrieval
- Principle:Sgl project Sglang Frontend Backend Initialization
- Principle:Mlflow Mlflow Run Management
- Principle:CrewAIInc CrewAI Crew Integration In Flow
- Principle:BerriAI Litellm Server Startup
- Principle:Junyanz Pytorch CycleGAN and pix2pix Training Options Configuration
- Principle:NVIDIA NeMo Aligner DPO Reference Policy Management
- Principle:Fastai Fastbook Stochastic Gradient Descent
Implementations
- Implementation:Rapidsai Cuml Kernel SHAP
- Implementation:Heibaiying BigData Notes ConnectionFactory CreateConnection
- Implementation:Microsoft Playwright TraceViewer Server
- Implementation:Apache Druid ExecutionProgressBarPane
- Implementation:Open compass VLMEvalKit FlashVL
- Implementation:Astronomer Astronomer cosmos Get Dataset Alias Name
- Implementation:Heibaiying BigData Notes Job Assembly and Submission
- Implementation:ArroyoSystems Arroyo Iceberg Schema
- Implementation:ARISE Initiative Robosuite CheckCustomRobotModel
- Implementation:Mlc ai Mlc llm Mistral Model
Heuristics
- Heuristic:Farama Foundation Gymnasium Seeding Determinism Best Practices
- Heuristic:Puppeteer Puppeteer Chrome Default Launch Arguments
- Heuristic:Mlfoundations Open flamingo FSDP Manual Wrapping For Mixed Parameters
- Heuristic:Langgenius Dify Credential Sanitization In API Responses
- Heuristic:DistrictDataLabs Yellowbrick Model Fitted State Detection
- Heuristic:Deepset ai Haystack Pipeline Max Runs Safety Limit
- Heuristic:Ggml org Ggml Gradient Accumulation Batch Sizing
- Heuristic:Fede1024 Rust rdkafka Cooperative Rebalance Protocol
- Heuristic:Kornia Kornia Lazy Loading Optional Deps
- Heuristic:Intel Ipex llm QLoRA Training Hyperparameters
Environments
- Environment:Langfuse Langfuse Node 24 Runtime
- Environment:TA Lib Ta lib python Python Build Environment
- Environment:TobikoData Sqlmesh GitHub CICD Runner
- Environment:Lance format Lance Rust Toolchain
- Environment:Apache Dolphinscheduler Node Pnpm Runtime
- Environment:Marker Inc Korea AutoRAG Japanese NLP Dependencies
- Environment:Sgl project Sglang CUDA
- Environment:Bentoml BentoML Triton Inference Server
- Environment:Dotnet Machinelearning OneDal Acceleration
- Environment:PacktPublishing LLM Engineers Handbook API Credentials