Main Page
Welcome to Leeroopedia
Your ML & Data Knowledge Wiki. Best practices and expert-level knowledge for Machine Learning and Data Engineering, covering 1000+ frameworks and libraries from training to deployment.
Browse implementation patterns, configuration guides, debugging heuristics, and battle-tested defaults for frameworks like vLLM, DeepSpeed, Megatron-LM, FlashAttention, Triton, Unsloth, LangChain, and many more. Every page is structured so both humans and AI agents can find what they need fast.
Connect your AI coding agent. Plug Leeroopedia into your favorite coding agent, and let it build robust AI/ML systems autonomously:
- SuperML plugin — converts your AI coding agent into an expert ML engineer with agentic memory
- Leeroopedia MCP — search over best-practices and skills of ML/AI
- Kapso — experimentation platform for autonomous AI/ML software building
Browse by Category
| Category | Description | Browse |
|---|---|---|
| Workflows | Step-by-step processes and procedures | Browse All |
| Principles | Core ideas and foundational knowledge | Browse All |
| Implementations | Code-level details and modules | Browse All |
| Heuristics | Best practices and guidelines | Browse All |
| Environments | Setup and configuration guides | Browse All |
Explore Pages
Workflows
- Workflow:Liu00222 Open Prompt Injection DataSentinel Detection
- Workflow:Vllm project Vllm Multi LoRA Serving
- Workflow:Huggingface Optimum FX Graph Optimization
- Workflow:Volcengine Verl Data Preprocessing For RL
- Workflow:Protectai Llm guard API Server Deployment
- Workflow:Google research Deduplicate text datasets Cross dataset deduplication
- Workflow:Apache Paimon Data Ingestion With Ray Sink
- Workflow:PeterL1n BackgroundMattingV2 Image matting inference
- Workflow:Truera Trulens Snowflake Observability Pipeline
- Workflow:Microsoft Onnxruntime Distributed Model Training
Principles
- Principle:Mlc ai Web llm Grammar Constrained Decoding
- Principle:Ggml org Llama cpp Model Serialization
- Principle:Huggingface Datatrove Random Sampling
- Principle:Apache Paimon Vector Search Query Construction
- Principle:BerriAI Litellm Cache Backend Selection
- Principle:Pytorch Serve Instance Segmentation
- Principle:Langfuse Langfuse Dataset Run Item Upsert
- Principle:MaterializeInc Materialize Platform Check Pattern
- Principle:Huggingface Datatrove Data Reading Framework
- Principle:Huggingface Datasets Dataset Object Construction
Implementations
- Implementation:Lance format Lance Text Column Schema
- Implementation:Speechbrain Speechbrain Prepare Voicebank MTL
- Implementation:FlagOpen FlagEmbedding Reinforced IR Retriever Modeling
- Implementation:Predibase Lorax Flash Phi Modeling
- Implementation:Hiyouga LLaMA Factory LongLoRA
- Implementation:Mlc ai Mlc llm Attach Softmax Temperature Pass
- Implementation:Treeverse LakeFS Java SDK Model IcebergPullRequest
- Implementation:Speechbrain Speechbrain AMI Splits
- Implementation:Rapidsai Cuml Coordinate Descent MG
- Implementation:FlowiseAI Flowise DocumentStoreTable
Heuristics
- Heuristic:Princeton nlp Tree of thought llm Ad Hoc Value Map Scoring
- Heuristic:PrefectHQ Prefect SQLite Performance Tuning
- Heuristic:Promptfoo Promptfoo Cache Configuration Tips
- Heuristic:Dotnet Machinelearning Text File Sampling Strategy
- Heuristic:Neuml Txtai LLM Context Window Fallback
- Heuristic:EvolvingLMMs Lab Lmms eval Truncated Image Handling
- Heuristic:FlowiseAI Flowise Tool Ordering Convention
- Heuristic:MaterializeInc Materialize CI Agent Prioritization
- Heuristic:Diagram of thought Diagram of thought Strict Vs Flexible Critic Rigor
- Heuristic:Togethercomputer Together python Repetition Penalty Conflict
Environments
- Environment:Rapidsai Cuml CUDA GPU
- Environment:Junyanz Pytorch CycleGAN and pix2pix DDP Multi GPU
- Environment:Elevenlabs Elevenlabs python PyAudio
- Environment:Mlflow Mlflow GPU System Metrics Environment
- Environment:Sgl project Sglang Python Dependencies
- Environment:Lm sys FastChat LoRA QLoRA Training Environment
- Environment:Huggingface Optimum Python Core Dependencies
- Environment:Speechbrain Speechbrain PyTorch CUDA Runtime
- Environment:Lucidrains X transformers Python Environment
- Environment:ArroyoSystems Arroyo PostgreSQL Database