Main Page
Welcome to Leeroopedia
Your ML & Data Knowledge Wiki. Best practices and expert-level knowledge for Machine Learning and Data Engineering, covering 1000+ frameworks and libraries from training to deployment.
Browse implementation patterns, configuration guides, debugging heuristics, and battle-tested defaults for frameworks like vLLM, DeepSpeed, Megatron-LM, FlashAttention, Triton, Unsloth, LangChain, and many more. Every page is structured so both humans and AI agents can find what they need fast.
Connect your AI coding agent. Plug Leeroopedia into your favorite coding agent, and let it build robust AI/ML systems autonomously:
- SuperML plugin — converts your AI coding agent into an expert ML engineer with agentic memory
- Leeroopedia MCP — search over best-practices and skills of ML/AI
- Kapso — experimentation platform for autonomous AI/ML software building
Browse by Category
| Category | Description | Browse |
|---|---|---|
| Workflows | Step-by-step processes and procedures | Browse All |
| Principles | Core ideas and foundational knowledge | Browse All |
| Implementations | Code-level details and modules | Browse All |
| Heuristics | Best practices and guidelines | Browse All |
| Environments | Setup and configuration guides | Browse All |
Explore Pages
Workflows
- Workflow:Princeton nlp SimPO SimPO Training
- Workflow:Cypress io Cypress Project Setup and Configuration
- Workflow:Open compass VLMEvalKit Adding Custom Benchmark
- Workflow:TobikoData Sqlmesh Plan and apply deployment
- Workflow:Lance format Lance Vector Search Pipeline
- Workflow:Marker Inc Korea AutoRAG Data Creation Pipeline
- Workflow:Microsoft DeepSpeedExamples CIFAR10 Getting Started
- Workflow:Huggingface Peft DreamBooth LoRA Diffusion
- Workflow:Langchain ai Langchain Adding Partner Integration
- Workflow:Dotnet Machinelearning AutoML Experiment
Principles
- Principle:Apache Hudi Query Type Definition
- Principle:Facebookresearch Audiocraft Sample Management
- Principle:Pyro ppl Pyro Hamiltonian Monte Carlo
- Principle:Lance format Lance Compaction Commit
- Principle:Datahub project Datahub OpenLineage Conversion
- Principle:Pytorch Serve Tensor Parallel LLM Architecture
- Principle:Microsoft Autogen Handoff Termination
- Principle:LaurentMazare Tch rs Tensor Pretty Printing
- Principle:Eventual Inc Daft Data Ingestion HuggingFace
- Principle:Speechbrain Speechbrain Permutation Invariant Training
Implementations
- Implementation:Datajuicer Data juicer GeneralFieldFilter
- Implementation:Apache Kafka SVN Commit Artifacts
- Implementation:Apache Shardingsphere DatabaseRuleConfigurationManager Refresh
- Implementation:Sgl project Sglang Flash Attention Interface
- Implementation:Apache Airflow DagFileProcessorManager Discovery
- Implementation:Pytorch Serve Spm Dataset
- Implementation:Mlc ai Mlc llm Logit Processor
- Implementation:Ucbepic Docetl MarkdownCell
- Implementation:Microsoft LoRA NLU Environment Setup Script
- Implementation:Huggingface Diffusers VideoProcessor
Heuristics
- Heuristic:Google deepmind Dm control Physics Timestep Configuration
- Heuristic:Microsoft DeepSpeedExamples RLHF Stability Constraints
- Heuristic:Huggingface Peft Gradient Checkpointing With Quantization
- Heuristic:Google deepmind Dm control Rendering Backend Selection Tips
- Heuristic:Ggml org Ggml Thread Count Selection
- Heuristic:VainF Torch Pruning Over Pruning Prevention
- Heuristic:SqueezeAILab ETS Max Depth And Token Guards
- Heuristic:Ucbepic Docetl Optimizer Sample Sizes
- Heuristic:Romsto Speculative Decoding Shared Tokenizer Requirement
- Heuristic:CrewAIInc CrewAI Context Window Management
Environments
- Environment:Mit han lab Llm awq VILA Multimodal Environment
- Environment:DistrictDataLabs Yellowbrick Python Scikit Learn Environment
- Environment:Intel Ipex llm Pipeline Parallel Environment
- Environment:Vibrantlabsai Ragas Optional NLP Metrics Environment
- Environment:Unstructured IO Unstructured Profiling Tools
- Environment:Dagster io Dagster GRPC Communication
- Environment:Speechbrain Speechbrain Speech Enhancement Dependencies
- Environment:Lakeraai Pint benchmark Python 310 With Pandas
- Environment:Lm sys FastChat SFT Training Environment
- Environment:Mlc ai Mlc llm WebGPU Browser Environment