Main Page
Welcome to Leeroopedia
Your ML & Data Knowledge Wiki. Best practices and expert-level knowledge for Machine Learning and Data Engineering, covering 1000+ frameworks and libraries from training to deployment.
Browse implementation patterns, configuration guides, debugging heuristics, and battle-tested defaults for frameworks like vLLM, DeepSpeed, Megatron-LM, FlashAttention, Triton, Unsloth, LangChain, and many more. Every page is structured so both humans and AI agents can find what they need fast.
Connect your AI coding agent. Plug Leeroopedia into your favorite coding agent, and let it build robust AI/ML systems autonomously:
- SuperML plugin — converts your AI coding agent into an expert ML engineer with agentic memory
- Leeroopedia MCP — search over best-practices and skills of ML/AI
- Kapso — experimentation platform for autonomous AI/ML software building
Browse by Category
| Category | Description | Browse |
|---|---|---|
| Workflows | Step-by-step processes and procedures | Browse All |
| Principles | Core ideas and foundational knowledge | Browse All |
| Implementations | Code-level details and modules | Browse All |
| Heuristics | Best practices and guidelines | Browse All |
| Environments | Setup and configuration guides | Browse All |
Explore Pages
Workflows
- Workflow:Volcengine Verl Data Preprocessing For RL
- Workflow:Google deepmind Mujoco Simulation benchmarking
- Workflow:Apache Druid Batch Data Ingestion
- Workflow:Speechbrain Speechbrain Speech Separation Training
- Workflow:Huggingface Trl Direct Preference Optimization
- Workflow:Marker Inc Korea AutoRAG RAG Pipeline Optimization
- Workflow:Datajuicer Data juicer Text Data Processing Pipeline
- Workflow:Trailofbits Fickling Pickle Safety Analysis
- Workflow:Apache Airflow DAG Authoring and Deployment
- Workflow:NVIDIA NeMo Aligner REINFORCE Training
Principles
- Principle:Facebookresearch Audiocraft Multi Scale Discrimination
- Principle:Wandb Weave Prompt Publishing
- Principle:Princeton nlp Tree of thought llm Naive Baseline Sampling
- Principle:Kserve Kserve Pipeline Validation
- Principle:Ray project Ray Application Deployment
- Principle:Kubeflow Pipelines Reusable Component Loading
- Principle:Cleanlab Cleanlab Multilabel Quality Scoring
- Principle:Alibaba ROLL RLVR Dataset Preparation
- Principle:AUTOMATIC1111 Stable diffusion webui Inpainting Pipeline
- Principle:SqueezeAILab ETS Result Collection
Implementations
- Implementation:Hpcaitech ColossalAI Math Competition Reward
- Implementation:Huggingface Datasets Dataset From Dict
- Implementation:Vibrantlabsai Ragas QuotedSpans
- Implementation:Protectai Llm guard Vault
- Implementation:ArroyoSystems Arroyo Kafka Source Tests
- Implementation:Online ml River Linear Model Perceptron
- Implementation:Microsoft DeepSpeedExamples BingBert Turing FocalLoss
- Implementation:Axolotl ai cloud Axolotl RunPod Config Template
- Implementation:Microsoft Semantic kernel ProcessFunctionTargetBuilder
- Implementation:Microsoft Playwright ChannelOwner
Heuristics
- Heuristic:TA Lib Ta lib python Compatibility Mode Switching
- Heuristic:Apache Paimon File Sizing and Split Planning
- Heuristic:Marker Inc Korea AutoRAG Deterministic Evaluation Generation
- Heuristic:Apache Dolphinscheduler Gzip Compression Threshold
- Heuristic:Duckdb Duckdb Sanitizer Configuration
- Heuristic:Teamcapybara Capybara Frozen Time Detection
- Heuristic:Microsoft Onnxruntime Memory Recomputation Optimization
- Heuristic:Princeton nlp Tree of thought llm Duplicate Candidate Zeroing
- Heuristic:Nautechsystems Nautilus trader Strategy On Start Initialization
- Heuristic:Speechbrain Speechbrain Nonfinite Loss Handling
Environments
- Environment:Diagram of thought Diagram of thought Graphviz
- Environment:Mbzuai oryx Awesome LLM Post training Git CLI
- Environment:Cohere ai Cohere python Cohere API Credentials
- Environment:Huggingface Trl PEFT LoRA Environment
- Environment:Iterative Dvc Remote Storage Backends
- Environment:Huggingface Alignment handbook Python Transformers
- Environment:Huggingface Alignment handbook Python Datasets
- Environment:DataExpert io Data engineer handbook Statsig API Environment
- Environment:Kubeflow Pipelines Kubernetes Cluster
- Environment:Obss Sahi Python Detection Frameworks