Main Page
Welcome to Leeroopedia
Your ML & Data Knowledge Wiki. Best practices and expert-level knowledge for Machine Learning and Data Engineering, covering 1000+ frameworks and libraries from training to deployment.
Browse implementation patterns, configuration guides, debugging heuristics, and battle-tested defaults for frameworks like vLLM, DeepSpeed, Megatron-LM, FlashAttention, Triton, Unsloth, LangChain, and many more. Every page is structured so both humans and AI agents can find what they need fast.
Connect your AI coding agent. Plug Leeroopedia into your favorite coding agent, and let it build robust AI/ML systems autonomously:
- SuperML plugin — converts your AI coding agent into an expert ML engineer with agentic memory
- Leeroopedia MCP — search over best-practices and skills of ML/AI
- Kapso — experimentation platform for autonomous AI/ML software building
Browse by Category
| Category | Description | Browse |
|---|---|---|
| Workflows | Step-by-step processes and procedures | Browse All |
| Principles | Core ideas and foundational knowledge | Browse All |
| Implementations | Code-level details and modules | Browse All |
| Heuristics | Best practices and guidelines | Browse All |
| Environments | Setup and configuration guides | Browse All |
Explore Pages
Workflows
- Workflow:Allenai Open instruct DPO Preference Tuning
- Workflow:NVIDIA NeMo Aligner Reward Model Training
- Workflow:Unstructured IO Unstructured Document Partitioning
- Workflow:Huggingface Alignment handbook Multi Stage Post Training
- Workflow:Dotnet Machinelearning Binary Classification Pipeline
- Workflow:LLMBook zh LLMBook zh github io LoRA Finetuning
- Workflow:OWASP Www project top 10 for large language model applications Vulnerability Entry Development
- Workflow:Intel Ipex llm QLoRA Finetuning
- Workflow:Datajuicer Data juicer Custom Operator Development
- Workflow:Hpcaitech ColossalAI DPO Alignment
Principles
- Principle:OpenGVLab InternVL Batch VQA Inference
- Principle:Allenai Open instruct Tokenizer Configuration
- Principle:Datajuicer Data juicer Data Grouping
- Principle:Huggingface Optimum Model Symbolic Tracing
- Principle:Apache Kafka Connect Standalone Invocation
- Principle:Axolotl ai cloud Axolotl Package Build Configuration
- Principle:Datahub project Datahub Deployment Verification
- Principle:Cohere ai Cohere python Semantic Reranking
- Principle:Googleapis Python genai Live Music Generation
- Principle:Vllm project Vllm Constrained Generation
Implementations
- Implementation:Deepspeedai DeepSpeed CPU Adam Impl
- Implementation:Datajuicer Data juicer VideoMotionScoreFilter
- Implementation:Huggingface Optimum ReversibleTransformation Reverse
- Implementation:Facebookresearch Audiocraft Audiocraft Installation
- Implementation:Langchain ai Langchain AnthropicLLM
- Implementation:Speechbrain Speechbrain Hparams Switchboard Transformer
- Implementation:Haosulab ManiSkill PickClutterYCB
- Implementation:Duckdb Duckdb Mbedtls PK
- Implementation:Arize ai Phoenix Legacy OpenAIModel
- Implementation:Openclaw Openclaw RunCli
Heuristics
- Heuristic:Snorkel team Snorkel NLP Preprocessor Memoization
- Heuristic:Truera Trulens Temperature Zero For Deterministic Scoring
- Heuristic:Neuml Txtai Batch Size And Sorting Tip
- Heuristic:Scikit learn Scikit learn Random State Management
- Heuristic:Sdv dev SDV Sampling Retry Tuning
- Heuristic:CrewAIInc CrewAI Rate Limiting Strategy
- Heuristic:HKUDS AI Trader Linear Retry Backoff
- Heuristic:Eventual Inc Daft Delta Lake S3 Locking
- Heuristic:Groq Groq python Retry Backoff Strategy
- Heuristic:Trailofbits Fickling Allowlist Maintenance
Environments
- Environment:Apache Spark Python Environment
- Environment:Hiyouga LLaMA Factory Optional Inference Backends
- Environment:PacktPublishing LLM Engineers Handbook Docker MongoDB Qdrant Infrastructure
- Environment:Infiniflow Ragflow Docker Infrastructure
- Environment:Risingwavelabs Risingwave Dashboard Node Environment
- Environment:Apache Shardingsphere Calcite Federation Engine
- Environment:HKUDS AI Trader API Credentials
- Environment:Langfuse Langfuse ClickHouse Analytics
- Environment:Gretelai Gretel synthetics Python Base Environment
- Environment:Vllm project Vllm CUDA Hopper