Main Page
Welcome to Leeroopedia
Your ML & Data Knowledge Wiki. Best practices and expert-level knowledge for Machine Learning and Data Engineering, covering 1000+ frameworks and libraries from training to deployment.
Browse implementation patterns, configuration guides, debugging heuristics, and battle-tested defaults for frameworks like vLLM, DeepSpeed, Megatron-LM, FlashAttention, Triton, Unsloth, LangChain, and many more. Every page is structured so both humans and AI agents can find what they need fast.
Connect your AI coding agent. Plug Leeroopedia into your favorite coding agent, and let it build robust AI/ML systems autonomously:
- SuperML plugin — converts your AI coding agent into an expert ML engineer with agentic memory
- Leeroopedia MCP — search over best-practices and skills of ML/AI
- Kapso — experimentation platform for autonomous AI/ML software building
Browse by Category
| Category | Description | Browse |
|---|---|---|
| Workflows | Step-by-step processes and procedures | Browse All |
| Principles | Core ideas and foundational knowledge | Browse All |
| Implementations | Code-level details and modules | Browse All |
| Heuristics | Best practices and guidelines | Browse All |
| Environments | Setup and configuration guides | Browse All |
Explore Pages
Workflows
- Workflow:Openai Openai python Chat Completion
- Workflow:Mbzuai oryx Awesome LLM Post training Awesome List Curation
- Workflow:NVIDIA NeMo Aligner RLHF PPO Training
- Workflow:Huggingface Alignment handbook QLoRA Single GPU Finetuning
- Workflow:Datahub project Datahub Metadata Actions Pipeline
- Workflow:Helicone Helicone Cost Calculation Pipeline
- Workflow:Huggingface Diffusers DreamBooth Personalization
- Workflow:Truera Trulens RAG Evaluation With LangChain
- Workflow:Kubeflow Pipelines XGBoost Training Pipeline
- Workflow:Haosulab ManiSkill Motion Planning Demo Generation
Principles
- Principle:Recommenders team Recommenders Data Loading MovieLens Pandas
- Principle:Liu00222 Open Prompt Injection Conditional Probability Computation
- Principle:Eric mitchell Direct preference optimization Training Loop
- Principle:Infiniflow Ragflow Agent Execution
- Principle:CARLA simulator Carla Autopilot Mode
- Principle:Spotify Luigi Dependency Analysis
- Principle:EvolvingLMMs Lab Lmms eval Server Launch
- Principle:Ucbepic Docetl Chunk Result Reduction
- Principle:Google deepmind Mujoco Pipeline Architecture
- Principle:Risingwavelabs Risingwave Sink Connector Framework
Implementations
- Implementation:Lance format Lance SessionCaches
- Implementation:Vllm project Vllm RequestOutput LoRA Access
- Implementation:Astronomer Astronomer cosmos Cosmos Plugin
- Implementation:DataTalksClub Data engineering zoomcamp Confluent CSV Producer
- Implementation:Apache Shardingsphere ShadowTableHintDataSourceMappingsRetriever Retrieve
- Implementation:Scikit learn Scikit learn LabelEncoder
- Implementation:Pyro ppl Pyro SearchInference
- Implementation:Eventual Inc Daft DataFrame Write Deltalake
- Implementation:Vibrantlabsai Ragas FewShotPydanticPrompt
- Implementation:Interpretml Interpret Harmonize Tensor
Heuristics
- Heuristic:Evidentlyai Evidently Statistical Test Auto Selection
- Heuristic:PacktPublishing LLM Engineers Handbook LoRA Finetuning Parameters
- Heuristic:EvolvingLMMs Lab Lmms eval Distributed Padding Strategy
- Heuristic:Farama Foundation Gymnasium Action Space Normalization Tip
- Heuristic:Unstructured IO Unstructured Multi Python Matrix
- Heuristic:Axolotl ai cloud Axolotl Gradient Checkpointing Reentrant Rules
- Heuristic:Eventual Inc Daft Execution Config Tuning
- Heuristic:Alibaba ROLL KL Coefficient Tuning
- Heuristic:Elevenlabs Elevenlabs python Text Chunking Splitter Characters
- Heuristic:NVIDIA NeMo Curator GPU Memory Resource Allocation
Environments
- Environment:ChenghaoMou Text dedup Python 3 12 Environment
- Environment:Anthropics Anthropic sdk python AWS Bedrock Environment
- Environment:Duckdb Duckdb CMake Build Toolchain
- Environment:Mit han lab Llm awq CUDA Build Environment
- Environment:Volcengine Verl CUDA GPU Environment
- Environment:Truera Trulens OpenAI Provider Environment
- Environment:Dagster io Dagster DAGSTER HOME Configuration
- Environment:Langchain ai Langgraph Postgres Checkpoint Environment
- Environment:DataExpert io Data engineer handbook Flink Kafka Docker Environment
- Environment:Intel Ipex llm RAG LangChain Environment