llm · experimental
This skill is experimental. Recipes cover the LLM engineering stack but assume familiarity with Python packaging, TypeScript, and basic LLM concepts.
Context skill for LLM engineering: agent frameworks, document/image processing for LLMs, open-weight model serving, retrieval-augmented generation, and evaluation tooling.
Requirements
- Python 3.11+
- Node.js 20+ and pnpm — for TypeScript agent recipes
uv— Python package and project manager- Cloud provider CLI (AWS CLI or gcloud) — for serving and cloud RAG recipes
Philosophy
LLM engineering is systems engineering with a probabilistic core. The model is just one component in a pipeline that includes document ingestion, embedding, retrieval, prompt construction, tool execution, and output evaluation. These recipes treat each component as an independently testable, observable subsystem.
Recipes
- Agents with Vercel AI SDK / Anthropic SDK — agent loop, tool calling, streaming, multi-turn, human-in-the-loop
- Agents with LangChain/LangGraph — chains, state graphs, conditional edges, memory, checkpointing
- Tool Calling — defining schemas, parallel tool use, error recovery, tool result handling
- PDF and Document Ingestion — text extraction, OCR, chunking strategies (recursive, semantic)
- Image Processing for Multimodal LLMs — PDF rasterization, tiling large images for Vision APIs, canvas/WebGL rendering
- Serving on AWS — EC2 with vLLM, SageMaker endpoints, spot instances, quantization
- Serving on GCP — Vertex AI Model Garden, Cloud Run with vLLM, Compute Engine GPU
- RAG on AWS — OpenSearch vector store, Bedrock embeddings, S3 + Lambda pipeline
- RAG on GCP — Vertex AI Search, BigQuery vector search, RAG Engine
- RAG with OSS — Qdrant, Chroma, hybrid search (BM25 + vector), RAGAS evaluation
- LLM Eval Tooling — RAGAS, LangSmith tracing, custom eval harnesses, regression tracking