Skip to content

hpc

Experimental

This skill is experimental. Recipes and structure may change.

Context skill for high-performance and distributed computing in Python and C++.

Requirements

  • C++17 compiler (GCC 10+, Clang 14+, or MSVC 19.30+)
  • CUDA Toolkit (optional, for GPU compute recipes)
  • Python 3.11+
  • uv — Python package and project manager
  • MPI implementation (optional, for cluster recipes — OpenMPI or MPICH)

Philosophy

Performance work is measurement-driven. Every recipe follows the same pattern: establish a baseline, apply the technique, measure the improvement. Techniques that can't be measured shouldn't be applied. Recipes cover threading, SIMD, GPU compute, MPI clustering, and distributed Python with Ray — from laptop to multi-node cluster.

Recipes

References

Released under the MIT License.