LMCache/LMCache
Summary
LMCache is a KV cache management layer for LLM inference. It turns KV cache from a transient state into reusable AI native knowledge that can be stored persistently, reused across multiple serving engines, observed with a comprehensive metrics stack, and transformed for better generation quality. The project emphasizes vendor neutrality, pluggable backends, non prefix KV reuse, and transport of cache data across workers, with an active ecosystem and documentation.