Injecting repository knowledge into code LLMs via adapters — "Code2LoRA," keeping up with evolving code
Code LLMs need repository-level context to resolve imports, APIs, and conventions. Code2LoRA is a hypernetwork that generates repository-specific LoRA adapters, injecting that knowledge with zero inference-time token overhead. It offers a Static mode (snapshot → adapter) and an Evo mode updated per code diff.
Paper overview (our summary)
- Field (arXiv category)cs.SE(+2)
- AuthorsLiliana Hotsko, Yinxi Li, Yuntian Deng, et al. (4)
- Submitted2026-06-04
- arXiv ID2606.06492v1
Key points
- Injects repository knowledge into code LLMs as LoRA adapters (zero inference-time token overhead)
- A hypernetwork generates repository-specific adapters
- Static (snapshot) and Evo (updated per diff via a GRU state) modes
- Static matches the per-repo LoRA upper bound; Evo beats a shared LoRA by +5.2 points
- Introduces and releases RepoPeftBench (604 repositories)
This work (Code2LoRA) efficiently injects repository-specific knowledge into code language models (LLMs).
Code LLMs need repository-level context to resolve imports, APIs, and project conventions. Existing methods inject this either (1) as long inputs (retrieved via RAG or dependency analysis) or (2) through per-repository fine-tuning and LoRA — both costly at repository scale and brittle to evolving codebases.
Code2LoRA, a hypernetwork framework, generates repository-specific LoRA adapters, injecting repository knowledge with zero inference-time token overhead. It supports two scenarios: Code2LoRA-Static converts a single repository snapshot into an adapter, suited to comprehension of stable codebases; Code2LoRA-Evo maintains an adapter backed by a GRU hidden state updated per code diff, suited to active development of evolving codebases.
To evaluate against parameter-efficient fine-tuning baselines, the authors built RepoPeftBench — 604 Python repositories with a static track (40K training, 12K test assertion-completion tasks) and an evolution track (215K commit-derived training, 87K test tasks). On the static track, Code2LoRA-Static achieves 63.8% cross-repo and 66.2% in-repo exact match, matching the per-repository LoRA upper bound; on the evolution track, Code2LoRA-Evo achieves 60.3% cross-repo exact match (+5.2 pp over a single shared LoRA).
Why it matters
A case of making code-LLM "repository adaptation" efficient. A useful read for those tracking RAG-free code comprehension, adaptation to evolving codebases, and parameter-efficient fine-tuning (PEFT).
FAQ
What are LoRA / adapters?
Why does "zero token overhead" matter?
Sources (primary)
Source: arXiv (descriptive metadata is CC0 public domain). Summaries are our own; see arXiv for the original text and PDF.
- arXiv abstract page (original, official)
- PDF (arXiv)
- arXiv ID: 2606.06492