cs.SE
2026/06
Code LLMs need repository-level context to resolve imports, APIs, and conventions. Code2LoRA is a hypernetwork that generates repository-specific LoRA adapters, injecting that knowledge with zero inference-time token overhead. It offers a Static mode (snapshot → adapter) and an Evo mode updated per code diff.
Read more
cs.RO
2026/06
Manipulation alternates between low-risk transit (fast) and high-risk contact (slow, precise), yet existing Vision-Language-Action models (VLAs) inherit a single fixed speed from demonstrations. TempoVLA notes that the magnitude of each predicted action already governs speed, and controls execution speed via an explicit condition — combining a data-side variable-speed trajectory augmentation (VSTA) with model-side speed conditioning to control both acceleration and deceleration.
Read more
cs.CL
2026/06
As AI writing assistants spread, documents are increasingly the product of progressive human–AI co-editing rather than purely human or AI. OpAI-Bench studies human-to-AI transformation at document, sentence, token, and span granularities — and finds that mixed-authorship "intermediate" versions are often harder to detect than fully human or heavily AI-edited endpoints (a non-monotonic pattern).
Read more
cs.LG
2026/06
Standard RNN training (BPTT) is sequential in time, hard to parallelize, and suffers vanishing/exploding gradients on long ranges. SMT reduces RNN training to supervised learning on one-step memory-transition labels (m_t, x_{t+1})→m_{t+1}, sidestepping recurrent credit propagation entirely — enabling time-parallel training with an O(1) gradient path between any two tokens. It beats BPTT on language and pixel-sequence modeling (MIT, Isola lab).
Read more
cs.CV
2026/06
Vision-Language Models (VLMs) tend to confine reasoning to observed images and text, struggling with unobserved layouts and alternative viewpoints. Astra is a "thinking with imagination" framework where a VLM actively queries a world simulator for imagined novel-view evidence during reasoning. It couples an RL-trained policy with a Bagel-based world model and improves spatial-reasoning benchmarks.
Read more
cs.LG
2026/06
RL fine-tuning of reasoning models (e.g., GRPO) can only verify and reward the final answer after the chain-of-thought (CoT) is complete — a delayed-reward problem that is Monte-Carlo-like and high-variance. RREDCoT redistributes reward (credit assignment) to the CoT segments that mattered, approximating the optimal redistribution using the model itself without extra generation (from LSTM creator Hochreiter's group, JKU).
Read more
cs.AI
2026/06
When LLM agents tackle long-horizon tasks like ML engineering, inter-branch information isolation, memoryless search, and a lack of hierarchical control hamper long-horizon optimization. MLEvolve is a self-evolving multi-agent framework that enables cross-branch information flow and reuses accumulated experience. It reaches SOTA on MLE-Bench in half the time budget and beats AlphaEvolve on math optimization.
Read more
cs.AI
2026/06
An agentic framework for formal theorem proving in Lean 4 that generates and refines a "blueprint" — a dependency graph of definitions and lemmas. A tool-equipped Lean prover closes each lemma node in parallel, and failures drive blueprint refinement, avoiding the dead-end loops of recursive decomposition. On an open-weight backbone it reaches 99.2% on MiniF2F and 75.6% on PutnamBench (88.8% with a natural-language proof) — SOTA-class for an open-source pipeline.
Read more
cs.CL
2026/06
Long-context inference is bottlenecked by decoding efficiency, especially for reasoning models that emit long chains of thought. Existing sparse attention faces an efficiency-quality trade-off. CLSA, built on KV-sharing (YOCO), shares not just the KV cache but the routing index across layers — computing top-k selection once and reusing it. At 128K context it reaches up to 7.6x decoding speedup and 17.1x overall throughput.
Read more
cs.CR
2026/06
As autonomous LLM agents hold real credentials and operate infrastructure, operators lack a standard way to say a resource is off-limits. The Recuse Signal is a lightweight in-band deny signal (over an SSH banner or a PostgreSQL NOTICE) asking an automated agent to voluntarily withdraw — a robots.txt-like cooperative control, not a security boundary. In a pilot, the signal induced 100% recusal versus 100% task completion without it.
Read more