arXiv Papers

Notable AI research papers, in brief

Selected papers from the preprint server arXiv in AI / machine learning (cs.AI / cs.LG / cs.CL and more), organized with our own summaries, key points, and sources. This site is not affiliated with arXiv.

This page is a general organization of public research information. Summaries are our own; always verify accuracy and currency with the original paper on arXiv. Includes non-peer-reviewed preprints.

Browse all collected data (list, filter, search) →

Featured

Notable AI/ML papers explained with our own summaries, key points, FAQs, and sources.

cs.SE 2026/06

Injecting repository knowledge into code LLMs via adapters — "Code2LoRA," keeping up with evolving code

Code LLMs need repository-level context to resolve imports, APIs, and conventions. Code2LoRA is a hypernetwork that generates repository-specific LoRA adapters, injecting that knowledge with zero inference-time token overhead. It offers a Static mode (snapshot → adapter) and an Evo mode updated per code diff.

cs.RO 2026/06

A robot policy with controllable speed — "TempoVLA," a speed-controllable Vision-Language-Action model

Manipulation alternates between low-risk transit (fast) and high-risk contact (slow, precise), yet existing Vision-Language-Action models (VLAs) inherit a single fixed speed from demonstrations. TempoVLA notes that the magnitude of each predicted action already governs speed, and controls execution speed via an explicit condition — combining a data-side variable-speed trajectory augmentation (VSTA) with model-side speed conditioning to control both acceleration and deceleration.

cs.CL 2026/06

AI-text detection gets harder under human–AI co-editing — the "OpAI-Bench" progressive-editing benchmark

As AI writing assistants spread, documents are increasingly the product of progressive human–AI co-editing rather than purely human or AI. OpAI-Bench studies human-to-AI transformation at document, sentence, token, and span granularities — and finds that mixed-authorship "intermediate" versions are often harder to detect than fully human or heavily AI-edited endpoints (a non-monotonic pattern).

cs.LG 2026/06

Training RNNs without recurrence — "Supervised Memory Training (SMT)," parallelizable across time

Standard RNN training (BPTT) is sequential in time, hard to parallelize, and suffers vanishing/exploding gradients on long ranges. SMT reduces RNN training to supervised learning on one-step memory-transition labels (m_t, x_{t+1})→m_{t+1}, sidestepping recurrent credit propagation entirely — enabling time-parallel training with an O(1) gradient path between any two tokens. It beats BPTT on language and pixel-sequence modeling (MIT, Isola lab).

cs.CV 2026/06

AI that "imagines" unseen space to reason — "Astra," a spatial-reasoning agent coupled with a world simulator

Vision-Language Models (VLMs) tend to confine reasoning to observed images and text, struggling with unobserved layouts and alternative viewpoints. Astra is a "thinking with imagination" framework where a VLM actively queries a world simulator for imagined novel-view evidence during reasoning. It couples an RL-trained policy with a Bagel-based world model and improves spatial-reasoning benchmarks.

cs.LG 2026/06

Redistributing reward to the reasoning steps that mattered — "RREDCoT" for chain-of-thought

RL fine-tuning of reasoning models (e.g., GRPO) can only verify and reward the final answer after the chain-of-thought (CoT) is complete — a delayed-reward problem that is Monte-Carlo-like and high-variance. RREDCoT redistributes reward (credit assignment) to the CoT segments that mattered, approximating the optimal redistribution using the model itself without extra generation (from LSTM creator Hochreiter's group, JKU).

cs.AI 2026/06

AI that "self-evolves" to discover machine-learning algorithms — "MLEvolve"

When LLM agents tackle long-horizon tasks like ML engineering, inter-branch information isolation, memoryless search, and a lack of hierarchical control hamper long-horizon optimization. MLEvolve is a self-evolving multi-agent framework that enables cross-branch information flow and reuses accumulated experience. It reaches SOTA on MLE-Bench in half the time budget and beats AlphaEvolve on math optimization.

cs.AI 2026/06

AI that proves theorems from a "blueprint" — "Goedel-Architect" for formal theorem proving in Lean 4

An agentic framework for formal theorem proving in Lean 4 that generates and refines a "blueprint" — a dependency graph of definitions and lemmas. A tool-equipped Lean prover closes each lemma node in parallel, and failures drive blueprint refinement, avoiding the dead-end loops of recursive decomposition. On an open-weight backbone it reaches 99.2% on MiniF2F and 75.6% on PutnamBench (88.8% with a natural-language proof) — SOTA-class for an open-source pipeline.

cs.CL 2026/06

Speeding up long-context LLMs by indexing once — "CLSA" cross-layer sparse attention

Long-context inference is bottlenecked by decoding efficiency, especially for reasoning models that emit long chains of thought. Existing sparse attention faces an efficiency-quality trade-off. CLSA, built on KV-sharing (YOCO), shares not just the KV cache but the routing index across layers — computing top-k selection once and reusing it. At 128K context it reaches up to 7.6x decoding speedup and 17.1x overall throughput.

cs.CR 2026/06

Will an AI agent recuse itself? Measuring compliance with a "Recuse Signal"

As autonomous LLM agents hold real credentials and operate infrastructure, operators lack a standard way to say a resource is off-limits. The Recuse Signal is a lightweight in-band deny signal (over an SSH banner or a PostgreSQL NOTICE) asking an automated agent to voluntarily withdraw — a robots.txt-like cooperative control, not a security boundary. In a pilot, the signal induced 100% recusal versus 100% task completion without it.

Latest feed

Recent AI/ML papers — latest 40

New papers in AI-related categories, newest submission first. Each links to the original page on arXiv.

1
TailLoR: Protecting Principal Components in Parameter-Efficient Continual Learning
cs.LG

2026/06/04
2
HANDOFF: Humanoid Agentic Task-Space Whole-Body Control via Distilled Complementary Teachers
cs.RO ・ cs.AI ・ cs.LG

2026/06/04
3
Code2LoRA: Hypernetwork-Generated Adapters for Code Language Models under Software Evolution Read more
cs.SE ・ cs.AI ・ cs.CL

2026/06/04
4
TempoVLA: Learning Speed-Controllable Vision-Language-Action Policies Read more
cs.RO ・ cs.AI

2026/06/04
5
Regret Minimization with Adaptive Opponents in Repeated Games
cs.LG ・ cs.AI ・ cs.GT

2026/06/04
6
PAR3D: A Unified 3D-MLLM with Part-Aware Representation for Scene Understanding
cs.CV

2026/06/04
7
Operation-Guided Progressive Human-to-AI Text Transformation Benchmark for Multi-Granularity AI-Text Detection Read more
cs.CL ・ cs.AI ・ cs.LG

2026/06/04
8
DNQ: Deep Nash Q-Network for Partially Observable n-Player Games
cs.GT ・ cs.LG

2026/06/04
9
Pretraining Recurrent Networks without Recurrence Read more
cs.LG ・ cs.AI

2026/06/04
10
Complexity-Balanced Diffusion Splitting
cs.CV

2026/06/04
11
Thinking with Imagination: Agentic Visual Spatial Reasoning with World Simulators Read more
cs.CV

2026/06/04
12
RREDCoT: Segment-Level Reward Redistribution for Reasoning Models Read more
cs.LG ・ cs.AI

2026/06/04
13
Self-Augmenting Retrieval for Diffusion Language Models
cs.CL ・ cs.AI ・ cs.LG

2026/06/04
14
MLEvolve: A Self-Evolving Framework for Automated Machine Learning Algorithm Discovery Read more
cs.AI ・ cs.CL

2026/06/04
15
PC Layer: Polynomial Weight Preconditioning for Improving LLM Pre-Training
cs.LG ・ cs.AI

2026/06/04
16
How abundant are good interpolators?
math.ST ・ cs.LG ・ math.PR

2026/06/04
17
Goedel-Architect: Streamlining Formal Theorem Proving with Blueprint Generation and Refinement Read more
cs.AI

2026/06/04
18
You Only Index Once: Cross-Layer Sparse Attention with Shared Routing Read more
cs.CL ・ cs.AI ・ cs.LG

2026/06/04
19
Human Adults and LLMs as Scientists: Who Benefits from Active Exploration?
cs.CL

2026/06/04
20
Benchmark Everything Everywhere All at Once
cs.AI

2026/06/04
21
Will the Agent Recuse Itself? Measuring LLM-Agent Compliance with In-Band Access-Deny Signals Read more
cs.CR ・ cs.AI

2026/06/04
22
Event Detection for Parameter-to-KPI Dependency Learning for AI-RAN
cs.LG

2026/06/04
23
In-Context Multiple Instance Learning
cs.LG ・ cs.AI ・ cs.CV

2026/06/04
24
Scaffold, Not Vocabulary? A Controlled, Two-Tier, Pre-Registered Study of a Popperian Code-Generation Skill
cs.SE ・ cs.CL

2026/06/04
25
Vortex: Efficient and Programmable Sparse Attention Serving for AI Agents
cs.AI

2026/06/04
26
Agent Memory: Characterization and System Implications of Stateful Long-Horizon Workloads
cs.AI

2026/06/04
27
Latent Reasoning with Normalizing Flows
cs.CL ・ cs.LG

2026/06/04
28
USAD 2.0: Scaling Representation Distillation for Universal Audio Understanding
eess.AS ・ cs.CL ・ cs.SD

2026/06/04
29
Revising Context, Shifting Simulated Stance: Auditing LLM-Based Stance Simulation in Online Discussions
cs.CL ・ cs.MM ・ cs.SI

2026/06/04
30
Causal Atlases from Entropic Inference: Bayesian Networks beyond Optimal DAGs
cs.LG ・ stat.ML

2026/06/04
31
Reinforcement Learning Elicits Contextual Learning of Unseen Language Translation
cs.CL

2026/06/04
32
RiskFlow: Fast and Faithful Safety-Critical Traffic Scenario Generation
cs.RO ・ cs.AI

2026/06/04
33
A Komi-Yazva--Russian Parallel Corpus and Evaluation Protocol for Zero- and Few-Shot LLM Translation
cs.CL

2026/06/04
34
Double Preconditioning (DoPr): Optimization for Test-Time Performance, not Validation Loss
cs.LG ・ cs.AI ・ eess.SY

2026/06/04
35
Unsupervised Skill Discovery for Agentic Data Analysis
cs.AI ・ cs.CL ・ cs.LG

2026/06/04
36
Nonreversible Gauge Fields in Fokker--Planck Dynamics: Supersymmetric Hamiltonians and Learned Finite Forces
cond-mat.dis-nn ・ quant-ph ・ stat.ML

2026/06/04
37
A Vision-language Framework for Comparative Reasoning in Radiology
cs.CV ・ cs.IR ・ cs.LG

2026/06/04
38
CollabSim: A CSCW-Grounded Methodology for Investigating Collaborative Competence of LLM Agents through Controlled Multi-Agent Experiments
cs.CL

2026/06/04
39
The Post-GCN Decade Revisited: Curvature-Stratified Evaluation of Relational Learning
cs.LG

2026/06/04
40
Risk Assessment of Autonomous Driving: Integrating Technical Failures, Ethical Dilemmas, and Policy Frameworks
cs.AI

2026/06/04

Source: arXiv (descriptive metadata is CC0 public domain). Summaries are our own; see arXiv for the original text and PDF.