AI that "self-evolves" to discover machine-learning algorithms — "MLEvolve"
When LLM agents tackle long-horizon tasks like ML engineering, inter-branch information isolation, memoryless search, and a lack of hierarchical control hamper long-horizon optimization. MLEvolve is a self-evolving multi-agent framework that enables cross-branch information flow and reuses accumulated experience. It reaches SOTA on MLE-Bench in half the time budget and beats AlphaEvolve on math optimization.
Paper overview (our summary)
- Field (arXiv category)cs.AI(+1)
- AuthorsShangheng Du, Xiangchao Yan, Jinxin Shi, et al. (14)
- Submitted2026-06-04
- arXiv ID2606.06473v1
Key points
- An LLM-agent framework that self-evolvingly discovers ML algorithms
- Extends tree search to Progressive MCGS — cross-branch sharing and exploration→exploitation
- Retrospective Memory (knowledge base + dynamic global memory) accumulates and reuses experience
- Decouples planning from code generation for stable long-horizon iteration
- SOTA on MLE-Bench in half the budget; beats AlphaEvolve on math optimization
This work (MLEvolve) lets AI discover machine-learning (ML) algorithms in a self-evolving way.
LLM agents are increasingly applied to long-horizon tasks such as scientific discovery and machine-learning engineering (MLE), where sustained self-evolution is a key capability. But existing MLE agents suffer from inter-branch information isolation, memoryless search, and a lack of hierarchical control — together hindering long-horizon optimization.
MLEvolve is an LLM-based self-evolving multi-agent framework for end-to-end ML algorithm discovery. By extending tree search to Progressive MCGS, it enables cross-branch information flow through graph-based reference edges and gradually shifts the search from broad exploration to focused exploitation with an entropy-inspired progressive schedule. To let the agent evolve with accumulated experience, it introduces Retrospective Memory, combining a cold-start domain knowledge base with a dynamic global memory for task-specific experience retrieval and reuse. For stable long-horizon iteration, it further decouples strategic planning from code generation with adaptive coding modes.
On MLE-Bench, MLEvolve achieves state-of-the-art performance across multiple dimensions including average medal rate and valid submission rate under a 12-hour budget (half the standard runtime). It also outperforms specialized algorithm-discovery methods including AlphaEvolve on math-algorithm optimization tasks, showing strong cross-domain generalization.
Why it matters
A read on AI-agent-driven AutoML / autonomous algorithm discovery. Useful for tracking LLM-agent long-horizon tasks (memory, search, self-improvement) and AI for Science.
FAQ
What does "self-evolving" mean here?
What is AlphaEvolve?
Sources (primary)
Source: arXiv (descriptive metadata is CC0 public domain). Summaries are our own; see arXiv for the original text and PDF.
- arXiv abstract page (original, official)
- PDF (arXiv)
- arXiv ID: 2606.06473