What are HyperAgents?

HyperAgents are self-referential self-improving AI agents proposed by Meta Research in 2026. They unify task agents and meta agents into a single editable codebase, enabling metacognitive self-modification — the system can improve not only its task performance but also its own improvement mechanisms.

How do HyperAgents differ from traditional AI agents?

Traditional AI agents execute tasks with fixed strategies. HyperAgents go further by allowing the agent to rewrite both its task-solving logic and its self-improvement rules within one unified program, enabling open-ended evolution without the infinite regress problem of stacking meta-layers.

What is DGM-H (Darwin Godel Machine with Hyperagents)?

DGM-H is the core architecture of HyperAgents. It extends the Darwin Godel Machine by making the meta-level modification procedure itself editable, eliminating the need for hand-crafted meta mechanisms and the assumption that task performance and self-improvement capability must be domain-aligned.

What domains have HyperAgents been tested on?

HyperAgents have been evaluated across four domains: Polyglot programming (code patching), academic paper review (accept/reject prediction), robotics reward design (reward function generation for Genesis simulation), and IMO Grading (olympiad-level math solution scoring).

HyperAgents — Self-Improving AI Agents

HyperAgents represent a paradigm shift in AI agent research, introduced by researchers at Meta in March 2026. The core idea is to merge the task agent (the program that solves problems) and the meta agent (the mechanism that improves the task agent) into a single, self-modifiable codebase. This architectural choice enables what the authors call metacognitive self-modification: the system can improve its own improvement process, avoiding the infinite regress problem that plagues layered meta-architectures.

The framework is formalized as DGM-H (Darwin Godel Machine with Hyperagents), extending the earlier Darwin Godel Machine. DGM-H retains the open-ended archive-based exploration mechanism—preserving historically successful agent variants as "stepping stones"—while allowing the system to rewrite its own improvement rules. This approach has demonstrated continuous performance gains across multiple non-code domains, including academic paper review, robotics reward function design, and olympiad-level math grading, and has shown measurable cross-domain transfer of improvement capability.

In empirical evaluations, DGM-H achieved remarkable results: a transferred HyperAgent meta-mechanism reached an improvement@50 score of 0.630 on olympiad math grading (IMO-GradingBench), while a hand-crafted DGM baseline scored 0.0 on the same transfer task. The system also autonomously developed emergent engineering capabilities such as persistent memory, performance tracking, and computational resource planning—written directly into its own codebase during the self-improvement loop.

Domain	Task	Metric	Scale	Key Finding
Polyglot Programming	Repository + instruction → code patch	pass@1	Train 60 / Test 165	Strict evaluation with no ground-truth test feedback visible to the agent
Paper Review	Full paper → Accept / Reject	accuracy	Train 100 / Test 100	From 0% (format parse failure) to significant accuracy via self-developed decision pipelines
Robotics Reward Design	Task description → Python reward function	task score	6 repeats each	Zero-shot transfer to novel embodied tasks (Genesis + PPO training)
IMO Grading	Problem + solution + rubric → discrete score	accuracy	Train 100 / Test 100	Transfer target: imp@50 = 0.630 from cross-domain meta-mechanism

Year	Milestone	Significance
2003	Godel Machine (Schmidhuber) & self-modifying agent theory (Hutter)	Formal foundations for agents that rewrite themselves when provably beneficial
2022	ReAct: Reason + Act paradigm	Established alternating reasoning traces and actions for interactive LLM agents
2023	Toolformer; AgentBench	Self-supervised tool learning; systematic evaluation of LLM-as-agent capability
2024	GAIA (ICLR), OSWorld, WebArena, Mind2Web	Interactive evaluation benchmarks push agents toward real-world environments
2025	Darwin Godel Machine (DGM); AI Scientist-v2	Open-ended self-improvement in code domains (SWE-bench: 20% → 50%); end-to-end automated research
2026	HyperAgents (DGM-H)	Meta-mechanism becomes editable; cross-domain transfer of improvement capability demonstrated

Framework	Positioning	License	HyperAgent Compatibility
facebookresearch/HyperAgents	Official DGM-H implementation; task + meta unified, self-modifiable	CC BY-NC-SA 4.0	Native support: archive exploration, cross-domain evaluation, logging
jennyzzt/dgm	DGM predecessor: open-ended self-improvement, primarily code domains	Apache-2.0	Partial: provides self-improvement + archive skeleton, but meta is fixed
SakanaAI/AI-Scientist-v2	End-to-end research workflow; reviewer agent used as HyperAgents baseline	Open Source	Domain/baseline component; does not self-modify improvement mechanism
microsoft/autogen	Multi-agent collaborative programming framework	MIT	Requires building self-modification, evaluation, and archive selection
langchain / LangGraph	Agent engineering platform with durable execution	MIT	Viable execution/orchestration substrate; core self-modification not provided
crewAI	Role-playing multi-agent orchestration	MIT	Good for role-based collaboration; self-modification loop must be built separately

What Are HyperAgents?

Solving the Infinite Regress Problem

DGM-H: The Three-Loop Architecture

Task Execution Loop

Evaluation Feedback Loop

Metacognitive Self-Modification

Open-Source Implementation Structure

Multi-Domain Evaluation Results

Self-Developed Engineering Capabilities

Persistent Memory Systems

Performance Tracking

Computational Resource Planning

Structured Decision Pipelines

Agent Taxonomy: Where HyperAgents Fit

Historical Foundations and Key Milestones

From DGM to DGM-H

HyperAgents vs. Agent Frameworks

improvement@k: Quantifying Transferable Improvement

Risk Profile and Safety Considerations

Code Execution Risk

Meta-Level Poisoning

Evaluation Overfitting

Output Format Fragility

Recommended Mitigations

HyperAgents Knowledge Hub

HyperAgents DGM-H Architecture Explained

Meta Agent Self-Modification in HyperAgents

HyperAgents Task Agent: Design and Implementation

HyperAgents on Polyglot Programming Tasks

HyperAgents for Academic Paper Review

HyperAgents in Robotics Reward Function Design

HyperAgents for Olympiad Math Grading (IMO)

Cross-Domain Transfer of Improvement Capability

Open-Ended Exploration and Archive Mechanisms

HyperAgents vs. Darwin Godel Machine (DGM)

Emergent Engineering in HyperAgents

HyperAgents Safety, Risk, and AI Governance

AI Agent Taxonomy: From ReAct to HyperAgents

Benchmarks for Self-Improving Agent Systems

HyperAgents Setup and Quickstart Guide

Citing HyperAgents