Qwen-AgentWorld: Language World Models for General Agents

This paper introduces Qwen-AgentWorld, a novel approach to developing general AI agents through advanced language world models. By simulating dynamic environments, these models aim to enhance agent reasoning and planning capabilities, marking a significant step towards more sophisticated artificial intelligence.

Foundation Models: The core contribution includes Qwen-AgentWorld-35B-A3B and Qwen-AgentWorld-397B-A17B, described as the first language world models capable of simulating agentic environments across seven distinct domains using long chain-of-thought reasoning.
Data and Training: These models were trained using over 10 million environment interaction trajectories from real-world scenarios, employing a three-stage pipeline: CPT for general-purpose world modeling, SFT for next-state-prediction reasoning, and RL with hybrid rubric-and-rule rewards for simulation fidelity.
Evaluation Benchmark: To rigorously assess language world models, the authors introduce AgentWorldBench, a comprehensive benchmark derived from real-world interactions of five frontier models across nine established benchmarks.
Performance: Empirical results demonstrate that Qwen-AgentWorld significantly outperforms existing frontier models on the AgentWorldBench.
Dual Paradigm Enhancement: The research explores two ways world modeling improves general agents: as a decoupled environment simulator, enabling scalable and controllable simulation for agentic RL with gains surpassing real-environment training; and as a unified agent foundation model, where world-model training acts as an effective warm-up to boost downstream performance across seven agentic benchmarks.

In essence, Qwen-AgentWorld offers a robust framework for building more capable and adaptable AI agents by providing them with a profound understanding of their operational environments.

Qwen-AgentWorld: Language World Models for General Agents

The Lowdown