Researchers from Tencent introduced R‑Zero, a framework that trains large language models without any external data. Traditional self‑evolving methods rely on human‑curated tasks and labels, which limit scalability. R‑Zero instead starts with a base model and splits it into two independent roles: a Challenger, which proposes tasks at the edge of the model’s capabilities, and a Solver, which must solve those tasks. The two roles co‑evolve, generating and solving increasingly difficult problems, creating a self‑improving curriculum from scratch. Empirically, the framework boosted the Qwen3‑4B‑Base model’s average score on math reasoning benchmarks by 6.49 points and general‑domain reasoning benchmarks by 7.54 points. The authors argue that such self‑evolving LLMs offer a scalable pathway toward models capable of reasoning beyond human‑curated datasets.
These four stories showcase different facets of the rapidly evolving AI landscape. Nano Banana demonstrates how new image models can capture public imagination before official announcements; GPT‑5 highlights both progress and community pushback; Anthropic’s Claude for Chrome signals a move toward AI‑driven web browsers with deep safety considerations; and R‑Zero points to novel training paradigms that could reduce dependence on human‑labeled data.
- Meta’s Ambitious Plan: Fully Automated AI Advertising by 2026
- Hyperledger’s Expanding Ecosystem: Diverse Use Cases Across Industries
- Perplexity Labs Empowers Users to Build Web Apps and Dashboards Without Coding
- Navigating the Ethical Landscape of AI: Implications for Businesses and Society
- AI Tools Emerge as Key Defenders Against Weaponized Disinformation Campaigns
- Gemini Code Assist Adds Gemini 2.5, Personalization, and Context Management: A New Era of AI-Powered Development