
Open-Source GLM 5.2 Outperforms Anthropic's Claude on Security Tasks
New benchmark results suggest Chinese-developed LLM excels at code vulnerability detection, challenging established players in specialized AI domains.
Papers, breakthroughs, benchmarks, and the long-arc trends shaping artificial intelligence. arXiv highlights and lab announcements, distilled.

New benchmark results suggest Chinese-developed LLM excels at code vulnerability detection, challenging established players in specialized AI domains.

ConlangCrafter demonstrates that machine learning can create novel language systems with internal coherence, opening new pathways for NLP research.

Researchers achieve state-of-the-art video matting results by decoupling tracking from fine-detail extraction, without requiring specialized video training data.

New study exposes how resource constraints and data distribution shifts impact the performance of advanced entity matching frameworks in real-world applications.

Researchers introduce geometric encoding method that improves consistency and camera control in AI-generated video sequences.

Researchers develop multilingual language model system to automatically extract political relationships from news archives at continental scale.

New approach trains AI models to self-correct prediction errors without expensive optimization, showing 10x accuracy gains on complex fluid dynamics problems.

Researchers replace flow-based models with autoregressive architecture, cutting energy prediction errors by 60% on protein structures.

New AI system simulates realistic physical behavior for rigid and elastic objects without explicit constraints, advancing world models for robotics and design.

A new self-guidance technique helps flow models produce varied outputs without slowing down inference or requiring external systems.

A new attention mechanism reduces noise in transformer models, boosting image recognition accuracy and expanding benefits across video and multimodal tasks.

A rigorous study reveals fundamental limits to improving LLM answers through decoding optimization.

Researchers tackle a fundamental flaw where multimodal AI systems rely too heavily on language patterns instead of actually analyzing visual content.

Researchers show language models can improve at programming tasks by learning from imperfect feedback, not just correct solutions.

New technique lets robots retain learned skills while adapting to new tasks, eliminating the need to store original training data.

Researchers demonstrate self-improving multimodal systems that enhance both image understanding and generation using only unlabeled data.

Researchers develop technique to unify text-to-image generation with editing capabilities without sacrificing quality.

Researchers solve a major challenge in deploying machine learning across industrial processes with fundamentally different physics.