
Voice AI Systems Ignore Emotion Despite Detecting It
New research reveals leading real-time voice assistants fail to act on vocal cues they can perceive, creating risks in high-stakes interactions.
Papers, breakthroughs, benchmarks, and the long-arc trends shaping artificial intelligence. arXiv highlights and lab announcements, distilled.

New research reveals leading real-time voice assistants fail to act on vocal cues they can perceive, creating risks in high-stakes interactions.

Researchers develop geometric supervision technique that enhances consistency in AI-generated videos from single camera angles.

A popular technique for improving language model accuracy may inadvertently reduce output variety and hurt performance on unfamiliar tasks.

New framework lets users explore garments from any angle, advancing e-commerce and fashion technology.

New benchmark reveals how well LLMs can reconstruct hidden decision-making policies through behavioral observation and controlled experiments.

Researchers propose pretrained action modules to accelerate robot learning across different body types and real-world tasks.

Large-scale study reveals state-of-the-art tumor detection systems fail for underrepresented demographic groups, raising concerns about equitable medical AI deployment.

The fast, efficient AI model can now interact with computer interfaces, expanding automation possibilities for enterprise and consumer applications.

Researchers introduce automated method for matching training objectives to specific AI tasks, dramatically improving encoder-decoder models.

Researchers develop a framework enabling robots to autonomously acquire manipulation abilities by decomposing tasks into learnable primitives.

A technical guide to selecting the right embeddings index for retrieval-augmented generation systems.

Researchers show how to flatten spherical imagery for depth estimation, enabling real-time 3D perception in omnidirectional systems.

New study exposes flaws in evaluating AI-enhanced communication devices for people with speech disabilities.

New benchmark reveals that methods improving class-conditional generation don't necessarily excel at text-to-image tasks, forcing a rethink of how diffusion transformers are evaluated.

Researchers demonstrate how reinforcement learning can teach multimodal AI systems to solve complex numerical problems through code generation.

Randomized YaRN technique enables language models to reason effectively across context windows 16 times larger than training data.

Researchers introduce Lift4D, a technique that combines 3D modeling with AI priors to capture dynamic deformations and hidden surfaces from monocular footage.

New training method enables continuous dexterous manipulation during locomotion, moving beyond stop-and-grasp limitations.