Latest Papers
Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos
with Spatio-Temporal Diffusion Models
Yudong Jin, Sida Peng, Xuan Wang, et al.
a month ago

The Imitation Game: Turing Machine Imitator is Length Generalizable
Reasoner
Zhouqi Hua, Wenwei Zhang, Chengqi Lyu, et al.
a month ago

π^3: Scalable Permutation-Equivariant Visual Geometry Learning
Yifan Wang, Jianjun Zhou, Haoyi Zhu, et al.
a month ago

VisionThink: Smart and Efficient Vision Language Model via Reinforcement
Learning
Senqiao Yang, Junyi Li, Xin Lai, et al.
a month ago

A Survey of Context Engineering for Large Language Models
Lingrui Mei, Jiayu Yao, Yuyao Ge, et al.
a month ago

Assessing adaptive world models in machines with novel games
Lance Ying, Katherine M. Collins, Prafull Sharma, et al.
a month ago

Emotional Support with LLM-based Empathetic Dialogue Generation
Shiquan Wang, Ruiyu Fang, Zhongjiang He, et al.
a month ago

DrafterBench: Benchmarking Large Language Models for Tasks Automation in Civil Engineering
Yinsheng Li, Zhen Dong, Yi Shao
a month ago

SWE-Perf: Can Language Models Optimize Code Performance on Real-World
Repositories?
Xinyi He, Qian Liu, Mingzhe Du, et al.
a month ago

MOSPA: Human Motion Generation Driven by Spatial Audio
Shuyang Xu, Zhiyang Dou, Mingyi Shi, et al.
a month ago

MMHU: A Massive-Scale Multimodal Benchmark for Human Behavior
Understanding
Renjie Li, Ruijie Ye, Mingyang Wu, et al.
a month ago

PhysX: Physical-Grounded 3D Asset Generation
Ziang Cao, Zhaoxi Chen, Linag Pan, et al.
a month ago

Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs
Yangning Li, Weizhi Zhang, Yuyao Yang, et al.
a month ago

La-Proteina: Atomistic Protein Generation via Partially Latent Flow Matching
Tomas Geffner, Kieran Didi, Zhonglin Cao, et al.
a month ago

SUICA: Learning Super-high Dimensional Sparse Implicit Neural Representations for Spatial Transcriptomics
Qingtian Zhu, Yumin Zheng, Yuling Sang, et al.
a month ago

XiChen: An observation-scalable fully AI-driven global weather forecasting system with 4D variational knowledge
Wuxin Wang, Weicheng Ni, Lilan Huang, et al.
a month ago

AgentsNet: Coordination and Collaborative Reasoning in Multi-Agent LLMs
Florian Gr\u00f6tschla, Luis M\u00fcller, Jan T\u00f6nshoff, et al.
a month ago

Can Multimodal Foundation Models Understand Schematic Diagrams? An
Empirical Study on Information-Seeking QA over Scientific Papers
Yilun Zhao, Chengye Wang, Chuhan Li, et al.
a month ago

Scaling Laws for Optimal Data Mixtures
Mustafa Shukor, Louis Bethune, Dan Busbridge, et al.
a month ago

EXAONE 4.0: Unified Large Language Models Integrating Non-reasoning and
Reasoning Modes
LG AI Research, Kyunghoon Bae, Eunbi Choi, et al.
a month ago

Subject-Consistent and Pose-Diverse Text-to-Image Generation
Zhanxin Gao, Beier Zhu, Liang Yao, et al.
a month ago

Vision-Language-Vision Auto-Encoder: Scalable Knowledge Distillation from Diffusion Models
Tiezheng Zhang, Yitong Li, Yu-cheng Chou, et al.
a month ago

DuetGraph: Coarse-to-Fine Knowledge Graph Reasoning with Dual-Pathway Global-Local Fusion
Jin Li, Zezhong Ding, Xike Xie
a month ago

CogDDN: A Cognitive Demand-Driven Navigation with Decision Optimization and Dual-Process Thinking
Yuehao Huang, Liang Liu, Shuangming Lei, et al.
a month ago

LayerCake: Token-Aware Contrastive Decoding within Large Language Model Layers
Jingze Zhu, Yongliang Wu, Wenbo Zhu, et al.
a month ago

Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive
Token-Level Computation
Sangmin Bae, Yujin Kim, Reza Bayat, et al.
a month ago

REST: Stress Testing Large Reasoning Models by Asking Multiple Problems
at Once
Zhuoshi Pan, Qizhi Pei, Yu Li, et al.
a month ago

EmbRACE-3K: Embodied Reasoning and Action in Complex Environments
Mingxian Lin, Wei Huang, Yitang Li, et al.
a month ago

Reasoning or Memorization? Unreliable Results of Reinforcement Learning
Due to Data Contamination
Mingqi Wu, Zhihao Zhang, Qiaole Dong, et al.
a month ago

SpeakerVid-5M: A Large-Scale High-Quality Dataset for Audio-Visual
Dyadic Interactive Human Generation
Youliang Zhang, Zhaoyang Li, Duomin Wang, et al.
a month ago