Picture for Yang Zhao

Yang Zhao

Frank

Single-to-mix Modality Alignment with Multimodal Large Language Model for Document Image Machine Translation

Add code
Jul 10, 2025
Viaarxiv icon

Decoupled Planning and Execution: A Hierarchical Reasoning Framework for Deep Search

Add code
Jul 03, 2025
Viaarxiv icon

Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations

Add code
Jun 23, 2025
Viaarxiv icon

Leveraging Reference Documents for Zero-Shot Ranking via Large Language Models

Add code
Jun 13, 2025
Viaarxiv icon

Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation

Add code
Jun 11, 2025
Viaarxiv icon

SSS: Semi-Supervised SAM-2 with Efficient Prompting for Medical Imaging Segmentation

Add code
Jun 10, 2025
Viaarxiv icon

Seedance 1.0: Exploring the Boundaries of Video Generation Models

Add code
Jun 10, 2025
Viaarxiv icon

SeedVR2: One-Step Video Restoration via Diffusion Adversarial Post-Training

Add code
Jun 05, 2025
Viaarxiv icon

CodeV-R1: Reasoning-Enhanced Verilog Generation

Add code
May 30, 2025
Viaarxiv icon

IKMo: Image-Keyframed Motion Generation with Trajectory-Pose Conditioned Motion Diffusion Model

Add code
May 27, 2025
Viaarxiv icon