Papers
-
DISPLAY: Directable Human-Object Interaction Video Generation via Sparse Motion Guidance and Multi-Task Auxiliary
-
GarmentPainter: Efficient 3D Garment Texture Synthesis with Character-Guided Diffusion Model
-
Learning to Generate via Understanding: Understanding-Driven Intrinsic Rewarding for Unified Multimodal ModelsBaidu / Chinese Academy of Sciences, Peking University, Sun Yat-sen University, University of Chinese Academy of Sciences
-
GenHOI: Towards Object-Consistent Hand-Object Interaction with Temporally Balanced and Spatially Selective Object Injection
-
PaddleOCR-VL-1.5: Towards a Multi-Task 0.9B VLM for Robust In-the-Wild Document Parsing
MongoDB - Build AI That Scales
