2,588
2,008
769
12,852
9,811
3,056
10,049
14,349
5,125
6,257
3,149
9,432
5,466
82,324
4,662
2,223
5,299
11,870
4,273
1,967
Tencent
Tencent is a world-leading internet and technology company that develops innovative products and services to improve the quality of life of people around the world.
Shenzhen, China
🇨🇳
Follow
Visit website
AI Native
No
Number of tools
1
Number of employees
110k
Profitable
Yes
Valuation
$591B
Most popular AI tool
Models
-
Penguin-VL-2B is a compact vision-language model that uses an LLM-based vision encoder to push efficiency limits in multimodal reasoning.NewMultimodalReleased 12d ago
-
Penguin-VL-2B is a compact vision-language model that uses an LLM-based vision encoder to push efficiency limits in multimodal reasoning.NewMultimodalReleased 17d ago
-
HunyuanImage-3.0-Instruct is Tencent’s 80B MoE image-editing model that first reasons over an input image and text, then performs stable one-line edits and multi-image fusion for layout-safe poster, logo and product updates.NewImageReleased 1mo ago
-
Tencent HY-Motion 1.0 is an open text-to-3D human motion generator using billion-scale Diffusion Transformer and flow matching, turning prompts into skeleton-based 3D animations ready for game and character pipelines.NewImageReleased 2mo ago
-
Tencent-HY-MT1.5 is a multilingual machine-translation family with 1.8B and 7B models, supporting 33 languages plus dialects and advanced features like terminology control, context-aware and format-preserving translationNewTextReleased 2mo ago
-
Tencent Hunyuan 3D, powered by the Hunyuan 3D Generate Large Model 2.5, is a 3D AI engine that turns text, images and sketches into low-poly 3D assets and scenes, supporting text-to-3D, image-to-3D, animation and texture generation in the browser.ImageReleased 3mo ago
-
HunyuanOCR is Tencent Hunyuan’s 1B parameter end-to-end OCR expert VLM. It reads documents, screenshots, and video frames, handling text detection, recognition, layout parsing, information extraction, subtitles, and photo translation in one shot, with strong multilingual support and state-of-the-art accuracy.TextReleased 3mo ago
-
HunyuanVideo-1.5 is Tencent's 8.3B-parameter open-source video diffusion model for text-to-video and image-to-video generation, delivering high-quality, stable motion clips while running efficiently on consumer-grade GPUs.VideoReleased 4mo ago
-
HunyuanWorld Mirror is a scene-reconstruction and world-modeling system. It turns photos and videos into a consistent digital twin that you can explore, edit, and render, with export to common 3D formats for simulation, virtual production, and design.ImageReleased 5mo ago
-
HunyuanImage 3.0 is Tencent’s next-gen text-to-image model. It delivers sharper detail, stronger style and identity consistency, improved typography, and precise, in-place editing—built for fast iteration from concept to production-ready visuals.ImageReleased 5mo ago
-
3dReleased 5mo ago
-
Hunyuan-MT-7B is Tencent’s compact 7B multilingual model tuned for fast, accurate translation and cross-lingual tasks. It supports long-context inputs, preserves formatting and terminology, and outputs structured JSON for enterprise pipelines.TextReleased 6mo ago
Papers
-
Evaluating Generative Models via One-Dimensional Code DistributionsFudan University, Institute for Artificial Intelligence, Peking UniversityPublished on: 2026-03-09 5 authors
-
Omni-Diffusion: Unified Multimodal Understanding and Generation with Masked Discrete DiffusionChinese Academy of Sciences Institute of Automation, Nanjing UniversityPublished on: 2026-03-06 9 authors
-
OneRanker: Unified Generation and Ranking with One Model in Industrial Advertising RecommendationPublished on: 2026-03-03 1 author
-
NOVA: Sparse Control, Dense Synthesis for Pair-Free Video EditingPublished on: 2026-03-03 1 author
-
RubricBench: Aligning Model-Generated Rubrics with Human StandardsUniversity of Illinois SpringfieldPublished on: 2026-03-02 1 author
-
WorldStereo: Bridging Camera-Guided Video Generation and Scene Reconstruction via 3D Geometric MemorieZhejiang UniversityPublished on: 2026-03-02 1 author
-
AngelSlim: A more accessible, comprehensive, and efficient toolkit for large model compressionPublished on: 2026-02-26 1 author
-
The Art of Efficient Reasoning: Data, Reward, and OptimizationThe University of Hong KongPublished on: 2026-02-25 1 author
-
Haitao LinFudan UniversityPublished on: 2026-02-23 1 author
-
HyTRec: A Hybrid Temporal-Aware Attention Architecture for Long Behavior Sequential RecommendationWuhan UniversityPublished on: 2026-02-20 1 author
-
GUI-GENESIS: Automated Synthesis of Efficient Environments with Verifiable Rewards for GUI Agent Post-TrainingPeking UniversityPublished on: 2026-02-15 1 author
-
OmniVideo-R1: Reinforcing Audio-visual Reasoning with Query Intention and Modality AttentionPublished on: 2026-02-15 1 author
-
Gradients Must Earn Their Influence: Unifying SFT with Generalized Entropic ObjectivesHarbin Institute of TechnologyPublished on: 2026-02-11 1 author
-
MotionCrafter: Dense Geometry and Motion Reconstruction with a 4D VAEMonash UniversityPublished on: 2026-02-09 1 author
-
RISE-Video: Can Video Generators Decode Implicit World Rules?Shanghai Jiao Tong UniversityPublished on: 2026-02-05 1 author
-
BlossomRec: Block-level Fused Sparse Attention Mechanism for Sequential RecommendationsCity University of Hong KongPublished on: 2026-02-03 1 author
-
ReMiT: RL-Guided Mid-Training for Iterative LLM EvolutionShanghai Jiao Tong UniversityPublished on: 2026-02-03 1 author
-
HY3D-Bench: Generation of 3D AssetsPublished on: 2026-02-03 1 author
-
Small Generalizable Prompt Predictive Models Can Steer Efficient RL Post-Training of Large Reasoning ModelsTsinghua UniversityPublished on: 2026-02-02 11 authors
-
HunyuanImage 3.0 Technical ReportPublished on: 2026-02-02 1 author
-
MAIN-VLA: Modeling Abstraction of Intention and eNvironment for Vision-Language-Action ModelsPublished on: 2026-02-02 1 author
-
AlignGemini: Generalizable AI-Generated Image Detection Through Task-Model AlignmentPublished on: 2026-01-30 1 author
-
PI-Light: Physics-Inspired Diffusion for Full-Image RelightingNanyang Technological UniversityPublished on: 2026-01-29 1 author
-
Youtu-Parsing: Perception, Structuring and Recognition via High-Parallelism DecodingPublished on: 2026-01-28 1 author
-
Youtu-VL: Unleashing Visual Potential via Unified Vision-Language SupervisionPublished on: 2026-01-27 1 author
-
RISER: Orchestrating Latent Reasoning Skills for Adaptive Activation SteeringTongji UniversityPublished on: 2026-01-19 1 author
-
FinVault: Benchmarking Financial Agent Safety in Execution-Grounded EnvironmentsSingapore University of Technology and DesignPublished on: 2026-01-09 1 author
-
UniFinEval: Towards Unified Evaluation of Financial Multimodal Models across Text, Images and VideosShanghai University of Finance and EconomicsPublished on: 2026-01-09 1 author
-
Rotate Your Character: Revisiting Video Diffusion Models for High-Quality 3D Character GenerationThe University of Hong KongPublished on: 2026-01-09 1 author
-
One Language-Free Foundation Model Is Enough for Universal Vision Anomaly DetectionPublished on: 2026-01-09 1 author
-
DocDancer: Towards Agentic Document-Grounded Information SeekingPeking UniversityPublished on: 2026-01-08 1 author
-
Re-Align: Structured Reasoning-guided Alignment for In-Context Image Generation and EditingInstitute of Information EngineeringPublished on: 2026-01-08 1 author
-
FOREVER: Forgetting Curve-Inspired Memory Replay for Language Model Continual LearningThe Hong Kong Polytechnic UniversityPublished on: 2026-01-07 1 author
-
SmartSnap: Proactive Evidence Seeking for Self-Verifying AgentsPublished on: 2026-01-06 1 author
-
A Versatile Multimodal Agent for Multimedia Content GenerationUniversity of RochesterPublished on: 2026-01-06 1 author
-
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language ModelsPublished on: 2026-01-05 1 author
-
YOLO-Master: MOE-Accelerated with Specialized Transformers for Enhanced Real-time DetectionSingapore Management UniversityPublished on: 2025-12-30 1 author
-
HY-MT1.5 Technical ReportPublished on: 2025-12-30 1 author
-
D2Pruner: Debiased Importance and Structural Diversity for MLLM Token PruningShanghai Jiao Tong UniversityPublished on: 2025-12-26 1 author
-
Streaming Video Instruction TuningHong Kong Baptist UniversityPublished on: 2025-12-24 1 author
-
RePlan: Reasoning-guided Region Planning for Complex Instruction-based Image EditingThe Chinese University of Hong KongPublished on: 2025-12-18 1 author
-
N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language ModelsHong Kong University of Science and TechnologyPublished on: 2025-12-18 1 author
-
AutoRefiner: Improving Autoregressive Video Diffusion Models via Reflective Refinement Over the Stochastic Sampling PathAustralian National UniversityPublished on: 2025-12-15 1 author
-
Soul: Breathe Life into Digital Human for High-fidelity Long-term Multimodal AnimationPublished on: 2025-12-15 1 author
-
Transform Trained Transformer: Accelerating Naive 4K Video Generation Over 10Published on: 2025-12-15 1 author
-
GTR-Turbo: Merged Checkpoint is Secretly a Free Teacher for Agentic VLM TrainingTsinghua UniversityPublished on: 2025-12-15 1 author
-
Distribution Matching Variational AutoEncoderPeking UniversityPublished on: 2025-12-08 1 author
-
HunyuanVideo 1.5 Technical ReportPublished on: 2025-10-25 1 author
-
Training-Free Group Relative Policy OptimizationFudan University, Xiamen UniversityPublished on: 2025-10-09 13 authors
-
Training-Free Group Relative Policy OptimizationPublished on: 2025-10-09 1 author
Repositories
No repositories yet.
