TAAFT
Free mode
100% free
Freemium
Free Trial
Deals

TensorRT Edge-LLM

NVIDIA / TensorRT-Edge-LLM

High-performance, light-weight C++ LLM and VLM Inference Software for Physical AI

452 80 Language: Python License: Apache-2.0 Updated: 17h ago

README

<div align="center">

TensorRT Edge-LLM

High-Performance Large Language Model Inference Framework for NVIDIA Edge Platforms

Documentation
version
license

Overview&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;Examples&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;Documentation&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;Roadmap


<div align="left">

Overview

TensorRT Edge-LLM is NVIDIA's high-performance C++ inference runtime for Large Language Models (LLMs) and Vision-Language Models (VLMs) on embedded platforms. It enables efficient deployment of state-of-the-art language models on resource-constrained devices such as NVIDIA Jetson and NVIDIA DRIVE platforms. TensorRT Edge-LLM provides convenient Python scripts to convert HuggingFace checkpoints to ONNX. Engine build and end-to-end inference runs entirely on Edge platforms.


Getting Started

For the supported platforms, models and precisions, see the Overview. Get started with TensorRT Edge-LLM in <15 minutes. For complete installation and usage instructions, see the Quick Start Guide.


Documentation

Introduction

User Guide

Developer Guide

Software Design

Advanced Topics


Use Cases

๐Ÿš— Automotive

  • In-vehicle AI assistants
  • Voice-controlled interfaces
  • Scene understanding
  • Driver assistance systems

๐Ÿค– Robotics

  • Natural language interaction
  • Task planning and reasoning
  • Visual question answering
  • Human-robot collaboration

๐Ÿญ Industrial IoT

  • Equipment monitoring with NLP
  • Automated inspection
  • Predictive maintenance
  • Voice-controlled machinery

๐Ÿ“ฑ Edge Devices

  • On-device chatbots
  • Offline language processing
  • Privacy-preserving AI
  • Low-latency inference

Tech Blogs

Coming soon

Stay tuned for technical deep-dives, optimization guides, and deployment best practices.


Latest News

  • [01/05] ๐Ÿš€ Accelerate AI Inference for Edge and Robotics with NVIDIA Jetson T4000 and NVIDIA JetPack 7.1 โœจ โžก๏ธ link
  • [01/05] ๐Ÿš€ Accelerating LLM and VLM Inference for Automotive and Robotics with NVIDIA TensorRT Edge-LLM โœจ โžก๏ธ link

Follow our GitHub repository for the latest updates, releases, and announcements.


Support


License

Apache License 2.0


Contributing

We welcome contributions! Please see our Contributing Guidelines for details.


0 AIs selected
Clear selection
#
Name
Task