TAAFT
Free mode
100% free
Freemium
Free Trial
Deals

HKUDS / DiffGraph

[WSDM'2025] "DiffGraph: Heterogeneous Graph Diffusion Model"

68 9 Language: Python Updated: 12d ago

README

# ๐ŸŒŒ DiffGraph: Heterogeneous Graph Diffusion Model WSDM 2025 PyTorch DGL Python Typing SVG
``` โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ•— โ–ˆโ–ˆโ•— โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•”โ•โ•โ•โ•โ•โ–ˆโ–ˆโ•”โ•โ•โ•โ•โ•โ–ˆโ–ˆโ•”โ•โ•โ•โ•โ• โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•”โ•โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•”โ•โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•”โ•โ•โ• โ–ˆโ–ˆโ•”โ•โ•โ• โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•”โ•โ•โ•โ• โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•”โ•โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ•‘ โ•šโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•”โ•โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ•‘ โ•šโ•โ•โ•โ•โ•โ• โ•šโ•โ•โ•šโ•โ• โ•šโ•โ• โ•šโ•โ•โ•โ•โ•โ• โ•šโ•โ• โ•šโ•โ•โ•šโ•โ• โ•šโ•โ•โ•šโ•โ• โ•šโ•โ• โ•šโ•โ• ```

โœจ ๐Ÿ”ฅ Heterogeneous Graph Intelligence | โšก Latent Diffusion | ๐ŸŒŠ Noise Denoising ๐ŸŒŠ โœจ

**๐ŸŒŸ Advancing Heterogeneous Graph Intelligence through Novel Latent Diffusion Strategies** [![arXiv](https://img.shields.io/badge/arXiv-2501.02313-B31B1B?style=flat-square&logo=arxiv)](https://arxiv.org/abs/2501.02313) [![GitHub](https://img.shields.io/badge/GitHub-Repository-181717?style=flat-square&logo=github)](https://github.com/HKUDS/DiffGraph) [![License](https://img.shields.io/badge/License-MIT-yellow?style=flat-square)](LICENSE) [![Stars](https://img.shields.io/github/stars/HKUDS/DiffGraph?style=flat-square&color=FFD700)](https://github.com/HKUDS/DiffGraph/stargazers) --- ## ๐ŸŽฏ **Mission Statement** > *"In the labyrinth of heterogeneous data, where noise corrupts truth and complexity obscures patterns, DiffGraph emerges as the quantum leap in graph intelligence - wielding the power of latent diffusion to transform chaos into clarity."*

๐Ÿง  Neural Architecture Overview

DiffGraph Architecture ๐Ÿ”ฌ The Heterogeneous Graph Diffusion Pipeline: From Noisy Reality to Pure Intelligence

๐ŸŒŸ Core Innovation Matrix

๐Ÿ”ฅ Component ๐ŸŽฎ Technology ๐ŸŽฏ Breakthrough
Latent Diffusion Engine Gaussian Noise Injection + Progressive Denoising Eliminates heterogeneous noise while preserving semantic integrity
Cross-View Semantic Fusion Auxiliary-to-Target Graph Transformation Maximizes mutual information across graph modalities
Quantum GCN Layers Multi-relational Message Passing Captures complex heterogeneous transitions
Neural Denoising Network Time-Conditioned MLP Architecture Reconstructs pure graph representations

๐Ÿš€ Performance Overview

### ๐Ÿ“Š **Main Results Summary** | **Task** | **Dataset** | **Best Baseline** | **DiffGraph** | **Improvement** | |----------|-------------|-------------------|---------------|-----------------| | **Link Prediction** | Tmall | 0.0463 (R@20) | **0.0589** | +27.21% โšก | | | Retail Rocket | 0.0524 (R@20) | **0.0620** | +18.32% ๐Ÿš€ | | | IJCAI | 0.0136 (R@20) | **0.0171** | +25.74% ๐Ÿ’Ž | | **Node Classification** | DBLP | 91.97% (Micro-F1) | **93.81%** | +2.00% ๐Ÿ“ˆ | | | AMiner | 82.46% (Micro-F1) | **83.29%** | +1.01% ๐ŸŽฏ | | | Industry | 79.82% (AUC) | **80.25%** | +0.54% ๐Ÿ’ช |

๐Ÿ“ˆ Detailed Experimental Analysis

<details>
<summary><b>๐Ÿ” Click to expand detailed results</b></summary>

| **Dataset** | **Metric** | **MATN** | **HGT** | **MBGCN** | **DiffGraph** | **Gain** | |-------------|------------|----------|---------|-----------|---------------|----------| | **Tmall** | Recall@20 | 0.0463 | 0.0431 | 0.0419 | **0.0589** | +27.21% | | | NDCG@20 | 0.0197 | 0.0192 | 0.0179 | **0.0274** | +39.09% | | **Retail Rocket** | Recall@20 | 0.0524 | 0.0413 | 0.0492 | **0.0620** | +18.32% | | | NDCG@20 | 0.0302 | 0.0250 | 0.0258 | **0.0367** | +21.52% | | **IJCAI** | Recall@20 | 0.0136 | 0.0126 | 0.0112 | **0.0171** | +25.74% | | | NDCG@20 | 0.0054 | 0.0051 | 0.0045 | **0.0063** | +16.67% |

๐ŸŽฏ Node Classification - Best Results

| **Dataset** | **Setting** | **Best Baseline** | **DiffGraph** | **Metric** | |-------------|-------------|-------------------|---------------|------------| | **DBLP** | 60 per class | HeCo: 91.59ยฑ0.2 | **93.81ยฑ0.3** | Micro-F1 | | | 60 per class | HeCo: 98.59ยฑ0.1 | **99.21ยฑ0.1** | AUC | | **AMiner** | 40 per class | HeCo: 80.53ยฑ0.7 | **83.29ยฑ1.3** | Micro-F1 | | | 40 per class | HeCo: 92.11ยฑ0.6 | **94.41ยฑ0.8** | AUC | | **Industry** | Full dataset | HGT: 0.7982 | **0.8025** | AUC |

</details>


๐Ÿ—๏ธ System Architecture

๐ŸŒŒ DiffGraph Neural Framework
โ”œโ”€โ”€ ๐Ÿ”ฅ DiffGraph-Rec/               # Link Prediction Engine
โ”‚   โ”œโ”€โ”€ ๐Ÿง  Model.py                 # Core HGDM Implementation
โ”‚   โ”œโ”€โ”€ ๐Ÿ“Š DataHandler.py           # Multi-behavior Data Processing
โ”‚   โ”œโ”€โ”€ โš™๏ธ main.py                  # Training & Evaluation Pipeline
โ”‚   โ”œโ”€โ”€ ๐ŸŽ›๏ธ params.py                # Hyperparameter Configuration
โ”‚   โ”œโ”€โ”€ ๐Ÿ—‚๏ธ data/                    # Heterogeneous Datasets
โ”‚   โ”‚   โ”œโ”€โ”€ tmall/                  # E-commerce Multi-behavior
โ”‚   โ”‚   โ”œโ”€โ”€ retail_rocket/          # Transaction Networks
โ”‚   โ”‚   โ””โ”€โ”€ ijcai_15/              # Competition Benchmark
โ”‚   โ””โ”€โ”€ ๐Ÿ› ๏ธ Utils/                   # Neural Utilities
โ”œโ”€โ”€ ๐ŸŽฏ DiffGraph_NC/                # Node Classification Engine
โ”‚   โ”œโ”€โ”€ ๐Ÿง  Model.py                 # Academic Network HGDM
โ”‚   โ”œโ”€โ”€ ๐Ÿ“Š DataHandler.py           # Citation Network Processing
โ”‚   โ”œโ”€โ”€ โš™๏ธ main.py                  # Classification Pipeline
โ”‚   โ”œโ”€โ”€ ๐ŸŽ›๏ธ params.py                # Configuration Matrix
โ”‚   โ”œโ”€โ”€ ๐Ÿ—‚๏ธ data/                    # Academic Datasets
โ”‚   โ”‚   โ”œโ”€โ”€ dblp/                   # Database & AI Publications
โ”‚   โ”‚   โ””โ”€โ”€ aminer/                 # Research Network
โ”‚   โ””โ”€โ”€ ๐Ÿ› ๏ธ Utils/                   # Classification Tools
โ””โ”€โ”€ ๐Ÿ“– README.md                    # This Neural Manual

๐Ÿ”ฌ Scientific Foundation

๐Ÿ“œ Mathematical Formulation

Latent Heterogeneous Graph Diffusion Process:

๐’ขโ‚›* โ†ญ^ฯ€ ๐„โ‚›* โ†’^ฯ† ๐„ฬƒโ‚›* โ†’^ฯ†' ๐„ฬƒโ‚›* โ†ญ^ฯ€' ๐’ขฬƒโ‚›*

Forward Diffusion Trajectory:

q(โ„‹โ‚œ | โ„‹โ‚œโ‚‹โ‚) = ๐’ฉ(โ„‹โ‚œ; โˆš(1-ฮฒโ‚œ)โ„‹โ‚œโ‚‹โ‚, ฮฒโ‚œ๐ˆ)

Reverse Denoising Process:

p(โ„‹โ‚œโ‚‹โ‚ | โ„‹โ‚œ) = ๐’ฉ(โ„‹โ‚œโ‚‹โ‚; ฮผฮธ(โ„‹โ‚œ,t), ฮฃฮธ(โ„‹โ‚œ,t))

๐ŸŽฏ Core Contributions

  1. ๐ŸŒŸ Latent Space Revolution: First heterogeneous graph diffusion in latent space, solving discrete graph generation challenges
  2. ๐Ÿ”„ Cross-View Intelligence: Novel auxiliary-to-target semantic transformation mechanism
  3. ๐Ÿ›ก๏ธ Noise Resilience: Superior robustness against heterogeneous data corruption
  4. โšก Scalable Architecture: Linear complexity with heterogeneous relation types

๐Ÿ“Š Datasets & Benchmarks

| **Task** | **Dataset** | **Scale** | **Domain** | |----------|-------------|-----------|------------| | **Link Prediction** | Tmall | 31K users, 31K items | E-commerce Multi-behavior | | | Retail Rocket | 2K users, 30K items | Transaction Networks | | | IJCAI-15 | 17K users, 36K items | Competition Benchmark | | **Node Classification** | DBLP | 26K nodes, 4 classes | Academic Publications | | | AMiner | 56K nodes, 4 classes | Research Networks | | | Industry | 2M+ users | Gaming Platform | *Complete dataset details available in paper appendix*

๐Ÿ”ฌ Component Analysis

| **Analysis Type** | **Key Finding** | **Performance Impact** | |-------------------|-----------------|------------------------| | **๐Ÿงฉ Ablation Study** | Diffusion module crucial | -11.0% without diffusion | | **โš™๏ธ Hyperparameters** | Optimal: 64-dim, 3-layers | Best at moderate complexity | | **๐Ÿ›ก๏ธ Noise Robustness** | Superior resilience | 50% less degradation vs baselines | | **โšก Efficiency** | 2.6x faster training | Computational advantage | | **๐Ÿ“Š Data Sparsity** | Consistent gains | +31.4% on sparse data |

<details>
<summary><b>๐Ÿ“Š Click to view detailed analysis</b></summary>

๐Ÿงฉ Ablation Study

Variant Description Tmall R@20 Change
DiffGraph Full model 0.0589 -
-D Remove diffusion 0.0524 -11.0%
-H Remove heterogeneous 0.0463 -21.4%
DAE Replace w/ autoencoder 0.0531 -9.8%

๐Ÿ›ก๏ธ Noise Robustness (50% Noise)

Behavior DiffGraph Retention HGT Retention
Page View 97.42% 95.59%
Favorite 98.62% 97.22%
Cart 96.73% 95.82%

๐Ÿ“Š Data Sparsity Impact

  • Sparse Users (< 8 interactions): +31.4% improvement
  • Medium Users (< 35 interactions): +25.1% improvement
  • Active Users (< 120 interactions): +19.4% improvement

</details>


๐Ÿ† Competitive Analysis

### ๐ŸŽฏ **Performance Advantage** | **Category** | **Baseline Methods** | **DiffGraph Improvement** | |--------------|---------------------|---------------------------| | **๐Ÿ“Š Link Prediction** | MATN, HGT, MBGCN | +15-40% Recall@20 | | **๐ŸŽฏ Node Classification** | HeCo, HAN, HGT | +1-2% Micro-F1 | | **๐Ÿ›ก๏ธ Noise Robustness** | All baselines | 50% less degradation | | **โšก Training Efficiency** | HGT, MBGCN | 2.6x faster convergence | *Comprehensive comparison with 15+ SOTA methods*

๐Ÿ“š Citation & Recognition

@inproceedings{li2025diffgraph,
  title={DiffGraph: Heterogeneous Graph Diffusion Model},
  author={Li, Zongwei and Xia, Lianghao and Hua, Hua and Zhang, Shijie and Wang, Shuangyang and Huang, Chao},
  booktitle={Proceedings of the Eighteenth ACM International Conference on Web Search and Data Mining},
  pages={--},
  year={2025},
  organization={ACM}
}

๐Ÿค Neural Network Contributors

**๐ŸŽฏ Principal Investigators** - **Zongwei Li** - *University of Hong Kong* ๐Ÿ‡ญ๐Ÿ‡ฐ - **Lianghao Xia** - *University of Hong Kong* ๐Ÿ‡ญ๐Ÿ‡ฐ - **Chao Huang** - *University of Hong Kong* ๐Ÿ‡ญ๐Ÿ‡ฐ **๐Ÿš€ Industry Partners** - **Hua Hua** - *Tencent Research* - **Shuangyang Wang** - *Tencent AI Lab* - **Shijie Zhang** - *Social Computing Center*

๐Ÿ›ก๏ธ License & Ethics

[![MIT License](https://img.shields.io/badge/License-MIT-green.svg?style=for-the-badge)](https://choosealicense.com/licenses/mit/) **๐Ÿ”’ Responsible AI Development** - โœ… Privacy-preserving implementations - โœ… Bias-aware model design - โœ… Transparent algorithmic decisions - โœ… Reproducible research standards

## ๐ŸŒŸ **Join the Graph Revolution** ``` โ•”โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•— โ•‘ ๐Ÿš€ Star this repository if DiffGraph powers your research! โ•‘ โ•‘ ๐Ÿ”ฌ Open issues for scientific discussions and improvements โ•‘ โ•‘ ๐Ÿค Contribute to the future of heterogeneous graph AI โ•‘ โ•šโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• ``` **Made with ๐Ÿง  AI and โค๏ธ Science** *"The future belongs to those who understand that in the complexity of heterogeneous graphs lies the key to artificial general intelligence."* --- โญ **Star us on GitHub** | ๐Ÿ“ง **Contact**: [email protected] | ๐ŸŒ **Lab**: [HKU Data Science](https://www.cs.hku.hk/)
0 AIs selected
Clear selection
#
Name
Task