microsoft / YOLaT-VectorGraphicsRecognition
Source Code of NeurIPS21 and T-PAMI24 paper: Recognizing Vector Graphics without Rasterization
README
:fire: [NIPS2021, TPAMI2024] YOLaT & YOLaT++: Powerful and Efficient Graph Models for Vector Graphics Recognition
:scroll: Introduction
This repository is the official PyTorch implementation of our two powerful vector graphics recognition models.
NeurIPS-2021 paper: Recognizing Vector Graphics without Rasterization.
TPAMI-2024 paper: Hierarchical Recognizing Vector Graphics and A New Chart-based Vector Graphics Dataset
<p align="center">
Rendering vector graphics into pixel arrays can result in significant memory costs or loss of information, as demonstrated in above Figure 1. Additionally, this process discards high-level structural information within the primitives, which is critical for recognition tasks such as identifying corners and contours.
To summarize, we propose You Only Look at Text series (YOLaT & YOLaT++) which addresses issues with raster graphics by taking in textual documents of vector graphics as input.
Environments
conda create -n your_env_name python=3.8
conda activate your_env_name
sh deepgcn_env_install.sh
YOLaT
1. Data Preparation
Floorplans
a) Download and unzip the Floorplans dataset to the dataset folder: data/FloorPlansGraph5_iter
b) Run the following scripts to prepare the dataset for training/inference.
cd utils
python svg_utils/build_graph_bbox.py
Diagrams
a) Download and unzip the Diagrams dataset to the dataset folder: data/diagrams
b) Run the following scripts to prepare the dataset for training/inference.
cd utils
python svg_utils/build_graph_bbox_diagram.py
2. Training & Inference
Floorplans
cd cad_recognition
CUDA_VISIBLE_DEVICES=0 python -u train.py --batch_size 4 --data_dir data/FloorPlansGraph5_iter --phase train --lr 2.5e-4 --lr_adjust_freq 9999999999999999999999999999999999999 --in_channels 5 --n_blocks 2 --n_blocks_out 2 --arch centernet3cc_rpn_gp_iter2 --graph bezier_cc_bb_iter --data_aug true --weight_decay 1e-5 --postname run182_2 --dropout 0.0 --do_mixup 0 --bbox_sampling_step 10
Diagrams
cd cad_recognition
CUDA_VISIBLE_DEVICES=0 python -u train.py --batch_size 4 --data_dir data/diagrams --phase train --lr 2.5e-4 --lr_adjust_freq 9999999999999999999999999999999999999 --in_channels 5 --n_blocks 2 --n_blocks_out 2 --arch centernet3cc_rpn_gp_iter2 --graph bezier_cc_bb_iter --data_aug true --weight_decay 1e-5 --postname run182_2 --dropout 0.0 --do_mixup 0 --bbox_sampling_step 5
YOLaT++
<p align="center">
YOLaT++ is introduced, characterized by a hierarchical structure designed for VGs, spanning three levels: Primitive, Curve, and Point. Additionally, YOLaT++ employs a position-aware enhancement strategy to effectively differentiate similar primitives.
Citation
BibTex:
@inproceedings{jiang2021recognizing,
title={{Recognizing Vector Graphics without Rasterization}},
author={Jiang, Xinyang and Liu, Lu and Shan, Caihua and Shen, Yifei and Dong, Xuanyi and Li, Dongsheng},
booktitle={Proceedings of Advances in Neural Information Processing Systems (NIPS)},
volume={34},
number={},
pages={},
year={2021}}
@article{journals/pami/DouJLYSSDWLZ24,
author = {Shuguang Dou and Xinyang Jiang and Lu Liu and Lu Ying and Caihua Shan and Yifei Shen and Xuanyi Dong and Yun Wang and Dongsheng Li and Cairong Zhao},
title = {Hierarchically Recognizing Vector Graphics and {A} New Chart-Based
Vector Graphics Dataset},
journal = {{IEEE} Trans. Pattern Anal. Mach. Intell.},
volume = {46},
number = {12},
pages = {7556--7573},
year = {2024},
doi = {10.1109/TPAMI.2024.3394298},
}
Please do consider :star2: star our project to share with your community if you find this repository helpful!
