| | |
| | |
Stat |
Members: 3657 Articles: 2'599'751 Articles rated: 2609
14 October 2024 |
|
| | | |
|
Article overview
| |
|
Vision GNN: An Image is Worth Graph of Nodes | Kai Han
; Yunhe Wang
; Jianyuan Guo
; Yehui Tang
; Enhua Wu
; | Date: |
1 Jun 2022 | Abstract: | Network architecture plays a key role in the deep learning-based computer
vision system. The widely-used convolutional neural network and transformer
treat the image as a grid or sequence structure, which is not flexible to
capture irregular and complex objects. In this paper, we propose to represent
the image as a graph structure and introduce a new Vision GNN (ViG)
architecture to extract graph-level feature for visual tasks. We first split
the image to a number of patches which are viewed as nodes, and construct a
graph by connecting the nearest neighbors. Based on the graph representation of
images, we build our ViG model to transform and exchange information among all
the nodes. ViG consists of two basic modules: Grapher module with graph
convolution for aggregating and updating graph information, and FFN module with
two linear layers for node feature transformation. Both isotropic and pyramid
architectures of ViG are built with different model sizes. Extensive
experiments on image recognition and object detection tasks demonstrate the
superiority of our ViG architecture. We hope this pioneering study of GNN on
general visual tasks will provide useful inspiration and experience for future
research. The PyTroch code will be available at
this https URL and the MindSpore code will be
avaiable at this https URL | Source: | arXiv, 2206.00272 | Services: | Forum | Review | PDF | Favorites |
|
|
No review found.
Did you like this article?
Note: answers to reviews or questions about the article must be posted in the forum section.
Authors are not allowed to review their own article. They can use the forum section.
|
| |
|
|
|