| | |
| | |
Stat |
Members: 3645 Articles: 2'503'724 Articles rated: 2609
23 April 2024 |
|
| | | |
|
Article overview
| |
|
Discriminative Sounding Objects Localization via Self-supervised Audiovisual Matching | Di Hu
; Rui Qian
; Minyue Jiang
; Xiao Tan
; Shilei Wen
; Errui Ding
; Weiyao Lin
; Dejing Dou
; | Date: |
12 Oct 2020 | Abstract: | Discriminatively localizing sounding objects in cocktail-party, i.e., mixed
sound scenes, is commonplace for humans, but still challenging for machines. In
this paper, we propose a two-stage learning framework to perform
self-supervised class-aware sounding object localization. First, we propose to
learn robust object representations by aggregating the candidate sound
localization results in the single source scenes. Then, class-aware object
localization maps are generated in the cocktail-party scenarios by referring
the pre-learned object knowledge, and the sounding objects are accordingly
selected by matching audio and visual object category distributions, where the
audiovisual consistency is viewed as the self-supervised signal. Experimental
results in both realistic and synthesized cocktail-party videos demonstrate
that our model is superior in filtering out silent objects and pointing out the
location of sounding objects of different classes. Code is available at
this https URL. | Source: | arXiv, 2010.05466 | Services: | Forum | Review | PDF | Favorites |
|
|
No review found.
Did you like this article?
Note: answers to reviews or questions about the article must be posted in the forum section.
Authors are not allowed to review their own article. They can use the forum section.
browser Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)
|
| |
|
|
|
| News, job offers and information for researchers and scientists:
| |