Enhanced Sound Event Localization and Detection in Real 360-degree audio-visual soundscapes
The authors have developed an enhanced audio-visual Sound Event Localization and Detection (SELD) network, improving on the audio-only SELDnet23 model by integrating audio and video information. The system uses YOLO…
Continue reading