Dataset for audio captioning of everyday soundscapes

We have published a dataset containing captions for a subset of the TAU Urban Acoustic Scenes 2018. The captioned soundscapes include airport, street and park scenes, and each file has five captions.  A paper analyzing its content has been submitted to DCASE 2021 Workshop.

The dataset is called MACS – Multi-Annotator Captioned Soundscapes and it is available on zenodo.org. See the “Datasets” section for the link.