The IDMT-DESED-FL and IDMT-URBAN-FL datasets enable research in sound event detection (SED) within a federated learning (FL) context. IDMT-DESED-FL and IDMT-URBAN-FL consist of sound events sourced from well-known DESED and URBAN-8K datasets. Each source dataset contains sound events from ten classes for the use cases of SED in domestic and urban environments, respectively. To simulate an FL scenario, the source events are mixed with background noise to generate 30.000 ten-second soundscapes which are partitioned to 100 edge devices. Each soundscape is generated by mixing up to five sound events (possibly overlapping) with background noise. Both datasets contain independent and identically distributed (IID) and non-IID versions, to provide a more real-world like distribution of event classes.
- IDMT-DESED-FL sound event classes include alarm/bell/ringing, blender, cat, dog, dishes, electric shaver/toothbrush, frying, running water, speech, and vacuum cleaner. The background classes include apartment room, computer interior, computer lab, emergency staircase, and library.
- IDMT-URBAN-FL sound event classes include air conditioner, car horn, children playing, dog bark, drilling, engine idling, gun shot, jackhammer, siren, and street music. Background classes for IDMT-URBAN-FL are sourced from the Isolated Urban Sound Database (IUSB), and include birds, crowd, fountain, rain, and traffic.
Due to the size of the datasets, with this download we provide the scripts and details necessary to generate the FL datasets using the source material from DESED, URBAN-8K, and IUSB.
See the above referenced paper and README contained with the data folder for further details.