updated README, config, .gitignore
This commit is contained in:
Родитель
ce8905bdfb
Коммит
95382bd2b9
|
@ -6,3 +6,4 @@ training_set5/
|
||||||
logs/
|
logs/
|
||||||
test_set2/
|
test_set2/
|
||||||
training_set_sept11/
|
training_set_sept11/
|
||||||
|
training_set_sept12/
|
||||||
|
|
15
README.md
15
README.md
|
@ -1,6 +1,6 @@
|
||||||
# Deep Noise Suppression (DNS) Challenge - Interspeech 2020
|
# Deep Noise Suppression (DNS) Challenge - Interspeech 2020
|
||||||
|
|
||||||
This repository contains the datasets and scripts required for the DNS challenge. For more details about the challenge, please visit https://dns-challenge.azurewebsites.net/ and refer to our [paper](https://arxiv.org/ftp/arxiv/papers/2001/2001.08662.pdf).
|
This repository contains the datasets and scripts required for the DNS challenge. For more details about the challenge, please visit https://dns-challenge.azurewebsites.net/.
|
||||||
|
|
||||||
## Repo details:
|
## Repo details:
|
||||||
* The **datasets** directory contains the clean speech and noise clips.
|
* The **datasets** directory contains the clean speech and noise clips.
|
||||||
|
@ -101,11 +101,24 @@ The datasets used in this project are licensed as follows:
|
||||||
* https://librivox.org/; License: https://librivox.org/pages/public-domain/
|
* https://librivox.org/; License: https://librivox.org/pages/public-domain/
|
||||||
* PTDB-TUG: Pitch Tracking Database from Graz University of Technology https://www.spsc.tugraz.at/databases-and-tools/ptdb-tug-pitch-tracking-database-from-graz-university-of-technology.html; License: http://opendatacommons.org/licenses/odbl/1.0/
|
* PTDB-TUG: Pitch Tracking Database from Graz University of Technology https://www.spsc.tugraz.at/databases-and-tools/ptdb-tug-pitch-tracking-database-from-graz-university-of-technology.html; License: http://opendatacommons.org/licenses/odbl/1.0/
|
||||||
* Edinburgh 56 speaker dataset: https://datashare.is.ed.ac.uk/handle/10283/2791; License: https://datashare.is.ed.ac.uk/bitstream/handle/10283/2791/license_text?sequence=11&isAllowed=y
|
* Edinburgh 56 speaker dataset: https://datashare.is.ed.ac.uk/handle/10283/2791; License: https://datashare.is.ed.ac.uk/bitstream/handle/10283/2791/license_text?sequence=11&isAllowed=y
|
||||||
|
* VocalSet: A Singing Voice Dataset https://zenodo.org/record/1193957#.X1hkxYtlCHs; License: Creative Commons Attribution 4.0 International
|
||||||
|
* Emotion data corpus: CREMA-D (Crowd-sourced Emotional Multimodal Actors Dataset)
|
||||||
|
https://github.com/CheyneyComputerScience/CREMA-D; License: http://opendatacommons.org/licenses/dbcl/1.0/
|
||||||
|
* The VoxCeleb2 Dataset http://www.robots.ox.ac.uk/~vgg/data/voxceleb/vox2.html; License: http://www.robots.ox.ac.uk/~vgg/data/voxceleb/
|
||||||
|
The VoxCeleb dataset is available to download for commercial/research purposes under a Creative Commons Attribution 4.0 International License. The copyright remains with the original owners of the video. A complete version of the license can be found here.
|
||||||
|
* VCTK Dataset: https://homepages.inf.ed.ac.uk/jyamagis/page3/page58/page58.html; License: This corpus is licensed under Open Data Commons Attribution License (ODC-By) v1.0.
|
||||||
|
http://opendatacommons.org/licenses/by/1.0/
|
||||||
|
|
||||||
2. Noise:
|
2. Noise:
|
||||||
* Audioset: https://research.google.com/audioset/index.html; License: https://creativecommons.org/licenses/by/4.0/
|
* Audioset: https://research.google.com/audioset/index.html; License: https://creativecommons.org/licenses/by/4.0/
|
||||||
* Freesound: https://freesound.org/ Only files with CC0 licenses were selected; License: https://creativecommons.org/publicdomain/zero/1.0/
|
* Freesound: https://freesound.org/ Only files with CC0 licenses were selected; License: https://creativecommons.org/publicdomain/zero/1.0/
|
||||||
* Demand: https://zenodo.org/record/1227121#.XRKKxYhKiUk; License: https://creativecommons.org/licenses/by-sa/3.0/deed.en_CA
|
* Demand: https://zenodo.org/record/1227121#.XRKKxYhKiUk; License: https://creativecommons.org/licenses/by-sa/3.0/deed.en_CA
|
||||||
|
|
||||||
|
3. RIR datasets: OpenSLR26 and OpenSLR28:
|
||||||
|
* http://www.openslr.org/26/
|
||||||
|
* http://www.openslr.org/28/
|
||||||
|
* License: Apache 2.0
|
||||||
|
|
||||||
## Code license
|
## Code license
|
||||||
MIT License
|
MIT License
|
||||||
|
|
||||||
|
|
|
@ -52,9 +52,9 @@ noise_dir: datasets\noise
|
||||||
speech_dir: datasets\clean\read_speech
|
speech_dir: datasets\clean\read_speech
|
||||||
noise_types_excluded: None
|
noise_types_excluded: None
|
||||||
|
|
||||||
noisy_destination: datasets\training_set_sept11\noisy
|
noisy_destination: datasets\training_set_sept12\noisy
|
||||||
clean_destination: datasets\training_set_sept11\clean
|
clean_destination: datasets\training_set_sept12\clean
|
||||||
noise_destination: datasets\training_set_sept11\noise
|
noise_destination: datasets\training_set_sept12\noise
|
||||||
log_dir: logs
|
log_dir: logs
|
||||||
|
|
||||||
# Config: add singing voice to clean speech
|
# Config: add singing voice to clean speech
|
||||||
|
@ -76,7 +76,7 @@ use_mandarin_data=1
|
||||||
clean_mandarin: datasets\clean\mandarin_speech
|
clean_mandarin: datasets\clean\mandarin_speech
|
||||||
|
|
||||||
# Config: add reverb to clean speech
|
# Config: add reverb to clean speech
|
||||||
rir_choice: 1
|
rir_choice: 3
|
||||||
# 1 for only real rir, 2 for only synthetic rir, 3 (default) use both real and synthetic
|
# 1 for only real rir, 2 for only synthetic rir, 3 (default) use both real and synthetic
|
||||||
lower_t60: 0.3
|
lower_t60: 0.3
|
||||||
# lower bound of t60 range in seconds
|
# lower bound of t60 range in seconds
|
||||||
|
|
Загрузка…
Ссылка в новой задаче