Download

You can download the segmented and aligned data for each language by clicking on the corresponding link from the following table.
If you use any part of the corpus in your work, please cite the following paper:

A. Stan, O. Watts, Y. Mamiya, M. Giurgiu, R. A. J. Clark, J. Yamagishi, S. King, TUNDRA: A Multilingual Corpus of Found Data for TTS Research Created with Light Supervision, In Proc. Interspeech, Lyon, France, August 2013

Please refer to the README file prior to downloading the corpus. You will find a detailed description of the archives, as well as the licence info.

You can also download the tools used to create this corpus from HERE

Language Code Title Author Segmented audio and text Chapter-level annotation
BulgarianBGZhetvariatYordan YovkovDOWNLOAD [969MB]DOWNLOAD
DanishDAGrimms eventyr I udvalgGrimm BrothersDOWNLOAD [194MB]DOWNLOAD
DutchNLAnna KareninaLeo TolstoyDOWNLOAD [1.1GB]DOWNLOAD
EnglishENLiving AloneStella BensonDOWNLOAD [562MB]DOWNLOAD
FinnishFIRautatieJuhani AhoDOWNLOAD [606MB]DOWNLOAD
FrenchFRCandideVoltaireDOWNLOAD [520MB]DOWNLOAD
GermanDEDas Bildnis des Dorian GrayOscar WildeDOWNLOAD [953MB]DOWNLOAD
HungarianHUEgri csillagokGeza GardonyiDOWNLOAD [1.2GB]DOWNLOAD
ItalianITGalateaAnton Giulio BarriliDOWNLOAD [668MB]DOWNLOAD
PolishPLSiedem wybranyc opowiadanWladyslaw OrkanDOWNLOAD [629MB]DOWNLOAD
PortuguesePTSenhoraJose de AlencarDOWNLOAD [1.2GB]DOWNLOAD
Romanian*RMMaraIoan SlaviciDOWNLOAD [1.5GB]DOWNLOAD
RussianRUUcheniye KhristaLeo TolstoyDOWNLOAD [358MB]DOWNLOAD
Spanish**ESDon Quijote de la ManchaMiguel de Cervantes-DOWNLOAD
*The Romanian data can only be used for non-commercial purposes.
**Only the first 35 chapters from the first part were used for alignment. The data can not be redistributed, so please download the files from the original source found on the About page.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 Unported License. The underlying audio and text are subject to their source licenses, so please check the links before using the data.