Type Name Files Added Size DLs
The New York Times Annotated Corpus 1 2022-07-01 3.23GB 3,17911+ 0
OntoNotes 5.0 Annotated Text Corpus LDC2013T19 1 2022-07-02 839.11MB 4564+ 0
DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus LDC93S1 1 2022-07-02 385.52MB 2,7345+ 0
TAC KBP Entity Discovery and Linking - Comprehensive Training and Evaluation Data LDC2019T02 1 2022-07-05 20.36GB 1022 0
LORELEI Tigrinya Language Pack NLP LDC2020T22 1 2022-07-11 122.79MB 562 0
BOLT Egyptian Arabic Treebank Conversational Telephone Speech NLP LDC2021T12 1 2022-07-11 19.42MB 2324 0
Abstract Meaning Representation AMR Annotation Release 3.0 LDC2017T10 1 2022-07-11 38.82MB 1796 0
English Gigaword 5th edition LDC2011T07 1 2022-07-12 9.76GB 8743+ 0
Chinese Gigaword 5th edition LDC2011T13 1 2022-07-12 4.39GB 1153+ 0
French Gigaword 3rd edition LDC2011T10 1 2022-07-12 2.08GB 1,9554+ 0
Spanish Gigaword 3rd edition LDC2011T12 1 2022-07-12 2.78GB 1042+ 0
TAC KBP Comprehensive English Source Corpora LDC2018T03 1 2022-07-12 6.80GB 1003 0
Penn Treebank Revised: English News Text Treebank LDC2015T13 1 2022-08-14 6.86MB 1171+ 0
CCGBank: CCG Combinatory Categorical Grammar for Penn Treebank 2 - LDC2005T13 1 2022-08-14 27.90MB 833+ 0
Penn Treebank II 2 - LDC95T7 1 2022-08-14 134.99MB 1,00410+ 0
RST Discourse Treebank LDC2002T07 1 2022-08-15 2.71MB 8111+ 0
Penn Treebank III 3 LDC99T42 1 2022-08-15 29.83MB 1491+ 3
Penn Treebank 1 - ACL / DCI - LDC93T1 1 2022-08-15 139.95MB 2933+ 0
TIMIT Acoustic-Phonetic Continuous Speech Corpus - LDC93S1 1 2023-05-06 385.20MB 1357+ 0
BOLT Egyptian Arabic SMS/Chat Parallel Training Data - LDC2021T15 1 2023-05-06 10.21MB 634+ 0
BOLT Chinese SMS/Chat Parallel Training Data - LDC2021T11 1 2023-05-06 14.43MB 305+ 0
BOLT Egyptian Arabic SMS/Chat and Transliteration - LDC2017T07 1 2023-05-06 8.91MB 2068+ 0
BOLT Egyptian Arabic Treebank - Discussion Forum - LDC2018T23 1 2023-05-06 60.02MB 707+ 0
BOLT Chinese SMS/Chat - LDC2018T15 1 2023-05-06 9.83MB 795+ 0



Send Feedback