Natural Language Processing

Natural Language Processing
Stanford Text2Scene Spatial Learning Dataset/Scenes and Descriptions for Text to Scene Generation https://nlp.stanford.edu/data/text2scene.shtml
MSMARCO-Microsoft Machine Reading Comprehension Dataset http://www.msmarco.org/
NewsQA Dataset https://github.com/Maluuba/newsqa
WikiQA Corpus https://www.microsoft.com/en-us/download/details.aspx?id=52419&from=http%3A%2F%2Fresearch.microsoft.com%2Fen-us%2Fdownloads%2F4495da01-db8c-4041-a7f6-7984a4f6a905%2Fdefault.aspx
The Blog Authorship Corpus http://u.cs.biu.ac.il/%7Ekoppel/BlogCorpus.htm
Amazon Fine Food Reviews https://www.kaggle.com/snap/amazon-fine-food-reviews
ClueWeb09 FACC Dataset http://lemurproject.org/clueweb09/FACC1/
Google Books Ngram Viewer Dataset http://storage.googleapis.com/books/ngrams/books/datasetsv2.html
Reuters Corpora (RCV1, RCV2, TRC2) http://trec.nist.gov/data/reuters/reuters.html
SouthParkData Dataset https://github.com/BobAdamsEE/SouthParkData
DBpedia Dataset http://wiki.dbpedia.org/Datasets/NLP
i2b2 NLP Research Data Sets https://www.i2b2.org/NLP/DataSets/
Lexical Inference Datasets http://u.cs.biu.ac.il/~nlp/resources/downloads/lexical-inference-datasets/
DeepDive Open Datasets http://deepdive.stanford.edu/opendata/
Stanford Datasets from arXiv http://snap.stanford.edu/data/index.html#citnets
CrisisNLP Dataset http://crisisnlp.qcri.org/
Enron Email Dataset https://www.cs.cmu.edu/~./enron/
Marcusyyy/NewYorkTimes_word2vec Dataset https://data.world/marcusyyy/newyorktimes-word-2-vec
Crowdflower/Airline Twitter Sentiment Dataset https://data.world/crowdflower/airline-twitter-sentiment
Fivethirtyeight/Presidential Commencement Speeches Dataset https://github.com/fivethirtyeight/data
Annotated Datasets http://clair.si.umich.edu/iopener/dataset.html
Congressional Speech Dataset http://www.cs.cornell.edu/home/llee/data/convote.html
MIMIC-III Dataset https://mimic.physionet.org/
CLEF eHealth Dataset https://sites.google.com/site/clefehealth/
MedNLPDoc Dataset https://sites.google.com/site/mednlpdoc/
ITU Turkish Natural Language Processing Datasets http://tools.nlp.itu.edu.tr/Datasets
Clinical Natural Language Processing Dataset http://faculty.washington.edu/melihay/LING575/Ling575_ClinicalNLP.html
Niderhoff/nlp-datasets https://libraries.io/github/niderhoff/nlp-datasets