Dataset

Disease (Image Processing)
Thyroid Disease http://archive.ics.uci.edu/ml/datasets/Thyroid+Disease
Statlog (Heart) Data Set http://archive.ics.uci.edu/ml/datasets/Statlog+(Heart)
Parkinsons Telemonitoring Data Set http://archive.ics.uci.edu/ml/datasets/Parkinsons+Telemonitoring
Medical Pictures / Disease Pictures Dataset http://hardinmd.lib.uiowa.edu/pictures.html
Heart-Disease-Cleveland Dataset http://mlcomp.org/datasets/223
Liver Disorders Data Set https://archive.ics.uci.edu/ml/datasets/Liver+Disorders
KEGG DISEASE Dataset http://www.genome.jp/kegg/disease/
DisGeNET Dataset http://www.disgenet.org/web/DisGeNET/menu/downloads
USA Contagious Disease Data(Note:Download Dataset Login Required) https://cloud.google.com/bigquery/public-data/usa-disease
Tuberculosis (TB) Dataset http://www.who.int/tb/country/data/download/en/
Pulmonary Hypertension https://www.ncbi.nlm.nih.gov/gds?term=100:500[Number+of+Samples]
South African Heart Disease https://statweb.stanford.edu/~tibs/ElemStatLearn/datasets/SAheart.data
Orphanet; A Database Dedicated to Information on Rare Diseases and Orphan Drugs http://download.bio2rdf.org/release/3/release.html
Congenital Heart Disease https://data.england.nhs.uk/dataset/congenitalheartdisease
Blood from Septic Patients Dataset https://www.ncbi.nlm.nih.gov/sites/GDSbrowser?acc=GDS4971
Leaf Disease Dataset https://www.uvm.edu/femc/data/archive/project/leaf-twig-survey-leaf-twig-damage/dataset/leaf-and-twig-disease-description
PharmGKB Data https://www.pharmgkb.org/downloads/
Plant Leaf Disease Datasets https://www.reddit.com/r/datasets/comments/5uljlp/plant_leaf_disease_datasets/
The Human Disease Methylation Database http://202.97.205.78/diseasemeth/download.html
R Datasets for Disease Collection https://vincentarelbundock.github.io/Rdatasets/datasets.html
Heart Dataset http://eric.univ-lyon2.fr/~ricco/tanagra/fichiers/heart_disease_male.xls
Alzheimer’s Disease Neuroimaging Initiative (ADNI) http://adni.loni.usc.edu/data-samples/access-data/
Disease and Patient-Level Statistics Dataset http://blog.wolframalpha.com/2010/06/29/disease-and-patient-level-statistics-with-wolframalpha/
Acute Inflammations Data Set http://archive.ics.uci.edu/ml/datasets/Acute+Inflammations
Parkinsons Data Set http://archive.ics.uci.edu/ml/datasets/Parkinsons
Heart Disease Data Set http://archive.ics.uci.edu/ml/datasets/Heart+Disease
Web Mining
MSNBC.com Anonymous Web Data Data Set http://archive.ics.uci.edu/ml/datasets/msnbc.com+anonymous+web+data
Web Mining-Social Networks Security (WmSnSec) Datasets http://www.marcovanetti.com/pages/wmsnsec/
ICML-09 Data Set http://www.sysnet.ucsd.edu/projects/url/#datasets
Public Datasets http://www.scaleunlimited.com/datasets/public-datasets/
Webkb-Data Dataset http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/webkb-data.gtar.gz
ILP Dataset http://www.cs.cmu.edu/~WebKB/ILP-data.html
Web Mining Data – UW-CAN-DATASET http://pami.uwaterloo.ca/~hammouda/webdata/
CTI Web Usage Data Set http://facweb.cs.depaul.edu/mobasher/classes/ect584/resource.html
DMW Dataset http://www.cs.ccsu.edu/~markov/dmwdata.zip
ICPCR Dataset http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/20240
Kaggle Datasets(Download Data:Login Required) https://www.kaggle.com/datasets
Big Data Set – 3.5 Billion Web Pages http://www.bigdatanews.com/profiles/blogs/big-data-set-3-5-billion-web-pages-made-available-for-all-of-us
The ClueWeb09 Dataset http://lemurproject.org/clueweb09/
Web-Mining Social Dataset https://www.rforge.net/affinity/files/
Web 1T 5-gram Dataset(Dataset Not Free.pay and Download) https://catalog.ldc.upenn.edu/LDC2006T13
Web Track Dataset http://trec.nist.gov/data/webmain.html
14M Weblog Dataset http://www.icwsm.org/data.html
PageRank Datasets http://langvillea.people.cofc.edu/PRDataCode/index.html
PageRank on Twitter Memes Dataset https://github.com/mongodb-labs/big-data-exploration/wiki/PageRank-on-Twitter-Memes-Dataset
Datasets – Web Analytics http://us-city.census.okfn.org/dataset/web-analytics

Reuters-21578 Text Categorization http://kdd.ics.uci.edu/databases/reuters21578/reuters21578.html
Reuters Transcribed Subset Data Set http://archive.ics.uci.edu/ml/datasets/reuters+transcribed+subset
NYSK Data Set https://archive.ics.uci.edu/ml/datasets/NYSK
SMS Spam Collection Data Set https://archive.ics.uci.edu/ml/datasets/sms+spam+collection
Text Classification Data Sets http://sci2s.ugr.es/keel/textClassification.php#sub2
Hansards Dataset http://www.isi.edu/natural-language/download/hansard/
Webkb Dataset http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/webkb-data.gtar.gz
Twenty Newsgroups Data Set https://archive.ics.uci.edu/ml/datasets/Twenty+Newsgroups
Movie Review Data Set http://www.cs.cornell.edu/People/pabo/movie-review-data/
Multi-Domain Sentiment Dataset http://www.cs.jhu.edu/~mdredze/datasets/sentiment/
Latent Aspect Rating Analysis/Online Forum Mining and Analysis Datasets http://sifaka.cs.uiuc.edu/~wang296/Data/index.html
Opinosis Dataset http://kavita-ganesan.com/opinosis-opinion-dataset
OpinRank Dataset http://kavita-ganesan.com/entity-ranking-data
Restaurant Reviews Dataset http://www.cs.cmu.edu/~mehrbod/RR/
MovieLens Dataset https://grouplens.org/datasets/movielens/
Micropinion Generation Dataset http://kavita-ganesan.com/content/micropinion-generation-dataset
Corpora Dataset http://www.mad.disco.unimib.it/doku.php/research/corpora
Wikipedia XML Corpus Dataset(Download Data:Login Reqiured) http://www-connex.lip6.fr/~denoyer/wikipediaXML/
Extended Epinions Dataset http://www.trustlet.org/datasets/extended_epinions/
Text Categorization Corpora http://disi.unitn.it/moschitti/corpora.htm
MLComp Dataset http://scikit-learn.org/stable/auto_examples/text/mlcomp_sparse_document_classification.html
TechTC – Technion Repository of Text Categorization Datasets http://techtc.cs.technion.ac.il/
Weka Collections of Datasets http://www.cs.waikato.ac.nz/ml/weka/datasets.html
Text classification Datasets #35 https://drive.google.com/drive/u/0/folders/0Bz8a_Dbh9Qhbfll6bVpmNUtUcFdjYmF2SEpmZUZUcVNiMUw1TWN6RDV3a0JHT3kxLVhVR2M
20ng-Dataset http://ana.cachopo.org/datasets-for-single-label-text-categorization
COCO-Text: Dataset for Text Detection and Recognition https://vision.cornell.edu/se3/coco-text-2/
Image Mining
Image Segmentation Data Set https://archive.ics.uci.edu/ml/datasets/Image+Segmentation
Volcanoes on Venus – JARtool Experiment Data Set https://archive.ics.uci.edu/ml/datasets/Volcanoes+on+Venus+-+JARtool+experiment
The MIRFLICKR Retrieval Evaluation Dataset http://press.liacs.nl/mirflickr/
Open Images Dataset https://github.com/openimages/dataset
INRIA Holidays Dataset http://lear.inrialpes.fr/people/jegou/data.php#holidays
Flower Datasets http://www.robots.ox.ac.uk/%7Evgg/data/flowers/index.html
Columbia University Image Library Mining Dataset http://www1.cs.columbia.edu/CAVE/software/softlib/coil-100.php
The Oxford Buildings Dataset http://www.robots.ox.ac.uk/%7Evgg/data/oxbuildings/index.html
Photo Tourism Dataset http://phototour.cs.washington.edu/patches/default.htm
NUS-WIDE Dataset http://lms.comp.nus.edu.sg/research/NUS-WIDE.htm
Caltech-UCSD Birds 200 Datasets http://www.vision.caltech.edu/visipedia/CUB-200.html
CIFAR-10 Dataset https://www.kaggle.com/c/cifar-10
Data Sets & Images http://lear.inrialpes.fr/data
STL-10 Dataset https://cs.stanford.edu/~acoates/stl10/
CMU Face Images Data Set https://archive.ics.uci.edu/ml/datasets/CMU+Face+Images
IAPR TC-12 Dataset http://www.imageclef.org/SIAPRdata
The INDECS Database http://www.cas.kth.se/INDECS/
European Cities 1M Dataset http://image.ntua.gr/iva/datasets/ec1m/
ADE20K Dataset http://groups.csail.mit.edu/vision/datasets/ADE20K/
Image Classification Dataset http://www.di.ens.fr/willow/events/cvml2011/materials/practical-classification/
Cloud Computing Services Dataset https://data.europa.eu/euodp/en/data/dataset/yUBHDpCh8MDqL9Gub8qMQ
Cloud Data Set https://archive.ics.uci.edu/ml/datasets/Cloud
Genomics in the Cloud Dataset(Download Data:Required Amazon Login) https://aws.amazon.com/public-datasets/
ClusterData2011_2 traces https://github.com/google/cluster-data/blob/master/ClusterData2011_2.md
Job Shop Scheduling Dataset http://people.brunel.ac.uk/~mastjjb/jeb/orlib/jobshopinfo.html
Google/cluster-Data Set https://github.com/google/cluster-data
Facebook SWIM Datasets https://github.com/SWIMProjectUCB/SWIM/wiki/Workloads-repository
HLA Datasets http://ftp.pdl.cmu.edu/pub/datasets/hla/
Data-Centers and Cloud Computing Datasets http://pages.cs.wisc.edu/~akella/projects/dc.html
Cloud VM Workload Dataset https://www.cs.ucsb.edu/~rich/workload/
Computing Systems Data Set https://webscope.sandbox.yahoo.com/catalog.php?datatype=s
Point Cloud Data Sets http://www.pointclouds.org/news/2013/01/07/point-cloud-data-sets/
KITTI Raw Dataset http://www.cvlibs.net/datasets/kitti/raw_data.php
Cloud_cci v2.0 Datasets http://www.esa-cloud-cci.org/?q=data_download
Public Datasets in the Cloud https://archive.org/details/PublicDatasetsInTheCloud-RosalynMetzAndMichaelB.Klein
Cloud Armor Dataset http://cs.adelaide.edu.au/~cloudarmor/ds.html
NEC Personal Cloud Trace Dataset http://cloudspaces.eu/results/datasets
Camelyon 16 Dataset https://camelyon16.grand-challenge.org/download/
Oracle Labs Downloads – Datasets http://www.oracle.com/technetwork/oracle-labs/datasets/downloads/index.html
Security Dataset http://www.secrepo.com/
Statlog (Shuttle) Data Set https://archive.ics.uci.edu/ml/datasets/Statlog+(Shuttle)
ISOLET Data Set https://archive.ics.uci.edu/ml/datasets/isolet
Particle Physics Data Set(Download data:Login Required) http://osmot.cs.cornell.edu/kddcup/datasets.html
Datasets for Data Mining http://www.inf.ed.ac.uk/teaching/courses/dme/html/datasets0405.html
Weka Data Sets http://storm.cis.fordham.edu/~gweiss/data-mining/datasets.html
BCI Competition III Dataset http://www.bbci.de/competition/iii/
The 4 Universities Data Set( Web->Kb ) http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/
The Insurance Company (TIC) Benchmark Dataset http://liacs.leidenuniv.nl/~puttenpwhvander/library/cc2000/
Airport, Airline and Route Data https://openflights.org/data.html
Frequent Itemset Mining Dataset Repository http://fimi.ua.ac.be/data/
UCI Knowledge Discovery in Databases http://kdd.ics.uci.edu/
StatLib—Datasets http://lib.stat.cmu.edu/datasets/
GIScience at ASU Datasets https://geoplan.asu.edu/geodacenter-redirect
EconData Datasets http://inforumweb.umd.edu/econdata/econdataarchives.html
Dblp Dataset http://dblp.org/xml/
Public Datasets from Government http://data.gov.au/
Five Thirty Eight Datasets https://github.com/fivethirtyeight/data
Youtube Labeled Video Dataset https://research.google.com/youtube8m/index.html
Kaggle Datasets https://www.kaggle.com/datasets
The MNIST Database http://yann.lecun.com/exdb/mnist/
The Chars74K Dataset http://www.ee.surrey.ac.uk/CVSSP/demos/chars74k/
Frontal Face Images Dataset http://vasc.ri.cmu.edu//idb/html/face/frontal_images/index.html
SMS Spam Corpus v.0.1 Dataset http://www.esp.uem.es/jmgomez/smsspamcorpus/
Twitter Sentiment Analysis Training Corpus (Dataset) http://thinknook.com/twitter-sentiment-analysis-training-corpus-dataset-2012-09-22/
Jester Collaborative Filtering Dataset http://www.ieor.berkeley.edu/~goldberg/jester-data/
Bruteforce-Database https://github.com/duyetdev/bruteforce-database
Yelp Dataset https://www.yelp.com/dataset_challenge
Lending Club Dataset https://www.lendingclub.com/info/download-data.action
Walmart Recruiting – Store Sales Forecasting Dataset https://www.kaggle.com/c/walmart-recruiting-store-sales-forecasting/data
KEEL-Dataset Repository http://sci2s.ugr.es/keel/datasets.php#sub1
Wikipedia:Database https://en.wikipedia.org/wiki/Wikipedia:Database_download
Stanford Large Network Dataset Collection http://snap.stanford.edu/data/
KONECT Datasets http://konect.uni-koblenz.de/downloads/
Delve Datasets http://www.cs.toronto.edu/~delve/data/datasets.html
Yahoo Webscope Dataset https://webscope.sandbox.yahoo.com/
Spam Email Datasets http://csmining.org/index.php/spam-email-datasets-.html
Cancer Program Datasets http://portals.broadinstitute.org/cgi-bin/cancer/datasets.cgi
Data Dumps Dataset https://developers.google.com/freebase/
Wireless Sensor Networks (WSN)
Wsn-Indfeat-Dataset https://github.com/apanouso/wsn-indfeat-dataset
Activities of Daily Living (ADLs) Recognition Using Binary Sensors Data Set https://archive.ics.uci.edu/ml/datasets/Activities+of+Daily+Living+(ADLs)+Recognition+Using+Binary+Sensors
Benchmark Datasets(Human Activity Recognition from Wireless Sensor Network Data) https://sites.google.com/site/tim0306/datasets
General collection of Datasets (Download Dataset Need Login Required) https://data.4tu.nl/repository/uuid:bfbd480d-1b49-4494-ad2c-0a5caa383354
Crowd_Temperature Dataset http://crawdad.org/queensu/crowd_temperature/20151120/
Packet-Delivery Dataset http://crawdad.org/due/packet-delivery/20150401/
I-LENSE Dataset http://www.isi.edu/ilense/software/
Intel Lab Data http://db.csail.mit.edu/labdata/labdata.html
Sensorscope Datasets http://lcav.epfl.ch/page-86035-en.html
Dataset for Distinct https://eprints.soton.ac.uk/407610/
NetworksDB Dataset http://wislab.cz/our-work/wireless-sensor-network-simulation-tutorial-for-matlab
Activity Recognition System Based on Multisensor Data Fusion (AReM) Data Set https://archive.ics.uci.edu/ml/datasets/Activity+Recognition+system+based+on+Multisensor+data+fusion+%28AReM%29
Indoor User Movement Prediction from RSS Data Data Set https://archive.ics.uci.edu/ml/datasets/Indoor+User+Movement+Prediction+from+RSS+data
Smartphone-Based Recognition of Human Activities and Postural Transitions Data Set https://archive.ics.uci.edu/ml/datasets/Smartphone-Based+Recognition+of+Human+Activities+and+Postural+Transitions
Heterogeneity Activity Recognition Data Set https://archive.ics.uci.edu/ml/datasets/Heterogeneity+Activity+Recognition
Wireless Sensor Network Ontology Dataset https://datahub.io/dataset/wireless-sensor-network-ontology
Synthetic Data Sets for WSN Optimization http://arco.unex.es/wsnopt/
Waternet Soil Moisture and LST Observation Dataset http://card.westgis.ac.cn/data/b7beb8bf-58d9-4e58-a945-7b6e1dc7705f
WiseML Datasets http://albcom.lsi.upc.edu/wisebed/?cmd=download
ZigBee-Wsn Dataset https://catalog.data.gov/dataset/a-zigbee-based-wireless-sensor-network-for-continuous-sound-and-noise-level-monitoring-on–eacc0
Cyber Security Datasets for Wireless Sensors Network http://fitnesslab.altervista.org/index.php/it/?option=com_content&view=article&id=71
RSSI Data Sets for WSN http://dgt.dei.unipd.it/pages/read/59/
Berkeley Sensor Dataset http://www.select.cs.cmu.edu/data/index.html#labapp3
DRED Dataset http://www.st.ewi.tudelft.nl/~akshay/dred/
Network Security
ICML-09 Data Set http://www.sysnet.ucsd.edu/projects/url/#datasets
DARPA Intrusion Detection Data Sets https://www.ll.mit.edu/ideval/data/
NSL_KDD Dataset https://github.com/defcom17/NSL_KDD
User-Computer Authentication Associations Datasets http://csr.lanl.gov/data/auth/
Multi-Source Cyber-Security Datasets http://csr.lanl.gov/data/cyber1/
The Drebin Dataset https://www.sec.cs.tu-bs.de/~danarp/drebin/download.html
KDD Cup by Tencent 2012 https://www.kaggle.com/c/kddcup2012-track1/data
CAIDA Data http://www.caida.org/data/overview/
2009-M57-Patents Enterprise Network Traffic Dataset http://digitalcorpora.org/corpora/scenarios/m57-patents-scenario
MAWI data set/NLANR (AMP) data set/NIMS1 data set/NIMS2 Datasets https://web.cs.dal.ca/~riyad/Site/Download.html
Honeypots Datasets http://www.takakura.com/Kyoto_data/
Secure Water Treatment (SWaT) Dataset https://itrust.sutd.edu.sg/dataset/
Ben-Gurion University of the Negev Dataset Collection https://snap.stanford.edu/data/links.html
UNSW-NB15 Data Set https://www.unsw.adfa.edu.au/australian-centre-for-cyber-security/cybersecurity/ADFA-NB15-Datasets/
Social Networks Security:

Any Beat Dataset http://proj.ise.bgu.ac.il/sns/anybeat.html
Academia Dataset http://proj.ise.bgu.ac.il/sns/academia.html
Google Plus Dataset http://proj.ise.bgu.ac.il/sns/googlep.html
Facebook Applications Data Set http://proj.ise.bgu.ac.il/sns/facebook_applications.html
A Facebook Group of Coworkers http://proj.ise.bgu.ac.il/sns/Link_Prediction.html
Cybersecurity:

Stratosphere IPS Data Sets https://stratosphereips.org/category/dataset.html
ADFA Intrusion Detection Datasets https://www.unsw.adfa.edu.au/australian-centre-for-cyber-security/cybersecurity/ADFA-IDS-Datasets/
Malicious URLs Data Sets http://www.sysnet.ucsd.edu/projects/url/
Web Attack Payloads https://github.com/foospidy/payloads
Aktaion Data Sets https://github.com/jzadeh/Aktaion/tree/master/data
CRIME Database from DeepEnd Research https://www.dropbox.com/sh/7fo4efxhpenexqp/AADHnRKtL6qdzCdRlPmJpS8Aa/CRIME?dl=0
Publicly Available PCAP Files http://www.netresec.com/?page=PcapFiles