MSNBC.com Anonymous Web Data Data Set |
http://archive.ics.uci.edu/ml/datasets/msnbc.com+anonymous+web+data |
Web Mining-Social Networks Security (WmSnSec) Datasets |
http://www.marcovanetti.com/pages/wmsnsec/ |
ICML-09 Data Set |
http://www.sysnet.ucsd.edu/projects/url/#datasets |
Public Datasets |
http://www.scaleunlimited.com/datasets/public-datasets/ |
Webkb-Data Dataset |
http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/webkb-data.gtar.gz |
ILP Dataset |
http://www.cs.cmu.edu/~WebKB/ILP-data.html |
Web Mining Data – UW-CAN-DATASET |
http://pami.uwaterloo.ca/~hammouda/webdata/ |
CTI Web Usage Data Set |
http://facweb.cs.depaul.edu/mobasher/classes/ect584/resource.html |
DMW Dataset |
http://www.cs.ccsu.edu/~markov/dmwdata.zip |
ICPCR Dataset |
http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/20240 |
Kaggle Datasets(Download Data:Login Required) |
https://www.kaggle.com/datasets |
Big Data Set – 3.5 Billion Web Pages |
http://www.bigdatanews.com/profiles/blogs/big-data-set-3-5-billion-web-pages-made-available-for-all-of-us |
The ClueWeb09 Dataset |
http://lemurproject.org/clueweb09/ |
Web-Mining Social Dataset |
https://www.rforge.net/affinity/files/ |
Web 1T 5-gram Dataset(Dataset Not Free.pay and Download) |
https://catalog.ldc.upenn.edu/LDC2006T13 |
Web Track Dataset |
http://trec.nist.gov/data/webmain.html |
14M Weblog Dataset |
http://www.icwsm.org/data.html |
PageRank Datasets |
http://langvillea.people.cofc.edu/PRDataCode/index.html |
PageRank on Twitter Memes Dataset |
https://github.com/mongodb-labs/big-data-exploration/wiki/PageRank-on-Twitter-Memes-Dataset |
Datasets – Web Analytics |
http://us-city.census.okfn.org/dataset/web-analytics |