This repository contains or links to all assets relevant to the WWW'20 paper: "The representative of automated Web crawls as a surrogate for human browsing"
Перейти к файлу
Sarah Bird 558e428168
Merge pull request #2 from mozilla/torrent
Add link to bittorrent copy of crawl data
2020-06-07 17:14:11 -05:00
human-browsing-top-sites Update and clean up READMEs 2020-05-04 15:31:39 -05:00
list-comparison Add README for list comparison dir 2020-04-29 18:00:37 -05:00
lists Add lists used for crawls 2020-04-21 12:22:30 -05:00
.gitignore Add gitignore 2020-04-27 16:36:30 -05:00
3366423.3380104.pdf Upload ACM formatted manuscript. 2020-04-28 08:30:55 -07:00
CODE_OF_CONDUCT.md Add code of conduct 2020-04-21 15:51:59 -05:00
LICENSE Initial commit 2020-01-25 16:38:59 +01:00
README.md Add link to bittorrent copy of crawl data 2020-06-07 12:43:38 -04:00

README.md

The representativeness of automated Web crawls as a surrogate for human browsing: companion repository

This repository contains or links to all assets relevant to the WWW'20 paper: The representative of automated Web crawls as a surrogate for human browsing. All listed assets will be made publicly available pending internal privacy/trust audit processes required prior to data release. For specific inquiries pertaining to data access and collaborations on privacy enhancing technologies research please reach out to the corresponding authors listed on the manuscript.

If you find any of the resources contained int his repository valuable for your research please cite the original manuscript for which this work was produced:

@inproceedings{10.1145/3366423.3380104,
author = {Zeber, David and Bird, Sarah and Oliveira, Camila and Rudametkin, Walter and Segall, Ilana and Wolls\'{e}n, Fredrik and Lopatka, Martin},
title = {The Representativeness of Automated Web Crawls as a Surrogate for Human Browsing},
year = {2020},
isbn = {9781450370233},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3366423.3380104},
doi = {10.1145/3366423.3380104},
booktitle = {Proceedings of The Web Conference 2020},
pages = {167–178},
numpages = {12},
keywords = {Web Crawling, Online Privacy, Tracking, Browser Fingerprinting, World Wide Web},
location = {Taipei, Taiwan},
series = {WWW 20}
}