48 |
How open data can promote participatory democracy |
Hector Dominguez |
2:30 PM |
May 9 2019 |
Daisy Bingham Room |
https://csvconf.com/speakers/#hector-dominguez |
2019-05-09T14:30:00 |
In this discussion, I will explore the nuances of building an open data program as a step towards participatory democracy and the challenges of creating trust with local communities. |
https://csvconf.com/img/speakers-2019/hdominguez.jpg |
46 |
Beyond the WARC: Making Web Archives More Useful and User-friendly |
Ilya Kreymer |
2:30 PM |
May 9 2019 |
Main Sanctuary |
https://csvconf.com/speakers/#ilya-kreymer |
2019-05-09T14:30:00 |
Archives of the web contain not only web pages but any type of data.
The only standard in web archiving is the ISO WARC file format, which specifies raw data captured from the web. However, the WARC files often lack any context or metadata about how this data was captured. The talk will briefly cover the basics of the WARC format, and also provide possible ideas for making web archiving data more user-friendly, present existing tools and suggest ideas for interoperable ways to describe collections and make sense of growing web archive data beyond the WARC format. |
https://csvconf.com/img/speakers-2019/ikreymer.jpg |
47 |
How a File Format Led to a Crossword Scandal |
Saul Pwanson |
2:30 PM |
May 9 2019 |
Fuller Hall |
https://csvconf.com/speakers/#saul-pwanson |
2019-05-09T14:30:00 |
In 2016 I designed a plain-text file format for crossword puzzle data, and then spent a couple of months building a micro-data-pipeline, scraping tens of thousands of crosswords from various sources. Then, having all those crosswords in a simple format, I wanted to see if there were any common grid patterns--and discovered egregious plagiarism by a major crossword editor that had gone on for years. This talk would cover the file format, data pipeline, and the design choices that aided rapid exploration; the evidence for the scandal, from the initial anomalies to the final damning visualization; and what it's like for a data project to get 15 minutes of fame. |
https://csvconf.com/img/speakers-2019/spwanson.jpg |