{"rowid": 20, "title": "The Streets of Women. An Analysis of Street Nomenclature Data in Latin America and Spain through OpenStreetMap and Wikipedia", "speaker": "Selene Yang", "time": "2:30 PM", "day": "May 8 2019", "room": "Main Sanctuary", "url": "https://csvconf.com/speakers/#selene-yang", "datetime": "2019-05-08T14:30:00", "abstract": "This is a collaborative project of Geochicas to produce a map of the streets named after women in Latin America and Spain. This project seeks to link and generate content in OSM and Wikipedia about prominent women. It is intended to make a survey of information on streets, avenues, passages, roads that have the names of women and also their respective biographies in Wikipedia.", "image": "https://csvconf.com/img/speakers-2019/syang.jpg"} {"rowid": 21, "title": "Using Research and Technology to Tackle Gender Bias", "speaker": "Mollie Marr", "time": "2:30 PM", "day": "May 8 2019", "room": "Fuller Hall", "url": "https://csvconf.com/speakers/#mollie-marr", "datetime": "2019-05-08T14:30:00", "abstract": "Qualitative and mixed methods research studies have provided insight into the language and patterns associated with unconscious bias across multiple fields. These patterns can be converted to rules for NLP and other text analysis programs making it possible to identify bias within a written document. This talk will explore one approach using qualitative research in gender bias and letters of recommendation and evaluation to define rules for a web-based automated text analysis program using NLP. The role of research and technology in addressing structural issues such as bias will be discussed and participants will be encouraged to think about ways in which existing research might be used to inspire new solutions to social problems.", "image": "https://csvconf.com/img/speakers-2019/mmarr.jpg"} {"rowid": 22, "title": "US Energy Data Liberation", "speaker": "Zane Selvans", "time": "2:30 PM", "day": "May 8 2019", "room": "Daisy Bingham Room", "url": "https://csvconf.com/speakers/#zane-selvans", "datetime": "2019-05-08T14:30:00", "abstract": "An alphabet soup of government agencies like FERC, EPA, EIA, PHMSA, MSHA and the ISOs and RTOs collect and publish terabytes of data about the US energy system. It includes operating costs and fuel consumption, hourly power output and GHG emissions, and the age and length of natural gas pipelines, the price of electricity every 5 minutes at thousands of nodes in the grid, coal production numbers and much much more. In theory all this data is public and freely available, but in practice it takes a lot of wrangling to make it usable for analysis. The result: it's packaged up by one or two platform monopolies that charge tens of thousands of dollars a year for easy access, excluding most non-corporate users. But for anyone interested in the ongoing transformation of our energy system and its climate impacts, this data is a treasure trove worth excavating. The Public Utility Data Liberation project (https://github.com/catalyst-cooperative/pudl) has been working for the last 2.5 years to liberate this data and make it freely accessible to activists, data journalists, and researchers working on US climate and energy policy. This talk will take a look at what the data is, where it comes from, why it's interesting, how we're processing it and making it available, and some of the challenges we're facing and opportunities we see ahead.", "image": "https://csvconf.com/img/speakers-2019/zselvans.jpg"} {"rowid": 46, "title": "Beyond the WARC: Making Web Archives More Useful and User-friendly", "speaker": "Ilya Kreymer", "time": "2:30 PM", "day": "May 9 2019", "room": "Main Sanctuary", "url": "https://csvconf.com/speakers/#ilya-kreymer", "datetime": "2019-05-09T14:30:00", "abstract": "Archives of the web contain not only web pages but any type of data.\nThe only standard in web archiving is the ISO WARC file format, which specifies raw data captured from the web. However, the WARC files often lack any context or metadata about how this data was captured. The talk will briefly cover the basics of the WARC format, and also provide possible ideas for making web archiving data more user-friendly, present existing tools and suggest ideas for interoperable ways to describe collections and make sense of growing web archive data beyond the WARC format.", "image": "https://csvconf.com/img/speakers-2019/ikreymer.jpg"} {"rowid": 47, "title": "How a File Format Led to a Crossword Scandal", "speaker": "Saul Pwanson", "time": "2:30 PM", "day": "May 9 2019", "room": "Fuller Hall", "url": "https://csvconf.com/speakers/#saul-pwanson", "datetime": "2019-05-09T14:30:00", "abstract": "In 2016 I designed a plain-text file format for crossword puzzle data, and then spent a couple of months building a micro-data-pipeline, scraping tens of thousands of crosswords from various sources. Then, having all those crosswords in a simple format, I wanted to see if there were any common grid patterns--and discovered egregious plagiarism by a major crossword editor that had gone on for years. This talk would cover the file format, data pipeline, and the design choices that aided rapid exploration; the evidence for the scandal, from the initial anomalies to the final damning visualization; and what it's like for a data project to get 15 minutes of fame.", "image": "https://csvconf.com/img/speakers-2019/spwanson.jpg"} {"rowid": 48, "title": "How open data can promote participatory democracy", "speaker": "Hector Dominguez", "time": "2:30 PM", "day": "May 9 2019", "room": "Daisy Bingham Room", "url": "https://csvconf.com/speakers/#hector-dominguez", "datetime": "2019-05-09T14:30:00", "abstract": "In this discussion, I will explore the nuances of building an open data program as a step towards participatory democracy and the challenges of creating trust with local communities.", "image": "https://csvconf.com/img/speakers-2019/hdominguez.jpg"}