27 Missing Data for Data - Our Quest to Clean Up Institutional Affiliations in Dryad Daniella Lowenberg, Ted Habermann 4:00 PM May 8 2019 Main Sanctuary https://csvconf.com/speakers/#daniella-lowenberg-ted-habermann 2019-05-08T16:00:00 Data publications and other scholarly outputs do not have clean information on institutional affiliations for researchers. This is caused by a mix of not asking researchers for this information up front, as well as incomplete metadata being submitted by repositories to DataCite and (publications to) Crossref. Without this standardized information we can't properly report on or provide statistics on deposits, usage metrics, or reach by institution. Join us for a session about our work using OpenRefine, organizational identifiers (ROR), and some manual sleuthing to update and improve Dryad institutional metadata for 25,000 data publications. https://csvconf.com/img/speakers-2019/dlowenberg_thabermann.jpg
53 Spanking and Spreadsheets: Data-driven Sex Journalism Jacqueline Nolis & Heather Nolis 4:00 PM May 9 2019 Main Sanctuary https://csvconf.com/speakers/#jacqueline-nolis-heather-nolis 2019-05-09T16:00:00 When we saw that the Stranger, Seattle’s alternative newspaper, was running a survey on kinks and sexual preferences, we knew we had to get our hands on the data. We convinced the that using machine learning methods on the responses would be a good idea, and then we quickly set out to analyzing them. But we had never written an article for a newspaper before—nor had we worked with data even remotely as dirty. It turns out what makes for a good blog post or technical journal is very different than writing for print, especially for such a sensitive topic. In this talk we will cover how we made sense of the lewd data, the statistical methods we used (and failures we produced), as well as the final results that ended up in our feature article: “There Are Four Kinds of Sex Partners (which one are you).” https://csvconf.com/img/speakers-2019/jnolis_hnolis.jpg
