home / csvconf

talks

Data source: https://csvconf.com/

6 rows where time = "4:00 PM" sorted by datetime

View and edit SQL

Suggested facets: room

day

datetime

time

Link rowid title speaker time day room url datetime ▼ abstract image
27 Missing Data for Data - Our Quest to Clean Up Institutional Affiliations in Dryad Daniella Lowenberg, Ted Habermann 4:00 PM May 8 2019 Main Sanctuary https://csvconf.com/speakers/#daniella-lowenberg-ted-habermann 2019-05-08T16:00:00 Data publications and other scholarly outputs do not have clean information on institutional affiliations for researchers. This is caused by a mix of not asking researchers for this information up front, as well as incomplete metadata being submitted by repositories to DataCite and (publications to) Crossref. Without this standardized information we can't properly report on or provide statistics on deposits, usage metrics, or reach by institution. Join us for a session about our work using OpenRefine, organizational identifiers (ROR), and some manual sleuthing to update and improve Dryad institutional metadata for 25,000 data publications. https://csvconf.com/img/speakers-2019/dlowenberg_thabermann.jpg
28 Where Has Your Data Come From? Data Ancestry and Other Tales Dr. Tania Allard 4:00 PM May 8 2019 Fuller Hall https://csvconf.com/speakers/#dr-tania-allard 2019-05-08T16:00:00 Over the last few years, great improvements have been made around the areas of reproducible scientific computing research and FAIR (findable, accessible, interoperable and reusable) data. As a consequence, data scientists and researchers alike have started to incorporate modern software development practices in their workflows (i.e. version control, testing). More and more emphasis has been made on the need to look after the quality and validity of the software developed. But what about the data? Data validation and integrity is just as important as the adequacy of the code ingesting and processing the datasets. In this talk, I will take a high-level look at concepts such as data lineage, provenance, continuous data validation and present real-world examples in which these concepts have been applied to different real-world data pipelines increasing not only the confidence of the results obtained but also the efficiency and integrity of the workflows themselves. https://csvconf.com/img/speakers-2019/tallard.jpg
29 How to Feed Your Robot: Building and Maintaining Open Machine Learning Datasets Evan Tachovsky 4:00 PM May 8 2019 Daisy Bingham Room https://csvconf.com/speakers/#evan-tachovsky 2019-05-08T16:00:00 While algorithms and computing power get all the press, the special sauce behind many recent machine learning breakthroughs are meticulously labeled training data. Developing and maintaining these data sets as public goods is both an art and a science. In this talk I'll present a new set of best practices gleaned from interview with ~20 data set builders, maintainers, and funders. Topics include: encouraging collaboration between rival data teams; finding and addressing ethical issues with crowd labeling; launching competitions to spur data set use; and revenue generation models for sustainability. https://csvconf.com/img/speakers-2019/etachovsky.jpg
53 Spanking and Spreadsheets: Data-driven Sex Journalism Jacqueline Nolis & Heather Nolis 4:00 PM May 9 2019 Main Sanctuary https://csvconf.com/speakers/#jacqueline-nolis-heather-nolis 2019-05-09T16:00:00 When we saw that the Stranger, Seattle’s alternative newspaper, was running a survey on kinks and sexual preferences, we knew we had to get our hands on the data. We convinced the that using machine learning methods on the responses would be a good idea, and then we quickly set out to analyzing them. But we had never written an article for a newspaper before—nor had we worked with data even remotely as dirty. It turns out what makes for a good blog post or technical journal is very different than writing for print, especially for such a sensitive topic. In this talk we will cover how we made sense of the lewd data, the statistical methods we used (and failures we produced), as well as the final results that ended up in our feature article: “There Are Four Kinds of Sex Partners (which one are you).” https://csvconf.com/img/speakers-2019/jnolis_hnolis.jpg
54 Annotations in the Classroom; The Classroom in Annotations Asura Enkhbayar 4:00 PM May 9 2019 Fuller Hall https://csvconf.com/speakers/#asura-enkhbayar 2019-05-09T16:00:00 In this talk I want to explore the impact of using Hypothesis in the classroom. What does it mean to read, think, and annotate publicly? How does it change your learning experience as a student? How do you evaluate and assess different annotation styles as a teacher? As a student I can share my own experience of this new mode of teaching and learning. As a data scientist, I want to give a taste of possible new metrics and measurements based on annotation data. Finally, as a critical scholar I am hoping to explore how this new metrification and monitoring of reading might affect education. The talk will rely on data outlined in this essay: https://course-journals.lib.sfu.ca/index.php/pdc2018/article/view/240/213 https://csvconf.com/img/speakers-2019/aenkhbayar.jpg
55 Data Scavenger Hunts: Learning about Data Together Ted Laderas 4:00 PM May 9 2019 Daisy Bingham Room https://csvconf.com/speakers/#ted-laderas 2019-05-09T16:00:00 Data exploration and visualization are a highly accessible gateway activity to learning data science. In this talk, we discuss our experience with "Data Scavenger Hunts" using web apps to democratize data science and make it accessible to a wide variety of audiences. In order to acheive this, we have developed an R package called `burro` that can enable public datasets to be explored together via a sharable web app. In this talk, we talk about our experience with using data scavenger hunts to teach each other interesting things about data. In particular, we share our experiences with exploring the NHANES (National Health Nutirition Examination Survey) data and the insights we have taught each other. We show that this guided and communal data exploration leads to increased confidence and curiosity about data science in Biodata-Club, our learning community. `burro` apps can be deployed by anyone to start conversations about data. https://csvconf.com/img/speakers-2019/tladeras.jpg

Advanced export

JSON shape: default, array, newline-delimited

CSV options:

CREATE TABLE [talks] (
   [title] TEXT,
   [speaker] TEXT,
   [time] TEXT,
   [day] TEXT,
   [room] TEXT,
   [url] TEXT,
   [datetime] TEXT,
   [abstract] TEXT,
   [image] TEXT
)
Powered by Datasette · Query took 6.902ms · Data source: https://csvconf.com/