csvconf: talks: 6 rows where where time = "4:00 PM" sorted by rowid

talks

6 rows where time = "4:00 PM" sorted by rowid

Link	rowid ▼	title	speaker	time	day	room	url	datetime	abstract	image
27	27	Missing Data for Data - Our Quest to Clean Up Institutional Affiliations in Dryad	Daniella Lowenberg, Ted Habermann	4:00 PM	May 8 2019	Main Sanctuary	https://csvconf.com/speakers/#daniella-lowenberg-ted-habermann	2019-05-08T16:00:00	Data publications and other scholarly outputs do not have clean information on institutional affiliations for researchers. This is caused by a mix of not asking researchers for this information up front, as well as incomplete metadata being submitted by repositories to DataCite and (publications to) Crossref. Without this standardized information we can't properly report on or provide statistics on deposits, usage metrics, or reach by institution. Join us for a session about our work using OpenRefine, organizational identifiers (ROR), and some manual sleuthing to update and improve Dryad institutional metadata for 25,000 data publications.	https://csvconf.com/img/speakers-2019/dlowenberg_thabermann.jpg
28	28	Where Has Your Data Come From? Data Ancestry and Other Tales	Dr. Tania Allard	4:00 PM	May 8 2019	Fuller Hall	https://csvconf.com/speakers/#dr-tania-allard	2019-05-08T16:00:00	Over the last few years, great improvements have been made around the areas of reproducible scientific computing research and FAIR (findable, accessible, interoperable and reusable) data. As a consequence, data scientists and researchers alike have started to incorporate modern software development practices in their workflows (i.e. version control, testing). More and more emphasis has been made on the need to look after the quality and validity of the software developed. But what about the data? Data validation and integrity is just as important as the adequacy of the code ingesting and processing the datasets. In this talk, I will take a high-level look at concepts such as data lineage, provenance, continuous data validation and present real-world examples in which these concepts have been applied to different real-world data pipelines increasing not only the confidence of the results obtained but also the efficiency and integrity of the workflows themselves.	https://csvconf.com/img/speakers-2019/tallard.jpg
29	29	How to Feed Your Robot: Building and Maintaining Open Machine Learning Datasets	Evan Tachovsky	4:00 PM	May 8 2019	Daisy Bingham Room	https://csvconf.com/speakers/#evan-tachovsky	2019-05-08T16:00:00	While algorithms and computing power get all the press, the special sauce behind many recent machine learning breakthroughs are meticulously labeled training data. Developing and maintaining these data sets as public goods is both an art and a science. In this talk I'll present a new set of best practices gleaned from interview with ~20 data set builders, maintainers, and funders. Topics include: encouraging collaboration between rival data teams; finding and addressing ethical issues with crowd labeling; launching competitions to spur data set use; and revenue generation models for sustainability.	https://csvconf.com/img/speakers-2019/etachovsky.jpg
53	53	Spanking and Spreadsheets: Data-driven Sex Journalism	Jacqueline Nolis & Heather Nolis	4:00 PM	May 9 2019	Main Sanctuary	https://csvconf.com/speakers/#jacqueline-nolis-heather-nolis	2019-05-09T16:00:00	When we saw that the Stranger, Seattle’s alternative newspaper, was running a survey on kinks and sexual preferences, we knew we had to get our hands on the data. We convinced the that using machine learning methods on the responses would be a good idea, and then we quickly set out to analyzing them. But we had never written an article for a newspaper before—nor had we worked with data even remotely as dirty. It turns out what makes for a good blog post or technical journal is very different than writing for print, especially for such a sensitive topic. In this talk we will cover how we made sense of the lewd data, the statistical methods we used (and failures we produced), as well as the final results that ended up in our feature article: “There Are Four Kinds of Sex Partners (which one are you).”	https://csvconf.com/img/speakers-2019/jnolis_hnolis.jpg
54	54	Annotations in the Classroom; The Classroom in Annotations	Asura Enkhbayar	4:00 PM	May 9 2019	Fuller Hall	https://csvconf.com/speakers/#asura-enkhbayar	2019-05-09T16:00:00	In this talk I want to explore the impact of using Hypothesis in the classroom. What does it mean to read, think, and annotate publicly? How does it change your learning experience as a student? How do you evaluate and assess different annotation styles as a teacher? As a student I can share my own experience of this new mode of teaching and learning. As a data scientist, I want to give a taste of possible new metrics and measurements based on annotation data. Finally, as a critical scholar I am hoping to explore how this new metrification and monitoring of reading might affect education. The talk will rely on data outlined in this essay: https://course-journals.lib.sfu.ca/index.php/pdc2018/article/view/240/213	https://csvconf.com/img/speakers-2019/aenkhbayar.jpg
55	55	Data Scavenger Hunts: Learning about Data Together	Ted Laderas	4:00 PM	May 9 2019	Daisy Bingham Room	https://csvconf.com/speakers/#ted-laderas	2019-05-09T16:00:00	Data exploration and visualization are a highly accessible gateway activity to learning data science. In this talk, we discuss our experience with "Data Scavenger Hunts" using web apps to democratize data science and make it accessible to a wide variety of audiences. In order to acheive this, we have developed an R package called `burro` that can enable public datasets to be explored together via a sharable web app. In this talk, we talk about our experience with using data scavenger hunts to teach each other interesting things about data. In particular, we share our experiences with exploring the NHANES (National Health Nutirition Examination Survey) data and the insights we have taught each other. We show that this guided and communal data exploration leads to increased confidence and curiosity about data science in Biodata-Club, our learning community. `burro` apps can be deployed by anyone to start conversations about data.	https://csvconf.com/img/speakers-2019/tladeras.jpg

Link

rowid ▼

title

speaker

time

day

room

url

datetime

abstract

image

Missing Data for Data - Our Quest to Clean Up Institutional Affiliations in Dryad

Daniella Lowenberg, Ted Habermann

4:00 PM

May 8 2019

Main Sanctuary

https://csvconf.com/speakers/#daniella-lowenberg-ted-habermann

2019-05-08T16:00:00

Data publications and other scholarly outputs do not have clean information on institutional affiliations for researchers. This is caused by a mix of not asking researchers for this information up front, as well as incomplete metadata being submitted by repositories to DataCite and (publications to) Crossref. Without this standardized information we can't properly report on or provide statistics on deposits, usage metrics, or reach by institution. Join us for a session about our work using OpenRefine, organizational identifiers (ROR), and some manual sleuthing to update and improve Dryad institutional metadata for 25,000 data publications.

https://csvconf.com/img/speakers-2019/dlowenberg_thabermann.jpg

Where Has Your Data Come From? Data Ancestry and Other Tales

Dr. Tania Allard

4:00 PM

May 8 2019

Fuller Hall

https://csvconf.com/speakers/#dr-tania-allard

2019-05-08T16:00:00

Over the last few years, great improvements have been made around the areas of reproducible scientific computing research and FAIR (findable, accessible, interoperable and reusable) data. As a consequence, data scientists and researchers alike have started to incorporate modern software development practices in their workflows (i.e. version control, testing). More and more emphasis has been made on the need to look after the quality and validity of the software developed. But what about the data? Data validation and integrity is just as important as the adequacy of the code ingesting and processing the datasets. In this talk, I will take a high-level look at concepts such as data lineage, provenance, continuous data validation and present real-world examples in which these concepts have been applied to different real-world data pipelines increasing not only the confidence of the results obtained but also the efficiency and integrity of the workflows themselves.

https://csvconf.com/img/speakers-2019/tallard.jpg

How to Feed Your Robot: Building and Maintaining Open Machine Learning Datasets

Evan Tachovsky

4:00 PM

May 8 2019

Daisy Bingham Room

https://csvconf.com/speakers/#evan-tachovsky

2019-05-08T16:00:00

While algorithms and computing power get all the press, the special sauce behind many recent machine learning breakthroughs are meticulously labeled training data. Developing and maintaining these data sets as public goods is both an art and a science. In this talk I'll present a new set of best practices gleaned from interview with ~20 data set builders, maintainers, and funders. Topics include: encouraging collaboration between rival data teams; finding and addressing ethical issues with crowd labeling; launching competitions to spur data set use; and revenue generation models for sustainability.

https://csvconf.com/img/speakers-2019/etachovsky.jpg

Spanking and Spreadsheets: Data-driven Sex Journalism

Jacqueline Nolis & Heather Nolis

4:00 PM

May 9 2019

Main Sanctuary

https://csvconf.com/speakers/#jacqueline-nolis-heather-nolis

2019-05-09T16:00:00

When we saw that the Stranger, Seattle’s alternative newspaper, was running a survey on kinks and sexual preferences, we knew we had to get our hands on the data. We convinced the that using machine learning methods on the responses would be a good idea, and then we quickly set out to analyzing them. But we had never written an article for a newspaper before—nor had we worked with data even remotely as dirty. It turns out what makes for a good blog post or technical journal is very different than writing for print, especially for such a sensitive topic. In this talk we will cover how we made sense of the lewd data, the statistical methods we used (and failures we produced), as well as the final results that ended up in our feature article: “There Are Four Kinds of Sex Partners (which one are you).”

https://csvconf.com/img/speakers-2019/jnolis_hnolis.jpg

Annotations in the Classroom; The Classroom in Annotations

Asura Enkhbayar

4:00 PM

May 9 2019

Fuller Hall

https://csvconf.com/speakers/#asura-enkhbayar

2019-05-09T16:00:00

In this talk I want to explore the impact of using Hypothesis in the classroom. What does it mean to read, think, and annotate publicly? How does it change your learning experience as a student? How do you evaluate and assess different annotation styles as a teacher? As a student I can share my own experience of this new mode of teaching and learning. As a data scientist, I want to give a taste of possible new metrics and measurements based on annotation data. Finally, as a critical scholar I am hoping to explore how this new metrification and monitoring of reading might affect education. The talk will rely on data outlined in this essay: https://course-journals.lib.sfu.ca/index.php/pdc2018/article/view/240/213

https://csvconf.com/img/speakers-2019/aenkhbayar.jpg

Data Scavenger Hunts: Learning about Data Together

Ted Laderas

4:00 PM

May 9 2019

Daisy Bingham Room

https://csvconf.com/speakers/#ted-laderas

2019-05-09T16:00:00

Data exploration and visualization are a highly accessible gateway activity to learning data science. In this talk, we discuss our experience with "Data Scavenger Hunts" using web apps to democratize data science and make it accessible to a wide variety of audiences. In order to acheive this, we have developed an R package called `burro` that can enable public datasets to be explored together via a sharable web app. In this talk, we talk about our experience with using data scavenger hunts to teach each other interesting things about data. In particular, we share our experiences with exploring the NHANES (National Health Nutirition Examination Survey) data and the insights we have taught each other. We show that this guided and communal data exploration leads to increased confidence and curiosity about data science in Biodata-Club, our learning community. `burro` apps can be deployed by anyone to start conversations about data.

https://csvconf.com/img/speakers-2019/tladeras.jpg

Advanced export

JSON shape: default, array, newline-delimited

CREATE TABLE [talks] ( [title] TEXT, [speaker] TEXT, [time] TEXT, [day] TEXT, [room] TEXT, [url] TEXT, [datetime] TEXT, [abstract] TEXT, [image] TEXT )