home / csvconf

talks

Data source: https://csvconf.com/

8 rows where day = "May 8 2019" and room = "Daisy Bingham Room" sorted by rowid

View and edit SQL

speaker

datetime

abstract

room

day

Link rowid ▼ title speaker time day room url datetime abstract image
5 Frictionless Data Processing in the Wild Amber D. York 10:30 AM May 8 2019 Daisy Bingham Room https://csvconf.com/speakers/#amber-d-york 2019-05-08T10:30:00 Frictionless Data (FD) initiatives out of Open Knowledge International provide attractive informatics and processing capabilities. The BCO-DMO data repository used FD tools on real-world datasets, and we have some lessons learned to share. By building upon existing FD tools, we found ways to reduce the amount of time data managers spend generating metadata, and writing custom scripts. We are also developing ways for data managers with varying levels of scripting ability to make use of Frictionless Data tools. https://csvconf.com/img/speakers-2019/adyork.jpg
8 What's Next after Notebooks? Alexander Morley 11:00 AM May 8 2019 Daisy Bingham Room https://csvconf.com/speakers/#alexander-morley 2019-05-08T11:00:00 Jane is a data scientist. Jane uses Jupyter notebooks as her working environment, and her presentation environment. These “computational essays” allow Jane to present her methods and her results to her colleagues at the same time. Jane is happy with this. But sometimes it’s difficult for Jane to share notebooks with her colleagues, and even harder for them to re-mix or re-use parts of the notebook, or to share their changes back to Jane. And sometimes Jane finds it hard to explain the flow of a particular notebook, or how different notebooks are tied together. There’s no provision for keeping things modular. First, I will discuss a few up-and-coming projects that are leveraging the power of new web technologies and faster browsers to solve all of fictional Jane’s problems, and more. Second, I will present a prototype for my own solution that is also web-based, and draws inspiration from some now-uncool graphical programming languages. https://csvconf.com/img/speakers-2019/amorley.jpg
11 Bash <3's CSVs: Data Analysis on the cmdline Nicholas Canzoneri 11:30 AM May 8 2019 Daisy Bingham Room https://csvconf.com/speakers/#nicholas-canzoneri 2019-05-08T11:30:00 Your bash shell has a _lot_ utilities that can be used to help you analyze your data, often easier and faster than trying to import your data to an external tool. But these utilities can be hard to find and even harder to figure out the right options. I'll walkthrough a data set and show examples of the best utility to use in different situations. I'll go over common commands like `grep` and `cut`, more exotic commands like `comm` and `tr`, and dig up very useful options to a command you might have overlooked, like `sort -k`. https://csvconf.com/img/speakers-2019/ncanzoneri.jpg
16 Measurement Lab - Open Data on Global Internet Health Chris Ritzo 1:30 PM May 8 2019 Daisy Bingham Room https://csvconf.com/speakers/#chris-ritzo 2019-05-08T13:30:00 Measurement Lab (M-Lab) is the largest open internet measurement platform in the world, hosting internet-scale measurement experiments and releasing all data into the public domain (CC0). We are an open source project with contributors from civil society organizations, educational institutions, and private sector companies, and are a fiscally sponsored project of Code for Science & Society. Our mission is to Measure the Internet, save the data, and make it universally accessible and useful. M-Lab works to advance network research and empowers the public with useful information about broadband and mobile connections by maintaining a scalable, global platform for conducting internet measurements, and by supporting an ecosystem of external partners and users around the world interested in using the resulting open data. Our users are researchers, activists, analysts, journalists, experiment developers, hosting providers, regulators, municipalities, and every day consumers. M-Lab works to enhance internet transparency, and help to promote and sustain a healthy, innovative internet by supporting our users in their research and data analyses, developing and publicizing new use cases for our datasets, forming collaborative partnerships, and building open source measurement tools. In this talk we will introduce the M-Lab platform with the csvconf audience, share how our open data and open source tools are being used by communities around the world, and provide resources on how attendees might use them as well. https://csvconf.com/img/speakers-2019/critzo.jpg
19 Hacking Open Data in Africa Soila Kenya 2:00 PM May 8 2019 Daisy Bingham Room https://csvconf.com/speakers/#soila-kenya 2019-05-08T14:00:00 This talk will cover the tips & tricks of community-sourcing for openAFRICA.net - the largest independent repository of open data on the African continent - used in order to digitise deadwood to give citizens actionable information. Data availability in many African countries is dismal. Files upon files of important government information lay gathering dust in abandoned storage rooms. On the other hand, journalists and citizens need this information to keep governments in check and ensure they are receiving the right services. So how do you turn paper-based government archives into machine readable & API accessible digital files? https://csvconf.com/img/speakers-2019/skenya.jpg
22 US Energy Data Liberation Zane Selvans 2:30 PM May 8 2019 Daisy Bingham Room https://csvconf.com/speakers/#zane-selvans 2019-05-08T14:30:00 An alphabet soup of government agencies like FERC, EPA, EIA, PHMSA, MSHA and the ISOs and RTOs collect and publish terabytes of data about the US energy system. It includes operating costs and fuel consumption, hourly power output and GHG emissions, and the age and length of natural gas pipelines, the price of electricity every 5 minutes at thousands of nodes in the grid, coal production numbers and much much more. In theory all this data is public and freely available, but in practice it takes a lot of wrangling to make it usable for analysis. The result: it's packaged up by one or two platform monopolies that charge tens of thousands of dollars a year for easy access, excluding most non-corporate users. But for anyone interested in the ongoing transformation of our energy system and its climate impacts, this data is a treasure trove worth excavating. The Public Utility Data Liberation project (https://github.com/catalyst-cooperative/pudl) has been working for the last 2.5 years to liberate this data and make it freely accessible to activists, data journalists, and researchers working on US climate and energy policy. This talk will take a look at what the data is, where it comes from, why it's interesting, how we're processing it and making it available, and some of the challenges we're facing and opportunities we see ahead. https://csvconf.com/img/speakers-2019/zselvans.jpg
26 Fundamentals of Research Software Sustainability Daniel S. Katz 3:30 PM May 8 2019 Daisy Bingham Room https://csvconf.com/speakers/#daniel-s-katz 2019-05-08T15:30:00 Software sustainability means different things to different groups of people, including the persistence of working software, and the persistence of people, or funding. While we can generally define sustainability as the inflow of resources is sufficient to do the needed work, where those resources both include and are somewhat transferrable into human effort, users, funders, managers, and developers (or maintainers) all mean somewhat different things when they use sustainable in the context of research software. This talk will illustrate some of these different views, and their corresponding aims. It will also provide some guidance on quantifying research software sustainability from some of these views. https://csvconf.com/img/speakers-2019/dskatz.jpg
29 How to Feed Your Robot: Building and Maintaining Open Machine Learning Datasets Evan Tachovsky 4:00 PM May 8 2019 Daisy Bingham Room https://csvconf.com/speakers/#evan-tachovsky 2019-05-08T16:00:00 While algorithms and computing power get all the press, the special sauce behind many recent machine learning breakthroughs are meticulously labeled training data. Developing and maintaining these data sets as public goods is both an art and a science. In this talk I'll present a new set of best practices gleaned from interview with ~20 data set builders, maintainers, and funders. Topics include: encouraging collaboration between rival data teams; finding and addressing ethical issues with crowd labeling; launching competitions to spur data set use; and revenue generation models for sustainability. https://csvconf.com/img/speakers-2019/etachovsky.jpg

Advanced export

JSON shape: default, array, newline-delimited

CSV options:

CREATE TABLE [talks] (
   [title] TEXT,
   [speaker] TEXT,
   [time] TEXT,
   [day] TEXT,
   [room] TEXT,
   [url] TEXT,
   [datetime] TEXT,
   [abstract] TEXT,
   [image] TEXT
)
Powered by Datasette · Query took 12.026ms · Data source: https://csvconf.com/