home / csvconf

talks

Data source: https://csvconf.com/

1 row where day = "May 9 2019" and url = "https://csvconf.com/speakers/#saul-pwanson" sorted by time

View and edit SQL

url

speaker

day

abstract

Link rowid title speaker time ▼ day room url datetime abstract image
47 How a File Format Led to a Crossword Scandal Saul Pwanson 2:30 PM May 9 2019 Fuller Hall https://csvconf.com/speakers/#saul-pwanson 2019-05-09T14:30:00 In 2016 I designed a plain-text file format for crossword puzzle data, and then spent a couple of months building a micro-data-pipeline, scraping tens of thousands of crosswords from various sources. Then, having all those crosswords in a simple format, I wanted to see if there were any common grid patterns--and discovered egregious plagiarism by a major crossword editor that had gone on for years. This talk would cover the file format, data pipeline, and the design choices that aided rapid exploration; the evidence for the scandal, from the initial anomalies to the final damning visualization; and what it's like for a data project to get 15 minutes of fame. https://csvconf.com/img/speakers-2019/spwanson.jpg

Advanced export

JSON shape: default, array, newline-delimited

CSV options:

CREATE TABLE [talks] (
   [title] TEXT,
   [speaker] TEXT,
   [time] TEXT,
   [day] TEXT,
   [room] TEXT,
   [url] TEXT,
   [datetime] TEXT,
   [abstract] TEXT,
   [image] TEXT
)
Powered by Datasette · Query took 6.985ms · Data source: https://csvconf.com/