Projects

Here are some projects I have worked on for school, work, or fun. You can find the code for most of them on GitHub.

Falcon

Falcon is an interactive system for real-time brushing and linking interactions among multiple visualizations of billion-record datasets.

Dense Lines

Danyel Fisher and I developed a visualization technique to show millions of time series at once.

Pangloss

Pangloss is a big-data visualization system that uses approximation to answer queries quickly. I developed Falcon with Danyel Fisher at MSR.

VSUPs

To visualize values and uncertaity together, Michael Correll and I created Value-Suppressing Uncertainty Palettes (VSUPs for short).

Draco

I created Draco, a formal model of visualization design with shareable design guidelines, formal reasoning over the design space, and visualization recommendation.

Vega-Lite

I am the co-author of Vega-Lite, a high-level visualization grammar for rapid specification of interactive, multi-view graphics.

Vega

I contribute to Vega, a declarative visualization grammar that introduces novel primitives for interactive visualization design.

Voyager

I co-created Voyager. The system accelerates exploratory data analysis by augmenting visual analysis tools with visualization recommendations.

CompassQL

Query language and engine that unifies visualization specifications and recommendations.

Polestar

As a control condition for Voyager, I co-created Polestar, a Tableau-like data exploration tool.

Vega-Editor

I maintain the Vega and Vega-Lite online editor.

Vega Embed

I maintain Vega-Embed, a library to embed Vega and Vega-Lite visualizations on the web.

JSON Schema Generator

I maintain a tool to generate JSON schema from Typescript code. The tool uses the TypeScript compiler to parse into an AST and then generates an equivalent JSON schema. We use this tool to generate the schema for Vega-Lite.

Vega Tooltip

I maintain the library to add tooltips to charts.

Vega Themes

I maintain a collection of reusable themes for Vega and Vega-Lite.

IPython Vega

I maintain the Jupyter Notebook extesion for Vega and Vega-Lite.

JupyterLab

I contribute code to support Vega and Vega-Lite in JupyterLab.

Altair

I contribute to Altair, a popular Python wrapper for Vega-Lite.

Myria

I developed new operators and debugging tools for Myria, a distributed database systems from the UW database group.

Perfopticon

I led the development of an interactive tool to investigate quety execution in distirbuted database systems.

Google Dataset Search

During my internship with Google Research in Mountain View, I worked on the UX for Goods. The project later became Google Dataset Search.

Google Monarch

During my internship at Google in NYC during the summer of 2014, I implemented a new out-of-core join and aggregation operators for a large-scale time series database inside Monarch. Monarch processes production monitoring data from various systems at Google.

Not much about this system is public but there is a talk from John Banning about the system.

CKAN

CKAN is the world’s leading open-source data portal platform used by data.gov.uk, data.gov and publicdata.eu among many others. I developed the new data store, a data ingestion service, and a preview extension API.

Text detection with NN

For a class project, I extracted 500k labeled images of figures from roughly 1M papers. For each image I created a mask that shows where the text in the image is. I then used the images to train a neural network that would find text in images.

Himawari 8

A webextension that shows the latest image from the Himawari 8 weather satellite when you open a new tab. You can install the extension from the Chrome web store and the Firefox addon gallery.

Game of Life

I implemented Conway’s Game of Life in Python, Go, Rust, and C#. All projects are on GitHub.

LocateControl

A simple Leaflet map control to find your current location on a leaflet map. Very popular and used on the OpenStreetMap home page.

MaskCanvas

Visualize coverage on a Leaflet map. The data is store in a quad-tree to make queries to the data super fast.

Heatmap layer

For this project I combined the MaskCanvas layer and heatmap.js from my friend Patrick.

Space Clean Up

A bomberman clone written in Squeak Smalltalk.

MapLink

For Patrick Baudisch’s HCI class at HPI, we had to implement an application that usually requires large screens on a iPod nano (enough for 4 buttons). We implemented a GIS application to geo-reference images by simply aligning points. We introduced the concept of an X-Ray layer to align points on small devices.

Singing VHDL board

Together with another student at HPI, I built a music player in VHDL, a hardware description language.

SoSat

A SAT solver that uses different statistical optimization algorithms to solve SAT problems encoded in the DIMACS format. This solver is written in Python and uses Numpy to speed up calculations. The two main algorithms in this solver are an ant colony optimization algorithm and a genetic algorithm. To support these algorithms, there are some pre-processing algorithms.

Tagshot

Tagshot is a photo management tool in the browser. We designed and developed as a class project. Our goal was to create a tool that let’s users efficiently manage large numbers of photos and in particular add tags.

Optimal ATM placement

Our contribution to the Informaticup 2011. The algorithms we devleloped compute optimal placements of ATMs in a city. We won the competition and were invited to present our work in Bonn.

Shopping tour optimizer

Our contribution to the Informaticup 2012. We developed algorithms to find the optimal tour to buy items at stores.