Here are some projects I have worked on for school, work, or fun that you might find interesting. You can find the code for most of them on GitHub.
- Side projects
An interactive crossfilter application for big data. Check it out at github.com/uwdata/falcon.
With the Interactive Data Lab, we created the Vega stack and a number of tools for data exploration based on it.
I am the co-author of Vega-Lite, a high-level visualization grammar. It provides a concise JSON syntax for supporting rapid generation of visualizations to support analysis. Vega-Lite specifications can be compiled to Vega specifications. Read more in our blog post.
Polestar and Voyager
IPython/Jupyter notebook module for Vega and Vega-Lite. The code is on GitHub.
Data Search at Google Research
During my internship with Google Research in Mountain View, I worked on the UX for Goods. The system is described in this paper.
Production System Monitoring at Google
During my internship at Google in NYC during the summer of 2014, I implemented a new out of core join and aggregation operators for a large-scale time series database. The system processes production monitoring data from various systems at Google.
Not much about this system is public but there is a talk from John Banning about the system.
CKAN at the Open Knowledge foundation
It is a powerful data management system that makes data accessible – by providing tools to streamline publishing, sharing, finding and using data. CKAN is aimed at data publishers (national and regional governments, companies and organizations) wanting to make their data open and available. CKAN is mainly developed by the Open Knowledge Foundation.
Development happens on GitHub.
Text detection with neural networks
For a class project, I extracted 500k labeled images of figures from roughly 1M papers. For each image I created a mask that shows where the text in the image is. I then used the images to train a neural network that would find text in images. The project is on GitHub.
Space Clean Up
A bomberman clone written in Squeak Smalltalk. Code is on GitHub.
For Patrick Baudisch’s HCI class at HPI, we had to implement an application that ususally requires large screens on a iPod nano (enough for 4 buttons). We implemented a GIS application that allows users to geo-reference images by simply aligning points. We introduced the concept of an X-Ray layer to enable users on small devices to align points.
You can find more details in the paper.
Singing VHDL board
The glass is half full
In this class project I write about how we can use an optimistic approach to concurrency control. You can find the paper here.
A SAT solver that uses different statistical optimization algorithms to solve SAT problems encoded in the DIMACS format. This solver is written in Python and uses Numpy to speed up calculations. The two main algorithms in this solver are an ant colony optimization algorithm and a genetic algorithm. To support these algorithms, there are some pre-processing algorithms.
Tagshot is a photo management tool in the browser. We designed and developed as a class project. Our goal was to create a tool that let’s users efficiently manage large numbers of photos and in particular add tags. The code is on GitHub and I wrote a blog post about it.
A webextension that shows the latest image from the Himawari 8 weather satellite when you open a new tab. The code is on Github and you can install the extension from the Chrome web store and the Firefox addon gallery.
Game of Life
I implemented Conway’s Game of Life in Python, Go, Rust, and C#. All projects are on GitHub.
A simple control to find your current location on a leaflet map. Very popular and used on the OpenStreetMap home page.
The code is at github.com/domoritz/leaflet-locatecontrol.
Visualize coverage on a leaflet map. The data is store in a quad-tree to make queries to the data super fast.
For this project I combined the MaskCanvas layer and
heatmap.js from my friend Patrick.
You can learn more about it at patrick-wied.at/static/heatmapjs/example-heatmap-leaflet.html