This page was last updated March, 2015.

Published

Recent work in both nonequilibrium statistical mechanics and sequential Monte Carlo characterizes properties of an intractable distribution in terms of an expectation over a trajectory. This trajectory is typically initialized at a tractable distribution, and terminated at the intractable target distribution. Here we turn this idea on its head, and rather than using a trajectory to evaluate a distribution which has been otherwise defined, we define the target distribution as the endpoint of a diffusion trajectory. This allows the target distribution to be exactly sampled from, easily evaluated, and more easily trained.
In collaboration with Eric Weiss, Niru Maheswaranathan, and Surya Ganguli.
See the ICML paper, and the released code.

MPF is a technique for parameter estimation in un-normalized probabilistic models. You can usefully think of it as being similar to Contrastive-Divergence (CD), except that it is a consistent estimator with an associated objective function, rather than only a heuristic for parameter updates. It proves to be an order of magnitude faster than competing techniques for the Ising model, and an effective tool (strictly never slower than CD) for learning parameters for any non-normalizable distribution.
In collaboration with Peter Battaglino and Michael R. DeWeese.
See the ICML paper, the PRL paper, and the released code. If you are interested in using MPF in a continuous state space, you should use the method described in the Persistent MPF note.

We combine the benefits of stochastic gradient and quasi-Newton optimization methods, by maintaining an online approximation to the Hessian of every single minibatch. This method is kept tractable even for high dimensional problems and large numbers of minibatches by storing the Hessians, and performing all computation, in a shared low dimensional subspace determined by the recent history of gradients and iterates. This leads to faster optimization. More importantly, it saves your sanity from days spent tweaking hyperparameters.
In collaboration with Ben Poole and Surya Ganguli.
See the ICML paper, and the released code.

By training Ising, RBM, and sRBM models on patterns of neural activity, we are able to answer questions about: functional connectivity, and how it changes as a result of different interventions; the importance of higher order correlations in the brain; and the entropy of the neural spike code.
In collaboration with Liberty S. Hamilton, Urs Köster, Alexander G. Huth, Vanessa M. Carels, Karl Deisseroth, Shaowen Bao, Charles M. Gray, and Bruno A. Olshausen.
See the Neuron paper, the PLOS Computational Biology paper, and Liberty's Github repository.

We have built a device which emits bat-like ultrasonic chirps, records the echoes in stereo using artificial pinnae, and then time stretches them so that they are in the human auditory range. By performing minimal signal processing, we present the human brain with a naturalistic and information rich signal which it can rapidly learn to interpret. The hope is that this device will be useful as an assistive device for visually impaired people.
In collaboration with Nicol Harper, Santani Teng, and Benjamin M. Gaub.
See the TBME paper and the project website. Also check out the first-person video of its use to the right.

HMC can be viewed in terms of operators acting on a discrete state space. Using this perspective it is possible to derive HMC sampling algorithms which do not rely on detailed balance in order to sample from the correct distribution. Detailed balance causes sampling algorithms to explore the state space by a random walk, meaning they only traverse a distance proportional to the square root of the number of sampling steps. These methods therefore allow much more rapid mixing.
In collaboration with Mayur Mudigonda and Michael R. DeWeese.
See the ICML paper, a brief note outlining a closely related scheme, and the released code.

HAIS is a method for combining Hamiltonian Monte Carlo (HMC) and Annealed Importance Sampling (AIS), so that a single Hamiltonian trajectory can stretch over many AIS intermediate distributions. This allows much more efficient estimation of importance weights - and thus partition functions and log likelihoods - for intractable probabilistic models.
In collaboration with Jack Culpepper.
See the tech report, and the released code.

We develop techniques to make training and inference tractable for high dimensional Lie group models. We use these techniques to learn Lie group models of the inter-frame transformations which occur in natural video, and also in video compression.
A collaboration with Jimmy Wang and Bruno Olshausen.
See the core algorithm paper (currently under revision), and also the DCC paper focusing on its use for video compression.

We developed a generative (directed) bilinear model, and applied it to model natural images. A bilinear model form is well matched to many aspects of natural images (e.g., combining surface reflectance and illumination, or factoring apart the overall activity of a group of similar features from the specific feature location or phase within the group).
A collaboration with Jack Culpepper and Bruno Olshausen.
See the ICCV paper.

Prior to graduate school, I worked on the multispectral camera (PANCAM) team for the MER mission. This involved data analysis of images from the spacecraft, developing super-resolution code for the camera system, and building physical models of Martian dust, so as to remove the effects of dust from Martian spectra.
See the many papers (including two Science papers) on the Publications page.

We explore the effects, both analytically and experimentally, of injecting different types of noise at different locations in an autoencoder. We find that injecting noise improves performance in general, and also that dropout is not the most effective type of noise.
In collaboration with Ben Poole and Surya Ganguli
See the NIPS 2013 deep learning workshop paper.

In progress

We are building a temporal probabilistic model of the knowledge state of students moving through educational content on Khan Academy.
In collaboration with Eliana Feasley, Jace Kohlmeier, and Surya Ganguli.

Although deep networks are remarkably effective at many machine learning problems, we have only limited understanding of the properties of the functions they can compute. I have developed a technique for taking low dimensional slices through the Hilbert space of all possible functions, and I am using these low dimensional slices to both visualize and quantify the ways in which multi-layer networks fill function space.
In collaboration with Surya Ganguli.