about me


I am an Assistant Professor in the Department of Biomedical Engineering and the Institute for Computational Medicine at Johns Hopkins University. I sit at the Center for Imaging Science in Clark Hall at the Homewood campus. I also have an appointment in the Institute for Data Intensive Engineering and Sciences. My work focuses largely on big and wide data, especially neuroscience, focusing on statistics of brain graphs (connectomes). I co-founded the Open Connectome Project with my brother R. Jacob Vogelstein and Randal Burns, Associated Professor in the Department of Computer Science at Johns Hopkins University. We run a very vertical group, with people working at all levels of analysis, ranging from data collection to analysis and interpretation. We are always looking for new collaborators and team members. Please inquire if you are interested.


This world is already an incredibly beautiful and lovely place to live (at least for me, for now). We are motivated by a desire to make it even better for all of us, as well as our descendants, and our fellow inhabitants. We do so via searching for patterns. Specifically, we seek patterns in our physical worlds (e.g., our bodies, our brains) as well as our mental worlds (e.g., our perceptions, experiences, memories, thoughts, emotions, psychiatric conditions). More importantly, we seek to understand patterns in our mental worlds in terms of our physical worlds. Our hope and belief is that via developing a deeper understanding of the links between these worlds, we will be able to bring them into greater alignment. A primary motivating factor is that all humans/animals have brains and therefore, such ideas could directly benefit all of animalkind. Thus, all of our research products are freely available to all. For more details see my cv.


My research passion lies in the development of inference techniques for scientific discovery from large and complex datasets, typically relating measurements of brain properties (e.g., brain imaging) to mental properties (e.g., aptitude, cognition, perception, memory, etc.). Of primal consideration for our group is that these techniques are useful in solving important scientific questions and social problems. Our unique contributions follows from the juxtaposition our collective domain knowledge (enabling important applied questions to be asked), computational aptitude (allowing the tools that enable one to obtain answers from terabytes of data to be built), and statistical insight (clarifying the extent to which to trust the answers). All projects are motivated by scientific questions, and result in open source code available to the greater scientific community, as well as applications of the methods on state-of-the-art neuroscientific datasets. For all projects, we primarily search for students who are excited and fun to work with, secondarily, it would be helpful if you had some math/stat skills, some programming skills, and some interest in solving real problems. Below, are a few example projects.

  • human connectomics: Even the simplest questions about human connectomes (brain-graphs) - such what is the mean connectome, and are these two populations of connectomes different - remain unanswered. In part, this is due to a lack of theoretically justified tools with resulting scalable code and appropriate data to answer them. We have developed fundamental theory (e.g., Tang14a), and scalable code (FlashGraph), that will now enable us to rigorously answer these questions. We are looking for students who would like to extend and apply these methods to extant collected and pre-processed human connectomes (available here), to get some quantitative answers. A video describing some of the related data can be found here.
  • synaptic diversity: All brains are composed of neurons, and the connections between them, called synapses. All brain function is therefore at least partially determined by these synapses, and yet, we are woefully clueless with regard to their diversity and functional properties. We do know that each synapse is a complex composition of hundreds of proteins, all acting in concert together (read O'Rourke12a for a nice review). We have the data, as well as the methodological tools to enable exploring this diversity. We are looking for students who would like to extend and apply these methods to the extant collected and pre-processed data sets (available here), to reveal previously unknown but fundamental principles of neural computation. Videos describing these datasets can be found here.
  • electron microscopy connectomics: Microcircuits of the brain enable computational capabilities far exceeding our greatest modern supercomputers, yet they fit in little skulls weighing less than a grapefruit. It is widely believe that cracking the neural circuitry code will reveal mysteries of the brain. To do so, however, we need petascale computer vision methods, such as three-dimensional scene parsing, as well as statistical tools to analyze these large and complex datasets. We are looking for students who would like to extend and apply these methods to the extant collected and pre-processed data sets (available here). Videos describing these datasets can be found here and here.
  • connectome coding: Understanding the relationship between brain properties (for example, connectivity) and mental properties (for example, memories) remains one of the greatest challenges in the 21st century. It is also an example of a class of cutting-edge machine learning problems, sometimes called multi-modal learning. We have already developed some tools (e.g., JOFC) and have pre-processed some datasets (e.g., Adelstein et al, 2001), we are looking for students to extend and apply these methods to uncover, for example, relationships between brain networks and personality types.
  • Your project here!

For more information about any of these projects, or to tell me about your awesome new project, email me, or swing by my office sometime!


Essentially every project I embark on includes writing some code, if only to buttress the theoretical results with numerical examples. Moreover, as an adherent to the philosophy of open science, code I write is always open source by the time of submission, and often sooner. Thus, searching my cv is the most reliable method for discovering all of the code bases associated with any project I am a part of. You can also check various github accounts that I regularly contribute to, including my own and open connectome project's, which we started. We also develop FlashGraph these days. Nevertheless, here I will try to post the codes from various projects that I think are potentially useful for other applications.

time series analysis

  1. Fast Nonnegative Deconvolution for Spike Train Inference from Population Calcium Imaging. manuscript, code, and repo .
  2. Spike inference from calcium imaging using sequential Monte Carlo methods. article, code.
  3. Real-Time Inference for a Gamma Process Model of Neural Spiking. manuscript, code.

graph analysis

  1. Graph Classification using Signal Subgraphs: Applications in Statistical Connectomics. arxiv, code, repo, article.
  2. Fast Approximate Quadratic Programming for Large (Brain) Graph Matching. arxiv, code.
  3. Seeded Graph Matching Via Joint Optimization of Fidelity and Commensurability. arxiv, code.
  4. Robust Multimodal Graph Matching: Sparse Coding Meets Graph Matching. nips, arxiv, code.
  5. DELTACON: Measuring Connectivity Differences in Large Networks. article. arxiv, code.
  6. MR Connectome Automated Pipeline. arxiv, code, abstract, pdf.



jovo [at symbol] jhu [dot] edu

Joshua T. Vogelstein
Center for Imaging Science
Clark Hall, Rm 317C
3400 N. Charles St
Baltimore, MD 21218