about me


I am an Assistant Professor in the Department of Biomedical Engineering and the Institute for Computational Medicine at Johns Hopkins University. I sit at the Center for Imaging Science in Clark Hall at the Homewood campus. I also have an appointment in the Institute for Data Intensive Engineering and Sciences. My work focuses largely on big and wide data, especially neuroscience, focusing on statistics of brain graphs (connectomes). I co-founded the Open Connectome Project (OCP) with my brother R. Jacob Vogelstein and Randal Burns, Associated Professor in the Department of Computer Science at Johns Hopkins University. Recently it has spawned NeuroData which is the mother of several complementary projects, including OCP, Open Synaptome Project, and more. We run a very vertical group, with people working at all levels of analysis, ranging from data collection to analysis and interpretation. We are always looking for new collaborators and team members. Please inquire if you are interested.


We seek patterns in our physical worlds (e.g., our bodies, our brains) as well as our mental worlds (e.g., our perceptions, experiences, memories, thoughts, emotions, psychiatric conditions). More importantly, we seek to understand patterns in our mental worlds in terms of our physical worlds. Our hope and belief is that via developing a deeper understanding of the links between these worlds, we will be able to bring them into greater alignment. A primary motivating factor is that all humans/animals have brains and therefore, such ideas could directly benefit all of animalkind. Thus, all of our research products are freely available to all. For more details see my cv.


The fundamental driving force of science is the discovery of latent structure that converts myriad disparate data into understanding. In the 21st century, the growth of data acquisition has greatly outpaced the growth of data analysis, rendering current computational and statistical tools insufficient for extracting meaning from large datasets. Neuroscience is particularly susceptible to these big data challenges, as neuroexperimentalists have devised methods of collecting terabytes of data per hour. Without statistical and computational frameworks for analysis and organization of these data, the field will be unable to fully reap the benefits of this enormous potential. Our expertise in (i) computer science, (ii) data curation, (iii) statistical science, and (iv) neuroscience enables us to fill this gap. Below we elaborate on these four threads.

statistical science

Statistical sciences have developed a beautiful corpus of knowledge on inference in various settings over the last 100 years. However, in recent years, new experimental and data collection methods have yielded datasets that no longer fit within the assumptions upon which that body of literature rests. We therefore focus on building statistical methods for 21st century data, including:

computer science

Statistical methods require computational implementations. Moreover, the raw data of the 21st century is not typically in a form immediately amenable to statistical analysis; rather, various data wrangling/munging/pre-processing is typically beneficial. We therefore develop computational tools to enable principled statistical inference on such data. This includes:

  • petascale data management including spatial databases for 3D+ (multispectral or time-series) image data including voxel-wise metadata, and grute databases for populations of massive graphs with rich attributes, as well as MATLAB and Python APIs, a content distribution system, LIMS, and hardware configuration;
  • interactive visualizations including for multispectral annotated 3D+ big images, graphs, and vectors.
  • streaming image processing including both 2D stitching artifact correction and 3D histogram normalization;
  • distributed computer vision including graph inference from serial electron microscopy, calcium imaging, functional magnetic resonance imaging, diffusion tensor imaging , and object detection in both MATLAB and Python;
  • scalable machine learning libraries (FlashX) including FlashGraph for billion vertex graph analytics using various graph traversal algorithms, FlashMatrix for matrix applications such as svd, pca, k-means, and linear classification, and FlashR for usability;

big data curation

To answer any of the below questions, using any of the above statistical methods and computational tools, fundamentally requires data. Much like all of our code is open source, all of our data is open access. Moreover, we provide it in a fashion immediately amenable to analysis for everyone, including the below data types:

  • images both anatomical including electron microscopy, array tomography, expansion microscopy, x-ray microscopy, CLARITY, histology, as well as physiological images such as calcium imaging, massively parallel electrophysiology, and multi-modal mri;
  • annotations including volumetric annotations, skeletonizations, and ROIs;
  • graphs with rich attributes ("grutes") derived from a wide variety of experimental techniques and scales.


The above described tools, both statistical and computational, are designed and built in the service of answering fundamental questions in neuroscience, both basic and clinical. Below are the projects we are have worked on, and continue to develop:

  • anatomy including em connectomics, synaptomics, and mesoscale neuro-architecture;
  • physiology including analysis of correlations from o-phys data;
  • systems including biomarker discovery from multimodal mri data, and behavioromics;
  • connectome coding is homologous to neural coding, where we learn the rules by which memories & experiences are stored in patterns of connectivity, rather than patterns of activity.

For more information about any of these projects, or to tell me about your awesome new project, email me, or swing by my office sometime!


Essentially every project I embark on includes writing some code, if only to buttress the theoretical results with numerical examples. Moreover, as an adherent to the philosophy of open science, code I write is always open source by the time of submission, and often sooner. Thus, my personal github account contains many repos including code from publications prior to launching the Open Connectome Project. We have subsequently migrated everything we are still using to our new organization, neurodata. Finally, we also created and continue to develop FlashX.



jovo [at symbol] jhu [dot] edu

Joshua T. Vogelstein
Center for Imaging Science
Clark Hall, Rm 317C
3400 N. Charles St
Baltimore, MD 21218


NeuroData is always hiring exceptional individuals at all levels! In fact, even if we are not an optimal choice for your future development as a scientist, we endeavor to assist/support your growth in various ways, including short/extended visits to work with us. Unfortunately, because our time and funding are limited, we cannot work with everyone that we would love to; nonetheless, please don't hesitate to contact us in case we can provide some guidance :) We are currently particularly looking for people with their own ideas that they are super passionate about, that we might be able to contribute to, as well any of the above listed projects.

Please email your cv (and transcript if appropriate) to jovo at jhu dot edu. See below for more details:

  • summer intern: these are really difficult for us to support for anybody not local to JHU. I recommend trying as much as you can to get research experience at your local university.
  • research associate: an undergraduate degree in a related field, eg, biomedical engineering, neuroscience, or computer science.
  • current JHU undergrads: i advise many undergraduates for credit, please email me your interests and your transcript, so i can find a topic of mutual interest or suggest a different faculty member more suited for you.
  • current JHU masters students: please contact me as soon as you can, so we can get started right away!
  • current JHU PhD students: great! email me your interests and your transcript, and we'll find a project for you, or we'll find a faculty better suited for you.
  • potential JHU graduate students: we are always accepting new students. the best plan is to apply to the BME PhD program, applications are typically due in late november/early december. i can also advise students in other departments, including biostats, neuroscience, and computer science, so i recommend applying widely to increase the chances that we can work together.
  • postdocs: we are especially always hiring postdocs! please email me your particular interest in our group and attach your cv.

This work is graciously supported by the Defense Advanced Research Projects Agency (DARPA) SIMPLEX program through SPAWAR contract N66001-15-C-4041, DARPA GRAPHS N66001-14-1-4028, and NIH-TRA 1R01NS092474, "Synaptomes of Mouse and Man".