Current projects & collaborations...

GO: Gene Ontology

The Gene Ontology (GO) project is a collaborative effort to address the need for consistent descriptions of gene products in different databases. The GO collaborators are developing three structured, controlled vocabularies (ontologies) that describe gene products in terms of their associated biological processes, cellular components and molecular functions in a species-independent manner. There are three separate aspects to this effort: first, we write and maintain the ontologies themselves; second, we make cross-links between the ontologies and the genes and gene products in the collaborating databases, and third, we develop tools that facilitate the creation, maintainence and use of ontologies.

The use of GO terms by several collaborating databases facilitates uniform queries across them. The controlled vocabularies are structured so that you can query them at different levels: for example, you can use GO to find all the gene products in the mouse genome that are involved in signal transduction, or you can zoom in on all the receptor tyrosine kinases. This structure also allows annotators to assign properties to gene products at different levels, depending on how much is known about a gene product.

BBOP has been a member of the GO consortium since its inception in 1999, for which Suzanna Lewis was a founding member (with Michael Ashburner, Judy Blake, and Mike Cherry). BBOP currently contributes to the GO consortium in several ways: project vision, ontology development, and in the development of software tools to view and edit GO ontologies (OBO-Edit) and their annotations (AmiGO).

Software & Resources:

  • AmiGO: A search tool for Gene Ontology terms and their annotations
  • GOOSE: GO Online SQL Environment
  • go-perl: the go-perl page on CPAN
  • OBO-Edit: The OBO Ontology Format Editing Tool

  • GO database: A relational database housing the Gene Ontology and the annotations of genes and gene products to terms in the GO.
  • Sequence Ontology (SO): A controlled vocabulary of terms to describe sequence annotations. The SO is used to catalog the features and properties of biological sequence, and how these features relate to each other. The SO is also used as a standard in GFF3 format.

OBOF: The Open Biomedical Ontologies Foundry

The Open Biomedical Ontologies (OBO) Foundry is a collaborative experiment: to produce well-structured vocabularies for shared use across different biological and medical domains. The OBO Foundry introduces a new paradigm for biomedical ontology development by the establishment of gold standard reference ontologies for individual domains of inquiry.

Back in 2002 Michael Ashburner and Suzanna Lewis initiated this grass-roots effort. The goal of OBO Foundry is to work with the community to create a suite of interoperable reference ontologies in the biomedical domain. We organize small content-development workshops to assist groups who have need of an ontology to support their research, such as anatomies, cell types, environments, and so on. From August 2005-2008 these workshops were supported through our 3-year collaboration with the National Center for Biomedical Ontology (NCBO).

BBOP currently hosts the OBO foundry website and associated tools as a free service, such as nightly production of alternate ontology formats.

We are also involved in a number of ontology development efforts, such as:

  • CARO: Common Anatomy Reference Ontology
  • EnvO: Environmental Conditions Ontology
  • GO: The Gene Ontology (see above)
  • OBI: Ontologies for Biomedical Investigation
  • PATO: OBO Phenotype Quality Ontology
  • OBO_REL: OBO Relations Ontology (see above)

modENCODE: The model organism Encyclopedia of DNA Elements

We are also working on the modENCODE project. We compile, track and collate the data being generated by different modENCODE research projects and are implementing the quality assurance methods to ensure that data meet the relevant data standards. Here again we are using ontologies to help manage the data, using them to describe the biological samples (e.g., anatomical part, developmental stage), the laboratory techniques that are used, and the genomic features that are described. We are also developing a web-based submission and tracking system for the project.

Software & Resources:

  • BIR-TAB: an extension to the MAGE-TAB metadata specification that allows for reporting of data other than microarrays. We are using this format for the submission of RACE, ChIP-chip, ChIP-Seq, sequencing, gene models, and various other data types to the modENCODE Data Coordination Center. This is highly "typed" in that it relies heavily on ontologies for describing experiments. It is intended to be a "superset" of MAGE-TAB.

GMOD: The Generic Model Organism Database Toolkit

GMOD is the Generic Model Organism Database Toolkit, a collection of software tools for creating and managing genome-scale biological databases. You can use it to create a small laboratory database of genome annotations, or a large web-accessible community database. GMOD tools are in use at FlyBase, WormBase, SGD, BeeBase and many other large and small community databases.

Software & Resources:

  • Apollo: A Java-based genome annotation browser and editor
  • Chado: Chado is a relational schema that underlies many GMOD installations
  • Obol: A system for deriving logical definitions for OBO ontologies
  • Blipkit: Biomedical Logic Programming Knowledge Integration Kit (aka BioProlog)

PANTHER: Protein ANalysis THrough Evolutionary Relationships

One natural outgrowth of the GO project was the necessity to move from annotating the function of individual gene products to annotating protein families. For the >PANTHER project we are working with SRI to develop a user-interface that will allow experts to use the scientific experimental evidence in the context of evolutionary relationships, to describe GO molecular functions of these families.


BIRN: Bioinformatics Information Research Network

We are collaborating with the BIRN project at UCSD to develop tools that utilize ontologies to query across distributed relational databases of images at multiple scales used for the study of Neurological Disease. Our two contributions are Phenote, a user interface that facilitates the expert annotation of biological data (in this case images of mouse brains) using ontologies, and the Open Biomedical Database, which is a database schema and a software package designed to store, access, and present annotations created with ontologies.


Former projects ...

NCBO: The National Center for Biomedical Ontology

This collaboration, which ended in August 2008, involved several groups based at Stanford University and Lawrence Berkeley National Labs. BBOP represented the Bioinformatics Core of the NCBO and contributed the analysis of data being annotated with the OBO ontologies. BBOP developed two of the main software tools in the NCBO project.

Software & Resources:

  • Phenote: A phenotype annotation editor using ontologies
  • OBD: the database for OBO. It uses a generic metamodel combined with OBO Ontologies and additional utility ontologies to represent a wide amount of data across the biomedical spectrum.