The overall goal of the NIH Common Fund's Bridge to Artificial Intelligence (Bridge2AI) program is to generate flagship datasets and best practices for the collection and preparation of data to address biomedical research grand challenges that require artificial intelligence and machine-learning (AI/ML) analysis.

Our Role in Bridge2AI

Our team at Lawrence Berkeley National Laboratory (LBNL) participates in the Bridge2AI consortium's Bridge Center, working as key contributors to the Standards, Practices, and Quality Assessment working group. This is a highly collaborative effort involving leading researchers from multiple institutions including the University of Colorado Anschutz Medical Campus, University of Virginia, Nationwide Children's Hospital, Johns Hopkins University, Sage Bionetworks, and Boston Children's Hospital, among others.

The Standards working group's mission is to develop software and standards to unify data attributes across multiple Bridge2AI Data Generation Projects and a wide range of data types. We engage with the community to extend existing standards and design new ones in the areas of data modeling, ontologies, data exchange formats, data quality evaluation, data mapping and harmonization, knowledge graphs, standards-driven tools (including tools that facilitate graph-based machine learning), provenance and attribution. Whenever possible, we encode standards using our LinkML modeling framework, which provides easy-to-use structures for data scientists and data engineers while retaining semantic linkages.

Major Projects

Standards Explorer

The Standards Explorer is a web application that enables users to explore schemas and data standards relevant to the preparation and application of AI-ready data in biomedical research. The Explorer provides an intuitive interface for browsing, searching, and understanding the various standards being developed and adopted across the consortium and in the broader research community.

Key Resources:

Datasheets for Datasets (D4D) Schema

The Datasheets for Datasets schema provides a structured approach to documenting datasets, promoting transparency and accountability in dataset creation and use. This work adapts and extends the concept of datasheets to better serve the biomedical AI/ML community.

GitHub: https://github.com/bridge2ai/data-sheets-schema

Model Cards Schema

The Model Cards schema establishes a standard framework for documenting AI/ML models, including their intended use, performance characteristics, limitations, and ethical considerations. This enables better model understanding and responsible deployment.

GitHub: https://github.com/bridge2ai/model-card-schema

Publications

Our work has resulted in several key manuscripts that describe our approaches and findings:

Edit