NSF AI grant $9M: "FabAID," a data fabric for AI-driven science (Morgridge Institute)
The NSF awarded about $9M to "FabAID (Fabric for AI-Driven Science)," a national-scale data fabric that supports the lifecycle of scientific datasets. It weaves together data repositories, producers/curators, and processing capacity to give researchers the data-centric workflows that underpin AI-driven research, integrating with NSF cyberinfrastructure and NAIRR.
Grant overview (primary data)
- Award amount$9,000,000 / Est. total $24,500,000
- RecipientMorgridge Institute for Research, Inc.(WI)
- ProgramData Cyberinfrastructure
- Period2026-05-15 〜 2031-04-30
- FunderU.S. National Science Foundation (NSF) / NSF
Key points
- A national-scale data fabric (FabAID) supporting the lifecycle of scientific datasets
- Provides the data-centric workflows that underpin AI-driven research
- Operates the Open Science Data Fabric (OSDF) on Pelican Platform and HTCondor, open source
- Integrates with NSF cyberinfrastructure and NAIRR
- About $9M, led by the Morgridge Institute, 2026–2031
The NSF awarded about $9,000,000 to FabAID (Fabric for AI-Driven Science), led by the Morgridge Institute for Research (NSF Award 2609485; program: Data Cyberinfrastructure; May 2026 – April 2031).
Per the abstract, FabAID weaves services and technologies to support the lifecycle of scientific datasets at national scale, so researchers can manage and leverage the data-centric workflows that are a foundation of AI-driven research. By integrating scientific data repositories, dataset producers and curators, and processing capacity, the fabric presents the nation's educators and researchers with a coherent set of powerful, dependable capabilities — powered by the integrated NSF cyberinfrastructure ecosystem and initiatives such as the National AI Research Resource (NAIRR).
Technically, FabAID operates the Open Science Data Fabric (OSDF) to deliver data to and from compute, expand dataset discovery, support active metadata extraction, and integrate archive services. Its data-centric workload services integrate with distributed computing infrastructures so researchers can transform or simulate billions of data points and curate datasets for AI-driven workloads. The services are built on the Pelican Platform and the HTCondor software suite, distributed and supported under open-source licenses so they can be used across autonomous entities. Facilitation activities provide training and collaboration so user communities can effectively use the national integrated data services.
Why it matters
A case of building AI research's "data supply chain" at national scale. For those in AI for Science and data-infrastructure/ML-data work, a useful read on how the U.S. builds open-source AI-ready data infrastructure.
FAQ
Why is a "data fabric" important for AI?
What are OSDF and Pelican?
Sources (primary)
Source: NSF Award Search (U.S. National Science Foundation, public domain). Amounts are the obligated amount. For privacy, we do not handle principal investigator names.
- NSF Award (original, official)
- NSF Award ID: 2609485