IBM and NASA’s Marshall Space Flight Center announced a collaboration to use IBM’s artificial intelligence (AI) technology to discover new insights in NASA’s massive trove of Earth and geospatial science data. The joint work will apply AI foundation model technology to NASA’s Earth-observing satellite data for the first time.
Foundation AI models are versatile, multi-purpose models trained on vast amounts of unstructured data. These models have greatly advanced NLP in recent years, and IBM is at the forefront of utilizing them for tasks beyond language processing.
Earth observation data are being gathered at unprecedented rates and volumes, allowing scientists to study and monitor our world. To extract knowledge from these enormous data resources, novel and creative ways are needed. The objective of this effort is to make it simpler for academics to examine and extrapolate information from these huge datasets. The ability to identify and analyse these data more quickly because of IBM’s foundation model technology could help increase science’s comprehension of Earth and its reaction to climate-related problems.
IBM and NASA plan to develop several new technologies to extract insights from Earth observations. One project will train an IBM geospatial intelligence foundation model on NASA’s Harmonized Landsat Sentinel-2 (HLS) dataset, a record of land cover and land use changes captured by Earth-orbiting satellites. By analyzing petabytes of satellite data to identify changes in the geographic footprint of phenomena such as natural disasters, cyclical crop yields, and wildlife habitats, this foundation model technology will help researchers provide critical analysis of our planet’s environmental systems.
A key outcome of this collaboration is an anticipated searchable database of Earth science literature. IBM has created an NLP model that has been trained on nearly 300,000 Earth science journal articles, to simplify the discovery of new knowledge in this field. This model represents one of the biggest AI workloads trained on Red Hat’s OpenShift software and utilizes IBM’s open-source multilingual question-answering system, PrimeQA. The new Earth science language model will not only be a valuable resource for researchers but could also be integrated into NASA’s scientific data management processes.
Rahul Ramachandran, senior research scientist at NASA’s Marshall Space Flight Center in Huntsville, Alabama said “The beauty of foundation models is they can potentially be used for many downstream applications.”
Ramachandran added “Building these foundation models cannot be tackled by small teams,” he added. “You need teams across different organizations to bring their different perspectives, resources, and skill sets.”
Raghu Ganti, principal researcher at IBM said“Foundation models have proven successful in natural language processing, and it’s time to expand that to new domains and modalities important for business and society.”
Ganti added “Applying foundation models to geospatial, event-sequence, time-series, and other non-language factors within Earth science data could make enormously valuable insights and information suddenly available to a much wider group of researchers, businesses, and citizens. Ultimately, it could facilitate a larger number of people working on some of our most pressing climate issues.”
Other potential cooperation efforts between IBM and NASA under this agreement include developing a fundamental model for forecasting weather and climate utilising the MERRA-2 dataset of atmospheric observations. This partnership is a part of NASA’s Open-Source Science Initiative, which is dedicated to creating a diverse, open, and cooperative open science community over the next ten years.