Collaboration with Amazon Web Services enhances data access
NOAA generates thousands of datasets as part of its mission to collect information on environments that span from the surface of the sun to the depths of the ocean floor. More than 68,000 datasets are made publicly available from NOAA with the National Centers for Environmental Information (NCEI) providing access to more than half of those datasets. With such vast holdings, NOAA faces the challenge of providing access to the data in an efficient and scalable manner.
NOAA’s Big Data Project (BDP) helps address this challenge by collaborating with Amazon Web Services (AWS), Google Cloud Platform, IBM, Microsoft, and the Open Commons Consortium to host NOAA’s public data on their cloud platforms. By providing access to the data in this manner, the BDP aims to enhance data discoverability and accessibility, improve efficiency for NOAA and users, as well as spur innovation and economic growth. AWS helps the BDP to achieve these goals by hosting the data on its Amazon Simple Storage Service (Amazon S3) cloud platform as part of its Public Dataset Program.
NOAA’s Big Data Project
The BDP began in 2015 as a four-year business experiment to understand if the inherent value in NOAA’s data could be more fully realized if made available in the cloud. Some of NOAA’s datasets are extremely large and can be cumbersome to transfer through traditional data transfer methods, requiring substantial amounts of bandwidth and time. In some cases, there can be costs associated with the data acquisition, such as when a satellite receiver is required for satellite data.
Benefits of Amazon Web Services Collaboration
The amount of bandwidth, time, and potential cost to access the data can cause limitations and create a barrier for some users. Providing NOAA data in the cloud for free significantly reduces the time needed to access and process the data, opening up possibilities for emerging areas of study and economic growth through new business generation.
AWS currently hosts a variety of NOAA datasets, including both atmospheric and oceanic data. Available datasets can be discovered through the Registry of Open Data on AWS.
One of the first datasets to move to AWS through the BDP was the historical archive of Level-II Next Generation Weather Radar (NEXRAD), available through NCEI. The NEXRAD Level-II archive was among the top initial datasets of interest to move to AWS since it already has an active user community, and it has qualities that are of value to research, commercial, and federal users.
The NEXRAD Level-II archive is comprised of weather radar data that is collected by detecting precipitation and atmospheric movement. It does so by detecting the basic size, shape, and motion of raindrops, hail, or debris in the atmosphere. These data are then used to determine estimates for precipitation and wind, allowing for estimates and early detection of storm motion, tornadoes, hail, ice, snow, and flash floods. NEXRAD is most commonly used for a variety of applications, including weather forecasts and warnings, air traffic control, water and land management, wildlife detection, as well as other uses.
Moving the data to the cloud was beneficial for NCEI and AWS as this was the first time that the full NEXRAD Level-II archive was available on demand to users. In July of 2016, eight months after the data became available on AWS, access to NEXRAD Level-II data was 2.3 times higher than historical monthly rates for the same time period from NCEI, highlighting that providing access to the data in this manner can increase data discoverability and accessibility, maximizing ease of use and reducing the time needed to access and process the data. New research and work has resulted from NEXRAD Level-II being available on the cloud, and examples of work can be found on the AWS NEXRAD Level-II page on the Registry of Open Data.
In addition to NEXRAD Level-II, AWS also hosts datasets from sources such as the Geostationary Operational Environmental Satellite (GOES) 16 from NOAA National Environmental Satellite, Data, and Information Service (NESDIS), the Operational Forecast System from NOAA National Ocean Service (NOS), and the National Water Model (NWM) from NOAA National Weather Service (NWS), to name a few. These datasets highlight the variety of hosted data, ranging from focusing on satellite weather data, ocean data, to hydraulic data. When new datasets become available on the platform, NOAA subject matter experts and AWS host informational sessions to provide users an opportunity to learn more about the data.
Currently, the BDP is in its last year of the experimental phase and is working with AWS to continue hosting additional datasets. The BDP is also gathering feedback from AWS, the other collaborators, and end users that will help develop a sustainable operational phase of the project.