Data Assimilation Algorithms for U.S West and East Coast OFS
Project Lead: John Wilkin, Rutgers, The State University of New Jersey
CO-PIs: Hernan Arango (Rutgers), Andrew Moore (UC Santa Cruz), Christopher Edwards (UCSC)
Federal partners: NOAA Coastal Survey Development Laboratory (CSDL)
IOOS Regional Association partners: MARACOOS, CeNCOOS, SCCOOS, and NANOOS
Project Overview and Results
This project is directed at enabling NOAA Center for Operational Oceanographic Products and Services (CO-OPS) and IOOS RAs to deliver more accurate and more highly resolved forecasts of water level, velocity, temperature and salinity to key stakeholders concerned with fisheries, ecosystem health, navigation, maritime safety, response to marine environmental hazards, and a sustainable blue economy
NOAA’s West Coast Ocean Forecasting System (WCOFS) and numerous analysis and forecast systems within IOOS RAs are founded on the Regional Ocean Modeling System (ROMS) and its supporting 4-Dimensional Variational (4D-Var) data assimilation (DA) tools. This project will improve, in multiple respects, the performance of these DA systems for operational applications.
Specific objectives of this project are:
- Achieve faster ROMS DA execution time, thereby enabling the practicality of higher resolution analyses and forecasts
- Expand capabilities for better utilizing the information content of high-resolution observations used in existing operational systems
- Create an infrastructure to run coupled ROMS-biogeochemical models that exploit the enhanced ocean physics state estimates delivered by operational DA systems
- Develop IOOS Cloud Sandbox instances of DA systems, including WCOFS, to facilitate hands-on user training in running advanced DA systems and enable the user community to experiment with system performance, ecosystem forecasting, coupling, and observing system design
- Solicit stakeholder priorities and requirements for the next generation of analysis and near real-time forecasts.
Anticipated direct project benefits are (a) the delivery of ROMS data assimilative forecasts at higher resolution than at present by enabling execution of the nonlinear model on a different grid than the 4D-Var iterations, (b) reduced computational footprint of existing operational systems through mixed-precision split executables, (c) greater use of the information content of observations by developing operators that average high-frequency observations, (d) a biogeochemical model coupled to WCOFS output and run operationally within IOOS RAs, (e) archived model configurations and tools in a cloud sandbox to stimulate experimentation and prototyping of system improvements, and (f) communication with stakeholders to obtain feedback on prioritized information needs.
Indirect project benefits of higher resolution and accuracy will cascade to existing downstream products that use operational analysis forecast systems, such as several statistical ecological products that use physical state estimates as inputs such as for HAB prediction and fisheries management.
The modeling efforts here utilize the Regional Ocean Modeling System (ROMS; www.myroms.org) and its 4D-Var assimilation system and coupled NEMURO biogeochemical and ecosystem model.
4D-Var is an iterative data assimilation algorithm for identifying the most-likely ocean state given all available prior information and observations (Moore et al. 2011). The components of ROMS 4D-Var are depicted schematically in the figure below.
The most-likely state estimate is identified by minimizing a nonlinear cost function, which is challenging. An approach is to linearize the problem utilizing the tangent linear (TL) and adjoint (AD) versions of ROMS and minimize a sequence of linearized cost functions during so-called inner loops of the algorithm. The ocean state estimate about which each inner loop is linearized is periodically updated in an outer loop, which is a run of the full nonlinear model with all its physics and forcing. At the conclusion, the final outer-loop integration yields the most-likely ocean state.
A 4D-Var inner-loop comprises one integration of TL and AD ROMS that each require ~50% more CPU time than a nonlinear (NL) ROMS calculation spanning the same time interval. Therefore, a single inner-loop is equivalent to ~3 NL model runs. The current CeNCOOS and MARACOOS forecast systems require ~50 times the computational cost of running the NL model to compute a single 4D-Var analysis, which places significant constraints on feasible model resolutions. This project will develop multi-resolution and mixed-precision computation capabilities that will improve the efficiency of the ROMS 4D-Var system.
Biogeochemical (BGC) ocean state-estimates have been a component of the CeNCOOS near-real-time ROMS system since 2015. Based on the NEMURO model (Kishi et al. 2007), the model estimates 11 ecosystem components (inorganic nitrogen and silicon nutrients, 2 phytoplankton classes, and 3 zooplankton). The BGC model is incorporated into UCSC ROMS 4D-Var, assimilating satellite surface chlorophyll data. Though not in the NRT product, recent model enhancements add oxygen and carbonate chemistry for the prediction of pH and aragonite saturation – key indicators of ocean acidification.
Multi-resolution ROMS 4D-Var
We propose a multi-resolution approach for ROMS 4D-Var facilitated by a new feature in ROMS that allows different grid resolution NL, TL and AD components of 4D-Var run as separate executables. This “split” capability was introduced as part of the integration of ROMS into the Joint Effort for Data Assimilation Integration (JEDI).
Model-data misfits, or the so-called “innovations” in DA parlance, are computed based on a high-resolution NL model forecast with the most accurate possible physics, but the AD and TL iterations that collectively comprise the bulk of the compute effort run at a lower resolution. The low-resolution increments are interpolated back to the fine resolution model to execute the forecast.
Mixed-precision 4D-Var in the multi-resolution system
Most earth system models are written in double-precision yet research in Numerical Weather Prediction (NWP) suggests that many floating-point operations can be performed at lower accuracy without an appreciable degradation in the final product. In applications where performance is limited by memory and or I/O, such as 4D-Var, the savings for large ocean grids may be substantial.
In mixed-precision ROMS, the outer loops are computed at double-precision for accuracy in the ocean dynamics, but the inner loops that inform the iterative search directions of the cost function minimization run at single precision. This will be tested in the MARACOOS and CeNCOOS systems and WCOFS re-run at UC Santa Cruz. In combination with the multi-resolution 4D-Var approach described above, the use of mixed-precision could potentially yield an order of magnitude reduction in the computational effort required to perform a 4D-Var analysis.
Time- and space-averaged observations
Altimetry, gilders, and coastal HF radars provide high temporal resolution observations to IOOS forecast systems. Direct assimilation of these raw data streams, however, can degrade the performance of 4D-Var if the cost function is dominated by high-frequency signals that are not well resolved by the model. An alternative approach is to assimilate the time-average of the observations, or in case of altimetry, observations that are pre-processed to best represent slowly evolving geophysical signals. The existing ROMS observation operators are not configured to accept time-averaged data, so we will develop a more general operator in the TL model forcing to allow the assimilation of such data.
Ecosystem/BGC modeling with WCOFS
One impediment to the development of ecosystem forecast models is the time and expertise required to configure a skillful underlying physical model, especially with DA. The ROMS 4D-Var systems in the RAs and at CSDL each took years to develop. We will create an infrastructure to quickly and easily re-run WCOFS analyses and forecasts locally, and with this explore how to leverage efforts in DA model development in support of ecosystem modeling.
ROMS 4D-Var sandbox
Creating a cloud-based sandbox for experimentation with ROMS 4D-Var DA is central to our project transition activities. The IOOS Cloud Sandbox (ICS) (github.com/ioos/Cloud-Sandbox) presently enables re-running selected instances of NOAA operational forecast systems and RA systems.
In collaboration with RPS, we will add to the ICS archival instances (not NRT sustained data) of MARACOOS and WCOFS 4D-Var encompassing all necessary code and supporting input files. This will enable researchers to explore the sensitivity of WCOFS (and MARACOOS) 4D-Var to changes in the 4D-Var configuration, but more importantly to changes in the assimilated data. This might include the impact of new data streams that are not yet operational or, via Observing System Simulation Experiments (OSSE), the potential value of novel observing platforms or redesign of deployment strategies.
Kishi, M.J., Kashiwai, M., Ware, D.M., Megrey, B.A., Eslinger, D.L., Werner, F.E., Noguchi-Aita, M., Azumaya, T., Fujii, M., Hashimoto, S. and Huang, D., 2007. NEMURO—a lower trophic level model for the North Pacific marine ecosystem. Ecological Modelling, 202(1-2), pp.12-25.
Moore, A.M., Arango, H.G., Broquet, G., Edwards, C.A., Veneziani, M., Foley, B.P.D., Doyle, J., Costa, D., and Robinson, P., 2011a. The Regional Ocean Modeling System (ROMS) 4-dimensional variational data assimilation systems. Part I: System overview and formulation. Prog. Oceanogr. 91, 34–49.