Station Papa oxygen - differences between ERDDAP and OOINet data

I am trying to understand the differences between the oxygen data on OOINet and ERDDAP for Station Papa moorings A and B. These data are not currently available through the Data Explorer.

Here is the OOINet link:
I was only able to access oxygen data for Mooring B. For mooring A, there is no Dissolved Oxygen instrument listed in the Data Catalog.
https://ooinet.oceanobservatories.org/data_access/?search=GP03FLMB#GP03FLMA-RIS01-03-DOSTAD000

Here is the ERDDAP link:
I can access both Mooring A and mooring B.
https://erddap.dataexplorer.oceanobservatories.org/erddap/search/index.html?page=1&itemsPerPage=1000&searchFor=oxygen+papa

For OOINet, the data for mooring B ends in early 2019. The ERDDAP data for both moorings ends in early 2020 and there are fewer data gaps.

What units are being used for oxygen in the OOINet and ERDDAP datasets (umol/kg or umol/L)? ERDDAP has the variable name ‘mole_concentration_of_dissolved_molecular_oxygen_in_sea_water’ and OOINet has ‘dissolved_oxygen’.

For most of the time-series, the OOINet oxygen is about 25% lower (70 umol/kg or umol/L lower) than the ERDDAP data. What is causing this large offset? Is it because in the OOINet oxygen data, the solubility is calculated assuming a constant salinity and in the ERDDAP oxygen data, the solubility is corrected to the measured salinity?

The O2 data from ERDDAP is showing strange behaviour for both moorings in late 2019 - values are oscillating between high and low oxygen values. For mooring A the difference between the high and low values is around 20 and for mooring B it is around 250. I was hoping to go back to the OOINet raw optode data and see if the issue is present there, but can’t access the data from late 2019 through OOINet.

I’ve attached a figure showing the whole time-series as well as the data from late 2019.

Thanks for your time and assistance.

@cara.manning,

Thank you for reaching out. I see DOSTA data for both GP03FLMA and GP03FLMB in OOINet. If you search for “GP03FLM DOSTA” in the Data Catalog, you should see both listed, and data appears to be available for both.

In OOINet there are three dissolved oxygen data products:

  • DO (umol L-1)
  • DO - Pressure Temp Sal Corrected (umol kg-1)
  • DO - Temp Corrected (umol L-1)

The DOSTA instruments in these locations do not have any bio-fouling mitigation, so bio-fouling may be affecting the data over time (that has been noted in the annotations). We will have to look in more detail at the offsets and data gaps you indicated.

Also, the Station Papa moorings were not turned last year, due to the COVID-19 pandemic. The Station Papa cruise is occurring right now, so recovered data from mid-2019 to mid-2021 will be available soon.

Thanks and stay safe,
Sheri N. White
OOI/CGSN Data Team

Hi Sheri, Thanks very much for your response and for providing the direct link to the Papa mooring A oxygen data stream.

With respect to the three dissolved oxygen products on OOINet, can you please tell me what .nc file variable names correspond to:

  • DO (umol L-1)
  • DO - Pressure Temp Sal Corrected (umol kg-1)
  • DO - Temp Corrected (umol L-1)

The recovered-inst .nc file contains the following variables:

  • dissolved_oxygen
  • dosta_abcdjm_cspp_tc_oxygen
  • estimated_oxygen_concentration

The dosta_abcdjm_cspp_tc_oxygen and estimated_oxygen_concentration have identical values. None of them match the ERDDAP oxygen data (variable name: mole_concentration_of_dissolved_molecular_oxygen_in_sea_water). See figure below.

Hi @cara.manning,
I can add a few details to what Sheri mentioned above.

For OOINet, the telemetered and recovered data are not merged; you download either the telemetered (which is a decimated dataset due to size) or the recovered data. In comparison, the data downloaded from the Data Explorer ERDDAP stitches together all available sources of data in order to make as complete of record as possible. So the ERDDAP data has merged the telemetered and recovered data sources.

If you download the netCDF files, the units for each variable are in the attribute “units.” For the dissolved_oxygen, the units are umol/kg, and for mole_concentration_of_dissolved_molecular_oxygen_in_sea_water its umol/L. If you want the umol/kg get the variable moles_of_oxygen_per_unit_mass_in_sea_water. The ERDDAP data variable names are from the CF-conventions (and should mostly match the standard_name attribute for a variable in the datasets from OOINet).

As for the weird behavior in 2019, I think you are looking at the umol/L concentrations which show this:

But when I plot the umol/kg concentrations (the moles_of_oxygen_per_unit_mass_in_sea_water) I get expected behavior

As for the very low FLMB oxygen, that still needs some investigation.

Hi Andrew
Thanks a lot for the very helpful message.

Thanks for the info on how to find the units. For the ERDDAP file the units are available (e.g. umol.kg-1 for moles_of_oxygen_per_unit_mass_in_sea_water). For the OOINet files I have downloaded, the units are listed as UNSUPPORTED DATATYPE.

In Matlab/Octave:
Input:
fn=‘deployment0001_GP03FLMB-RIS01-03-DOSTAD000-recovered_host-dosta_abcdjm_sio_instrument_recovered_20130724T064501-20130811T234501.nc’;
ncdisp(fn,‘dissolved_oxygen’)

Output:
dissolved_oxygen
Size: 1797x1
Dimensions: obs
Datatype: double
Attributes:
_FillValue = -9999999
comment = ‘Dissolved Oxygen Concentration from the Stable Response Dissolved Oxygen Instrument is a measure of the concentration of gaseous oxygen mixed in seawater. This data product is corrected for salinity, temperature, and depth.’
long_name = ‘DO - Pressure Temp Sal Corrected’
precision = 4
coordinates = ‘time lat lon pressure’
data_product_identifier = ‘DOXYGEN_L2’
standard_name = ‘moles_of_oxygen_per_unit_mass_in_sea_water’
units = UNSUPPORTED DATATYPE
ancillary_variables = ‘estimated_oxygen_concentration practical_salinity pressure ctdmo_seawater_temperature’

I tried comparing the OOINet dissolved_oxygen (standard_name = ‘moles_of_oxygen_per_unit_mass_in_sea_water’) versus the ERDDAP variable moles_of_oxygen_per_unit_mass_in_sea_water and they are sometimes they same and sometimes offset. Attached is an example. At the start of the time-series they look the same, and then in 2018 there is an offset. Do you know why this is?

Finally, we (OOI BGC sensor working group) were wondering what causes the differences in coverage between the different oxygen data streams for instruments that have already been recovered?

e.g., in late 2013 to mid 2014 in the attached figure, there is data on ERDDAP moles_of_oxygen_per_unit_mass_in_sea_water and from the OOI dosta_abcdjm and estimated_oxygen_concentration streams but not the OOINet dissolved_oxygen stream

e.g., in Aug 2017 there is data from OOINet (all 3 oxygen streams) but not the ERDDAP moles_of_oxygen_per_unit_mass_in_sea_water

Thanks again!

Cara,

Although I can’t address the issue with the discrepancy between what ERDDAP and OOINet are providing, I can address the issue with the UNSUPPORTED_DATATYPE. The units in the OOINet file use the greek character “μ” in the units name. This is a non-standard ASCII character and the NetCDF4 library that Matlab uses does not recognize it. However, more recent versions of Matlab also provide the newer HDF5 library and tools. I have encountered several issues working with the OOI produced NetCDF files over the years, and have found that using the newer HDF5 tools addresses most of these issues. In the example you noted above, instead of using ncdisp, you can use h5disp:

>> fn = 'deployment0001_GP03FLMB-RIS01-03-DOSTAD000-recovered_host-dosta_abcdjm_sio_instrument_recovered_20130724T064501-20140618T134501.nc';
>> ncdisp(fn, 'dissolved_oxygen')
Source:
           /data/testing/deployment0001_GP03FLMB-RIS01-03-DOSTAD000-recovered_host-dosta_abcdjm_sio_instrument_recovered_20130724T064501-20140618T134501.nc
Format:
           netcdf4
Dimensions:
           obs = 31613 (UNLIMITED)
Variables:
    dissolved_oxygen
           Size:       31613x1
           Dimensions: obs
           Datatype:   double
           Attributes:
                       _FillValue              = -9999999
                       comment                 = 'Dissolved Oxygen Concentration from the Stable Response Dissolved Oxygen Instrument is a measure of the concentration of gaseous oxygen mixed in seawater. This data product is corrected for salinity, temperature, and depth.'
                       long_name               = 'DO - Pressure Temp Sal Corrected'
                       precision               = 4
                       coordinates             = 'time lat lon int_ctd_pressure'
                       data_product_identifier = 'DOXYGEN_L2'
                       standard_name           = 'moles_of_oxygen_per_unit_mass_in_sea_water'
                       units                   = UNSUPPORTED DATATYPE
                       ancillary_variables     = 'estimated_oxygen_concentration practical_salinity ctdmo_seawater_temperature'
>> h5disp(fn, '/dissolved_oxygen')
HDF5 deployment0001_GP03FLMB-RIS01-03-DOSTAD000-recovered_host-dosta_abcdjm_sio_instrument_recovered_20130724T064501-20140618T134501.nc 
Dataset 'dissolved_oxygen' 
    Size:  31613
    MaxSize:  Inf
    Datatype:   H5T_IEEE_F64LE (double)
    ChunkSize:  10000
    Filters:  shuffle, deflate(1)
    FillValue:  -9999999.000000
    Attributes:
        '_FillValue':  -9999999.000000 
        'comment':  'Dissolved Oxygen Concentration from the Stable Response Dissolved Oxygen Instrument is a measure of the concentration of gaseous oxygen mixed in seawater. This data product is corrected for salinity, temperature, and depth.'
        'long_name':  'DO - Pressure Temp Sal Corrected'
        'precision':  4 
        'coordinates':  'time lat lon int_ctd_pressure'
        'data_product_identifier':  'DOXYGEN_L2'
        'standard_name':  'moles_of_oxygen_per_unit_mass_in_sea_water'
        'units':  'µmol kg-1'
        'ancillary_variables':  'estimated_oxygen_concentration practical_salinity ctdmo_seawater_temperature'
        'DIMENSION_LIST':  H5T_VLEN
>> 

The HDF5 library supports the use of groups in the NetCDF files (currently OOI does not use groups), which are indicated through the use of a “directory” path. If there are no groups, the variables are at the root level. That is why the h5disp command above uses the string '/dissolved_oxygen'.

With regards to the discrepancies between what is available from ERDDAP versus OOINet, I’m tagging Jeff Glatstein (@jglatstein) and Jim Case (@oceanzus) so they are aware of this conversation. We’ll need to get developers involved in understanding and addressing the differences. Note, there was a bug in how the dissolved oxygen concentrations were being calculated which was corrected on September 3, 2020. The files used by the ERDDAP system look like they were generated on September 1, 2020. It may be that they need to be regenerated. If you pull data from OOINet, the fix would apply, but those older files may be incorrect.

Hi Chris, thanks for the tip to use the h5disp function in Matlab, that is very helpful.

@cara.manning,

We were finally able to track down the issue with the low oxygen data for GP03FLMB Deployment 7. There was an error with one of the calibration coefficients that caused the calculated value to be too low by a factor of ten. This calibration error also affect the DOSTA at 130 m on GI01SUMO Deployment 4. Both of those datasets have now been corrected in the system and correct data are available through both OOINet and OOI Data Explorer.

The Metadata Changes Search widget (https://oceanobservatories.org/metadatasearch/) shows if updates/corrections have been made to datasets, so that updated datasets can be downloaded.

Thanks,
Sheri N. White
OOI/CGSN Data Team