Hi, when downloading different data streams from the data explorer, which are binned to 1 minute resolution, there is also a quality flag variable in the netcdf files which consist solely on NaN values for all data streams I have downloaded.
Is this because the bad quality data has already been filtered out, or will I have to go to the full resolution data and/or annotations on the data explorer to figure out what data to potentially scrap or avoid?
In order to accurately answer your question, I wanted to ask for a few more details. Are you downloading the netCDF (.nc) file version from the Data Explorer ERDDAP server? If so, blanks in the qc_agg parameter associated with a given variable should be NaNs in the .nc file. Additionally, we do not perform QARTOD tests on every parameter, and are currently in the process of rolling out the QARTOD tests for the various instruments. You can find more information on the QARTOD tests and our progress towards implementation here: Quality Control - Ocean Observatories Initiative
As for “bad quality” data, OOI is committed to never removing data, and will deliver all available data as requested, including “bad data”. Additionally, I would caution you that parameters with QARTOD flags of “3” may be of interest and good quality and not necessarily “suspect” quality and shouldn’t be discarded without further assessment.
Thanks for reaching out and please let us know if you have any more questions.
thank you for your reply. I have downloaded the netCDFs from the Data Explorer ERDDAP. I have downloaded several different datastreams (ERDDAP netCDF files): sea surface temperature, air temperature, winds: northward and eastward, relative humidity and more. So is it right that all of the qc_agg parameters associated with a given variable contains only NaN values? i.e. no indication of the quality of the data at all?
I can see that most of these variables are listed as “In QARTOD development”, which might already answer my question above, but I thought I would ask for clarification!
I am planning to use the data in my Masters Research. I have gone through most of the annotations for the data I am using, as I saw that had some notes about quality of data. Do you have any other recommendations on how to assess the quality of the data which do not have a quality flag? Other than the annotations and/or analysing the data as it isn’t always apparent when something should be considered “suspect”.
Correct, the met data parameters currently do not have QARTOD tests running on their data streams, so they will have NaNs in the associated qc parameters. However, these tests should be going live relatively soon, so keep on the lookout for the QARTOD tests for met data to start appearing in the near future! I will also note that we will not be testing any of the flux or derived parameters.
If you want to do some simple threshold tests, you can start by taking a look at the instrument (METBKA - Ocean Observatories Initiative) and going to the manufacturer for basic instrument ranges. The other option is you can look at our qc-lookup table for the METBK instrument here (https://github.com/oceanobservatories/qc-lookup/blob/master/qartod/metbk/metbk_qartod_gross_range_test_values.csv) and try to match the parameters. I would just use the ‘fail_span’ as a starting point, since these are the manufacturer’s operational ranges for the instruments. These tables work on the backend of our system so they don’t match 1:1 to the parameters you get from the Data Explorer ERDDAP, so you’ll need to match the reference designator and parameter carefully.
Hope that helps.
Many thanks. I will have a look at that!