Difficulty downloading data from ERDDAP Tabledap

I am using this R-script to try to download glider files from ERDDAP Tabledap but it keeps throwing errors. The script will successfully download one to several files before it gives the error. It is never failing on the same file nor on the same number of files. Is there something wrong with the server end?

Here is the error*********************
Error in download.file(file_url, destination, mode = “wb”) :
cannot open URL ‘https://gliders.ioos.us/erddap/tabledap/cp_374-20140416T1634-delayed.nc?
In addition: Warning message:
In download.file(file_url, destination, mode = “wb”) :
URL ‘https://gliders.ioos.us/erddap/tabledap/cp_374-20140416T1634-delayed.nc’: status was ‘Failure when receiving data from the peer’
******End of Error message

Here is the R script, kindly provided by Ian Black*******

Install required packages.

pkgs ← c(‘ncdf4’,‘rerddap’)
install.packages(pkgs)

library(‘rerddap’)

save_directory = ‘C:/Users/rvaillancourt/OneDrive - Millersville University/Documents/AA_Pioneer Array/Glider Data by Year and Month/2014-2022_May2024/’ # Need a / on the end to signify a directory.

download_gliderdac_file ← function(dataset_id, save_dir, timeout= 60*1000000){
options(timeout = timeout)
base = ‘ERDDAP - List of tabledap Datasets
file_url = paste(base, dataset_id,‘.nc?’,sep = ‘’)
destination = paste(save_dir,dataset_id, ‘.nc’,sep = ‘’)
download.file(file_url,destination, mode = ‘wb’)
}

server ← ‘ERDDAP - Home Page
all_deployments = global_search(query = ‘size’, server, ‘tabledap’)
all_datasets = unlist(all_deployments[2], use.names = FALSE) # Get all deployments available through the GliderDAC ERDDAP.
pioneer_datasets = c() # Holder for PA gliders.
for (i in 1:length(all_datasets)){
if (grepl(‘cp_’, all_datasets[i])){ # Only keep PA glider deployments.
pioneer_datasets = c(pioneer_datasets,c(all_datasets[i]))
}
}

timeout = 60*1000000
for (j in 1:length(pioneer_datasets)){
dataset_id = pioneer_datasets[j+3]
download_gliderdac_file(dataset_id, save_directory, timeout)
}

End of script**********

Hi Bob, I asked some colleagues in my group, and they think it’s probably a timeout issue on the server. Your script includes a large timeout on your end, but it’s most likely the server that’s having issues compiling and caching the datasets within the time allotted. Sometimes the request will work on the 2nd or 3rd try, because of the work the server has previously cached. But then, that cache expires, which is why it might fail again in the future. Annoying I know.

I don’t know how the IOOS Glider Erddap server is setup. They might be able to extend the timeout on their end, but they might already have a large value set.

Erddap is a great tool, but I’ve heard it can be tricky to get it to work well for every dataset and request. This is especially true when requesting multiple glider deployments, which can each require combining thousands of files due to the way the IOOS community archives individual profiles.

My only suggestion, is to request deployments slowly :wink:

Hi Sage,

I was thinking the same thing – that it was a time-out issue on ERDDAP’s end. I think the easiest way to do this is to download the files individually directly from ERDDAP. It will take awhile but seeing as the bottleneck is at their end then there is no solution available for users other than to deal with their download utility.

Thanks Sage.

Bob