Page 1 of 2

Download Multiple Data Files from PODAAC Drive Using wget

PostPosted: Thu Dec 01, 2016 10:30 am
by yiboj
This data recipe shows how to download multiple data files from PODAAC using GNU wget utility command. GNU Wget is a free utility for non-interactive download of files from the Web. It supports http, https, and ftp protocols, as well as retrieval through http proxies. It is a Unix-based command-line tool, but is also available for other operating system, such as Windows, Mac OS X, etc.

[b][color=#FF0000]1. wget Command Options[/color][/b]

Here is the list of a few key options frequently used:

[b]-nd[/b]
--no-directories
Do not create a hierarchy of directories when retrieving recursively. With this option turned on, all files will get saved to the current directory, without clobbering (if a name shows up more than once, the filenames will get extensions '.n').


[b]-x[/b]
--force-directories
The opposite of '-nd' —create a hierarchy of directories, even if one would not have been created otherwise. E.g. "wget -x http://podaac.jpl.nasa.gov/robots.txt" will save the downloaded file to podaac.jpl.nasa.gov/robots.txt.
[b]
-nH[/b]
--no-host-directories
Disable generation of host-prefixed directories. By default, invoking Wget with "-r http://podaac.jpl.nasa.gov/" will create a structure of directories beginning with podaac.jpl.nasa.gov/. This option disables such behavior.
[b]
-r[/b]
--recursive
Turn on recursive retrieving. The default maximum depth is 5.
[b]
-l depth[/b]
--level=depth
Specify recursion maximum depth level depth.

[i]Try to specify the criteria that match the kind of download you are trying to achieve. If you want to download only one page, use '--page-requisites' without any additional recursion. If you want to download things under one directory, use '-np' to avoid downloading things from other directories. If you want to download all the files from one directory, use '-l 1' to make
sure the recursion depth never exceeds one.[/i]

[b][color=#FF0000]2. Download multiple files from PODAAC FTP site[/color][/b]

Let's take GHRSST SST Level 2 datset from REMSS as an example, the dataset landing page is [url=https://podaac.jpl.nasa.gov/dataset/AMSRE-REMSS-L2P-v7a]https://podaac.jpl.nasa.gov/dataset/AMSRE-REMSS-L2P-v7a[/url]. The FTP link for this dataset is indicated by the red circle in Figure 1.

[attachment=1]amsr-e_ftp.png[/attachment]

* To download one day data files
[code]
% wget -r -nc -np -nH -nd -A "*.nc" "ftp://podaac-ftp.jpl.nasa.gov/allData/ghrsst/data/GDS2/L2P/AMSRE/REMSS/v7/2011/001"
[/code]

* To download one year data files and create sub-directory

[code]
% wget -r -nc -np -nH -d -A "*.nc" "ftp://podaac-ftp.jpl.nasa.gov/allData/ghrsst/data/GDS2/L2P/AMSRE/REMSS/v7/2011/"
[/code]

[b][color=#FF0000]3. Download multiple files from PODAAC Drive[/color][/b]

In order to access PODAAC Drive, all users are required to be registered with NASA Earthdata system. User can login to the PODAAC Drive using the following link [url=https://podaac-tools.jpl.nasa.gov/drive/]https://podaac-tools.jpl.nasa.gov/drive/[/url]. Figure 2 shows the WebDAV/Programmatic API credentials which will be used later to access the files through wget command. Please note that the password is encrypted, it is different from the Earthdata Login password.

[attachment=0]podaac_drive.png[/attachment]

Again we take the GHRSST SST Level 2 datset from REMSS as an example.

* To download one day data files
[code]
% wget --user=LOGIN --password=PASSWORD -r -nc -np -nH -nd -A "*.nc" "https://podaac-tools.jpl.nasa.gov/drive/files/OceanTemperature/ghrsst/data/GDS2/L2P/AMSRE/REMSS/v7/2011/001/"
[/code]

* To download one year data files and create sub-directory
[code]
% wget --user=LOGIN --password=PASSWORD -r -nc -np -nH -d -A "*.nc" "https://podaac-tools.jpl.nasa.gov/drive/files/OceanTemperature/ghrsst/data/GDS2/L2P/AMSRE/REMSS/v7/2011/"
[/code]

Please refer to the following link for more detail information:
[url=https://www.gnu.org/software/wget/]Download and Install wget[/url]
[url=https://www.gnu.org/software/wget/manual/wget.pdf]wget Manual in PDF Format[/url]

Re: Download Multiple Data Files from PODAAC Drive Using wge

PostPosted: Wed Jan 11, 2017 7:21 am
by mgangl
Another option people may be interested in is the -N option for wget:

Code: Select all
-N,  --timestamping              don't re-retrieve files unless newer than local.


With this, you can run the same command over and over on a top level directory (say a year or the entire dataset top level directory) and only download the newest files. This is a common case for many users and we have other ways of addressing this same use case (using rsync and WebDAV).

So a quick change to the command may look like this (and i'm using ASCAT data in this example):

Code: Select all
 wget --user=USER --password=PASSWORD -r -N -np -nH -d -A "*.nc.gz" https://podaac-tools.jpl.nasa.gov/drive/files/allData/ascat/preview/L2/metop_a/coastal_opt/2017/011/


This downloads a bunch of files in the 2017/011 directory. Keep running the command and you won't get any new files- but if we 'fake' out the server, and set the time of one of the downloaded files to a time before the file was created on the server, we can sho how the data will download new data:

Code: Select all
touch -t 201501010000 ascat_20170111_110000_metopa_53088_eps_o_coa_2401_ovw.l2.nc.gz


the above command will set the timestamp of the given file to january 1st, 2015.

When we run the wget command again, you can see that it downloads the newer files from the server, but not the existing, matching files.

Re: Download Multiple Data Files from PODAAC Drive Using wge

PostPosted: Thu Dec 26, 2019 6:12 am
by bknd
wget --user=***** --password=***** -r -N -np -nH -d -A "*.nc.gz" https://podaac-tools.jpl.nasa.gov/drive ... 1/2019/060

wget --user=bknd --password=*******i -r -N -np -nH -d -A "*.nc.gz" https://podaac-tools.jpl.nasa.gov/drive ... /2017/011/

j'ai utilisé cette ligne de commande pour télécharger les données mais il me signale l'erreur suivant:

Échec d’authentification par identifiant et mot de passe

Pouvez vous m'aider

I used this command line to download the data but it reports the following error to me: Failed to authenticate by username and password

Can you help me

Re: Download Multiple Data Files from PODAAC Drive Using wge

PostPosted: Fri Jan 03, 2020 12:32 pm
by yiboj
HI,
Thanks for the inquiry.
The user name and password have to be the PO.DAAC Drive API Credentials (WebDAV) as shown in the attached screenshot. Please check and let us know if this works for you.
Regards,

PODAAC DE

temp.png
temp.png (173.35 KiB) Viewed 4039 times

Re: Download Multiple Data Files from PODAAC Drive Using wge

PostPosted: Wed Jan 22, 2020 1:17 am
by fionasupernova
Hi,

I am trying to download files by year (since I can't download all the files all at once). I have the correct username and password. But I still couldn't download the files. Pls help.

Code: Select all
$ wget --user=USER --password=PASS -r -nc -np -nH -d -A "*.nc.bz2" "https://podaac-tools.jpl.nasa.gov/drive/files/OceanTemperature/ghrsst/data/L4/GLOB/UKMO/OSTIA/2006/"


Thanks,

Fiona

Re: Download Multiple Data Files from PODAAC Drive Using wge

PostPosted: Wed Jan 22, 2020 9:14 am
by yiboj
Hi,
Thanks for your inquiry. Please try the following command and let us know if this works for you.
Code: Select all
$ wget --user=USER --password=PASS -r -nc -np -nH "https://podaac-tools.jpl.nasa.gov/drive/files/OceanTemperature/ghrsst/data/L4/GLOB/UKMO/OSTIA/2006/"

PODAAC DE

Re: Download Multiple Data Files from PODAAC Drive Using wge

PostPosted: Sat Feb 08, 2020 7:53 pm
by fionasupernova
yiboj wrote:Hi,
Thanks for your inquiry. Please try the following command and let us know if this works for you.
Code: Select all
$ wget --user=USER --password=PASS -r -nc -np -nH "https://podaac-tools.jpl.nasa.gov/drive/files/OceanTemperature/ghrsst/data/L4/GLOB/UKMO/OSTIA/2006/"

PODAAC DE




This worked for me. Thanks! But how do I edit it without all the sub-directories? do I just use the "-nd" flag? (:

Re: Download Multiple Data Files from PODAAC Drive Using wge

PostPosted: Mon Feb 10, 2020 12:09 pm
by yiboj
Hi,
Thanks for your inquiry, The -nd switch is a short for --no-directories, and it should work for this case.
Regards,
-PODAAC DE

Re: Download Multiple Data Files from PODAAC Drive Using wge

PostPosted: Thu Apr 30, 2020 10:39 am
by cmamos
Hi,

I'm trying to download all files from a given year. But the following code is only downloading, then removing, 'index.html.tmp'.

Code: Select all
wget --user=USER --password=PASSWORD -r -nc -np -nH -nd -A "fv02_0-AVHRR_AMSR_OI.nc.bz2" "https://podaac-tools.jpl.nasa.gov/drive/files/allData/ghrsst/data/L4/GLOB/NCDC/AVHRR_AMSR_OI/2002/"


The code will work if I download one day within the year using:

Code: Select all
wget --user=USER --password=PASSWORD -r -nc -np -nH -nd -A "fv02_0-AVHRR_AMSR_OI.nc.bz2" "https://podaac-tools.jpl.nasa.gov/drive/files/allData/ghrsst/data/L4/GLOB/NCDC/AVHRR_AMSR_OI/2002/152/"


but does not work to retrieve all daily files in one year.

Please help, thanks!

Re: Download Multiple Data Files from PODAAC Drive Using wge

PostPosted: Thu Apr 30, 2020 1:30 pm
by yiboj
Hi,
Thanks for your inquiry. Please try the following command and let us know if this works for you.

Code: Select all
$ wget --user=USER --password=PASS -r -nc -np -nH "https://podaac-tools.jpl.nasa.gov/drive/files/allData/ghrsst/data/L4/GLOB/NCDC/AVHRR_AMSR_OI/2002/"


Regards,

PODAAC DE