Download Multiple Data Files from PODAAC Drive Using wget

Download Multiple Data Files from PODAAC Drive Using wget

Postby yiboj » Thu Dec 01, 2016 10:30 am

This data recipe shows how to download multiple data files from PODAAC using GNU wget utility command. GNU Wget is a free utility for non-interactive download of files from the Web. It supports http, https, and ftp protocols, as well as retrieval through http proxies. It is a Unix-based command-line tool, but is also available for other operating system, such as Windows, Mac OS X, etc.

[b][color=#FF0000]1. wget Command Options[/color][/b]

Here is the list of a few key options frequently used:

[b]-nd[/b]
--no-directories
Do not create a hierarchy of directories when retrieving recursively. With this option turned on, all files will get saved to the current directory, without clobbering (if a name shows up more than once, the filenames will get extensions '.n').


[b]-x[/b]
--force-directories
The opposite of '-nd' —create a hierarchy of directories, even if one would not have been created otherwise. E.g. "wget -x http://podaac.jpl.nasa.gov/robots.txt" will save the downloaded file to podaac.jpl.nasa.gov/robots.txt.
[b]
-nH[/b]
--no-host-directories
Disable generation of host-prefixed directories. By default, invoking Wget with "-r http://podaac.jpl.nasa.gov/" will create a structure of directories beginning with podaac.jpl.nasa.gov/. This option disables such behavior.
[b]
-r[/b]
--recursive
Turn on recursive retrieving. The default maximum depth is 5.
[b]
-l depth[/b]
--level=depth
Specify recursion maximum depth level depth.

[i]Try to specify the criteria that match the kind of download you are trying to achieve. If you want to download only one page, use '--page-requisites' without any additional recursion. If you want to download things under one directory, use '-np' to avoid downloading things from other directories. If you want to download all the files from one directory, use '-l 1' to make
sure the recursion depth never exceeds one.[/i]

[b][color=#FF0000]2. Download multiple files from PODAAC FTP site[/color][/b]

Let's take GHRSST SST Level 2 datset from REMSS as an example, the dataset landing page is [url=https://podaac.jpl.nasa.gov/dataset/AMSRE-REMSS-L2P-v7a]https://podaac.jpl.nasa.gov/dataset/AMSRE-REMSS-L2P-v7a[/url]. The FTP link for this dataset is indicated by the red circle in Figure 1.

[attachment=1]amsr-e_ftp.png[/attachment]

* To download one day data files
[code]
% wget -r -nc -np -nH -nd -A "*.nc" "ftp://podaac-ftp.jpl.nasa.gov/allData/ghrsst/data/GDS2/L2P/AMSRE/REMSS/v7/2011/001"
[/code]

* To download one year data files and create sub-directory

[code]
% wget -r -nc -np -nH -d -A "*.nc" "ftp://podaac-ftp.jpl.nasa.gov/allData/ghrsst/data/GDS2/L2P/AMSRE/REMSS/v7/2011/"
[/code]

[b][color=#FF0000]3. Download multiple files from PODAAC Drive[/color][/b]

In order to access PODAAC Drive, all users are required to be registered with NASA Earthdata system. User can login to the PODAAC Drive using the following link [url=https://podaac-tools.jpl.nasa.gov/drive/]https://podaac-tools.jpl.nasa.gov/drive/[/url]. Figure 2 shows the WebDAV/Programmatic API credentials which will be used later to access the files through wget command. Please note that the password is encrypted, it is different from the Earthdata Login password.

[attachment=0]podaac_drive.png[/attachment]

Again we take the GHRSST SST Level 2 datset from REMSS as an example.

* To download one day data files
[code]
% wget --user=LOGIN --password=PASSWORD -r -nc -np -nH -nd -A "*.nc" "https://podaac-tools.jpl.nasa.gov/drive/files/OceanTemperature/ghrsst/data/GDS2/L2P/AMSRE/REMSS/v7/2011/001/"
[/code]

* To download one year data files and create sub-directory
[code]
% wget --user=LOGIN --password=PASSWORD -r -nc -np -nH -d -A "*.nc" "https://podaac-tools.jpl.nasa.gov/drive/files/OceanTemperature/ghrsst/data/GDS2/L2P/AMSRE/REMSS/v7/2011/"
[/code]

Please refer to the following link for more detail information:
[url=https://www.gnu.org/software/wget/]Download and Install wget[/url]
[url=https://www.gnu.org/software/wget/manual/wget.pdf]wget Manual in PDF Format[/url]
Attachments
podaac_drive.png
Figure 2: PODAAC Drive Login Credential Screen
podaac_drive.png (57.44 KiB) Viewed 8449 times
amsr-e_ftp.png
Figure 1: FTP Link of Dataset
amsr-e_ftp.png (113.75 KiB) Viewed 8449 times
yiboj
 
Posts: 115
Joined: Mon Mar 30, 2015 11:22 am

Re: Download Multiple Data Files from PODAAC Drive Using wge

Postby mgangl » Wed Jan 11, 2017 7:21 am

Another option people may be interested in is the -N option for wget:

Code: Select all
-N,  --timestamping              don't re-retrieve files unless newer than local.


With this, you can run the same command over and over on a top level directory (say a year or the entire dataset top level directory) and only download the newest files. This is a common case for many users and we have other ways of addressing this same use case (using rsync and WebDAV).

So a quick change to the command may look like this (and i'm using ASCAT data in this example):

Code: Select all
 wget --user=USER --password=PASSWORD -r -N -np -nH -d -A "*.nc.gz" https://podaac-tools.jpl.nasa.gov/drive/files/allData/ascat/preview/L2/metop_a/coastal_opt/2017/011/


This downloads a bunch of files in the 2017/011 directory. Keep running the command and you won't get any new files- but if we 'fake' out the server, and set the time of one of the downloaded files to a time before the file was created on the server, we can sho how the data will download new data:

Code: Select all
touch -t 201501010000 ascat_20170111_110000_metopa_53088_eps_o_coa_2401_ovw.l2.nc.gz


the above command will set the timestamp of the given file to january 1st, 2015.

When we run the wget command again, you can see that it downloads the newer files from the server, but not the existing, matching files.
mgangl
 
Posts: 12
Joined: Wed Apr 27, 2016 1:31 pm

Re: Download Multiple Data Files from PODAAC Drive Using wge

Postby bknd » Thu Dec 26, 2019 6:12 am

wget --user=***** --password=***** -r -N -np -nH -d -A "*.nc.gz" https://podaac-tools.jpl.nasa.gov/drive ... 1/2019/060

wget --user=bknd --password=*******i -r -N -np -nH -d -A "*.nc.gz" https://podaac-tools.jpl.nasa.gov/drive ... /2017/011/

j'ai utilisé cette ligne de commande pour télécharger les données mais il me signale l'erreur suivant:

Échec d’authentification par identifiant et mot de passe

Pouvez vous m'aider

I used this command line to download the data but it reports the following error to me: Failed to authenticate by username and password

Can you help me
bknd
 
Posts: 1
Joined: Thu Dec 26, 2019 5:58 am

Re: Download Multiple Data Files from PODAAC Drive Using wge

Postby yiboj » Fri Jan 03, 2020 12:32 pm

HI,
Thanks for the inquiry.
The user name and password have to be the PO.DAAC Drive API Credentials (WebDAV) as shown in the attached screenshot. Please check and let us know if this works for you.
Regards,

PODAAC DE

temp.png
temp.png (173.35 KiB) Viewed 3698 times
yiboj
 
Posts: 115
Joined: Mon Mar 30, 2015 11:22 am

Re: Download Multiple Data Files from PODAAC Drive Using wge

Postby fionasupernova » Wed Jan 22, 2020 1:17 am

Hi,

I am trying to download files by year (since I can't download all the files all at once). I have the correct username and password. But I still couldn't download the files. Pls help.

Code: Select all
$ wget --user=USER --password=PASS -r -nc -np -nH -d -A "*.nc.bz2" "https://podaac-tools.jpl.nasa.gov/drive/files/OceanTemperature/ghrsst/data/L4/GLOB/UKMO/OSTIA/2006/"


Thanks,

Fiona
fionasupernova
 
Posts: 3
Joined: Wed Jan 22, 2020 1:10 am

Re: Download Multiple Data Files from PODAAC Drive Using wge

Postby yiboj » Wed Jan 22, 2020 9:14 am

Hi,
Thanks for your inquiry. Please try the following command and let us know if this works for you.
Code: Select all
$ wget --user=USER --password=PASS -r -nc -np -nH "https://podaac-tools.jpl.nasa.gov/drive/files/OceanTemperature/ghrsst/data/L4/GLOB/UKMO/OSTIA/2006/"

PODAAC DE
yiboj
 
Posts: 115
Joined: Mon Mar 30, 2015 11:22 am

Re: Download Multiple Data Files from PODAAC Drive Using wge

Postby fionasupernova » Sat Feb 08, 2020 7:53 pm

yiboj wrote:Hi,
Thanks for your inquiry. Please try the following command and let us know if this works for you.
Code: Select all
$ wget --user=USER --password=PASS -r -nc -np -nH "https://podaac-tools.jpl.nasa.gov/drive/files/OceanTemperature/ghrsst/data/L4/GLOB/UKMO/OSTIA/2006/"

PODAAC DE




This worked for me. Thanks! But how do I edit it without all the sub-directories? do I just use the "-nd" flag? (:
fionasupernova
 
Posts: 3
Joined: Wed Jan 22, 2020 1:10 am

Re: Download Multiple Data Files from PODAAC Drive Using wge

Postby yiboj » Mon Feb 10, 2020 12:09 pm

Hi,
Thanks for your inquiry, The -nd switch is a short for --no-directories, and it should work for this case.
Regards,
-PODAAC DE
yiboj
 
Posts: 115
Joined: Mon Mar 30, 2015 11:22 am

Re: Download Multiple Data Files from PODAAC Drive Using wge

Postby cmamos » Thu Apr 30, 2020 10:39 am

Hi,

I'm trying to download all files from a given year. But the following code is only downloading, then removing, 'index.html.tmp'.

Code: Select all
wget --user=USER --password=PASSWORD -r -nc -np -nH -nd -A "fv02_0-AVHRR_AMSR_OI.nc.bz2" "https://podaac-tools.jpl.nasa.gov/drive/files/allData/ghrsst/data/L4/GLOB/NCDC/AVHRR_AMSR_OI/2002/"


The code will work if I download one day within the year using:

Code: Select all
wget --user=USER --password=PASSWORD -r -nc -np -nH -nd -A "fv02_0-AVHRR_AMSR_OI.nc.bz2" "https://podaac-tools.jpl.nasa.gov/drive/files/allData/ghrsst/data/L4/GLOB/NCDC/AVHRR_AMSR_OI/2002/152/"


but does not work to retrieve all daily files in one year.

Please help, thanks!
cmamos
 
Posts: 4
Joined: Thu Apr 30, 2020 10:32 am

Re: Download Multiple Data Files from PODAAC Drive Using wge

Postby yiboj » Thu Apr 30, 2020 1:30 pm

Hi,
Thanks for your inquiry. Please try the following command and let us know if this works for you.

Code: Select all
$ wget --user=USER --password=PASS -r -nc -np -nH "https://podaac-tools.jpl.nasa.gov/drive/files/allData/ghrsst/data/L4/GLOB/NCDC/AVHRR_AMSR_OI/2002/"


Regards,

PODAAC DE
yiboj
 
Posts: 115
Joined: Mon Mar 30, 2015 11:22 am

Next

Return to Data Access and Services

cron