PO.DAAC Drive Data Recipes

PO.DAAC Drive Data Recipes

Postby mgangl » Tue Dec 20, 2016 10:09 am

PO.DAAC Drive is a an Earthdata login enabled, FTP alternative to accessing data at PO.DAAC. Because our users are so accustom to FTP access, we are putting together the following list of data recipes, functions, and tutorials on how to best interact with drive and how to use drive in your current work flows. This list will grow over time.


Last edited by mgangl on Tue Dec 20, 2016 10:51 am, edited 4 times in total.
mgangl
 
Posts: 12
Joined: Wed Apr 27, 2016 1:31 pm

Re: PO.DAAC Drive Data Recipes

Postby mgangl » Tue Dec 20, 2016 10:11 am

Finding your username and password key for use in automated scripts or webdav.

In order to access PODAAC Drive, all users are required to be registered with NASA Earthdata system. User can login to the PODAAC Drive using the following link https://podaac-uat.jpl.nasa.gov/drive/. Figure 2 shows the WebDAV/Programmatic API credentials which will be used later to access the files through wget command. Please note that the password is encrypted, it is different from the Earthdata URS password.

Image

The username and password shown above allow users to connect to PO.DAAC Drive without using their Earthdata login password. This is particularly helpful when embedding your credentials in command line or automated processes. We use this username/password token in the below examples for automated processing or WebDAV setup.
Last edited by mgangl on Tue Dec 20, 2016 10:18 am, edited 1 time in total.
mgangl
 
Posts: 12
Joined: Wed Apr 27, 2016 1:31 pm

Re: PO.DAAC Drive Data Recipes

Postby mgangl » Tue Dec 20, 2016 10:13 am

Copying 'latest' files from PO.DAAC via WebDAV

This post assumes you have already setup WebDAV with PO.DAAC Drive. For more information on that, please see the Drive help pages.

We are assuming we've mounted the PO.DAAC Drive WebDAV to '/Volume/files' in this instance.


A common use case for users is to pull data from PO.DAAC to some local disk for further processing. Users don't want to download all of the data each time they make a request, they only want to fetch data that's new or has changed. Using the command line tool 'rsync' we can accomplish this feature.

Once Drive is mounted via webDAV, we use the rsync command to copy new or updated files to our local disk:

Code: Select all
rsync -avzh SOURCE_DIR DESTINATION_DIR

rsync -avzh /Volumes/files/allData/ostm/preview/L2/GPS-OGDR/c311 /data/archive/ostm/preview/L2/GPS-OGDR/


The above code will 'copy' the files from the source directory (/Volumes/files/allData/ostm/preview/L2/GPS-OGDR/c311) to the destination directory (/data/archive...). This example is using just one subdirectory, c311. the more subdirectories we truncate from the source, the larger our rsync job will become. For example, if we ran the following:

Code: Select all
rsync -avzh /Volumes/files/allData/ostm/preview/L2/GPS-OGDR/ /data/archive/ostm/preview/L2/GPS-OGDR


We would rsycn ALL the cycle subdirectories. This would, in effect, copy all existing mission data to your destination directory.

So we've now copied all existing data for a cycle (or the dataset, if you executed the second rsync). But this dataset is still ongoing. How do we capture the newly created data hours, days, or weeks later? Simple: we run the same exact command again!

Code: Select all
rsync -avzh /Volumes/files/allData/ostm/preview/L2/GPS-OGDR/ /data/archive/ostm/preview/L2/GPS-OGDR


command will check the source directory against the local directory, and download any new or changed files! To automate this entire process, we can create a cron job that runs on whatever schedule we'd like. For this example, we will run the rsync job every night at 4am:

Code: Select all
crontab -e
0 4 * * * rsync -avzh /Volumes/files/allData/ostm/preview/L2/GPS-OGDR/c311 /data/archive/ostm/preview/L2/GPS-OGDR/ > /data/archive/rsync.log
crontab: installing new crontab



This tells cron to run on the 0th minute of the 4th hour, every day, every month, every day of week. We also write the rsync output to a log file (rsync.log) to investigate any issues that may arise.
mgangl
 
Posts: 12
Joined: Wed Apr 27, 2016 1:31 pm

Re: PO.DAAC Drive Data Recipes

Postby mariakatosvich » Tue Mar 07, 2017 7:36 am

Using the command line tool 'rsync' can we fetch data that's old or hasnt changed.
mariakatosvich
 
Posts: 3
Joined: Sun Jun 26, 2016 9:37 pm

Re: PO.DAAC Drive Data Recipes

Postby mgangl » Tue Mar 07, 2017 7:44 am

mariakatosvich wrote:Using the command line tool 'rsync' can we fetch data that's old or hasnt changed.


The first time you run the rsync command it will grab all the existing data it finds. If you run it again, it should grab only new or changed data. So yes, it will sync the old data. The wget recipes are also very good at grabbing a bunch of historical data, if you're interested in those as well.
mgangl
 
Posts: 12
Joined: Wed Apr 27, 2016 1:31 pm


Return to Data Access and Services