Multiple File Download from PO.DAAC Drive Using Python

Re: Multiple File Download from PO.DAAC Drive Using Python

Postby rattana06 » Wed Aug 05, 2020 11:56 pm

Dear PODAAC database experts,

I have used the python program of the discussion above to download SST data from GHRSST.
I run the following command on anaconda on window 10:

"python drive_download.py -u Username:Password -s 20020601 -f 20200804 -x MUR-JPL-L4-GLOB-v4.1 -t wget"


Note that the username and password I used is the one derived from
"Access PO.DAAC Drive API Credentials"

After running the code I got the following errors:
"
Please wait while program searching for the granules ...

OK to download? [yes or no]: yes
--2020-08-06 13:30:51-- https://podaac-tools.jpl.nasa.gov/drive ... -fv04.1.nc
Resolving podaac-tools.jpl.nasa.gov (podaac-tools.jpl.nasa.gov)... 137.78.248.120
Connecting to podaac-tools.jpl.nasa.gov (podaac-tools.jpl.nasa.gov)|137.78.248.120|:443... connected.
ERROR: The certificate of 'podaac-tools.jpl.nasa.gov' is not trusted.
ERROR: The certificate of 'podaac-tools.jpl.nasa.gov' hasn't got a known issuer.

"

It warns about untrusted certificate. Can you explain about the mistakes I made here.
Thank you in advance.
rattana06
 
Posts: 1
Joined: Wed Aug 05, 2020 11:30 pm

Re: Multiple File Download from PO.DAAC Drive Using Python

Postby yiboj » Thu Aug 06, 2020 10:11 am

Hi,
Thanks for your inquiry and support.

First, you need to install the cygwin package ca-certificates via Cygwin's setup.exe to get the certificates.
Second, you need to tell wget where your certificates are, since it doesn't pick them up by default in Cygwin environment. If you can do that either with the command-line parameter --ca-directory=/usr/ssl/certs (best for shell scripts) or by adding ca_directory = /usr/ssl/certs to ~/.wgetrc file.

You can also disable checking SSL traffic by modifying the wget code in the python script:

wget --no-check-certificate

Hope this helps and please let us know.

Regards,

PODAAC DE
yiboj
 
Posts: 130
Joined: Mon Mar 30, 2015 11:22 am

Re: Multiple File Download from PO.DAAC Drive Using Python

Postby arundeep » Fri Sep 24, 2021 3:01 am

tanyapak wrote:Hi!

I don't understand where I'm supposed to add the username and password, and specify the url to datasets. Could you please give an example?



Hi,

I am hoping that you had the same issues as I face recently and I did not find an answer in the forum here. Just to be clear, my scenario is as below.
I have a directly url given to a file on PO.DAAC drive on a page from NASA. If I click on Get data button, it gives me the URL.

The script here seems to be for complex scenario and did not help me.
I found a link about accessing EarthData login supported files here. There are 3 methods given.
I tried all 3 and on Pythonn 3.8 2nd approach worked for me. The one with overriding requests.Session class. As I did not need to download file. I had to modied the code a little bit.
Code: Select all
#response = session.get(url, stream=True)
response = session.get(url)


I removed the below code
Code: Select all
    with open(filename, 'wb') as fd:
 
        for chunk in response.iter_content(chunk_size=1024*1024):
 
            fd.write(chunk)


and directly used response.text to parse into a DataFrame through io.StringIO

Code: Select all
 pd.read_table(io.StringIO(response.text), skiprows=48,delim_whitespace=True, header= None, index_col=False)


I hope it helps someone who comes to this page with less understanding of Po.DAAC structures and python like me .

EDIT: 3rd option also worked after I figured the issue.

had to change code
Code: Select all
   #session.auth = (username, password)
   #r1 = session.request('get', url)
       
   r = session.get(url, auth=(username, password))



NOTE: password here is the password for Web drive and not the default login password.
arundeep
 
Posts: 1
Joined: Fri Sep 24, 2021 2:31 am

Re: Multiple File Download from PO.DAAC Drive Using Python

Postby sjainnahta » Sun Oct 24, 2021 11:39 pm

Hi,

Let me add a few remarks:

getSolution() is more powerful than getSlack(); while both can be called with (numerical) indices, constraint/variable objects, and constraint/variable names, both alone or in lists/arrays, getSolution() allows for more complex parameters such as expressions or lists thereof: one can call getSolution(x[1] + x[2]) or even getSolution([[x[0]**1, x[1]**2], [x[2]**3, x[3]**4]]), while the same can't happen for getSlack();

the above difference stems from more general function xpress.evaluate(), which evaluates any expression (e.g. a constraint's lhs or part of the objective) using a variable assignment that can be different than the current solution and/or a different problem object.

getSolution() is little more than a wrapper for xpress.evaluate();
other functions behave like getSlack(): these are getDual() and getRCost();
efficiency won't be much affected by calling getSlack() or getSolution() with a small subset: the Pythoninterface uses the C API functions for retrieving slack and solution, which however return the whole solution vector requested. The Python interface then takes care of selecting the right subset.

As stated in the Modeling chapter of the Python interface's reference manual, calling getSolution(x[i]) or getSlack(constrs[i]) several times will result in as many C API calls that return the whole vector. It is advisable instead to call these functions once.
sjainnahta
 
Posts: 1
Joined: Sun Oct 24, 2021 11:31 pm

Previous

Return to Data Recipes