Using the Pydap Package to Access PO.DAAC OPeNDAP Datasets

Using the Pydap Package to Access PO.DAAC OPeNDAP Datasets

Postby yiboj » Thu Sep 14, 2017 1:27 pm

Pydap makes it easy to access data from any of the OPeNDAP servers, and it’s possible to instrospect and manipulate a dataset as if it were stored locally without data being downloaded.

In this recipe, we use PODAAC OPeNDAP server and AVHRR_OI NCEI L4 dataset as an exmaple, now let's go to an OPeNDAP server https://podaac-opendap.jpl.nasa.gov/ope ... HRR_OI/v2/ and pick a netCDF file from NCEI L4 dataset by copying the contents of the Data URL box.

Code: Select all
>>> from pydap.client import open_url
>>> dataset = open_url('https://podaac-opendap.jpl.nasa.gov/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2017/001/20170101120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc')


Here we use the pydap.client.open_url function to open an URL specifying the location of the dataset, now we can check all the available parameters or variables in the data file using the following command:

Code: Select all
>>> print (list(dataset.keys()))
['analysed_sst', 'analysis_error', 'lat', 'lon', 'mask', 'sea_ice_fraction', 'time', 'lat_bnds', 'lon_bnds', 'time_bnds']
>>> lat = dataset['lat']
>>> lon = dataset['lon']
>>> sst = dataset['analysed_sst']
>>> type(sst)
<class 'pydap.model.GridType'>
>>> sst.array.shape
(1, 720, 1440)
>>> sst.dimensions
('time', 'lat', 'lon')
>>> sst.maps
OrderedDict([('time', <BaseType with data BaseProxy('https://podaac-opendap.jpl.nasa.gov/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2017/001/20170101120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc', 'analysed_sst.time', dtype('>i4'), (1,), (slice(None, None, None),))>), ('lat', <BaseType with data BaseProxy('https://podaac-opendap.jpl.nasa.gov/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2017/001/20170101120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc', 'analysed_sst.lat', dtype('>f4'), (720,), (slice(None, None, None),))>), ('lon', <BaseType with data BaseProxy('https://podaac-opendap.jpl.nasa.gov/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2017/001/20170101120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc', 'analysed_sst.lon', dtype('>f4'), (1440,), (slice(None, None, None),))>)])


Finally, we can download or subset some data. To download data we simply access it like we would access a Numpy array, and the data for the corresponding subset will be downloaded on the fly from the server:

Code: Select all
>>> sst_subset = sst[0,10:100,10:100]  # this will download data from the server
>>> lat_subset = lat[10:100]
>>> lon_subset = lon[10:100]
>>> print (lat_subset)
[-87.375 -87.125 -86.875 -86.625 -86.375 -86.125 -85.875 -85.625 -85.375
 -85.125 -84.875 -84.625 -84.375 -84.125 -83.875 -83.625 -83.375 -83.125
 -82.875 -82.625 -82.375 -82.125 -81.875 -81.625 -81.375 -81.125 -80.875
 -80.625 -80.375 -80.125 -79.875 -79.625 -79.375 -79.125 -78.875 -78.625
 -78.375 -78.125 -77.875 -77.625 -77.375 -77.125 -76.875 -76.625 -76.375
 -76.125 -75.875 -75.625 -75.375 -75.125 -74.875 -74.625 -74.375 -74.125
 -73.875 -73.625 -73.375 -73.125 -72.875 -72.625 -72.375 -72.125 -71.875
 -71.625 -71.375 -71.125 -70.875 -70.625 -70.375 -70.125 -69.875 -69.625
 -69.375 -69.125 -68.875 -68.625 -68.375 -68.125 -67.875 -67.625 -67.375
 -67.125 -66.875 -66.625 -66.375 -66.125 -65.875 -65.625 -65.375 -65.125]
yiboj
 
Posts: 130
Joined: Mon Mar 30, 2015 11:22 am

Return to Data Recipes