OPeNDAP (Open-source Project for a Network Data Access Protocol) is a standard interface for transferring, transforming and subsetting Earth science data. It’s capable of subsetting by time, space and variable, and is a great tool for those looking to reduce the amount of gridded data required for research or analysis to a specific region, time, or set of variables. For more information, please visit Earthdata’s Developer Portal.
“OPeNDAP in the cloud” is a hosted, Earthdata enterprise service that provides a single interface to all Earthdata cloud data enabled for OPeNDAP. Because of the move to a hosted service, there are some mechanisms that have changed for how users interact with the tool.
The way OPeNDAP-in-the-cloud works is by file (or by granule). In its current form, it expects access to be done by a specific data file. As such, it is driven by search and metadata more so than an entire collection of files. Search and metadata use the Earthdata Common Metadata Repository (CMR).
Existing (non-cloud) OPeNDAP traditionally worked on top of a file system. In this way, it mirrored the PO.DAAC Directory structure. If you’re familiar with PO.DAAC Drive, the OPeNDAP layout looks exactly the same. The move to the cloud removes a fundamental idea of ‘directory structures’ - this means that the familiar OPeNDAP interactive (click though) interface is no longer available. There are some alternatives to this available to users described below.
A few notes:
An “Earthdata Login” is required to access data files from within OPeNDAP-in-the-cloud. This service provided by the EOSDIS program is openly available to all free of charge except where governed by internal agreements. If you access OPeNDAP without being logged in, your Earthdata username and password will be requested.
Supported Formats: OPeNDAP currently works with hdf5 and netCDF-4 granules. If the granule’s native format is not in these formats (i.e., zarr, hdf4, netCDF-3, ASCII, txt, etc), then OPeNDAP service is not available for that granule.
OPeNDAP Download Format Options: OPeNDAP provides download options in CSV, DAP4 Binary, netCDF-3, and netCDF-4. Of these formats, netCDF-4 download formats are recommended, as netCDF-3 has variable type support limitations.
It is recommended that variable subsetting is utilized in OPeNDAP data requests. By design, OPeNDAP’s dmr++ content directly accesses the data values held in the source granule file, and it can do so without having to retrieve the entire file and work on it locally, even when the file is stored in a Web Object Store like S3 (see more information here). This is beneficial when granule sizes are large which may exceed your local machine’s storage capacity, if a granule is downloaded without subsetting.
Prerequisites: to use OPeNDAP-in-the-cloud, one needs to know the collection for which you want to subset data. The PO.DAAC Cloud Data Listing is a great place to find collections of interest. Below are various methods to access OPeNDAP.
Method 1: Virtual Browse User Interface Access
For users who are used to “drilling down” through a directory for OPeNDAP links, the CMR Virtual Browse Interface allows this capability. Specific to PO.DAAC Cloud - the POCLOUD Browse Granule Listing Interface will allow you to find data through a directory-like interface. This interface is accessible through the link to all POCLOUD collections, or is linked directly from the Cloud Dataset listing page:
From this link, or after choosing a collection from the virtual interface listing, one can step through the collection’s granules by Year > Month > Day or by Cycle and Pass depending on the collection. Once you reach the granule files, you’ll see a link to the full data file download, and, if available, the OPeNDAP link:
The OPeNDAP link will then take you to the familiar UI interface for OPeNDAP( i.e., the OPeNDAP “form”).
By default, this OPeNDAP form loads as a DAP4 data request form. This includes the option to transform the data into the following formats: CSV, netCDF-3, netCDF-4, DAP4 Binary, and DAP2 Binary. To revert back to a more traditional look of the OPeNDAP form, you may remove the “.dmr” string from the URL, and then reload the form. Once reloaded, you will see a more traditionally looking DAP2 data request form, which includes unique download options, in addition to the standard netCDF-3 and netCDF-4, such as: ASCII, CoverageJSON, and Binary.
Method 2: Earthdata Search Access Direct Download
Earthdata Search is map based search and access tool for NASA Earthdata. PO.DAAC has some existing tutorials on using Earthdata Search for downloading data. To tie this into OPeNDAP is a similar process. After setting a search area and/or a time bounds, the resulting granules can be added to your project. You can use either the “Download All” button or by adding files individually. See the below figure for examples of search filters (blue circles) and adding files to download (red circles).
Once you’re ready to download your data, click the big green button at the bottom of the screen. It will tell you how many granules you will be accessing.
You’ll have access to the collection options for download on the resulting screen. Make sure you choose ‘Customize’ OPeNDAP as your output option. If it is not available, this collection is still being added to OPeNDAP.
You’ll also be able to select which variables you’ll want to download. As of right now, do not select the coordinate variables for OPeNDAP subsetting - this may prevent you from getting the _actual_ subsets you’re interested in. For now, just select the science variables or quality flags, and coordinate and time variables will be included for you.
After finishing the customization options, click ‘Download Data’. The resulting screen will give you the links to OPeNDAP and their filled in subsets. If you did not select a spatial bound, or limit the variables, you’ll simply get the entire file back.
The resulting links can be downloaded using wget, curl or fed into another program or code snippet of your choice.
Method 3: Earthdata Search Access Variable Subset Direct Download
Using Earthdata Search to find the collection and select the granule, the granule’s OPeNDAP URL can be directly accessed which has variable selection. In step (1), go to https://search.earthdata.nasa.gov/search and search for the collection’s short name. Select the granule in step (2) and click on the ‘kebab’ (three vertical dots) of that granule. This will open the granule’s metadata which includes its OPeNDAP URL outlined in step (3). Select a variable by setting its parameters (4) and clicking “Get Data” in netCDF-4 (5) (6). After making the selection, a subsetted netCDF-4 file will be downloaded to your local machine.
Method 4: API Access
By leveraging CMR, one can use any number of programming languages to search CMR through it’s API. There are PO.DAAC Tutorials for using OPeNDAP in the cloud through python (via a jupyter notebook) but the techniques can apply to numerous other languages.
The main flow is to search CMR for granules by collection, space, and time. The resulting granules will include a link to OPeNDAP like the following:
In the above link, there is the notion of a ‘collection concept id’, which is how CMR identifies a collection. To find the concept ID of your collection of interest, you can use, once again, the cloud dataset listing page:
OPeNDAP link in granule metadata:
"Type": "USE SERVICE API",
"Subtype": "OPENDAP DATA",
"Description": "OPeNDAP request URL"
The above link, if put into a browser, will take you to the OPeNDAP UI. By using this link in a piece of code, as in the Jupyter tutorial, one can request the subsetting directly and download or use the resulting data product in your processes.