How to Subset/Download Dataset from OceanColor Using OPeNDAP

How to Subset/Download Dataset from OceanColor Using OPeNDAP

Postby yiboj » Wed Oct 11, 2017 1:10 pm

PODAAC hosts a few remote datasets from Ocean Biology Processing Group (NASA/GSFC/OBPG) such as MODIS Aqua and Terra L3 Chlorophyll (eg. https://podaac.jpl.nasa.gov/dataset/MODIS_Aqua_L3_CHLA_Daily_4km_V2014.0_R). OBPG provides the tools to download the data files. Here we provide the simple python script to subset and download the data files using OBPG OPeNDAP server.

The following steps show how to find the dataset from OBPG site, then execute the script to subset and download daily chlorophyll dataset for year 2016:

Step 1: Goto OBPG OPeNDAP site (https://oceandata.sci.gsfc.nasa.gov/opendap)
opendap_1.png
opendap_1.png (55.72 KiB) Viewed 16381 times


Step 2: Navigate to the dataset and directory you want (eg. https://oceandata.sci.gsfc.nasa.gov/opendap/MODISA/L3SMI/contents.html)
opendap_2.png
opendap_2.png (56.01 KiB) Viewed 16381 times


Step 3: Copy and execute the following script by modifying the dataset information:

Code: Select all
import sys, os
from math import floor,ceil
import numpy as np

from pydap.client import open_url
from netCDF4 import Dataset

#** Index calculation
def boundingindex(dmin,dint,length,boundary0,boundary1):
  inx0=max(int(floor((boundary0-dmin)/dint)),0)
  inx1=max(int(floor((boundary1-dmin)/dint)),0)
  if (inx0 > inx1):
    atemp = inx0
    inx0 = inx1
    inx1 = atemp
  return [inx0,inx1]

#** Input parameters

ayear = '2016'
lon_box = [-132, -94]
lat_box = [4, 30]

#** Check sample file to get parameters
ncin = open_url('https://oceandata.sci.gsfc.nasa.gov/opendap/MODISA/L3SMI/2017/001/A2017001.L3m_DAY_CHL_chlor_a_4km.nc')

chla = ncin['chlor_a']
lon = ncin['lon']
lat = ncin['lat']

lon_step = np.mean(np.diff(lon))
lat_step = np.mean(np.diff(lat))

[i0,i1]=boundingindex(lon[0],lon_step,len(lon),lon_box[0],lon_box[1])
[j0,j1]=boundingindex(lat[0],lat_step,len(lat),lat_box[0],lat_box[1])

opendap_dir = 'https://oceandata.sci.gsfc.nasa.gov/opendap/MODISA/L3SMI/'+ayear+'/'

for i in range(1, 366):
   filename = opendap_dir + "{0:0>3}".format(i)+'/'+'A'+ayear+"{0:0>3}".format(i)+'.L3m_DAY_CHL_chlor_a_4km.nc.nc'
   filenameout = 'A'+ayear+"{0:0>3}".format(i)+'.L3m_DAY_CHL_chlor_a_4km.nc'
   cmd='wget "' + filename + '?chlor_a['+str(j0)+':'+str(j1)+']['+str(i0)+':'+str(i1)+']"' + ' -O ' + filenameout
   os.system( cmd )
yiboj
 
Posts: 130
Joined: Mon Mar 30, 2015 11:22 am

Re: How to Subset/Download Dataset from OceanColor Using OPe

Postby batsc » Tue Dec 05, 2017 12:14 am

Here is how to do the same using the Python library 'Iris'. This loads only the requested data straight into your Python session, and provides easy methods for plotting maps, time series, calculating statistics, regridding, comparing with other datasets, etc. See http://scitools.org.uk/iris/docs/latest/index.html for more.

Code: Select all
import iris
from datetime import datetime, timedelta

# Grab 3 days
start_date = datetime(2010, 10, 1)
end_date = start_date + timedelta(days=2)

# Define latitude/longitude bounds
lat_min, lat_max = -20, 10
lon_min, lon_max = -50, -20

# Filename information
file_suffix = '.L3m_DAY_CHL_chlor_a_4km.nc'
opendap_dir = 'https://oceandata.sci.gsfc.nasa.gov/opendap/MODISA/L3SMI'

# Loop over each day to build up a list of required URLs
urls = []
date = start_date
while date <= end_date:
    urls.append('/'.join([opendap_dir,
                          date.strftime('%Y'),
                          date.strftime('%j'),
                          'A' + date.strftime('%Y%j') + file_suffix]))
    date += timedelta(days=1)

# Define the coordinate constraints
latitudes = iris.Constraint(latitude=lambda lat: lat_min < lat < lat_max)
longitudes = iris.Constraint(longitude=lambda lon: lon_min < lon < lon_max)

# Now define the dataset. Note the data is only downloaded once
# you perform a calculation/plot with it.
cubes = iris.load(urls, latitudes & longitudes)

# cubes is an iris.CubeList of individual 2D (lat/lon) cubes. If each cube
# had a time value associated with it (in CF Conventions), cubes would be
# a 3D (time/lat/lon) cube, from which it is easy to plot time series,
# calculate statistics, regrid, compare with other datasets, etc.
# See http://scitools.org.uk/iris/docs/latest/index.html for more.
print(cubes[0])

batsc
 
Posts: 3
Joined: Sat Jun 10, 2017 3:51 am


Return to Data Recipes

cron