Page 1 of 1

How to Subset/Download Dataset from OceanColor Using OPeNDAP

PostPosted: Wed Oct 11, 2017 1:10 pm
by yiboj
PODAAC hosts a few remote datasets from Ocean Biology Processing Group (NASA/GSFC/OBPG) such as MODIS Aqua and Terra L3 Chlorophyll (eg. https://podaac.jpl.nasa.gov/dataset/MODIS_Aqua_L3_CHLA_Daily_4km_V2014.0_R). OBPG provides the tools to download the data files. Here we provide the simple python script to subset and download the data files using OBPG OPeNDAP server.

The following steps show how to find the dataset from OBPG site, then execute the script to subset and download daily chlorophyll dataset for year 2016:

Step 1: Goto OBPG OPeNDAP site (https://oceandata.sci.gsfc.nasa.gov/opendap)
opendap_1.png
opendap_1.png (55.72 KiB) Viewed 16488 times


Step 2: Navigate to the dataset and directory you want (eg. https://oceandata.sci.gsfc.nasa.gov/opendap/MODISA/L3SMI/contents.html)
opendap_2.png
opendap_2.png (56.01 KiB) Viewed 16488 times


Step 3: Copy and execute the following script by modifying the dataset information:

Code: Select all
import sys, os
from math import floor,ceil
import numpy as np

from pydap.client import open_url
from netCDF4 import Dataset

#** Index calculation
def boundingindex(dmin,dint,length,boundary0,boundary1):
  inx0=max(int(floor((boundary0-dmin)/dint)),0)
  inx1=max(int(floor((boundary1-dmin)/dint)),0)
  if (inx0 > inx1):
    atemp = inx0
    inx0 = inx1
    inx1 = atemp
  return [inx0,inx1]

#** Input parameters

ayear = '2016'
lon_box = [-132, -94]
lat_box = [4, 30]

#** Check sample file to get parameters
ncin = open_url('https://oceandata.sci.gsfc.nasa.gov/opendap/MODISA/L3SMI/2017/001/A2017001.L3m_DAY_CHL_chlor_a_4km.nc')

chla = ncin['chlor_a']
lon = ncin['lon']
lat = ncin['lat']

lon_step = np.mean(np.diff(lon))
lat_step = np.mean(np.diff(lat))

[i0,i1]=boundingindex(lon[0],lon_step,len(lon),lon_box[0],lon_box[1])
[j0,j1]=boundingindex(lat[0],lat_step,len(lat),lat_box[0],lat_box[1])

opendap_dir = 'https://oceandata.sci.gsfc.nasa.gov/opendap/MODISA/L3SMI/'+ayear+'/'

for i in range(1, 366):
   filename = opendap_dir + "{0:0>3}".format(i)+'/'+'A'+ayear+"{0:0>3}".format(i)+'.L3m_DAY_CHL_chlor_a_4km.nc.nc'
   filenameout = 'A'+ayear+"{0:0>3}".format(i)+'.L3m_DAY_CHL_chlor_a_4km.nc'
   cmd='wget "' + filename + '?chlor_a['+str(j0)+':'+str(j1)+']['+str(i0)+':'+str(i1)+']"' + ' -O ' + filenameout
   os.system( cmd )

Re: How to Subset/Download Dataset from OceanColor Using OPe

PostPosted: Tue Dec 05, 2017 12:14 am
by batsc
Here is how to do the same using the Python library 'Iris'. This loads only the requested data straight into your Python session, and provides easy methods for plotting maps, time series, calculating statistics, regridding, comparing with other datasets, etc. See http://scitools.org.uk/iris/docs/latest/index.html for more.

Code: Select all
import iris
from datetime import datetime, timedelta

# Grab 3 days
start_date = datetime(2010, 10, 1)
end_date = start_date + timedelta(days=2)

# Define latitude/longitude bounds
lat_min, lat_max = -20, 10
lon_min, lon_max = -50, -20

# Filename information
file_suffix = '.L3m_DAY_CHL_chlor_a_4km.nc'
opendap_dir = 'https://oceandata.sci.gsfc.nasa.gov/opendap/MODISA/L3SMI'

# Loop over each day to build up a list of required URLs
urls = []
date = start_date
while date <= end_date:
    urls.append('/'.join([opendap_dir,
                          date.strftime('%Y'),
                          date.strftime('%j'),
                          'A' + date.strftime('%Y%j') + file_suffix]))
    date += timedelta(days=1)

# Define the coordinate constraints
latitudes = iris.Constraint(latitude=lambda lat: lat_min < lat < lat_max)
longitudes = iris.Constraint(longitude=lambda lon: lon_min < lon < lon_max)

# Now define the dataset. Note the data is only downloaded once
# you perform a calculation/plot with it.
cubes = iris.load(urls, latitudes & longitudes)

# cubes is an iris.CubeList of individual 2D (lat/lon) cubes. If each cube
# had a time value associated with it (in CF Conventions), cubes would be
# a 3D (time/lat/lon) cube, from which it is easy to plot time series,
# calculate statistics, regrid, compare with other datasets, etc.
# See http://scitools.org.uk/iris/docs/latest/index.html for more.
print(cubes[0])