Using Matlab and OPeNDAP to Subset Level 2 Datasets

Open-source Project for a Network Data Access Protocol

Using Matlab and OPeNDAP to Subset Level 2 Datasets

Postby yiboj » Wed Jun 22, 2016 10:10 am

Dealing with large datasets is challenging from the perspective of storage and access. Instead of downloading and storing large datasets locally it is often preferable to spatially subset them on a server side and store smaller pieces, or even use the pieces dynamically for an analysis and discard them. OPeNDAP is a popular service to extract subsets from data in netCDF and HDF.

Dr. John Wilkin (Rutgers University) has kindly provided the matlab codes here to download and subset level 2 datasets. The subroutines and main file are listed below. The AMSR2 dataset is used here as an example and please feel free to modify the code.

get_amsr2sst
Code: Select all
function [data,geo] = get_amsr2sst(dataurl,latmin,latmax,lonmin,lonmax,varname)
% Read AMSR2 data from PO-DAAC opendap
%
% See get_amsr2sst_filelist.m which returns information required to
% build a set of data urls for any given day
%
%
% Optional inputs:
%     POLYBOX:  Matrix with columns of lon,lat defining a box
%     in which to extrct the AMSR2 data (e.g. created with
%     ginput)
%     VARNAME variable to extract - default sea_surface_temperature
%
% John Wilkin
% Feb 13, 2015

if nargin < 6
  varname = 'sea_surface_temperature';
end

%lon = nc_varget(dataurl,'lon');
lon = ncread(dataurl,'lon');
lat = ncread(dataurl,'lat');
if nargin > 4
  in = find((lon >= lonmin & lon <= lonmax) & (lat >= latmin & lat <= latmax));
else
  in = ~isnan(lon);
end

if ~isempty(in)
  lon = lon(in);
  lat = lat(in);
  data = ncread(dataurl,varname);
  data = data(in);
  time = ncread(dataurl,'time')/86400+datenum(1981,1,1);
  geo.lon = lon;
  geo.lat = lat;
  geo.time = time;
else
  warning('No data found in the granule')
  data = [];
  geo = [];
end


get_amsr2sst_filelist
Code: Select all
function [opendap,flist] = get_amsr2sst_filelist(day,shortName, version)
% [opendap,flist] = get_amsr2sst_filelist(day,version)
% Get list of Level-2 pass files (FLIST) for AMSR2 microwave SST data
% and the path to the files in the PO-DAAC opendap catalog (OPENDAP)
%
% The data url for an opendap query is the string [opendap flist(k,:)]
%
% Optional input VERSION ('final' or 'realtime','nearrealtime','nrt')
%
% See also get_amsr2sst.m
%
% John Wilkin
% Feb 13, 2015

if nargin<2
  version = 'final';
end
switch version
  case 'final'
    grepstr = ' grep FTP | grep v07.2 | cut -c100-180 ';
  case {'realtime','nearrealtime','nrt'}
    grepstr = ' grep FTP | grep rt | cut -c100-177 ';
end

%
[year,~] = datevec(day);
yday = floor(day-datenum(year,1,1))+1;
startTime = datestr(floor(day),30);
endTime = datestr(1+floor(day),30);

% query the PO-DAAC granule data base for names of the L2 files
query = ['curl "http://podaac.jpl.nasa.gov/ws/search/granule',...
  '/?shortName=' shortName,...
  '&startTime=' startTime,...
  '&endTime=' endTime,...
  '&itemsPerPage=100" | ' grepstr ' | sort'];
[~,flist] = unix(query);

% parse the result of the curl query by finding the file names (which all
% start with the year string)
k = strfind(flist,int2str(year));
if isempty(k)
  disp(['No ' upper(version) ' files found for day ' startTime])
  opendap = [];
  flist = [];
  return
end

% trim off the preamble text not excluded by grep
flist = flist(k(1):end);

% reshape
N = length(k);
flist = reshape(flist,[length(flist)/N N])';

% remove the trailing linefeed
flist(:,end) = [];

% PO-DAAC acknowledge that for some reason the database gives one file at
% the beginning of the list that belongs in the preceding day -
% remove it. I'm not if this means we miss a pass in the successive days.
flist(1,:) = [];

% String giving the path in the opendap catalog to files for this
% day to be used to build the fulll data url in function get_amsr_sst.m
opendap = ['http://podaac-opendap.jpl.nasa.gov:80/opendap/allData/',...
  'ghrsst/data/GDS2/L2P/AMSR2/REMSS/v7.2/',...
  int2str(year) '/' sprintf('%03d',yday) '/'];


Main Routine
Code: Select all
clear
format compact
format long

%*** Input Parametr

shortName='AMSR2-REMSS-L2P-v7.2';

%Read/Write data****************************************************************
datanum = datenum(2014,10,1);
[opendap,flist] = get_amsr2sst_filelist(datanum, shortName, 'final');
[nfile lz]=size(flist);
for i=1:nfile
 [data,geo] = get_amsr2sst([opendap flist(i,:)], -20, 20, -60, 60);
 if (length(data) > 1)
   disp(sprintf('\n*** Writing File %s ***', flist(i,:)));
   nccreate( flist(i,:), 'sea_surface_temperature', 'Dimensions',{'r' length(data)}, 'Datatype','double');
   ncwrite ( flist(i,:), 'sea_surface_temperature', data);
   nccreate ( flist(i,:), 'time');
   ncwrite ( flist(i,:), 'time', geo.time);
   nccreate ( flist(i,:), 'lon', 'Dimensions',{'r' length(geo.lon)}, 'Datatype','double');
   ncwrite ( flist(i,:), 'lon', geo.lon);
   nccreate ( flist(i,:), 'lat', 'Dimensions',{'r' length(geo.lat)}, 'Datatype','double');
   ncwrite ( flist(i,:), 'lat', geo.lat);
 end
end
%End****************************************************************
yiboj
 
Posts: 80
Joined: Mon Mar 30, 2015 11:22 am

Return to PO.DAAC OPeNDAP

cron