More Examples of Using the NCO Toolkit for Oceans Data

More Examples of Using the NCO Toolkit for Oceans Data

Postby yiboj » Sat Sep 16, 2017 7:02 pm

The netCDF Operators (NCO) toolkit manipulates and analyzes data stored in netCDF format and can access most powerful mathematical and statistical algorithms of GSL (the GNU Scientific Library). It is fast, powerful, and easy to use. Here we present some examples of applying common nco commands to manipulate oceanographic data.

*** Concatenators ncrcat and ncecat:

ncecat will create a new record dimension (named record by default) with which to glue together the individual files into the single ensemble file, and sometimes the created file will not be processed in the usually way due to new record dimension. On the other hand, ncrcat will do a better job by concatenating alone say the time dimension and the resulting output file is clean and easy to read.

Here is the sample command to concatenate file_1.nc, file_2.nc. file_3.nc. file_4.nc. file_5.nc. file_6.nc. These files have same structure as shown below:

Code: Select all
>>> ncdump -h file_1.nc
netcdf \file_1 {
dimensions:
   lat = 89 ;
   lev = 1 ;
   lon = 180 ;
   time = UNLIMITED ; // (1 currently)
variables:
   double lat(lat) ;
      lat:units = "degrees_north" ;
      lat:long_name = "Latitude" ;
      lat:standard_name = "latitude" ;
      lat:axis = "Y" ;
      lat:bounds = "lat_bnds" ;
      lat:grids = "Uniform grid from -88 to 88 by 2" ;
   double lev(lev) ;
      lev:units = "meters" ;
      lev:long_name = "Depth of sea surface temperature measurements" ;
      lev:standard_name = "depth" ;
      lev:axis = "Z" ;
      lev:positive = "down" ;
      lev:_CoordinateAxisType = "Height" ;
      lev:comment = "Measurement depth of in situ sea surface temperature varies" ;
   double lon(lon) ;
      lon:units = "degrees_east" ;
      lon:long_name = "Longitude" ;
      lon:standard_name = "longitude" ;
      lon:axis = "X" ;
      lon:bounds = "lon_bnds" ;
      lon:grids = "Uniform grid from 0 to 358 by 2" ;
   float sst(time, lev, lat, lon) ;
      sst:_FillValue = -999.f ;
      sst:long_name = "Extended reconstructed sea surface temperature" ;
      sst:standard_name = "SST" ;
      sst:units = "degree_C" ;
      sst:valid_min = -3.f ;
      sst:valid_max = 45.f ;
   double time(time) ;
      time:long_name = "Center time of the day" ;
      time:standard_name = "time" ;
      time:axis = "T" ;
      time:delta_t = "0000-01-00" ;
      time:avg_period = "0000-01-00" ;
      time:units = "minutes since 2012-01-01 00:00" ;


To concatenate files using ncecat, the output file file_out_ncecat.nc is shown bellow:
Code: Select all
>>> ncecat file_1.nc file_2.nc file_3.nc file_4.nc file_5.nc file_6.nc file_out_ncecat.nc
>>> ncdump -h file_out_ncecat.nc
netcdf file_out_ncecat {
dimensions:
   record = UNLIMITED ; // (6 currently)
   lat = 89 ;
   lev = 1 ;
   lon = 180 ;
   time = 1 ;
variables:
   double lat(lat) ;
      lat:units = "degrees_north" ;
      lat:long_name = "Latitude" ;
      lat:standard_name = "latitude" ;
      lat:axis = "Y" ;
      lat:bounds = "lat_bnds" ;
      lat:grids = "Uniform grid from -88 to 88 by 2" ;
   double lev(lev) ;
      lev:units = "meters" ;
      lev:long_name = "Depth of sea surface temperature measurements" ;
      lev:standard_name = "depth" ;
      lev:axis = "Z" ;
      lev:positive = "down" ;
      lev:_CoordinateAxisType = "Height" ;
      lev:comment = "Measurement depth of in situ sea surface temperature varies" ;
   double lon(lon) ;
      lon:units = "degrees_east" ;
      lon:long_name = "Longitude" ;
      lon:standard_name = "longitude" ;
      lon:axis = "X" ;
      lon:bounds = "lon_bnds" ;
      lon:grids = "Uniform grid from 0 to 358 by 2" ;
   float sst(record, time, lev, lat, lon) ;
      sst:_FillValue = -999.f ;
      sst:long_name = "Extended reconstructed sea surface temperature" ;
      sst:standard_name = "SST" ;
      sst:units = "degree_C" ;
      sst:valid_min = -3.f ;
      sst:valid_max = 45.f ;
   double time(time) ;
      time:long_name = "Center time of the day" ;
      time:standard_name = "time" ;
      time:axis = "T" ;
      time:delta_t = "0000-01-00" ;
      time:avg_period = "0000-01-00" ;
      time:units = "minutes since 2010-01-01 00:00" ;

The file_out_ncecat.nc file has new dimension "record" which is redundant for most of the use cases.

On the other hand, ncrcat will concatenate file along time dimension without creating record dimension as shown below in the output file file_out_ncrcat.nc:
Code: Select all
>>> ncrcat file_1.nc file_2.nc file_3.nc file_4.nc file_5.nc file_6.nc file_out_ncrcat.nc
>>> ncdump -h file_out_ncrcat.nc
netcdf file_out_ncrcat {
dimensions:
   lat = 89 ;
   lev = 1 ;
   lon = 180 ;
   time = UNLIMITED ; // (6 currently)
variables:
   double lat(lat) ;
      lat:units = "degrees_north" ;
      lat:long_name = "Latitude" ;
      lat:standard_name = "latitude" ;
      lat:axis = "Y" ;
      lat:bounds = "lat_bnds" ;
      lat:grids = "Uniform grid from -88 to 88 by 2" ;
   double lev(lev) ;
      lev:units = "meters" ;
      lev:long_name = "Depth of sea surface temperature measurements" ;
      lev:standard_name = "depth" ;
      lev:axis = "Z" ;
      lev:positive = "down" ;
      lev:_CoordinateAxisType = "Height" ;
      lev:comment = "Measurement depth of in situ sea surface temperature varies" ;
   double lon(lon) ;
      lon:units = "degrees_east" ;
      lon:long_name = "Longitude" ;
      lon:standard_name = "longitude" ;
      lon:axis = "X" ;
      lon:bounds = "lon_bnds" ;
      lon:grids = "Uniform grid from 0 to 358 by 2" ;
   float sst(time, lev, lat, lon) ;
      sst:_FillValue = -999.f ;
      sst:long_name = "Extended reconstructed sea surface temperature" ;
      sst:standard_name = "SST" ;
      sst:units = "degree_C" ;
      sst:valid_min = -3.f ;
      sst:valid_max = 45.f ;
   double time(time) ;
      time:long_name = "Center time of the day" ;
      time:standard_name = "time" ;
      time:axis = "T" ;
      time:delta_t = "0000-01-00" ;
      time:avg_period = "0000-01-00" ;
      time:units = "minutes since 2010-01-01 00:00" ;


*** ncap2 (netCDF Arithmetic Processor) is one of the most powerful command in the NCO toolkit, and it opens a endless possibilities for user to explore. Unfortunately the ncap2 documentation is incomplete and hard to follow. In this article, we are going to explain step by step the confusing part of ncap2 with real netCDF data file from PODAAC.

1. Syntax
Code: Select all
>>> ncap2 other_options -s "inline Script" -S script_filename.nco file_in.nc file_out.nc


2. Predefined Variables

All variables in input netCDF file are predefined variables, such as lat, lon, time and sst in file_1.nc. These predefined variables can be accessed directly by the inline or .nco script as shown below by using file_1.nc as an example:

Code: Select all
>>> ncdump -h file_1.nc
netcdf file_1 {
dimensions:
   lat = 89 ;
   lev = 1 ;
   lon = 180 ;
   time = UNLIMITED ; // (1 currently)
variables:
   double lat(lat) ;
      lat:units = "degrees_north" ;
      lat:long_name = "Latitude" ;
      lat:standard_name = "latitude" ;
      lat:axis = "Y" ;
      lat:bounds = "lat_bnds" ;
      lat:grids = "Uniform grid from -88 to 88 by 2" ;
   double lev(lev) ;
      lev:units = "meters" ;
      lev:long_name = "Depth of sea surface temperature measurements" ;
      lev:standard_name = "depth" ;
      lev:axis = "Z" ;
      lev:positive = "down" ;
      lev:_CoordinateAxisType = "Height" ;
      lev:comment = "Measurement depth of in situ sea surface temperature varies" ;
   double lon(lon) ;
      lon:units = "degrees_east" ;
      lon:long_name = "Longitude" ;
      lon:standard_name = "longitude" ;
      lon:axis = "X" ;
      lon:bounds = "lon_bnds" ;
      lon:grids = "Uniform grid from 0 to 358 by 2" ;
   float sst(time, lev, lat, lon) ;
      sst:_FillValue = -999.f ;
      sst:long_name = "Extended reconstructed sea surface temperature" ;
      sst:standard_name = "SST" ;
      sst:units = "degree_C" ;
      sst:valid_min = -3.f ;
      sst:valid_max = 45.f ;
   double time(time) ;
      time:long_name = "Center time of the day" ;
      time:standard_name = "time" ;
      time:axis = "T" ;
      time:delta_t = "0000-01-00" ;
      time:avg_period = "0000-01-00" ;
      time:units = "minutes since 2010-01-01 00:00" ;
>>> ncap2 -O -s 'print(lat)' file_1.nc file_out.nc
lat[0]=-88
lat[1]=-86
lat[2]=-84
lat[3]=-82
lat[4]=-80
lat[5]=-78
lat[6]=-76
lat[7]=-74
lat[8]=-72
lat[9]=-70
lat[10]=-68


3. Create New Variables

The new variable can be either scalar, vector or matrix. The scalar can be defined directly. The vector or matrix can be created directly if it uses existing dimensions, otherwise the dimension has to be defined first. This is shown in the following code:

Code: Select all
// Create scalar variable
aScalar = 0.0;

// Create vector using existing dimension
aVec[$lat] = 0.0;

// Create matrix with new dimensions
defdim("i1",10);
defdim("i2",20);

aMat[$i1, $i2] = 0.0


Now let's output these variables, and confirm the output file file_1_out.nc:
Code: Select all
>>> ncap2 -O -s 'aScalar=0.0; aVec[$lat]=0.0; defdim("i1",10); defdim("i2",20); aMat[$i1, $i2]=0.0' file_1.nc file_1_out.nc
>>> ncdump -h file_1_out.nc
netcdf file_out {
dimensions:
   lat = 89 ;
   i1 = 10 ;
   i2 = 20 ;
   lev = 1 ;
   lon = 180 ;
   time = UNLIMITED ; // (1 currently)
variables:
   double aScalar ;
   double aVec(lat) ;
   double aMat(i1, i2) ;
   double lat(lat) ;
      lat:units = "degrees_north" ;
      lat:long_name = "Latitude" ;
      lat:standard_name = "latitude" ;
      lat:axis = "Y" ;
      lat:bounds = "lat_bnds" ;
      lat:grids = "Uniform grid from -88 to 88 by 2" ;
   double lev(lev) ;
      lev:units = "meters" ;
      lev:long_name = "Depth of sea surface temperature measurements" ;
      lev:standard_name = "depth" ;
      lev:axis = "Z" ;
      lev:positive = "down" ;
      lev:_CoordinateAxisType = "Height" ;
      lev:comment = "Measurement depth of in situ sea surface temperature varies" ;
   double lon(lon) ;
      lon:units = "degrees_east" ;
      lon:long_name = "Longitude" ;
      lon:standard_name = "longitude" ;
      lon:axis = "X" ;
      lon:bounds = "lon_bnds" ;
      lon:grids = "Uniform grid from 0 to 358 by 2" ;
   float sst(time, lev, lat, lon) ;
      sst:_FillValue = -999.f ;
      sst:long_name = "Extended reconstructed sea surface temperature" ;
      sst:standard_name = "SST" ;
      sst:units = "degree_C" ;
      sst:valid_min = -3.f ;
      sst:valid_max = 45.f ;
   double time(time) ;
      time:long_name = "Center time of the day" ;
      time:standard_name = "time" ;
      time:axis = "T" ;
      time:delta_t = "0000-01-00" ;
      time:avg_period = "0000-01-00" ;
      time:units = "minutes since 2010-01-01 00:00" ;


4. Difference between vector or matrix [] and ()
Code: Select all
// Create matrix with new dimensions
defdim("i1",10);
defdim("i2",20);

// Define matrix variable aVar with size 10x20 and assign its values to 0.0
aVar[$i1, $i2] = 0.0

// Assign variable aVar at index 6 and 7 to 10.0
aVar(6, 7) = 10.0


5. inline vs. .nco script

inline script is used with simple code or logic, and .nco script is a better choice when data processing is going on. We can put the previous code into script file mytest.nco and execute the command as follows and output file in file_1_script_out.nc:
Code: Select all
>>> vi mytest.nco
// Create matrix with new dimensions
defdim("i1",10);
defdim("i2",20);

// Define matrix variable aVar with size 10x20 and assign its values to 0.0
aVar[$i1, $i2] = 0.0

// Assign variable aVar at index 6 and 7 to 10.0
aVar(6, 7) = 10.0
>>> ncap2 -O -S mytest.nco file_1.nc file_1_script_out.nc
>>> ncdump -h file_1_script_out.nc
netcdf file_1_script_out {
dimensions:
   i1 = 10 ;
   i2 = 20 ;
   lat = 89 ;
   lev = 1 ;
   lon = 180 ;
   time = UNLIMITED ; // (1 currently)
variables:
   double aVar(i1, i2) ;
   double lat(lat) ;
      lat:units = "degrees_north" ;
      lat:long_name = "Latitude" ;
      lat:standard_name = "latitude" ;
      lat:axis = "Y" ;
      lat:bounds = "lat_bnds" ;
      lat:grids = "Uniform grid from -88 to 88 by 2" ;
   double lev(lev) ;
      lev:units = "meters" ;
      lev:long_name = "Depth of sea surface temperature measurements" ;
      lev:standard_name = "depth" ;
      lev:axis = "Z" ;
      lev:positive = "down" ;
      lev:_CoordinateAxisType = "Height" ;
      lev:comment = "Measurement depth of in situ sea surface temperature varies" ;
   double lon(lon) ;
      lon:units = "degrees_east" ;
      lon:long_name = "Longitude" ;
      lon:standard_name = "longitude" ;
      lon:axis = "X" ;
      lon:bounds = "lon_bnds" ;
      lon:grids = "Uniform grid from 0 to 358 by 2" ;
   float sst(time, lev, lat, lon) ;
      sst:_FillValue = -999.f ;
      sst:long_name = "Extended reconstructed sea surface temperature" ;
      sst:standard_name = "SST" ;
      sst:units = "degree_C" ;
      sst:valid_min = -3.f ;
      sst:valid_max = 45.f ;
   double time(time) ;
      time:long_name = "Center time of the day" ;
      time:standard_name = "time" ;
      time:axis = "T" ;
      time:delta_t = "0000-01-00" ;
      time:avg_period = "0000-01-00" ;
      time:units = "minutes since 2010-01-01 00:00" ;


6. Access GSL Functions

ncap2 can access most GSL special functions. Here we show how to do the linear regression over time on the input netCDF file sst_in.nc. The following list the .nco script file sst_fit.nco:
Code: Select all
// Linear Regression

// Declare variables
c0[$lat, $lon]=0.;        // Intercept
c1[$lat, $lon]=0.;        // Slope
sdv[$lat, $lon]=0.;       // Standard deviation
covxy[$lat, $lon]=0.;     // Covariance

for (i=0;i<$lat.size;i++)   // Loop over lat
{
  for (j=0;j<$lon.size;j++)   // Loop over lon
  {
        gsl_fit_linear(time,1,sst(:, i, j),1, $time.size, &tc0, &tc1, &cov00, &cov01,&cov11,&sumsq); // Linear regression function
        c0(i,j) = tc0;    // Output results
        c1(i,j) = tc1;    // Output results
        covxy(i,j) = gsl_stats_covariance(time,1,$time.size,double(sst(:,i,j)),1,$time.size); // Covariance function
        sdv(i,j) = gsl_stats_sd(sst(:,i,j), 1, $time.size);   // Standard deviation function
  }
}


Here is the command and output file sst_out.nc from the linear fit:
Code: Select all
>>> ncap2 -O -S sst_fit.nco sst_in.nc sst_out.nc
>>> ncdump -h sst_out.nc
netcdf sst_out {
dimensions:
   lat = 89 ;
   lon = 180 ;
   time = UNLIMITED ; // (6 currently)
variables:
   double c0(lat, lon) ;
   double c1(lat, lon) ;
   double covxy(lat, lon) ;
   double sdv(lat, lon) ;
   int i ;
   int j ;
   double tc0 ;
   double tc1 ;
   double cov00 ;
   double cov01 ;
   double cov11 ;
   double sumsq ;
   double lat(lat) ;
      lat:units = "degrees_north" ;
      lat:long_name = "Latitude" ;
      lat:standard_name = "latitude" ;
      lat:axis = "Y" ;
      lat:bounds = "lat_bnds" ;
      lat:grids = "Uniform grid from -88 to 88 by 2" ;
   double lev ;
      lev:units = "meters" ;
      lev:long_name = "Depth of sea surface temperature measurements" ;
      lev:standard_name = "depth" ;
      lev:axis = "Z" ;
      lev:positive = "down" ;
      lev:_CoordinateAxisType = "Height" ;
      lev:comment = "Measurement depth of in situ sea surface temperature varies" ;
      lev:cell_methods = "lev: mean" ;
   double lon(lon) ;
      lon:units = "degrees_east" ;
      lon:long_name = "Longitude" ;
      lon:standard_name = "longitude" ;
      lon:axis = "X" ;
      lon:bounds = "lon_bnds" ;
      lon:grids = "Uniform grid from 0 to 358 by 2" ;
   float sst(time, lat, lon) ;
      sst:_FillValue = -999.f ;
      sst:long_name = "Extended reconstructed sea surface temperature" ;
      sst:standard_name = "SST" ;
      sst:units = "degree_C" ;
      sst:valid_min = -3.f ;
      sst:valid_max = 45.f ;
      sst:cell_methods = "lev: mean" ;
   double time(time) ;
      time:long_name = "Center time of the day" ;
      time:standard_name = "time" ;
      time:axis = "T" ;
      time:delta_t = "0000-01-00" ;
      time:avg_period = "0000-01-00" ;
      time:units = "minutes since 2010-01-01 00:00" ;


And here is the contour plot of slope variable c1 in the sst_out.nc file:
c1_sst_out.png
Contour Plot of Slope c1
c1_sst_out.png (113.13 KiB) Viewed 3143 times


7. Output Variables

All created variables inline or in the .nco script file are automatically written into output netCDF file (by default).
yiboj
 
Posts: 115
Joined: Mon Mar 30, 2015 11:22 am

Re: More Examples of Using the NCO Toolkit for Oceans Data

Postby fionasupernova » Sun Mar 01, 2020 7:11 am

Hello,

I am trying to subset multiple global datasets to the Philippine domain, and I couldn't even get my day1 code right. I read somewhere that this is the code:

Code: Select all
ncea -d lat,-5.0,30.0 -d lon,95.0,140.0 152.nc ph_152.nc


But it did not work. I used the code before that worked on my atmospheric data previously,

Code: Select all
ncks -d lat,-5.0,30.0 -d lon,95.0,140.0 152.nc -O ph_152.nc


but it still won't work. What could possibly be the problem? Here is the error code:

ncea: ERROR nco_lmt_evl_dmn_crd() unable to read user-specified coordinate lat. Ensure this coordinate variable is in file and is a 1-D array.
nco_err_exit(): ERROR Short NCO-generated message (usually name of function that triggered error): nc_get_vara_double()
nco_err_exit(): ERROR Error code is -101. Translation into English with nc_strerror(-101) is "NetCDF: HDF error"
nco_err_exit(): ERROR NCO will now exit with system call exit(EXIT_FAILURE)



Thanks in advance!


Sincerely,

Sam
fionasupernova
 
Posts: 3
Joined: Wed Jan 22, 2020 1:10 am

Re: More Examples of Using the NCO Toolkit for Oceans Data

Postby yiboj » Mon Mar 16, 2020 11:06 am

Hi Sam,
Thanks for your inquiry.
It looks like your input data files 152.nc and ph_152.nc do not have variable lat or it is not 1-D array, please upload your data files here so that we could check the issue.
Regards,

-PODAAC DE
yiboj
 
Posts: 115
Joined: Mon Mar 30, 2015 11:22 am

Re: More Examples of Using the NCO Toolkit for Oceans Data

Postby zackchang » Sun May 31, 2020 8:49 am

Dear yiboj,

I have tried running the sst_fit.nco using ncap2 by altering the dimensions to my netcdf file.
However, I received an error: "ncap2: ERROR ncap_lmt_evl(): Lower limit 160 for dim Longitude is outside range 0-159".
Would you help assist me on this?
Thank you.

Regards,
Zack
zackchang
 
Posts: 2
Joined: Sat Apr 11, 2020 9:50 pm

Re: More Examples of Using the NCO Toolkit for Oceans Data

Postby yiboj » Fri Jun 05, 2020 1:35 pm

Hi Zack,

Thanks for your inquiry and support. Could you please attach your nc file here and we can check the problem with the script.
Best Regards,

PODAAC DE
yiboj
 
Posts: 115
Joined: Mon Mar 30, 2015 11:22 am

Re: More Examples of Using the NCO Toolkit for Oceans Data

Postby zackchang » Fri Jun 05, 2020 11:03 pm

Dear yiboj,

Thank you for your assistance. The link is as follows: 'https://liveuclac-my.sharepoint.com/:f:/g/personal/uceszc8_ucl_ac_uk/EkW_LdyxY-RHgbC-QThqCEsBZfdPsamXzDaD_4DJfLtAdQ?e=1YoTlj'
I have tried to resample (via Python) the data into monthly sum in hope to get monthly trend and annual trend if everything else works. However, I encountered a new error from ncap2: ERROR gsl_stats_covariance(): The data1 type and data2 type must be the same. In your argument, data1 is type NC_INT64 and data2 is type NC_INT. Hence, I manually convert the variables to NC_INT but the nco script still doesn't work.
Could you please assist me to get annual trend as well??
Appreciated much. Thank you.

Regards,
Zack
zackchang
 
Posts: 2
Joined: Sat Apr 11, 2020 9:50 pm


Return to Data Recipes

cron