Sea Surface Temperature Analysis Using NCO Utilities

Sea Surface Temperature Analysis Using NCO Utilities

Postby yiboj » Wed Oct 25, 2017 10:11 am

NCO utilities are the versatile tools for simple and quick manipulation on netCDF data files including numerical analyses with mathematical and statistical algorithms of the GSL (GNU Scientific Library) package. Users can consult with the online NCO help manual (http://nco.sourceforge.net/) for details. Other NCO script tricks can be found in another post on this Forum "More Examples of Using the NCO Toolkit for Oceans Data".

Inspired by a recent climate change report, Climate Science Special Report, Fourth National Climate Assessment, we have developed a data recipe for the analysis of global Sea Surface Temperature (SST) differences and long term trends by using the NCO utilities. This recipe demonstrates the power and elegance of using this toolkit to process satellite data. NCO always produces netCDF output files which can be plotted using a popular program like Panoply.

In the first part of this recipe we calculate a linear trend of the global anomaly temperature between 1986 and 2015 using a long term SST record called the Smith and Reynolds NCDC Level 4 Historical Reconstructed SST Monthly Version 3b netCDF (ERSST). The script optionally allows the user to choose a different satellite only SST dataset called the GHRSST Level 4 AVHRR_OI Global Blended Sea Surface Temperature Analysis (GDS version 2) from NCEI (AVHRR_OI). Both of these datasets are hosted by PO.DAAC and ready for download. A linear fit is made on a pixel-by-pixel basis for the annual anomaly SST of ERSST from 1986 to 2015. ERSST is also used to calculate the climatology using a base period of 1901 and 1960.

Inside the first script the following steps occur:

Step 0: PO.DAAC web services retrieve the ERSST filenames necessary for the time periods of interest
Step 1: NCO creates a monthly climatology
Step 2: NCO computes monthly averages in individual years of 1986 to 2015 and then calculates the monthly anomaly using the climatology from the step 1
Step 3: NCO performs the linear regression across anomaly time series on a pixel-by-pixel basis
Step 4: NCO finds (and appends) the regression end points in time to determine long term SST changes

The climate_linear_trend.csh script is shown below:

Code: Select all
#! /bin/tcsh

# ersst_climate_linear_trend.csh
# 20 Sept 2017
# Ed Armstrong NASA/JPL/CalTech

# Calculate the difference in global anomaly temperature using NCO between 1986 and 2015 based on
# different input SST datasets.  A linear fit is made on a pixel-by-pixel basis for
# the annual anomaly SST from 1986 to 2015. ERSST is used to calculate the climatology
# between 1901 and 1960

# SST sources can be either NOAA ERSST or NOAA AVHRR_OI

# Dependencies
# shell script derange
# NCO utilities
# NCO script file sst_fit.nco

# Inspired by the oceans part of global map found on page 13 of this report:
# https://www.nytimes.com/interactive/2017/08/07/climate/document-Draft-of-the-Climate-Science-Special-Report.html?mcubz=3


set path = ($path /opt/local/bin)

# ~~~~~~~~~~~ set input variables ~~~~~~~~~~~~~~~~~~~~~
# set the bounds of the climatology
set min_year = '1901'
set max_year = '1960'

# create lists of years and months to loop over
set years=`derange $min_year-$max_year`
set months='01 02 03 04 05 06 07 08 09 10 11 12'

# set a PO.DAAC product ID. This PO.DAAC dataset
# will be used to determine the temperature differences
# between $min_year and $max_year
# ERSST PO.DAAC id:    PODAAC-REY3B-ERMON, the Extended Reconstruction SST dataset
# [ see: https://podaac.jpl.nasa.gov/dataset/REYNOLDS_NCDC_L4_SST_HIST_RECON_MONTHLY_V3B_NETCDF ]
# AVHRR_OI PO.DAAC id: PODAAC-GHAAO-4BC02, the GHRSST L4 AVHRR_OI dataset
# [ see: https://podaac.jpl.nasa.gov/dataset/AVHRR_OI-NCEI-L4-GLOB-v2.0 ]

set dataset = 'AVHRR_OI'
#set dataset = 'ERSST'

if ( $dataset == 'AVHRR_OI' ) then
     set dataset_id = 'PODAAC-GHAAO-4BC02'
else if ( $dataset == 'ERSST' ) then
     set dataset_id    = 'PODAAC-REY3B-ERMON'
endif

echo " creating anomalies for dataset: $dataset"

# Use PO.DAAC web services to extract ERSST OPeNDAP URL endpoints (need two calls because of return size limits)
# curl "https://podaac.jpl.nasa.gov/ws/search/granule/?datasetId=PODAAC-REY3B-ERMON&startTime=19010101T12:00:00Z&endTime=19300101T12:59:50Z&itemsPerPage=100000" | grep 'https://podaac-opendap.jpl.nasa.gov/.*\.nc' | awk -F '"' '{print $2}' | sed 's/\.html//' > ERSST.list
# curl "https://podaac.jpl.nasa.gov/ws/search/granule/?datasetId=PODAAC-REY3B-ERMON&startTime=19310101T12:00:00Z&endTime=19610101T12:59:50Z&itemsPerPage=100000" | grep 'https://podaac-opendap.jpl.nasa.gov/.*\.nc'  | awk -F '"' '{print $2}' | sed 's/\.html//'  >>  ERSST.list

# if cleanup of interim files desired set this variable
set cleanup


# ~~~~~~~~~~~ done ~~~~~~~~~~~~~~~~~~~~~
 

# Main Program....

# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Step 1) Make the (monthly) climatology...
# create a list of OPeNDAP URLs for each month and run NCO ncea on those lists
# to calculate monthly climatological averages
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
echo "Creating monthly lists of ERSST files....."
foreach month ( $months )
   rm $month.list
   foreach year ( $years )
       grep "ersst\.$year$month\.nc" ERSST.list >> $month.list
   end

   # call the NCO nces utility for averaging of monthly input files for variable 'sst' only
   # alternatively use nclimo
   echo ""
   echo " .... averaging ERSST files between years $min_year and $max_year for month: $month"
   #nces -O -v sst  `cat $month.list` ERSST.$min_year-$max_year.$month.nc
end


# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Step 2)  Compute monthly averages in individual years of 1986 to 2015 and calculate an anomaly
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#set years = `derange 1986-2015`
set years = `derange 2010-2015`
foreach year ( $years )
    set month_end = ( 31 28 31 30 31 30 31 31 30 31 30 31 )
    foreach month ( $months )

    set start_monthday = "-$month-01"
    set end_monthday = "-$month-$month_end[1]"
    shift month_end

    # check the ouput of the PO.DAAC web services call to create OPeNDAP URLs for input granules
    # curl "https://podaac.jpl.nasa.gov/ws/search/granule/?datasetId=${dataset_id}&startTime=${year}${start_monthday}T12:00:00Z&endTime=${year}${end_monthday}T23:50:50Z&itemsPerPage=10000" | grep 'https://podaac-opendap.jpl.nasa.gov/.*\.nc' | awk -F '"' '{print $2}' | sed 's/\.html//'

    # NCO ncea averaging command
    echo ""
    echo " .... averaging $dataset files for year/month: $year/$month"
    time nces -O -v '.*sst$' `curl "https://podaac.jpl.nasa.gov/ws/search/granule/?datasetId=${dataset_id}&startTime=${year}${start_monthday}T12:00:00Z&endTime=${year}${end_monthday}T23:50:50Z&itemsPerPage=10000" | grep 'https://podaac-opendap.jpl.nasa.gov/.*\.nc' | awk -F '"' '{print $2}' | sed 's/\.html//' ` $year$month.$dataset.nc

    if ( $dataset == 'AVHRR_OI' ) then
        # regrid to 2x2 ERSST grid
        echo " Regrid file"
        ncremap -a stod -i $year$month.$dataset.nc -d ERSST.1901-1960.10.nc  -o $year$month.$dataset.regrid.nc

        # rename 'analysed_sst' variable to 'sst'
        ncrename -v analysed_sst,sst $year$month.$dataset.regrid.nc

        # convert variable 'sst' from Kelvin to Celsius
        ncap2 -s 'sst=sst - 273.15' $year$month.$dataset.regrid.nc $year$month.$dataset.degC.nc

        if( $?cleanup ) then
            rm -f $year$month.$dataset.regrid.nc 
        endif

   mv -f $year$month.$dataset.degC.nc $year$month.$dataset.nc

    endif

    # create the monthly SST anomaly
    echo " .....creating monthly SST anomaly file"
    echo ""
    ncbo -O --op_typ=subtract $year$month.$dataset.nc ERSST.1901-1960.$month.nc $year$month.$dataset.anomaly.nc

    if( $?cleanup ) then
        rm $year$month.$dataset.nc
    endif

    end

   # average monthly SST anomalies into yearly
   nces -O `ls  $year*$dataset.anomaly.nc` $year.$dataset.anomaly_avg.nc

   # set time record dimension to 'UNLIMITED' so ncrcat works properly. Unknown
   # why it does not without this step.
   set str1 = 'time = 1 ;'
   set str2 = 'time = UNLIMITED ; // (1 currently)'
   ncdump $year.$dataset.anomaly_avg.nc | sed -e "s#^.$str1# $str2#" | ncgen -o $year.$dataset.anomaly_avg.time_unlimited.nc
   mv -f $year.$dataset.anomaly_avg.time_unlimited.nc  $year.$dataset.anomaly_avg.nc

   # remove dimenson lev
   ncwa -a lev  -O  $year.$dataset.anomaly_avg.nc $year.$dataset.anomaly_avg.no_lev.nc
   mv -f $year.$dataset.anomaly_avg.no_lev.nc $year.$dataset.anomaly_avg.nc

   if( $?cleanup ) then
        rm `ls  $year*$dataset.anomaly.nc` 
   endif

end

# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Step 3)  Perform the linear regression across anomaly time series  on a pixel-by-pixel basis
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# First perform record catenation along the time dimension in prepartion for linear regression
ncrcat  -O `ls *$dataset.anomaly_avg.nc`  $dataset.anomaly_avg_ensemble.nc

# for some reason AVHRR_OI ensemble output needs variable time converted to a double. Do for
# all datasets
ncap -s 'time= double(time)' $dataset.anomaly_avg_ensemble.nc $dataset.anomaly_avg_ensemble.time.nc
mv -f  $dataset.anomaly_avg_ensemble.time.nc $dataset.anomaly_avg_ensemble.nc


echo ""
echo "  .....do linear regression "
# ....do the linear regression !
# call 'sset_fit.nco'
ncap2 -O -S sst_fit.nco $dataset.anomaly_avg_ensemble.nc $dataset.anomaly_regression.nc

#  add slope/intercept metadata to the output regression file for missing values etc.
ncatted -O -a _FillValue,c0,c,d,-999  -a long_name,c0,c,c,'intercept' -a _FillValue,c1,c,d,-999 -a long_name,c1,c,c,'slope'  $dataset.anomaly_regression.nc  $dataset.anomaly_regression.more_metadata.nc
mv -f  $dataset.anomaly_regression.more_metadata.nc $dataset.anomaly_regression.nc

# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Step 4)  Find (and append)  the regression end points to determine longterm SST changes (variable sst_diff)
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ncap2 -A -v -s 'sst_diff = (time(29)*c1 + c0) - (time(0)*c1 + c0)' $dataset.anomaly_regression.nc

# append longterm SST changes in degF
ncap2 -A -s 'sst_diff_degF=sst_diff*1.8' $dataset.anomaly_regression.nc

# update the slope to change from degC/minutes to degC/decade
# 5256000 minutes per decade ( 60 * 24 * 365 * 10 )
ncap2 -A -s 'c1 = c1 * 5256000' $dataset.anomaly_regression.nc

# now plot global end point differences ...........or use application Panoply to plot!

echo "Done"


The NCO file sst_fit.nco which calls the GSL subroutine is attached here:
Code: Select all
// Linear Regression

// Declare variables
c0[$lat, $lon]=0.;        // Intercept
c1[$lat, $lon]=0.;        // Slope
sdv[$lat, $lon]=0.;       // Standard deviation
covxy[$lat, $lon]=0.;     // Covariance

for (i=0;i<$lat.size;i++)   // Loop over lat
{
  for (j=0;j<$lon.size;j++)   // Loop over lon
  {
        gsl_fit_linear(time,1,sst(:, i, j),1, $time.size, &tc0, &tc1, &cov00, &cov01,&cov11,&sumsq); // Linear regression function
        c0(i,j) = tc0;    // Output results
        c1(i,j) = tc1;    // Output results
        covxy(i,j) = gsl_stats_covariance(time,1,$time.size,double(sst(:,i,j)),1,$time.size); // Covariance function
        sdv(i,j) = gsl_stats_sd(sst(:,i,j), 1, $time.size);   // Standard deviation function
  }
}


Here is the SST trend contour plot from the output:
ersst_trend.png
SST Trend
ersst_trend.png (123.5 KiB) Viewed 2018 times


In second part of this recipe we reproduce Figure 1 and 1.3 from the climate change report also using the ERSST dataset. In this case we show the global difference in SST between the average of a recent 1986 to 2015 period vs. the average of an earlier 1901 to 1960 period. The ERSST dataset we use is nearly identical to the data used in the report but in our case we are only showing the ocean component of change.

Code: Select all
# 20 Sept 2017
# Ed Armstrong NASA/JPL/CalTech

# Calculate the difference in global  sea surface temperature using NCO between two periods:
# 1986 to 2015 vs 1901 to 1960.
# ERSST is used the data source

# Dependencies:
# NCO utilities

# Reproducing by the oceans part of global map found on page 13 and page 69  of this report:
# https://www.nytimes.com/interactive/2017/08/07/climate/document-Draft-of-the-Climate-Science-Special-Report.html?mcubz=3


set path = ($path /opt/local/bin)

# ~~~~~~~~~~~ set input variables for period 1  ~~~~~~~~~~~~~~~~~~~~~
# set the bounds of first period
set min_year = '1901'
set max_year = '1960'

# set a PO.DAAC product ID. This PO.DAAC dataset
# will be used to determine the temperature differences
# between $min_year and $max_year
# ERSST PO.DAAC id:    PODAAC-REY3B-ERMON, the Extended Reconstruction SST dataset
# [ see: https://podaac.jpl.nasa.gov/dataset/REYNOLDS_NCDC_L4_SST_HIST_RECON_MONTHLY_V3B_NETCDF ]
# AVHRR_OI PO.DAAC id: PODAAC-GHAAO-4BC02, the GHRSST L4 AVHRR_OI dataset
# [ see: https://podaac.jpl.nasa.gov/dataset/AVHRR_OI-NCEI-L4-GLOB-v2.0 ]

set dataset = 'ERSST'

if ( $dataset == 'AVHRR_OI' ) then
     set dataset_id = 'PODAAC-GHAAO-4BC02'
else if ( $dataset == 'ERSST' ) then
     set dataset_id    = 'PODAAC-REY3B-ERMON'
endif

echo " creating SST average for period 1  for dataset: $dataset"

# Use PO.DAAC web services to extract ERSST OPeNDAP URL endpoints (need two calls because of return size limits)
 curl "https://podaac.jpl.nasa.gov/ws/search/granule/?datasetId=PODAAC-REY3B-ERMON&startTime=19010101T12:00:00Z&endTime=19300101T12:59:50Z&itemsPerPage=100000" | grep 'https://podaac-opendap.jpl.nasa.gov/.*\.nc' | awk -F '"' '{print $2}' | sed 's/\.html//' > ERSST.$min_year-$max_year.list
 curl "https://podaac.jpl.nasa.gov/ws/search/granule/?datasetId=PODAAC-REY3B-ERMON&startTime=19310101T12:00:00Z&endTime=19601231T12:59:50Z&itemsPerPage=100000" | grep 'https://podaac-opendap.jpl.nasa.gov/.*\.nc'  | awk -F '"' '{print $2}' | sed 's/\.html//'  >> ERSST.$min_year-$max_year.list


# Calculate longterm average of period 1
echo " .... averaging ERSST files between years $min_year and $max_year "
nces -O -v sst  `cat  ERSST.$min_year-$max_year.list | sort`  ERSST.$min_year-$max_year.average.nc
cp -f ERSST.$min_year-$max_year.average.nc period_1_average.nc


# ~~~~~~~~~~~ set input variables for period 2  ~~~~~~~~~~~~~~~~~~~~~
# set the bounds of first period
set min_year = '1986'
set max_year = '2015'

echo " creating SST average for period 2  for dataset: $dataset"

# Use PO.DAAC web services to extract ERSST OPeNDAP URL endpoints (need two calls because of return size limits)
 curl "https://podaac.jpl.nasa.gov/ws/search/granule/?datasetId=PODAAC-REY3B-ERMON&startTime=19860101T12:00:00Z&endTime=20151231T12:59:50Z&itemsPerPage=100000" | grep 'https://podaac-opendap.jpl.nasa.gov/.*\.nc' | awk -F '"' '{print $2}' | sed 's/\.html//' > ERSST.$min_year-$max_year.list

# Calculate longterm average of period 2
echo " .... averaging ERSST files between years $min_year and $max_year "
nces -O -v sst  `cat  ERSST.$min_year-$max_year.list`  ERSST.$min_year-$max_year.average.nc
cp -f ERSST.$min_year-$max_year.average.nc period_2_average.nc


# compute the difference between period 1 and period 2   
echo " ....calculate difference between two periods"
echo ""
ncbo -O --op_typ=subtract period_2_average.nc  period_1_average.nc ERSST.diff.nc

# append SST changes in degF
ncap2 -A -s 'sst_diff_degF=sst*1.8' ERSST.diff.nc

echo "Done"


Here is the SST difference plot between the average of year 1986 to 2015 and the average of year 1901 to 1960:
ersst_diff.png
SST Difference in Fahrenheit
ersst_diff.png (104.21 KiB) Viewed 2013 times
yiboj
 
Posts: 88
Joined: Mon Mar 30, 2015 11:22 am

Re: Sea Surface Temperature Analysis Using NCO Utilities

Postby edward.m.armstrong » Mon May 06, 2019 3:45 pm

The shell script 'derange' necessary to run the main program can be found here:

#!/bin/sh

# File: derange
# taken from Power Tools CD. Uses awk to create a sequential range of
# numbers that increment by +/- 1

case $# in
0) echo "usage: `basename $0` range [ranges ...]" 1>&2; exit 1 ;;
esac

# make awk input field separator a dash and the input record separator a space;
# unfortunately, changing the output record separator to space has problems

#echo "$@" | nawk '
echo "$@" | awk '
BEGIN { FS = "-"; RS = " " }
{
# if printing first range, no leading blank. otherwise, do:
if ( notfirst != "y" ) {
printf "%d", $1
notfirst="y"
}
else
printf " %d", $1

# print numbers with leading blank but no newlines:
if ( $1 < $2 ) { # increasing range
for ( i=$1+1; i <= $2; i++ )
printf " %d", i
}
else if ( $1 > $2 ) { # decreasing range
for ( i=$1-1; i >= $2; i-- )
printf " %d", i
}
}
END { printf "\n" }' # print the newline
edward.m.armstrong
 
Posts: 5
Joined: Mon Oct 06, 2014 1:23 pm


Return to Numerical Analysis