3. TOOLKIT DEMONSTRATION

Multi-Channel Singular-Spectrum/Principal Component Analysis

As an example, we apply M-SSA to the near-global data set of monthly sea-surface temperatures (SSTs) anomalies data set for 1950-2004, from 30$^{\circ}$S to 60$^{\circ}$N, on a 10$^{\circ}$-latitude by 10$^{\circ}$-longitude grid ( (International Research Institute for Climate and Society (IRI), see sst.tkt project in Examples/Oceanography folder)

This translates into a dataset with 360 channels (36 longitude x10 latitude), 648 months long. The channels with NaN values in all rows indicate land mask. To read the sst data we use the Matrix function from the Tools menu on the main panel and ASCII format, which places the data into a matrix with a default name "sst" of 648 rows (time) and 360 columns (space). To setup a square grid for display of the space coordinate, we set col1 and col2 values for sst matrix as 36 and 10, respectively. Then, all relevant MSSA (PCA) results for this matrix will be obtained and plotted using such spatical grid.

Selecting the `MSSA/PCA' option from the Tools menu on the main panel launches the following window:

PCA

Having selected the data to be analyzed (here `sst') and the sampling interval, user can choose analysis type, i.e. PCA, MSSA or PCA->MSSA, which is MSSA done on data compressed by PCA.

Main options for MSSA are temporal Window Length, type of Significance Test, Covariance method, and number of spatial EOFs channels (for PCA->MSSA). The number of MSSA Components specifies how many components will be retained for Reconstruction/Prediction. Results of PCA/MSSA are stored in matrix with a name specified in Spectrum field. In addition, T-EOFs, ST-EOFs and T-PCs (see below) are stored in matrices with names obtained by prefixing "eof_", "steof_" and "pc_" to a Spectrum name, and can be accessed in Data I/O tool. If results from several analyses have been stored in different matrices, the parameters used in a particular computation will be restored in GUI by simply selecting correspondent matrix from a Spectrum pop-up list. Get Default Values button is provided as a guide.

PCA

We set the analysis switch to PCA, No. of EOFs to 10 and hit Compute to compress data with standard PCA (Preisendorfer-1988) to retain the 10 leading spatial EOFs.

Principal Component Analysis

The PCA spectrum is stored in matrix specified in Spectrum field, and can be plotted with Plot button:

PCA spectrum The leading spatial EOFs describe 55.2% of the variance, as we can see from the Log:


This favors the association of larger decorrelation times with larger spatial scales, as expected for climatic (Fraedrich and Boettger-1978) and other geophysical fields, and the channels are uncorrelated at zero lag. After PCA the principal components (PCs) are stored in a matrix with a name obtained by prefixing "spc_" to a Data name; correspondent spatial EOFs are stored in a matrix a name obtained by prefixing "seof_" to a Data name. Both matrices can be accessed in Data I/O tool.

In Advanced options of MSSA/PCA panel, user can plot selected spatial EOFs and temporal PCs in MSSA/PCA components table, as well as reconstruct their contribution ((available in licensed copy only, see below) in the dataset by using Reconstruction/Prediction option .

We will select PCA->MSSA to perform MSSA on the number of leading PCs (specified in EOFs filed) retained after PCA .

Window length

For comparison with previous ENSO studies and to demonstrate the usage of N'-windows vs. M-windows [see Eq. (11) and Fig.1, for the definition of ``complementary windows''], we set M=480, which yields an N'-window of 5 years, and M=270, which yields the M- and the N'-windows equal both to N/2.

Note!: The value in the Window Length is taken to equal to N' if `Reduced' option is chosen for Covariance, and M for other Covariance choices.

Covariance Estimation

The method for estimating the lag-covariance matrix that is decomposed (diagonalized) by M-SSA is chosen by selecting either `Reduced' , `VG' (Vautard-Ghil) , or `BK ' ( Broomhead &King) from the `Covariance' menu on the main M-SSA control panel (Fig. 11).
We choose Reduced for the Covariance estimator, since in all cases ML > N' so that it is more efficient to diagonalize the reduced ( $N'\times N'$) covariance matrix with elements given by Eq. (11), rather than the ( $ML \times ML$) matrix whose elements are given by Eq. (6) (` Broomhead &King), or Eq. (1) `Vautard-Ghil'.

We set the Window to 60 and 270 (N' for such option, see above).

Significance test

There are three choices for Significance test

described below. If None is selected, the eigenspectrum is displayed in order of eigenvalue rank. The leading oscillatory pair over the entire domain has a quasi-quadrennial period for all values of M. The quasi-quadrennial pair (modes 2 and 3 in this case ), accounts for less then 30% of ariance for N'=60, as we can see by examining the Log File shown above. The smaller one of M and N' determines the approximate spectral resolution 1/N' or 1/M. Choosing M=N'=270 yields the maximum spectral resolution.

The dominant frequencies of MSSA modes are computed only in Monte-Carlo or apporximate, but much faster Chi-squared MSSA test. Choosing Chi-squared in the Significance tests menu, and M=N'=270 we obtain the following plot:

Here the red-noise surrogate projections are plotted against the dominant frequencies associated with each MSSA mode. The frequencies and variances captured by each mode are displayed in a components table of Advanced options window:
Using M=N'=270 captures less variance - 12.1% - for the significant oscillatory quasi-quadriennial pair formed in this case by modes 3 and 4, as seen in the table above.

Monte-Carlo Test Options

The quasi-biennial pair (modes 14 and 15) does not pass the significance test when data eigenmodes are used as a basis for projections. However since the Monte-Carlo test is biased if we project onto the data eigenmodes, we project onto the eigenmodes provided by the covariance matrix of the AR(1) noise. So, we select AR(1) instead of Data option in Test Options pull-down menu of MSSA-Options window:

to obtain the following result:

Here, the projections are plotted against the dominant frequencies associated with each noise eigenvector, and we have zoomed in on the 0-0.05 cy/month frequency interval of interest using the xmgr plotting tool. Since the latter are near-sinusoidal in this case, the resulting spectrum is closely related to a traditional Fourier power spectrum. Both the quasi-quadrennial(~0.023 cycle/month) and the quasi-biennial(~0.038 cycle/month) modes pass the test at the 95% level (as specified in Test Options). They are well separated in frequency by about 1/(20 months), which far exceeds the spectral resolution of 1/M = 1/N' = $1/(270\;\;{\rm months}) \approx 1/(22\;\; {\rm years})$. The two modes are thus significantly distinct from each other spectrally, in agreement with the univariate SOI results using MEM and MTM, respectively.

We leave to a user to verify the above results with the Monte-Carlo test with 100 surrogates for M=N'=270, the maximum effective resolution. Computation make take a while, so be patient!

Plot Options

We can check that ST-EOFs of oscillatory MSSA pairs are indeed in phase quadrature in a particular channel, by selecting components from MSSA components table, and setting a channel number in a Result box at the bottom of Advanced MSSA options and hitting a Plot EOFs button. The following plot is for a quasi-quadriennial pair 3 and 4 and channel 1:

We leave as an exersise to check that a phase quadrature for a quasi-biennial pair (modes 14 and 15) is a mostly prominent for channel 2.

Reconstruction and Prediction

(Note: MSSA Reconstruction/Prediction is available only in a licensed copy or for data of example projects! This feature is enabled after activation with a purchased Serial No.)

We can reconstruct contributions from selected MSSA components in Components table of Advanced panel in original gridded or compressed by PCA data by choosing Grid or PCA space options. For prediction, user has to specify in Options panel lead time (Lead field), and order of AR model to advance in time selected MSSA components; see MSSA prediction for more details. The name of matrix with reconstruction is specified by the user in Result field. By clicking Plot in Result box at the bottom of Advanced options, user can compare the specified column (spatial channel) or row (time channel) of reconstruction vs. original data. Checking "fliter out" box in Options of Reconstruction will filter out the selected components from the orginal data; this is quite useful for detrending, for example. If results from several reconstructions have been stored in different matrices, the parameters for a particular reconstruction will be restored in GUI by simply selecting correspondent matrix from a Result pop-up list. Here we show reconstruction in PCA space of quasi-quadriennial pair for spatial channel 1 (set Column 1 in Result Box):