huggingchat/papers-content / 2511 /2511.01151.md

|

112 kB

Title: A structural equation formulation for general quasi-periodic Gaussian processes

URL Source: https://arxiv.org/html/2511.01151

Markdown Content: Back to arXiv

This is experimental HTML to improve accessibility. We invite you to report rendering errors. Use Alt+Y to toggle on accessible reporting links and Alt+Shift+Y to toggle off. Learn more about this project and help improve conversions.

Why HTML? Report Issue Back to Abstract Download PDF Abstract IIntroduction IIQuasi-Periodic Gaussian Processes IIIDynamical Equation Model for QPGP IVEstimation strategy VSimulation Study VICase Studies VIIConclusion References

HTML conversions sometimes display errors due to content that did not convert correctly from the source. This paper uses the following packages that are not yet supported by the HTML conversion tool. Feedback on these issues are not necessary; they are known and are being worked on.

failed: arydshln.sty

Authors: achieve the best HTML results from your LaTeX submissions by following these best practices.

License: arXiv.org perpetual non-exclusive license arXiv:2511.01151v1 [stat.ME] 03 Nov 2025 A structural equation formulation for general quasi-periodic Gaussian processes Unnati Nigam, Radhendushka Srivastava, Faezeh Marzbanrad, Michael Burke Unnati Nigam is a Ph.D. student at IITB-Monash Research Academy, IIT Bombay, Mumbai, India.Radhendushka Srivastava is with the Department of Mathematics, IIT Bombay, Mumbai, India.Faezeh Marzbanrad and Michael Burke are with the Department of Electrical and Computer Systems Engineering, Monash University, Clayton, Melbourne, Australia. Abstract

This paper introduces a structural equation formulation that gives rise to a new family of quasi-periodic Gaussian processes, useful to process a broad class of natural and physiological signals. The proposed formulation simplifies generation and forecasting, and provides hyperparameter estimates, which we exploit in a convergent and consistent iterative estimation algorithm. A bootstrap approach for standard error estimation and confidence intervals is also provided. We demonstrate the computational and scaling benefits of the proposed approach on a broad class of problems, including water level tidal analysis, CO2 emission data, and sunspot numbers data. By leveraging the structural equations, our method reduces the cost of likelihood evaluations and predictions from 𝒪 ( 𝑘 2 𝑝 2 ) to 𝒪 ( 𝑝 2 ) , significantly improving scalability.

IIntroduction

Periodic signals are prevalent in fields like robotics, physiology, astronomy, and communication systems. However, random noise and unmodeled disturbances often disrupt these regular periodic patterns, leading to signals that display quasi-periodic or pseudo-periodic characteristics. The task of identifying the periodic components and reconstructing the original signal from such quasi-periodic data is a well-established challenge in the field of signal processing.

A range of methods have been proposed for analyzing strictly periodic signals. Among the most commonly used techniques are correlation-based approaches, such as those described in [1, 2] and [3]. Regression analysis is another widely applied tool for modeling periodic signals, as noted in [4]. To improve computational efficiency, [5] introduced a non-linear regression approach for modeling periodic signals. The maximum likelihood estimation technique has also been extensively used to estimate periodic structures, with key contributions in [6] and [7]. Furthermore, Bayesian methods have been employed to model quasi-periodic signals, with [8] using MacKay’s kernel and prior covariance information to enhance signal modeling.

The periodic Gaussian process, as introduced in [8], is a widely adopted noise model for quasi-periodic signals in various applications. It has been used in scenarios such as fault vibration detection in mechanical systems [9], pitch estimation in speech signals [9], analysis of climatological data like rainfall and famine trends [10], modeling joint angles of rotating robotic arms [10], and traffic pollution analysis [11]. Simulation results from [9] demonstrate that periodic Gaussian processes excel in period estimation, particularly under low signal-to-noise ratios, outperforming traditional methods. Periodic Gaussian processes have also been used to detect the multiple unknown seasonal components as well as estimation of respective periods sequentially [12]. However, while periodic Gaussian processes effectively capture correlations within periods, they do not model dependencies across different periods, as highlighted in [13].

In addition to periodic Gaussian processes, Gaussian processes with covariance functions formed by the product of an exponential kernel and MacKay’s periodic kernel have also been explored to model quasi-periodic signals (see [14, 15, 16, 11]). More recently, [13] introduced the Quasi-Periodic Gaussian Process (QPGP), with a covariance function defined as the product of a geometrically decaying kernel and MacKay’s periodic covariance kernel (see (5)), specifically designed to model quasi-periodic signals. This approach effectively captures both intra-period (within-period) and inter-period (between-period) correlations, and allows a maximum likelihood estimation algorithm for model parameters by utilizing the structural properties of MacKay’s kernel to enhance computational efficiency. Simulation studies have demonstrated the improved performance of the QPGP compared to alternative models. While standard QPGPs often require computationally intensive likelihood evaluations, the algorithm presented in [13] offers a significant reduction in complexity. However, this method is limited to MacKay’s kernel and cannot be extended to other commonly used periodic covariance kernels. In another approach, [17] considered the seasonal Gaussian process which is derived through stochastic differential equations and used B-spline approximations for scalable modelling of large irregular quasi-periodic signals.

In this article1, we propose a new dynamical equation system that gives rise to a broad family of QPGPs (Section III). This new family of QPGP allows extensive selection of periodic kernels to model the variation within periodic blocks along with diminishing variation between the elements of different blocks. In Section IV, a likelihood-based computationally inexpensive algorithm is presented for the estimation of model parameters and one-step prediction. The dynamical equations lead to the rapid generation of bootstrap resamples. In light of this, we present a bootstrap procedure to estimate the standard errors of the parameter estimates along with the 95% bootstrap confidence intervals for the parameter estimates. A finite sample performance of the proposed estimation methodology is illustrated in Section V. We apply the proposed QPGP model to quasi-periodic signals on carbon dioxide data, sunspot number data, and water level tidal signals in Section VI. We make the concluding remarks in section VII. Proofs of theoretical results are given in the supplementary material. MATLAB codes for parameter estimation, along with bootstrap standard errors and their confidence intervals, are available on GitHub2.

In summary, the core contributions of this work are a structural equation formulation that gives rise to a broad class of quasi-periodic Gaussian processes, without requiring strict assumptions on the within-period covariance structure, a computationally effective generation and likelihood evaluation technique, and a rapid approach to hyperparameter estimation that facilitates rapid standard error estimation via a bootstrap procedure. We now review the literature on quasi-periodic Gaussian processes.

IIQuasi-Periodic Gaussian Processes

A zero-mean stationary Gaussian process { 𝑋 ( 𝑡 ) , 𝑡 ∈ ℤ } is completely specified by its covariance kernel, 𝜅 ( 𝑡 ) ≜ 𝐸 ( 𝑋 ( 0 ) 𝑋 ( 𝑡 ) ) ∀ 𝑡 ∈ ℤ . A Gaussian process { 𝑋 ( 𝑡 ) , 𝑡 ∈ ℤ } is referred to as a Periodic Gaussian Process with period 𝑝 if its covariance kernel 𝜅 𝑝 is a periodic function with period 𝑝 [14], i.e.,

𝜅 𝑝 ( 𝑡 + 𝑝 )

𝜅 𝑝 ( 𝑡 ) ∀ 𝑡 ∈ ℤ .

(1)

Using the fact that the covariance kernel of a stationary process is an even function, i.e., 𝜅 𝑝 ( − 𝑡 )

𝜅 𝑝 ( 𝑡 ) , and (1), specifying 𝜅 𝑝 ( 𝑡 ) for 𝑡

0 , 1 , … , 𝑇 where 𝑇 is the maximum lag needed to identify 𝜅 𝑝 , (with 𝑇

𝑝 / 2 if 𝑝 is even and 𝑇

( 𝑝 − 1 ) / 2 if 𝑝 is odd) completely determines the periodic covariance kernel 𝜅 𝑝 .

We list below some of the popular periodic families of covariance kernels with period 𝑝 used in various applications for modelling periodic signals.

1.

MacKay’s Kernel [8]: Given 𝜃

0 , 𝜎 2

0 ,

𝜅 𝑝 ( 𝑡 )

𝜎 2 exp ⁡ ( − 𝜃 2 sin 2 ⁡ ( 𝜋 | 𝑡 | / 𝑝 ) ) .

(2)

Here, the parameter 𝜃 represents the inverse of characteristic length-scale (see pp. 14 in [14]).

2.

Periodic Matérn Kernel: Given 𝜈

0 , 𝜃

0 , 𝜎 2

0 ,

𝜅 𝑝 ( 𝑡 )

𝜎 2 2 1 − 𝜈 Γ ( 𝜈 ) ( 𝜙 ( | 𝑡 | ) ) 𝜈 𝐾 𝜈 ( 𝜙 ( | 𝑡 | ) ) ,

(3)

where 𝜙 ( 𝑡 )

2 𝜃 2 𝜈 sin 2 ⁡ ( 𝜋 𝑡 / 𝑝 ) and 𝐾 𝜈 ( ⋅ ) is a modified Bessel function of second type [19]. This periodic covariance function is formed by warping the Matérn kernel [20]. The warping of an aperiodic kernel is a general technique to construct a periodic kernel. The parameters 𝜈 and 𝜃 represent the degree of smoothness and characteristic length scale, respectively. In particular, if 𝜈

1.5 , then the corresponding periodic Gaussian process is differentiable in the mean square sense (see pp. 81–84 in [14]).

3.

Cosine Kernel [21]: Given 𝜎 2

0 and 𝜄 ∈ ℤ + ,

𝜅 𝑝 ( 𝑡 )

𝜎 2 cos ⁡ ( 2 𝜋 𝜄 | 𝑡 | / 𝑝 ) .

(4)

The family of cosine kernels constitutes a basis of periodic covariance kernels. A periodic covariance kernel 𝜅 𝑝 can be pointwise approximated by a linear combination of elements of the cosine family.

Moreover, note that the parameter 𝜎 2 (

𝜅 𝑝 ( 0 ) ) , for above above-listed kernels, represents the variance of the periodic Gaussian process.

The periodic behavior of the covariance kernel 𝜅 𝑝 only leads to a periodic sample path from the Gaussian process. This is a major limitation of periodic Gaussian processes in modelling a quasi-periodic signal. To better reflect quasi-periodicity, the periodic pattern of the covariance kernel of a periodic Gaussian process can be adjusted using non-periodic covariance kernels to form quasi-periodic kernels. For example, [14] studied a stationary quasi-periodic covariance kernel

𝜅 ( 𝑡 )

𝜎 2 𝑒 − 𝑡 2 exp ⁡ ( − 𝜃 2 sin 2 ⁡ ( 𝜋 𝑡 / 𝑝 ) ) ,

where 𝜃

0 . This kernel is a product of a squared-exponential covariance kernel and periodic MacKay’s kernel. This kernel has been used to model quasi-periodic Gaussian noise in ECG and Pulsatile physiological signals, crop biomass data [15] and the stellar activity of stars [16].

In a study of astrophysical phenomena, [21] used a different stationary quasi-periodic covariance kernel

𝜅 ( 𝑡 )

𝜎 2 𝑒 − 𝑡 2 ( cos ⁡ ( 2 𝜋 𝑡 / 𝑝 ) + exp ⁡ ( − 𝜃 2 sin 2 ⁡ ( 𝜋 𝑡 / 𝑝 ) ) ) .

For carbon dioxide concentration data, [14] used a quasi-periodic kernel that is a linear combination of different non-periodic covariance kernels and a periodic Mackay’s kernel. [22] considered a quasi-periodic kernel (termed a Simple Harmonic Oscillator kernel) for astronomical quasi-periodic data.

These stationary quasi-periodic covariance kernels have been used to model quasi-periodic signals in various applications. However, these kernels do not explicitly model the correlation between the blocks of successive periodic patterns in the process. To address the limitations of standard periodic Gaussian processes, [13] explored a non-stationary covariance kernel that models both the between- and within-period correlation of the quasi-periodic signal. They referred to a Gaussian process { 𝑋 𝑡 , 𝑡 ∈ ℤ } as a Quasi-Periodic Gaussian Processes with period 𝑝 if the covariance between 𝑋 𝑡 and 𝑋 𝑠 for 𝑡 , 𝑠 ∈ ℤ is given by

Cov ( 𝑋 𝑡 , 𝑋 𝑠 )

(5)

=

𝜔 | ⌈ 𝑡 / 𝑝 ⌉ − ⌈ 𝑠 / 𝑝 ⌉ | 𝜎 2 exp ⁡ ( − 𝜃 2 sin 2 ⁡ ( 𝜋 ( 𝑡 − 𝑠 ) / 𝑝 ) ) ,

where ⌈ ⋅ ⌉ denotes the ceiling function, 𝜎 2 > 0 is the variance, 𝜔 ∈ ( − 1 , 1 ) denotes the between-period correlation, and 𝜃 > 0 denotes the within-period correlation. The second term on the RHS of (5) corresponds to MacKay’s covariance kernel, which determines the periodic pattern of the process, while the first term is a geometrically decaying function, i.e. the covariance decreases as the difference between ⌈ 𝑡 / 𝑝 ⌉ and ⌈ 𝑠 / 𝑝 ⌉ increases. The sample paths of a QPGP with covariance as in (5) exhibit quasi-periodic patterns. Note that, for 𝑡 , 𝑠 ∈ ℤ such that | ⌈ 𝑡 / 𝑝 ⌉ − ⌈ 𝑠 / 𝑝 ⌉ |

0 , 𝑋 𝑡 and 𝑋 𝑠 are in the same periodic block of the QPGP and the covariance between them coincides with MacKay’s kernel. Further, when 𝑡 , 𝑠 ∈ ℤ such that | ⌈ 𝑡 / 𝑝 ⌉ − ⌈ 𝑠 / 𝑝 ⌉ |

𝑘 , then 𝑋 𝑡 and 𝑋 𝑠 belong to periodic blocks that are successively 𝑘 apart. In this case, the covariance between 𝑋 𝑡 and 𝑋 𝑠 decays as 𝜔 𝑘 . Although, the covariance structure of QPGP given in (5) is non-stationary, it models the correlation between and within block elements explicitly.

Given 𝑛 samples 𝑋 1 , 𝑋 2 , … , 𝑋 𝑛 of the QPGP, the likelihood function involves the computation of determinant and inverse of a covariance matrix of order 𝑛 with elements as in (5), which is computationally expensive. [13] expressed this covariance matrix as 𝜎 2 times a Kronecker product of a Kac-Murdock-Szego (KMS) matrix ([23]) and a symmetric-circulant matrix. [13] developed an efficient algorithm for fast likelihood computation by exploiting the structure of special matrices. Specifically, the method leverages properties of the Kronecker product inverse and utilizes the factorization of the KMS matrix alongside a symmetric circulant matrix, enabling the use of the Fast Fourier Transform (FFT). However, such a computation restricts the scope of generalization of the within- and between-period correlation structures of a covariance function. Below, we introduce a more general QPGP formulation.

IIIDynamical Equation Model for QPGP

The covariance structure of the proposed QPGP mainly consists of two parts: (a) a component that models the covariance between the elements of different periodic blocks, and (b) a component that models the covariance between elements within the same periodic block. Our proposed QPGP is based on a dynamical equation system that permits the modeling of the within-period correlations for arbitrary choices of the periodic covariance kernel, along with flexibility to adjust the correlation between elements of successive periods. We now formally define this new family of Quasi-Periodic Gaussian Processes.

Definition 1 (QPGP).

A Gaussian process { 𝑌 𝑡 , 𝑡 ∈ ℤ + } is said to be a zero-mean Quasi-Periodic Gaussian process with period 𝑝 , periodic covariance kernel 𝜅 𝑝 and between-period correlation 𝜔 if

𝐸 ( 𝑌 𝑡 )

0 , for all 𝑡 ∈ ℤ + ,

(6)

and the 𝑝 -dimensional random vectors,

𝗬 𝑖 + 1 ≜ [ 𝑌 𝑖 𝑝 + 1 , 𝑌 𝑖 𝑝 + 2 , … , 𝑌 𝑖 𝑝 + 𝑝 ] ⊤ , for 𝑖

0 , 1 , 2 …

satisfy the recursion

𝗬 𝑖 + 1

𝜔 𝗬 𝑖 + 𝗭 𝑖 + 1 , for 𝑖 ≥ 1 ,

(7)

where { 𝗭 𝑖 + 1 } 𝑖 ≥ 1 is a sequence of independently and identically distributed 𝑝 -dimensional zero-mean Gaussian random vectors with covariance matrix

𝓚 ≜ ( 𝜅 𝑝 ( 𝑖 − 𝑗 ) ) 1 ≤ 𝑖 , 𝑗 ≤ 𝑝 ,

(8)

which is independent of initial vector 𝗬 1 . ■

Note that the random vectors 𝗬 𝑖 ’s represent the periodic blocks of the QPGP. The recursion given in (7) shows that the correlation between successive periods of the QPGP is described by the parameter 𝜔 . Since the random vectors 𝗭 𝑖 ’s are independent copies of a zero-mean periodic Gaussian process with covariance kernel 𝜅 𝑝 , the recursion given in (7) also shows that the within-period correlation of the QPGP is described by 𝜅 𝑝 . We refer to the random vectors 𝗭 𝑖 ’s as the periodic building-blocks of the QPGP. The rapid generation of periodic building blocks and the dynamical equation in (7) leads to the rapid generation of sample paths from a QPGP. Theorem 1, given below, provides an expression for the covariance structure of the QPGP.

Theorem 1.

Let { 𝑌 𝑡 , 𝑡 ∈ ℤ + } be a zero mean Quasi-Periodic Gaussian Process with parameters 𝑝 , 𝜅 𝑝 and 𝜔 . Then, for 𝑠 ≤ 𝑡 ∈ ℤ + , we have

𝐶 𝑜 𝑣 ( 𝑌 𝑡 , 𝑌 𝑠 )

𝜔 | ⌈ 𝑡 𝑝 ⌉ − ⌈ 𝑠 𝑝 ⌉ | ( 𝜅 𝑝 ( 𝑡 − 𝑠 ) [ 1 − 𝜔 2 ⌊ 𝑠 𝑝 ⌋ 1 − 𝜔 2 ]

(9)

𝜔 2 ⌊ 𝑠 𝑝 ⌋ 𝐶 𝑜 𝑣 ( 𝑌 𝑙 ( 𝑡 ) , 𝑌 𝑙 ( 𝑠 ) ) )

where 𝑙 ( 𝑡 ) ≜ 𝑡 − ⌊ 𝑡 / 𝑝 ⌋ 𝑝 , 𝑙 ( 𝑠 ) ≜ 𝑠 − ⌊ 𝑠 / 𝑝 ⌋ 𝑝 with 𝑙 ( 𝑡 ) , 𝑙 ( 𝑠 ) ∈ { 1 , 2 , … , 𝑝 } and ⌊ ⋅ ⌋ denotes the floor function. ■

The second term on the RHS of (9) represents the effect of initial vector 𝗬 1 on the covariance of QPGP, which diminishes for large 𝑠 . Therefore, if one burns out or discards a sufficiently large number of initial observations of the QPGP, then (9) can be approximated as

𝐶 𝑜 𝑣 ( 𝑌 𝑡 , 𝑌 𝑠 )

≈ 𝜔 | ⌈ 𝑡 𝑝 ⌉ − ⌈ 𝑠 𝑝 ⌉ | 1 − 𝜔 2 𝜅 𝑝 ( 𝑡 − 𝑠 ) .

(10)

Proposition 1, given below, shows that a special choice of distribution of the initial vector 𝗬 1 also simplifies (9) to (10) in an exact manner.

Proposition 1.

Let { 𝑌 𝑡 , 𝑡 ∈ ℤ + } be a zero-mean Quasi-Periodic Gaussian Process with parameters 𝑝 , 𝜅 𝑝 and 𝜔 . Let the initial vector 𝗬 1 be a zero mean Gaussian vector with covariance matrix 1 1 − 𝜔 2 𝓚 . Then, for 𝑠 , 𝑡 ∈ ℤ + , we have

𝐶 𝑜 𝑣 ( 𝑌 𝑡 , 𝑌 𝑠 )

𝜔 | ⌈ 𝑡 𝑝 ⌉ − ⌈ 𝑠 𝑝 ⌉ | 1 − 𝜔 2 𝜅 𝑝 ( 𝑡 − 𝑠 ) .

(11)

■

We now define a Standard Quasi-Periodic Gaussian Process.

Definition 2 (Standard QPGP).

A QPGP { 𝑌 𝑡 , 𝑡 ∈ ℤ + } is said to be a Standard QPGP if the initial vector 𝗬 1 is a Gaussian vector with mean 0 and covariance matrix 1 1 − 𝜔 2 𝓚 . ■

The covariance structure of the QPGP given by [13] (see (5)) coincides with that of the proposed standard QPGP when 𝜅 𝑝 is chosen as MacKay’s kernel (2) with a scale factor of 1 1 − 𝜔 2 . The proposed QPGP enables constructing a new family of QPGP with arbitrary periodic covariance kernels for modeling within-period correlation, providing far greater flexibility than the covariance function given by (5).

IVEstimation strategy

Given the 𝑛 -dimensional data vector 𝒚

[ 𝑦 1 , 𝑦 2 , … , 𝑦 𝑛 ] ⊤ from the standard QPGP with period 𝑝 , periodic covariance kernel 𝜅 𝑝 , and between-period correlation 𝜔 , we consider the likelihood approach for estimation of the parameters. We begin with 𝑛

𝑘 𝑝 for some 𝑘 ∈ ℤ + for simplicity of the likelihood expression. When 𝑛 ≠ 𝑘 𝑝 , the amended estimation methodology is provided in Appendix A-A. The negative logarithm of the likelihood function of the data vector 𝒚 is given as

ℓ 𝑛 ( 𝜔 , 𝜅 𝑝 )

1 2 log ⁡ ( | 𝚺 𝑛 | ) + 1 2 𝒚 ⊤ 𝚺 𝑛 − 1 𝒚 + 𝑛 2 log ⁡ ( 2 𝜋 ) ,

(12)

where

( 1 − 𝜔 2 ) 𝚺 𝑛 ≜ ( 𝜔 | ⌈ 𝑖 𝑝 ⌉ − ⌈ 𝑗 𝑝 ⌉ | 𝜅 𝑝 ( 𝑖 − 𝑗 ) ) 1 ≤ 𝑖 , 𝑗 ≤ 𝑛 .

(13)

The evaluation of (12) involves a computationally expensive computation of the determinant and inverse of the covariance matrix 𝚺 𝑛 . When 𝜅 𝑝 is chosen to be MacKay’s kernel, then the fast algorithm developed in [13] for the computation of the determinant and inverse of ( 1 − 𝜔 2 ) 𝚺 𝑛 can be used for the likelihood evaluation.

By using (7) and the conditional distribution of the periodic blocks of the standard QPGP, the negative logarithm of the likelihood function is expressed as follows.

ℓ 𝑛 ( 𝜔 , 𝜅 𝑝 )

𝑘 − 1 2 log ⁡ ( | 𝓚 | )

+ 1 2 ∑ 𝑖

1 𝑘 − 1 ( 𝘆 𝑘 − 𝑖 + 1 − 𝜔 𝘆 𝑘 − 𝑖 ) ⊤ 𝓚 − 1 ( 𝘆 𝑘 − 𝑖 + 1 − 𝜔 𝘆 𝑘 − 𝑖 ) ⏟ Contribution of periodic blocks 𝗬 𝑘 , … , 𝗬 1

1 2 log ⁡ ( | 1 1 − 𝜔 2 𝓚 | )
1 2 𝘆 1 ⊤ ( 1 1 − 𝜔 2 𝓚 ) − 1 𝘆 1 ⏟ Marginal contribution of periodic block 𝗬 1
𝑐 ,

(14)

where 𝘆 1 , 𝘆 2 , … , 𝘆 𝑘 are the observed periodic blocks of the QPGP data vector 𝒚 and 𝑐

𝑛 log ⁡ ( 2 𝜋 ) . The evaluation of (14) requires computation of the inverse and determinant of the 𝑝 -dimensional periodic covariance matrix 𝓚 . This simplification reduces the computational cost of likelihood evaluation immensely. In subsection V-A, we illustrate numerically that evaluation of the simplified likelihood expression (as given in (14)) is computationally faster than using expression in (12) (see Table I). This improvement arises because the proposed method reduces the matrix computation complexity from 𝒪 ( 𝑘 2 𝑝 2 ) to 𝒪 ( 𝑝 2 ) .

We now formally describe the parameter space of QPGP parameters 𝜔 and 𝜅 𝑝 . Since 𝜔 represents the correlation between the elements of successive periodic blocks, we set 𝜔 ∈ [ − 1 , 1 ] . Similarly, as 𝜅 𝑝 represents the covariance kernel of the periodic building-blocks, we set 𝜅 𝑝 ∈ 𝕂 𝑝 where 𝕂 𝑝 denotes the set of all periodic covariance kernels of order 𝑝 . The maximum likelihood estimator of ( 𝜔 , 𝜅 𝑝 ) is obtained by minimizing ℓ 𝑛 ( 𝜔 , 𝜅 𝑝 ) , as given in (14), over 𝜔 ∈ [ − 1 , 1 ] and 𝜅 𝑝 ∈ 𝕂 𝑝 .

The non-convexity of ℓ 𝑛 ( 𝜔 , 𝜅 𝑝 ) over 𝜅 𝑝 ∈ 𝕂 𝑝 poses a major challenge in the maximum likelihood estimation (see [24]). However, for a specific 𝜅 𝑝 indexed by parameters ( 𝜽 , 𝜎 2 ) , the likelihood function ℓ 𝑛 ( 𝜔 , 𝜅 𝑝 ( 𝜽 , 𝜎 2 ) ) may be a convex function of ( 𝜔 , 𝜽 , 𝜎 2 ) over the reduced parameter space. In such scenarios, the maximum likelihood estimates of 𝜔 , 𝜃 , and 𝜎 2 can be obtained analytically or by using numerical methods such as grid search. In particular, [13] considered 𝜅 𝑝 to be MacKay’s kernel, and maximum likelihood estimates were obtained using the grid search method. The computational efficiency and accuracy of estimates based on the grid search algorithm depend on the size of the grid.

By utilizing the structural equations approach, we now develop a fast algorithm for estimating QPGP parameters 𝜔 and 𝜅 𝑝 under a general setup. Note that the likelihood of the proposed QPGP differs from the standard QPGP through the marginal contribution of the initial periodic block 𝗬 1 (see (14)). This shows that the maximum likelihood estimates depend on the marginal distribution of 𝗬 1 . Given this, we do not consider the marginal contribution of 𝗬 1 in the likelihood function in our estimation approach. This loss of information provides flexibility in the applicability of our estimation approach for a general QPGP. Thus, we consider the function ℓ ~ 𝑛 , referred to as the scaled negative logarithm of the reduced likelihood function, given below for our estimation approach in the next subsection.

ℓ ~ 𝑛 ( 𝜔 , 𝓚 )

log ⁡ ( | 𝓚 | )

+ 1 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 ( 𝘆 𝑘 − 𝑖 + 1 − 𝜔 𝘆 𝑘 − 𝑖 ) ⊤ 𝓚 − 1 ( 𝘆 𝑘 − 𝑖 + 1 − 𝜔 𝘆 𝑘 − 𝑖 ) .

(15) IV-ATwo stage fast estimation algorithm

We propose a two-stage algorithm to estimate the parameters 𝜔 and periodic covariance kernel 𝜅 𝑝 based on the reduced likelihood function ℓ ~ 𝑛 given in (15). Note that the reduced likelihood function ℓ ~ 𝑛 ( 𝜔 , 𝓚 ) is a twice differentiable function of 𝜔 ∈ [ − 1 , 1 ] and 𝓚 ∈ 𝔎 where 𝔎 is the set of all real-valued 𝑝 -dimensional invertible matrices with bounded entries. In Stage I of the algorithm, we estimate 𝜔 and 𝓚 by minimizing ℓ ~ 𝑛 over 𝜔 ∈ [ − 1 , 1 ] and 𝓚 ∈ 𝔎 . Note that, by using matrix differentiation (see pp. 9–10 of [25]), we have

∂ ℓ ~ 𝑛 ∂ 𝜔

− 1 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 𝘆 𝑖 ⊤ 𝓚 − 1 𝘆 𝑖 + 1 + 𝜔 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 𝘆 𝑖 ⊤ 𝓚 − 1 𝘆 𝑖 ,

(16)

∂ ℓ ~ 𝑛 ∂ 𝓚

− 𝓚 + 1 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 ( 𝘆 𝑖 + 1 − 𝜔 𝘆 𝑖 ) ( 𝘆 𝑖 + 1 − 𝜔 𝘆 𝑖 ) ⊤ .

(17)

By equating the right-hand side of (16) and (17) to 0 , the stationary points of ℓ ~ 𝑛 satisfy the following relation.

𝜔

∑ 𝑖

1 𝑘 − 1 𝘆 𝑖 ⊤ 𝓚 − 1 𝘆 𝑖 + 1 ∑ 𝑖

1 𝑘 − 1 𝘆 𝑖 ⊤ 𝓚 − 1 𝘆 𝑖 ,

(18)

𝓚

1 ( 𝑘 − 1 ) ∑ 𝑖

1 𝑘 − 1 ( 𝘆 𝑖 + 1 − 𝜔 𝘆 𝑖 ) ( 𝘆 𝑖 + 1 − 𝜔 𝘆 𝑖 ) ⊤ .

(19)

The explicit solution of the nonlinear equations given in (18) and (19) does not exist. We apply an alternate minimization of ℓ ~ 𝑛 over 𝜔 and 𝓚 iteratively to obtain the solution of these equations and denote them by 𝜔 ~ 𝑛 and 𝓚 ~ 𝑛 . Note that, for given 𝜔 , the solution 𝓚 as in (19) is the unique minimizer of ℓ ~ 𝑛 . Similarly, given 𝓚 , the solution 𝜔 given in (18) is the unique minimizer of ℓ ~ 𝑛 . Therefore, by using Proposition 2.7.1 of [26], the iterative alternating minimization sequence { ( 𝜔 ~ ( 𝑚 ) , 𝓚 ~ ( 𝑚 ) ) } in Stage I of Algorithm 1 converges to a stationary point of ℓ ~ 𝑛 .

Note that the solution 𝓚 ~ 𝑛 is a non-negative definite matrix, but not guaranteed to be a covariance matrix corresponding to a periodic covariance kernel. Therefore, in Stage II of the Algorithm 1, we constrain 𝓚 ~ 𝑛 to the set 𝕂 𝑝 . For this purpose, we minimize the Frobenius norm of the matrix ( 𝓚 ~ 𝑛 − 𝓚 ) over the periodic covariance kernels to estimate 𝜅 𝑝 , i.e.,

minimize 𝓚 (

( 𝜅 𝑝 ( 𝑖 − 𝑗 ) ) 1 ≤ 𝑖 , 𝑗 ≤ 𝑝 ) ‖ 𝓚 ~ 𝑛 − 𝓚 ‖ 𝐹 .

(20)

The minimizer of (20) is given as follows.

𝜅 ~ 𝑝 ( 𝑡 )

1 𝑝 − | 𝑡 | ∑ 𝑗

1 𝑝 − | 𝑡 | 𝓚 ~ 𝑛 ( 𝑗 , 𝑗 + | 𝑡 | ) , for | 𝑡 | < 𝑝 .

(21)

To ensure the positive definiteness property of covariance kernel 𝜅 ~ 𝑝 , we use the technique developed in [27]. We restrict the spectrum of 𝜅 ~ 𝑝 to be non-negative using truncation, and its inverse Fourier transform is our proposed estimates of 𝜅 𝑝 , i.e.,

𝜅 ^ 𝑝 ( 𝑡 )

∫ − 𝜋 𝜋 𝑒 𝑖 𝑡 𝜆 max ⁡ ( 𝑓 ~ ( 𝜆 ) , 0 ) 𝑑 𝜆 for | 𝑡 | < 𝑝 ,

(22)

where

𝑓 ~ ( 𝜆 )

1 2 𝜋 ∑ | 𝑡 | < 𝑝 𝜅 ~ 𝑝 ( 𝑡 ) 𝑒 − 𝑖 𝑡 𝜆 , for all 𝜆 ∈ [ − 𝜋 , 𝜋 ] .

(23)

Now, our proposed estimate of 𝜔 is given as follows.

𝜔 ^

∑ 𝑖

1 𝑘 − 1 𝘆 𝑖 ⊤ 𝓚 ^ − 1 𝘆 𝑖 + 1 ∑ 𝑖

1 𝑘 − 1 𝘆 𝑖 ⊤ 𝓚 ^ − 1 𝘆 𝑖 ,

(24)

where 𝓚 ^ ≜ ( 𝜅 ^ 𝑝 ( 𝑖 − 𝑗 ) ) 1 ≤ 𝑖 , 𝑗 ≤ 𝑝 . The proposed estimation algorithm is summarized in Alg. 1.

Suppose the periodic covariance kernel 𝜅 𝑝 of the QPGP is known to belong to a parametric family of covariance kernel 𝕂 ( 𝜽 , 𝜎 2 ) where hyper-parameters ( 𝜽 , 𝜎 2 ) ∈ Θ × ( 0 , ∞ ) (see the listed examples in section II). In such a scenario, the stage II of Algorithm 1 reduces significantly. In particular, we restrict 𝓚 ~ 𝑛 , obtained from stage I, to 𝕂 ( 𝜽 , 𝜎 2 ) to estimate ( 𝜽 , 𝜎 2 ) . We estimate ( 𝜽 , 𝜎 2 ) by minimizing the Frobenius norm of ( 𝓚 ~ 𝑛 − 𝓚 ( 𝜽 , 𝜎 2 ) ) over 𝓚 ( 𝜽 , 𝜎 2 ) ∈ 𝕂 ( 𝜃 , 𝜎 2 ) , i.e.,

( 𝜽 ^ , 𝜎 ^ 2 )

arg min ( 𝜽 , 𝜎 2 ) ∈ Θ × ( 0 , ∞ ) ⁡ ‖ 𝓚 ~ 𝑛 − 𝓚 ( 𝜽 , 𝜎 2 ) ‖ 𝐹 .

(25)

We then replace 𝓚 ^ by 𝓚 ( 𝜽 ^ , 𝜎 ^ 2 ) in (24) to get the stage II estimate of 𝜔 . In the absence of analytical expression of ( 𝜽 ^ , 𝜎 ^ 2 ) , a grid search minimization approach can be implemented for (25).

Theorem 2, given below, establishes the consistency of the proposed estimators 𝜔 ^ and 𝜅 ^ 𝑝 .

Theorem 2.

Let 𝐲

[ 𝑦 1 , 𝑦 2 , … , 𝑦 𝑛 ] ⊤ be a 𝑛 -dimensional sample path of a QPGP with period 𝑝 and parameters 𝜔 0 and 𝜅 0 𝑝 . Then, we have the following convergence results.

1.

The reduced likelihood function ℓ ~ 𝑛 ( 𝜔 , 𝓚 ) → 𝑃 ℓ ~ ( 𝜔 , 𝓚 ) continuously3 as 𝑛 → ∞ where

ℓ ~ ( 𝜔 , 𝓚 )

log ⁡ ( | 𝓚 | ) + 𝑡 𝑟 ( 𝓚 − 1 𝓚 0 ) ( 1 + ( 𝜔 − 𝜔 0 ) 2 1 − 𝜔 0 2 ) .

(26) 2.

The limiting function ℓ ~ ( 𝜔 , 𝓚 ) , defined over 𝜔 ∈ [ − 1 , 1 ] and 𝓚 ∈ 𝔎 , is a twice differentiable function with a minima at ( 𝜔 0 , 𝓚 0 ) .

3.

If ℓ ~ ( 𝜔 , 𝓚 ) has a unique minimum at ( 𝜔 0 , 𝓚 0 ) , then the estimators 𝜔 ^ and 𝓚 ^ , obtained from the stage II of Algorithm 1, converge as follows.

•

𝜔 ^ → 𝑃 𝜔 0 as 𝑛 → ∞ .

•

𝓚 ^ → 𝑃 𝓚 0 ≜ ( 𝜅 0 𝑝 ( 𝑖 − 𝑗 ) ) 1 ≤ 𝑖 , 𝑗 ≤ 𝑝 as 𝑛 → ∞ .

Here, → 𝑃 denotes the convergence in probability. ■

Algorithm 1 Estimation of QPGP parameters 𝜔 and 𝜅 𝑝 3 1:Input: 𝒚

[ 𝑦 1 , … , 𝑦 𝑛 ] ⊤ , 𝑛

𝑘 𝑝 ; 𝑝 ; 𝛿 (threshold) 2:Stage I: 3:Initialize: 𝓚 ~ ( 0 )

𝐈 𝑝 4:for 𝑚

1 , 2 , … do 5: 𝜔 ~ ( 𝑚 )

∑ 𝑖

1 𝑘 − 1 𝘆 𝑖 ⊤ 𝓚 ~ ( 𝑚 − 1 ) − 1 𝘆 𝑖 + 1 ∑ 𝑖

1 𝑘 − 1 𝘆 𝑖 ⊤ 𝓚 ~ ( 𝑚 − 1 ) − 1 𝘆 𝑖 6: 𝓚 ~ ( 𝑚 )

1 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 ( 𝘆 𝑖 + 1 − 𝜔 ~ ( 𝑚 ) 𝘆 𝑖 ) ( 𝘆 𝑖 + 1 − 𝜔 ~ ( 𝑚 ) 𝘆 𝑖 ) ⊤ 7: if max ⁡ ( | ∂ ℓ ~ 𝑛 ∂ 𝜔 | , | ∂ ℓ ~ 𝑛 ∂ 𝓚 | ∞ ) | 𝜔 ~ ( 𝑚 ) , 𝓚 ~ ( 𝑚 ) < 𝛿 then 8: Set 𝜔 ~ 𝑛

𝜔 ~ ( 𝑚 ) , 𝓚 ~ 𝑛

𝓚 ~ ( 𝑚 ) 9: break 10: end if 11:end for 12:Stage II: 13:for | 𝑡 | < 𝑝 do 14: 𝜅 ~ 𝑝 ( 𝑡 )

1 𝑝 − | 𝑡 | ∑ 𝑗

1 𝑝 − | 𝑡 | 𝓚 ~ 𝑛 ( 𝑗 , 𝑗 + | 𝑡 | ) 15:end for 16:Compute Spectrum( 𝜅 ~ 𝑝 ): 17: 𝑓 ~ ( 𝜆 )

1 2 𝜋 ∑ | 𝑡 | < 𝑝 𝜅 ~ 𝑝 ( 𝑡 ) 𝑒 − 𝑖 𝑡 𝜆 , ∀ 𝜆 ∈ [ − 𝜋 , 𝜋 ] 18:Estimates of 𝜅 𝑝 and 𝜔 : 19: 𝜅 ^ 𝑝 ( 𝑡 )

∫ − 𝜋 𝜋 𝑒 𝑖 𝑡 𝜆 max ⁡ ( 𝑓 ~ ( 𝜆 ) , 0 ) 𝑑 𝜆 , | 𝑡 | < 𝑝 20: 𝜔 ^

∑ 𝑖

1 𝑘 − 1 𝘆 𝑖 ⊤ 𝓚 ^ − 1 𝘆 𝑖 + 1 ∑ 𝑖

1 𝑘 − 1 𝘆 𝑖 ⊤ 𝓚 ^ − 1 𝘆 𝑖 , 𝓚 ^ ≜ ( 𝜅 ^ 𝑝 ( 𝑖 − 𝑗 ) ) 1 ≤ 𝑖 , 𝑗 ≤ 𝑝 21:Output: 𝜔 ^ and 𝜅 ^ 𝑝 ( 𝑡 ) for 𝑡

0 , 1 , 2 … , 𝑝

IV-BPrediction of QPGP

In this subsection, for a standard QPGP vector 𝒀 𝑡

[ 𝑌 1 , … , 𝑌 𝑡 ] ⊤ , we first obtain the best linear predictor of 𝑌 𝑡 in terms of 𝒀 𝑡 − 1 and subsequently describe a measure of the goodness of fit of the QPGP. For a Gaussian vector 𝒀 𝑡 , the best linear predictor ( 𝑌 ^ 𝑡 ) of 𝑌 𝑡 given 𝒀 𝒕 − 𝟏 is the conditional expectation given as follows (see Definition 2.7.4 for the conditional mean of jointly Gaussian vectors on pp. 64 of [28]).

𝑌 ^ 𝑡

𝐸 ( 𝑌 𝑡 | 𝑌 𝑡 − 1 , 𝑌 𝑡 − 2 , … , 𝑌 1 )

𝚺 𝑡 − 1 , 1 𝚺 𝑡 − 1 − 1 𝒀 𝑡 − 1 for 𝑡

1 ,

(27)

where 𝚺 𝑡

Var ( 𝒀 𝑡 ) as defined in (13) and partitioned as

𝚺 𝑡

[ 𝚺 𝑡 − 1

𝚺 𝑡 − 1 , 1

𝚺 1 , 𝑡 − 1

𝜅 𝑝 ( 0 ) 1 − 𝜔 2 ] .

Further, Var ( 𝑌 ^ 𝑡 )

𝚺 𝑡 − 1 , 1 𝚺 𝑡 − 1 − 1 𝚺 1 , 𝑡 − 1 . Similar to the evaluation of the likelihood function, the evaluation of 𝑌 ^ 𝑡 also requires the expensive computation of the inverse of 𝚺 𝑡 − 1 for large 𝑡 . Theorem 3, given below, demonstrates that the proposed structural equation-based QPGP yields a computationally efficient formula for the best linear predictor of 𝑌 𝑡 .

Theorem 3.

Let 𝐘 𝑡

[ 𝑌 1 , 𝑌 2 , … , 𝑌 𝑡 ] ⊤ be a standard QPGP vector with parameters 𝑝 , 𝜔 and 𝜅 𝑝 . Let 𝑖 ( 𝑡 ) ≜ ⌊ 𝑡 / 𝑝 ⌋ , 𝑙 ( 𝑡 ) ≜ 𝑡 − 𝑖 ( 𝑡 ) 𝑝 and 𝗬 𝑗 + 1 ( 𝑙 ) ≜ [ 𝑌 𝑗 𝑝 + 1 , … , 𝑌 𝑗 𝑝 + 𝑙 ] ⊤ for 𝑗

0 , … , 𝑘 − 1 . Then, the best linear predictor, 𝑌 ^ 𝑡 , of 𝑌 𝑡 given 𝐘 𝐭 − 𝟏 is as follows.

𝑌 ^ 𝑡

{ 𝓚 1 , 𝑙 ( 𝑡 ) − 1 𝓚 𝑙 ( 𝑡 ) − 1 − 1 𝗬 1 ( 𝑡 − 1 )

if 1 < 𝑡 ≤ 𝑝

𝜔 𝑌 𝑡 − 𝑝 + 𝓚 1 , 𝑙 ( 𝑡 ) − 1 𝓚 𝑙 ( 𝑡 ) − 1 − 1

if 𝑡

𝑝

× ( 𝗬 𝑖 ( 𝑡 ) + 1 𝑙 ( 𝑡 ) − 1 − 𝜔 𝗬 𝑖 ( 𝑡 ) 𝑙 ( 𝑡 ) − 1 )

,

(28)

where 𝓚 𝑙 ( 𝑡 ) ≜ ( 𝜅 𝑝 ( 𝑖 − 𝑗 ) ) 1 ≤ 𝑖 , 𝑗 ≤ 𝑙 ( 𝑡 ) and partitioned as

𝓚 𝑙 ( 𝑡 )

[ 𝓚 𝑙 ( 𝑡 ) − 1

𝓚 𝑙 ( 𝑡 ) − 1 , 1

𝓚 1 , 𝑙 ( 𝑡 ) − 1

𝜅 𝑝 ( 0 ) ] .

(29)

Further,

Var ( 𝑌 ^ 𝑡 )

‘

{ 1 1 − 𝜔 2 𝓚 1 , 𝑙 ( 𝑡 ) − 1 𝓚 𝑙 ( 𝑡 ) − 1 − 1 𝓚 𝑙 ( 𝑡 ) − 1 , 1

if 1 < 𝑡 ≤ 𝑝

𝜔 2 𝜅 𝑝 ( 0 ) 1 − 𝜔 2 + 𝓚 1 , 𝑙 ( 𝑡 ) − 1 𝓚 𝑙 ( 𝑡 ) − 1 − 1 𝓚 𝑙 ( 𝑡 ) − 1 , 1

if 𝑡

𝑝 .

(30)

■

Theorem 3 enables the fast evaluation of 𝑌 ^ 𝑡 as it requires computation of the inverse of 𝓚 𝑙 ( 𝑡 ) − 1 where 𝑙 ( 𝑡 ) ∈ { 1 , 2 , … , 𝑝 } with computation complexity 𝒪 ( 𝑙 ( 𝑡 ) 2 ) ≤ 𝒪 ( 𝑝 2 ) . In contrast, the computation complexity of 𝑌 ^ 𝑡 using expression (27) is 𝒪 ( 𝑡 2 ) . We numerically illustrate in subsection V-A that the computation of 𝑌 ^ 𝑡 based on (28) is computationally faster than that of based on (27) (see Table II).

We now turn to describe a measure of goodness of fit of a standard QPGP. We first estimate the best linear predictor of 𝑌 𝑡 by using Theorem 3. Given 𝑛 -dimensional data vector 𝒚

[ 𝑦 1 , 𝑦 2 , … , 𝑦 𝑛 ] ⊤ of the standard QPGP with period 𝑝 , periodic covariance kernel 𝜅 𝑝 , and between-period correlation 𝜔 , we estimate the best linear predictor of 𝑦 𝑡 using the plugin estimator as follows.

𝑦 ^ 𝑡

{ 𝓚 ^ 1 , 𝑙 ( 𝑡 ) − 1 𝓚 ^ 𝑙 ( 𝑡 ) − 1 − 1 𝘆 1 ( 𝑡 − 1 )

if 1 < 𝑡 ≤ 𝑝

𝜔 ^ 𝑦 𝑡 − 𝑝 + 𝓚 ^ 1 , 𝑙 ( 𝑡 ) − 1 𝓚 ^ 𝑙 ( 𝑡 ) − 1 − 1

if 𝑡

𝑝

× ( 𝘆 𝑖 ( 𝑡 ) + 1 𝑙 ( 𝑡 ) − 1 − 𝜔 ^ 𝘆 𝑖 ( 𝑡 ) 𝑙 ( 𝑡 ) − 1 )

,

(31)

where 𝜔 ^ and 𝓚 ^ are proposed estimators obtained from Algorithm 1. Further, 𝓚 ^ 1 , 𝑙 ( 𝑡 ) − 1 and 𝓚 ^ 𝑙 ( 𝑡 ) − 1 are obtained from 𝓚 ^ by using (29). The estimated predicted value, 𝑦 ^ 𝑡 given by (31), is also referred to as the fitted QPGP at time 𝑡 . We choose empirical integrated prediction squared error (EIPSE), defined below, as a measure of goodness of fit.

EIPSE

1 𝑛 ∑ 𝑡

2 𝑛 ( 𝑦 𝑡 − 𝑦 ^ 𝑡 ) 2

(32)

Note that EIPSE measures the scaled squared Euclidean distance between the observation vector ( 𝒚 𝑛 ) and its best linear predictor vector ( 𝒚 ^ 𝑛 ). The measure EIPSE can be used to determine the covariance kernel 𝜅 𝑝 among a class of potential parametric families of covariance kernels (see examples in section II) in a particular application. In such a scenario, we recommend choosing 𝜅 𝑝 that corresponds to the smallest EIPSE. For a general QPGP, we discard the initial periodic block 𝗬 1 in the computation of EIPSE.

IV-CUncertainty quantification

In this subsection, we present a model-based bootstrap approach to quantify the uncertainty of the estimators 𝜔 ^ and 𝜅 ^ 𝑝 . The rapid generation of the proposed QPGP using the structural equations (7) plays an advantageous role in the resampling procedure to generate the bootstrap samples.

Given the QPGP data vector 𝒚

[ 𝑦 1 , 𝑦 2 , … , 𝑦 𝑛 ] ⊤ and 𝜔 ^ and 𝜅 ^ 𝑝 obtained from Algorithm 1, we use the following resampling steps to get bootstrap estimates of the parameters 𝜔 , 𝓚 and best linear prediction of 𝑦 𝑡 for 𝑡

1 , 2 , … , 𝑛 .

1.

Residuals: Compute the residuals, 𝘇 ^ 𝑖

𝘆 𝑖 − 𝜔 ^ 𝘆 𝑖 − 1 , for 𝑖

2 , 3 , . … , 𝑘 .

2.

Resampled periodic building blocks: Generate 𝘇 𝑖 ∗ , for 𝑖

2 , … , 𝑘 by using simple random sampling with replacement from { 𝘇 ^ 2 , 𝘇 ^ 3 , … , 𝘇 ^ 𝑘 } .

3.

Initial periodic block: Set 𝘆 1 ∗ as 𝘆 1 .

4.

Resampled QPGP: Generate 𝒚 ∗

[ 𝑦 1 ∗ , 𝑦 2 ∗ , … , 𝑦 𝑛 ∗ ] ⊤ by using (7) and 𝘆 1 ∗ , 𝘇 2 ∗ , … , 𝘇 𝑘 ∗ .

5.

Bootstrap estimates: 𝜔 ^ ∗ , 𝜅 ^ 𝑝 ∗ by using Algorithm 1.

In step (3), the choice of the initial building block for the resampling scheme is in line with [29]. We repeat the resampling and bootstrap estimation steps (1) to (5) a large number of times (say, 𝑀 ). The bootstrap standard error of 𝜔 ^ and 𝜅 ^ 𝑝 is given by standard deviation of the 𝑀 bootstrap estimates of 𝜔 ^ ∗ and 𝜅 ^ 𝑝 ∗ , respectively. A ( 1 − 𝛼 ) % bootstrap confidence interval of 𝜔 is constructed using the empirical 𝛼 / 2 and ( 1 − 𝛼 / 2 ) quantiles of 𝑀 bootstraps estimates 𝜔 ^ ∗ . Similarly, a pointwise confidence interval of 𝜅 𝑝 ( ⋅ ) is also constructed using empirical quantiles 𝑀 bootstrap estimates 𝜅 ^ 𝑝 ∗ ( ⋅ ) .

Suppose 𝜅 𝑝 is known to belong to a parametric family of periodic covariance kernel 𝕂 ( 𝜽 , 𝜎 2 ) with hyper-parameters ( 𝜽 , 𝜎 2 ) . Given an estimate of 𝜔 ^ , we modify step (5) of the bootstrap procedure in this scenario. The bootstrap estimates of parameter 𝜔 ^ ∗ and ( 𝜃 ^ ∗ , 𝜎 ^ 2 ⁣ ∗ ) , based on resampled QPGP 𝒚 ∗ , are obtained by using the estimation strategy as described in (25). As discussed, the bootstrap standard errors and confidence intervals of QPGP parameters ( 𝜔 , 𝜃 , 𝜎 2 ) are obtained empirically using these bootstrap estimates.

VSimulation Study

In this section, we illustrate the numerical performance of the proposed estimation algorithm as discussed in section IV. We also examine the performance of bootstrap standard error estimates of the proposed estimator of QPGP parameters. Further, we also demonstrate the faster evaluation of likelihood and prediction for the proposed QPGP data vector. We choose standard QPGP with period 𝑝

10 , between period correlation 𝜔

0.5 , and MacKay’s periodic covariance kernel 𝜅 𝑝 as in (2) with 𝜃

1 and 𝜎 2

1 . We compare the performances for sample sizes 𝑛

600 , 3000 , and 10000 . The chosen sample sizes are similar to those of the real datasets analyzed in Section VI. An additional simulation study to illustrate the performance of the proposed estimation methodology, under the identical experimental setup, for a larger periodicity 𝑝

100 (which is similar to the one of the chosen real dataset) is reported in the supplementary material. Furthermore, all experiments are performed on a standard desktop with an Intel Core i7-12700 CPU, 16 GB DDR4 RAM, and 1 TB SSD.

V-AFaster likelihood and prediction evaluation

We first compare the computational time taken in evaluating the likelihood using the expressions given in (12) and (14). We implement the fast algorithm proposed in [13] for computing the inverse and determinant of 𝚺 𝑛 given in (13). In Table I, we report the value of ℓ 𝑛 and time taken in computation in milliseconds. We observe that the value of ℓ ~ 𝑛 coincides for both the expressions. However, the computational time in likelihood evaluation using expression (14) is significantly smaller than that of using expression (12) for larger sample sizes.

TABLE I:Computational time in milliseconds for likelihood evaluation Expression (12) Expression (14) n ℓ 𝑛 Time ℓ 𝑛 Time

600

− 9.61 × 10 1

0.04

− 9.61 × 10 1

0.06

3000

− 5.32 × 10 2

0.63

− 5.32 × 10 2

0.09

10000

− 1.79 × 10 3

4.22

− 1.79 × 10 3

0.18

We now compare the time taken in computing the prediction using the expression given in (27) and (28). In Table II, we report the integrated prediction squared error (IPSE) defined as IPSE ≜ 1 𝑛 ∑ 𝑡

2 𝑛 ( 𝑌 ^ 𝑡 − 𝑌 𝑡 ) 2 and time (in milliseconds) taken in computing it using expressions (27) and (28). Similar to the likelihood values evaluations, we observe that the values of IPSE using both the expressions (27) and (28) match. However, the computational time of IPSE based on (27) is significantly larger than that of using (28). A substantial computational cost of IPSE using (27), for 𝑛

10000 , is due to repetitive evaluation of the computationally expensive inverse of 𝚺 𝒕 − 𝟏 for large 𝑡 . These experiments demonstrate the computational advantages of the proposed structural equation based QPGP.

TABLE II:Computational time in milliseconds for IPSE Expression (27) Expression (28) n IPSE Time IPSE Time

600

1.45 × 10 − 1

1.37 × 10 2

1.45 × 10 − 1

0.54

3000

1.48 × 10 − 1

2.14 × 10 4

1.48 × 10 − 1

1.91

10000

1.50 × 10 − 1

4.78 × 10 7

1.50 × 10 − 1

24.83 V-BFinite sample performance of proposed estimator

We now present the finite sample performance of the proposed estimation methodology in terms of root mean squared error (RMSE) based on 1000 simulation runs and compare it with the maximum likelihood estimates (MLE), which are obtained by minimizing the negative logarithm of the likelihood function given in (14) using a grid search algorithm. For MLE, we choose the grid for 𝜔 ∈ [ 0 , 0.99 ] , 𝜃 ∈ [ 0.5 , 1.5 ] and 𝜎 2 ∈ [ 0.5 , 1.5 ] with step sizes 0.01. Table III shows the RMSE of the proposed estimator and the MLE of the QPGP parameters. It also shows the computational time per run in milliseconds for the proposed estimator as well as MLE. We observe that RMSE of the proposed estimator is slightly larger than that of MLE. Note that the computational time of MLE is tremendously larger than the proposed estimator. We also observed that a coarser grid size reduces the computational cost of MLE; however, it leads to a larger RMSE of the MLE in comparison to the proposed methodology. This indicates that the proposed estimation strategy has a sizable computational advantage over the grid search based MLE and exhibits a comparable accuracy in comparison to MLE.

TABLE III:RMSE of proposed estimator and MLE based on 1000 runs, along with respective computational cost RMSE Time per run in milliseconds

𝑛

𝜔 ^

𝜔 𝑚 𝑙 𝑒

𝜃 ^

𝜃 𝑚 𝑙 𝑒

𝜎 ^ 2

𝜎 𝑚 𝑙 𝑒 2 Proposed Estimator MLE 600 0.0639

0.0366

0.1185

0.0192

0.1251

0.0961

2.04

6183.23

3000 0.0276

0.0161

0.0511

0.0089

0.0551

0.0441

4.08

39504.15

10000 0.0148

0.0088

0.0274

0.0056

0.0313

0.0267

5.06

53721.56 V-CFinite sample performance of bootstrap standard errors

We estimate the bootstrap standard errors of the proposed estimators 𝜔 ^ , 𝜃 ^ and 𝜎 ^ 2 based on 𝑀 (

1000 ) bootstrap samples corresponding to each run by using the resampling steps outlined in subsection IV-C. We also compute the standard errors of these estimators across 1000 independent runs.

The left column of Figure 1 shows the boxplots of bootstrap standard errors of 𝜔 ^ based on 1000 independent runs for sample sizes 𝑛

600 , 3000 and 10000 . The decreasing spread of the boxplots as 𝑛 increases indicates a reduction in variability, consistent with a smaller standard deviation for larger sample sizes. The red dashed line corresponds to the standard error of 𝜔 ^ across simulation runs. We observe that the gap between median bootstrap standard error and across-run standard errors reduces as the sample size increases. Further, the width of boxes also reduces as the sample size ( 𝑛 ) increases. The center and right columns of Figure 1 show the boxplots of bootstrap standard errors of 𝜃 ^ and 𝜎 ^ 2 . We observe similar patterns as in the case of 𝜔 ^ . This indicates that the bootstrap standard errors of the proposed estimator approximate the true standard errors reasonably well.

S.E.( 𝜔 ^ ) S.E.( 𝜃 ^ ) S.E.( 𝜎 ^ 2 )

Figure 1:The box plots of bootstrap standard errors (computed from 𝑀

1000 bootstrap samples) of 𝜔 ^ , 𝜃 ^ and 𝜎 ^ 2 , based on 1000 simulation runs of standard QPGP with period 𝑝

10 , 𝜔

0.5 and Mackay’s periodic kernel with ( 𝜃

1 , 𝜎 2

1 ) , are shown in left, center and right panel, respectively. Each panel consists of three box plots corresponding to sample sizes 𝑛

600 , 3000 , and 10000 . The empirical standard error of estimators across simulation runs is shown in a dashed horizontal red line. VICase Studies

In this section, we fit the proposed QPGP model to three distinct real datasets, which are known to be quasi-periodic signals. Suppose the exact periodicity of a quasi-periodic signal is not known but believed to belong to an integer set 𝒫 . In that case, we fit standard QPGP with a general periodic kernel for each 𝑝 ∈ 𝒫 using Algorithm 1 and determine the periodicity 𝑝 that corresponds to the smallest reduced negative logarithm of likelihood ℓ ~ 𝑛 given in (33) (if 𝑛

𝑘 𝑝 for some positive integer 𝑘 , then use ℓ ~ 𝑛 as given in (15)). To compare the performance of fitted QPGP over different periodic covariance kernels, we consider the following choices of 𝜅 𝑝 kernels:

•

General periodic kernel

•

MacKay’s kernel given in (2).

•

Periodic Matérn’s kernel given in (3) with 𝜈

1.5 .

•

Cosine kernel given in (4) with 𝜄

1 .

We evaluate the EIPSE corresponding to all the chosen kernels and show it in Table IV. We report here the estimates of the QPGP parameters, along with their bootstrap standard errors, corresponding to the kernel that yields the smallest EIPSE among the chosen periodic kernels. The standard errors are computed based on 𝑀

1000 resamples. We report the same details of estimates of QPGP parameters for all the chosen periodic covariance kernels in the supplementary material.

VI-ACarbon Dioxide Emission Signal

We consider the monthly carbon dioxide emission data, measured in ppm by SIO (Scripps Institution of Oceanography, San Diego) air sampling network, from the year 1958 to 2003, publicly available at the DOE Data Explorer. The carbon dioxide emission signal exhibits an increasing trend over the years, along with quasi-periodic behavior [14]. We first adjust the trend in the data by fitting a quadratic regression over time using least squares, and proceed to fit the QPGP on the trend-adjusted CO2 emission signal. The data set consists of 𝑛

612 time instances with six missing entries. We imputed the missing entries by using linear interpolation. The approximate periodicity of the data appears to be 𝑝

12 . As discussed, we fitted a general QPGP for 𝑝 ∈ 𝒫

{ 2 , 3 , … , 20 } , and the smallest reduced negative likelihood ℓ ~ 𝑛 corresponds to 𝑝

12 . This is in line with the approximate periodicity of the data.

TABLE IV:EIPSE values corresponding to various periodic covariance kernels 𝜅 𝑝 Dataset Periodic Covariance Kernel 𝜅 𝑝

MacKay’s (2)	Matérn (3)	Cosine (4)	General

( 𝜈

1.5 )
( 𝜄

1 )

CO2 Emission 0.4759 0.5123 0.6997 0.4044 Sunspot numbers 37.2487 33.8736 40.6732 35.4992 Water Level 0.1222 0.0295 0.1156 0.0311

The top row of Table IV shows the EIPSE values for CO2 emission data corresponding to the chosen periodic covariance kernels 𝜅 𝑝 with 𝑝

12 . The smallest EIPSE corresponds to the general choice of covariance kernel. The estimate of 𝜔 corresponding to the general kernel turns out to be 0.9752 with bootstrap standard error 0.0085 and 95% confidence interval as ( 0.9705 , 1.0038 ) . Figure 2 shows the plot of estimates of the general covariance kernel against lag in a solid black line, along with 95 % confidence limits in the dashed black lines. We also observe that the estimated 𝜅 𝑝 ( ⋅ ) values are included in 95 % confidence limit. Figure 3 shows the plot of detrended CO2 emission data in a black solid line and the fitted QPGP corresponding to the general kernel in a dashed red line. By using the Gaussianity of predicted QPGP (see (28)), a 95 % prediction interval of fitted QPGP is given by ( 𝑌 𝑡 ^ − 1.96 Var ( 𝑌 𝑡 ^ ) , 𝑌 𝑡 ^ + 1.96 Var ( 𝑌 𝑡 ^ ) ) . An estimate of the variance of the predicted QPGP is obtained by a plug-in estimate of 𝜔 ^ and 𝜅 ^ 𝑝 ( ⋅ ) on the RHS of (3). The grey-shaded region in Figure 3 represents an estimated 95 % prediction interval obtained using the plug-in estimates of the parameters. We observe that the proposed QPGP fits the data reasonably well.

Figure 2:The plot shows the estimates of general 𝜅 𝑝 ( ⋅ ) against lag in black solid line corresponding to the CO2 dataset, along with the 95% bootstrap confidence limits in grey-dashed lines. Figure 3:The plot shows the detrended carbon dioxide emission levels vs. year in black solid line together with fitting standard QPGP with 𝑝

12 and general kernel in dashed red line. The grey-shaded region corresponds to the estimated 95 % prediction intervals using plug-in estimates. VI-BSunspot Numbers Data

Sunspot numbers observed over the years appear to exhibit a quasi-periodic pattern. The underlying solar magnetic dynamo, which involves nonlinear and chaotic processes, reflects the irregularities in cycle timing and intensity in sunspots. Further, the turbulence in the solar plasma also affects the periodic pattern of the sunspot numbers ([30]). We consider the yearly sunspot numbers from 1703 to 2025 . The dataset consists of 𝑛

322 samples. The data is publicly available at the SILSO database. [13] fitted the QPGP model (that corresponds to MacKay’s kernel) to the sunspot data. It is well known that the approximate periodicity of sunspot numbers is 11 years [31]. As indicated earlier, we fitted standard QPGP for the general covariance kernel to sunspot data corresponding to 𝑝 ∈ 𝒫

{ 2 , 4 , … , 20 } . The smallest reduced negative likelihood ( ℓ ~ 𝑛 ) corresponds to 𝑝

11 . This is in tune with the well-known approximate periodicity of the sunspot numbers.

The middle row of Table IV shows the EIPSE values corresponding to the chosen periodic covariance kernels with 𝑝

11 . The smallest EIPSE value corresponds to the periodic Matérn kernel. Note that the EIPSE value corresponding to the general kernel is close to the periodic Matérn kernel. Table V shows the estimates of fitted QPGP parameters (with 𝑝

11 , 𝜅 𝑝 as periodic Matérn kernel) along with bootstrap standard errors and 95% confidence intervals. Note that the confidence interval of QPGP parameters does not include 0 , which indicates that the estimates of QPGP parameters are statistically significant. Figure 4 shows the plot of sunspot numbers over the years in a black solid line, along with the fitted QPGP with 𝑝

11 with Matérn kernel in a dashed red line. The grey-shaded region in Figure 4 represents an estimated 95 % prediction interval obtained using the plug-in estimates of the parameters. We observe that the proposed QPGP mostly fits the sunspot numbers well, except for a few years when the signal appears to be relatively weak.

TABLE V:Estimates of QPGP parameter corresponding to 𝑝

11 and periodic Matérn kernel ( 𝜈

1.5 ) for sunspot data Parameter Estimate Standard Error Confidence Interval

𝜔

0.7228

0.06375

( 0.5861 , 0.8429 )

𝜎 2

2568.1523

563.3239

( 1439.5634 , 3699.3789 )

𝜃
0.7599
0.0814
( 0.5436 , 0.8539 ) Figure 4:The plot shows sunspot numbers vs. year in a black solid line together with fitting standard QPGP with period 𝑝

11 and Matérn kernel in a dashed red line. The grey-shaded region corresponds to the estimated 95 % prediction intervals using plug-in estimates. VI-CWater Level Signal

The water level at a specific location of the sea depends on the complex climate phenomenon and appears to be quasi-periodic due to the tide. Since tides are affected by multiple natural cycles due to the gravitational pull of the moon and the sun, these cycles of tides are a source of periodicity in the water level. The local weather conditions, sea level, and bathymetry of the location also affect the tidal measurements, which adds to the periodic variation in water levels (see [32]). We consider the water levels dataset, recorded over a uniform time interval at an automatic tide gauge on Mornington Island in Queensland, publicly available at Queensland Government’s open data portal. This data set contains the water level records measured from January 1, 2016 00:00 Hrs. to April 9, 2016 23:50 Hrs over a uniform time interval of 10 minutes and consists of 𝑛

14400 observations. We choose this high-frequency water level quasi-periodic dataset to illustrate that the proposed QPGP model can be fitted to long quasi-periodic signals with large sample counts efficiently, while providing uncertainty quantification of estimated parameters.

The approximate periodicity of the water level appears to be 24 hours. We therefore fitted standard QPGP with the general kernel for 𝑝 ∈ 𝒫

{ 132 , 137 , … , 156 } (between 22 hours to 26 hours) to determine 𝑝 . The smallest reduced negative likelihood value corresponds to 𝑝

148 (24 hours and 40 minutes). As we observed in the CO2 emission and sunspot numbers, the described method to determine 𝑝 for water level data is also in tune with conventional observations.

The bottom row of Table IV shows the EIPSE values for the water level data corresponding to the chosen periodic covariance kernels 𝜅 𝑝 with 𝑝

148 . The smallest EIPSE corresponds to Matérn periodic kernel; however, the EIPSE value corresponding to the general kernel is very close to it.

Table VI shows the estimates of fitted QPGP parameters (with 𝑝

148 , 𝜅 𝑝 as periodic Matérn kernel) along with bootstrap standard errors and 95% confidence intervals. The confidence intervals of the QPGP parameters exclude 0 , which shows that the estimates of the parameters are statistically significant. Figure 5 shows the plot of the water level in a black solid line, along with the fitted QPGP (with 𝑝

148 and 𝜅 𝑝 as Matérn kernel) in a dashed red line. The grey-shaded region represents an estimated 95 % prediction interval obtained using the plug-in estimates of the parameters. We observe that the fitted QPGP is very close to the data.

TABLE VI:Estimates of QPGP parameter corresponding to 𝑝

148 and periodic Matérn kernel ( 𝜈

1.5 ) for water level data Parameter Estimate Standard Error Confidence Interval

𝜔

0.9673

0.0102

( 0.9432 , 0.9824 )

𝜎 2

0.0358

0.0872

( 0.0237 , 0.3508 )

𝜃
0.8338
0.4892
( 0.5649 , 2.1742 ) Figure 5:The plot shows the water level in a black solid line against days, together with fitted standard QPGP (with 𝑝

148 and 𝜅 𝑝 periodic Matérn kernel) in a dashed red line. The grey-shaded region corresponds to the estimated 95 % prediction intervals using plug-in estimates. VIIConclusion

In this article, we develop a novel family of Quasi-Periodic Gaussian Processes by using a system of structural equations which provide a flexible framework for modeling the within period correlation of the QPGP. We show that the structural equations simplify the likelihood function, proving more computationally efficient to evaluate than the prior work on rapid likelihood evaluation [13]. Importantly, the proposed approach generalises to a broad class of within period kernels, both parametric and non-parametric.

Given a data vector of the proposed QPGP, the maximum likelihood estimation technique of the QPGP parameters requires optimization of the likelihood function over the family of general periodic covariance kernels, which is a non-convex set. We address this issue by developing a two stage fast estimation algorithm based on a reduced likelihood function. We establish that the proposed estimation strategy provides statistically consistent estimators. We numerically show that that the accuracy of the proposed estimator is comparable to the maximum likelihood estimator. We illustrate that the proposed estimation strategy is computationally faster than prior work [13]. Further, the structural equations reduce the computational cost of the best linear prediction of QPGP significantly. The proposed QPGP based on structural equations enable rapid generation from the QPGP. This fact is exploited to construct a bootstrap methodology for the estimation of the standard errors of the proposed estimators of the QPGP parameters. The technique of utilizing the partial block information in the proposed estimation strategy, as discussed in the Appendix, can be extended to the analysis of missing data in the QPGP in future research.

The general selection of periodic kernel sets a reasonable QPGP fit for all the chosen data. The QPGP with parametric choice of periodic kernel, in particular periodic Matérn kernel, competes with the general periodic kernel and exhibits superior performance for two datasets. This highlights the advantage of the proposed QPGP, which offers a flexible choice of periodic covariance kernels. The rapid generation of the proposed QPGP and the computationally efficient proposed parameter estimation methodology enable the uncertainty quantification in the estimates, even for a large quasi-periodic signal such as the water level signal. The fitting of the proposed QPGP to the different types of quasi-periodic signals underscores the broad applicability and efficacy of our QPGP.

References [1] ↑ W. Fan, Y. Li, K. L. Tsui, and Q. Zhou, “A noise resistant correlation method for period detection of noisy signals,” IEEE Transactions on Signal Processing, vol. 66, no. 10, pp. 2700–2710, 2018. [2] ↑ Y. Li, H. Zhao, W. Fan, and C. Shen, “Generalized autocorrelation method for fault detection under varying-speed working conditions,” IEEE Transactions on Instrumentation and Measurement, vol. 70, pp. 1–11, 2021. [3] ↑ L. Rabiner, “On the use of autocorrelation analysis for pitch detection,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-25, no. 1, pp. 24–33, 1977. [4] ↑ B. Quinn and P. Thomson, “Estimating the frequency of a periodic function,” Biometrika, vol. 78, no. 1, pp. 65–74, 1991. [5] ↑ J. K. Nielsen, T. L. Jensen, J. R. Jensen, M. G. Christensen, and S. H. Jensen, “Fast fundamental frequency estimation: Making a statistically efficient estimator computationally efficient,” Signal Processing, vol. 135, pp. 188–197, 2017. [6] ↑ I. V. L. Clarkson, “Approximate maximum-likelihood period estimation from sparse, noisy timing data,” IEEE Transactions on Signal Processing, vol. 56, no. 5, pp. 1779–1787, 2008. [7] ↑ J. Wise, J. Caprio, and T. Parks, “Maximum likelihood pitch estimation,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-24, no. 5, pp. 418–423, 1976. [8] ↑ D. J. C. MacKay, “Introduction to gaussian processes,” NATO ASI Series F Computer and Systems Sciences, vol. 168, pp. 133–166, 1998. [9] ↑ Y. Li, Y. Pu, C. Cheng, and Q. Xiao, “A scalable gaussian process for large-scale periodic data,” Technometrics, vol. 65, no. 3, pp. 363–374, 2023. [10] ↑ N. HajiGhassemi and M. Deisenroth, “Analytic long-term forecasting with periodic gaussian processes,” in Artificial Intelligence and Statistics, pp. 303–311, PMLR, 2014. [11] ↑ H. Cai, Z. Cen, L. Leng, and R. Song, “Periodic-gp: Learning periodic world with gaussian process bandits,” CoRR, vol. abs/2105.14422, 2021. [12] ↑ Y. Li, W. Zhang, M. H. Y. Tan, and P. Chien, “Sequential decomposition of multiple seasonal components using spectrum-regularized periodic gaussian process,” IEEE Transactions on Signal Processing, vol. 73, pp. 1034–1047, 2025. [13] ↑ Y. Li, Y. Zhang, Q. Xiao, and J. Wu, “Quasi-periodic gaussian process modeling of pseudo-periodic signals,” IEEE Transactions on Signal Processing, vol. 71, pp. 3548–3561, 2023. [14] ↑ C. E. Rasmussen and C. K. I. Williams, Gaussian Processes for Machine Learning.Cambridge, MA: The MIT Press, 2006. [15] ↑ V. Chandola and R. R. Vatsavai, “A gaussian process based online change detection algorithm for monitoring periodic time series,” in Proceedings of the SIAM International Conference on Data Mining, (Philadelphia, PA, USA), pp. 95–106, SIAM, 2011. [16] ↑ B. A. Nicholson and S. Aigrain, “Quasi-periodic Gaussian processes for stellar activity: From physical to kernel parameters,” Monthly Notices of the Royal Astronomical Society, vol. 515, pp. 5251–5266, 07 2022. [17] ↑ Z. Zhang, P. Brown, and J. Stafford, “Efficient modeling of quasi-periodic data with seasonal gaussian process,” Statistics and Computing, vol. 35, p. 32, 2025. [18] ↑ U. Nigam, R. Srivastava, M. Burke, and F. Marzbanrad, “A dynamical equation approach for quasi-periodic gaussian processes,” in 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5, 2025. [19] ↑ M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions.New York: Dover Publications, 1965. [20] ↑ M. L. Stein, Interpolation of Spatial Data: Some Theory for Kriging.New York: Springer, 1999. [21] ↑ M. Perger, G. Anglada-Escudé, I. Ribas, A. Rosich, E. Herrero, and J. C. Morales, “Auto-correlation functions of astrophysical processes, and their relation to gaussian processes - application to radial velocities of different starspot configurations,” A&A, vol. 645, p. A58, 2021. [22] ↑ D. Foreman-Mackey, E. Agol, S. Ambikasaran, and R. Angus, “Fast and scalable gaussian process modeling with applications to astronomical time series,” The Astronomical Journal, vol. 154, no. 6, p. 220, 2017. [23] ↑ M. Kac, W. Murdock, and G. Szegö, “On the eigenvalues of certain hermitian forms,” Indiana University Mathematics Journal, vol. 2, no. 4, pp. 767–800, 1953. [24] ↑ A. Aubry, P. Babu, A. De Maio, and M. Rosamilia, “Advanced methods for mle of toeplitz structured covariance matrices with applications to radar problems,” IEEE Transactions on Information Theory, vol. 70, no. 12, pp. 9277–9292, 2024. [25] ↑ K. B. Petersen and M. S. Pedersen, “The matrix cookbook,” 2012.Version 20121115. [26] ↑ D. P. Bertsekas, Nonlinear programming.Athena Scientific Optimization and Computation Series, Athena Scientific, Belmont, MA, 2nd ed., 1999. [27] ↑ P. Hall, N. I. Fisher, and B. Hoffmann, “On the nonparametric estimation of covariance functions,” Ann. Statist., vol. 22, no. 4, pp. 2115–2134, 1994. [28] ↑ P. J. Brockwell and R. A. Davis, Time Series: Theory and Methods.New York: Springer, 2nd ed., 2016. [29] ↑ L. A. Thombs and W. R. Schucany, “Bootstrap prediction intervals for autoregression,” Journal of the American Statistical Association, vol. 85, no. 410, pp. 486–492, 1990. [30] ↑ P. Frick, D. Sokoloff, R. Stepanov, V. Pipin, and I. Usoskin, “Spectral characteristic of mid-term quasi-periodicities in sunspot data,” Monthly Notices of the Royal Astronomical Society, vol. 491, pp. 5572–5578, 12 2019. [31] ↑ A. Balogh, H. Hudson, K. Petrovay, et al., “Introduction to the solar activity cycle: Overview of causes and consequences,” Space Science Reviews, vol. 186, pp. 1–15, Dec 2014. [32] ↑ I. D. Haigh, M. D. Pickering, J. A. M. Green, B. K. Arbic, A. Arns, S. Dangendorf, D. F. Hill, K. Horsburgh, T. Howard, D. Idier, D. A. Jay, L. Jänicke, S. B. Lee, M. Müller, M. Schindelegger, S. A. Talke, S. B. Wilmes, and P. L. Woodworth, “The tides they are a-changin’: A comprehensive review of past and future nonastronomical changes in tides, their driving mechanisms, and future implications,” Reviews of Geophysics, vol. 58, no. 1, p. e2018RG000636, 2020.e2018RG000636 2018RG000636. [33] ↑ S. R. Searle and M. H. J. Gruber, Linear models.Wiley Series in Probability and Statistics, John Wiley & Sons, Inc., Hoboken, NJ, 2nd ed., 2017. Appendix AAppendix: Updated estimation strategy in presence of partial periodic block in data A-AUpdated estimation strategy in presence of partial periodic block in data

In this appendix, we consider the QPGP data vector 𝒚

[ 𝑦 1 , 𝑦 2 , … , 𝑦 𝑛 ] where 𝑛

𝑘 𝑝 + 𝑙 for some 𝑙 ∈ { 1 , 2 , … , 𝑝 − 1 } , i.e., the data vector consists of complete observations on the periodic blocks 𝘆 𝑖 for 𝑖

1 , 2 , … , 𝑘 and partial observation of ( 𝑘 + 1 ) th block 𝘆 𝑘 + 1 ( 𝑙 ) . For the QPGP data vector 𝒚 , the negative logarithm of the likelihood function is given as follows:

ℓ 𝑛 ( 𝜔 , 𝜅 𝑝 )

1 2 log ⁡ | 𝓚 𝑙 | + 1 2 ( 𝘆 𝑘 + 1 ( 𝑙 ) − 𝜔 𝘆 𝑘 ( 𝑙 ) ) ⊤ 𝓚 𝑙 − 1 ( 𝘆 𝑘 + 1 ( 𝑙 ) − 𝜔 𝘆 𝑘 ( 𝑙 ) ) ⏟ Contribution of partial block 𝗬 𝑘 + 1 ( 𝑙 )

+ 𝑘 − 1 2 log ⁡ ( | 𝓚 | )

+ 1 2 ∑ 𝑖

1 𝑘 − 1 ( 𝘆 𝑘 − 𝑖 + 1 − 𝜔 𝘆 𝑘 − 𝑖 ) ⊤ 𝓚 − 1 ( 𝘆 𝑘 − 𝑖 + 1 − 𝜔 𝘆 𝑘 − 𝑖 ) ⏟ Contribution of blocks 𝗬 𝑘 , … , 𝗬 1

1 2 log ⁡ ( | 1 1 − 𝜔 2 𝓚 | )
1 2 𝘆 1 ⊤ ( 1 1 − 𝜔 2 𝓚 ) − 1 𝘆 1 ⏟ Marginal contribution of block 𝗬 1
𝑐 ,

(33)

where 𝑐

𝑛 log ⁡ ( 2 𝜋 ) . As discussed in section IV, we ignore the marginal contribution of the initial periodic block 𝗬 1 in the likelihood given in (33) in our estimation strategy. Similar to (15), the scaled negative logarithm of reduced likelihood function is updated as follows.

ℓ ~ 𝑛 ( 𝜔 , 𝜅 𝑝 )

1 𝑘 − 1 log ⁡ | 𝓚 𝑙 |

+ 1 𝑘 − 1 ( 𝘆 𝑘 + 1 ( 𝑙 ) − 𝜔 𝘆 𝑘 ( 𝑙 ) ) ⊤ 𝓚 𝑙 − 1 ( 𝘆 𝑘 + 1 ( 𝑙 ) − 𝜔 𝘆 𝑘 ( 𝑙 ) )

+ log ⁡ ( | 𝓚 | )

+ 1 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 ( 𝘆 𝑘 − 𝑖 + 1 − 𝜔 𝘆 𝑘 − 𝑖 ) ⊤ 𝓚 − 1 ( 𝘆 𝑘 − 𝑖 + 1 − 𝜔 𝘆 𝑘 − 𝑖 ) .

(34)

Note that, by using the partition and transformation

𝓚

≜

[ 𝓚 𝑙

𝓚 𝑙 , 𝑝 − 𝑙

𝓚 𝑝 − 𝑙 , 𝑙

𝓚 𝑝 − 𝑙 ] ,

(37)

𝓚 ( 𝑝 − 𝑙 ) . 𝑙

≜

𝓚 𝑝 − 𝑙 − 𝓚 𝑝 − 𝑙 , 𝑙 𝓚 𝑙 − 1 𝓚 𝑙 , 𝑝 − 𝑙 ,

(38)

we have

| 𝓚 |

| 𝓚 𝑙 | × | 𝓚 ( 𝑝 − 𝑙 ) . 𝑙 | .

(39)

By using the transformation and partition

𝖅 𝑖 ( 𝜔 )

𝘆 𝑖 − 𝜔 𝘆 𝑖 − 1 for 𝑖

2 , … , 𝑘 ,

(40)

𝖅 𝑖 ( 𝜔 )

[ 𝖅 𝑖 ( 𝑙 ) ( 𝜔 )

𝖅 𝑖 ( 𝑝 − 𝑙 ) ( 𝜔 ) ]

[ 𝘆 𝑖 ( 𝑙 ) − 𝜔 𝘆 𝑖 − 1 ( 𝑙 )

𝘆 𝑖 ( 𝑝 − 𝑙 ) − 𝜔 𝘆 𝑖 − 1 ( 𝑝 − 𝑙 ) ] ,

(45)

and some algebra, the quadratic summand of the fourth term on the RHS of (34) simplifies as follows.

( 𝘆 𝑘 − 𝑖 + 1 − 𝜔 𝘆 𝑘 − 𝑖 ) ⊤ 𝓚 − 1 ( 𝘆 𝑘 − 𝑖 + 1 − 𝜔 𝘆 𝑘 − 𝑖 )

𝖅 𝑘 − 𝑖 + 1 ( 𝑙 ) ( 𝜔 ) ⊤ 𝓚 𝑙 − 1 𝖅 𝑘 − 𝑖 + 1 ( 𝑙 ) ( 𝜔 )

( 𝖅 𝑘 − 𝑖
1 ( 𝑝 − 𝑙 ) ( 𝜔 ) − 𝓚 𝑝 − 𝑙 , 𝑙 𝓚 𝑙 − 1 𝖅 𝑘 − 𝑖
1 ( 𝑙 ) ( 𝜔 ) ) ⊤ 𝓚 ( 𝑝 − 𝑙 ) . 𝑙 − 1

× ( 𝖅 𝑘 − 𝑖 + 1 ( 𝑝 − 𝑙 ) ( 𝜔 ) − 𝓚 𝑝 − 𝑙 , 𝑙 𝓚 𝑙 − 1 𝖅 𝑘 − 𝑖 + 1 ( 𝑙 ) ( 𝜔 ) ) .

(46)

By using (39) and (46), ℓ ~ 𝑛 , given in (34), can be viewed as a twice differentiable function of 𝜔 , 𝓚 𝑙 , 𝓚 𝑝 − 𝑙 , 𝑙 𝓚 𝑙 − 1 and 𝓚 ( 𝑝 − 𝑙 ) . 𝑙 . By equating the first derivatives of ℓ ~ 𝑛 with respect to 𝜔 , 𝓚 𝑙 , 𝓚 𝑝 − 𝑙 , 𝑙 𝓚 𝑙 − 1 and 𝓚 ( 𝑝 − 𝑙 ) . 𝑙 to 0 , the stationary points of ℓ ~ 𝑛 satisfy the following relations.

𝜔

∑ 𝑖

1 𝑘 − 1 𝘆 𝑘 − 𝑖 ⊤ 𝓚 − 1 𝘆 𝑘 − 𝑖 + 1 + 𝘆 𝑘 ( 𝑙 ) ⊤ 𝓚 𝑙 − 1 𝘆 𝑘 + 1 ( 𝑙 ) ∑ 𝑖

1 𝑘 − 1 𝘆 𝑘 − 𝑖 ⊤ 𝓚 − 1 𝘆 𝑘 − 𝑖 + 𝘆 𝑘 + 1 ( 𝑙 ) ⊤ 𝓚 𝑙 − 1 𝘆 𝑘 + 1 ( 𝑙 )

(47)

𝓚 𝑙

1 𝑘 ∑ 𝑖

0 𝑘 − 1 𝖅 𝑘 − 𝑖 + 1 ( 𝑙 ) ( 𝜔 ) 𝖅 𝑘 − 𝑖 + 1 ( 𝑙 ) ( 𝜔 ) ⊤

(48)

𝓚 𝑝 − 𝑙 , 𝑙

( 1 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 𝖅 𝑘 − 𝑖 + 1 ( 𝑝 − 𝑙 ) ( 𝜔 ) 𝖅 𝑘 − 𝑖 + 1 ( 𝑙 ) ( 𝜔 ) ⊤ )

× ( 1 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 𝖅 𝑘 − 𝑖 + 1 ( 𝑙 ) ( 𝜔 ) 𝖅 𝑘 − 𝑖 + 1 ( 𝑙 ) ( 𝜔 ) ⊤ ) − 1 𝓚 𝑙

(49)

𝓚 ( 𝑝 − 𝑙 ) . 𝑙

1 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 ( 𝖅 𝑘 − 𝑖 + 1 ( 𝑝 − 𝑙 ) ( 𝜔 ) − 𝓚 𝑝 − 𝑙 , 𝑙 𝓚 𝑙 − 1 𝖅 𝑘 − 𝑖 + 1 ( 𝑙 ) ( 𝜔 ) )

× ( 𝖅 𝑘 − 𝑖 + 1 ( 𝑝 − 𝑙 ) ( 𝜔 ) − 𝓚 𝑝 − 𝑙 , 𝑙 𝓚 𝑙 − 1 𝖅 𝑘 − 𝑖 + 1 ( 𝑙 ) ( 𝜔 ) ) ⊤ .

(50)

Given three out of four variables among 𝜔 , 𝓚 𝑙 , 𝓚 𝑝 − 𝑙 , 𝑙 𝓚 𝑙 − 1 and 𝓚 ( 𝑝 − 𝑙 ) . 𝑙 , the second derivative of ℓ ~ 𝑛 is positive definite at the stationary points. Therefore, by using a similar argument as in subsection IV-A, iterative alternate minimization algorithm converges.

We now describe the modified iterative steps of stage I of Algorithm 1 to accommodate the observed partial block 𝘆 𝑘 + 1 ( 𝑙 ) in the following manner to get 𝜔 ~ 𝑛 and 𝓚 ~ 𝑛 . Given 𝓚 ~ ( 𝑚 − 1 ) with initial 𝓚 ~ ( 0 )

𝑰 𝑝 for 𝑚 ≥ 1 , we set

𝜔 ~ ( 𝑚 )

∑ 𝑖

1 𝑘 − 1 𝘆 𝑘 − 𝑖 ⊤ 𝓚 ~ ( 𝑚 − 1 ) − 1 𝘆 𝑘 − 𝑖 + 1 + 𝘆 𝑘 ( 𝑙 ) ⊤ 𝓚 ~ 𝑙 ( 𝑚 − 1 ) − 1 𝘆 𝑘 + 1 ( 𝑙 ) ∑ 𝑖

1 𝑘 − 1 𝘆 𝑘 − 𝑖 ⊤ 𝓚 ~ ( 𝑚 − 1 ) − 1 𝘆 𝑘 − 𝑖 + 𝘆 𝑘 ( 𝑙 ) ⊤ 𝓚 ~ 𝑙 ( 𝑚 − 1 ) − 1 𝘆 𝑘 ( 𝑙 ) .

Further, given 𝜔 ~ ( 𝑚 ) for 𝑚 ≥ 1 , we first replace 𝜔 by 𝜔 ~ ( 𝑚 ) in equations (40) and (45) to get 𝖅 ⋅ ( 𝑙 ) ( 𝜔 ~ ( 𝑚 ) ) and 𝖅 ⋅ ( 𝑝 − 𝑙 ) ( 𝜔 ~ ( 𝑚 ) ) . Subsequently, replace 𝖅 ⋅ ( 𝑙 ) ( 𝜔 ) by 𝖅 ⋅ ( 𝑙 ) ( 𝜔 ~ ( 𝑚 ) ) and 𝖅 ⋅ ( 𝑝 − 𝑙 ) ( 𝜔 ) by 𝖅 ⋅ ( 𝑝 − 𝑙 ) ( 𝜔 ~ ( 𝑚 ) ) on the RHS of (48), (49) and (50) and set them as 𝓚 ~ 𝑙 ( 𝑚 ) , 𝓚 ~ 𝑝 − 𝑙 , 𝑙 ( 𝑚 ) and 𝓚 ~ ( 𝑝 − 𝑙 ) . 𝑙 ( 𝑚 ) , respectively. Now, by using (38) and (37), we set

𝓚 ~ 𝑝 − 𝑙 ( 𝑚 )

≜ 𝓚 ~ ( 𝑝 − 𝑙 ) . 𝑙 ( 𝑚 ) + 𝓚 ~ 𝑝 − 𝑙 , 𝑙 ( 𝑚 ) 𝓚 ~ 𝑙 ( 𝑚 ) − 1 𝓚 ~ 𝑙 , 𝑝 − 𝑙 ( 𝑚 )

𝓚 ~ ( 𝑚 )

≜ [ 𝓚 ~ 𝑙 ( 𝑚 )

𝓚 ~ 𝑙 , 𝑝 − 𝑙 ( 𝑚 )

𝓚 ~ 𝑝 − 𝑙 , 𝑙 ( 𝑚 )

𝓚 ~ 𝑝 − 𝑙 ( 𝑚 ) ] .

As described in stage I of Algorithm 1, we terminate these iterative steps when the first derivative of ℓ ~ 𝑛 evaluated at ( 𝜔 ~ ( 𝑚 ) , 𝓚 ~ ( 𝑚 ) ) is less than prespecified threshold 𝛿 simultaneously. Since the steps of stage II of Algorithm 1 are based on 𝜔 ~ 𝑛 and 𝓚 ~ 𝑛 , it is implemented as described in Algorithm 1 to get 𝜔 ^ and 𝜅 ^ 𝑝 .

Since the contribution of the partial periodic block 𝗬 𝑘 + 1 ( 𝑙 ) in the likelihood given in (33) is bounded, the consistency of 𝜔 ^ and 𝜅 ^ 𝑝 derived in Theorem 2 holds good in the presence of partial periodic block data. If significant partial block information is available for inference, we recommend implementing the modified stage I of Algorithm 1 instead of ignoring this information.

Supplementary Material

In this supplementary material, we provide the proof of all the theoretical results of the main manuscript in section S1. In section S2, we provide an additional simulation study to illustrate the performance of the proposed estimation methodology of QPGP parameters for a larger value of periodicity 𝑝 (

100 ) as indicated in section V of the main manuscript. Further, in section S3, we provide the details of the estimates of the QPGP parameters for the different chosen periodic covariance kernels as listed in section VI of the main manuscript.

Appendix S1Proof of Theoretical results

Proof of Theorem 1: For 𝑡 ∈ ℤ + , define 𝑖 ( 𝑡 ) ≜ ⌊ 𝑡 / 𝑝 ⌋ . Note that 𝑖 ( 𝑡 ) ∈ ℤ + ∪ { 0 } . Since 𝑡

𝑖 ( 𝑡 ) 𝑝 + 𝑙 ( 𝑡 ) , 𝑌 𝑡 belongs to the ( 𝑖 ( 𝑡 ) + 1 ) th periodic block (i.e., 𝗬 𝑖 ( 𝑡 ) + 1 ) at its 𝑙 ( 𝑡 ) th coordinate. Similarly, 𝑠

𝑖 ( 𝑠 ) 𝑝 + 𝑙 ( 𝑠 ) and 𝑌 𝑠 ∈ 𝗬 𝑖 ( 𝑠 ) + 1 .

For 𝑠 ≤ 𝑡 , 𝑖 ( 𝑡 ) and 𝑖 ( 𝑠 ) must satisfy either of the following two scenarios: (a) 𝑖 ( 𝑠 ) < 𝑖 ( 𝑡 ) , (b) 𝑖 ( 𝑠 )

𝑖 ( 𝑡 ) with 𝑙 ( 𝑠 ) ≤ 𝑙 ( 𝑡 ) . Note that by using the recursive equation (7) repeatedly, we have

𝑌 𝑡

{ ∑ 𝑚

0 𝑖 ( 𝑡 ) − 𝑖 ( 𝑠 ) − 1 𝜔 𝑚 𝖹 𝑖 ( 𝑡 ) + 1 − 𝑚 , 𝑙 ( 𝑡 ) + 𝜔 𝑖 ( 𝑡 ) − 𝑖 ( 𝑠 ) 𝑌 𝑖 ( 𝑠 ) 𝑝 + 𝑙 ( 𝑡 )

if 𝑖 ( 𝑡 )

𝑖 ( 𝑠 )

𝑌 𝑖 ( 𝑠 ) 𝑝 + 𝑙 ( 𝑡 )
if 𝑖 ( 𝑡 )

𝑖 ( 𝑠 ) ,

(S.1)

where 𝗭 𝑖 ≜ [ 𝖹 𝑖 , 1 , … , 𝖹 𝑖 , 𝑝 ] ⊤ . By using the independence of periodic block 𝗬 𝑖 ( 𝑠 ) + 1 and periodic building blocks ( 𝗭 𝑖 ( 𝑠 ) + 2 , 𝗭 𝑖 ( 𝑠 ) + 3 , … , 𝗭 𝑖 ( 𝑡 ) + 1 ) and (S.1), we have

𝐸 ( 𝑌 𝑡 𝑌 𝑠 )

{ 𝜔 𝑖 ( 𝑡 ) − 𝑖 ( 𝑠 ) 𝐸 ( 𝑌 𝑖 ( 𝑠 ) 𝑝 + 𝑙 ( 𝑡 ) 𝑌 𝑖 ( 𝑠 ) 𝑝 + 𝑙 ( 𝑠 ) )

if 𝑖 ( 𝑡 )

𝑖 ( 𝑠 )

𝐸 ( 𝑌 𝑖 ( 𝑠 ) 𝑝 + 𝑙 ( 𝑡 ) 𝑌 𝑖 ( 𝑠 ) 𝑝 + 𝑙 ( 𝑠 ) )
if 𝑖 ( 𝑡 )

𝑖 ( 𝑠 ) .

(S.2)

Again by using (7) repeatedly, for 𝑙 ∈ { 1 , 2 , … , 𝑝 } , we have

𝑌 𝑖 ( 𝑠 ) 𝑝 + 𝑙

{ ∑ 𝑚

0 𝑖 ( 𝑠 ) − 1 𝜔 𝑚 𝖹 𝑖 ( 𝑠 ) + 1 − 𝑚 , 𝑙 + 𝜔 𝑖 ( 𝑠 ) 𝑌 𝑙

if 𝑖 ( 𝑠 ) ≥ 1

𝑌 𝑙
if 𝑖 ( 𝑠 )

0 .

(S.3)

Now by using the independence of initial periodic block 𝗬 1 and building-blocks { 𝗭 𝑖 ( 𝑠 ) } 𝑖 ( 𝑠 )

1 and (S.3), we have

𝐸 ( 𝑌 𝑖 ( 𝑠 ) 𝑝 + 𝑙 ( 𝑡 ) 𝑌 𝑗 ( 𝑠 ) 𝑝 + 𝑙 ( 𝑠 ) )

𝐸 ( ∑ 𝑚

0 𝑖 ( 𝑠 ) − 1 𝜔 𝑚 𝖹 𝑖 ( 𝑠 ) − 𝑚 , 𝑙 ( 𝑡 ) ⋅ ∑ 𝑚

0 𝑖 ( 𝑠 ) − 1 𝜔 𝑚 𝖹 𝑖 ( 𝑠 ) − 𝑚 , 𝑙 ( 𝑠 ) ) + 𝜔 2 𝑖 ( 𝑠 ) 𝐸 ( 𝑌 𝑙 ( 𝑡 ) 𝑌 𝑙 ( 𝑠 ) )

[ 1 − 𝜔 2 𝑖 ( 𝑠 ) 1 − 𝜔 2 ] 𝜅 𝑝 ( 𝑙 ( 𝑡 ) − 𝑙 ( 𝑠 ) ) + 𝜔 2 𝑖 ( 𝑠 ) 𝐸 ( 𝑌 𝑙 ( 𝑡 ) 𝑌 𝑙 ( 𝑠 ) ) .

(S.4)

The proof is completed by plugging in the expression on the RHS of (S.4) in that of (S.2) and using the periodicity of the covariance kernel 𝜅 𝑝 . ■

Proof of Proposition 1: The proof follows by plugging in the expression 𝐸 ( 𝑌 𝑙 ( 𝑡 ) , 𝑌 𝑙 ( 𝑠 ) )

𝜅 𝑝 ( 𝑙 ( 𝑡 ) − 𝑙 ( 𝑠 ) ) 1 − 𝜔 2 on the RHS of (S.4). ■

Proof of Theorem 2 (part (1)). Since 𝒀 𝑛

[ 𝑌 1 , 𝑌 2 , … , 𝑌 𝑛 ] is a QPGP vector with period 𝑝 and parameters 𝜔 0 and 𝜅 0 𝑝 , by using (7) and the following identity

𝗬 𝑖 + 1 − 𝜔 𝗬 𝑖

𝗭 𝑖 + 1 − ( 𝜔 − 𝜔 0 ) 𝗬 𝑖 ,

for all 𝑖

1 , 2 , … , 𝑘 − 1 , the reduced likelihood function as defined in (15) can be expressed as follows.

ℓ ~ 𝑛 ( 𝜔 , 𝓚 )

log ⁡ ( | 𝓚 | ) + 1 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 𝗭 𝑘 − 𝑖 + 1 ⊤ 𝓚 − 1 𝗭 𝑘 − 𝑖 + 1 − 2 ( 𝜔 − 𝜔 0 ) 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 𝗭 𝑘 − 𝑖 + 1 ⊤ 𝓚 − 1 𝗬 𝑘 − 𝑖 + ( 𝜔 − 𝜔 0 ) 2 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 𝗬 𝑘 − 𝑖 ⊤ 𝓚 − 1 𝗬 𝑘 − 𝑖 .

(S.5)

Note that the first term on the RHS of (S.5) does not depend on 𝑛 whereas the other terms depend on 𝑛 via 𝑘

𝑛 / 𝑝 . The limit of ℓ ~ 𝑛 over 𝑛 → ∞ is equivalent to the convergence of the RHS of (S.5) as 𝑘 → ∞ since 𝑝 is fixed. We establish the probability converges of the second, third and fourth term on the RHS of (S.5) by showing the convergence of mean and variances of these terms continuously.

We begin with the second term on the RHS of (S.5). Note that

𝐸 [ 1 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 𝗭 𝑘 − 𝑖 + 1 ⊤ 𝓚 − 1 𝗭 𝑘 − 𝑖 + 1 ]

1 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 𝐸 [ 𝑡 𝑟 ( 𝓚 − 1 𝗭 𝑘 − 𝑖 + 1 𝗭 𝑘 − 𝑖 + 1 ⊤ ) ]

𝑡 𝑟 ( 𝓚 − 1 𝓚 0 ) .

(S.6)

Further, by using the independence of building blocks { 𝗭 𝑖 , 𝑖 ≥ 1 } and variance of random quadratic form (see equation (50) on pp. 77 for the formula of variance of random quadratic forms in [33]), we have

Var [ 1 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 𝗭 𝑘 − 𝑖 + 1 ⊤ 𝓚 − 1 𝗭 𝑘 − 𝑖 + 1 ]

1 ( 𝑘 − 1 ) 2 ∑ 𝑖

1 𝑘 − 1 Var ( 𝗭 𝑘 − 𝑖 + 1 ⊤ 𝓚 − 1 𝗭 𝑘 − 𝑖 + 1 )

2 𝑡 𝑟 ( ( 𝓚 − 1 𝓚 0 ) 2 ) 𝑘 − 1 .

(S.7)

By using (S.6), the expectation of the second term on the RHS of (S.5) converges to 𝑡 𝑟 ( 𝓚 − 1 𝓚 0 ) continuously over 𝓚 ∈ 𝔎 . Similarly, by using (S.7), the variance of the second term on the RHS of (S.5) converges to 0 continuously over 𝓚 ∈ 𝔎 as 𝑘 → ∞ . Therefore, the second term on the RHS of (S.5) converges continuously in probability, i.e.,

1 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 𝗭 𝑘 − 𝑖 + 1 ⊤ 𝓚 − 1 𝗭 𝑘 − 𝑖 + 1 → 𝑃 𝑡 𝑟 ( 𝓚 − 1 𝓚 0 ) as 𝑘 → ∞ .

(S.8)

By using a similar line of argument, we show in Lemma 1 and Lemma 2 that the third and fourth terms on the RHS of (S.5) converge continuously over 𝜔 ∈ [ − 1 , 1 ] and 𝓚 ∈ 𝔎 . In particular, by using Lemma 1 and Lemma 2 as 𝑘 → ∞ , we have

( 𝜔 − 𝜔 0 ) 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 𝗭 𝑘 − 𝑖 + 1 ⊤ 𝓚 − 1 𝗬 𝑘 − 𝑖

→ 𝑃

0

(S.9)

( 𝜔 − 𝜔 0 ) 2 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 𝗬 𝑘 − 𝑖 ⊤ 𝓚 − 1 𝗬 𝑘 − 𝑖

→ 𝑃

( 𝜔 − 𝜔 0 ) 2 1 − 𝜔 0 2 𝑡 𝑟 ( 𝓚 − 1 𝓚 0 ) .

(S.10)

Now by using (S.5), (S.8), (S.9) and (S.10), ℓ ~ 𝑛 → 𝑃 ℓ ~ as 𝑛 → ∞ continuously over ( 𝜔 , 𝓚 ) . This completes the proof of part (1).

part (2): By using (26), the first order derivative of the limiting function ℓ ~ with respect to 𝜔 and 𝓚 is given as follows:

∂ ℓ ~ ∂ 𝜔

2 ( 𝜔 − 𝜔 0 ) 1 − 𝜔 0 2 𝑡 𝑟 ( 𝓚 − 1 𝓚 0 )

(S.11)

∂ ℓ ~ ∂ 𝓚

𝓚 − 1 − ( 1 + ( 𝜔 − 𝜔 0 ) 2 1 − 𝜔 2 ) 𝓚 − 1 𝓚 0 𝓚 − 1 .

(S.12)

Note that, by using (S.11) and (S.12), 𝜔 0 and 𝓚 0 is a stationary point of ℓ ~ . Now, the second derivative of ℓ ~ with respect to 𝜔 and 𝓚 is given as follows.

∂ 2 ℓ ~ ∂ 𝜔 2

𝑡 𝑟 ( 𝓚 − 1 𝓚 0 ) 1 − 𝜔 0 2

∂ 2 ℓ ~ ∂ 𝓚 2

− 𝓚 − 1 ⊗ 𝓚 − 1 + ( 1 + ( 𝜔 − 𝜔 0 ) 2 1 − 𝜔 2 ) ( 𝓚 − 1 ⊗ ( 𝓚 − 1 𝓚 0 𝓚 − 1 ) + ( 𝓚 − 1 𝓚 0 𝓚 − 1 ) ⊗ 𝓚 − 1 ) ,

where ⊗ denotes the Kronecker product between matrices. We refer to [25] for differentiation formulae with respect to matrices (in particular, see equations (57), (60) and (61) on pp. 9–10 of [25]). Further, note that the Hessian matrix of ℓ ~ at the stationary point ( 𝜔 0 , 𝓚 0 ) simplifies to

𝐻

[ 𝑝 1 − 𝜔 0 2

0

𝓚 0 − 1 ⊗ 𝓚 0 − 1 ] ,

which is positive-definite. This completes the proof of part (2).

part (3): Since the set [ − 1 , 1 ] × 𝔎 is a closed bounded subset of ℝ 𝑝 2 + 1 , the sequence of continuous functions ℓ ~ 𝑛 are uniformly continuous. Let a non-random sequence ( 𝜔 𝑛 , 𝓚 𝑛 ) → ( 𝜔 , 𝓚 ) as 𝑛 → ∞ . Then, we have

| ℓ ~ 𝑛 ( 𝜔 , 𝓚 ) − ℓ ~ ( 𝜔 , 𝓚 ) | ≤

| ℓ ~ 𝑛 ( 𝜔 , 𝓚 ) − ℓ ~ 𝑛 ( 𝜔 𝑛 , 𝓚 𝑛 ) | + | ℓ ~ 𝑛 ( 𝜔 𝑛 , 𝓚 𝒏 ) − ℓ ~ ( 𝜔 , 𝓚 ) | .

(S.13)

By using the uniform continuity of ℓ ~ 𝑛 , we have

| ℓ ~ 𝑛 ( 𝜔 , 𝓚 ) − ℓ ~ 𝑛 ( 𝜔 𝑛 , 𝓚 𝑛 ) |

≤ sup ( 𝜔 , 𝓚 ) ∈ [ − 1 , 1 ] × 𝔎 ℓ ~ 𝑛 ( 𝜔 , 𝓚 ) × 𝑑 ( ( 𝜔 𝑛 , 𝓚 𝑛 ) , ( 𝜔 , 𝓚 ) ) ,

(S.14)

where 𝑑 ( ⋅ , ⋅ ) is the Euclidean distance between ( 𝑝 2 + 1 ) dimensional real vectors. By using part (1), and continuity of ℓ ~ , and compactness of set [ − 1 , 1 ] × 𝔎 , ℓ ~ 𝑛 ( 𝜔 , 𝓚 ) is uniformly bounded in probability. By using the convergence of ( 𝜔 𝑛 , 𝓚 𝑛 ) and uniform boundedness of ℓ ~ 𝑛 , the first term on the RHS of (S.13) converges to 0 with probability 1 uniformly over ( 𝜔 , 𝓚 ) ∈ [ − 1 , 1 ] × 𝔎 . By using part (1) of Theorem 2, the second term on the RHS of (S.13) converges to 0 in probability as 𝑛 → ∞ . Thus, we have

sup ( 𝜔 , 𝓚 ) ∈ [ − 1 , 1 ] × 𝔎 | ℓ ~ 𝑛 ( 𝜔 , 𝓚 ) − ℓ ~ ( 𝜔 , 𝓚 ) | → 𝑃 0 as 𝑛 → ∞ .

(S.15)

Since the stage I estimator ( 𝜔 ~ 𝑛 , 𝓚 ~ 𝑛 ) is the minimizer of ℓ ~ 𝑛 , by using part(1) of Theorem 2, we have

ℓ ~ 𝑛 ( 𝜔 ~ 𝑛 , 𝓚 ~ 𝑛 ) ≤ ℓ ~ 𝑛 ( 𝜔 0 , 𝓚 0 )

ℓ ~ ( 𝜔 0 , 𝓚 0 ) + 𝑜 𝑃 ( 1 )

(S.16)

As ( 𝜔 0 , 𝓚 0 ) is the unique minima of ℓ ~ , by using (S.16), we have

0

≤

ℓ ~ ( 𝜔 ~ 𝑛 , 𝓚 ~ 𝑛 ) − ℓ ~ ( 𝜔 0 , 𝓚 0 )

(S.17)

≤

ℓ ~ ( 𝜔 ~ 𝑛 , 𝓚 ~ 𝑛 ) − ℓ ~ 𝑛 ( 𝜔 ~ 𝑛 , 𝓚 ~ 𝑛 ) + 𝑜 𝑃 ( 1 )

≤

sup ( 𝜔 , 𝓚 ) ∈ [ − 1 , 1 ] × 𝔎 | ℓ ~ ( 𝜔 , 𝓚 ) − ℓ ~ 𝑛 ( 𝜔 , 𝓚 ) | + 𝑜 𝑃 ( 1 ) .

Now by using (S.15) and (S.17), we have

0

≤

ℓ ~ ( 𝜔 ~ 𝑛 , 𝓚 ~ 𝑛 ) − ℓ ~ ( 𝜔 0 , 𝓚 0 ) → 𝑃 0 as 𝑛 → ∞ .

(S.18)

Now, suppose the stage I estimator ( 𝜔 ~ 𝑛 , 𝓚 ~ 𝑛 ) → 𝑃 ( 𝜔 0 , 𝓚 0 ) , then there exists 𝜖

0 and 𝛿

0 such that

𝑃 ( 𝑑 ( ( 𝜔 ~ 𝑛 , 𝓚 ~ 𝑛 ) , ( 𝜔 0 , 𝓚 0 ) ) ≥ 𝜀 )

𝛿 ∀ 𝑛 .

(S.19)

Since ( 𝜔 0 , 𝓚 0 ) is the unique minima of ℓ ~ , ∀ 𝜖

0 , there exists 𝜂

0 such that

inf { ( 𝜔 , 𝓚 ) : 𝑑 ( ( 𝜔 , 𝓚 ) , ( 𝜔 0 , 𝓚 0 ) ) ≥ 𝜖 } ℓ ~ ( 𝜔 , 𝓚 )

ℓ ~ ( 𝜔 0 , 𝓚 0 ) + 𝜂 .

(S.20)

Now, (S.20) and (S.19) contradicts (S.18). Thus, we have

( 𝜔 ~ 𝑛 , 𝓚 ~ 𝑛 ) → 𝑃 ( 𝜔 0 , 𝓚 0 ) as 𝑛 → ∞ .

(S.21)

We now turn to establish the convergence of stage II estimator ( 𝜔 ^ , 𝓚 ^ ) of Algorithm 1. Note that, by using (21), (S.21) and (8), for fixed 𝑝 and | 𝑡 | < 𝑝 , we have

𝜅 ~ 𝑝 ( 𝑡 ) → 𝑃 1 𝑝 − | 𝑡 | ∑ 𝑗

1 𝑝 − | 𝑡 | 𝓚 0 ( 𝑗 , 𝑗 + | 𝑡 | )

𝜅 0 𝑝 ( 𝑡 ) as 𝑛 → ∞ .

(S.22)

By using (S.22) and given fixed 𝑝 , the spectrum of 𝜅 ~ 𝑝 defined as 𝑓 ~ ( 𝜆 ) (for 𝜆 ∈ [ − 𝜋 , 𝜋 ] ) in (23) converges as follows.

𝑓 ~ ( 𝜆 ) → 𝑃 𝑓 0 ( 𝜆 ) ≜ 1 2 𝜋 ∑ | 𝑡 | < 𝑝 𝜅 0 𝑝 ( 𝑡 ) 𝑒 − 𝑖 𝑡 𝜆 as 𝑛 → ∞ .

(S.23)

Since 𝑓 0 ( 𝜆 ) ≥ 0 for all 𝜆 ∈ [ − 𝜋 , 𝜋 ] , by using (S.23), we have

max ⁡ ( 𝑓 ~ ( 𝜆 ) , 0 ) → 𝑃 𝑓 0 ( 𝜆 ) as 𝑛 → ∞ .

(S.24)

Now by using (22), (23), (S.23), and (S.24), for | 𝑡 | < 𝑝 , we have

𝜅 ^ 𝑝 ( 𝑡 ) → 𝑃 ∫ − 𝜋 𝜋 𝑒 𝑖 𝑡 𝜆 𝑓 0 ( 𝜆 ) 𝑑 𝜆

𝜅 0 𝑝 ( 𝑡 ) as 𝑛 → ∞ .

(S.25)

We now establish the convergence of 𝜔 ^ proposed in (24) to complete the proof. By using (7) and (24), we have

𝜔 ^

𝜔 0 + 𝑡 𝑟 ( 𝓚 ^ − 1 1 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 𝗭 𝑖 + 1 𝗬 𝑖 ⊤ ) 𝑡 𝑟 ( 𝓚 ^ − 1 1 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 𝗬 𝑖 𝗬 𝑖 ⊤ ) .

(S.26)

By using a similar argument as in proof of Lemma 1 and Lemma 2, we have

1 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 𝗭 𝑖 + 1 𝗬 𝑖 ⊤

→ 𝑃

𝟎 as 𝑛 → ∞ ,

(S.27)

1 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 𝗬 𝑖 𝗬 𝑖 ⊤

→ 𝑃

1 1 − 𝜔 0 2 𝓚 0 as 𝑛 → ∞ .

(S.28)

By using (S.25), (S.26), (S.27) and (S.28), we have

𝜔 ^ → 𝑃 𝜔 0 as 𝑛 → ∞ .

(S.29)

This completes the proof. ■

Lemma 1.

Let 𝐘 𝑛

[ 𝑌 1 , … , 𝑌 𝑛 ] ⊤ be a QPGP vector with period 𝑝 and parameters 𝜔 0 and 𝜅 𝑝 0 . Suppose 𝑛

𝑘 𝑝 and 𝓚 ∈ 𝔎 . Then, as 𝑘 → ∞ , we have

( 𝜔 − 𝜔 0 ) 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 𝗬 𝑖 ⊤ 𝓚 − 1 𝗭 𝑖 + 1 → 𝑃 0 ,

(S.30)

continuously over 𝜔 ∈ [ − 1 , 1 ] and 𝓚 ∈ 𝔎 .

Proof of Lemma 1: Note that

𝐸 ( ( 𝜔 − 𝜔 0 ) 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 𝗬 𝑖 ⊤ 𝓚 − 1 𝗭 𝑖 + 1 )

( 𝜔 − 𝜔 0 ) 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 𝐸 [ 𝐸 ( 𝗬 𝑖 ⊤ 𝓚 − 1 𝗭 𝑖 + 1 | 𝗬 𝑖 ) ]

0 .

(S.31)

Further,

Var ( 1 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 𝗬 𝑖 ⊤ 𝓚 − 1 𝗭 𝑖 + 1 )

1 ( 𝑘 − 1 ) 2 ∑ 𝑖

1 𝑘 − 1 Var ( 𝗬 𝑖 ⊤ 𝓚 − 1 𝗭 𝑖 + 1 )

+ 2 ( 𝑘 − 1 ) 2 ∑ 𝑖

2 𝑘 − 1 ∑ 𝑗

1 𝑖 − 1 Cov ( 𝗬 𝑖 ⊤ 𝓚 − 1 𝗭 𝑖 + 1 , 𝗬 𝑗 ⊤ 𝓚 − 1 𝗭 𝑗 + 1 )

(S.32)

By using (S.31) and Definition 1, the summands of the first term on the RHS of (S.32) simplifies as follows for 𝑖

1 , 2 , … , 𝑘 − 1 .

Var [ 𝗬 𝑖 ⊤ 𝓚 − 1 𝗭 𝑖 + 1 ]

𝐸 ( 𝗬 𝑖 ⊤ 𝓚 − 1 𝗭 𝑖 + 1 ) 2

𝐸 [ 𝐸 ( 𝑡 𝑟 ( 𝗭 𝑖 + 1 𝗭 𝑖 + 1 ⊤ 𝓚 − 1 𝗬 𝑖 𝗬 𝑖 ⊤ 𝓚 − 1 ) | 𝗬 𝑖 ) ]

𝐸 [ 𝑡 𝑟 ( 𝓚 𝟎 𝓚 − 1 𝗬 𝑖 𝗬 𝑖 ⊤ 𝓚 − 1 ) ) ]

1 1 − 𝜔 0 2 𝑡 𝑟 ( ( 𝓚 0 𝓚 − 1 ) 2 ) .

(S.33)

Similarly by using (7) and (S.31), the summands of the second term on the RHS of (S.32) is simplified as follows for 𝑗 < 𝑖 .

Cov ( 𝗬 𝑖 ⊤ 𝓚 − 1 𝗭 𝑖 + 1 , 𝗬 𝑗 ⊤ 𝓚 − 1 𝗭 𝑗 + 1 )

𝐸 ( ( 𝜔 0 𝑖 − 𝑗 𝗬 𝑗 + ∑ 𝑚

0 𝑖 − 𝑗 − 1 𝜔 0 𝑚 𝗭 𝑖 − 𝑚 ) ⊤ 𝓚 − 1 𝗭 𝑖 + 1 𝗬 𝑗 ⊤ 𝓚 − 1 𝗭 𝑗 + 1 )

𝜔 0 𝑖 − 𝑗 𝐸 ( 𝗬 𝑗 ⊤ 𝓚 − 1 𝗭 𝑖 + 1 𝗬 𝑗 ⊤ 𝓚 − 1 𝗭 𝑗 + 1 ) + ∑ 𝑚

0 𝑖 − 𝑗 − 1 𝜔 0 𝑚 𝐸 ( 𝗭 𝑖 − 𝑚 ⊤ 𝓚 − 1 𝗭 𝑖 + 1 𝗬 𝑗 ⊤ 𝓚 − 1 𝗭 𝑗 + 1 )

(S.34)

Since 𝑗 < 𝑖 for each summand in the second term on the RHS of (S.32), by using the independence between 𝗭 𝑖 + 1 and ( 𝗬 𝑗 , 𝗭 𝑗 + 1 , … , 𝗭 𝑖 ) and conditional expectation arguments both the terms on the RHS of (S.34) vanishes. Therefore, by using (S.32) and (S.33), we have

Var ( ( 𝜔 − 𝜔 0 ) 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 𝗬 𝑖 ⊤ 𝓚 − 1 𝗭 𝑖 + 1 )

( 𝜔 − 𝜔 0 ) 2 𝑡 𝑟 ( 𝓚 𝟎 𝓚 − 1 ) 2 ( 𝑘 − 1 ) ( 1 − 𝜔 0 2 ) .

(S.35)

Since the RHS of (S.35) converges to 0 continuously over 𝜔 ∈ [ − 1 , 1 ] and 𝓚 ∈ 𝔎 as 𝑘 → ∞ , this completes the proof. ■

Lemma 2.

Let 𝐘 𝑛

[ 𝑌 1 , … , 𝑌 𝑛 ] ⊤ be a QPGP vector with period 𝑝 and parameters 𝜔 0 and 𝜅 𝑝 0 . Suppose 𝑛

𝑘 𝑝 and 𝓚 ∈ 𝔎 . Then, as 𝑘 → ∞ , we have

( 𝜔 − 𝜔 0 ) 2 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 𝗬 𝑖 ⊤ 𝓚 − 1 𝗬 𝑖 → 𝑃 ( 𝜔 − 𝜔 0 ) 2 𝑡 𝑟 ( 𝓚 − 1 𝓚 0 ) 1 − 𝜔 0 2 ,

(S.36)

continuously over 𝜔 ∈ [ − 1 , 1 ] and 𝓚 ∈ 𝔎 .

Proof of Lemma 2: By using Proposition 1, we have

𝐸 ( ( 𝜔 − 𝜔 0 ) 2 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 𝗬 𝑖 ⊤ 𝓚 − 1 𝗬 𝑖 )

( 𝜔 − 𝜔 0 ) 2 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 𝑡 𝑟 ( 𝓚 − 1 𝐸 ( 𝗬 𝑖 𝗬 𝑖 ⊤ ) )

( 𝜔 − 𝜔 0 ) 2 𝑡 𝑟 ( 𝓚 − 1 𝓚 0 ) 1 − 𝜔 0 2 .

(S.37)

Note that

Var ( 1 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 𝗬 𝑖 ⊤ 𝓚 − 1 𝗬 𝑖 )

1 ( 𝑘 − 1 ) 2 ∑ 𝑖

1 𝑘 − 1 Var ( 𝗬 𝑖 ⊤ 𝓚 − 1 𝗬 𝑖 )

+ 2 ( 𝑘 − 1 ) 2 ∑ 𝑖

2 𝑘 − 1 ∑ 𝑗

1 𝑖 − 1 Cov ( 𝗬 𝑖 ⊤ 𝓚 − 1 𝗬 𝑖 , 𝗬 𝑗 ⊤ 𝓚 − 1 𝗬 𝑗 )

(S.38)

By using Proposition 1 and and a similar argument as used in (S.7), the variance of summands of the first term on the RHS of (S.44) is given as follows.

Var ( 𝗬 𝑖 ⊤ 𝓚 − 1 𝗬 𝑖 )

2 𝑡 𝑟 ( 𝓚 − 1 𝓚 0 ) 2 ( 1 − 𝜔 0 2 ) 2 for 1 ≤ 𝑖 ≤ 𝑘 − 1 .

(S.39)

We now turn to simplify the second term on the RHS of (S.44). By using Definition 1 and (7), we have

𝗬 𝑖

𝜔 0 𝑖 − 𝑗 𝗬 𝑗 + ∑ 𝑚

0 𝑖 − 𝑗 − 1 𝜔 0 𝑚 𝗭 𝑖 − 𝑚 for 𝑖

𝑗 .

(S.40)

Therefore, by using (S.40), we have

𝗬 𝑖 ⊤ 𝓚 − 1 𝗬 𝑖

𝜔 0 2 ( 𝑖 − 𝑗 ) 𝗬 𝑗 ⊤ 𝓚 − 1 𝗬 𝑗 + 2 ∑ 𝑚

0 𝑖 − 𝑗 − 1 𝜔 0 𝑖 − 𝑗 + 𝑚 𝗬 𝑗 ⊤ 𝓚 − 1 𝗭 𝑖 − 𝑚

+ ∑ 𝑚

0 𝑖 − 𝑗 − 1 ∑ 𝑚 ′

0 𝑖 − 𝑗 − 1 𝜔 0 𝑚 + 𝑚 ′ 𝗭 𝑖 − 𝑚 ⊤ 𝓚 − 1 𝗭 𝑖 − 𝑚 ′ .

(S.41)

By using (S.41), the summand of the second term on the RHS of (S.44) is simplified as follows for 𝑖

𝑗 .

Cov ( 𝗬 𝑖 ⊤ 𝓚 − 1 𝗬 𝑖 , 𝗬 𝑗 ⊤ 𝓚 − 1 𝗬 𝑗 )

𝜔 0 2 ( 𝑖 − 𝑗 ) Var ( 𝗬 𝑗 ⊤ 𝓚 − 1 𝗬 𝑗 ) + 2 ∑ 𝑚

0 𝑖 − 𝑗 − 1 𝜔 0 𝑖 − 𝑗 + 𝑚 Cov ( 𝗬 𝑗 ⊤ 𝓚 − 1 𝗭 𝑖 − 𝑚 , 𝗬 𝑗 ⊤ 𝓚 − 1 𝗬 𝑗 )

+ ∑ 𝑚

0 𝑖 − 𝑗 − 1 ∑ 𝑚 ′

0 𝑖 − 𝑗 − 1 𝜔 0 𝑚 + 𝑚 ′ Cov ( 𝗭 𝑖 − 𝑚 ⊤ 𝓚 − 1 𝗭 𝑖 − 𝑚 ′ , 𝗬 𝑗 ⊤ 𝓚 − 1 𝗬 𝑗 ) .

(S.42)

By using the similar argument as in (S.39), the first term on the RHS of (S.42) is simplified as follows.

𝜔 0 2 ( 𝑖 − 𝑗 ) 𝑉 𝑎 𝑟 ( 𝗬 𝑗 ⊤ 𝓚 − 1 𝗬 𝑗 )

2 𝜔 0 2 ( 𝑖 − 𝑗 ) ( 1 − 𝜔 0 2 ) 2 𝑡 𝑟 ( 𝓚 − 1 𝓚 0 ) 2 .

(S.43)

Since 𝗬 𝑗 and ( 𝗭 𝑗 + 1 , … , 𝗭 𝑖 ) are independent for 𝑗 < 𝑖 , by using conditional expectation arguments, the second term on the RHS of (S.42) vanishes. By using the similar argument and the independence between 𝗬 𝑗 and 𝗭 𝑖 − 𝑚 for 𝑗 < 𝑖 and 𝑚

0 , 1 , … , 𝑖 − 𝑗 − 1 , the third term on the RHS of (S.42) also vanishes. Therefore, by using (S.44), (S.39) and (S.43), we have

Var ( ( 𝜔 − 𝜔 0 ) 2 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 𝗬 𝑖 ⊤ 𝓚 − 1 𝗬 𝑖 )

( 𝜔 − 𝜔 0 ) 4 𝑘 − 1 2 𝑡 𝑟 ( 𝓚 − 1 𝓚 0 ) 2 ( 1 − 𝜔 0 2 ) 2 ( 1 + 2 ( 𝑘 − 1 ) ∑ 𝑖

2 𝑘 − 1 ∑ 𝑗

1 𝑖 − 1 𝜔 0 2 ( 𝑖 − 𝑗 ) )

( 𝜔 − 𝜔 0 ) 4 𝑘 − 1 2 𝑡 𝑟 ( 𝓚 − 1 𝓚 0 ) 2 ( 1 − 𝜔 0 2 ) 2 [ 1 + 2 𝑘 − 1 [ ( 𝑘 − 2 ) 𝜔 0 2 − ( 𝑘 − 1 ) 𝜔 0 4 − 𝜔 0 2 𝑘 ] ( 1 − 𝜔 0 2 ) 2 ] .

(S.44)

Since the RHS of (S.37) converges to ( 𝜔 − 𝜔 0 ) 2 𝑡 𝑟 ( 𝓚 − 1 𝓚 0 ) 1 − 𝜔 0 2 and the RHS of (S.44) converges to 0 as 𝑘 → ∞ continuously over 𝜔 ∈ [ − 1 , 1 ] and 𝓚 ∈ 𝔎 , this completes the proof. ■

Proof of Theorem 3: If 𝑡 ≤ 𝑝 , then 𝑌 𝑡 ∈ 𝗬 𝟏 and 𝒀 𝒕 (

𝗬 1 ( 𝑡 ) ) is a zero mean Gaussian vector with covariance matrix 𝓚 𝑡 ≜ 1 1 − 𝜔 2 ( 𝑘 𝑝 ( 𝑖 − 𝑗 ) ) 1 ≤ 𝑖 , 𝑗 ≤ 𝑡 . Thus, by using the conditional expectation of jointly Gaussian vectors (see (27)), we have

𝑌 ^ 𝑡

𝐸 ( 𝑌 𝑡 | 𝒀 𝒕 − 𝟏 )

1 1 − 𝜔 2 𝓚 1 , 𝑙 ( 𝑡 ) − 1 ( 1 1 − 𝜔 2 𝓚 𝑙 ( 𝑡 ) − 1 − 1 ) 𝗬 1 ( 𝑡 − 1 )

𝓚 1 , 𝑙 ( 𝑡 ) − 1 𝓚 𝑙 ( 𝑡 ) − 1 − 1 𝗬 1 ( 𝑡 − 1 ) .

(S.45)

If 𝑡 > 𝑝 . then 𝑖 ( 𝑡 ) ≥ 1 and 𝑌 𝑡

𝑌 𝑖 ( 𝑡 ) 𝑝 + 𝑙 ( 𝑡 ) . Thus,

𝑌 ^ 𝑡

𝐸 ( 𝑌 𝑡 | 𝒀 𝒕 − 𝟏 )

𝐸 [ 𝑌 𝑖 ( 𝑡 ) 𝑝 + 𝑙 ( 𝑡 ) | 𝗬 𝑖 ( 𝑡 ) + 1 ( 𝑙 ( 𝑡 ) − 1 ) , 𝗬 𝑖 ( 𝑡 ) , … , 𝗬 1 ] .

(S.46)

Recall the structural equations given in (7), we have

𝗬 𝑖 ( 𝑡 ) + 1 ( 𝑙 ( 𝑡 ) )

𝜔 𝗬 𝑖 ( 𝑡 ) ( 𝑙 ( 𝑡 ) ) + 𝗭 𝑖 ( 𝑡 ) + 1 ( 𝑙 ( 𝑡 ) ) .

(S.47)

By using (S.47), we have

𝐸 [ 𝑌 𝑖 ( 𝑡 ) 𝑝 + 𝑙 ( 𝑡 ) | 𝗬 𝑖 ( 𝑡 ) + 1 ( 𝑙 ( 𝑡 ) − 1 ) , 𝗬 𝑖 ( 𝑡 ) , … , 𝗬 1 ]

𝜔 𝑌 ( 𝑖 ( 𝑡 ) − 1 ) 𝑝 + 𝑙 ( 𝑡 ) + 𝐸 [ 𝖹 𝑖 ( 𝑡 ) + 1 , 𝑙 ( 𝑡 ) | 𝗬 𝑖 ( 𝑡 ) + 1 ( 𝑙 ( 𝑡 ) − 1 ) , 𝗬 𝑖 ( 𝑡 ) , … , 𝗬 1 ] .

(S.48)

By using (S.47) and the independence between 𝗭 𝑖 ( 𝑡 ) + 1 and ( 𝗬 𝑖 ( 𝑡 ) , … , 𝗬 1 ) , we have

𝐸 [ 𝖹 𝑖 ( 𝑡 ) + 1 , 𝑙 ( 𝑡 ) | 𝗬 𝑖 ( 𝑡 ) + 1 ( 𝑙 ( 𝑡 ) − 1 )

𝘆 𝑖 ( 𝑡 ) + 1 ( 𝑙 ( 𝑡 ) − 1 ) , 𝗬 𝑖 ( 𝑡 )

𝘆 𝑖 ( 𝑡 ) , … , 𝗬 1

𝘆 1 ]

𝐸 [ 𝖹 𝑖 ( 𝑡 ) + 1 , 𝑙 ( 𝑡 ) | 𝗭 𝑖 ( 𝑡 ) + 1 ( 𝑙 ( 𝑡 ) − 1 )

𝘆 𝑖 ( 𝑡 ) + 1 ( 𝑙 ( 𝑡 ) − 1 ) − 𝜔 𝗬 𝑖 ( 𝑡 ) ( 𝑙 ( 𝑡 ) − 1 ) , 𝗬 𝑖 ( 𝑡 )

𝘆 𝑖 ( 𝑡 ) , … , 𝗬 1

𝘆 1 ]

𝐸 [ 𝖹 𝑖 ( 𝑡 ) + 1 , 𝑙 ( 𝑡 ) | 𝗭 𝑖 ( 𝑡 ) + 1 ( 𝑙 ( 𝑡 ) − 1 )

𝘆 𝑖 ( 𝑡 ) + 1 ( 𝑙 ( 𝑡 ) − 1 ) − 𝜔 𝘆 𝑖 ( 𝑡 ) ( 𝑙 ( 𝑡 ) − 1 ) ]

𝓚 1 , 𝑙 ( 𝑡 ) − 1 𝓚 𝑙 ( 𝑡 ) − 1 − 1 ( 𝗬 𝑖 ( 𝑡 ) + 1 𝑙 ( 𝑡 ) − 1 − 𝜔 𝗬 𝑖 ( 𝑡 ) 𝑙 ( 𝑡 ) − 1 ) .

(S.49)

By using (S.48) and (S.49), the proof of (28) is completed. We now turn to the variance of 𝑌 ^ 𝑡 . If 𝑡 ≤ 𝑝 , then by using (28), we have

Var ( 𝑌 ^ 𝑡 )

1 1 − 𝜔 2 𝓚 1 , 𝑙 ( 𝑡 ) − 1 𝓚 𝑙 ( 𝑡 ) − 1 − 1 𝓚 𝑙 ( 𝑡 ) − 1 , 1 .

(S.50)

For 𝑡

𝑝 , by using (28), (S.47), and independence between 𝗬 𝑖 ( 𝑡 ) and 𝗭 𝑖 ( 𝑡 ) + 1 , we have

Var ( 𝑌 ^ 𝑡 )

𝜔 2 Var ( 𝑌 𝑡 − 𝑝 ) + Var ( 𝓚 1 , 𝑙 ( 𝑡 ) − 1 𝓚 𝑙 ( 𝑡 ) − 1 − 1 ( 𝗬 𝑖 ( 𝑡 ) + 1 𝑙 ( 𝑡 ) − 1 − 𝜔 𝗬 𝑖 ( 𝑡 ) 𝑙 ( 𝑡 ) − 1 ) ) .

(S.51)

Now by using (S.47), the second term on the RHS of (S.51) simplifies as follows.

Var ( 𝓚 1 , 𝑙 ( 𝑡 ) − 1 𝓚 𝑙 ( 𝑡 ) − 1 − 1 ( 𝗬 𝑖 ( 𝑡 ) + 1 𝑙 ( 𝑡 ) − 1 − 𝜔 𝗬 𝑖 ( 𝑡 ) 𝑙 ( 𝑡 ) − 1 ) )

𝓚 1 , 𝑙 ( 𝑡 ) − 1 𝓚 𝑙 ( 𝑡 ) − 1 − 1 𝓚 𝑙 ( 𝑡 ) − 1 , 1 .

(S.52)

This completes the proof. ■

Appendix S2Additional simulation studies

We now provide an additional simulation study to illustrate the performance of the proposed estimation methodology under the same experimental setup except for the choice of 𝑝 as described in Section V of the main paper. Here, we consider 𝑝

100 ; the rest of the experimental setup remains the same.

Similar to Table III of the main paper, Table S1 shows the RMSE of the proposed estimators and MLE based on grid search from 1000 simulation runs, along with computational time per simulation run. The grid search space remains the same as given in subsection V-B of the main paper. For 𝑝

100 , we also observe similar findings (as in the case of 𝑝

10 ) in terms of accuracy as well as computational cost.

Similar to Figure 1 of subsectionV-C of the main paper, Figure S1 shows the boxplots of bootstrap standard errors corresponding to QPGP parameters. The width of the boxes is larger for 𝑝

100 in comparison to 𝑝

10 . However, we observe a similar pattern in boxplots as discussed in subsectionV-C of the main paper.

TABLE S1:RMSE of proposed estimator and MLE based on 1000 runs, along with respective computational cost RMSE Time per run in milliseconds

𝑛

𝜔 ^

𝜔 𝑚 𝑙 𝑒

𝜃 ^

𝜃 𝑚 𝑙 𝑒

𝜎 ^ 2

𝜎 𝑚 𝑙 𝑒 2 Proposed Estimator MLE 600 0.2267

0.2201

0.4989

0.0245

0.4481

0.2734

8.26

29880.29

3000 0.0882

0.0692

0.1860

0.0140

0.1992

0.1309

11.83

40576.02

10000 0.04574

0.02833

0.0943

0.0107

0.1057

0.0847

12.11

69650.91 S.E.( 𝜔 ^ ) S.E.( 𝜃 ^ ) S.E.( 𝜎 ^ 2 )

Figure S1:The box plots of bootstrap standard errors (computed from 𝑀

1000 bootstrap samples) of 𝜔 ^ , 𝜃 ^ and 𝜎 ^ 2 , based on 1000 simulation runs of standard QPGP with period 𝑝

100 , 𝜔

0.5 and Mackay’s periodic kernel with ( 𝜃

1 , 𝜎 2

1 ) , are shown in left, center and right panel, respectively. Each panel consists of three box plots corresponding to sample sizes 𝑛

600 , 3000 and 10000 . The empirical standard error of estimators across simulation runs are shown in dashed horizontal maroon line. Appendix S3Additional details of real data analysis

In this section, we provide the estimates of QPGP parameters corresponding to the chosen periodic kernels for the real data case studies discussed in section VI of the main paper. Recall, we chose the following periodic covariance kernels for fitting QPGP: (a) the general kernel, (b) MacKay’s kernel, (c) the periodic Matérn kernel with 𝜈

1.5 , and (d) the cosine kernel with 𝜄

1 .

S3-ACarbon Dioxide Emission Signal

Table S2 shows the estimates of QPGP (with 𝑝

12 ) parameters corresponding to the chosen periodic covariance kernels, along with their bootstrap standard errors and 95% confidence intervals. The left column of the top panel of Figure S2 shows the plot of the general 𝜅 𝑝 estimates against lag in a black solid line, along with 95% confidence limits in dashed black lines. The right column of the top panel of Figure S2 shows the plot of the bootstrap standard errors of the estimates of the general covariance kernel against lag.

TABLE S2:Estimates of QPGP (with 𝑝

12 ) parameters for CO2 data for different 𝜅 𝑝 Kernel Parameter Estimate Standard Error Confidence Interval General kernel 𝜔

0.9752

0.0085

( 0.9705 , 1.0038 )

MacKay 𝜔

0.9752

0.0085

( 0.9705 , 1.0038 )

𝜎 2

0.2464

0.0437

( 0.1666 , 0.3329 )

𝜃

0.8188

0.1251

( 0.5975 , 1.0798 )

Matérn ( 𝜈

1.5 ) (3) 𝜔

0.9752

0.0085

( 0.9705 , 1.0038 )

𝜎 2

0.2610

0.0429

( 0.1826 , 0.3464 )

𝜃

1.9950

0.4053

( 1.3569 , 2.9443 )

Cosine (4) 𝜔

0.9752

0.0085

( 0.9705 , 1.0038 )

𝜎 2

0.0546

0.0086

( 0.0355 , 0.0687 ) S3-BSunspot Numbers Data

Table S3 shows the estimates of QPGP (with 𝑝

11 ) parameters corresponding to the chosen periodic covariance kernels, along with their bootstrap standard errors and 95% confidence intervals. The left column of the middle panel of Figure S2 shows the plot of the general 𝜅 𝑝 estimates against lag in a black solid line, along with 95% confidence limits in dashed black lines. The right column of the middle panel of Figure S2 shows the plot of the bootstrap standard errors of the estimates of the general covariance kernel against lag.

TABLE S3:Estimates of QPGP (with 𝑝

11 ) parameters for Sunspot data for different 𝜅 𝑝 Kernel Parameter Estimate Standard Error Confidence Interval General Kernel 𝜔

0.7228

0.06375

( 0.5861 , 0.8429 )

MacKay (2) 𝜔

0.7228

0.06375

( 0.5861 , 0.8429 )

𝜎 2

2335.9067

519.06165

( 1337.9731 , 3422.8676 )

𝜃

1.8401

0.2085

( 1.6809 , 2.4982 )

Matérn ( 𝜈

1.5 ) (3) 𝜔

0.7228

0.06375

( 0.5861 , 0.8429 )

𝜎 2

2568.1523

563.3239

( 1439.5634 , 3699.3789 )

𝜃

0.7599

0.0814

( 0.5436 , 0.8539 )

Cosine (4) 𝜔

0.7228

0.06375

( 0.5861 , 0.8429 )

𝜎 2

1254.7285

323.1657

( 709.9667 , 2002.5801 ) S3-CWater Level Signal

Table S4 shows the estimates of QPGP (with 𝑝

148 ) parameters corresponding to the chosen periodic covariance kernels, along with their bootstrap standard errors and 95% confidence intervals. The left column of the bottom panel of Figure S2 shows the plot of the general 𝜅 𝑝 estimates against lag in a black solid line, along with 95% confidence limits in dashed black lines. The right column of the bottom panel of Figure S2 shows the plot of the bootstrap standard errors of the estimates of the general covariance kernel against lag.

TABLE S4:Estimates of QPGP (with 𝑝

148 ) parameters for water level data for different 𝜅 𝑝 Kernel Parameter Estimate Standard Error Confidence Interval General kernel 𝜔

0.9673

0.0102

( 0.9432 , 0.9824 )

MacKay (2) 𝜔

0.9673

0.0102

( 0.9432 , 0.9824 )

𝜎 2

0.0334

0.0827

( 0.0217 , 0.3315 )

𝜃

1.7398

1.7659

( 0.8033 , 2.3211 )

Matérn ( 𝜈

1.5 ) (3) 𝜔

0.9673

0.0102

( 0.9432 , 0.9824 )

𝜎 2

0.0358

0.0872

( 0.0237 , 0.3508 )

𝜃

0.8338

0.4892

( 0.5649 , 2.1742 )

Cosine (4) 𝜔

0.9673

0.0102

( 0.9432 , 0.9824 )

𝜎 2

0.0183

0.0376

( 0.0119 , 0.1531 ) Estimate of general 𝜅 𝑝 SE of general 𝜅 𝑝 estimates

Carbon Dioxide signal

Sunspot numbers

Water level signal

Figure S2:The left column shows the plot of estimates of general 𝜅 𝑝 against lag in a solid black line, along with 95 % confidence limits in dashed black lines. The right column shows the plot of bootstrap standard errors of general 𝜅 𝑝 estimates against lag. Report Issue Report Issue for Selection Generated by L A T E xml Instructions for reporting errors

We are continuing to improve HTML versions of papers, and your feedback helps enhance accessibility and mobile support. To report errors in the HTML that will help us improve conversion and rendering, choose any of the methods listed below:

Click the "Report Issue" button. Open a report feedback form via keyboard, use "Ctrl + ?". Make a text selection and click the "Report Issue for Selection" button near your cursor. You can use Alt+Y to toggle on and Alt+Shift+Y to toggle off accessible reporting links at each section.

Our team has already identified the following issues. We appreciate your time reviewing and reporting rendering errors we may not have found yet. Your efforts will help us improve the HTML versions for all readers, because disability should not be a barrier to accessing research. Thank you for your continued support in championing open access for all.

Have a free development cycle? Help support accessibility at arXiv! Our collaborators at LaTeXML maintain a list of packages that need conversion, and welcome developer contributions.

𝜅 𝑝 ​ ( 𝑡 + 𝑝 )

Using the fact that the covariance kernel of a stationary process is an even function, i.e., 𝜅 𝑝 ​ ( − 𝑡 )

𝜅 𝑝 ​ ( 𝑡 ) , and (1), specifying 𝜅 𝑝 ​ ( 𝑡 ) for 𝑡

0 , 1 , … , 𝑇 where 𝑇 is the maximum lag needed to identify 𝜅 𝑝 , (with 𝑇

𝑝 / 2 if 𝑝 is even and 𝑇

𝜅 𝑝 ​ ( 𝑡 )

𝜅 𝑝 ​ ( 𝑡 )

where 𝜙 ​ ( 𝑡 )

𝜅 𝑝 ​ ( 𝑡 )

Moreover, note that the parameter 𝜎 2 (

𝜅 ​ ( 𝑡 )

𝜅 ​ ( 𝑡 )

0 , 𝑋 𝑡 and 𝑋 𝑠 are in the same periodic block of the QPGP and the covariance between them coincides with MacKay’s kernel. Further, when 𝑡 , 𝑠 ∈ ℤ such that | ⌈ 𝑡 / 𝑝 ⌉ − ⌈ 𝑠 / 𝑝 ⌉ |

𝐸 ​ ( 𝑌 𝑡 )

𝗬 𝑖 + 1 ≜ [ 𝑌 𝑖 ​ 𝑝 + 1 , 𝑌 𝑖 ​ 𝑝 + 2 , … , 𝑌 𝑖 ​ 𝑝 + 𝑝 ] ⊤ , for ​ 𝑖

𝗬 𝑖 + 1

𝐶 ​ 𝑜 ​ 𝑣 ​ ( 𝑌 𝑡 , 𝑌 𝑠 )

𝐶 ​ 𝑜 ​ 𝑣 ​ ( 𝑌 𝑡 , 𝑌 𝑠 )

Given the 𝑛 -dimensional data vector 𝒚

[ 𝑦 1 , 𝑦 2 , … , 𝑦 𝑛 ] ⊤ from the standard QPGP with period 𝑝 , periodic covariance kernel 𝜅 𝑝 , and between-period correlation 𝜔 , we consider the likelihood approach for estimation of the parameters. We begin with 𝑛

ℓ 𝑛 ​ ( 𝜔 , 𝜅 𝑝 )

ℓ 𝑛 ​ ( 𝜔 , 𝜅 𝑝 )

𝑘 − 1 2 ​ log ⁡ ( | 𝓚 | ) + 1 2 ​ ∑ 𝑖

where 𝘆 1 , 𝘆 2 , … , 𝘆 𝑘 are the observed periodic blocks of the QPGP data vector 𝒚 and 𝑐

ℓ ~ 𝑛 ​ ( 𝜔 , 𝓚 )

log ⁡ ( | 𝓚 | ) + 1 𝑘 − 1 ​ ∑ 𝑖

∂ ℓ ~ 𝑛 ∂ 𝜔

− 1 𝑘 − 1 ​ ∑ 𝑖

1 𝑘 − 1 𝘆 𝑖 ⊤ ​ 𝓚 − 1 ​ 𝘆 𝑖 + 1 + 𝜔 𝑘 − 1 ​ ∑ 𝑖

∂ ℓ ~ 𝑛 ∂ 𝓚

− 𝓚 + 1 𝑘 − 1 ​ ∑ 𝑖

𝜔

∑ 𝑖

1 𝑘 − 1 𝘆 𝑖 ⊤ ​ 𝓚 − 1 ​ 𝘆 𝑖 + 1 ∑ 𝑖

𝓚

1 ( 𝑘 − 1 ) ​ ∑ 𝑖

minimize 𝓚 (

𝜅 ~ 𝑝 ​ ( 𝑡 )

1 𝑝 − | 𝑡 | ​ ∑ 𝑗

𝜅 ^ 𝑝 ​ ( 𝑡 )

𝑓 ~ ​ ( 𝜆 )

𝜔 ^

∑ 𝑖

1 𝑘 − 1 𝘆 𝑖 ⊤ ​ 𝓚 ^ − 1 ​ 𝘆 𝑖 + 1 ∑ 𝑖

( 𝜽 ^ , 𝜎 ^ 2 )

Let 𝐲

ℓ ~ ​ ( 𝜔 , 𝓚 )

Algorithm 1 Estimation of QPGP parameters 𝜔 and 𝜅 𝑝 3 1:Input: 𝒚

[ 𝑦 1 , … , 𝑦 𝑛 ] ⊤ , 𝑛

𝑘 ​ 𝑝 ; 𝑝 ; 𝛿 (threshold) 2:Stage I: 3:Initialize: 𝓚 ~ ( 0 )

𝐈 𝑝 4:for 𝑚

1 , 2 , … do 5: 𝜔 ~ ( 𝑚 )

∑ 𝑖

1 𝑘 − 1 𝘆 𝑖 ⊤ ​ 𝓚 ~ ( 𝑚 − 1 ) − 1 ​ 𝘆 𝑖 + 1 ∑ 𝑖

1 𝑘 − 1 𝘆 𝑖 ⊤ ​ 𝓚 ~ ( 𝑚 − 1 ) − 1 ​ 𝘆 𝑖 6: 𝓚 ~ ( 𝑚 )

1 𝑘 − 1 ​ ∑ 𝑖

1 𝑘 − 1 ( 𝘆 𝑖 + 1 − 𝜔 ~ ( 𝑚 ) ​ 𝘆 𝑖 ) ​ ( 𝘆 𝑖 + 1 − 𝜔 ~ ( 𝑚 ) ​ 𝘆 𝑖 ) ⊤ 7: if max ⁡ ( | ∂ ℓ ~ 𝑛 ∂ 𝜔 | , | ∂ ℓ ~ 𝑛 ∂ 𝓚 | ∞ ) | 𝜔 ~ ( 𝑚 ) , 𝓚 ~ ( 𝑚 ) < 𝛿 then 8: Set 𝜔 ~ 𝑛

𝜔 ~ ( 𝑚 ) , 𝓚 ~ 𝑛

𝓚 ~ ( 𝑚 ) 9: break 10: end if 11:end for 12:Stage II: 13:for | 𝑡 | < 𝑝 do 14: 𝜅 ~ 𝑝 ​ ( 𝑡 )

1 𝑝 − | 𝑡 | ​ ∑ 𝑗

1 𝑝 − | 𝑡 | 𝓚 ~ 𝑛 ​ ( 𝑗 , 𝑗 + | 𝑡 | ) 15:end for 16:Compute Spectrum( 𝜅 ~ 𝑝 ): 17: 𝑓 ~ ​ ( 𝜆 )

1 2 ​ 𝜋 ​ ∑ | 𝑡 | < 𝑝 𝜅 ~ 𝑝 ​ ( 𝑡 ) ​ 𝑒 − 𝑖 ​ 𝑡 ​ 𝜆 , ∀ 𝜆 ∈ [ − 𝜋 , 𝜋 ] 18:Estimates of 𝜅 𝑝 and 𝜔 : 19: 𝜅 ^ 𝑝 ​ ( 𝑡 )

∫ − 𝜋 𝜋 𝑒 𝑖 ​ 𝑡 ​ 𝜆 ​ max ⁡ ( 𝑓 ~ ​ ( 𝜆 ) , 0 ) ​ 𝑑 𝜆 , | 𝑡 | < 𝑝 20: 𝜔 ^

∑ 𝑖

1 𝑘 − 1 𝘆 𝑖 ⊤ ​ 𝓚 ^ − 1 ​ 𝘆 𝑖 + 1 ∑ 𝑖

1 𝑘 − 1 𝘆 𝑖 ⊤ ​ 𝓚 ^ − 1 ​ 𝘆 𝑖 , 𝓚 ^ ≜ ( 𝜅 ^ 𝑝 ​ ( 𝑖 − 𝑗 ) ) 1 ≤ 𝑖 , 𝑗 ≤ 𝑝 21:Output: 𝜔 ^ and 𝜅 ^ 𝑝 ​ ( 𝑡 ) for 𝑡

In this subsection, for a standard QPGP vector 𝒀 𝑡

𝑌 ^ 𝑡

𝐸 ​ ( 𝑌 𝑡 | 𝑌 𝑡 − 1 , 𝑌 𝑡 − 2 , … , 𝑌 1 )

where 𝚺 𝑡

𝚺 𝑡

Further, Var ​ ( 𝑌 ^ 𝑡 )

Let 𝐘 𝑡

𝑌 ^ 𝑡

𝓚 𝑙 ​ ( 𝑡 )

Var ​ ( 𝑌 ^ 𝑡 ) ‘

We now turn to describe a measure of goodness of fit of a standard QPGP. We first estimate the best linear predictor of 𝑌 𝑡 by using Theorem 3. Given 𝑛 -dimensional data vector 𝒚

𝑦 ^ 𝑡

EIPSE

1 𝑛 ​ ∑ 𝑡

𝜅 𝑝 ( 𝑡 + 𝑝 )

Using the fact that the covariance kernel of a stationary process is an even function, i.e., 𝜅 𝑝 ( − 𝑡 )

𝜅 𝑝 ( 𝑡 ) , and (1), specifying 𝜅 𝑝 ( 𝑡 ) for 𝑡

𝜅 𝑝 ( 𝑡 )

𝜅 𝑝 ( 𝑡 )

where 𝜙 ( 𝑡 )

𝜅 𝑝 ( 𝑡 )

𝜅 ( 𝑡 )

𝜅 ( 𝑡 )

𝐸 ( 𝑌 𝑡 )

𝗬 𝑖 + 1 ≜ [ 𝑌 𝑖 𝑝 + 1 , 𝑌 𝑖 𝑝 + 2 , … , 𝑌 𝑖 𝑝 + 𝑝 ] ⊤ , for 𝑖

𝐶 𝑜 𝑣 ( 𝑌 𝑡 , 𝑌 𝑠 )

𝐶 𝑜 𝑣 ( 𝑌 𝑡 , 𝑌 𝑠 )

ℓ 𝑛 ( 𝜔 , 𝜅 𝑝 )

ℓ 𝑛 ( 𝜔 , 𝜅 𝑝 )

𝑘 − 1 2 log ⁡ ( | 𝓚 | )

+ 1 2 ∑ 𝑖

ℓ ~ 𝑛 ( 𝜔 , 𝓚 )

log ⁡ ( | 𝓚 | )

+ 1 𝑘 − 1 ∑ 𝑖

− 1 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 𝘆 𝑖 ⊤ 𝓚 − 1 𝘆 𝑖 + 1 + 𝜔 𝑘 − 1 ∑ 𝑖

− 𝓚 + 1 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 𝘆 𝑖 ⊤ 𝓚 − 1 𝘆 𝑖 + 1 ∑ 𝑖

1 ( 𝑘 − 1 ) ∑ 𝑖

𝜅 ~ 𝑝 ( 𝑡 )

1 𝑝 − | 𝑡 | ∑ 𝑗

𝜅 ^ 𝑝 ( 𝑡 )

𝑓 ~ ( 𝜆 )

1 𝑘 − 1 𝘆 𝑖 ⊤ 𝓚 ^ − 1 𝘆 𝑖 + 1 ∑ 𝑖

ℓ ~ ( 𝜔 , 𝓚 )

𝑘 𝑝 ; 𝑝 ; 𝛿 (threshold) 2:Stage I: 3:Initialize: 𝓚 ~ ( 0 )

1 𝑘 − 1 𝘆 𝑖 ⊤ 𝓚 ~ ( 𝑚 − 1 ) − 1 𝘆 𝑖 + 1 ∑ 𝑖

1 𝑘 − 1 𝘆 𝑖 ⊤ 𝓚 ~ ( 𝑚 − 1 ) − 1 𝘆 𝑖 6: 𝓚 ~ ( 𝑚 )

1 𝑘 − 1 ∑ 𝑖

1 𝑘 − 1 ( 𝘆 𝑖 + 1 − 𝜔 ~ ( 𝑚 ) 𝘆 𝑖 ) ( 𝘆 𝑖 + 1 − 𝜔 ~ ( 𝑚 ) 𝘆 𝑖 ) ⊤ 7: if max ⁡ ( | ∂ ℓ ~ 𝑛 ∂ 𝜔 | , | ∂ ℓ ~ 𝑛 ∂ 𝓚 | ∞ ) | 𝜔 ~ ( 𝑚 ) , 𝓚 ~ ( 𝑚 ) < 𝛿 then 8: Set 𝜔 ~ 𝑛

𝓚 ~ ( 𝑚 ) 9: break 10: end if 11:end for 12:Stage II: 13:for | 𝑡 | < 𝑝 do 14: 𝜅 ~ 𝑝 ( 𝑡 )

1 𝑝 − | 𝑡 | ∑ 𝑗

1 𝑝 − | 𝑡 | 𝓚 ~ 𝑛 ( 𝑗 , 𝑗 + | 𝑡 | ) 15:end for 16:Compute Spectrum( 𝜅 ~ 𝑝 ): 17: 𝑓 ~ ( 𝜆 )

1 2 𝜋 ∑ | 𝑡 | < 𝑝 𝜅 ~ 𝑝 ( 𝑡 ) 𝑒 − 𝑖 𝑡 𝜆 , ∀ 𝜆 ∈ [ − 𝜋 , 𝜋 ] 18:Estimates of 𝜅 𝑝 and 𝜔 : 19: 𝜅 ^ 𝑝 ( 𝑡 )

∫ − 𝜋 𝜋 𝑒 𝑖 𝑡 𝜆 max ⁡ ( 𝑓 ~ ( 𝜆 ) , 0 ) 𝑑 𝜆 , | 𝑡 | < 𝑝 20: 𝜔 ^

1 𝑘 − 1 𝘆 𝑖 ⊤ 𝓚 ^ − 1 𝘆 𝑖 + 1 ∑ 𝑖

1 𝑘 − 1 𝘆 𝑖 ⊤ 𝓚 ^ − 1 𝘆 𝑖 , 𝓚 ^ ≜ ( 𝜅 ^ 𝑝 ( 𝑖 − 𝑗 ) ) 1 ≤ 𝑖 , 𝑗 ≤ 𝑝 21:Output: 𝜔 ^ and 𝜅 ^ 𝑝 ( 𝑡 ) for 𝑡

𝐸 ( 𝑌 𝑡 | 𝑌 𝑡 − 1 , 𝑌 𝑡 − 2 , … , 𝑌 1 )

Further, Var ( 𝑌 ^ 𝑡 )

𝓚 𝑙 ( 𝑡 )

Var ( 𝑌 ^ 𝑡 )

‘

1 𝑛 ∑ 𝑡

𝘆 𝑖 − 𝜔 ^ 𝘆 𝑖 − 1 , for 𝑖

We now compare the time taken in computing the prediction using the expression given in (27) and (28). In Table II, we report the integrated prediction squared error (IPSE) defined as IPSE ≜ 1 𝑛 ∑ 𝑡

1.5 )
( 𝜄

𝜃
0.7599
0.0814
( 0.5436 , 0.8539 ) Figure 4:The plot shows sunspot numbers vs. year in a black solid line together with fitting standard QPGP with period 𝑝

𝜃
0.8338
0.4892
( 0.5649 , 2.1742 ) Figure 5:The plot shows the water level in a black solid line against days, together with fitted standard QPGP (with 𝑝

𝑘 𝑝 + 𝑙 for some 𝑙 ∈ { 1 , 2 , … , 𝑝 − 1 } , i.e., the data vector consists of complete observations on the periodic blocks 𝘆 𝑖 for 𝑖

ℓ 𝑛 ( 𝜔 , 𝜅 𝑝 )

ℓ ~ 𝑛 ( 𝜔 , 𝜅 𝑝 )

1 𝑘 − 1 log ⁡ | 𝓚 𝑙 |

+ 1 𝑘 − 1 ( 𝘆 𝑘 + 1 ( 𝑙 ) − 𝜔 𝘆 𝑘 ( 𝑙 ) ) ⊤ 𝓚 𝑙 − 1 ( 𝘆 𝑘 + 1 ( 𝑙 ) − 𝜔 𝘆 𝑘 ( 𝑙 ) )

+ log ⁡ ( | 𝓚 | )

+ 1 𝑘 − 1 ∑ 𝑖

𝖅 𝑖 ( 𝜔 )

𝘆 𝑖 − 𝜔 𝘆 𝑖 − 1 for 𝑖

𝖅 𝑖 ( 𝜔 )

𝖅 𝑖 ( 𝑝 − 𝑙 ) ( 𝜔 ) ]

( 𝘆 𝑘 − 𝑖 + 1 − 𝜔 𝘆 𝑘 − 𝑖 ) ⊤ 𝓚 − 1 ( 𝘆 𝑘 − 𝑖 + 1 − 𝜔 𝘆 𝑘 − 𝑖 )