In
statistics,
signal processing
Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing '' signals'', such as sound, images, and scientific measurements. Signal processing techniques are used to optimize transmissions, ...
, and
time series analysis
In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. ...
, a sinusoidal model is used to approximate a sequence ''Y
i'' to a
sine
In mathematics, sine and cosine are trigonometric functions of an angle. The sine and cosine of an acute angle are defined in the context of a right triangle: for the specified angle, its sine is the ratio of the length of the side that is opp ...
function:
:
where ''C'' is constant defining a
mean
There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value ( magnitude and sign) of a given data set.
For a data set, the '' ari ...
level, α is an
amplitude
The amplitude of a periodic variable is a measure of its change in a single period (such as time or spatial period). The amplitude of a non-periodic signal is its magnitude compared with a reference value. There are various definitions of a ...
for the
sine
In mathematics, sine and cosine are trigonometric functions of an angle. The sine and cosine of an acute angle are defined in the context of a right triangle: for the specified angle, its sine is the ratio of the length of the side that is opp ...
, ω is the
angular frequency
In physics, angular frequency "''ω''" (also referred to by the terms angular speed, circular frequency, orbital frequency, radian frequency, and pulsatance) is a scalar measure of rotation rate. It refers to the angular displacement per unit ti ...
, ''T
i'' is a time variable, φ is the
phase-shift
In physics and mathematics, the phase of a periodic function F of some real number, real variable t (such as time) is an angle-like quantity representing the fraction of the cycle covered up to t. It is denoted \phi(t) and expressed in such a sca ...
, and ''E
i'' is the error sequence.
This sinusoidal model can be fit using
nonlinear least squares
Non-linear least squares is the form of least squares analysis used to fit a set of ''m'' observations with a model that is non-linear in ''n'' unknown parameters (''m'' ≥ ''n''). It is used in some forms of nonlinear regression. The ...
; to obtain a good fit, routines may require good starting values for the unknown parameters.
Fitting a model with a single sinusoid is a special case of
spectral density estimation
In statistical signal processing, the goal of spectral density estimation (SDE) or simply spectral estimation is to estimate the spectral density (also known as the power spectral density) of a signal from a sequence of time samples of the si ...
and
least-squares spectral analysis
Least-squares spectral analysis (LSSA) is a method of estimating a frequency spectrum, based on a least squares fit of sinusoids to data samples, similar to Fourier analysis. Fourier analysis, the most used spectral method in science, generall ...
.
Good starting values
Good starting value for the mean
A good starting value for ''C'' can be obtained by calculating the
mean
There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value ( magnitude and sign) of a given data set.
For a data set, the '' ari ...
of the data. If the data show a
trend
A fad or trend is any form of collective behavior that develops within a culture, a generation or social group in which a group of people enthusiastically follow an impulse for a short period.
Fads are objects or behaviors that achieve short- ...
, i.e., the assumption of constant location is violated, one can replace ''C'' with a linear or quadratic
least squares
The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems (sets of equations in which there are more equations than unknowns) by minimizing the sum of the squares of the r ...
fit. That is, the model becomes
:
or
:
Good starting value for frequency
The starting value for the frequency can be obtained from the dominant frequency in a
periodogram In signal processing, a periodogram is an estimate of the spectral density of a signal. The term was coined by Arthur Schuster in 1898. Today, the periodogram is a component of more sophisticated methods (see spectral estimation). It is the most c ...
. A
complex demodulation
Complex commonly refers to:
* Complexity, the behaviour of a system whose components interact in multiple ways so possible interactions are difficult to describe
** Complex system, a system composed of many components which may interact with each ...
phase plot can be used to refine this initial estimate for the frequency.
Good starting values for amplitude
The
root mean square
In mathematics and its applications, the root mean square of a set of numbers x_i (abbreviated as RMS, or rms and denoted in formulas as either x_\mathrm or \mathrm_x) is defined as the square root of the mean square (the arithmetic mean of th ...
of the detrended data can be scaled by the square root of two to obtain an estimate of the sinusoid amplitude. A complex demodulation amplitude plot can be used to find a good starting value for the amplitude. In addition, this plot can indicate whether or not the amplitude is constant over the entire range of the data or if it varies. If the plot is essentially flat, i.e., zero slope, then it is reasonable to assume a constant amplitude in the non-linear model. However, if the slope varies over the range of the plot, one may need to adjust the model to be:
:
That is, one may replace α with a function of time. A linear fit is specified in the model above, but this can be replaced with a more elaborate function if needed.
Model validation
As with any
statistical model
A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of sample data (and similar data from a larger population). A statistical model represents, often in considerably idealized form, ...
, the fit should be subjected to graphical and quantitative techniques of
model validation
In statistics, model validation is the task of evaluating whether a chosen statistical model is appropriate or not. Oftentimes in statistical inference, inferences from models that appear to fit their data may be flukes, resulting in a misundersta ...
. For example, a
run sequence plot
A run chart, also known as a run-sequence plot is a graph that displays observed data in a time sequence. Often, the data displayed represent some aspect of the output or performance of a manufacturing or other business process. It is therefore ...
to check for significant shifts in location, scale, start-up effects and
outliers
In statistics, an outlier is a data point that differs significantly from other observations. An outlier may be due to a variability in the measurement, an indication of novel data, or it may be the result of experimental error; the latter a ...
. A
lag plot
In the analysis of data, a correlogram is a chart of correlation statistics.
For example, in time series analysis, a plot of the sample autocorrelations r_h\, versus h\, (the time lags) is an autocorrelogram.
If cross-correlation is plotted ...
can be used to verify the
residuals are independent. The outliers also appear in the lag plot, and a
histogram
A histogram is an approximate representation of the distribution of numerical data. The term was first introduced by Karl Pearson. To construct a histogram, the first step is to " bin" (or " bucket") the range of values—that is, divide the ent ...
and
normal probability plot
The normal probability plot is a graphical technique to identify substantive departures from normality. This includes identifying outliers, skewness, kurtosis, a need for transformations, and mixtures. Normal probability plots are made of raw ...
to check for skewness or other non-
normality in the residuals.
Extensions
A different method consists in transforming the non-linear regression to a linear regression thanks to a convenient integral equation. Then, there is no need for initial guess and no need for iterative process : the fitting is directly obtained.
[The method is explained in the chapter "Generalized sinusoidal regression" pp.54-63 in the paper]
/ref>
See also
*Pitch detection algorithm
Pitch may refer to:
Acoustic frequency
* Pitch (music), the perceived frequency of sound including "definite pitch" and "indefinite pitch"
** Absolute pitch or "perfect pitch"
** Pitch class, a set of all pitches that are a whole number of octav ...
* Spectral density estimation#Single tone
References
External links
Beam deflection case study
{{DEFAULTSORT:Sinusoidal Model
Regression with time series structure
Regression models