Previous statistical analysis efforts of DNA sequences revealed that noncoding regions exhibit long-range power law correlations, whereas coding regions behave like random sequences or sustain short-range correlations. A great deal of debate on the presence or absence of long-range correlations in nucleotide sequences, and more specifically in coding regions, has ensued. These results were obtained using signal processing techniques for stationary signals and statistical tools for signals with slowly varying trends superimposed on stationary signals. However, it can be verified using statistical tests that genomic sequences are nonstationary and the nature of their nonstationarity varies and is often much more complex than a simple trend. In this paper, we will bring to bear new tools to analyze nonstationary signals that have emerged in the statistical and signal processing community over the past few years. The emergence of these new methods will be used to shed new light and help resolve the issues of i) the existence of long-range correlations in DNA sequences and ii) whether they are present in both coding and noncoding segments or only in the latter. It turns out that the statistical differences between coding and noncoding segments are much more subtle than previously thought using stationary analysis. In particular, both coding and noncoding sequences exhibit long-range correlations, as asserted by a 1/fβ(n) evolutionary (i.e., time-dependent) spectrum. However, we will use an index of randomness, which we derive from the Hilbert transform, to demonstrate that coding segments, although not random as previously suspected, are often "closer" to random sequences than noncoding segments. Moreover, we analytically justify the use of the Hilbert spectrum by proving that narrowband nonstationary signals result in a small demodulation error using the Hilbert transform.
|Original language||English (US)|
|Number of pages||8|
|Journal||IEEE Journal on Selected Topics in Signal Processing|
|State||Published - Jun 2008|
All Science Journal Classification (ASJC) codes
- Signal Processing
- Electrical and Electronic Engineering