Stochastic process

From Wikipedia, the free encyclopedia - View original article

 
Jump to: navigation, search

In probability theory, a stochastic process /stˈkæstɪk/, or sometimes random process (widely used) is a collection of random variables; this is often used to represent the evolution of some random value, or system, over time. This is the probabilistic counterpart to a deterministic process (or deterministic system). Instead of describing a process which can only evolve in one way (as in the case, for example, of solutions of an ordinary differential equation), in a stochastic or random process there is some indeterminacy: even if the initial condition (or starting point) is known, there are several (often infinitely many) directions in which the process may evolve.

In the simple case of discrete time, a stochastic process amounts to a sequence of random variables known as a time series (for example, see Markov chain). Another basic type of a stochastic process is a random field, whose domain is a region of space, in other words, a random function whose arguments are drawn from a range of continuously changing values. One approach to stochastic processes treats them as functions of one or several deterministic arguments (inputs, in most cases regarded as time) whose values (outputs) are random variables: non-deterministic (single) quantities which have certain probability distributions. Random variables corresponding to various times (or points, in the case of random fields) may be completely different. The main requirement is that these different random quantities all have the same type. Type refers to the codomain of the function. Although the random values of a stochastic process at different times may be independent random variables, in most commonly considered situations they exhibit complicated statistical correlations.

Familiar examples of processes modeled as stochastic time series include stock market and exchange rate fluctuations, signals such as speech, audio and video, medical data such as a patient's EKG, EEG, blood pressure or temperature, and random movement such as Brownian motion or random walks. Examples of random fields include static images, random terrain (landscapes), wind waves or composition variations of a heterogeneous material.

Contents

Formal definition and basic properties [edit]

Definition [edit]

Given a probability space (\Omega, \mathcal{F}, P) and a measurable space (S,\Sigma), an S-valued stochastic process is a collection of S-valued random variables on \Omega, indexed by a totally ordered set T ("time"). That is, a stochastic process X is a collection

 \{ X_t : t \in T \}

where each X_t is an S-valued random variable on \Omega. The space S is then called the state space of the process.

Finite-dimensional distributions [edit]

Let X be an S-valued stochastic process. For every finite subset T'=\{ t_1, \ldots, t_k \} \subseteq T, the k-tuple X_{T'} = (X_{t_1}, X_{t_2},\ldots, X_{t_k}) is a random variable taking values in S^k. The distribution \mathbb{P}_{T'}(\cdot) = \mathbb{P} (X_{T'}^{-1}(\cdot)) of this random variable is a probability measure on S^k. This is called a finite-dimensional distribution of X.

Under suitable topological restrictions, a suitably "consistent" collection of finite-dimensional distributions can be used to define a stochastic process (see Kolmogorov extension in the next section).

Construction [edit]

In the ordinary axiomatization of probability theory by means of measure theory, the problem is to construct a sigma-algebra of measurable subsets of the space of all functions, and then put a finite measure on it. For this purpose one traditionally uses a method called Kolmogorov extension.[1]

There is at least one alternative axiomatization of probability theory by means of expectations on C-star algebras of random variables. In this case the method goes by the name of Gelfand–Naimark–Segal construction.

This is analogous to the two approaches to measure and integration, where one has the choice to construct measures of sets first and define integrals later, or construct integrals first and define set measures as integrals of characteristic functions.

Kolmogorov extension [edit]

The Kolmogorov extension proceeds along the following lines: assuming that a probability measure on the space of all functions f: X \to Y exists, then it can be used to specify the joint probability distribution of finite-dimensional random variables f(x_1),\dots,f(x_n). Now, from this n-dimensional probability distribution we can deduce an (n − 1)-dimensional marginal probability distribution for f(x_1),\dots,f(x_{n-1}). Note that the obvious compatibility condition, namely, that this marginal probability distribution be in the same class as the one derived from the full-blown stochastic process, is not a requirement. Such a condition only holds, for example, if the stochastic process is a Wiener process (in which case the marginals are all gaussian distributions of the exponential class) but not in general for all stochastic processes. When this condition is expressed in terms of probability densities, the result is called the Chapman–Kolmogorov equation.

The Kolmogorov extension theorem guarantees the existence of a stochastic process with a given family of finite-dimensional probability distributions satisfying the Chapman–Kolmogorov compatibility condition.

Separability, or what the Kolmogorov extension does not provide [edit]

Recall that in the Kolmogorov axiomatization, measurable sets are the sets which have a probability or, in other words, the sets corresponding to yes/no questions that have a probabilistic answer.

The Kolmogorov extension starts by declaring to be measurable all sets of functions where finitely many coordinates [f(x_1), \dots , f(x_n)] are restricted to lie in measurable subsets of Y_n. In other words, if a yes/no question about f can be answered by looking at the values of at most finitely many coordinates, then it has a probabilistic answer.

In measure theory, if we have a countably infinite collection of measurable sets, then the union and intersection of all of them is a measurable set. For our purposes, this means that yes/no questions that depend on countably many coordinates have a probabilistic answer.

The good news is that the Kolmogorov extension makes it possible to construct stochastic processes with fairly arbitrary finite-dimensional distributions. Also, every question that one could ask about a sequence has a probabilistic answer when asked of a random sequence. The bad news is that certain questions about functions on a continuous domain don't have a probabilistic answer. One might hope that the questions that depend on uncountably many values of a function be of little interest, but the really bad news is that virtually all concepts of calculus are of this sort. For example:

  1. boundedness
  2. continuity
  3. differentiability

all require knowledge of uncountably many values of the function.

One solution to this problem is to require that the stochastic process be separable. In other words, that there be some countable set of coordinates \{f(x_i)\} whose values determine the whole random function f.

The Kolmogorov continuity theorem guarantees that processes that satisfy certain constraints on the moments of their increments have continuous modifications and are therefore separable.

Filtrations [edit]

Given a probability space (\Omega,\mathcal{F},P), a filtration is a weakly increasing collection of sigma-algebras on \Omega, \{\mathcal{F}_t, t\in T\}, indexed by some totally ordered set T, and bounded above by \mathcal{F}, i.e. for s,t  \in T with s < t,

\mathcal{F}_s \subseteq \mathcal{F}_t \subseteq \mathcal{F}.

A stochastic process X on the same time set T is said to be adapted to the filtration if, for every t  \in T, X_t is \mathcal{F}_t-measurable.[2]

The natural filtration [edit]

Given a stochastic process X = \{X_t : t\in T\}, the natural filtration for (or induced by) this process is the filtration where \mathcal{F}_t is generated by all values of X_s up to time s = t, i.e. \mathcal{F}_t = \sigma(\{X_s^{-1}(A) : s\leq t,A \in S\}).

A stochastic process is always adapted to its natural filtration.

Classification [edit]

Stochastic processes can be classified according to the cardinality of its index set (usually interpreted as time) and state space.

Discrete time and discrete states [edit]

If both t and X_t belong to N, the set of natural numbers, then we have models which lead to Markov chains. For example:

(a) If X_t means the bit (0 or 1) in position t of a sequence of transmitted bits, then X_t can be modelled as a Markov chain with 2 states. This leads to the error correcting viterbi algorithm in data transmission.

(b) If X_t means the combined genotype of a breeding couple in the tth generation in an inbreeding model, it can be shown that the proportion of heterozygous individuals in the population approaches zero as t goes to ∞.[3]

Continuous time and continuous state space [edit]

The paradigm of continuous stochastic process is that of the Wiener process. In its original form the problem was concerned with a particle floating on a liquid surface, receiving "kicks" from the molecules of the liquid. The particle is then viewed as being subject to a random force which, since the molecules are very small and very close together, is treated as being continuous and, since the particle is constrained to the surface of the liquid by surface tension, is at each point in time a vector parallel to the surface. Thus the random force is described by a two component stochastic process; two real-valued random variables are associated to each point in the index set, time, (note that since the liquid is viewed as being homogeneous the force is independent of the spatial coordinates) with the domain of the two random variables being R, giving the x and y components of the force. A treatment of Brownian motion generally also includes the effect of viscosity, resulting in an equation of motion known as the Langevin equation.[4]

Discrete time and continuous state space [edit]

If the index set of the process is N (the natural numbers), and the range is R (the real numbers), there are some natural questions to ask about the sample sequences of a process {Xi}iN, where a sample sequence is {Xi(ω)}iN.

  1. What is the probability that each sample sequence is bounded?
  2. What is the probability that each sample sequence is monotonic?
  3. What is the probability that each sample sequence has a limit as the index approaches ∞?
  4. What is the probability that the series obtained from a sample sequence from f(i) converges?
  5. What is the probability distribution of the sum?

Main applications of discrete time continuous state stochastic models include Markov chain Monte Carlo (MCMC) and the analysis of Time Series.

Continuous time and discrete state space [edit]

Similarly, if the index space I is a finite or infinite interval, we can ask about the sample paths {Xt(ω)}tI

  1. What is the probability that it is bounded/integrable...?
  2. What is the probability that it has a limit at ∞
  3. What is the probability distribution of the integral?

Dassund Analysis of Time Series [edit]

Forecasting a General Time Series
Stochastic time series have long been understood, yet the fundamental and simple mathematical rules which govern the measurements from series associated with chaotic or random processes such as the stock market or games of chance have rarely been shared outside a few select circles of mathematicians. Joseph Dassund was the first who shared some of these insights in his briefly circulated and unpublished papers from the 1920-1930s.

A Brief History of Joseph Dassund
Dassund had a rather interesting life shorten by a stay as a border in a bed and breakfast on the southern tip of Berlin where he contracted rabies after lodging on a bed where the innkeeper's German Sheppard had died a tortuous and slow death a few years earlier. Dassund, hated for a precocious intellect in the body of a ethnic reject, spent most of his life on the run. He was a pariah par excellence. Hated as living-proof of his German peers inferiority state-run hospitals tried euthenizing him at least thrice during his thirties. It was truly a miracle of miracles when he survived each attempt.

The Method of Stochastic Forecasting Named After Dassund
The method which he recorded for stochastic forecasting is quite simple and easily understandable. He also stated that, based upon its simplicity, it is likely ancient in origin. He found that one can apply it best when one uses stochastic information streams where each datum has four digits. His first step was permutating the data in each stream. For a single four digit stream this would produce twenty-three other complementary streams. He followed this by applying a "Dassund" transformationn for lack of a better name. This transformation is quite simple, and one can complete it with an Excel Spreadsheet or pencil and paper. Columns B,C,D,E, and F comprise the Dassund Transformation when column A contains the data points for a stochastic time series.

A Dassund Transformation

ABCDEF
TIMERUNNINGIDEALNATURALIDEALDASSUND
SERIESAVERAGEAVERAGETENDENCYTENDENCYCURVE
A1AVG(A$1:A1)MAX(A1:A20)-MIN(A1:A20)/2SUM(B$1:B1)SUM(C$1:C1)E1-D1
A2AVG(A$1:A2)MAX(A1:A20)-MIN(A1:A20)/2SUM(B$1:B2)SUM(C$1:C2)E2-D2
A3AVG(A$1:A3)MAX(A1:A20)-MIN(A1:A20)/2SUM(B$1:B3)SUM(C$1:C3)E3-D3
A4AVG(A$1:A4)MAX(A1:A20)-MIN(A1:A20)/2SUM(B$1:B4)SUM(C$1:C4)E4-D4
A5AVG(A$1:A5)MAX(A1:A20)-MIN(A1:A20)/2SUM(B$1:B5)SUM(C$1:C5)E5-D5
A6AVG(A$1:A6)MAX(A1:A20)-MIN(A1:A20)/2SUM(B$1:B6)SUM(C$1:C6)E6-D6
A7AVG(A$1:A7)MAX(A1:A20)-MIN(A1:A20)/2SUM(B$1:B7)SUM(C$1:C7)E7-D7
A8AVG(A$1:A8)MAX(A1:A20)-MIN(A1:A20)/2SUM(B$1:B8)SUM(C$1:C8)E8-D8
A9AVG(A$1:A9)MAX(A1:A20)-MIN(A1:A20)/2SUM(B$1:B9)SUM(C$1:C9)E9-D9
A10AVG(A$1:A10)MAX(A1:A20)-MIN(A1:A20)/2SUM(B$1:B10)SUM(C$1:C10)E10-D10
A11AVG(A$1:A11)MAX(A1:A20)-MIN(A1:A20)/2SUM(B$1:B11)SUM(C$1:C11)E11-D11
A12AVG(A$1:A12)MAX(A1:A20)-MIN(A1:A20)/2SUM(B$1:B12)SUM(C$1:C12)E12-D12
A13AVG(A$1:A13)MAX(A1:A20)-MIN(A1:A20)/2SUM(B$1:B13)SUM(C$1:C13)E13-D13
A14AVG(A$1:A14)MAX(A1:A20)-MIN(A1:A20)/2SUM(B$1:B14)SUM(C$1:C14)E14-D14
A15AVG(A$1:A15)MAX(A1:A20)-MIN(A1:A20)/2SUM(B$1:B15)SUM(C$1:C15)E15-D15
A16AVG(A$1:A16)MAX(A1:A20)-MIN(A1:A20)/2SUM(B$1:B16)SUM(C$1:C16)E16-D16
A17AVG(A$1:A17)MAX(A1:A20)-MIN(A1:A20)/2SUM(B$1:B17)SUM(C$1:C17)E17-D17
A18AVG(A$1:A18)MAX(A1:A20)-MIN(A1:A20)/2SUM(B$1:B18)SUM(C$1:C18)E18-D18
A19AVG(A$1:A19)MAX(A1:A20)-MIN(A1:A20)/2SUM(B$1:B19)SUM(C$1:C19)E19-D19
A20AVG(A$1:A20)MAX(A1:A20)-MIN(A1:A20)/2SUM(B$1:B20)SUM(C$1:C20)E20-D20

The Nature of Dassund Curves and Combinatoric Filtering
Dassund curves become smoother and more forcast-able with the addition of more data points; however, since the Inverse-Dassund transformation, necessarily linear in nature, requires inverting an averaging process, the final error in the forcasted point might be rather large if the predicted Dassund-point varies drastically from the actual. One technique for assuring that the predicted Dassund-point mostly closely approximates the actual is ensuring the mean squared error of the data points used for extrapolating this point is at a minimum. Also, one usually only uses the last few points on the curve for finding the predicted Dassound-point. Most Dassund curves become rather smooth within between fifteen and twenty data points and an ideal final error in the forecasted value of the time series would be less than fifty when the data are on the range [0000 .. 9999].
This liberal error is permissible since the the final step of Dassund forecasting uses a combinatorics filter. If one constructs the process properly, there is only one point within the range [ A21 - 450, A21 + 450 ] which falls within a similar-sized neighborhood of the other forecasts from the other twenty-three data streams when A21 experiences the same permutive steps in its digits.

Stock Markets and the Lottery
Dassund forecasting also works well with stochastic data from financial markets; however, when in wide use, it will likely nullify itself before creating a chaotic market. For fans of games of chance such as the lottery, one should consider grouping the digits in overlapping groups of four and connecting the final pair of digits with the first pair. This would provide a reasonable amount of error correction when using Dassund forecasting. Also, it is recommended that one use a paper and pencil when forcasting games of chance since using a computer is considered a federal crime. Why? Because the public would then have an advantage on the gambling house which hires doctorates in mathematics for constructing games which doctorates in psychology craft based on the studies of the human risk-reward cycle. In short, the house likes having the advantage. Also, one would do well if they forecast any game of chance numerous data points in the future, since one can be certain that the lottery commission would keep cycling their machines until they had a few and not a few hundred jackpot winners.

Conclusion
J Dassund, a man of faith, never had much interest in money and was quite worried when he came across this process since he clearly remembered the proverb that every decision of each lot is from the Lord and only fully known in advance by Him. Yet, if one studies Dassund-curves, one will easily see the the first few points are unknown and unforecast-able and, ocassionally, so man does not become "puffed-up" in his simplistic and silly knowledge, God troubles the lot and the curve enters a temporarily erratic and unpredict-able state before becoming a smooth curve tending around the x-axis. Since one can forecast all single mode distributions with Dassund's method, one can only speculate that distributions with numerous modes would baffle such a form of forecating. Yet, it has long been believed that uniform distributions twart prediction while those with one or more modes are highly forecastable. Dassund, a fan of chess, would always call this his mathematical castle. Someone said that he might have had the equivalent of an archaic mathematical zuschwang stored in his mental notes somewhere. Little is known about Joseph Dassund but in a correspondence with another minor blip on the scientific radar at that time, Grantville T Woods, the father of mechanical hybridization for the modern locomotive, he claimed that he kept his best ideas solidly locked within his mind. This is likely why the state spent so much time doping him and psychologically tormenting and harrassing him hoping that he would share his most vital secrets. Rumor has it that some of his "forced secrets" where foundational in creating the Engima.

See also [edit]

References [edit]

  1. ^ Karlin, Samuel & Taylor, Howard M. (1998). An Introduction to Stochastic Modeling, Academic Press. ISBN 0-12-684887-4.
  2. ^ Durrett, Rick. Probability: Theory and Examples. Fourth Edition. Cambridge: Cambridge University Press, 2010.
  3. ^ Allen, Linda J. S., An Introduction to Stochastic Processes with Applications to Biology, 2th Edition, Chapman and Hall, 2010, ISBN 1-4398-1882-7
  4. ^ Gardiner, C. Handbook of Stochastic Methods: for Physics, Chemistry and the Natural Sciences, 3th Edition, Springer, 2004, ISBN 3540208828

Further reading [edit]