“The stock market is like a game of chess, but with the added thrill of a roller coaster. And the roller coaster is on fire.”
In Our Last Chapter, We have discussed about Random Walk in Markov Chain using python. Now, We shall dive into more detail into Markov Chain and its relation with linear algebra more theoretically.
In tackling this model, we’ll employ Linear Algebra. Given that we’re dealing with a directed graph, it can be conveniently represented using an Adjacency Matrix.
If you’re unfamiliar with the term, we’ve previously encountered the use of an adjacency matrix in our last discussion where we designed a Transition Matrix utilizing Pandas.
So, the Markov Chain above will translate into –
An adjacency matrix is a square grid used to show a graph’s structure.
This matrix makes it easier to understand and work with Markov Chains.
In our journey through understanding Markov Chains, we come across a crucial component known as the Transition Matrix. This matrix plays a fundamental role in encapsulating the dynamics of state transitions within the chain.
Let’s dive a little deeper to understand its structure and significance.
Matrix Elements Representing Transition Probabilities:
Zero Elements Indicating Absence of Transition:
A zero value within the matrix isn’t arbitrary; it holds meaning. A zero element indicates that there’s no transition possible between the corresponding states. In graphical terms, it means there’s no edge connecting these two vertices (states) in the directed graph representing the Markov Chain.
Naming and Notation:
This significant matrix is termed the “Transition Matrix,” often denoted by the symbol ‘A’ for ease of reference in mathematical discussions and computations.
Transition Matrix as a Gateway to State Dynamics:
The Transition Matrix, in essence, forms the backbone of the Markov Chain, providing a structured, numerical way to explore, analyze, and predict state transitions within the chain. As we move forward, we’ll see how this matrix, denoted as ‘A’.
Remember that, Our goal is to find the probabilities of each state.
A row vector is a type of matrix with a single row and multiple columns. Each element in this row corresponds to a value associated with a particular column. In mathematical notation, it’s often represented as a horizontal array of numbers.
So π will look like – π_0 = [ 0 1 0 ]
Now, Let’s see what happens if we multiply our row vector with the transition matrix. When you multiply the row vector π_0 by the Transition Matrix A (i.e., π_0.A), you’re essentially calculating the probabilities of transitioning to each future state from the current ‘Downside’ state.
The Initial Transition Matrix
Future Probabilities corresponds to the Downside State
The Initial Transition Matrix
Future Probabilities corresponding to the
So, π_0.A =
The result of the multiplication gives you the second row of the Transition Matrix. Do note that We are not diving deep into the formulas of matrix multiplication.
This multiplication results in a new row vector which gives the probabilities of being in each state in the next time step (t=1), based on the current state being ‘Downside’.
This row corresponds to the ‘Downside’ state, and each element in this row now represents the probability of transitioning from the ‘Downside’ state to each of the three possible states (‘Upside’, ‘Downside’, ‘Consolidation’) in the next time step.
This process can be continued to calculate state probabilities for further time steps by repeatedly multiplying the resulting row vector with the Transition Matrix. In this step, Let’s iterate the Markov Chain forward in time by continually multiplying the current state probability vector (π) with the transition matrix (A). Then, We keep extending it –
Initially, we have π_0 (the state probabilities at time t=0).
By multiplying π_0 with A, we obtain π_1, which represents the state probabilities at time t=1.
This process is repeated to move forward through time steps: π_1.A = π_2, π_2.A = π_3, and so on. Each multiplication gives us the state probabilities for the next time step.
This process of multiplying the state probability vector with the Transition Matrix to obtain future state probabilities is a core aspect of working with Markov Chains. It demonstrates how the chain evolves over time and how the probabilities of being in each state change with each time step, based on the current state and the transition probabilities defined in the Transition Matrix.
Objective: The primary goal when working with Markov Chains is to find a stationary state. A stationary state is a state where the probabilities no longer change with further transitions.
A stationary state, represented as π, would satisfy the equation π.A = π. This means that when you multiply this state vector by the transition matrix (A), the result is the same state vector.
An eigenvector of a square matrix A is a nonzero vector x such that when A operates on x, the result is a scalar multiple of x. This relationship is expressed by the eigenvector equation: Ax = λx where:
The equation π.A = π is analogous to the eigenvector equation Ax = λx from linear algebra. In this equation:
In the context of Markov Chains, we’re specifically looking for an eigenvector π where the eigenvalue λ is 1. This is because the stationary state vector remains unchanged when multiplied by the transition matrix.
Finding the Stationary State:
Besides being an eigenvector with eigenvalue 1, the elements of π must sum to 1 because they represent a probability distribution across states.
This results in two equations:
These equations together aid in finding the stationary state π.
Once π is found, it provides the equilibrium or stationary distribution of the Markov Chain, which remains unchanged upon further transitions.
By employing linear algebra concepts, particularly eigenvectors and eigenvalues, we can solve for the stationary distribution π of the Markov Chain in a more efficient and theoretical manner. This approach provides an elegant way to understand and calculate the long-term behavior of the Markov Chain, bridging probabilistic models with linear algebraic theory.
π.A = π
We get three equations from here –
0.2x + 0.3y + 0.5z = x
0.6x = y
0.2x + 0.7y + 0.5z = z
π + π + π = 1
We get one equation from here –
x + y + z = 1
π = [25/71, 15/71, 31/71] = [0.35211, 0.21127. 0.43662]
Verification through Random Walk:
Previously, using the Random Walk theory, a similar vector π was derived:
π = [0.35191,0.21245,0.43564]
The close similarity between the π values obtained by both methods validates the consistency and robustness of the Markov Chain model.
The alignment of results from two distinct methods – one theoretical (linear algebra) and one empirical (random walk) – showcases the utility and reliability of Markov Chains in analyzing stochastic processes.
This comparison also illustrates how linear algebra provides a more structured and analytical way to find the stationary distribution, which is a crucial aspect in understanding the long-term behavior of Markov Chains.
It’s possible for a Markov Chain to have more than one stationary state. This occurs if there are multiple eigenvectors corresponding to the eigenvalue of 1 in the transition matrix. Each of these eigenvectors represents a different stationary state.
Limitation of Random Walk Method:
The random walk method discussed earlier could miss some of these stationary states because it essentially simulates the evolution of the Markov Chain from a specific starting point.
If there are multiple stationary states, the random walk method might only converge to one of them, based on the starting state and the structure of the transition matrix.
Advantages of Linear Algebraic Approach:
By employing linear algebra, specifically by solving for eigenvectors of the transition matrix with an eigenvalue of 1, you can potentially uncover all stationary states that exist.
Let’s jump back to our discussion using Python, connect the dots, and visually and numerically validate these concepts. This will solidify the understanding and ability to apply Markov Chains in practical scenarios