In Our Last Chapter, We have discussed about Random Walk in Markov Chain using python. Now, We shall dive into more detail into Markov Chain and its relation with linear algebra more theoretically.
So, We are approaching this Model with Linear Algebra. As this is a directed graph, We can just replace this with Adjacent Matrix.
If You don’t what it is, We have already seen its usage in our last discussion with our approach of designing Transition Matrix using Pandas.
So, the Markov Chain above will translate into –
The elements of the matrix just denote the weight of the edge connecting the two corresponding vertices. But, We have already seen the example in our last discussion. So,
Downside
state to an Upside
state.This matrix is called as “Transition Matrix”. Let’s denote it as A
.
Remember that, Our goal is to find the probabilities of each state. So, Conventionally, We take a row vector named π
whose element represent the probabilities of the state. Suppose, at the beginning i.e. the first day which is at t=0
, We are having a Downside
state. So π
will look like –
π_0 = [ 0 1 0 ]
Downside
state becomes 1
.0
.Now, Let’s see what happens if we multiply our row vector with the transition matrix. So, π_0.A =
We get the second row of the Transition Matrix as a result! In other words, We get the future probabilities corresponding to the Downside
state.
The Initial Transition Matrix
Future Probabilities corresponding to the Downside
state.
Now, We take this result and put them in the place of π_0
. In this case, We get the probabilities of the States of the first day after tomorrow. Then, We keep extending it –
So, π_1.A =
Similarly, π_2.A =
Now, if there exists a stationary state. After some iterations, the output vector should be the same as the input vector.
Let’s denote this special row vector as π
. Now, We should be able to write – π.A = π
.
If You have ever taken a course of Linear Algebra, You will find this equation is somewhat similar to our famous Eigenvector Equation Ax = λx
.
Consider λ = 1
and just reverse the order of multiplication. You will get out equilibrium state equation!
Now, coming to context, We can imagine that π
is a left eigenvector of the matrix A
with an eigenvalue equal to 1.
Equation 1: π.A = π
Now, the eigenvalue has to satisfy another condition – The elements of π
must add up to 1
.
Equation 2: π[1] + π[2] + π[3] = 1
Because – It denotes a probability distribution. So, after solving the above two equations we will get π
. This is the Stationary Distribution calculated from the theories of the Markov Chain with the help of Linear Algebra.
From π.A = π
We get three equations from here –
0.2x + 0.3y + 0.5z = x
0.6x = y
0.2x + 0.7y + 0.5z = z
From: π[1] + π[2] + π[3] = 1
We get one equation from here –
x + y + z = 1
After solving,
π = [25/71, 15/71, 31/71] = [0.35211, 0.21127. 0.43662]
In the last discussion, We calculated the value of π using the Random Walk Theory.
It was –
π = [0.35191,0.21245,0.43564]
(π)
calculatd by both methods is almost similar which is what validates Markov Chains even more – π[0.35211, 0.21127. 0.43662]
and [0.35191,0.21245,0.43564]
. Now, If there exists more than one eigenvector with eigenvalue 1, it will mean there are more stationary states which were impossible to find by the method of the random walk as described above!
Lets jump back to our discussion using Python and connect the dots and do the pending calculations clearly to understand the concepts with more clarity.