# Markov Chains in Stock Market Using Python – Getting Transition Matrix​

## Markov Chains in Stock Market Using Python - Getting Transition Matrix

In Our Last Chapter, We have discussed about Markov Chains with an example of Stock Market.

Now, We shall take a simple example to build a Markov Chain. Let’s say We have a series of “NIFTY 50” prices and we want to model the behavior to make predictions about the future price.

## Getting the Sample Data

So, first We shall use the `NSEPython` Library and use the `index_history()` function to get the last 100 trading days of data as a sample set.

```				```
from nsepython import *
symbol = "NIFTY 50"
days = 100
end_date = datetime.datetime.now().strftime("%d-%b-%Y")
end_date = str(end_date)

start_date = (datetime.datetime.now()- datetime.timedelta(days=days)).strftime("%d-%b-%Y")
start_date = str(start_date)

df=index_history("NIFTY 50",start_date,end_date)
print(df)
```
```
Output –
```				```
#
Index Name	INDEX_NAME	HistoricalDate	OPEN	HIGH	    LOW	        CLOSE
0	Nifty 50	NIFTY 50	02 Jul 2021	15705.85	15738.35	15635.95	15722.20
1	Nifty 50	NIFTY 50	01 Jul 2021	15755.05	15755.55	15667.05	15680.00
2	Nifty 50	NIFTY 50	30 Jun 2021	15776.90	15839.10	15708.75	15721.50
3	Nifty 50	NIFTY 50	29 Jun 2021	15807.50	15835.90	15724.05	15748.45
4	Nifty 50	NIFTY 50	28 Jun 2021	15915.35	15915.65	15792.15	15814.70
...	...	...	...	...	...	...	...
60	Nifty 50	NIFTY 50	05 Apr 2021	14837.70	14849.85	14459.50	14637.80
61	Nifty 50	NIFTY 50	01 Apr 2021	14798.40	14883.20	14692.45	14867.35
62	Nifty 50	NIFTY 50	31 Mar 2021	14811.85	14813.75	14670.25	14690.70
63	Nifty 50	NIFTY 50	30 Mar 2021	14628.50	14876.30	14617.60	14845.10
64	Nifty 50	NIFTY 50	26 Mar 2021	14506.30	14572.90	14414.25	14507.30
65 rows × 7 columns
```
```

## The States of a Markov Chain

As discussed in earlier example with NIFTY while framing an example of Markov Chain, NIFTY can have three states –

• Upside: The price has increased today from yesterday’s price.
• Downside: the price is decreased today compared to yesterday’s price
• Consolidation: The price remains unchanged from the previous day.

To obtain the states in our data frame, the first task is to calculate the daily return. Then, We shall use a function to identify the possible states according to the return. Now, If We, define the state `Consolidation` where there is literally 0 movement that day, that’s realistically impossible.

So,  We are keeping minimum legroom. If the movement is between a small range then it will be still called a `Consolidation` state. ```				```
df["state"]=df["CLOSE"].astype(float).pct_change()
df['state']=df['state'].apply(lambda x: 'Upside' if (x > 0.001) else ('Downside' if (x<=0.001) else 'Consolidation'))
df.tail()
```
```

Output –

We are using the `df.tail()` function to limit the output to the last 5 rows to soothe the eyes.

```				```
#
Index Name	INDEX_NAME	HistoricalDate	OPEN	HIGH	    LOW	        CLOSE   	state
60	Nifty 50	NIFTY 50	05 Apr 2021	14837.70	14849.85	14459.50	14637.80	Downside
61	Nifty 50	NIFTY 50	01 Apr 2021	14798.40	14883.20	14692.45	14867.35	Upside
62	Nifty 50	NIFTY 50	31 Mar 2021	14811.85	14813.75	14670.25	14690.70	Downside
63	Nifty 50	NIFTY 50	30 Mar 2021	14628.50	14876.30	14617.60	14845.10	Upside
64	Nifty 50	NIFTY 50	26 Mar 2021	14506.30	14572.90	14414.25	14507.30	Downside
```
```

Now, the `pct_change()` the function of Pandas library is to show the prior day’s price to today’s price. That’s hence `Today's State`.

• But, We want to analyze the transitions from `Yesterday's State` to `Today's State`.
• That’s why,  We are adding a new column `priorstate` that contains the values of `Yesterday's State` .
```				```
df['priorstate']=df['state'].shift(1)
df.tail()
```
```

Output –

```				```
#
Index Name	INDEX_NAME	HistoricalDate	OPEN	HIGH	    LOW	        CLOSE   	state	    priorstate
60	Nifty 50	NIFTY 50	05 Apr 2021	14837.70	14849.85	14459.50	14637.80	Downside	Downside
61	Nifty 50	NIFTY 50	01 Apr 2021	14798.40	14883.20	14692.45	14867.35	Upside  	Downside
62	Nifty 50	NIFTY 50	31 Mar 2021	14811.85	14813.75	14670.25	14690.70	Downside	Upside
63	Nifty 50	NIFTY 50	30 Mar 2021	14628.50	14876.30	14617.60	14845.10	Upside  	Downside
64	Nifty 50	NIFTY 50	26 Mar 2021	14506.30	14572.90	14414.25	14507.30	Downside	Upside
```
```

## Coding Frequency Distribution Matrix for Markov Chain Model

Now that we have the Current State and Prior State, We need to build the Frequency Distribution Matrix. It will be actually easier to explain its definition by showing the outcome of the code –

```				```
states = df [['priorstate','state']].dropna()
states_matrix = states.groupby(['priorstate','state']).size().unstack().fillna(0)
print(states_matrix)
```
```

Output –

```				```
state       	Downside	Upside
priorstate
Consolidation	1.0	        0.0
Downside	    26.0    	14.0
Upside	        14.0    	9.0
```
```

The above matrix tells that there are 26 times NIFTY was down provided the previous day was also down. Or, There is just 1 time when NIFTY was down when the previous day was in consolidation state. That’s what Frequency Distribution Matrix does!

The Frequency Distribution Matrix that shows the frequency or, the occurrence of states based on Prior State.

## Coding Transition Matrix for Markov Chain Model

Now, Like We calculated the Frequency Distribution Matrix, We need to calculate the Transition Matrix. While discussing the basics of the Markov Chain, We discussed two points –

• Each arrow is called a transition from one state to another.
• The sum of the weights of the outgoing arrows from any state is 1.
That’s all the Transition Matrix is all about. Let’s have a look at the output first as it will give us leverage to define it in layman’s term.
```				```
transition_matrix= states_matrix.apply(lambda x: x/float(x.sum()),axis=1)
print(transition_matrix)
```
```

Output –

```				```
state	        Downside	Upside
priorstate
Consolidation	1.000000	0.000000
Downside    	0.650000	0.350000
Upside      	0.608696	0.391304
```
```

The Transition Matrix shows the probability of the occurrence instead of the number of occurrences like the Frequency Distribution Matrix. You can also notice the sum of the weights are always 1 from `priorstate` to `state`. Like –

`P(state="Downside"/priorstate="Downside") + P(state="Upside"/priorstate="Downside")= 0.65 + 0.35= 1`

That’s why it is also called “Initial Probability Matrix”.

Note – There is no current state of Consolidation. That’s why it is resulting in `3x2` a matrix instead of `3x3` matrix. In a matrix calculation, when all the values are 0 in a column, the column can be omitted.

Now before We jump into a Pythonic approach of building the Markov Chain, Let’s discuss the relation of matrix and Markov Chain in more detail.

×Close