Markov Chains in Stock Market Using Python – Getting Transition Matrix​

Markov Chains in Stock Market Using Python - Getting Transition Matrix

In Our Last Chapter, We have discussed about Markov Chains with an example of Stock Market. 

Now, We shall take a simple example to build a Markov Chain. Let’s say We have a series of “NIFTY 50” prices and we want to model the behavior to make predictions about the future price. 

Getting the Sample Data

So, first We shall use the NSEPython Library and use the index_history() function to get the last 100 trading days of data as a sample set.

				
					from nsepython import *
symbol = "NIFTY 50"
days = 100
end_date = datetime.datetime.now().strftime("%d-%b-%Y")
end_date = str(end_date)

start_date = (datetime.datetime.now()- datetime.timedelta(days=days)).strftime("%d-%b-%Y")
start_date = str(start_date)

df=index_history("NIFTY 50",start_date,end_date)
print(df)
				
			
Output – 
				
					#
	Index Name	INDEX_NAME	HistoricalDate	OPEN	HIGH	    LOW	        CLOSE
0	Nifty 50	NIFTY 50	02 Jul 2021	15705.85	15738.35	15635.95	15722.20
1	Nifty 50	NIFTY 50	01 Jul 2021	15755.05	15755.55	15667.05	15680.00
2	Nifty 50	NIFTY 50	30 Jun 2021	15776.90	15839.10	15708.75	15721.50
3	Nifty 50	NIFTY 50	29 Jun 2021	15807.50	15835.90	15724.05	15748.45
4	Nifty 50	NIFTY 50	28 Jun 2021	15915.35	15915.65	15792.15	15814.70
...	...	...	...	...	...	...	...
60	Nifty 50	NIFTY 50	05 Apr 2021	14837.70	14849.85	14459.50	14637.80
61	Nifty 50	NIFTY 50	01 Apr 2021	14798.40	14883.20	14692.45	14867.35
62	Nifty 50	NIFTY 50	31 Mar 2021	14811.85	14813.75	14670.25	14690.70
63	Nifty 50	NIFTY 50	30 Mar 2021	14628.50	14876.30	14617.60	14845.10
64	Nifty 50	NIFTY 50	26 Mar 2021	14506.30	14572.90	14414.25	14507.30
65 rows × 7 columns
				
			

The States of a Markov Chain

As discussed in earlier example with NIFTY while framing an example of Markov Chain, NIFTY can have three states – 

  • Upside: The price has increased today from yesterday’s price. 
  • Downside: the price is decreased today compared to yesterday’s price
  • Consolidation: The price remains unchanged from the previous day.

To obtain the states in our data frame, the first task is to calculate the daily return. Then, We shall use a function to identify the possible states according to the return. Now, If We, define the state Consolidation where there is literally 0 movement that day, that’s realistically impossible. 

So,  We are keeping minimum legroom. If the movement is between a small range then it will be still called a Consolidation state.

Markov Chain in Stock Market
				
					df["state"]=df["CLOSE"].astype(float).pct_change()
df['state']=df['state'].apply(lambda x: 'Upside' if (x > 0.001) else ('Downside' if (x<=0.001) else 'Consolidation'))
df.tail()
				
			

Output –

We are using the df.tail() function to limit the output to the last 5 rows to soothe the eyes.

				
					#
    Index Name	INDEX_NAME	HistoricalDate	OPEN	HIGH	    LOW	        CLOSE   	state
60	Nifty 50	NIFTY 50	05 Apr 2021	14837.70	14849.85	14459.50	14637.80	Downside
61	Nifty 50	NIFTY 50	01 Apr 2021	14798.40	14883.20	14692.45	14867.35	Upside
62	Nifty 50	NIFTY 50	31 Mar 2021	14811.85	14813.75	14670.25	14690.70	Downside
63	Nifty 50	NIFTY 50	30 Mar 2021	14628.50	14876.30	14617.60	14845.10	Upside
64	Nifty 50	NIFTY 50	26 Mar 2021	14506.30	14572.90	14414.25	14507.30	Downside
				
			

Now, the pct_change() the function of Pandas library is to show the prior day’s price to today’s price. That’s hence Today's State.

  • But, We want to analyze the transitions from Yesterday's State to Today's State.
  • That’s why,  We are adding a new column priorstate that contains the values of Yesterday's State .
				
					df['priorstate']=df['state'].shift(1)
df.tail()
				
			

Output –

				
					#
    Index Name	INDEX_NAME	HistoricalDate	OPEN	HIGH	    LOW	        CLOSE   	state	    priorstate
60	Nifty 50	NIFTY 50	05 Apr 2021	14837.70	14849.85	14459.50	14637.80	Downside	Downside
61	Nifty 50	NIFTY 50	01 Apr 2021	14798.40	14883.20	14692.45	14867.35	Upside  	Downside
62	Nifty 50	NIFTY 50	31 Mar 2021	14811.85	14813.75	14670.25	14690.70	Downside	Upside
63	Nifty 50	NIFTY 50	30 Mar 2021	14628.50	14876.30	14617.60	14845.10	Upside  	Downside
64	Nifty 50	NIFTY 50	26 Mar 2021	14506.30	14572.90	14414.25	14507.30	Downside	Upside
				
			

Coding Frequency Distribution Matrix for Markov Chain Model

Now that we have the Current State and Prior State, We need to build the Frequency Distribution Matrix. It will be actually easier to explain its definition by showing the outcome of the code –

				
					states = df [['priorstate','state']].dropna()
states_matrix = states.groupby(['priorstate','state']).size().unstack().fillna(0)
print(states_matrix)
				
			

Output –

				
					state       	Downside	Upside
priorstate		
Consolidation	1.0	        0.0
Downside	    26.0    	14.0
Upside	        14.0    	9.0
				
			

The above matrix tells that there are 26 times NIFTY was down provided the previous day was also down. Or, There is just 1 time when NIFTY was down when the previous day was in consolidation state. That’s what Frequency Distribution Matrix does!

The Frequency Distribution Matrix that shows the frequency or, the occurrence of states based on Prior State.

Coding Transition Matrix for Markov Chain Model

Now, Like We calculated the Frequency Distribution Matrix, We need to calculate the Transition Matrix. While discussing the basics of the Markov Chain, We discussed two points – 

  • Each arrow is called a transition from one state to another. 
  • The sum of the weights of the outgoing arrows from any state is 1.
That’s all the Transition Matrix is all about. Let’s have a look at the output first as it will give us leverage to define it in layman’s term.
				
					transition_matrix= states_matrix.apply(lambda x: x/float(x.sum()),axis=1)
print(transition_matrix)
				
			

Output –

				
					state	        Downside	Upside
priorstate		
Consolidation	1.000000	0.000000
Downside    	0.650000	0.350000
Upside      	0.608696	0.391304
				
			

The Transition Matrix shows the probability of the occurrence instead of the number of occurrences like the Frequency Distribution Matrix. You can also notice the sum of the weights are always 1 from priorstate to state. Like –

P(state="Downside"/priorstate="Downside") + P(state="Upside"/priorstate="Downside")
= 0.65 + 0.35
= 1

That’s why it is also called “Initial Probability Matrix”.

Note – There is no current state of Consolidation. That’s why it is resulting in 3x2 a matrix instead of 3x3 matrix. In a matrix calculation, when all the values are 0 in a column, the column can be omitted.

Now before We jump into a Pythonic approach of building the Markov Chain, Let’s discuss the relation of matrix and Markov Chain in more detail.

×Close