In Our Last Chapter, We have discussed about Markov Chains with an example of Stock Market.
Now, We shall take a simple example to build a Markov Chain. Let’s say We have a series of “NIFTY 50” prices and we want to model the behavior to make predictions about the future price.
So, first We shall use the NSEPython
Library and use the index_history()
function to get the last 100 trading days of data as a sample set.
from nsepython import *
symbol = "NIFTY 50"
days = 100
end_date = datetime.datetime.now().strftime("%d-%b-%Y")
end_date = str(end_date)
start_date = (datetime.datetime.now()- datetime.timedelta(days=days)).strftime("%d-%b-%Y")
start_date = str(start_date)
df=index_history("NIFTY 50",start_date,end_date)
print(df)
#
Index Name INDEX_NAME HistoricalDate OPEN HIGH LOW CLOSE
0 Nifty 50 NIFTY 50 02 Jul 2021 15705.85 15738.35 15635.95 15722.20
1 Nifty 50 NIFTY 50 01 Jul 2021 15755.05 15755.55 15667.05 15680.00
2 Nifty 50 NIFTY 50 30 Jun 2021 15776.90 15839.10 15708.75 15721.50
3 Nifty 50 NIFTY 50 29 Jun 2021 15807.50 15835.90 15724.05 15748.45
4 Nifty 50 NIFTY 50 28 Jun 2021 15915.35 15915.65 15792.15 15814.70
... ... ... ... ... ... ... ...
60 Nifty 50 NIFTY 50 05 Apr 2021 14837.70 14849.85 14459.50 14637.80
61 Nifty 50 NIFTY 50 01 Apr 2021 14798.40 14883.20 14692.45 14867.35
62 Nifty 50 NIFTY 50 31 Mar 2021 14811.85 14813.75 14670.25 14690.70
63 Nifty 50 NIFTY 50 30 Mar 2021 14628.50 14876.30 14617.60 14845.10
64 Nifty 50 NIFTY 50 26 Mar 2021 14506.30 14572.90 14414.25 14507.30
65 rows × 7 columns
As discussed in earlier example with NIFTY while framing an example of Markov Chain, NIFTY can have three states –
To obtain the states in our data frame, the first task is to calculate the daily return. Then, We shall use a function to identify the possible states according to the return. Now, If We, define the state Consolidation
where there is literally 0 movement that day, that’s realistically impossible.
So, We are keeping minimum legroom. If the movement is between a small range then it will be still called a Consolidation
state.
df["state"]=df["CLOSE"].astype(float).pct_change()
df['state']=df['state'].apply(lambda x: 'Upside' if (x > 0.001) else ('Downside' if (x<=0.001) else 'Consolidation'))
df.tail()
Output –
We are using the df.tail()
function to limit the output to the last 5 rows to soothe the eyes.
#
Index Name INDEX_NAME HistoricalDate OPEN HIGH LOW CLOSE state
60 Nifty 50 NIFTY 50 05 Apr 2021 14837.70 14849.85 14459.50 14637.80 Downside
61 Nifty 50 NIFTY 50 01 Apr 2021 14798.40 14883.20 14692.45 14867.35 Upside
62 Nifty 50 NIFTY 50 31 Mar 2021 14811.85 14813.75 14670.25 14690.70 Downside
63 Nifty 50 NIFTY 50 30 Mar 2021 14628.50 14876.30 14617.60 14845.10 Upside
64 Nifty 50 NIFTY 50 26 Mar 2021 14506.30 14572.90 14414.25 14507.30 Downside
Now, the pct_change()
the function of Pandas library is to show the prior day’s price to today’s price. That’s hence Today's State
.
Yesterday's State
to Today's State
.priorstate
that contains the values of Yesterday's State
.
df['priorstate']=df['state'].shift(1)
df.tail()
Output –
#
Index Name INDEX_NAME HistoricalDate OPEN HIGH LOW CLOSE state priorstate
60 Nifty 50 NIFTY 50 05 Apr 2021 14837.70 14849.85 14459.50 14637.80 Downside Downside
61 Nifty 50 NIFTY 50 01 Apr 2021 14798.40 14883.20 14692.45 14867.35 Upside Downside
62 Nifty 50 NIFTY 50 31 Mar 2021 14811.85 14813.75 14670.25 14690.70 Downside Upside
63 Nifty 50 NIFTY 50 30 Mar 2021 14628.50 14876.30 14617.60 14845.10 Upside Downside
64 Nifty 50 NIFTY 50 26 Mar 2021 14506.30 14572.90 14414.25 14507.30 Downside Upside
Now that we have the Current State and Prior State, We need to build the Frequency Distribution Matrix. It will be actually easier to explain its definition by showing the outcome of the code –
states = df [['priorstate','state']].dropna()
states_matrix = states.groupby(['priorstate','state']).size().unstack().fillna(0)
print(states_matrix)
Output –
state Downside Upside
priorstate
Consolidation 1.0 0.0
Downside 26.0 14.0
Upside 14.0 9.0
The above matrix tells that there are 26 times NIFTY was down provided the previous day was also down. Or, There is just 1 time when NIFTY was down when the previous day was in consolidation state. That’s what Frequency Distribution Matrix does!
The Frequency Distribution Matrix that shows the frequency or, the occurrence of states based on Prior State.
Now, Like We calculated the Frequency Distribution Matrix, We need to calculate the Transition Matrix. While discussing the basics of the Markov Chain, We discussed two points –
transition_matrix= states_matrix.apply(lambda x: x/float(x.sum()),axis=1)
print(transition_matrix)
Output –
state Downside Upside
priorstate
Consolidation 1.000000 0.000000
Downside 0.650000 0.350000
Upside 0.608696 0.391304
The Transition Matrix shows the probability of the occurrence instead of the number of occurrences like the Frequency Distribution Matrix. You can also notice the sum of the weights are always 1 from priorstate
to state
. Like –
P(state="Downside"/priorstate="Downside") + P(state="Upside"/priorstate="Downside")
= 0.65 + 0.35
= 1
That’s why it is also called “Initial Probability Matrix”.
Note – There is no current state of Consolidation. That’s why it is resulting in 3x2
a matrix instead of 3x3
matrix. In a matrix calculation, when all the values are 0 in a column, the column can be omitted.
Now before We jump into a Pythonic approach of building the Markov Chain, Let’s discuss the relation of matrix and Markov Chain in more detail.