Historical Data is the most important thing if You’re backtesting. NSE provides Historical Data for free in various time frames. To scrap that data, We have designed a function named equity_history()
in NSEPython Library. The documentation follows –
Syntax:
equity_history(symbol,series,start_date,end_date)
Program Structure:
def equity_history(symbol,series,start_date,end_date):
payload = nsefetch("https://www.nseindia.com/api/historical/cm/equity?symbol="+symbol+"&series=[%22"+series+"%22]&from="+start_date+"&to="+end_date+"")
return pd.DataFrame.from_records(payload["data"])
Example:
symbol = "SBIN"
series = "EQ"
start_date = "08-06-2021"
end_date ="14-06-2021"
print(equity_history(symbol,series,start_date,end_date))
It will return a Pandas Dataframe
as output. Anyways it is huge, So We are not pasting here. But there is a huge limitation –
365 days.
So, How to improvise this function?
Without further ado, We shall paste the code first and explain side by side. Here, We have copied the Previous code to a different function. It is called Virgin
. We shall use this as a base
.
from nsepython import *
logging.basicConfig(level=logging.DEBUG)
def equity_history_virgin(symbol,series,start_date,end_date):
url="https://www.nseindia.com/api/historical/cm/equity?symbol="+symbol+"&series=[%22"+series+"%22]&from="+str(start_date)+"&to="+str(end_date)+""
payload = nsefetch(url)
return pd.DataFrame.from_records(payload["data"])
Our hands are tied to the output of 40 days only because – “It will return only 50 days of data even if you ask for the last 90 days and discard the rest!”. So, We need to divide our input into chunks of requests of 40 days. After we get our inputs in chunks, We will stitch it.
Now the agenda is –
#We are getting the input in text. So it is being converted to Datetime object from String.
start_date = datetime.datetime.strptime(start_date, "%d-%m-%Y")
end_date = datetime.datetime.strptime(end_date, "%d-%m-%Y")
logging.info("Starting Date: "+str(start_date))
logging.info("Ending Date: "+str(end_date))
#We are calculating the difference between the days
diff = end_date-start_date
logging.info("Total Number of Days: "+str(diff.days))
logging.info("Total FOR Loops in the program: "+str(int(diff.days/40)))
logging.info("Remainder Loop: " + str(diff.days-(int(diff.days/40)*40)))
Now, We will run the loop –
Dataframe
too. That’s why the reference of "Caterpillar"
,Dataframe
.total.iloc[::-1]
is used..reset_index(drop=True)`
total=pd.DataFrame()
for i in range (0,int(diff.days/40)):
temp_date = (start_date+datetime.timedelta(days=(40))).strftime("%d-%m-%Y")
start_date = datetime.datetime.strftime(start_date, "%d-%m-%Y")
logging.info("Loop = "+str(i))
logging.info("====")
logging.info("Starting Date: "+str(start_date))
logging.info("Ending Date: "+str(temp_date))
logging.info("====")
total=total.append(equity_history_virgin(symbol,series,start_date,temp_date))
logging.info("Length of the Table: "+ str(len(total)))
#Preparation for the next loop
start_date = datetime.datetime.strptime(temp_date, "%d-%m-%Y")
start_date = datetime.datetime.strftime(start_date, "%d-%m-%Y")
end_date = datetime.datetime.strftime(end_date, "%d-%m-%Y")
logging.info("End Loop")
logging.info("====")
logging.info("Starting Date: "+str(start_date))
logging.info("Ending Date: "+str(end_date))
logging.info("====")
total=total.append(equity_history_virgin(symbol,series,start_date,end_date))
logging.info("Finale")
logging.info("Length of the Total Dataset: "+ str(len(total)))
payload = total.iloc[::-1].reset_index(drop=True)
print(payload)
Anyways We are adding these following variables and update the above two parts into function definition of
def equity_history(symbol,series,start_date,end_date):
Then, Updated the NSEPython Library. That’s the perk of an open-source library right? We can modify any function instantly. So, All we have to run is –
from nsepython import *
logging.basicConfig(level=logging.INFO)
symbol = "SBIN"
series = "EQ"
start_date = "08-01-2021"
end_date ="14-06-2021"
print(equity_history(symbol,series,start_date,end_date))
Please note that We have used logging.basicConfig(level=logging.INFO)
. Otherwise, It will skip the output of the logger
functions.
Anyways, here goes the output –
INFO:root:Starting Date: 2021-01-08 00:00:00
INFO:root:Ending Date: 2021-06-14 00:00:00
INFO:root:Total Number of Days: 157
INFO:root:Total FOR Loops in the program: 3
INFO:root:Remainder Loop: 37
INFO:root:Loop = 0
INFO:root:====
INFO:root:Starting Date: 08-01-2021
INFO:root:Ending Date: 2021-06-14 00:00:00
INFO:root:====
INFO:root:Length of the Table: 28
INFO:root:Loop = 1
INFO:root:====
INFO:root:Starting Date: 17-02-2021
INFO:root:Ending Date: 2021-06-14 00:00:00
INFO:root:====
INFO:root:Length of the Table: 55
INFO:root:Loop = 2
INFO:root:====
INFO:root:Starting Date: 29-03-2021
INFO:root:Ending Date: 2021-06-14 00:00:00
INFO:root:====
INFO:root:Length of the Table: 81
INFO:root:End Loop
INFO:root:====
INFO:root:Starting Date: 08-05-2021
INFO:root:Ending Date: 14-06-2021
INFO:root:====
INFO:root:Finale
INFO:root:Length of the Total Dataset: 106
_id CH_SYMBOL CH_SERIES ... VWAP mTIMESTAMP CA
0 6099206528330700080ff5a2 SBIN EQ ... 362.87 10-May-2021 NaN
1 609a71fe45df9f0008b768c2 SBIN EQ ... 362.80 11-May-2021 NaN
2 609bc37ec132690009effd5e SBIN EQ ... 368.18 12-May-2021 NaN
3 609e667e885aee00088f452d SBIN EQ ... 365.59 14-May-2021 NaN
4 60a25afd4e61470008bbebf9 SBIN EQ ... 376.90 17-May-2021 NaN
.. ... ... ... ... ... ... ...
101 60251c628160710008fa5699 SBIN EQ ... 391.61 11-Feb-2021 NaN
102 60266de0e854190008018823 SBIN EQ ... 392.34 12-Feb-2021 NaN
103 602a6261e85419000827e6de SBIN EQ ... 403.08 15-Feb-2021 NaN
104 602bb3db05368f0008dd72ad SBIN EQ ... 407.39 16-Feb-2021 NaN
105 602d057ee89593000836a7a8 SBIN EQ ... 410.54 17-Feb-2021 NaN
[106 rows x 24 columns]
[Finished in 1.788s]
We will keep making the NSEPython Library better but so can you!
now we can get historical data of index using below method
import requests
import pandas as pd
import chardet # Import the chardet library for encoding detection
import json
baseURL = “https://www.niftyindices.com”
# Set the URL of the website you want to scrape
url = ‘https://www.niftyindices.com/Backpage.aspx/getHistoricaldatatabletoString‘
headers = {
“User-Agent”: “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36 Edg/97.0.1072.55”,
“accept”: “text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9”,
“accept-language”: “en-US,en;q=0.9”
}
# Create a session object
session = requests.Session()
# Send the GET request to retrieve cookies
session.get(baseURL, headers=headers)
# Set the data you want to send in the POST request
payload = {‘name’:’NIFTY 50′,’startDate’:’01-Jan-2000′,’endDate’:’26-Aug-2023′}
# Send the POST request with cookies
response = session.post(url, headers=headers, json=payload)
# Check the response status code
if response.status_code == 200:
# The request was successful
# You can now access the response data
data = response.json()
pd = pd.read_json(data[‘d’])
print(pd)
else:
# The request was not successful
print(f’Request failed with status code {response.status_code}’)
Hi Team, Quick Question –
how to download all the NSE500 stocks historical data. have to go one by one and then download to Database. or is there any other way to dowload it?
thanks