Use Python to Calculate the Conduct Augmented Dickey-Fuller (ADF) Test on Indian Stocks

The Augmented Dickey-Fuller (ADF) test is a common statistical test used to determine whether a given time series is stationary or not. Stationarity is a crucial concept in time series analysis, as many forecasting methods assume that the time series is stationary. A time series is said to be stationary if its statistical properties, such as mean and variance, remain constant over time.

Formula of Sharpe Ratio

The formula for calculating the Sharpe Ratio is: \[ \text{Sharpe Ratio} = \frac{ (\text{Expected return of the investment} – \text{Risk-free rate})}{\text{Standard deviation of the investment’s returns}} \] Where,
  • Expected return of the investment is the mean return of the investment.
  • Risk-free rate is the return of a risk-free investment, like a government bond.
  • Standard deviation of the investment’s returns is a measure of the investment’s volatility.

Computation of ADF Test Using NSEPython

Lets conduct the Augmented Dickey-Fuller (ADF) Test on SBIN (State Bank of India) using the nsepython library. To perform the ADF test, we need to obtain the time series data for SBIN. Here’s how you could go about it:

This python code patch is written for NSEPython Library first time. 

Step 1: Fetching the Data:

You can use the equity_history function from the nsepython library to get the historical data for SBIN.

				
					from nsepythonserver import *

# Get daily prices
today = datetime.datetime.today().strftime('%d-%m-%Y')
one_year_ago = (datetime.datetime.today() - datetime.timedelta(days=365)).strftime('%d-%m-%Y')

# Get the historical data for SBIN
sbin_data = equity_history("SBIN", "EQ", one_year_ago, today)

				
			

Step 2: Prepare the Time Series Data

Extract the column of interest, for instance, the closing price, which will be used for the ADF test.

				
					# The column 'CH_CLOSING_PRICE' contains the closing prices
sbin_time_series = sbin_data['CH_CLOSING_PRICE']
				
			

In the code provided, sbin_time_series represents the time series data of the closing prices of SBIN (State Bank of India). 

A time series data is a sequence of numerical data points in successive order, usually occurring at equally spaced time intervals. Here, each data point represents the closing price of SBIN on a particular trading day, and the sequence is ordered chronologically from an older date to a more recent date.

Step 3: Conducting the ADF Test

Use the adfuller function from the statsmodels library to conduct the Augmented Dickey-Fuller test.

				
					import statsmodels.tsa.stattools as ts

# Conduct the ADF test
adf_test = ts.adfuller(sbin_time_series, autolag='AIC')

# Output the results
print(f'ADF Statistic: {adf_test[0]}')
print(f'p-value: {adf_test[1]}')

				
			
  • The autolag='AIC' argument in the code is used to automatically determine the optimal lag length that will be used in the Augmented Dickey-Fuller test. 
  • The lag length is the number of lagged differences of the time series that will be included in the test regression.
  • The Akaike Information Criterion (AIC) is a measure used to compare the goodness-of-fit of different statistical models for a given set of data. A lower AIC value indicates a better model fit with fewer features, while penalizing models with more features.
  • When autolag='AIC' is specified, the function will iteratively test different lag lengths and choose the one that minimizes the AIC, aiming to find the most parsimonious model that explains the data well.

Output

In the code above, sbin_time_series is the time series data of SBIN’s closing prices, and ts.adfuller is the function used to conduct the Augmented Dickey-Fuller test. The autolag='AIC' argument is used to automatically select the lag length that minimizes the Akaike Information Criterion.

This code will provide you with the ADF statistic and the p-value, which you can use to determine whether the time series is stationary or not.

				
					ADF Statistic: -3.564168196716654
p-value: 0.006484183313370225
				
			

The Augmented Dickey-Fuller (ADF) statistic of -3.564 suggests that the time series data is likely stationary, as a more negative value indicates stronger evidence against the presence of a unit root.

The p-value of 0.00648, being less than a common significance level like 0.05, further supports this, indicating that we can reject the null hypothesis of a unit root, confirming the stationarity of the time series.

Q. Does a non-stationary time series imply that the stock is more likely to consolidate rather than show a trending behavior?

A. No, a non-stationary time series implies that the statistical properties of the series, such as its mean and variance, change over time. This could be indicative of a trending stock, where prices are consistently moving upwards or downwards over a period, rather than consolidating. On the other hand, a stationary time series has constant statistical properties over time, meaning it could be more likely to consolidate, as the prices hover around a constant mean without any clear upward or downward trend.

Wrapping Up

Here goes the final version of the adf_test() function. 

				
					from nsepythonserver import *
import datetime
import statsmodels.tsa.stattools as ts

def adf_test(symbol, series, start_date=None, end_date=None, column='CH_CLOSING_PRICE'):
    """
    Conduct the Augmented Dickey-Fuller test to check the stationarity of a time series data.

    Parameters:
        symbol (str): The ticker symbol of the equity (e.g., "SBIN").
        series (str): The series code (e.g., "EQ").
        start_date (str, optional): The start date for historical data in 'dd-mm-yyyy' format.
        end_date (str, optional): The end date for historical data in 'dd-mm-yyyy' format.
        column (str, optional): The column name to be used for time series analysis.

    Returns:
        str: ADF Statistic and p-value.
    """

    # Set default values for start_date and end_date if not provided
    if not end_date:
        end_date = datetime.datetime.today().strftime('%d-%m-%Y')
    if not start_date:
        start_date = (datetime.datetime.today() - datetime.timedelta(days=365)).strftime('%d-%m-%Y')
        
    # Get the historical data for the specified equity
    equity_data = equity_history(symbol, series, start_date, end_date)
    
    # Prepare the time series data from the specified column
    time_series_data = equity_data[column]

    # Conduct the ADF test
    adf_test_result = ts.adfuller(time_series_data, autolag='AIC')
    
    # Return the ADF Statistic and p-value
    return f"ADF Statistic: {adf_test_result[0]}\np-value: {adf_test_result[1]}"

# Example usage:
result = adf_test("SBIN", "EQ")
print(result)

				
			

Output

				
					ADF Statistic: -3.564168196716654
p-value: 0.006484183313370225
				
			

In this function:

  • symbol and series are used to specify the equity.
  • start_date and end_date are optional and used to specify the date range for the historical data.
  • column is optional and used to specify the column to be analyzed.
  • It fetches historical data, prepares the time series, conducts the Augmented Dickey-Fuller (ADF) test, and returns the ADF Statistic and p-value to determine the stationarity of the time series data.

Join The Conversation?

Post a comment

Leave a Comment

Your email address will not be published. Required fields are marked *

×Close