Fine-Tuning the Entropy Alpha Strategy: Incorporating Micro Conditions for Optimal Backtesting

Fine-Tuning with Micro-Conditions

The Entropy Alpha strategy’s core logic relies on the market conforming to a Log-Normal Distribution. However, the high volatility during market opening and closing hours, driven by liquidity shifts, can disrupt this assumption. To stabilise the strategy’s performance and ensure more reliable execution, we introduce a set of micro-conditions.

These parameters primarily control the timing of trades and filter out potentially false signals, without altering the fundamental Entropy Alpha setup.

Intuition. The goal of these rules is not to find more trades, but to systematically avoid trades during periods of anomalous volatility (market open/close) and to prevent over-trading on a single instrument based on noisy, repetitive signals.

Global Conditions

These rules apply to the strategy as a whole, defining its operational window.

  • Entry Time Window: As an intraday strategy, trades are only initiated between 9:30 AM and 14:20 PM. This avoids the opening volatility before 9:30 AM and the end-of-day volatility after 14:20 PM.
  • Trade Management: The strategy will long the stock with the day’s high as the initial target. If the stop loss is not hit during the day, all open positions are automatically squared off at 14:50 PM.

Trade-Specific Conditions

These rules govern how individual stock triggers are handled.

  • False Positive Filter: If a trade signal for a stock is triggered, the same stock cannot trigger another signal within the next 15 minutes. The strategy will only consider the first_trigger.
  • Max Daily Occurrence: A single stock can trigger a maximum of two trades in one day. If a second_trigger occurs, it must be at least 15 minutes after the first_trigger.

Backtesting with the KiteConnect API

We will use Zerodha’s Kite Connect API to process and backtest the trigger data. Kite Connect provides a comprehensive set of REST-like HTTP APIs suitable for our analysis. As an official Zerodha Partner, Unofficed provides resources and support for integrating with their APIs.

KiteConnect API logo and description

Step 1: Initialize KiteConnect and Pandas

The first step in any Python-based analysis is to set up the environment. For this backtest, we need to initialize the `KiteConnect` object to interact with the broker’s API and use the `pandas` library for data manipulation.

The example below shows the standard initialization flow. You would typically complete the login flow once to get an `access_token`, which can be used for subsequent sessions.

import logging
from kiteconnect import KiteConnect
import pandas as pd

logging.basicConfig(level=logging.DEBUG)

# These are placeholders. Replace with your actual keys.
api_key = "your_api_key"
api_secret = "your_secret"
kite = KiteConnect(api_key=api_key)

# First-time login to generate a session
# print(kite.login_url())
# request_token = input("Enter request_token: ")
# data = kite.generate_session(request_token, api_secret=api_secret)
# access_token = data["access_token"]
# kite.set_access_token(access_token)

# For subsequent runs, you can directly use the access_token
# kite.set_access_token("your_saved_access_token")

Step 2: Load and Normalize Trigger Data

Our plan for data preparation is as follows:

  1. Load the raw trigger data from the CSV file into a pandas DataFrame.
  2. The "Triggered at" column is a string. We’ll split it into separate "Trigger Date" and "Trigger Time" columns for easier filtering.
  3. Some rows contain multiple stock names in a single line (e.g., "WHIRLPOOL, GODREJPROP, PEL"). We must treat each as a distinct trade signal, so we will “explode” these rows, creating a separate row for each stock.
# Initialization of Pandas Dataframe
df = pd.read_csv("entropy_data.csv")

# Split the Column "Triggered at" into "Trigger Date" and "Trigger Time"
df["Trigger Date"] = pd.to_datetime(df["Triggered at"]).dt.date
df["Trigger Time"] = pd.to_datetime(df["Triggered at"]).dt.time

# Removing the commas in Stock of same timed entries
# create an empty dataframe to store the updated rows
new_df = pd.DataFrame(columns=df.columns)

# iterate over the rows in the original dataframe
for index, row in df.iterrows():
    # check if the "Stocks (new stocks are highlighted)" column contains a comma
    stock_str = row["Stocks (new stocks are highlighted)"]
    if isinstance(stock_str, str) and "," in stock_str:
        # split the stock names by comma
        stocks = stock_str.split(",")
        # create a new row for each stock
        for stock in stocks:
            new_row = row.copy()
            new_row["Stocks (new stocks are highlighted)"] = stock.strip()
            new_df = pd.concat([new_df, new_row.to_frame().T], ignore_index=True)
    else:
        new_df = pd.concat([new_df, row.to_frame().T], ignore_index=True)

# assign the updated dataframe to the original dataframe
df = new_df
df

Interpretation

The code block transforms the raw data into a clean, normalized format. The original 780 rows are expanded based on the comma-separated stocks, preparing the dataset for time-series analysis where each row represents a unique signal for a single instrument.

    Triggered at                       Count Stocks (new stocks are highlighted) Trigger Date Trigger Time
0   Fri Apr 21 2023, 10:07 am          1     APOLLOTYRE                          2023-04-21   10:07:00
1   Fri Apr 21 2023, 9:58 am           1     ASIANPAINT                          2023-04-21   09:58:00
2   Thu Apr 20 2023, 3:21 pm           1     TATACONSUM                          2023-04-20   15:21:00
3   Thu Apr 20 2023, 2:16 pm           1     BAJAJ-AUTO                          2023-04-20   14:16:00
4   Thu Apr 20 2023, 12:49 pm          1     CUB                                 2023-04-20   12:49:00
..  ...                                ...   ...                                 ...          ...
775 Tue Sep 13 2022, 10:03 am          3     DIXON                               2022-09-13   10:03:00
776 Tue Sep 13 2022, 10:03 am          3     DRREDDY                             2022-09-13   10:03:00
777 Tue Sep 13 2022, 10:03 am          3     HEROMOTOCO                          2022-09-13   10:03:00
778 Tue Sep 13 2022, 10:01 am          2     DRREDDY                             2022-09-13   10:01:00
779 Tue Sep 13 2022, 10:01 am          2     HEROMOTOCO                          2022-09-13   10:01:00

780 rows × 5 columns

Step 3: Apply Trade-Specific Conditions

Now we apply the deduplication logic. For each stock on each day, we will discard any signals that appear within 15 minutes of the first signal. This enforces the “False Positive Filter” and “Max Daily Occurrence” rules.

# Convert the "Triggered at" column to datetime format
df['Triggered at'] = pd.to_datetime(df['Triggered at'], format='mixed')

# Sort the dataframe by stock and time to ensure correct order
df.sort_values(by=['Stocks (new stocks are highlighted)', 'Triggered at'], inplace=True)

# We will group by stock and date, then filter
# This is a more efficient approach than iterating
def filter_triggers(group):
    # The first trigger is always kept
    first_trigger = group.iloc[0:1]
    
    # Find the second valid trigger (at least 15 mins after the first)
    if len(group) > 1:
        time_diff = group['Triggered at'] - group.iloc[0]['Triggered at']
        second_triggers = group[time_diff >= pd.Timedelta(minutes=15)]
        if not second_triggers.empty:
            return pd.concat([first_trigger, second_triggers.iloc[0:1]])
            
    return first_trigger

# Group by stock and date, then apply the filtering function
df_filtered = df.groupby([pd.Grouper(key='Triggered at', freq='D'), 'Stocks (new stocks are highlighted)']).apply(filter_triggers)

# Reset index to clean up the DataFrame
df_filtered.reset_index(drop=True, inplace=True)
df = df_filtered
df

Interpretation

After applying the trade-specific conditions, the number of rows drops significantly from 780 to 542. This shows the filter is effectively removing redundant, rapid-fire signals that could lead to over-trading or entries based on short-lived noise.

    Triggered at                       Count Stocks (new stocks are highlighted) Trigger Date Trigger Time
779 2022-09-13 10:01:00                2     HEROMOTOCO                          2022-09-13   10:01:00
778 2022-09-13 10:01:00                2     DRREDDY                             2022-09-13   10:01:00
775 2022-09-13 10:03:00                3     DIXON                               2022-09-13   10:03:00
770 2022-09-13 10:06:00                5     ITC                                 2022-09-13   10:06:00
771 2022-09-13 10:06:00                5     SBICARD                             2022-09-13   10:06:00
..  ...                                ...   ...                                 ...          ...
4   2023-04-20 12:49:00                1     CUB                                 2023-04-20   12:49:00
3   2023-04-20 14:16:00                1     BAJAJ-AUTO                          2023-04-20   14:16:00
2   2023-04-20 15:21:00                1     TATACONSUM                          2023-04-20   15:21:00
1   2023-04-21 09:58:00                1     ASIANPAINT                          2023-04-21   09:58:00
0   2023-04-21 10:07:00                1     APOLLOTYRE                          2023-04-21   10:07:00

542 rows × 5 columns

Step 4: Apply Global Conditions

Finally, we apply the global time window. We filter the dataset to include only trades that occurred within the month of April 2023 and between the hours of 9:30 AM and 14:20 PM. This is necessary because derivative instrument tokens change monthly, and this backtest focuses specifically on April F&O data.

# Filter the data to the specified time window
df['Trigger Time'] = pd.to_datetime(df['Triggered at']).dt.time
df_filtered = df[(df['Trigger Time'] >= pd.Timestamp('09:30').time()) & (df['Trigger Time'] <= pd.Timestamp('14:20').time())]

# Filter for the specific month of April 2023
df_filtered = df_filtered[df_filtered['Triggered at'].dt.strftime('%Y-%m') == '2023-04']
df = df_filtered
df
Common Pitfalls. When running these scripts, watch for a few common errors:

  • API Credentials: Using placeholder API keys/secrets will cause an authentication failure. Ensure you replace "your_api_key" with your actual credentials.
  • File Path: A FileNotFoundError indicates that the script cannot find "entropy_data.csv". Make sure the CSV file is in the same directory as your Python script, or provide an absolute path.
  • Pandas Versions: The original code contained a bug in the filtering logic and used inefficient iteration. The improved version uses a more robust .groupby().apply() method that is less prone to errors across different pandas versions. Always ensure your environment is up to date.

After this final filtering step, the DataFrame df contains the precise list of trades that are candidates for the backtest, adhering to all the fine-tuning conditions we have defined. The next step would be to fetch historical data for these instruments and simulate the trades.

Post a comment

Leave a Comment

Your email address will not be published. Required fields are marked *

×Close