-
Notifications
You must be signed in to change notification settings - Fork 359
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #1057 from ashis2004/77
optimized the trading strategy based on sentiment analysis added
- Loading branch information
Showing
3 changed files
with
1,832 additions
and
0 deletions.
There are no files selected for viewing
316 changes: 316 additions & 0 deletions
316
...ed on sentiment analysis/optimized_the_trading_strategy_based_on_sentiment_analysis.ipynb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,316 @@ | ||
{ | ||
"nbformat": 4, | ||
"nbformat_minor": 0, | ||
"metadata": { | ||
"colab": { | ||
"provenance": [] | ||
}, | ||
"kernelspec": { | ||
"name": "python3", | ||
"display_name": "Python 3" | ||
}, | ||
"language_info": { | ||
"name": "python" | ||
} | ||
}, | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"source": [ | ||
"optimized the trading strategy based on sentiment analysis" | ||
], | ||
"metadata": { | ||
"id": "S_I_QeRRg52m" | ||
} | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"source": [ | ||
"Loading the Dataset" | ||
], | ||
"metadata": { | ||
"id": "x14oHjR0g9Dk" | ||
} | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"source": [ | ||
"import pandas as pd\n", | ||
"import numpy as np\n", | ||
"\n", | ||
"# Load data\n", | ||
"data_sentiment = pd.read_csv('sentiment_trading_data.csv')\n", | ||
"print(data_sentiment.head())\n", | ||
"\n", | ||
"# Convert to NumPy array (excluding the 'Date' column)\n", | ||
"data_sentiment = data_sentiment.drop(columns=['Date']).to_numpy()\n" | ||
], | ||
"metadata": { | ||
"colab": { | ||
"base_uri": "https://localhost:8080/" | ||
}, | ||
"id": "vHerMIPte3d2", | ||
"outputId": "6fd74df2-d792-4ef5-ac99-1e170e3c33dc" | ||
}, | ||
"execution_count": 3, | ||
"outputs": [ | ||
{ | ||
"output_type": "stream", | ||
"name": "stdout", | ||
"text": [ | ||
" Date Stock_Price Sentiment\n", | ||
"0 2020-01-01 87.454012 0.302466\n", | ||
"1 2020-01-02 145.071431 -0.786814\n", | ||
"2 2020-01-03 123.199394 0.315691\n", | ||
"3 2020-01-04 109.865848 0.998827\n", | ||
"4 2020-01-05 65.601864 -0.903576\n" | ||
] | ||
} | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"source": [ | ||
"Intialize the fitness function" | ||
], | ||
"metadata": { | ||
"id": "ObuEsypzhAGI" | ||
} | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"source": [ | ||
"def fitness_function(individual, data):\n", | ||
" stock_prices = data[:, 0]\n", | ||
" sentiments = data[:, 1]\n", | ||
"\n", | ||
" capital = 100000 # Starting capital\n", | ||
" position = 0 # Initial position (0 means no stock held)\n", | ||
"\n", | ||
" for i in range(len(data)):\n", | ||
" if sentiments[i] > individual[0]: # Buy signal based on sentiment threshold\n", | ||
" position += capital // stock_prices[i] # Buy as many stocks as possible\n", | ||
" capital -= position * stock_prices[i] # Deduct spent capital\n", | ||
" elif sentiments[i] < individual[1]: # Sell signal based on sentiment threshold\n", | ||
" capital += position * stock_prices[i] # Sell all stocks\n", | ||
" position = 0 # Reset position\n", | ||
"\n", | ||
" return capital\n", | ||
"\n" | ||
], | ||
"metadata": { | ||
"id": "M5oMXowze6Od" | ||
}, | ||
"execution_count": 4, | ||
"outputs": [] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"source": [ | ||
"Intialize Population" | ||
], | ||
"metadata": { | ||
"id": "PH2klZcvhEct" | ||
} | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"source": [ | ||
"def initialize_population(pop_size):\n", | ||
" population = []\n", | ||
" for _ in range(pop_size):\n", | ||
" buy_threshold = np.random.uniform(-1, 1) # Random buy threshold\n", | ||
" sell_threshold = np.random.uniform(-1, 1) # Random sell threshold\n", | ||
" individual = [buy_threshold, sell_threshold]\n", | ||
" population.append(individual)\n", | ||
" return population\n" | ||
], | ||
"metadata": { | ||
"id": "uYNfoXAxe7-m" | ||
}, | ||
"execution_count": 5, | ||
"outputs": [] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"source": [ | ||
"Perform Selection" | ||
], | ||
"metadata": { | ||
"id": "M9Wr94fnhHaZ" | ||
} | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"source": [ | ||
"def selection(population, fitness_scores, num_parents):\n", | ||
" parents = [population[idx] for idx in np.argsort(fitness_scores)[-num_parents:]]\n", | ||
" return parents\n" | ||
], | ||
"metadata": { | ||
"id": "BIzNDPbue9fc" | ||
}, | ||
"execution_count": 6, | ||
"outputs": [] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"source": [ | ||
"Perform crossover" | ||
], | ||
"metadata": { | ||
"id": "sTsXdkBRhJ4v" | ||
} | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"source": [ | ||
"def crossover(parents, offspring_size):\n", | ||
" offspring = []\n", | ||
" for _ in range(offspring_size):\n", | ||
" parent1 = parents[np.random.randint(len(parents))]\n", | ||
" parent2 = parents[np.random.randint(len(parents))]\n", | ||
" crossover_point = np.random.randint(1, len(parent1))\n", | ||
" child = parent1[:crossover_point] + parent2[crossover_point:]\n", | ||
" offspring.append(child)\n", | ||
" return offspring\n" | ||
], | ||
"metadata": { | ||
"id": "xOLySvsHe-5E" | ||
}, | ||
"execution_count": 7, | ||
"outputs": [] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"source": [ | ||
"Perform Mutation" | ||
], | ||
"metadata": { | ||
"id": "W_35vtRthMog" | ||
} | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"source": [ | ||
"def mutation(offspring, mutation_rate):\n", | ||
" for individual in offspring:\n", | ||
" if np.random.rand() < mutation_rate:\n", | ||
" mutation_point = np.random.randint(len(individual))\n", | ||
" individual[mutation_point] = np.random.uniform(-1, 1) # Mutate with new random threshold\n", | ||
" return offspring\n" | ||
], | ||
"metadata": { | ||
"id": "oQV3Xgt2fAyk" | ||
}, | ||
"execution_count": 8, | ||
"outputs": [] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"source": [ | ||
"Perform Genetic algorithm" | ||
], | ||
"metadata": { | ||
"id": "ch0Uzd7OhOyc" | ||
} | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"source": [ | ||
"def genetic_algorithm(data, num_generations, pop_size, num_parents, mutation_rate):\n", | ||
" population = initialize_population(pop_size)\n", | ||
"\n", | ||
" for generation in range(num_generations):\n", | ||
" fitness_scores = [fitness_function(individual, data) for individual in population]\n", | ||
" parents = selection(population, fitness_scores, num_parents)\n", | ||
" offspring_size = pop_size - len(parents)\n", | ||
" offspring = crossover(parents, offspring_size)\n", | ||
" offspring = mutation(offspring, mutation_rate)\n", | ||
" population = parents + offspring\n", | ||
"\n", | ||
" best_fitness = np.max(fitness_scores)\n", | ||
" print(f\"Generation {generation}: Best Fitness = {best_fitness}\")\n", | ||
"\n", | ||
" best_individual = population[np.argmax(fitness_scores)]\n", | ||
" return best_individual\n", | ||
"\n", | ||
"# Run the genetic algorithm\n", | ||
"num_generations = 50\n", | ||
"pop_size = 100\n", | ||
"num_parents = 20\n", | ||
"mutation_rate = 0.01\n", | ||
"\n", | ||
"best_params = genetic_algorithm(data_sentiment, num_generations, pop_size, num_parents, mutation_rate)\n", | ||
"print(f\"Best Trading Strategy: Buy Threshold = {best_params[0]}, Sell Threshold = {best_params[1]}\")\n" | ||
], | ||
"metadata": { | ||
"colab": { | ||
"base_uri": "https://localhost:8080/" | ||
}, | ||
"id": "hBM-j180fCPe", | ||
"outputId": "fec15d5d-e03e-477d-eff7-4c87591f2edc" | ||
}, | ||
"execution_count": 9, | ||
"outputs": [ | ||
{ | ||
"output_type": "stream", | ||
"name": "stdout", | ||
"text": [ | ||
"Generation 0: Best Fitness = 129785.02657707607\n", | ||
"Generation 1: Best Fitness = 176212273.2138567\n", | ||
"Generation 2: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 3: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 4: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 5: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 6: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 7: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 8: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 9: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 10: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 11: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 12: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 13: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 14: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 15: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 16: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 17: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 18: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 19: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 20: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 21: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 22: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 23: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 24: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 25: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 26: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 27: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 28: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 29: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 30: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 31: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 32: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 33: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 34: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 35: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 36: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 37: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 38: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 39: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 40: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 41: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 42: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 43: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 44: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 45: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 46: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 47: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 48: Best Fitness = 3.900617142553841e+32\n", | ||
"Generation 49: Best Fitness = 3.900617142553841e+32\n", | ||
"Best Trading Strategy: Buy Threshold = -0.7733929905450716, Sell Threshold = -0.9756864570452264\n" | ||
] | ||
} | ||
] | ||
} | ||
] | ||
} |
54 changes: 54 additions & 0 deletions
54
optimized the trading strategy based on sentiment analysis/readme.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
# Sentiment Analysis Driven Trading Strategy Optimization | ||
|
||
This project aims to optimize a trading strategy based on stock prices and sentiment data using a genetic algorithm. The goal is to find the best buy and sell thresholds that maximize the portfolio value over a given period. | ||
|
||
## Dataset | ||
|
||
The dataset used in this project is `sentiment_trading_data.csv`, which contains the following columns: | ||
- `Date`: The date of the observation. | ||
- `Stock_Price`: The stock price on the given date. | ||
- `Sentiment`: The sentiment score for the given date, ranging from -1 to 1. | ||
|
||
## Genetic Algorithm Implementation | ||
|
||
### Step 1: Load the Data | ||
|
||
First, load the `sentiment_trading_data.csv` dataset. | ||
|
||
```python | ||
import pandas as pd | ||
import numpy as np | ||
|
||
# Load data | ||
data_sentiment = pd.read_csv('sentiment_trading_data.csv') | ||
print(data_sentiment.head()) | ||
|
||
# Convert to NumPy array (excluding the 'Date' column) | ||
data_sentiment = data_sentiment.drop(columns=['Date']).to_numpy() | ||
|
||
# Step 2: Define the Fitness Function | ||
The fitness function evaluates how well a given trading strategy (individual) performs based on stock prices and sentiment. | ||
|
||
# step 3: Initialize the Population | ||
Generate an initial population of random trading strategies. | ||
|
||
# Step 4: Selection | ||
Select the best-performing individuals to be parents for the next generation. | ||
|
||
# Step 5: Crossover | ||
Create offspring by combining parts of two parents. | ||
|
||
# Step 6: Mutation | ||
Randomly mutate some individuals to maintain genetic diversity. | ||
|
||
#Step 7: Run the Genetic Algorithm | ||
Execute the genetic algorithm with the defined functions. | ||
|
||
#Results | ||
The genetic algorithm was run for 50 generations with a population size of 100. The best trading strategy found is: | ||
|
||
Buy Threshold: -0.7733929905450716 | ||
Sell Threshold: -0.9756864570452264 | ||
|
||
#Contributor | ||
Ashish Kumar Patel |
Oops, something went wrong.