Merge pull request #1057 from ashis2004/77

optimized the trading strategy based on sentiment analysis added
Niketkumardheeryan · Aug 9, 2024 · 5f87b72 · 5f87b72
2 parents 54fad7c + 5aef055
commit 5f87b72
Show file tree

Hide file tree

Showing 3 changed files with 1,832 additions and 0 deletions.
diff --git a/...ed on sentiment analysis/optimized_the_trading_strategy_based_on_sentiment_analysis.ipynb b/...ed on sentiment analysis/optimized_the_trading_strategy_based_on_sentiment_analysis.ipynb
@@ -0,0 +1,316 @@
+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "provenance": []
+    },
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    },
+    "language_info": {
+      "name": "python"
+    }
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "source": [
+        "optimized the trading strategy based on sentiment analysis"
+      ],
+      "metadata": {
+        "id": "S_I_QeRRg52m"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Loading the Dataset"
+      ],
+      "metadata": {
+        "id": "x14oHjR0g9Dk"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "import pandas as pd\n",
+        "import numpy as np\n",
+        "\n",
+        "# Load data\n",
+        "data_sentiment = pd.read_csv('sentiment_trading_data.csv')\n",
+        "print(data_sentiment.head())\n",
+        "\n",
+        "# Convert to NumPy array (excluding the 'Date' column)\n",
+        "data_sentiment = data_sentiment.drop(columns=['Date']).to_numpy()\n"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "vHerMIPte3d2",
+        "outputId": "6fd74df2-d792-4ef5-ac99-1e170e3c33dc"
+      },
+      "execution_count": 3,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "         Date  Stock_Price  Sentiment\n",
+            "0  2020-01-01    87.454012   0.302466\n",
+            "1  2020-01-02   145.071431  -0.786814\n",
+            "2  2020-01-03   123.199394   0.315691\n",
+            "3  2020-01-04   109.865848   0.998827\n",
+            "4  2020-01-05    65.601864  -0.903576\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Intialize the fitness function"
+      ],
+      "metadata": {
+        "id": "ObuEsypzhAGI"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "def fitness_function(individual, data):\n",
+        "    stock_prices = data[:, 0]\n",
+        "    sentiments = data[:, 1]\n",
+        "\n",
+        "    capital = 100000  # Starting capital\n",
+        "    position = 0      # Initial position (0 means no stock held)\n",
+        "\n",
+        "    for i in range(len(data)):\n",
+        "        if sentiments[i] > individual[0]:  # Buy signal based on sentiment threshold\n",
+        "            position += capital // stock_prices[i]  # Buy as many stocks as possible\n",
+        "            capital -= position * stock_prices[i]   # Deduct spent capital\n",
+        "        elif sentiments[i] < individual[1]:  # Sell signal based on sentiment threshold\n",
+        "            capital += position * stock_prices[i]   # Sell all stocks\n",
+        "            position = 0                            # Reset position\n",
+        "\n",
+        "    return capital\n",
+        "\n"
+      ],
+      "metadata": {
+        "id": "M5oMXowze6Od"
+      },
+      "execution_count": 4,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Intialize Population"
+      ],
+      "metadata": {
+        "id": "PH2klZcvhEct"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "def initialize_population(pop_size):\n",
+        "    population = []\n",
+        "    for _ in range(pop_size):\n",
+        "        buy_threshold = np.random.uniform(-1, 1)  # Random buy threshold\n",
+        "        sell_threshold = np.random.uniform(-1, 1) # Random sell threshold\n",
+        "        individual = [buy_threshold, sell_threshold]\n",
+        "        population.append(individual)\n",
+        "    return population\n"
+      ],
+      "metadata": {
+        "id": "uYNfoXAxe7-m"
+      },
+      "execution_count": 5,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Perform Selection"
+      ],
+      "metadata": {
+        "id": "M9Wr94fnhHaZ"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "def selection(population, fitness_scores, num_parents):\n",
+        "    parents = [population[idx] for idx in np.argsort(fitness_scores)[-num_parents:]]\n",
+        "    return parents\n"
+      ],
+      "metadata": {
+        "id": "BIzNDPbue9fc"
+      },
+      "execution_count": 6,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Perform crossover"
+      ],
+      "metadata": {
+        "id": "sTsXdkBRhJ4v"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "def crossover(parents, offspring_size):\n",
+        "    offspring = []\n",
+        "    for _ in range(offspring_size):\n",
+        "        parent1 = parents[np.random.randint(len(parents))]\n",
+        "        parent2 = parents[np.random.randint(len(parents))]\n",
+        "        crossover_point = np.random.randint(1, len(parent1))\n",
+        "        child = parent1[:crossover_point] + parent2[crossover_point:]\n",
+        "        offspring.append(child)\n",
+        "    return offspring\n"
+      ],
+      "metadata": {
+        "id": "xOLySvsHe-5E"
+      },
+      "execution_count": 7,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Perform Mutation"
+      ],
+      "metadata": {
+        "id": "W_35vtRthMog"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "def mutation(offspring, mutation_rate):\n",
+        "    for individual in offspring:\n",
+        "        if np.random.rand() < mutation_rate:\n",
+        "            mutation_point = np.random.randint(len(individual))\n",
+        "            individual[mutation_point] = np.random.uniform(-1, 1)  # Mutate with new random threshold\n",
+        "    return offspring\n"
+      ],
+      "metadata": {
+        "id": "oQV3Xgt2fAyk"
+      },
+      "execution_count": 8,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Perform Genetic algorithm"
+      ],
+      "metadata": {
+        "id": "ch0Uzd7OhOyc"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "def genetic_algorithm(data, num_generations, pop_size, num_parents, mutation_rate):\n",
+        "    population = initialize_population(pop_size)\n",
+        "\n",
+        "    for generation in range(num_generations):\n",
+        "        fitness_scores = [fitness_function(individual, data) for individual in population]\n",
+        "        parents = selection(population, fitness_scores, num_parents)\n",
+        "        offspring_size = pop_size - len(parents)\n",
+        "        offspring = crossover(parents, offspring_size)\n",
+        "        offspring = mutation(offspring, mutation_rate)\n",
+        "        population = parents + offspring\n",
+        "\n",
+        "        best_fitness = np.max(fitness_scores)\n",
+        "        print(f\"Generation {generation}: Best Fitness = {best_fitness}\")\n",
+        "\n",
+        "    best_individual = population[np.argmax(fitness_scores)]\n",
+        "    return best_individual\n",
+        "\n",
+        "# Run the genetic algorithm\n",
+        "num_generations = 50\n",
+        "pop_size = 100\n",
+        "num_parents = 20\n",
+        "mutation_rate = 0.01\n",
+        "\n",
+        "best_params = genetic_algorithm(data_sentiment, num_generations, pop_size, num_parents, mutation_rate)\n",
+        "print(f\"Best Trading Strategy: Buy Threshold = {best_params[0]}, Sell Threshold = {best_params[1]}\")\n"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "hBM-j180fCPe",
+        "outputId": "fec15d5d-e03e-477d-eff7-4c87591f2edc"
+      },
+      "execution_count": 9,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Generation 0: Best Fitness = 129785.02657707607\n",
+            "Generation 1: Best Fitness = 176212273.2138567\n",
+            "Generation 2: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 3: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 4: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 5: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 6: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 7: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 8: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 9: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 10: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 11: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 12: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 13: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 14: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 15: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 16: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 17: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 18: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 19: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 20: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 21: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 22: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 23: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 24: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 25: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 26: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 27: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 28: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 29: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 30: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 31: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 32: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 33: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 34: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 35: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 36: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 37: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 38: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 39: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 40: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 41: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 42: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 43: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 44: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 45: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 46: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 47: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 48: Best Fitness = 3.900617142553841e+32\n",
+            "Generation 49: Best Fitness = 3.900617142553841e+32\n",
+            "Best Trading Strategy: Buy Threshold = -0.7733929905450716, Sell Threshold = -0.9756864570452264\n"
+          ]
+        }
+      ]
+    }
+  ]
+}
diff --git a/optimized the trading strategy based on sentiment analysis/readme.md b/optimized the trading strategy based on sentiment analysis/readme.md
@@ -0,0 +1,54 @@
+# Sentiment Analysis Driven Trading Strategy Optimization
+
+This project aims to optimize a trading strategy based on stock prices and sentiment data using a genetic algorithm. The goal is to find the best buy and sell thresholds that maximize the portfolio value over a given period.
+
+## Dataset
+
+The dataset used in this project is `sentiment_trading_data.csv`, which contains the following columns:
+- `Date`: The date of the observation.
+- `Stock_Price`: The stock price on the given date.
+- `Sentiment`: The sentiment score for the given date, ranging from -1 to 1.
+
+## Genetic Algorithm Implementation
+
+### Step 1: Load the Data
+
+First, load the `sentiment_trading_data.csv` dataset.
+
+```python
+import pandas as pd
+import numpy as np
+
+# Load data
+data_sentiment = pd.read_csv('sentiment_trading_data.csv')
+print(data_sentiment.head())
+
+# Convert to NumPy array (excluding the 'Date' column)
+data_sentiment = data_sentiment.drop(columns=['Date']).to_numpy()
+
+# Step 2: Define the Fitness Function
+The fitness function evaluates how well a given trading strategy (individual) performs based on stock prices and sentiment.
+
+# step 3: Initialize the Population
+Generate an initial population of random trading strategies.
+
+# Step 4: Selection
+Select the best-performing individuals to be parents for the next generation.
+
+# Step 5: Crossover
+Create offspring by combining parts of two parents.
+
+# Step 6: Mutation
+Randomly mutate some individuals to maintain genetic diversity.
+
+#Step 7: Run the Genetic Algorithm
+Execute the genetic algorithm with the defined functions.
+
+#Results
+The genetic algorithm was run for 50 generations with a population size of 100. The best trading strategy found is:
+
+Buy Threshold: -0.7733929905450716
+Sell Threshold: -0.9756864570452264
+
+#Contributor
+Ashish Kumar Patel