Skip to content

jacoboneill/backpack-trading

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

62 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

title video demo
Backpack Trading

Backpack Trading

Backpack Trading started off as an experiment of the unbounded knapsack algorithm and how it could be applied to algorithmic trading. I based this on the unbounded knapsack, as when I was researching it, I thought about how the word "weight" is used in AI and figured out an idea of how to apply it with trading.

The unbounded knapsack problem tries to solve a problem where, given a backpack of a specific capacity, how can I maximise the value of what I have in there? The unbounded variant means that I can have duplicates of the same item in the bag, but they have to be whole. If you flip this on its head and say that the "capacity" is a budget for a portfolio, the "weight" of the item is the price of it, and the "value" of the item is a weight based on how well the stock is doing, it should give you an optimal portfolio. I went to go and prove my theory in a "Proof of Concept," building a simulator to test trading algorithms.

The three tasks I needed to complete to make this work were the scraper, a web scraper that can scrape the NASDAQ website to get historic data; a trading bot, or more like a simulator that can run the algorithm based on the historic data; and a data visualiser, to get a better understanding of what the data is saying after running the trading bot.

The scraper is comprised of a python script that uses the requests library to take a .csv file of company symbols, and store the historic data in a sqlite database. It does this by faking the header of a GET request, which I took from my own laptop, and parsing that API response into a list of dictionaries. Finally, it loops through that list and adds it to the database. This got me a dataset of 206 companies' stocks from 2014 to 2024. To improve upon this, I would've sent the data straight to a database. I didn't use asynchronous requests, as if there are too many requests coming from the same IP address, I could've risked my IP being banned or a rate limiter stopping the program from running in general, hence why there is a time.sleep in the get function.

The trading bot is where I practiced my object oriented programming, using inheritance for the stock classes and aggregation with the bank side of the project. The idea is that you create a bank account, which will automatically generate an account and a portfolio with a budget. This will then be fed into a trading bot class that will return an optimal portfolio based on the previous generation of stocks and the new ones; the account will then sync the account with the optimal one and wait until the next set of data is available (in the simulation this would be after computation, but in real life this could be a cron job). Finally, it would write a log to a file, sim.log, of the __repr__ of the accounts and timestamp. This log approach was naive, as I ended up having to write a script to parse the logs into a database. I should've done this to begin with, but considering the simulation took 5 hours to run, it was quicker for me to parse the logs instead.

Finnaly, the data visualiser had the previously mentioned log parser, which just used string manipulation of the logs to enter the data into the database, and a visualiser. I made use of panda's dataframes and used seaborn, a library built on top of matplotlib to make it look prettier, to generate the graphs based on sql queries.

Looking at the graphs, you can see that the experiment was a fail and that none of the budget percentages made a difference to the outcome; it just amplified what was happening from the algorithm. I realise now that I could have changed my approach from a dynamic programming approach with tabulation and memoisation to a greedy approach, as the algorithm just returned the best-performing stock and a combination of stocks to make up the budget total. Nonetheless, this was still an amazing learning opportunity for myself to learn about algorithms, web technologies and scraping, and data analysis and visualisation. I am now very confident in python as well as sql, and are useful tools to add to my arsenal.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages