This is my attempt at a GPU accelerated machine learning engine
The example I'm using for learning purposes is a custom genetic algorithm that I've just botched together. It's still a work in progress, but I hope by the end of it to have a fully GPU-accelerated server instance that I can run on a computer, and feed it training data from a seperate outside PC in realtime. It's designed to be as modular as possible, utilising the CPU to save a copy of the model to disk (And a backup copy) while the GPU trains up the next instance of the model. The GPU and CPU are constantly cycling data in this way so all parts of the training computer are active as much as possible.
How it works: The CPU spends all it's time saving a copy of the trained model it has in RAM to the disk (SSD HIGHLY recommended over HDD). The GPU at the same time asynchronously is trainign the model it has in VRAM. As this goes on, at some point the CPU finishes saving the RAM copy to disk (And it's backup). The GPU then finishes what it's doing, hands over a copy of the trained model to the CPU to place onto RAM, before the GPU continues with training the model even more. The CPU now having a newer more updated version of the trained model in RAM proceeds to start saving it to disk.
In this way, the GPU is always training, the CPU is always saving data to disk, and the disk always has both the most up to date version of the model as well as a backup copy (Just incase of system crash, corruption, or power outage in the middle of saving data)
The current issue I'm working on, is allowing significantly more processing power by converting from 32-bit to 64-bit. If I can get my hands on an RTX4090 then this change will make my code run 12 times faster than it does now in 32-bit world. Then lastly is to create a server TCP interface or similar that I can use, so now the entire AI training system runs asychronously (And on seperate hardware) to whatever simulation environment you intend to use it with.
My plans with this is to train it on fake stock market data (How the example in "Live Runtime" is setup already) to get it ready for crypto trading, taking into account taxes on transactions. NFTs and crypto at their core might be majority scams, but that doesn't mean I can't train an intellegent system to profit off that fact!