You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The point is to run an llm to discuss with, just by adding a cog to the bot.
But here comes several constraints that I applied :
To run on AMD, NVIDIA or CPU
To run on a 'normal'-end GPU (8gb VRAM), so using a GGUF version (compressed) of the model (Previously GGML)
To not use a slash command interact
To work with any gpt4all compatible models, like Google Gemma or any other good model
To be able to generate images using the bot
So, i went to use GPT4All.
And here comes even more constraints, lol.
So at this point, this is just a draft, a proof of concept.
Here is the current state of the Chatbot part of the bot:
Set for Meta-Llama-3-8B-Instruct.Q4_0.gguf, but can fit any model.
Only one session, for all users and channels.
Only one response at a time.
The bot will respond only when you tag it or reply to it.
There are 3 "!" additional commands. (no need to tag the bot)
!stop, to stop the current text generation.
!reset, to reset the current conversation/session.
!generate, to generate a prompt, then an image.
When the current response from the bot reach the discord 2000 chars limit, a new message is created, to continue smoothly.
The author is tagged in the response, to highlight the message.
The System Prompt is 'reintroduced' in the next message when reaching the context limit. So the bot never forget the instructions in a long conversation, without a session reset.
To get this functionnal, you need to :
add the chatbotcog.py file into the core folder
add gpt4all to the requirements.txt to install gpt4all (the model should be automagically downloaded later), after the torch line:
The bot will, with this code, load the Llama-3-8B-Instruct model, on a 8gb GPU.
VRAM usage screenshot (8K tokens context here, 16-18K max with 8gb. Can be extended with another model and more VRAM):
With this code, it's using a second AMD GPU (RX570 generating at ~8 tokens/sec), while the webui is using a main RTX3060 for SD.
Here is a screenshot of an interaction with the Chatbot :
Here is a screenshot example of image generation from the Chatbot :
Live Preview :
Result :
Many ideas comes to my mind, like to let the bot search on internet.
But any other new features imply a structure/logic change, or additionnal ressources.
Also many changes to fully implement this in the last version of the bot.
So... Will see.
What else to say, i'll edit this post when more comes to my mind.
If any peep want to play with this, I would be happy to have any advices or feedbacks.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hey,
I recently added a Chatbot feature to my fork.
Here is the chatbot file :
https://github.com/wizz13150/aiyabot/blob/Full_bot/core/chatbotcog.py
The point is to run an llm to discuss with, just by adding a cog to the bot.
But here comes several constraints that I applied :
So, i went to use GPT4All.
And here comes even more constraints, lol.
So at this point, this is just a draft, a proof of concept.
Here is the current state of the Chatbot part of the bot:
!stop, to stop the current text generation.
!reset, to reset the current conversation/session.
!generate, to generate a prompt, then an image.
To get this functionnal, you need to :
chatbotcog.py
file into thecore
foldergpt4all
to the requirements.txt to install gpt4all (the model should be automagically downloaded later), after the torch line:aiyabot/requirements.txt
Line 8 in 4b3b62f
wizz13150@2da0c5e
The bot will, with this code, load the Llama-3-8B-Instruct model, on a 8gb GPU.

VRAM usage screenshot (8K tokens context here, 16-18K max with 8gb. Can be extended with another model and more VRAM):
With this code, it's using a second AMD GPU (RX570 generating at ~8 tokens/sec), while the webui is using a main RTX3060 for SD.
Here is a screenshot of an interaction with the Chatbot :

Here is a screenshot example of image generation from the Chatbot :



Live Preview :
Result :
Many ideas comes to my mind, like to let the bot search on internet.
But any other new features imply a structure/logic change, or additionnal ressources.
Also many changes to fully implement this in the last version of the bot.
So... Will see.
What else to say, i'll edit this post when more comes to my mind.
If any peep want to play with this, I would be happy to have any advices or feedbacks.
Cheers ! 🥂
Beta Was this translation helpful? Give feedback.
All reactions