Krya.ai is an innovative LLM Automation system that empowers you to automate complex tasks on your local machine using the power of Large Language Models (LLMs) like Google Gemini. This initial release focuses on streamlining workflows by generating and executing code, with a sophisticated feedback loop that refines code through error handling and continuous learning. Think of Krya.ai as your intelligent, automated assistant capable of understanding and executing your instructions directly on your computer.
- Automated Code Execution: Krya.ai generates code based on your instructions and executes it in your local environment.
- Intelligent Task Automation: It automates cursor movements, clicks, and typing, enabling you to interact with your desktop applications programmatically.
- Error Handling & Refinement: A feedback loop allows Krya.ai to learn from errors and refine its code, improving accuracy and efficiency over time.
- Extensible Architecture: Built to be easily adaptable and extended for future integrations.
Screencast.from.2024-12-12.19-27-37.webm
Screencast.from.2024-12-12.19-30-55.webm
- Python 3.8 or higher
pyautogui
library for UI automationgoogle.generativeai
library for Gemini integrationpyperclip
library for clipboard interactionstreamlit
for the web interface- Access to Gemini API keys.
pip install pyautogui google.generativeai pyperclip streamlit
- GEMINI API = https://aistudio.google.com/app/apikey
- Login through your google account
- Create an API key
- Select your project
- Your api key will be create, it will look like : AIzaSxxxxxxxxxxh09xxLwCA
- Store it safe
git clone https://github.com/devdattatalele/Krya.ai.git
cd Krya.ai
-
Navigate to the directory where you cloned the repository.
-
Launch the Streamlit app:
streamlit run src/main.py
-
Enter your saved API keys when prompted in the Streamlit app interface.
-
Follow the interactive prompts and begin automating your tasks.
- LLM Code Generation: Based on your instructions, Krya.ai generates code snippets.
- Execution Environment: The generated code is sent to your local machine's execution environment to run through a Python interpreter.
- Output Feedback: Results or errors from the code execution are sent back to the LLM, which allows for refining the code for improved performance if needed.
-
PyAutoGUI Scripts: The LLM generates scripts using PyAutoGUI for tasks like mouse movements, clicks, and keyboard inputs.
-
Script Execution: The generated scripts are executed to automate the specified UI tasks.
Example Task: Open a text editor and type "Hello, World!".
import pyautogui import time # Open the text editor (this may vary based on the system) pyautogui.hotkey('win') pyautogui.typewrite('notepad\n', interval=0.1) time.sleep(1) # Type "Hello, World!" pyautogui.typewrite('Hello, World!', interval=0.1)
Execution: Running this Python script with PyAutoGUI installed will automate the process of opening Notepad and typing "Hello, World!".
- Automated Testing: Generate and execute test scripts based on user instructions.
- GUI Automation: Automate repetitive tasks such as form filling, software navigation, and more.
- Task Automation: Automate complex workflows that require both code and UI interactions.
- Personal Assistant: Use it to automate your day-to-day computer tasks.
- Phase 1 (Current): Focused on a single user prompt leading to code generation, local execution, and basic UI automation. This phase has laid the groundwork with core functionality.
- Phase 2: Enhanced Automation & Interaction: This phase will focus on making the system more robust, versatile, and interactive. We will extend the functionality and add more features.
- Phase 3 and Beyond: Advanced Capabilities & Ecosystem: We will be exploring new possibilities and build out an ecosystem around Krya.ai with features and integrations.
Core Improvements:
- Multi-Prompt Handling:
- Description: Move beyond single-prompt execution. Krya.ai will maintain conversation history, allowing users to interact iteratively with the system.
- Implementation: Implement a message history and context management for the LLM, enabling follow-up prompts and dependencies between actions.
- Enhanced Error Handling & Recovery:
- Description: Improved logging of errors along with automated retry, suggestion and user-driven corrections to handle the errors gracefully.
- Implementation: Implement better error catching and feedback mechanisms so that user is alerted when an error occurs. Provide options to correct the error and/or a retry option for re-running the previous task.
- Context-Awareness:
- Description: The system should maintain state and context between multiple prompts.
- Implementation: Use session data and a memory structure to track generated code snippets, executed actions, and user preferences. This will allow the system to be more context aware and provide more personalized experience.
Use Cases (Phase 2)
- Data Extraction & Report Generation: Extract data from web pages or documents, generate reports and summarize the information into readable reports.
- Automated Data Processing: Process multiple data sources at a given interval.
- Advanced File Management: Organize and manage files based on specific criteria.
- Web Automation: Perform complex tasks on the web, such as form filling or data extraction.
Potential Features:
- Plugin Architecture:
- Description: Allow users to extend Krya.ai's functionality through custom plugins.
- Implementation: Build a plugin system so users can install or create their own custom actions and integrations.
- GUI Development Assistant:
- Description: Help users to quickly generate GUI interfaces to interact with the automation scripts.
- Implementation: Include a UI framework for designing simple GUI interfaces from natural language requests.
- Cloud Integration:
- Description: Integrate with cloud services for storage, execution, and access.
- Implementation: Support for cloud providers (AWS, GCP, Azure) to leverage cloud-based resources for enhanced processing and accessibility.
We welcome contributions! If you have any ideas for features or improvements, feel free to create pull requests.
This project is licensed under the [Insert License Name Here] License.
- Support for more LLMs (OpenAI, others).
- Improved error handling.
- Enhanced user interface.
- More complex automation workflows.
- Support for more operating systems.