Skip to content

This is a chatbot application engineered to provide detailed insights and visualisations based on user queries regarding specific datasets.It uses Streamlit for the front-end and integrates advanced backend technologies such as LangChain, OpenAI, SQL, and Python agents for data processing and retrieval.

License

Notifications You must be signed in to change notification settings

SaM-92/DataInsightChatbot

Repository files navigation

SQL Query and Data Analysis Chatbot 💬

Technologies Used

LangChain OpenAI Python PostgreSQL SQLite Streamlit Heroku

Overview

This is a chatbot application engineered to provide detailed insights and visualisations based on user queries regarding specific datasets, such as the Irish historical electricity dataset. This application leverages cutting-edge technologies to deliver accurate and comprehensive data analysis. It uses Streamlit for the front-end and integrates advanced backend technologies such as LangChain, OpenAI, SQL, and Python agents for data processing and retrieval.

It features a SQL agent that interprets user queries to generate precise SQL statements, and a Python agent that translates SQL output into visualisation code, providing clear and informative graphical representations of the data. The application utilises natural language processing (NLP) to transform user queries into executable SQL and Python code, ensuring a seamless and intuitive user experience in data analysis and visualisation.

Real-time Data Scraping Diagram

Access the App

You can access the Chatbot by clicking the link below:

🔗 SQL Query and Data Analysis Chatbot 💬

Explore the features and see how large language models can simplify data queries related to renewable energy sources in Ireland.

Key Technologies and Techniques:

  • LangChain: Orchestrates the chatbot workflow, integrating SQL and Python agents for seamless data interaction and processing.

  • OpenAI: Utilizes the GPT-4 model for natural language understanding and response generation, enabling the chatbot to interpret and respond to user queries accurately.

  • SQL and Python Agents:

    • SQL Agents: Generate and execute SQL queries against the system database for precise data retrieval.
    • Python Agents: Execute Python scripts to process data and create visualizations using matplotlib.
  • Prompt Engineering:

    • Few-Shot Prompt Templates: Guide the language model in generating accurate SQL queries and responses.
    • System Message Templates: Define detailed instructions for handling various query types.
  • Semantic Similarity Example Selector: Matches user queries with relevant examples to improve response accuracy.

Setup

Prerequisites

  • See requirements.txt

Installation

  1. Clone the repository:

    git clone https://github.com/SaM-92/ireland_res_chatbot
  2. Create and activate a conda environment:

     conda create -n sql_agent_dev python=3.10
     conda activate sql_agent_dev
  3. Install the required packages:

    pip install -r requirements.txt
  4. Set up environment variables:

    • Create a .env file in the project root directory.
    • Add your OpenAI API Key to the .env file:
      OPENAI_API_KEY=your_openai_api_key_here

Running the Application

  1. Run the Streamlit application:

    streamlit run app.py
  2. Open your web browser and navigate to the provided local URL to interact with the chatbot.

Project Structure

emerald-insights/
├── images/
│   └── header.png
├── pages/
│   ├── service_overview.py
│   └── irish_data_chatbot.py
├── subs/
│   ├── __init__.py
│   ├── agent.py
│   ├── db_connections.py
│   ├── prompts.py
│   ├── styles.py
│   └── visualisation.py
├── data/
│   └── eirgrid_data.db
├── .env
├── app.py
├── requirements.txt
└── README.md

Key Components

  • app.py This is the main entry point of the application. It sets up the page configurations, navigation, and loads the respective pages based on user interaction.

  • pages/service_overview.py Contains the overview of the service which is displayed on the "Service Overview" page.

  • pages/irish_data_chatbot.py Contains the chatbot functionality to handle user queries about the Irish power system.

  • subs/ This directory contains various modules to support the application:

    • agent.py: Handles AI agents.
    • db_connections.py: Handles database connections.
    • prompts.py: Contains the templates and logic for generating prompts.
    • styles.py: Provides styles for the Streamlit application.
    • visualisations.py: Functions to visualize data in response to user queries.

Data

The data is hosted on a PostgreSQL database.

For running the code locally, you can use the SQL database stored in the data folder under the name eirgrid_data.db. You also need to set connect_to_irish_db(cloud=True) to connect_to_irish_db(cloud=False) in the code.

Contributing

Contributions are welcome! Please ensure your pull requests are well-documented and tested. Adherence to coding standards, including the use of docstrings and comments, is encouraged.

License

This project is licensed under the GNU GENERAL PUBLIC LICENSE - see the LICENSE file for details.

Contact

Created by Saeed Misaghian

About

This is a chatbot application engineered to provide detailed insights and visualisations based on user queries regarding specific datasets.It uses Streamlit for the front-end and integrates advanced backend technologies such as LangChain, OpenAI, SQL, and Python agents for data processing and retrieval.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published