Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

💡 [REQUEST] - Enable API / NLU Engine Integration + MLOps #346

Open
19 tasks
marrouchi opened this issue Nov 19, 2024 · 1 comment
Open
19 tasks

💡 [REQUEST] - Enable API / NLU Engine Integration + MLOps #346

marrouchi opened this issue Nov 19, 2024 · 1 comment
Assignees
Labels
question Further information is requested

Comments

@marrouchi
Copy link
Contributor

marrouchi commented Nov 19, 2024

Feature Request: MLOps Integration for NLU Section Inspired by MLflow Features

Description

To enhance Hexabot's NLU capabilities, we propose integrating MLOps-inspired features to streamline and improve the management, training, and evaluation of NLU models. This will significantly improve the user experience and extend Hexabot's utility for teams managing complex NLU workflows.


Proposed Features

  1. UI Enhancements:

    • Menu Reorganization: Divide existing pages into distinct menu items:
      • NLU Samples
      • NLU Entities & NLU Values
      • NLU Train
      • NLU Experiments
      • NLU Models
    • NLU Models Page:
      • Display all available models in the NLU project.
      • Enable actions such as:
        • Initiating a new model training.
        • Selecting the default model for use.
  2. Performance Assessment & Metrics:

    • Create new pages for model performance assessment.
    • Showcase key metrics and visualization for each model's performance, such as:
      • Accuracy, F1-score, Precision, Recall, Loss, etc.
      • Epoch-wise and overall metrics.
  3. Model Training Integration:

    • Add the ability to launch a new model training directly from the UI.
    • Training requests will:
      • Include the training dataset.
      • Be processed by the NLU service (Python project).
      • Trigger updates via webhook notifications.
  4. Model Evaluation:

    • Provide functionality to run evaluations on a given model.
    • Display evaluation results with detailed metrics.

Backend Requirements

  1. Webhook Notifications:

    • Implement a webhook notification controller in the Hexabot API (NestJS) to handle model-related updates:
      • Include environment variables WEBHOOK_TOKEN and WEBHOOK_SECRET to verify the hash.
      • Handle events such as:
        • New model experiment updates.
        • Epoch metrics during training.
        • Status changes (e.g., training started/completed).
  2. NLU Service Updates:

    • Introduce a class in the NLU project to handle webhook notifications:
      • The class will push notifications for key events triggered at the level of tfbp.Model (or equivalent), ensuring automatic updates for any model used.
    • Provide robust mechanisms to trigger these events and notify the API seamlessly.
  3. Versioning Support:

    • Enable versioning for both datasets and models in the NLU service:
      • Offer flexibility to store versioned data/models locally or on a remote service (e.g., Amazon S3).

Expected Benefits

  • Improved user experience with better UI organization and visibility into NLU models and experiments.
  • Streamlined training and evaluation processes with integrated webhook-based feedback.
  • Enhanced collaboration and reproducibility with model and dataset versioning.
  • Future-proofing Hexabot for more advanced MLOps features and enterprise use cases.

Open Questions

  1. What specific metrics should be included in the performance visualization?
  2. Should the UI include options to configure storage locations (local vs. remote) for versioned data/models?
  3. Are there additional webhook events that should be included in this iteration?

TODOS for MLOps Integration in NLU Section

Here’s an ordered list of todos for integrating MLOps-inspired features into Hexabot's NLU section. Each task is designed to be implemented in a separate PR for better tracking and iterative development. You can mark tasks as done as you progress.

  1. Menu Reorganization

    • Divide existing pages into distinct menu items:
      NLU Samples, NLU Entities & NLU Values, NLU Train, NLU Experiments, NLU Models.
  2. Create NLU Models Page

    • Display all models available under the NLU project.
    • Add functionality to initiate a new model training from the page.
    • Allow selecting a default model for use.
  3. Add Model Performance Pages

    • Create new pages to assess model performance.
    • Showcase detailed metrics for models (Accuracy, F1-score, Precision, Recall, etc.).
    • Add visualizations for epoch-wise metrics.
  4. Integrate Training Feature

    • Implement the ability to launch training for a new model.
    • Ensure training requests include the dataset and are processed by the NLU service.
  5. Implement Evaluation Feature

    • Add functionality to run evaluations for a given model.
    • Display evaluation results, including metrics and visualizations.
  6. Add Webhook Notification Support in API (NestJS)

    • Implement a webhook controller to receive model-related updates.
    • Validate webhook requests using WEBHOOK_TOKEN and WEBHOOK_SECRET.
  7. Add Webhook Notification Support in NLU Service

    • Create a class to send webhook notifications on key events (e.g., experiment creation, epoch updates).
    • Trigger webhook events automatically at the level of tfbp.Model.
  8. Enable Dataset and Model Versioning

    • Add versioning support for datasets and models in the NLU service.
    • Allow users to choose between local and remote storage (e.g., Amazon S3).
  9. UI Enhancements for Versioning

    • Add UI elements to display version history for datasets and models.
    • Provide options to manage versions (e.g., delete old versions).

Workflow

  • Tasks are listed in the recommended implementation order.
  • Each task should be completed and reviewed as a separate PR.
  • Progressively integrate the feature set into the project to ensure functionality and avoid breaking changes.

Let me know if you’d like me to expand on any specific task or create detailed subtasks!

@marrouchi marrouchi added the question Further information is requested label Nov 19, 2024
@marrouchi marrouchi changed the title 💡 [REQUEST] - Enable API / NLU Engine Integration 💡 [REQUEST] - Enable API / NLU Engine Integration + MLOps Nov 22, 2024
@MohamedAliBouhaouala
Copy link
Contributor

MohamedAliBouhaouala commented Nov 26, 2024

Current work advancements at the NestJS API level:

  • Nlp Schemas updated to handle dataset versioning/tracking, experiment versioning/tracking, metrics and hyperparameters tracking, model caching [DONE]
  • referential integrity on delete at the repository level [DONE]
  • referential integrity on write at the repository level [DONE]
  • Services implemented [DONE]

@todo List :

  • Add search functionality for best experiments for a given model (will the client assign the metric on which to decide or shall we use a combination and calculate a certain score based on that combination? etc etc )
  • Add search functionality for optimal hyperparameters (meaning which combination of hyperparameters gives the best experiment for a given model)
  • Add unit tests
  • Push a PR
  • Push another PR where the controllers are implemented using the webhook notification

Potential Enhancement :

  • Add a trigger for Hyperparameter Optimization (Automatically train model to find the best combination of hyperparameters and yielding the best experiment for a given model given)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants