30 Apr 15:22

jcjc712

d4d6f4e

v1.0.3 Latest

Latest

Release Notes for Version 1.0.3

What's New

1. New features

Added Redshift support c8e55a2
Added multi-schemas support for db connections, it only works for Postgres, Bigquery, Snowflake and Databricks d4d6f4e

2. Improvements and fixes

Fixed uri validation for db connections f40ac0e
Fixed the fallback and confidence score 30f5226
Fixed the observation code blocks a11d1f1
Fixed refresh endpoint error handling 15b6d46
Fixed the malformed sql queries in intermediate steps 828c64d
If the sql-generation endpoint gets an invalid sql it should raise an error fbd96ea

New Contributors

Contributors

toliver38, zhanpengjie, and akshayrakate

Assets 2

03 Apr 20:17

jcjc712

1.0.2

39bef01

v1.0.2

Release Notes for Version 1.0.2

What's New

1. New features

Adds Astra vector store support 6f39892
Adds MS SQL Server support 078c17d
Adds Streaming endpoint to show intermediate steps 1205d8a
Adds support to Pinecone serverless 7906f03
Adds intermediate steps in the SQL Generation response 3dbd483
Adds LangSmith metadata param(langsmith_metadata) to easily filter cf88a1b
Stores the db dialect when a db connection is created 809ac31

2. Improvements and fixes

Adds logs when a request fails 09f65c6
Adds descriptions to the new agent faf07de
Fixes malformed LLM output 4190b4d
Documents error codes e94c788
Fixes the running query forever issue cfb1d5b
Fixes the error parsing handler 8751410
Added Click House Hyperloglog support to improve the scaninng 61a92c9
Fixes SQL generation 5160e8d
Fixes background scanner process in Parallel 88ee8fa
Fixes error handling for golden SQL additions 8efb00f

3. Migration Script

Purpose: To facilitate a smooth transition from version 1.0.1 to version 1.0.2, we've introduced a migration script.
Data Modifications: The script performs the following actions:
- Decrypts all the db connection uri column
- Executes a regex method to retrieve the db dialect.
- Stores dialect column in database_connections mongo collection.

To run the migration script, use the following command:

docker-compose exec app python3 -m dataherald.scripts.populate_dialect_db_connection

New Contributors

Contributors

toliver38, zhanpengjie, and 3 other contributors

Assets 2

04 Mar 19:53

jcjc712

1.0.1

5e7a5de

v1.0.1

Release Notes for Version 1.0.1

What's New

1. New features

Added clickhouse support d494fed
MariaDB/MySQL support officially added and documented. 7b86ad3
Added a refresh endpoint (POST /api/v1/table-descriptions/refresh) to get the table name from a specified database and store them into the table-description Mongo collection. This improves response time when querying the table-description list endpoint (GET /api/v1/table-descriptions). 28b8130
Implemented error codes for better error handling. Now errors response a 400 HTTP status code. 2c70f16

2. Changes and fixes

Reduced SSH fields in requests by utilizing the connection_uri field. 64ceb6e
Updated LLM with the latest models. dd440f2
Expanded functionality to allow SSH connections on different ports. 1a5a2be
Improved performance for scanning endpoint (POST /api/v1/table-descriptions/sync-schemas). 435884e

3. Migration Script

You don't need to update the data if you're already using the stable 1.0.0 version; you can simply pull these changes.

New Contributors

Contributors

moltar, nalz, and 3 other contributors

Assets 2

16 Jan 16:52

jcjc712

1.0.0

a226a9d

v01.0.0

Release Notes for Version 1.0.0

What's New

1. New Resources, Attributes, and Endpoints

Finetuning: One of our new exciting features is automatic finetuning GPT family models on your golden questions/SQLs pairs.
- POST /api/v1/finetuning: By calling this endpoint you can create a fientuning job on your golden question/SQL pairs. The only required parameter is the db_connection_id and you have the option to specify which golden question/SQL pairs you want to use for finetuning process.
- GET /api/v1/finetuning/{finetuning_di}: With this endpoint you can retrieve the status of the finetuning process and once the status is SUCCEEDED you can use the model for SQL generation.
- POST /api/v1/finetuning/{finetuning_id}/cancel: If you want to cancel the finetuning for whatever reason you can call this endpoint.
- GET /api/v1/finetuning: List all of the finetuned models for a given db_connection_id
- DELETE /api/v1/finetuning/{finetuning_di}: Delete a given finetuned model from the finetunings collection.
Metadata: All resources now include a metadata attribute, allowing you to store additional information for internal purposes. Soon, GET list endpoints will support filtering based on metadata fields.

2. Resource and Endpoint Changes

Renaming questions to prompts: The entity has been renamed to Prompt, and the collection is now called prompts. You can use the following endpoints to interact with this resource:
- GET /api/v1/prompts: List all existing prompts.
- POST /api/v1/prompts: Create a new prompt.
- GET /api/v1/prompts/{prompt_id}: Retrieve a specific prompt.
- PUT /api/v1/prompts/{prompt_id}: Update the metadata for a prompt.
Splitting responses into sql_generation and nl_generation: The previous responses resource has been divided into sql_generations and nl_generations. You can work with them as follows:
- POST /api/v1/prompts/{prompt_id}/sql-generations: Create a sql-generation from an existing prompt.
- POST /api/v1/prompts/sql-generations: Create a new prompt and a sql-generation.
- GET /api/v1/prompts/sql-generations: List sql-generations.
- GET /api/v1/sql-generations/{sql_generation_id}: Retrieve a specific sql-generation.
- PUT /api/v1/sql-generations/{sql_generation_id}: Update the metadata for a sql-generation.
- GET /api/v1/sql-generations/{sql_generation_id}/execute: Execute the created SQL and retrieve the result.
- GET /api/v1/sql-generations/{sql_generation_id}/csv-file: Execute the created SQL and generate a CSV file using the result.
- POST /api/v1/sql-generations/{sql_generation_id}/nl-generations: Create an nl-generation from an existing sql-generation.
- POST /api/v1/prompts/{prompt_id}/sql-generations/nl-generations: Create a sql-generation and an nl-generation from an existing prompt.
- POST /api/v1/prompts/sql-generations/nl-generations: Create a prompt, sql-generation, and nl-generation.
- GET /api/v1/nl-generations: List all nl-generations.
- GET /api/v1/nl-generations/{nl_generation_id}: Retrieve a specific nl-generation.
- GET /api/v1/sql-generations/{sql_generation_id}: Retrieve a specific sql-generation.
- PUT /api/v1/nl-generations/{nl_generation_id}: Update the metadata for an nl-generation.
Renaming golden_records to golden_sqls: We've updated the name for all endpoints, entities, and collections.

3. Migration Script

Purpose: To facilitate a smooth transition from version 0.0.5 to version 1.0.0, we've introduced a migration script.
Data Modifications: The script performs the following actions:
- Renames the golden_records collection to golden_sqls.
- Replaces all related data types from ObjectId to strings.
- Updates table descriptions by changing "SYNCHRONIZED" status to "SCANNED" and "NOT_SYNCHRONIZED" to "NOT_SCANNED."
- Utilizes the existing questions collections to create the prompts collection.
- Converts responses collections into sql_generations and nl_generations collections.

To run the migration script, use the following command:

docker-compose exec app python3 -m dataherald.scripts.migrate_v006_to_v100

We hope that these changes enhance your experience with our platform. If you have any questions or encounter any issues, please don't hesitate to reach out to our support team.

Assets 2

13 Nov 17:59

jcjc712

0.0.6

f37743f

v0.0.6

What's Changed

1. Changes in `POST /api/v1/responses` endpoint:

If the sql_query body parameter is not set, the response is regenerated. This process generates new values for sql_query, sql_result, and response.

2. Introducing the `generate_csv` flag:

The generate_csv flag is a parameter that allows the generation of a CSV file populated with the sql_query_result rows. This parameter can be set in both POST /api/v1/responses and POST /api/v1/questions endpoints.

If the file is created, the response will include the field csv_file_path. For example:
```
"csv_file_path": "s3://k2-core/c6ddccfc-f355-4477-a2e7-e43f77e31bbb.csv"
```
Additionally, if the generate_csv flag is set to True, the sql_query_result will return NULL when it contains more than 50 rows.

3. Configure S3 Credentials:

You have the flexibility to set your S3 credentials to store the CSV files within the POST /api/v1/database-connections endpoint as follows:

    "file_storage": {
        "name": "string",
        "access_key_id": "string",
        "secret_access_key": "string",
        "region": "string",
        "bucket": "string"
    }

If S3 credentials are not specified within the db_connection, the system will use the S3 credentials from your environment variables, as set in your .env file.

These changes will improve the consistency and maintainability of your application's data structures and APIs. If you encounter any issues during the upgrade process, please don't hesitate to reach out to our support team.

New Contributors

Contributors

question44, ppmarkus, and roy-moven

Assets 2

25 Oct 22:57

jcjc712

0.0.5

d8439f3

v0.0.5

What's Changed

1. Endpoint Update

Affected Endpoints: The changes impact two API endpoints:
-- POST /api/v1/database-connections: This endpoint is used to create a database connection.
-- PUT /api/v1/database-connections/{db_connection_id}: This endpoint is used to update a database connection.
Change Description: The llm_credentials object in these endpoints has been replaced with the llm_api_key field, which now only accepts strings as its value. In other words, the llm_credentials field has been removed, and it has been replaced with a simpler llm_api_key field that can only hold string values. This change suggests a more straightforward approach to managing API keys or credentials within the system.

5. Migration Script

Purpose: A migration script has been introduced to assist users in smoothly transitioning their data from version 0.0.4 to version 0.0.5.
Data Modification: This script operates on the data_connections collection and performs the following action:
It replaces the llm_credentials field with the llm_api_key field but only if the llm_credentials field is populated in the data. In other words, if there is data in the llm_credentials field, the script will transfer it to the new llm_api_key field.

To run the migration script, use the following command:

docker-compose exec app python3 -m dataherald.scripts.migrate_v004_to_v005

New Contributors

Contributors

lpxiangyan9, ishaan-jaff, and 3 other contributors

Assets 2

06 Oct 20:46

jcjc712

0.0.4

f57fde5

v0.0.4

What's Changed `f57fde5`

1. Endpoint Renaming
We have streamlined our API endpoints for better consistency and clarity:

Renamed Endpoints:

POST /api/v1/nl-query-responses is now POST /api/v1/responses.
POST /api/v1/question is now POST /api/v1/questions.

2. Endpoint Removal
In this version, we have removed the following endpoint:

PATCH /api/v1/nl-query-responses/{query_id}.

Note: Responses resources are now immutable, so you can only create new responses and not update existing ones.

3. MongoDB Collection and Field Renaming
To improve consistency and readability, we have renamed MongoDB collection and field names:

Collection Name Changes:

nl_questions collection has been renamed to questions.
nl_query_responses collection has been renamed to responses.

Field Name Changes (within the responses collection):

nl_question_id has been renamed to question_id.
nl_response has been renamed to response.

4. Use of ObjectId for Foreign Keys
To enhance data integrity and relationships, we have transitioned to using ObjectId types for foreign keys, providing stronger data typing.

5. Migration Script
We've created a migration script to help you smoothly transition your data from version 0.0.3 to version 0.0.4. This script updates collection names, field names, and foreign keys data type to ObjectId. To run the migration script, use the following command:

docker-compose exec app python3 -m dataherald.scripts.migrate_v003_to_v004

Upgrade Instructions:

To upgrade to Version 0.0.4, follow these steps:

Ensure you have Docker Compose installed.
Pull the latest version of the application.
Run the provided migration script as shown above.

New Contributors

Contributors

khaianis, prasanta303, and mrtunguyen

Assets 2

25 Sep 23:20

jcjc712

0.0.3

9e2d119

v0.0.3

What's Changed

1. Validate Database Connection Requests 5937b35

When a database connection is created or updated, it now attempts to establish a connection.
If the connection is successfully established, it is stored, and a 200 response is returned.
In case of failure, a 400 error response is generated.

2. Add LLM Credentials to Database Connection Endpoints 2d9e873

With the latest update, when creating or updating a database connection, you have the option to set LLM credentials. This allows you to use different keys for different connections

3. SSH Connection Update a66f7d8

We have discontinued the use of the private_key_path field for SSH connections.
Instead, we now utilize the path_to_credentials_file to specify the path to the SSH private key file.

4. Enhanced Table Scanning with Background Tasks fdc3bb7

We have implemented background tasks for asynchronous table scanning.
The endpoint name has been updated from /api/v1/table-descriptions/scan to /api/v1/table-descriptions/sync-schemas.
This enhancement ensures that even if the process operates slowly, potentially taking several minutes, the HTTP response remains consistently fast and responsive.

5. Returns Scanned Tables and Not Scanned Tables 9e2d119

This endpoint /api/v1/table-descriptions should make a db connection to retrieve all the table names and check which tables have been scanned to generate a response.
The status can be:
- NOT_SYNCHRONIZED if the table has not been scanned
- SYNCHRONIZING while the sync schema process is running
- DEPRECATED if there is a row in our table-descriptions collection that is no longer in the database, probably because the table/view was deleted or renamed
- SYNCHRONIZED when we have scanned the table
- FAILED if anything failed during the sync schema process, and the error_message field stores the error.

6. Migration Script from v0.0.2 to v0.0.3 9e2d119

This script facilitates the transition from version v0.0.2 to v0.0.3 by performing the following essential task:
In the table_descriptions collection, it updates the status field to the value SYNCHRONIZED.

To execute the script, simply run the following command:

docker-compose exec app python3 -m dataherald.scripts.migrate_v002_to_v003

New Contributors

Contributors

ppmarkus and suranjannandi-git

Assets 2

14 Sep 15:54

jcjc712

0.0.2

86467b9

v0.0.2

What's Changed

1. RESTful Endpoint Names and Swagger Grouping
We have made significant changes to our endpoint naming conventions, following RESTful principles. Additionally, we have organized the endpoints into logical sections within our Swagger documentation for easier navigation and understanding.

2. MongoDB Collection Name Changes
We have updated the names of several MongoDB collections. Here are the collection name changes:

nl_query_response ➡️ nl_query_responses
nl_question ➡️ nl_questions
database_connection ➡️ database_connections
table_schema_detail ➡️ table_descriptions

3. Migration to db_connection_id for MongoDB Collections
Previously, we used a db_alias field to relate MongoDB collections. In this release, we have transitioned to using a new field called db_connection_id to establish relationships between collections.

4. Renamed Core Methods for Code Clarity
To improve the clarity of our codebase, we have renamed several core methods.

5. Migration Script from v0.0.1 to v0.0.2
We understand the importance of a smooth transition between versions. This script performs the following actions:

Adds the db_connection_id relation for all MongoDB collections.
Renames all MongoDB collection names to align with the new naming conventions.
Deletes the Vector store data (Pinecone or Chroma) and utilizes the golden_records collection to upload the data seamlessly.

To execute the script just run this command
docker-compose exec app python3 -m dataherald.scripts.migrate_v001_to_v002

New Contributors

Contributors

amoffat, manugarri, and 3 other contributors

Assets 2

Releases: Dataherald/dataherald

v1.0.3

Release Notes for Version 1.0.3

What's New

1. New features

2. Improvements and fixes

New Contributors

Contributors

v1.0.2

Release Notes for Version 1.0.2

What's New

1. New features

2. Improvements and fixes

3. Migration Script

New Contributors

Contributors

v1.0.1

Release Notes for Version 1.0.1

What's New

1. New features

2. Changes and fixes

3. Migration Script

New Contributors

Contributors

v01.0.0

Release Notes for Version 1.0.0

What's New

1. New Resources, Attributes, and Endpoints

2. Resource and Endpoint Changes

3. Migration Script

v0.0.6

What's Changed

1. Changes in POST /api/v1/responses endpoint:

2. Introducing the generate_csv flag:

3. Configure S3 Credentials:

New Contributors

Contributors

v0.0.5

What's Changed

New Contributors

Contributors

v0.0.4

What's Changed f57fde5

Upgrade Instructions:

New Contributors

Contributors

v0.0.3

What's Changed

New Contributors

Contributors

v0.0.2

What's Changed

New Contributors

Contributors

1. Changes in `POST /api/v1/responses` endpoint:

2. Introducing the `generate_csv` flag:

What's Changed `f57fde5`