Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved Language Detection Functionality (Tasks 1-3) #6

Closed
wants to merge 7 commits into from

Conversation

amit828as
Copy link

Task 1: Core Language Detection Function (pypacter/src/pypacter.py)

This pull request implements three new features for Pypacter, enhancing its language detection capabilities:

A new function, detect_programming_language(code_snippet: str) -> str, has been added to the core Pypacter package. This function takes a code snippet as input and uses the DEFAULT_MODEL from the pypacter.models module to analyze it with a Large Language Model (LLM). The detected programming language is returned.

The LLM prompt has been carefully crafted to handle various scenarios:

  1. It explicitly states the provided code snippet.
  2. It acknowledges the possibility of non-programming languages or invalid syntax.
  3. It requests only the programming language name without explanations.

Task 2: Language Detection API Endpoint (pypacter-api/src/pypacter_api/api.py)

A new API endpoint, /detect-language, has been added to the pypacter-api package. This endpoint accepts a POST request with a JSON body containing the code_snippet field. It performs the following actions:

  1. Validates the presence of the code_snippet parameter.
  2. Calls the detect_programming_language function from the core package.
  3. Returns a JSON response containing the detected language (lowercase) on success.
  4. Raises appropriate HTTP exceptions for missing code snippets or language detection failures.

Task 3: Language Detection CLI Command (pypacter-cli/src/pypacter_cli/cli.py)

A new CLI command, detect-language, has been added to the pypacter-cli package. This command allows users to detect the programming language of a code snippet through the command line. It offers two ways to provide the code snippet:

  1. As a file path passed as an argument.
  2. Through standard input using pipes (|).
  3. The command interacts with the detect_programming _language function and presents the detected language in a user-friendly format.

These enhancements have been implemented with a focus on the provided guidelines:

  • Readability and Clarity: The code adheres to best practices for naming conventions, structure, and comments. Documentation is included for all new features.
  • Correctness: The code handles edge cases like missing code snippets and potential LLM errors gracefully. Unit tests ensure functionality.
  • Maintainability: The code is modular and extensible, allowing for future improvements to the language detection mechanism.

Next Steps:

  • Consider exploring alternative LLM access methods or implementing a fallback mechanism for potential LLM limitations.
  • Investigate integrating the language detection functionality with other Pypacter features.

Please reach out if there has been any kind of oversight from my end.

amit828as added 4 commits July 4, 2024 21:01
While running the script in hatch environment, during cmd[3] | mypy src tests, it throws an error 'error: Need type annotation for parser'. Adding the type solves this issue.
@JP-Ellis JP-Ellis closed this Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants