Add support for processing diffs without line numbers in code suggestions tool

2025-07-21 04:50:39 +08:00 · 2024-11-03 17:34:30 +02:00
parent d9ef26dc1c
commit ef3241285d
6 changed files with 140 additions and 101 deletions
--- a/pr_agent/settings/pr_code_suggestions_prompts.toml
+++ b/pr_agent/settings/pr_code_suggestions_prompts.toml
@ -14,10 +14,10 @@ The PR code diff will be in the following structured format:

@@ ... @@ def func1():
 __new hunk__
-11  unchanged code line0 in the PR
-12  unchanged code line1 in the PR
-13 +new code line2 added in the PR
-14  unchanged code line3 in the PR
+ unchanged code line0 in the PR
+ unchanged code line1 in the PR
+new code line2 added in the PR
+ unchanged code line3 in the PR
 __old hunk__
 unchanged code line0
 unchanged code line1
@ -35,7 +35,6 @@ __new hunk__
 ======

 - In the format above, the diff is organized into separate '__new hunk__' and '__old hunk__' sections for each code chunk. '__new hunk__' contains the updated code, while '__old hunk__' shows the removed code. If no code was removed in a specific chunk, the __old hunk__ section will be omitted.
- Line numbers were added for the '__new hunk__' sections to help referencing specific lines in the code suggestions. These numbers are for reference only and are not part of the actual code.
 - Code lines are prefixed with symbols: '+' for new code added in the PR, '-' for code removed, and ' ' for unchanged code.
 {%- if is_ai_metadata %}
 - When available, an AI-generated summary will precede each file's diff, with a high-level overview of the changes. Note that this summary may not be fully accurate or complete.
@ -44,7 +43,7 @@ __new hunk__

 Specific guidelines for generating code suggestions:
 - Provide up to {{ num_code_suggestions }} distinct and insightful code suggestions.
- Focus solely on enhancing new code introduced in the PR, identified by '+' prefixes in '__new hunk__' sections (after the line numbers).
+- Focus solely on enhancing new code introduced in the PR, identified by '+' prefixes in '__new hunk__' sections.
 - Prioritize suggestions that address potential issues, critical problems, and bugs in the PR code. Avoid repeating changes already implemented in the PR. If no pertinent suggestions are applicable, return an empty list.
 - Don't suggest to add docstring, type hints, or comments, to remove unused imports, or to use more specific exception types.
 - When referencing variables or names from the code, enclose them in backticks (`). Example: "ensure that `variable_name` is..."
@ -67,12 +66,10 @@ class CodeSuggestion(BaseModel):
    relevant_file: str = Field(description="Full path of the relevant file")
    language: str = Field(description="Programming language used by the relevant file")
    suggestion_content: str = Field(description="An actionable suggestion to enhance, improve or fix the new code introduced in the PR. Don't present here actual code snippets, just the suggestion. Be short and concise")
-    existing_code: str = Field(description="A short code snippet from a '__new hunk__' section that the suggestion aims to enhance or fix. Include only complete code lines, without line numbers. Use ellipsis (...) for brevity if needed. This snippet should represent the specific PR code targeted for improvement.")
+    existing_code: str = Field(description="A short code snippet from a '__new hunk__' section that the suggestion aims to enhance or fix. Include only complete code lines. Use ellipsis (...) for brevity if needed. This snippet should represent the specific PR code targeted for improvement.")
    improved_code: str = Field(description="A refined code snippet that replaces the 'existing_code' snippet after implementing the suggestion.")
    one_sentence_summary: str = Field(description="A concise, single-sentence overview of the suggested improvement. Focus on the 'what'. Be general, and avoid method or variable names.")
-    relevant_lines_start: int = Field(description="The relevant line number, from a '__new hunk__' section, where the suggestion starts (inclusive). Should be derived from the hunk line numbers, and correspond to the beginning of the 'existing code' snippet above")
-    relevant_lines_end: int = Field(description="The relevant line number, from a '__new hunk__' section, where the suggestion ends (inclusive). Should be derived from the hunk line numbers, and correspond to the end of the 'existing code' snippet above")
-    label: str = Field(description="A single, descriptive label that best characterizes the suggestion type. Possible labels include 'security', 'possible bug', 'possible issue', 'performance', 'enhancement', 'best practice', 'maintainability'. Other relevant labels are also acceptable.")
+    label: str = Field(description="A single, descriptive label that best characterizes the suggestion type. Possible labels include 'security', 'possible bug', 'possible issue', 'performance', 'enhancement', 'best practice', 'maintainability', 'typo'. Other relevant labels are also acceptable.")


 class PRCodeSuggestions(BaseModel):
@ -95,8 +92,6 @@ code_suggestions:
    ...
  one_sentence_summary: |
    ...
-  relevant_lines_start: 12
-  relevant_lines_end: 13
  label: |
    ...
 ```
@ -112,7 +107,7 @@ Title: '{{title}}'

 The PR Diff:
 ======
-{{ diff|trim }}
+{{ diff_no_line_numbers|trim }}
 ======