Compare commits

..

27 Commits

Author SHA1 Message Date
91bf3c0749 openai version 2024-12-02 09:23:51 +02:00
Tal
159155785e Update README.md 2024-12-02 08:46:36 +02:00
Tal
eabc296246 Merge pull request #1376 from pdecat/enhancement/generalize_publish_output_progress
Add publish_output_progress config support to AzureDevOps, BitBucket and Gitlab providers
2024-12-02 08:27:06 +02:00
Tal
b44030114e Merge pull request #1374 from KennyDizi/main
Add Support for GPT-4o November 2024 Model and Update Configurations
2024-12-02 08:23:26 +02:00
Tal
1d6f87be3b Merge pull request #1375 from Codium-ai/update-google-tag-manager
Update Google Tag Manager ID in custom analytics integration
2024-12-02 07:53:16 +02:00
Tal
a7c6fa7bd2 Merge pull request #1364 from ryanzll/main
Check git_provider and reference_link before using them in utils.py
2024-12-02 07:52:59 +02:00
a825aec5f3 Add publish_output_progress config support to AzureDevOps, BitBucket and Gitlab providers 2024-11-28 17:15:24 +01:00
4df097c228 Update Google Tag Manager ID in custom analytics integration 2024-11-25 15:07:28 +02:00
6871e1b27a docs: add section on customizing best practices label in improve.md 2024-11-24 17:37:35 +02:00
4afe05761d docs: add section on best practices for multiple languages in improve.md 2024-11-24 17:22:18 +02:00
7d1b6c2f0a Upgrade litellm to v1.52.12 to support model gpt-4o-2024-11-20 2024-11-21 22:12:01 +07:00
3547cf2057 Update model_turbo and fallback_models 2024-11-21 22:10:55 +07:00
f2043d639c Add support model gpt-4o-2024-11-20 2024-11-21 22:10:27 +07:00
Tal
6240de3898 Merge pull request #1373 from Codium-ai/tr/ado
Improve logging and error handling in Azure DevOps provider for code …
2024-11-21 13:41:22 +02:00
f08b20c667 Improve logging and error handling in Azure DevOps provider for code suggestions 2024-11-21 13:37:48 +02:00
Tal
e64b468556 Update azure.md 2024-11-21 09:24:45 +02:00
Tal
d48d14dac7 Merge pull request #1369 from Codium-ai/tr/committable_comments
Tr/committable comments
2024-11-20 17:49:08 +02:00
eb0c959ca9 Add validation for committable comments within PR hunks in GitHub provider 2024-11-20 17:28:13 +02:00
741a70ad9d Add detailed diff code generation for GitLab suggestions and improve comment formatting 2024-11-20 17:26:36 +02:00
22ee03981e Add diff code generation for Bitbucket code suggestions and improve logging 2024-11-20 17:25:10 +02:00
Tal
b1336e7d08 Merge pull request #1355 from Codium-ai/tr/3-way-prs
use a more modern package
2024-11-18 17:02:26 +02:00
Tal
751caca141 Merge pull request #1367 from Codium-ai/tr/focus_only_on_problems_enabled
Enable focus_only_on_problems mode by default in configuration and up…
2024-11-18 16:49:57 +02:00
612004727c true 2024-11-18 16:47:55 +02:00
577ee0241d Enable focus_only_on_problems mode by default in configuration and update README.md 2024-11-18 16:35:23 +02:00
a141ca133c Update utils.py
1. add missed emoji for "PR contains tests"
2. check git_provider and reference_link before using them
2024-11-16 09:32:05 +08:00
2f4545dc15 Refactor byte decoding in Bitbucket server provider using decode_if_bytes function 2024-11-12 08:26:33 +02:00
cbd490b3d7 use a more modern version 2024-11-12 08:23:11 +02:00
14 changed files with 267 additions and 125 deletions

View File

@ -43,14 +43,22 @@ Qode Merge PR-Agent aims to help efficiently review and handle pull requests, by
## News and Updates ## News and Updates
### November 7, 2024 ### December 2, 2024
Added new option: `--pr_code_suggestions.focus_only_on_problems=true` Open-source repositories can now freely use Qodo Merge Pro, and enjoy easy one-click installation using our dedicated [app](https://github.com/apps/qodo-merge-pro-for-open-source).
When enabled, this option reduces the number of code suggestions and categorizes them into just two groups: "Possible Issues" and "General". The suggestions will focus primarily on identifying and fixing code problems, rather than style considerations like best practices, maintainability, or readability. <kbd><img src="https://github.com/user-attachments/assets/b0838724-87b9-43b0-ab62-73739a3a855c" width="512"></kbd>
This mode is ideal for developers who want to concentrate specifically on finding and fixing potential bugs in their pull request code.
### November 18, 2024
A new mode was enabled by default for code suggestions - `--pr_code_suggestions.focus_only_on_problems=true`:
- This option reduces the number of code suggestions received
- The suggestions will focus more on identifying and fixing code problems, rather than style considerations like best practices, maintainability, or readability.
- The suggestions will be categorized into just two groups: "Possible Issues" and "General".
Still, if you prefer the previous mode, you can set `--pr_code_suggestions.focus_only_on_problems=false` in the [configuration file](https://qodo-merge-docs.qodo.ai/usage-guide/configuration_options/).
**Example results:** **Example results:**
@ -68,51 +76,6 @@ Focused mode
Qodo Merge PR Agent will now leverage context from Jira or GitHub tickets to enhance the PR Feedback. Read more about this feature Qodo Merge PR Agent will now leverage context from Jira or GitHub tickets to enhance the PR Feedback. Read more about this feature
[here](https://qodo-merge-docs.qodo.ai/core-abilities/fetching_ticket_context/) [here](https://qodo-merge-docs.qodo.ai/core-abilities/fetching_ticket_context/)
### November 3, 2024
Meaningful improvement to the quality of code suggestions by separating the code suggestion generation from [line number detection](https://github.com/Codium-ai/pr-agent/pull/1338)
<kbd>![image](https://github.com/user-attachments/assets/093c185c-31ca-47a1-a4fe-be7d9335ea66)</kbd>
### October 27, 2024
Qodo Merge PR Agent will now automatically document accepted code suggestions in a dedicated wiki page (`.pr_agent_accepted_suggestions`), enabling users to track historical changes, assess the tool's effectiveness, and learn from previously implemented recommendations in the repository.
This dedicated wiki page will also serve as a foundation for future AI model improvements, allowing it to learn from historically implemented suggestions and generate more targeted, contextually relevant recommendations.
Read more about this novel feature [here](https://qodo-merge-docs.qodo.ai/tools/improve/#suggestion-tracking).
<kbd><img href="https://qodo.ai/images/pr_agent/pr_agent_accepted_suggestions1.png" src="https://qodo.ai/images/pr_agent/pr_agent_accepted_suggestions1.png" width="768"></kbd>
### October 21, 2024
**Disable publishing labels by default:**
The default setting for `pr_description.publish_labels` has been updated to `false`. This means that labels generated by the `/describe` tool will no longer be published, unless this configuration is explicitly set to `true`.
We constantly strive to balance informative AI analysis with reducing unnecessary noise. User feedback indicated that in many cases, the original PR title alone provides sufficient information, making the generated labels (`enhancement`, `documentation`, `bug fix`, ...) redundant.
The [`review_effort`](https://qodo-merge-docs.qodo.ai/tools/review/#configuration-options) label, generated by the `review` tool, will still be published by default, as it provides valuable information enabling reviewers to prioritize small PRs first.
However, every user has different preferences. To still publish the `describe` labels, set `pr_description.publish_labels=true` in the [configuration file](https://qodo-merge-docs.qodo.ai/usage-guide/configuration_options/).
For more tailored and relevant labeling, we recommend using the [`custom_labels 💎`](https://qodo-merge-docs.qodo.ai/tools/custom_labels/) tool, that allows generating labels specific to your project's needs.
<kbd>![image](https://github.com/user-attachments/assets/8f38d222-53b1-4742-b2ec-7ea0a30c9076)</kbd>
<kbd>![image](https://github.com/user-attachments/assets/8285bd90-0dda-4c7e-9237-bbfde5e21880)</kbd>
### October 14, 2024
Improved support for GitHub enterprise server with [GitHub Actions](https://qodo-merge-docs.qodo.ai/installation/github/#action-for-github-enterprise-server)
### October 10, 2024
New ability for the `review` tool - **ticket compliance feedback**. If the PR contains a ticket number, PR-Agent will check if the PR code actually [complies](https://github.com/Codium-ai/pr-agent/pull/1279#issuecomment-2404042130) with the ticket requirements.
<kbd><img src="https://github.com/user-attachments/assets/4a2a728b-5f47-40fa-80cc-16efd296938c" width="768"></kbd>
## Overview ## Overview
<div style="text-align:left;"> <div style="text-align:left;">

View File

@ -51,10 +51,12 @@ stages:
``` ```
This script will run Qodo Merge on every new merge request, with the `improve`, `review`, and `describe` commands. This script will run Qodo Merge on every new merge request, with the `improve`, `review`, and `describe` commands.
Note that you need to export the `azure_devops__pat` and `OPENAI_KEY` variables in the Azure DevOps pipeline settings (Pipelines -> Library -> + Variable group): Note that you need to export the `azure_devops__pat` and `OPENAI_KEY` variables in the Azure DevOps pipeline settings (Pipelines -> Library -> + Variable group):
![Qodo Merge Pro](https://codium.ai/images/pr_agent/azure_devops_pipeline_secrets.png){width=468} ![Qodo Merge Pro](https://codium.ai/images/pr_agent/azure_devops_pipeline_secrets.png){width=468}
Make sure to give pipeline permissions to the `pr_agent` variable group. Make sure to give pipeline permissions to the `pr_agent` variable group.
> Note that Azure Pipelines lacks support for triggering workflows from PR comments. If you find a viable solution, please contribute it to our [issue tracker](https://github.com/Codium-ai/pr-agent/issues)
## Azure DevOps from CLI ## Azure DevOps from CLI

View File

@ -245,6 +245,32 @@ enable_global_best_practices = true
Then, create a `best_practices.md` wiki file in the root of [global](https://qodo-merge-docs.qodo.ai/usage-guide/configuration_options/#global-configuration-file) configuration repository, `pr-agent-settings`. Then, create a `best_practices.md` wiki file in the root of [global](https://qodo-merge-docs.qodo.ai/usage-guide/configuration_options/#global-configuration-file) configuration repository, `pr-agent-settings`.
##### Best practices for multiple languages
For a git organization working with multiple programming languages, you can maintain a centralized global `best_practices.md` file containing language-specific guidelines.
When reviewing pull requests, Qodo Merge automatically identifies the programming language and applies the relevant best practices from this file.
Structure your `best_practices.md` file using the following format:
```
# [Python]
...
# [Java]
...
# [JavaScript]
...
```
##### Dedicated label for best practices suggestions
Best practice suggestions are labeled as `Organization best practice` by default.
To customize this label, modify it in your configuration file:
```toml
[best_practices]
organization_name = ""
```
And the label will be: `{organization_name} best practice`.
##### Example results ##### Example results
![best_practice](https://codium.ai/images/pr_agent/org_best_practice.png){width=512} ![best_practice](https://codium.ai/images/pr_agent/org_best_practice.png){width=512}
@ -277,7 +303,7 @@ Using a combination of both can help the AI model to provide relevant and tailor
</tr> </tr>
<tr> <tr>
<td><b>focus_only_on_problems</b></td> <td><b>focus_only_on_problems</b></td>
<td>If set to true, suggestions will focus primarily on identifying and fixing code problems, and less on style considerations like best practices, maintainability, or readability. Default is false.</td> <td>If set to true, suggestions will focus primarily on identifying and fixing code problems, and less on style considerations like best practices, maintainability, or readability. Default is true.</td>
</tr> </tr>
<tr> <tr>
<td><b>persistent_comment</b></td> <td><b>persistent_comment</b></td>

View File

@ -3,5 +3,5 @@
new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0], new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],
j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src= j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src=
'https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f); 'https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);
})(window,document,'script','dataLayer','GTM-5C9KZBM3');</script> })(window,document,'script','dataLayer','GTM-M6PJSFV');</script>
<!-- End Google Tag Manager --> <!-- End Google Tag Manager -->

View File

@ -19,6 +19,7 @@ MAX_TOKENS = {
'gpt-4o-mini': 128000, # 128K, but may be limited by config.max_model_tokens 'gpt-4o-mini': 128000, # 128K, but may be limited by config.max_model_tokens
'gpt-4o-mini-2024-07-18': 128000, # 128K, but may be limited by config.max_model_tokens 'gpt-4o-mini-2024-07-18': 128000, # 128K, but may be limited by config.max_model_tokens
'gpt-4o-2024-08-06': 128000, # 128K, but may be limited by config.max_model_tokens 'gpt-4o-2024-08-06': 128000, # 128K, but may be limited by config.max_model_tokens
'gpt-4o-2024-11-20': 128000, # 128K, but may be limited by config.max_model_tokens
'o1-mini': 128000, # 128K, but may be limited by config.max_model_tokens 'o1-mini': 128000, # 128K, but may be limited by config.max_model_tokens
'o1-mini-2024-09-12': 128000, # 128K, but may be limited by config.max_model_tokens 'o1-mini-2024-09-12': 128000, # 128K, but may be limited by config.max_model_tokens
'o1-preview': 128000, # 128K, but may be limited by config.max_model_tokens 'o1-preview': 128000, # 128K, but may be limited by config.max_model_tokens

View File

@ -173,7 +173,7 @@ def convert_to_markdown_v2(output_data: dict,
if is_value_no(value): if is_value_no(value):
markdown_text += f'### {emoji} No relevant tests\n\n' markdown_text += f'### {emoji} No relevant tests\n\n'
else: else:
markdown_text += f"### PR contains tests\n\n" markdown_text += f"### {emoji} PR contains tests\n\n"
elif 'ticket compliance check' in key_nice.lower(): elif 'ticket compliance check' in key_nice.lower():
markdown_text = ticket_markdown_logic(emoji, markdown_text, value, gfm_supported) markdown_text = ticket_markdown_logic(emoji, markdown_text, value, gfm_supported)
elif 'security concerns' in key_nice.lower(): elif 'security concerns' in key_nice.lower():
@ -224,12 +224,21 @@ def convert_to_markdown_v2(output_data: dict,
issue_content = issue.get('issue_content', '').strip() issue_content = issue.get('issue_content', '').strip()
start_line = int(str(issue.get('start_line', 0)).strip()) start_line = int(str(issue.get('start_line', 0)).strip())
end_line = int(str(issue.get('end_line', 0)).strip()) end_line = int(str(issue.get('end_line', 0)).strip())
reference_link = git_provider.get_line_link(relevant_file, start_line, end_line) if git_provider:
reference_link = git_provider.get_line_link(relevant_file, start_line, end_line)
else:
reference_link = None
if gfm_supported: if gfm_supported:
issue_str = f"<a href='{reference_link}'><strong>{issue_header}</strong></a><br>{issue_content}" if reference_link is not None and len(reference_link) > 0:
issue_str = f"<a href='{reference_link}'><strong>{issue_header}</strong></a><br>{issue_content}"
else:
issue_str = f"<strong>{issue_header}</strong><br>{issue_content}"
else: else:
issue_str = f"[**{issue_header}**]({reference_link})\n\n{issue_content}\n\n" if reference_link is not None and len(reference_link) > 0:
issue_str = f"[**{issue_header}**]({reference_link})\n\n{issue_content}\n\n"
else:
issue_str = f"**{issue_header}**\n\n{issue_content}\n\n"
markdown_text += f"{issue_str}\n\n" markdown_text += f"{issue_str}\n\n"
except Exception as e: except Exception as e:
get_logger().exception(f"Failed to process 'Recommended focus areas for review': {e}") get_logger().exception(f"Failed to process 'Recommended focus areas for review': {e}")

View File

@ -92,4 +92,3 @@ def run(inargs=None, args=None):
if __name__ == '__main__': if __name__ == '__main__':
run() run()
aa= "rr"

View File

@ -67,16 +67,14 @@ class AzureDevopsProvider(GitProvider):
relevant_lines_end = suggestion['relevant_lines_end'] relevant_lines_end = suggestion['relevant_lines_end']
if not relevant_lines_start or relevant_lines_start == -1: if not relevant_lines_start or relevant_lines_start == -1:
if get_settings().config.verbosity_level >= 2: get_logger().warning(
get_logger().exception( f"Failed to publish code suggestion, relevant_lines_start is {relevant_lines_start}")
f"Failed to publish code suggestion, relevant_lines_start is {relevant_lines_start}")
continue continue
if relevant_lines_end < relevant_lines_start: if relevant_lines_end < relevant_lines_start:
if get_settings().config.verbosity_level >= 2: get_logger().warning(f"Failed to publish code suggestion, "
get_logger().exception(f"Failed to publish code suggestion, " f"relevant_lines_end is {relevant_lines_end} and "
f"relevant_lines_end is {relevant_lines_end} and " f"relevant_lines_start is {relevant_lines_start}")
f"relevant_lines_start is {relevant_lines_start}")
continue continue
if relevant_lines_end > relevant_lines_start: if relevant_lines_end > relevant_lines_start:
@ -95,9 +93,11 @@ class AzureDevopsProvider(GitProvider):
"side": "RIGHT", "side": "RIGHT",
} }
post_parameters_list.append(post_parameters) post_parameters_list.append(post_parameters)
if not post_parameters_list:
return False
try: for post_parameters in post_parameters_list:
for post_parameters in post_parameters_list: try:
comment = Comment(content=post_parameters["body"], comment_type=1) comment = Comment(content=post_parameters["body"], comment_type=1)
thread = CommentThread(comments=[comment], thread = CommentThread(comments=[comment],
thread_context={ thread_context={
@ -117,15 +117,11 @@ class AzureDevopsProvider(GitProvider):
repository_id=self.repo_slug, repository_id=self.repo_slug,
pull_request_id=self.pr_num pull_request_id=self.pr_num
) )
if get_settings().config.verbosity_level >= 2: except Exception as e:
get_logger().info( get_logger().warning(f"Azure failed to publish code suggestion, error: {e}")
f"Published code suggestion on {self.pr_num} at {post_parameters['path']}" return True
)
return True
except Exception as e:
if get_settings().config.verbosity_level >= 2:
get_logger().error(f"Failed to publish code suggestion, error: {e}")
return False
def get_pr_description_full(self) -> str: def get_pr_description_full(self) -> str:
return self.pr.description return self.pr.description
@ -382,6 +378,9 @@ class AzureDevopsProvider(GitProvider):
return [] return []
def publish_comment(self, pr_comment: str, is_temporary: bool = False, thread_context=None): def publish_comment(self, pr_comment: str, is_temporary: bool = False, thread_context=None):
if is_temporary and not get_settings().config.publish_output_progress:
get_logger().debug(f"Skipping publish_comment for temporary comment: {pr_comment}")
return None
comment = Comment(content=pr_comment) comment = Comment(content=pr_comment)
thread = CommentThread(comments=[comment], thread_context=thread_context, status=5) thread = CommentThread(comments=[comment], thread_context=thread_context, status=5)
thread_response = self.azure_devops_client.create_thread( thread_response = self.azure_devops_client.create_thread(

View File

@ -1,4 +1,6 @@
import difflib
import json import json
import re
from typing import Optional, Tuple from typing import Optional, Tuple
from urllib.parse import urlparse from urllib.parse import urlparse
@ -72,24 +74,38 @@ class BitbucketProvider(GitProvider):
post_parameters_list = [] post_parameters_list = []
for suggestion in code_suggestions: for suggestion in code_suggestions:
body = suggestion["body"] body = suggestion["body"]
original_suggestion = suggestion.get('original_suggestion', None) # needed for diff code
if original_suggestion:
try:
existing_code = original_suggestion['existing_code'].rstrip() + "\n"
improved_code = original_suggestion['improved_code'].rstrip() + "\n"
diff = difflib.unified_diff(existing_code.split('\n'),
improved_code.split('\n'), n=999)
patch_orig = "\n".join(diff)
patch = "\n".join(patch_orig.splitlines()[5:]).strip('\n')
diff_code = f"\n\n```diff\n{patch.rstrip()}\n```"
# replace ```suggestion ... ``` with diff_code, using regex:
body = re.sub(r'```suggestion.*?```', diff_code, body, flags=re.DOTALL)
except Exception as e:
get_logger().exception(f"Bitbucket failed to get diff code for publishing, error: {e}")
continue
relevant_file = suggestion["relevant_file"] relevant_file = suggestion["relevant_file"]
relevant_lines_start = suggestion["relevant_lines_start"] relevant_lines_start = suggestion["relevant_lines_start"]
relevant_lines_end = suggestion["relevant_lines_end"] relevant_lines_end = suggestion["relevant_lines_end"]
if not relevant_lines_start or relevant_lines_start == -1: if not relevant_lines_start or relevant_lines_start == -1:
if get_settings().config.verbosity_level >= 2: get_logger().exception(
get_logger().exception( f"Failed to publish code suggestion, relevant_lines_start is {relevant_lines_start}"
f"Failed to publish code suggestion, relevant_lines_start is {relevant_lines_start}" )
)
continue continue
if relevant_lines_end < relevant_lines_start: if relevant_lines_end < relevant_lines_start:
if get_settings().config.verbosity_level >= 2: get_logger().exception(
get_logger().exception( f"Failed to publish code suggestion, "
f"Failed to publish code suggestion, " f"relevant_lines_end is {relevant_lines_end} and "
f"relevant_lines_end is {relevant_lines_end} and " f"relevant_lines_start is {relevant_lines_start}"
f"relevant_lines_start is {relevant_lines_start}" )
)
continue continue
if relevant_lines_end > relevant_lines_start: if relevant_lines_end > relevant_lines_start:
@ -113,8 +129,7 @@ class BitbucketProvider(GitProvider):
self.publish_inline_comments(post_parameters_list) self.publish_inline_comments(post_parameters_list)
return True return True
except Exception as e: except Exception as e:
if get_settings().config.verbosity_level >= 2: get_logger().error(f"Bitbucket failed to publish code suggestion, error: {e}")
get_logger().error(f"Failed to publish code suggestion, error: {e}")
return False return False
def publish_file_comments(self, file_comments: list) -> bool: def publish_file_comments(self, file_comments: list) -> bool:
@ -122,7 +137,7 @@ class BitbucketProvider(GitProvider):
def is_supported(self, capability: str) -> bool: def is_supported(self, capability: str) -> bool:
if capability in ['get_issue_comments', 'publish_inline_comments', 'get_labels', 'gfm_markdown', if capability in ['get_issue_comments', 'publish_inline_comments', 'get_labels', 'gfm_markdown',
'publish_file_comments']: 'publish_file_comments']:
return False return False
return True return True
@ -310,6 +325,9 @@ class BitbucketProvider(GitProvider):
self.publish_comment(pr_comment) self.publish_comment(pr_comment)
def publish_comment(self, pr_comment: str, is_temporary: bool = False): def publish_comment(self, pr_comment: str, is_temporary: bool = False):
if is_temporary and not get_settings().config.publish_output_progress:
get_logger().debug(f"Skipping publish_comment for temporary comment: {pr_comment}")
return None
pr_comment = self.limit_output_characters(pr_comment, self.max_comment_length) pr_comment = self.limit_output_characters(pr_comment, self.max_comment_length)
comment = self.pr.comment(pr_comment) comment = self.pr.comment(pr_comment)
if is_temporary: if is_temporary:

View File

@ -1,10 +1,14 @@
from distutils.version import LooseVersion import difflib
import re
from packaging.version import parse as parse_version
from typing import Optional, Tuple from typing import Optional, Tuple
from urllib.parse import quote_plus, urlparse from urllib.parse import quote_plus, urlparse
from atlassian.bitbucket import Bitbucket from atlassian.bitbucket import Bitbucket
from requests.exceptions import HTTPError from requests.exceptions import HTTPError
from ..algo.git_patch_processing import decode_if_bytes
from ..algo.language_handler import is_valid_file from ..algo.language_handler import is_valid_file
from ..algo.types import EDIT_TYPE, FilePatchInfo from ..algo.types import EDIT_TYPE, FilePatchInfo
from ..algo.utils import (find_line_number_of_relevant_line_in_file, from ..algo.utils import (find_line_number_of_relevant_line_in_file,
@ -36,7 +40,7 @@ class BitbucketServerProvider(GitProvider):
token=get_settings().get("BITBUCKET_SERVER.BEARER_TOKEN", token=get_settings().get("BITBUCKET_SERVER.BEARER_TOKEN",
None)) None))
try: try:
self.bitbucket_api_version = LooseVersion(self.bitbucket_client.get("rest/api/1.0/application-properties").get('version')) self.bitbucket_api_version = parse_version(self.bitbucket_client.get("rest/api/1.0/application-properties").get('version'))
except Exception: except Exception:
self.bitbucket_api_version = None self.bitbucket_api_version = None
@ -66,24 +70,37 @@ class BitbucketServerProvider(GitProvider):
post_parameters_list = [] post_parameters_list = []
for suggestion in code_suggestions: for suggestion in code_suggestions:
body = suggestion["body"] body = suggestion["body"]
original_suggestion = suggestion.get('original_suggestion', None) # needed for diff code
if original_suggestion:
try:
existing_code = original_suggestion['existing_code'].rstrip() + "\n"
improved_code = original_suggestion['improved_code'].rstrip() + "\n"
diff = difflib.unified_diff(existing_code.split('\n'),
improved_code.split('\n'), n=999)
patch_orig = "\n".join(diff)
patch = "\n".join(patch_orig.splitlines()[5:]).strip('\n')
diff_code = f"\n\n```diff\n{patch.rstrip()}\n```"
# replace ```suggestion ... ``` with diff_code, using regex:
body = re.sub(r'```suggestion.*?```', diff_code, body, flags=re.DOTALL)
except Exception as e:
get_logger().exception(f"Bitbucket failed to get diff code for publishing, error: {e}")
continue
relevant_file = suggestion["relevant_file"] relevant_file = suggestion["relevant_file"]
relevant_lines_start = suggestion["relevant_lines_start"] relevant_lines_start = suggestion["relevant_lines_start"]
relevant_lines_end = suggestion["relevant_lines_end"] relevant_lines_end = suggestion["relevant_lines_end"]
if not relevant_lines_start or relevant_lines_start == -1: if not relevant_lines_start or relevant_lines_start == -1:
if get_settings().config.verbosity_level >= 2: get_logger().warning(
get_logger().exception( f"Failed to publish code suggestion, relevant_lines_start is {relevant_lines_start}"
f"Failed to publish code suggestion, relevant_lines_start is {relevant_lines_start}" )
)
continue continue
if relevant_lines_end < relevant_lines_start: if relevant_lines_end < relevant_lines_start:
if get_settings().config.verbosity_level >= 2: get_logger().warning(
get_logger().exception( f"Failed to publish code suggestion, "
f"Failed to publish code suggestion, " f"relevant_lines_end is {relevant_lines_end} and "
f"relevant_lines_end is {relevant_lines_end} and " f"relevant_lines_start is {relevant_lines_start}"
f"relevant_lines_start is {relevant_lines_start}" )
)
continue continue
if relevant_lines_end > relevant_lines_start: if relevant_lines_end > relevant_lines_start:
@ -160,7 +177,7 @@ class BitbucketServerProvider(GitProvider):
head_sha = self.pr.fromRef['latestCommit'] head_sha = self.pr.fromRef['latestCommit']
# if Bitbucket api version is >= 8.16 then use the merge-base api for 2-way diff calculation # if Bitbucket api version is >= 8.16 then use the merge-base api for 2-way diff calculation
if self.bitbucket_api_version is not None and self.bitbucket_api_version >= LooseVersion("8.16"): if self.bitbucket_api_version is not None and self.bitbucket_api_version >= parse_version("8.16"):
try: try:
base_sha = self.bitbucket_client.get(self._get_merge_base())['id'] base_sha = self.bitbucket_client.get(self._get_merge_base())['id']
except Exception as e: except Exception as e:
@ -175,7 +192,7 @@ class BitbucketServerProvider(GitProvider):
# if Bitbucket api version is None or < 7.0 then do a simple diff with a guaranteed common ancestor # if Bitbucket api version is None or < 7.0 then do a simple diff with a guaranteed common ancestor
base_sha = source_commits_list[-1]['parents'][0]['id'] base_sha = source_commits_list[-1]['parents'][0]['id']
# if Bitbucket api version is 7.0-8.15 then use 2-way diff functionality for the base_sha # if Bitbucket api version is 7.0-8.15 then use 2-way diff functionality for the base_sha
if self.bitbucket_api_version is not None and self.bitbucket_api_version >= LooseVersion("7.0"): if self.bitbucket_api_version is not None and self.bitbucket_api_version >= parse_version("7.0"):
try: try:
destination_commits = list( destination_commits = list(
self.bitbucket_client.get_commits(self.workspace_slug, self.repo_slug, base_sha, self.bitbucket_client.get_commits(self.workspace_slug, self.repo_slug, base_sha,
@ -201,25 +218,21 @@ class BitbucketServerProvider(GitProvider):
case 'ADD': case 'ADD':
edit_type = EDIT_TYPE.ADDED edit_type = EDIT_TYPE.ADDED
new_file_content_str = self.get_file(file_path, head_sha) new_file_content_str = self.get_file(file_path, head_sha)
if isinstance(new_file_content_str, (bytes, bytearray)): new_file_content_str = decode_if_bytes(new_file_content_str)
new_file_content_str = new_file_content_str.decode("utf-8")
original_file_content_str = "" original_file_content_str = ""
case 'DELETE': case 'DELETE':
edit_type = EDIT_TYPE.DELETED edit_type = EDIT_TYPE.DELETED
new_file_content_str = "" new_file_content_str = ""
original_file_content_str = self.get_file(file_path, base_sha) original_file_content_str = self.get_file(file_path, base_sha)
if isinstance(original_file_content_str, (bytes, bytearray)): original_file_content_str = decode_if_bytes(original_file_content_str)
original_file_content_str = original_file_content_str.decode("utf-8")
case 'RENAME': case 'RENAME':
edit_type = EDIT_TYPE.RENAMED edit_type = EDIT_TYPE.RENAMED
case _: case _:
edit_type = EDIT_TYPE.MODIFIED edit_type = EDIT_TYPE.MODIFIED
original_file_content_str = self.get_file(file_path, base_sha) original_file_content_str = self.get_file(file_path, base_sha)
if isinstance(original_file_content_str, (bytes, bytearray)): original_file_content_str = decode_if_bytes(original_file_content_str)
original_file_content_str = original_file_content_str.decode("utf-8")
new_file_content_str = self.get_file(file_path, head_sha) new_file_content_str = self.get_file(file_path, head_sha)
if isinstance(new_file_content_str, (bytes, bytearray)): new_file_content_str = decode_if_bytes(new_file_content_str)
new_file_content_str = new_file_content_str.decode("utf-8")
patch = load_large_diff(file_path, new_file_content_str, original_file_content_str) patch = load_large_diff(file_path, new_file_content_str, original_file_content_str)
@ -330,10 +343,10 @@ class BitbucketServerProvider(GitProvider):
for comment in comments: for comment in comments:
if 'position' in comment: if 'position' in comment:
self.publish_inline_comment(comment['body'], comment['position'], comment['path']) self.publish_inline_comment(comment['body'], comment['position'], comment['path'])
elif 'start_line' in comment: # multi-line comment elif 'start_line' in comment: # multi-line comment
# note that bitbucket does not seem to support range - only a comment on a single line - https://community.developer.atlassian.com/t/api-post-endpoint-for-inline-pull-request-comments/60452 # note that bitbucket does not seem to support range - only a comment on a single line - https://community.developer.atlassian.com/t/api-post-endpoint-for-inline-pull-request-comments/60452
self.publish_inline_comment(comment['body'], comment['start_line'], comment['path']) self.publish_inline_comment(comment['body'], comment['start_line'], comment['path'])
elif 'line' in comment: # single-line comment elif 'line' in comment: # single-line comment
self.publish_inline_comment(comment['body'], comment['line'], comment['path']) self.publish_inline_comment(comment['body'], comment['line'], comment['path'])
else: else:
get_logger().error(f"Could not publish inline comment: {comment}") get_logger().error(f"Could not publish inline comment: {comment}")

View File

@ -1,5 +1,8 @@
import copy
import difflib
import hashlib import hashlib
import itertools import itertools
import re
import time import time
import traceback import traceback
from datetime import datetime from datetime import datetime
@ -11,6 +14,7 @@ from retry import retry
from starlette_context import context from starlette_context import context
from ..algo.file_filter import filter_ignored from ..algo.file_filter import filter_ignored
from ..algo.git_patch_processing import extract_hunk_headers
from ..algo.language_handler import is_valid_file from ..algo.language_handler import is_valid_file
from ..algo.types import EDIT_TYPE from ..algo.types import EDIT_TYPE
from ..algo.utils import (PRReviewHeader, Range, clip_tokens, from ..algo.utils import (PRReviewHeader, Range, clip_tokens,
@ -415,7 +419,10 @@ class GithubProvider(GitProvider):
Publishes code suggestions as comments on the PR. Publishes code suggestions as comments on the PR.
""" """
post_parameters_list = [] post_parameters_list = []
for suggestion in code_suggestions:
code_suggestions_validated = self.validate_comments_inside_hunks(code_suggestions)
for suggestion in code_suggestions_validated:
body = suggestion['body'] body = suggestion['body']
relevant_file = suggestion['relevant_file'] relevant_file = suggestion['relevant_file']
relevant_lines_start = suggestion['relevant_lines_start'] relevant_lines_start = suggestion['relevant_lines_start']
@ -872,3 +879,100 @@ class GithubProvider(GitProvider):
def calc_pr_statistics(self, pull_request_data: dict): def calc_pr_statistics(self, pull_request_data: dict):
return {} return {}
def validate_comments_inside_hunks(self, code_suggestions):
"""
validate that all committable comments are inside PR hunks - this is a must for committable comments in GitHub
"""
code_suggestions_copy = copy.deepcopy(code_suggestions)
diff_files = self.get_diff_files()
RE_HUNK_HEADER = re.compile(
r"^@@ -(\d+)(?:,(\d+))? \+(\d+)(?:,(\d+))? @@[ ]?(.*)")
# map file extensions to programming languages
language_extension_map_org = get_settings().language_extension_map_org
extension_to_language = {}
for language, extensions in language_extension_map_org.items():
for ext in extensions:
extension_to_language[ext] = language
for file in diff_files:
extension_s = '.' + file.filename.rsplit('.')[-1]
language_name = "txt"
if extension_s and (extension_s in extension_to_language):
language_name = extension_to_language[extension_s]
file.language = language_name.lower()
for suggestion in code_suggestions_copy:
try:
relevant_file_path = suggestion['relevant_file']
for file in diff_files:
if file.filename == relevant_file_path:
# generate on-demand the patches range for the relevant file
patch_str = file.patch
if not hasattr(file, 'patches_range'):
file.patches_range = []
patch_lines = patch_str.splitlines()
for i, line in enumerate(patch_lines):
if line.startswith('@@'):
match = RE_HUNK_HEADER.match(line)
# identify hunk header
if match:
section_header, size1, size2, start1, start2 = extract_hunk_headers(match)
file.patches_range.append({'start': start2, 'end': start2 + size2 - 1})
patches_range = file.patches_range
comment_start_line = suggestion.get('relevant_lines_start', None)
comment_end_line = suggestion.get('relevant_lines_end', None)
original_suggestion = suggestion.get('original_suggestion', None) # needed for diff code
if not comment_start_line or not comment_end_line or not original_suggestion:
continue
# check if the comment is inside a valid hunk
is_valid_hunk = False
min_distance = float('inf')
patch_range_min = None
# find the hunk that contains the comment, or the closest one
for i, patch_range in enumerate(patches_range):
d1 = comment_start_line - patch_range['start']
d2 = patch_range['end'] - comment_end_line
if d1 >= 0 and d2 >= 0: # found a valid hunk
is_valid_hunk = True
min_distance = 0
patch_range_min = patch_range
break
elif d1 * d2 <= 0: # comment is possibly inside the hunk
d1_clip = abs(min(0, d1))
d2_clip = abs(min(0, d2))
d = max(d1_clip, d2_clip)
if d < min_distance:
patch_range_min = patch_range
min_distance = min(min_distance, d)
if not is_valid_hunk:
if min_distance < 10: # 10 lines - a reasonable distance to consider the comment inside the hunk
# make the suggestion non-committable, yet multi line
suggestion['relevant_lines_start'] = max(suggestion['relevant_lines_start'], patch_range_min['start'])
suggestion['relevant_lines_end'] = min(suggestion['relevant_lines_end'], patch_range_min['end'])
body = suggestion['body'].strip()
# present new diff code in collapsible
existing_code = original_suggestion['existing_code'].rstrip() + "\n"
improved_code = original_suggestion['improved_code'].rstrip() + "\n"
diff = difflib.unified_diff(existing_code.split('\n'),
improved_code.split('\n'), n=999)
patch_orig = "\n".join(diff)
patch = "\n".join(patch_orig.splitlines()[5:]).strip('\n')
diff_code = f"\n\n<details><summary>New proposed code:</summary>\n\n```diff\n{patch.rstrip()}\n```"
# replace ```suggestion ... ``` with diff_code, using regex:
body = re.sub(r'```suggestion.*?```', diff_code, body, flags=re.DOTALL)
body += "\n\n</details>"
suggestion['body'] = body
get_logger().info(f"Comment was moved to a valid hunk, "
f"start_line={suggestion['relevant_lines_start']}, end_line={suggestion['relevant_lines_end']}, file={file.filename}")
else:
get_logger().error(f"Comment is not inside a valid hunk, "
f"start_line={suggestion['relevant_lines_start']}, end_line={suggestion['relevant_lines_end']}, file={file.filename}")
except Exception as e:
get_logger().error(f"Failed to process patch for committable comment, error: {e}")
return code_suggestions_copy

View File

@ -1,3 +1,4 @@
import difflib
import hashlib import hashlib
import re import re
from typing import Optional, Tuple from typing import Optional, Tuple
@ -193,6 +194,9 @@ class GitLabProvider(GitProvider):
self.publish_persistent_comment_full(pr_comment, initial_header, update_header, name, final_update_message) self.publish_persistent_comment_full(pr_comment, initial_header, update_header, name, final_update_message)
def publish_comment(self, mr_comment: str, is_temporary: bool = False): def publish_comment(self, mr_comment: str, is_temporary: bool = False):
if is_temporary and not get_settings().config.publish_output_progress:
get_logger().debug(f"Skipping publish_comment for temporary comment: {mr_comment}")
return None
mr_comment = self.limit_output_characters(mr_comment, self.max_comment_chars) mr_comment = self.limit_output_characters(mr_comment, self.max_comment_chars)
comment = self.mr.notes.create({'body': mr_comment}) comment = self.mr.notes.create({'body': mr_comment})
if is_temporary: if is_temporary:
@ -278,20 +282,23 @@ class GitLabProvider(GitProvider):
new_code_snippet = original_suggestion['improved_code'] new_code_snippet = original_suggestion['improved_code']
content = original_suggestion['suggestion_content'] content = original_suggestion['suggestion_content']
label = original_suggestion['label'] label = original_suggestion['label']
if 'score' in original_suggestion: score = original_suggestion.get('score', 7)
score = original_suggestion['score']
else:
score = 7
if hasattr(self, 'main_language'): if hasattr(self, 'main_language'):
language = self.main_language language = self.main_language
else: else:
language = '' language = ''
link = self.get_line_link(relevant_file, line_start, line_end) link = self.get_line_link(relevant_file, line_start, line_end)
body_fallback =f"**Suggestion:** {content} [{label}, importance: {score}]\n___\n" body_fallback =f"**Suggestion:** {content} [{label}, importance: {score}]\n\n"
body_fallback +=f"\n\nReplace lines ([{line_start}-{line_end}]({link}))\n\n```{language}\n{old_code_snippet}\n````\n\n" body_fallback +=f"\n\n<details><summary>[{target_file.filename} [{line_start}-{line_end}]]({link}):</summary>\n\n"
body_fallback +=f"with\n\n```{language}\n{new_code_snippet}\n````" body_fallback += f"\n\n___\n\n`(Cannot implement directly - GitLab API allows committable suggestions strictly on MR diff lines)`"
body_fallback += f"\n\n___\n\n`(Cannot implement this suggestion directly, as gitlab API does not enable committing to a non -+ line in a PR)`" body_fallback+="</details>\n\n"
diff_patch = difflib.unified_diff(old_code_snippet.split('\n'),
new_code_snippet.split('\n'), n=999)
patch_orig = "\n".join(diff_patch)
patch = "\n".join(patch_orig.splitlines()[5:]).strip('\n')
diff_code = f"\n\n```diff\n{patch.rstrip()}\n```"
body_fallback += diff_code
# Create a general note on the file in the MR # Create a general note on the file in the MR
self.mr.notes.create({ self.mr.notes.create({
@ -304,6 +311,7 @@ class GitLabProvider(GitProvider):
'file_path': f'{target_file.filename}', 'file_path': f'{target_file.filename}',
} }
}) })
get_logger().debug(f"Created fallback comment in MR {self.id_mr} with position {pos_obj}")
# get_logger().debug( # get_logger().debug(
# f"Failed to create comment in MR {self.id_mr} with position {pos_obj} (probably not a '+' line)") # f"Failed to create comment in MR {self.id_mr} with position {pos_obj} (probably not a '+' line)")

View File

@ -1,8 +1,8 @@
[config] [config]
# models # models
model="gpt-4-turbo-2024-04-09" model="gpt-4-turbo-2024-04-09"
model_turbo="gpt-4o-2024-08-06" model_turbo="gpt-4o-2024-11-20"
fallback_models=["gpt-4o-2024-05-13"] fallback_models=["gpt-4o-2024-08-06"]
# CLI # CLI
git_provider="github" git_provider="github"
publish_output=true publish_output=true
@ -111,7 +111,7 @@ max_context_tokens=16000
# #
commitable_code_suggestions = false commitable_code_suggestions = false
dual_publishing_score_threshold=-1 # -1 to disable, [0-10] to set the threshold (>=) for publishing a code suggestion both in a table and as commitable dual_publishing_score_threshold=-1 # -1 to disable, [0-10] to set the threshold (>=) for publishing a code suggestion both in a table and as commitable
focus_only_on_problems=false focus_only_on_problems=true
# #
extra_instructions = "" extra_instructions = ""
rank_suggestions = false rank_suggestions = false

View File

@ -12,10 +12,10 @@ google-cloud-aiplatform==1.38.0
google-generativeai==0.8.3 google-generativeai==0.8.3
google-cloud-storage==2.10.0 google-cloud-storage==2.10.0
Jinja2==3.1.2 Jinja2==3.1.2
litellm==1.52.0 litellm==1.52.12
loguru==0.7.2 loguru==0.7.2
msrest==0.7.1 msrest==0.7.1
openai==1.54.1 openai==1.55.3
pytest==7.4.0 pytest==7.4.0
PyGithub==1.59.* PyGithub==1.59.*
PyYAML==6.0.1 PyYAML==6.0.1