diff --git a/.github/workflows/pr-agent-review.yaml b/.github/workflows/pr-agent-review.yaml index 6932b4bd..166e83de 100644 --- a/.github/workflows/pr-agent-review.yaml +++ b/.github/workflows/pr-agent-review.yaml @@ -26,5 +26,6 @@ jobs: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} PINECONE.API_KEY: ${{ secrets.PINECONE_API_KEY }} PINECONE.ENVIRONMENT: ${{ secrets.PINECONE_ENVIRONMENT }} + GITHUB_ACTION.AUTO_REVIEW: true diff --git a/.pr_agent.toml b/.pr_agent.toml new file mode 100644 index 00000000..6937b547 --- /dev/null +++ b/.pr_agent.toml @@ -0,0 +1,2 @@ +[pr_reviewer] +enable_review_labels_effort = true \ No newline at end of file diff --git a/INSTALL.md b/INSTALL.md index d0298033..ee8b1dda 100644 --- a/INSTALL.md +++ b/INSTALL.md @@ -101,6 +101,7 @@ python3 -m pr_agent.cli --pr_url ask python3 -m pr_agent.cli --pr_url describe python3 -m pr_agent.cli --pr_url improve python3 -m pr_agent.cli --pr_url add_docs +python3 -m pr_agent.cli --pr_url generate_labels python3 -m pr_agent.cli --issue_url similar_issue ... ``` @@ -409,10 +410,49 @@ BITBUCKET_BEARER_TOKEN: You can get a Bitbucket token for your repository by following Repository Settings -> Security -> Access Tokens. +Note that comments on a PR are not supported in Bitbucket Pipeline. -### Run on a hosted Bitbucket app -Please contact if you're interested in a hosted BitBucket app solution that provides full functionality including PR reviews and comment handling. It's based on the [bitbucket_app.py](https://github.com/Codium-ai/pr-agent/blob/main/pr_agent/git_providers/bitbucket_provider.py) implmentation. +### Run using CodiumAI-hosted Bitbucket app +Please contact or visit [CodiumAI pricing page](https://www.codium.ai/pricing/) if you're interested in a hosted BitBucket app solution that provides full functionality including PR reviews and comment handling. It's based on the [bitbucket_app.py](https://github.com/Codium-ai/pr-agent/blob/main/pr_agent/git_providers/bitbucket_provider.py) implementation. + + +### Bitbucket Server and Data Center + +Login into your on-prem instance of Bitbucket with your service account username and password. +Navigate to `Manage account`, `HTTP Access tokens`, `Create Token`. +Generate the token and add it to .secret.toml under `bitbucket_server` section + +```toml +[bitbucket_server] +bearer_token = "" +``` + +#### Run it as CLI + +Modify `configuration.toml`: + +```toml +git_provider="bitbucket_server" +``` + +and pass the Pull request URL: +```shell +python cli.py --pr_url https://git.onpreminstanceofbitbucket.com/projects/PROJECT/repos/REPO/pull-requests/1 review +``` + +#### Run it as service + +To run pr-agent as webhook, build the docker image: +``` +docker build . -t codiumai/pr-agent:bitbucket_server_webhook --target bitbucket_server_webhook -f docker/Dockerfile +docker push codiumai/pr-agent:bitbucket_server_webhook # Push to your Docker repository +``` + +Navigate to `Projects` or `Repositories`, `Settings`, `Webhooks`, `Create Webhook`. +Fill the name and URL, Authentication None select the Pull Request Opened checkbox to receive that event as webhook. + +The url should be ends with `/webhook`, example: https://domain.com/webhook ======= diff --git a/RELEASE_NOTES.md b/RELEASE_NOTES.md index ab9fcf48..30b76a0f 100644 --- a/RELEASE_NOTES.md +++ b/RELEASE_NOTES.md @@ -1,5 +1,24 @@ -## Unreleased -- review tool now posts persistent comments by default +## [Version 0.10] - 2023-11-15 +- codiumai/pr-agent:0.10 +- codiumai/pr-agent:0.10-github_app +- codiumai/pr-agent:0.10-bitbucket-app +- codiumai/pr-agent:0.10-gitlab_webhook +- codiumai/pr-agent:0.10-github_polling +- codiumai/pr-agent:0.10-github_action + +### Added::Algo +- Review tool now works with [persistent comments](https://github.com/Codium-ai/pr-agent/pull/451) by default +- Bitbucket now publishes review suggestions with [code links](https://github.com/Codium-ai/pr-agent/pull/428) +- Enabling to limit [max number of tokens](https://github.com/Codium-ai/pr-agent/pull/437/files) +- Support ['gpt-4-1106-preview'](https://github.com/Codium-ai/pr-agent/pull/437/files) model +- Support for Google's [Vertex AI](https://github.com/Codium-ai/pr-agent/pull/436) +- Implementing [thresholds](https://github.com/Codium-ai/pr-agent/pull/423) for incremental PR reviews +- Decoupled custom labels from [PR type](https://github.com/Codium-ai/pr-agent/pull/431) + +### Fixed +- Fixed bug in [parsing quotes](https://github.com/Codium-ai/pr-agent/pull/446) in CLI +- Preserve [user-added labels](https://github.com/Codium-ai/pr-agent/pull/433) in pull requests +- Bug fixes in GitLab and BitBucket ## [Version 0.9] - 2023-10-29 - codiumai/pr-agent:0.9 diff --git a/Usage.md b/Usage.md index f11b28df..d4a7b230 100644 --- a/Usage.md +++ b/Usage.md @@ -32,12 +32,19 @@ The [Tools Guide](./docs/TOOLS_GUIDE.md) provides a detailed description of the #### Ignoring files from analysis In some cases, you may want to exclude specific files or directories from the analysis performed by CodiumAI PR-Agent. This can be useful, for example, when you have files that are generated automatically or files that shouldn't be reviewed, like vendored code. -To ignore files or directories, edit the **[ignore.toml](/pr_agent/settings/ignore.toml)** configuration file. This setting is also exposed the following environment variables: +To ignore files or directories, edit the **[ignore.toml](/pr_agent/settings/ignore.toml)** configuration file. This setting also exposes the following environment variables: - `IGNORE.GLOB` - `IGNORE.REGEX` -See [dynaconf envvars documentation](https://www.dynaconf.com/envvars/). +For example, to ignore python files in a PR with online usage, comment on a PR: +`/review --ignore.glob=['*.py']` + +To ignore python files in all PRs, set in a configuration file: +``` +[ignore] +glob = ['*.py'] +``` #### git provider The [git_provider](pr_agent/settings/configuration.toml#L4) field in the configuration file determines the GIT provider that will be used by PR-Agent. Currently, the following providers are supported: @@ -59,7 +66,7 @@ The [git_provider](pr_agent/settings/configuration.toml#L4) field in the configu ### Working from a local repo (CLI) When running from your local repo (CLI), your local configuration file will be used. -Examples for invoking the different tools via the CLI: +Examples of invoking the different tools via the CLI: - **Review**: `python -m pr_agent.cli --pr_url= review` - **Describe**: `python -m pr_agent.cli --pr_url= describe` @@ -83,7 +90,7 @@ python -m pr_agent.cli --pr_url= /review --pr_reviewer.extra_instructio publish_output=true verbosity_level=2 ``` -This is useful for debugging or experimenting with the different tools. +This is useful for debugging or experimenting with different tools. ### Online usage @@ -100,17 +107,17 @@ Commands for invoking the different tools via comments: To edit a specific configuration value, just add `--config_path=` to any command. -For example if you want to edit the `review` tool configurations, you can run: +For example, if you want to edit the `review` tool configurations, you can run: ``` /review --pr_reviewer.extra_instructions="..." --pr_reviewer.require_score_review=false ``` -Any configuration value in [configuration file](pr_agent/settings/configuration.toml) file can be similarly edited. comment `/config` to see the list of available configurations. +Any configuration value in [configuration file](pr_agent/settings/configuration.toml) file can be similarly edited. Comment `/config` to see the list of available configurations. ### Working with GitHub App When running PR-Agent from GitHub App, the default [configuration file](pr_agent/settings/configuration.toml) from a pre-built docker will be initially loaded. -By uploading a local `.pr_agent.toml` file, you can edit and customize any configuration parameter. +By uploading a local `.pr_agent.toml` file to the root of the repo's main branch, you can edit and customize any configuration parameter. For example, if you set in `.pr_agent.toml`: @@ -119,7 +126,7 @@ For example, if you set in `.pr_agent.toml`: num_code_suggestions=1 ``` -Than you will overwrite the default number of code suggestions to be 1. +Then you will overwrite the default number of code suggestions to 1. #### GitHub app automatic tools The [github_app](pr_agent/settings/configuration.toml#L76) section defines GitHub app-specific configurations. @@ -133,7 +140,7 @@ The GitHub app can respond to the following actions on a PR: 4. `review_requested` - Specifically requesting review (in the PR reviewers list) from the `github-actions[bot]` user The configuration parameter `handle_pr_actions` defines the list of actions for which the GitHub app will trigger the PR-Agent. -The configuration parameter `pr_commands` defines the list of tools that will be **run automatically** when one of the above action happens (e.g. a new PR is opened): +The configuration parameter `pr_commands` defines the list of tools that will be **run automatically** when one of the above actions happens (e.g., a new PR is opened): ``` [github_app] handle_pr_actions = ['opened', 'reopened', 'ready_for_review', 'review_requested'] @@ -173,11 +180,11 @@ push_commands = [ "/auto_review -i --pr_reviewer.remove_previous_review_comment=true", ] ``` -The means that when new code is pushed to the PR, the PR-Agent will run the `describe` and incremental `auto_review` tools. +This means that when new code is pushed to the PR, the PR-Agent will run the `describe` and incremental `auto_review` tools. For the describe tool, the `add_original_user_description` and `keep_original_user_title` parameters will be set to true. For the `auto_review` tool, it will run in incremental mode, and the `remove_previous_review_comment` parameter will be set to true. -Much like the configurations for `pr_commands`, you can override the default tool paramteres by uploading a local configuration file to the root of your repo. +Much like the configurations for `pr_commands`, you can override the default tool parameters by uploading a local configuration file to the root of your repo. #### Editing the prompts The prompts for the various PR-Agent tools are defined in the `pr_agent/settings` folder. @@ -303,6 +310,24 @@ key = ... Also review the [AiHandler](pr_agent/algo/ai_handler.py) file for instruction how to set keys for other models. +#### Vertex AI + +To use Google's Vertex AI platform and its associated models (chat-bison/codechat-bison) set: + +``` +[config] # in configuration.toml +model = "vertex_ai/codechat-bison" +fallback_models="vertex_ai/codechat-bison" + +[vertexai] # in .secrets.toml +vertex_project = "my-google-cloud-project" +vertex_location = "" +``` + +Your [application default credentials](https://cloud.google.com/docs/authentication/application-default-credentials) will be used for authentication so there is no need to set explicit credentials in most environments. + +If you do want to set explicit credentials then you can use the `GOOGLE_APPLICATION_CREDENTIALS` environment variable set to a path to a json credentials file. + ### Working with large PRs The default mode of CodiumAI is to have a single call per tool, using GPT-4, which has a token limit of 8000 tokens. diff --git a/docker/Dockerfile b/docker/Dockerfile index 951f846c..0f669e89 100644 --- a/docker/Dockerfile +++ b/docker/Dockerfile @@ -14,6 +14,10 @@ FROM base as bitbucket_app ADD pr_agent pr_agent CMD ["python", "pr_agent/servers/bitbucket_app.py"] +FROM base as bitbucket_server_webhook +ADD pr_agent pr_agent +CMD ["python", "pr_agent/servers/bitbucket_server_webhook.py"] + FROM base as github_polling ADD pr_agent pr_agent CMD ["python", "pr_agent/servers/github_polling.py"] diff --git a/docs/REVIEW.md b/docs/REVIEW.md index 342504e2..533ac466 100644 --- a/docs/REVIEW.md +++ b/docs/REVIEW.md @@ -16,17 +16,22 @@ The `review` tool can also be triggered automatically every time a new PR is ope Under the section 'pr_reviewer', the [configuration file](./../pr_agent/settings/configuration.toml#L16) contains options to customize the 'review' tool: +#### enable\\disable features - `require_focused_review`: if set to true, the tool will add a section - 'is the PR a focused one'. Default is false. - `require_score_review`: if set to true, the tool will add a section that scores the PR. Default is false. - `require_tests_review`: if set to true, the tool will add a section that checks if the PR contains tests. Default is true. - `require_security_review`: if set to true, the tool will add a section that checks if the PR contains security issues. Default is true. - `require_estimate_effort_to_review`: if set to true, the tool will add a section that estimates thed effort needed to review the PR. Default is true. +#### general options - `num_code_suggestions`: number of code suggestions provided by the 'review' tool. Default is 4. - `inline_code_comments`: if set to true, the tool will publish the code suggestions as comments on the code diff. Default is false. - `automatic_review`: if set to false, no automatic reviews will be done. Default is true. - `remove_previous_review_comment`: if set to true, the tool will remove the previous review comment before adding a new one. Default is false. -- `persistent_comment`: if set to true, the review comment will be persistent. Default is true. +- `persistent_comment`: if set to true, the review comment will be persistent, meaning that every new review request will edit the previous one. Default is true. - `extra_instructions`: Optional extra instructions to the tool. For example: "focus on the changes in the file X. Ignore change in ...". +#### review labels +- `enable_review_labels_security`: if set to true, the tool will publish a 'possible security issue' label if it detects a security issue. Default is true. +- `enable_review_labels_effort`: if set to true, the tool will publish a 'Review effort [1-5]: x' label. Default is false. - To enable `custom labels`, apply the configuration changes described [here](./GENERATE_CUSTOM_LABELS.md#configuration-changes) #### Incremental Mode For an incremental review, which only considers changes since the last PR-Agent review, this can be useful when working on the PR in an iterative manner, and you want to focus on the changes since the last review instead of reviewing the entire PR again, the following command can be used: diff --git a/pr_agent/agent/pr_agent.py b/pr_agent/agent/pr_agent.py index 6e76c5e0..5608c50a 100644 --- a/pr_agent/agent/pr_agent.py +++ b/pr_agent/agent/pr_agent.py @@ -46,10 +46,13 @@ class PRAgent: apply_repo_settings(pr_url) # Then, apply user specific settings if exists - request = request.replace("'", "\\'") - lexer = shlex.shlex(request, posix=True) - lexer.whitespace_split = True - action, *args = list(lexer) + if isinstance(request, str): + request = request.replace("'", "\\'") + lexer = shlex.shlex(request, posix=True) + lexer.whitespace_split = True + action, *args = list(lexer) + else: + action, *args = request args = update_settings_from_args(args) action = action.lstrip("/").lower() diff --git a/pr_agent/algo/__init__.py b/pr_agent/algo/__init__.py index 5a253363..5fe82ee5 100644 --- a/pr_agent/algo/__init__.py +++ b/pr_agent/algo/__init__.py @@ -13,5 +13,9 @@ MAX_TOKENS = { 'claude-2': 100000, 'command-nightly': 4096, 'replicate/llama-2-70b-chat:2c1608e18606fad2812020dc541930f2d0495ce32eee50074220b87300bc16e1': 4096, - 'meta-llama/Llama-2-7b-chat-hf': 4096 + 'meta-llama/Llama-2-7b-chat-hf': 4096, + 'vertex_ai/codechat-bison': 6144, + 'vertex_ai/codechat-bison-32k': 32000, + 'codechat-bison': 6144, + 'codechat-bison-32k': 32000, } diff --git a/pr_agent/algo/ai_handler.py b/pr_agent/algo/ai_handler.py index c3989563..9a48cdc3 100644 --- a/pr_agent/algo/ai_handler.py +++ b/pr_agent/algo/ai_handler.py @@ -23,39 +23,43 @@ class AiHandler: Initializes the OpenAI API key and other settings from a configuration file. Raises a ValueError if the OpenAI key is missing. """ - try: + self.azure = False + + if get_settings().get("OPENAI.KEY", None): openai.api_key = get_settings().openai.key litellm.openai_key = get_settings().openai.key - if get_settings().get("litellm.use_client"): - litellm_token = get_settings().get("litellm.LITELLM_TOKEN") - assert litellm_token, "LITELLM_TOKEN is required" - os.environ["LITELLM_TOKEN"] = litellm_token - litellm.use_client = True - self.azure = False - if get_settings().get("OPENAI.ORG", None): - litellm.organization = get_settings().openai.org - if get_settings().get("OPENAI.API_TYPE", None): - if get_settings().openai.api_type == "azure": - self.azure = True - litellm.azure_key = get_settings().openai.key - if get_settings().get("OPENAI.API_VERSION", None): - litellm.api_version = get_settings().openai.api_version - if get_settings().get("OPENAI.API_BASE", None): - litellm.api_base = get_settings().openai.api_base - if get_settings().get("ANTHROPIC.KEY", None): - litellm.anthropic_key = get_settings().anthropic.key - if get_settings().get("COHERE.KEY", None): - litellm.cohere_key = get_settings().cohere.key - if get_settings().get("REPLICATE.KEY", None): - litellm.replicate_key = get_settings().replicate.key - if get_settings().get("REPLICATE.KEY", None): - litellm.replicate_key = get_settings().replicate.key - if get_settings().get("HUGGINGFACE.KEY", None): - litellm.huggingface_key = get_settings().huggingface.key - if get_settings().get("HUGGINGFACE.API_BASE", None): - litellm.api_base = get_settings().huggingface.api_base - except AttributeError as e: - raise ValueError("OpenAI key is required") from e + if get_settings().get("litellm.use_client"): + litellm_token = get_settings().get("litellm.LITELLM_TOKEN") + assert litellm_token, "LITELLM_TOKEN is required" + os.environ["LITELLM_TOKEN"] = litellm_token + litellm.use_client = True + if get_settings().get("OPENAI.ORG", None): + litellm.organization = get_settings().openai.org + if get_settings().get("OPENAI.API_TYPE", None): + if get_settings().openai.api_type == "azure": + self.azure = True + litellm.azure_key = get_settings().openai.key + if get_settings().get("OPENAI.API_VERSION", None): + litellm.api_version = get_settings().openai.api_version + if get_settings().get("OPENAI.API_BASE", None): + litellm.api_base = get_settings().openai.api_base + if get_settings().get("ANTHROPIC.KEY", None): + litellm.anthropic_key = get_settings().anthropic.key + if get_settings().get("COHERE.KEY", None): + litellm.cohere_key = get_settings().cohere.key + if get_settings().get("REPLICATE.KEY", None): + litellm.replicate_key = get_settings().replicate.key + if get_settings().get("REPLICATE.KEY", None): + litellm.replicate_key = get_settings().replicate.key + if get_settings().get("HUGGINGFACE.KEY", None): + litellm.huggingface_key = get_settings().huggingface.key + if get_settings().get("HUGGINGFACE.API_BASE", None): + litellm.api_base = get_settings().huggingface.api_base + if get_settings().get("VERTEXAI.VERTEX_PROJECT", None): + litellm.vertex_project = get_settings().vertexai.vertex_project + litellm.vertex_location = get_settings().get( + "VERTEXAI.VERTEX_LOCATION", None + ) @property def deployment_id(self): diff --git a/pr_agent/algo/file_filter.py b/pr_agent/algo/file_filter.py index 32c61155..aa457293 100644 --- a/pr_agent/algo/file_filter.py +++ b/pr_agent/algo/file_filter.py @@ -11,7 +11,12 @@ def filter_ignored(files): try: # load regex patterns, and translate glob patterns to regex patterns = get_settings().ignore.regex - patterns += [fnmatch.translate(glob) for glob in get_settings().ignore.glob] + if isinstance(patterns, str): + patterns = [patterns] + glob_setting = get_settings().ignore.glob + if isinstance(glob_setting, str): # --ignore.glob=[.*utils.py], --ignore.glob=.*utils.py + glob_setting = glob_setting.strip('[]').split(",") + patterns += [fnmatch.translate(glob) for glob in glob_setting] # compile all valid patterns compiled_patterns = [] diff --git a/pr_agent/algo/pr_processing.py b/pr_agent/algo/pr_processing.py index e5b6f59e..6063dece 100644 --- a/pr_agent/algo/pr_processing.py +++ b/pr_agent/algo/pr_processing.py @@ -282,7 +282,7 @@ def find_line_number_of_relevant_line_in_file(diff_files: List[FilePatchInfo], r"^@@ -(\d+)(?:,(\d+))? \+(\d+)(?:,(\d+))? @@[ ]?(.*)") for file in diff_files: - if file.filename.strip() == relevant_file: + if file.filename and (file.filename.strip() == relevant_file): patch = file.patch patch_lines = patch.splitlines() diff --git a/pr_agent/algo/utils.py b/pr_agent/algo/utils.py index bd91477c..b9aaee94 100644 --- a/pr_agent/algo/utils.py +++ b/pr_agent/algo/utils.py @@ -282,41 +282,43 @@ def _fix_key_value(key: str, value: str): try: value = yaml.safe_load(value) except Exception as e: - get_logger().error(f"Failed to parse YAML for config override {key}={value}", exc_info=e) + get_logger().debug(f"Failed to parse YAML for config override {key}={value}", exc_info=e) return key, value -def load_yaml(review_text: str) -> dict: - review_text = review_text.removeprefix('```yaml').rstrip('`') +def load_yaml(response_text: str) -> dict: + response_text = response_text.removeprefix('```yaml').rstrip('`') try: - data = yaml.safe_load(review_text) + data = yaml.safe_load(response_text) except Exception as e: get_logger().error(f"Failed to parse AI prediction: {e}") - data = try_fix_yaml(review_text) + data = try_fix_yaml(response_text) return data -def try_fix_yaml(review_text: str) -> dict: - review_text_lines = review_text.split('\n') +def try_fix_yaml(response_text: str) -> dict: + response_text_lines = response_text.split('\n') + keys = ['relevant line:', 'suggestion content:', 'relevant file:'] # first fallback - try to convert 'relevant line: ...' to relevant line: |-\n ...' - review_text_lines_copy = review_text_lines.copy() - for i in range(0, len(review_text_lines_copy)): - if 'relevant line:' in review_text_lines_copy[i] and not '|-' in review_text_lines_copy[i]: - review_text_lines_copy[i] = review_text_lines_copy[i].replace('relevant line: ', - 'relevant line: |-\n ') + response_text_lines_copy = response_text_lines.copy() + for i in range(0, len(response_text_lines_copy)): + for key in keys: + if key in response_text_lines_copy[i] and not '|-' in response_text_lines_copy[i]: + response_text_lines_copy[i] = response_text_lines_copy[i].replace(f'{key}', + f'{key} |-\n ') try: - data = yaml.load('\n'.join(review_text_lines_copy), Loader=yaml.SafeLoader) - get_logger().info(f"Successfully parsed AI prediction after adding |-\n to relevant line") + data = yaml.safe_load('\n'.join(response_text_lines_copy)) + get_logger().info(f"Successfully parsed AI prediction after adding |-\n") return data except: - get_logger().debug(f"Failed to parse AI prediction after adding |-\n to relevant line") + get_logger().info(f"Failed to parse AI prediction after adding |-\n") # second fallback - try to remove last lines data = {} - for i in range(1, len(review_text_lines)): - review_text_lines_tmp = '\n'.join(review_text_lines[:-i]) + for i in range(1, len(response_text_lines)): + response_text_lines_tmp = '\n'.join(response_text_lines[:-i]) try: - data = yaml.load(review_text_lines_tmp, Loader=yaml.SafeLoader) + data = yaml.safe_load(response_text_lines_tmp,) get_logger().info(f"Successfully parsed AI prediction after removing {i} lines") break except: diff --git a/pr_agent/cli.py b/pr_agent/cli.py index 60948db5..5a6a6640 100644 --- a/pr_agent/cli.py +++ b/pr_agent/cli.py @@ -8,6 +8,8 @@ from pr_agent.log import setup_logger setup_logger() + + def run(inargs=None): parser = argparse.ArgumentParser(description='AI based pull request analyzer', usage= """\ @@ -55,9 +57,9 @@ For example: 'python cli.py --pr_url=... review --pr_reviewer.extra_instructions command = args.command.lower() get_settings().set("CONFIG.CLI_MODE", True) if args.issue_url: - result = asyncio.run(PRAgent().handle_request(args.issue_url, command + " " + " ".join(args.rest))) + result = asyncio.run(PRAgent().handle_request(args.issue_url, [command] + args.rest)) else: - result = asyncio.run(PRAgent().handle_request(args.pr_url, command + " " + " ".join(args.rest))) + result = asyncio.run(PRAgent().handle_request(args.pr_url, [command] + args.rest)) if not result: parser.print_help() diff --git a/pr_agent/git_providers/__init__.py b/pr_agent/git_providers/__init__.py index 968f0dfc..14103a95 100644 --- a/pr_agent/git_providers/__init__.py +++ b/pr_agent/git_providers/__init__.py @@ -1,5 +1,6 @@ from pr_agent.config_loader import get_settings from pr_agent.git_providers.bitbucket_provider import BitbucketProvider +from pr_agent.git_providers.bitbucket_server_provider import BitbucketServerProvider from pr_agent.git_providers.codecommit_provider import CodeCommitProvider from pr_agent.git_providers.github_provider import GithubProvider from pr_agent.git_providers.gitlab_provider import GitLabProvider @@ -12,6 +13,7 @@ _GIT_PROVIDERS = { 'github': GithubProvider, 'gitlab': GitLabProvider, 'bitbucket': BitbucketProvider, + 'bitbucket_server': BitbucketServerProvider, 'azure': AzureDevopsProvider, 'codecommit': CodeCommitProvider, 'local' : LocalGitProvider, diff --git a/pr_agent/git_providers/bitbucket_provider.py b/pr_agent/git_providers/bitbucket_provider.py index 47f2b32a..e2431645 100644 --- a/pr_agent/git_providers/bitbucket_provider.py +++ b/pr_agent/git_providers/bitbucket_provider.py @@ -153,17 +153,29 @@ class BitbucketProvider(GitProvider): self.diff_files = diff_files return diff_files - def publish_persistent_comment(self, pr_comment: str, initial_text: str, updated_text: str): + def get_latest_commit_url(self): + return self.pr.data['source']['commit']['links']['html']['href'] + + def get_comment_url(self, comment): + return comment.data['links']['html']['href'] + + def publish_persistent_comment(self, pr_comment: str, initial_header: str, update_header: bool = True): try: for comment in self.pr.comments(): body = comment.raw - if initial_text in body: - if updated_text: - pr_comment_updated = pr_comment.replace(initial_text, updated_text) + if initial_header in body: + latest_commit_url = self.get_latest_commit_url() + comment_url = self.get_comment_url(comment) + if update_header: + updated_header = f"{initial_header}\n\n### (review updated until commit {latest_commit_url})\n" + pr_comment_updated = pr_comment.replace(initial_header, updated_header) else: pr_comment_updated = pr_comment + get_logger().info(f"Persistent mode- updating comment {comment_url} to latest review message") d = {"content": {"raw": pr_comment_updated}} response = comment._update_data(comment.put(None, data=d)) + self.publish_comment( + f"**[Persistent review]({comment_url})** updated to latest commit {latest_commit_url}") return except Exception as e: get_logger().exception(f"Failed to update persistent review, error: {e}") diff --git a/pr_agent/git_providers/bitbucket_server_provider.py b/pr_agent/git_providers/bitbucket_server_provider.py new file mode 100644 index 00000000..44347850 --- /dev/null +++ b/pr_agent/git_providers/bitbucket_server_provider.py @@ -0,0 +1,351 @@ +import json +from typing import Optional, Tuple +from urllib.parse import urlparse + +import requests +from atlassian.bitbucket import Bitbucket +from starlette_context import context + +from .git_provider import FilePatchInfo, GitProvider, EDIT_TYPE +from ..algo.pr_processing import find_line_number_of_relevant_line_in_file +from ..algo.utils import load_large_diff +from ..config_loader import get_settings +from ..log import get_logger + + +class BitbucketServerProvider(GitProvider): + def __init__( + self, pr_url: Optional[str] = None, incremental: Optional[bool] = False + ): + s = requests.Session() + try: + bearer = context.get("bitbucket_bearer_token", None) + s.headers["Authorization"] = f"Bearer {bearer}" + except Exception: + s.headers[ + "Authorization" + ] = f'Bearer {get_settings().get("BITBUCKET_SERVER.BEARER_TOKEN", None)}' + + s.headers["Content-Type"] = "application/json" + self.headers = s.headers + self.bitbucket_server_url = None + self.workspace_slug = None + self.repo_slug = None + self.repo = None + self.pr_num = None + self.pr = None + self.pr_url = pr_url + self.temp_comments = [] + self.incremental = incremental + self.diff_files = None + self.bitbucket_pull_request_api_url = pr_url + + self.bitbucket_server_url = self._parse_bitbucket_server(url=pr_url) + self.bitbucket_client = Bitbucket(url=self.bitbucket_server_url, + token=get_settings().get("BITBUCKET_SERVER.BEARER_TOKEN", None)) + + if pr_url: + self.set_pr(pr_url) + + def get_repo_settings(self): + try: + url = (f"{self.bitbucket_server_url}/projects/{self.workspace_slug}/repos/{self.repo_slug}/src/" + f"{self.pr.destination_branch}/.pr_agent.toml") + response = requests.request("GET", url, headers=self.headers) + if response.status_code == 404: # not found + return "" + contents = response.text.encode('utf-8') + return contents + except Exception: + return "" + + def publish_code_suggestions(self, code_suggestions: list) -> bool: + """ + Publishes code suggestions as comments on the PR. + """ + post_parameters_list = [] + for suggestion in code_suggestions: + body = suggestion["body"] + relevant_file = suggestion["relevant_file"] + relevant_lines_start = suggestion["relevant_lines_start"] + relevant_lines_end = suggestion["relevant_lines_end"] + + if not relevant_lines_start or relevant_lines_start == -1: + if get_settings().config.verbosity_level >= 2: + get_logger().exception( + f"Failed to publish code suggestion, relevant_lines_start is {relevant_lines_start}" + ) + continue + + if relevant_lines_end < relevant_lines_start: + if get_settings().config.verbosity_level >= 2: + get_logger().exception( + f"Failed to publish code suggestion, " + f"relevant_lines_end is {relevant_lines_end} and " + f"relevant_lines_start is {relevant_lines_start}" + ) + continue + + if relevant_lines_end > relevant_lines_start: + post_parameters = { + "body": body, + "path": relevant_file, + "line": relevant_lines_end, + "start_line": relevant_lines_start, + "start_side": "RIGHT", + } + else: # API is different for single line comments + post_parameters = { + "body": body, + "path": relevant_file, + "line": relevant_lines_start, + "side": "RIGHT", + } + post_parameters_list.append(post_parameters) + + try: + self.publish_inline_comments(post_parameters_list) + return True + except Exception as e: + if get_settings().config.verbosity_level >= 2: + get_logger().error(f"Failed to publish code suggestion, error: {e}") + return False + + def is_supported(self, capability: str) -> bool: + if capability in ['get_issue_comments', 'get_labels', 'gfm_markdown']: + return False + return True + + def set_pr(self, pr_url: str): + self.workspace_slug, self.repo_slug, self.pr_num = self._parse_pr_url(pr_url) + self.pr = self._get_pr() + + def get_file(self, path: str, commit_id: str): + file_content = "" + try: + file_content = self.bitbucket_client.get_content_of_file(self.workspace_slug, + self.repo_slug, + path, + commit_id) + except requests.HTTPError as e: + get_logger().debug(f"File {path} not found at commit id: {commit_id}") + return file_content + + def get_files(self): + changes = self.bitbucket_client.get_pull_requests_changes(self.workspace_slug, self.repo_slug, self.pr_num) + diffstat = [change["path"]['toString'] for change in changes] + return diffstat + + def get_diff_files(self) -> list[FilePatchInfo]: + if self.diff_files: + return self.diff_files + + commits_in_pr = self.bitbucket_client.get_pull_requests_commits( + self.workspace_slug, + self.repo_slug, + self.pr_num + ) + + commit_list = list(commits_in_pr) + base_sha, head_sha = commit_list[0]['parents'][0]['id'], commit_list[-1]['id'] + + diff_files = [] + original_file_content_str = "" + new_file_content_str = "" + + changes = self.bitbucket_client.get_pull_requests_changes(self.workspace_slug, self.repo_slug, self.pr_num) + for change in changes: + file_path = change['path']['toString'] + match change['type']: + case 'ADD': + edit_type = EDIT_TYPE.ADDED + new_file_content_str = self.get_file(file_path, head_sha) + if isinstance(new_file_content_str, (bytes, bytearray)): + new_file_content_str = new_file_content_str.decode("utf-8") + original_file_content_str = "" + case 'DELETE': + edit_type = EDIT_TYPE.DELETED + new_file_content_str = "" + original_file_content_str = self.get_file(file_path, base_sha) + if isinstance(original_file_content_str, (bytes, bytearray)): + original_file_content_str = original_file_content_str.decode("utf-8") + case 'RENAME': + edit_type = EDIT_TYPE.RENAMED + case _: + edit_type = EDIT_TYPE.MODIFIED + original_file_content_str = self.get_file(file_path, base_sha) + if isinstance(original_file_content_str, (bytes, bytearray)): + original_file_content_str = original_file_content_str.decode("utf-8") + new_file_content_str = self.get_file(file_path, head_sha) + if isinstance(new_file_content_str, (bytes, bytearray)): + new_file_content_str = new_file_content_str.decode("utf-8") + + patch = load_large_diff(file_path, new_file_content_str, original_file_content_str) + + diff_files.append( + FilePatchInfo( + original_file_content_str, + new_file_content_str, + patch, + file_path, + edit_type=edit_type, + ) + ) + + self.diff_files = diff_files + return diff_files + + def publish_comment(self, pr_comment: str, is_temporary: bool = False): + if not is_temporary: + self.bitbucket_client.add_pull_request_comment(self.workspace_slug, self.repo_slug, self.pr_num, pr_comment) + + def remove_initial_comment(self): + try: + for comment in self.temp_comments: + self.remove_comment(comment) + except ValueError as e: + get_logger().exception(f"Failed to remove temp comments, error: {e}") + + def remove_comment(self, comment): + pass + + # funtion to create_inline_comment + def create_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str): + position, absolute_position = find_line_number_of_relevant_line_in_file( + self.get_diff_files(), + relevant_file.strip('`'), + relevant_line_in_file + ) + if position == -1: + if get_settings().config.verbosity_level >= 2: + get_logger().info(f"Could not find position for {relevant_file} {relevant_line_in_file}") + subject_type = "FILE" + else: + subject_type = "LINE" + path = relevant_file.strip() + return dict(body=body, path=path, position=absolute_position) if subject_type == "LINE" else {} + + def publish_inline_comment(self, comment: str, from_line: int, file: str): + payload = { + "text": comment, + "severity": "NORMAL", + "anchor": { + "diffType": "EFFECTIVE", + "path": file, + "lineType": "ADDED", + "line": from_line, + "fileType": "TO" + } + } + + response = requests.post(url=self._get_pr_comments_url(), json=payload, headers=self.headers) + return response + + def generate_link_to_relevant_line_number(self, suggestion) -> str: + try: + relevant_file = suggestion['relevant file'].strip('`').strip("'") + relevant_line_str = suggestion['relevant line'] + if not relevant_line_str: + return "" + + diff_files = self.get_diff_files() + position, absolute_position = find_line_number_of_relevant_line_in_file \ + (diff_files, relevant_file, relevant_line_str) + + if absolute_position != -1 and self.pr_url: + link = f"{self.pr_url}/#L{relevant_file}T{absolute_position}" + return link + except Exception as e: + if get_settings().config.verbosity_level >= 2: + get_logger().info(f"Failed adding line link, error: {e}") + + return "" + + def publish_inline_comments(self, comments: list[dict]): + for comment in comments: + self.publish_inline_comment(comment['body'], comment['position'], comment['path']) + + def get_title(self): + return self.pr.title + + def get_languages(self): + return {"yaml": 0} # devops LOL + + def get_pr_branch(self): + return self.pr.fromRef['displayId'] + + def get_pr_description_full(self): + return self.pr.description + + def get_user_id(self): + return 0 + + def get_issue_comments(self): + raise NotImplementedError( + "Bitbucket provider does not support issue comments yet" + ) + + def add_eyes_reaction(self, issue_comment_id: int) -> Optional[int]: + return True + + def remove_reaction(self, issue_comment_id: int, reaction_id: int) -> bool: + return True + + @staticmethod + def _parse_bitbucket_server(url: str) -> str: + parsed_url = urlparse(url) + return f"{parsed_url.scheme}://{parsed_url.netloc}" + + @staticmethod + def _parse_pr_url(pr_url: str) -> Tuple[str, str, int]: + parsed_url = urlparse(pr_url) + path_parts = parsed_url.path.strip("/").split("/") + if len(path_parts) < 6 or path_parts[4] != "pull-requests": + raise ValueError( + "The provided URL does not appear to be a Bitbucket PR URL" + ) + + workspace_slug = path_parts[1] + repo_slug = path_parts[3] + try: + pr_number = int(path_parts[5]) + except ValueError as e: + raise ValueError("Unable to convert PR number to integer") from e + + return workspace_slug, repo_slug, pr_number + + def _get_repo(self): + if self.repo is None: + self.repo = self.bitbucket_client.get_repo(self.workspace_slug, self.repo_slug) + return self.repo + + def _get_pr(self): + pr = self.bitbucket_client.get_pull_request(self.workspace_slug, self.repo_slug, pull_request_id=self.pr_num) + return type('new_dict', (object,), pr) + + def _get_pr_file_content(self, remote_link: str): + return "" + + def get_commit_messages(self): + def get_commit_messages(self): + raise NotImplementedError("Get commit messages function not implemented yet.") + # bitbucket does not support labels + def publish_description(self, pr_title: str, description: str): + payload = json.dumps({ + "description": description, + "title": pr_title + }) + + response = requests.put(url=self.bitbucket_pull_request_api_url, headers=self.headers, data=payload) + return response + + # bitbucket does not support labels + def publish_labels(self, pr_types: list): + pass + + # bitbucket does not support labels + def get_labels(self): + pass + + def _get_pr_comments_url(self): + return f"{self.bitbucket_server_url}/rest/api/latest/projects/{self.workspace_slug}/repos/{self.repo_slug}/pull-requests/{self.pr_num}/comments" diff --git a/pr_agent/git_providers/git_provider.py b/pr_agent/git_providers/git_provider.py index 1e18d86e..05122f9c 100644 --- a/pr_agent/git_providers/git_provider.py +++ b/pr_agent/git_providers/git_provider.py @@ -40,45 +40,10 @@ class GitProvider(ABC): def publish_description(self, pr_title: str, pr_body: str): pass - @abstractmethod - def publish_comment(self, pr_comment: str, is_temporary: bool = False): - pass - - def publish_persistent_comment(self, pr_comment: str, initial_text: str, updated_text: str): - self.publish_comment(pr_comment) - - @abstractmethod - def publish_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str): - pass - - @abstractmethod - def create_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str): - pass - - @abstractmethod - def publish_inline_comments(self, comments: list[dict]): - pass - @abstractmethod def publish_code_suggestions(self, code_suggestions: list) -> bool: pass - @abstractmethod - def publish_labels(self, labels): - pass - - @abstractmethod - def get_labels(self): - pass - - @abstractmethod - def remove_initial_comment(self): - pass - - @abstractmethod - def remove_comment(self, comment): - pass - @abstractmethod def get_languages(self): pass @@ -107,7 +72,7 @@ class GitProvider(ABC): def get_user_description(self) -> str: description = (self.get_pr_description_full() or "").strip() # if the existing description wasn't generated by the pr-agent, just return it as-is - if not description.startswith("## PR Type"): + if not any(description.startswith(header) for header in ("## PR Type", "## PR Description")): return description # if the existing description was generated by the pr-agent, but it doesn't contain the user description, # return nothing (empty string) because it means there is no user description @@ -117,11 +82,54 @@ class GitProvider(ABC): return description.split("## User Description:", 1)[1].strip() @abstractmethod - def get_issue_comments(self): + def get_repo_settings(self): + pass + + def get_pr_id(self): + return "" + + #### comments operations #### + @abstractmethod + def publish_comment(self, pr_comment: str, is_temporary: bool = False): + pass + + def publish_persistent_comment(self, pr_comment: str, initial_header: str, update_header: bool): + self.publish_comment(pr_comment) + + @abstractmethod + def publish_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str): pass @abstractmethod - def get_repo_settings(self): + def create_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str): + pass + + @abstractmethod + def publish_inline_comments(self, comments: list[dict]): + pass + + @abstractmethod + def remove_initial_comment(self): + pass + + @abstractmethod + def remove_comment(self, comment): + pass + + @abstractmethod + def get_issue_comments(self): + pass + + def get_comment_url(self, comment) -> str: + return "" + + #### labels operations #### + @abstractmethod + def publish_labels(self, labels): + pass + + @abstractmethod + def get_labels(self): pass @abstractmethod @@ -132,11 +140,12 @@ class GitProvider(ABC): def remove_reaction(self, issue_comment_id: int, reaction_id: int) -> bool: pass + #### commits operations #### @abstractmethod def get_commit_messages(self): pass - def get_pr_id(self): + def get_latest_commit_url(self) -> str: return "" def get_main_pr_language(languages, files) -> str: diff --git a/pr_agent/git_providers/github_provider.py b/pr_agent/git_providers/github_provider.py index c0b9cc11..634b8694 100644 --- a/pr_agent/git_providers/github_provider.py +++ b/pr_agent/git_providers/github_provider.py @@ -154,16 +154,28 @@ class GithubProvider(GitProvider): def publish_description(self, pr_title: str, pr_body: str): self.pr.edit(title=pr_title, body=pr_body) - def publish_persistent_comment(self, pr_comment: str, initial_text: str, updated_text: str): + def get_latest_commit_url(self) -> str: + return self.last_commit_id.html_url + + def get_comment_url(self, comment) -> str: + return comment.html_url + + def publish_persistent_comment(self, pr_comment: str, initial_header: str, update_header: bool = True): prev_comments = list(self.pr.get_issue_comments()) for comment in prev_comments: body = comment.body - if body.startswith(initial_text): - if updated_text: - pr_comment_updated = pr_comment.replace(initial_text, updated_text) + if body.startswith(initial_header): + latest_commit_url = self.get_latest_commit_url() + comment_url = self.get_comment_url(comment) + if update_header: + updated_header = f"{initial_header}\n\n### (review updated until commit {latest_commit_url})\n" + pr_comment_updated = pr_comment.replace(initial_header, updated_header) else: pr_comment_updated = pr_comment + get_logger().info(f"Persistent mode- updating comment {comment_url} to latest review message") response = comment.edit(pr_comment_updated) + self.publish_comment( + f"**[Persistent review]({comment_url})** updated to latest commit {latest_commit_url}") return self.publish_comment(pr_comment) @@ -393,7 +405,7 @@ class GithubProvider(GitProvider): raise ValueError("GitHub app installation ID is required when using GitHub app deployment") auth = AppAuthentication(app_id=app_id, private_key=private_key, installation_id=self.installation_id) - return Github(app_auth=auth) + return Github(app_auth=auth, base_url=get_settings().github.base_url) if deployment_type == 'user': try: @@ -402,7 +414,7 @@ class GithubProvider(GitProvider): raise ValueError( "GitHub token is required when using user deployment. See: " "https://github.com/Codium-ai/pr-agent#method-2-run-from-source") from e - return Github(auth=Auth.Token(token)) + return Github(auth=Auth.Token(token), base_url=get_settings().github.base_url) def _get_repo(self): if hasattr(self, 'repo_obj') and \ diff --git a/pr_agent/git_providers/gitlab_provider.py b/pr_agent/git_providers/gitlab_provider.py index 396483a5..078ca9dd 100644 --- a/pr_agent/git_providers/gitlab_provider.py +++ b/pr_agent/git_providers/gitlab_provider.py @@ -136,15 +136,27 @@ class GitLabProvider(GitProvider): except Exception as e: get_logger().exception(f"Could not update merge request {self.id_mr} description: {e}") - def publish_persistent_comment(self, pr_comment: str, initial_text: str, updated_text: str): + def get_latest_commit_url(self): + return self.mr.commits().next().web_url + + def get_comment_url(self, comment): + return f"{self.mr.web_url}#note_{comment.id}" + + def publish_persistent_comment(self, pr_comment: str, initial_header: str, update_header: bool = True): try: for comment in self.mr.notes.list(get_all=True)[::-1]: - if comment.body.startswith(initial_text): - if updated_text: - pr_comment_updated = pr_comment.replace(initial_text, updated_text) + if comment.body.startswith(initial_header): + latest_commit_url = self.get_latest_commit_url() + comment_url = self.get_comment_url(comment) + if update_header: + updated_header = f"{initial_header}\n\n### (review updated until commit {latest_commit_url})\n" + pr_comment_updated = pr_comment.replace(initial_header, updated_header) else: pr_comment_updated = pr_comment + get_logger().info(f"Persistent mode- updating comment {comment_url} to latest review message") response = self.mr.notes.update(comment.id, {'body': pr_comment_updated}) + self.publish_comment( + f"**[Persistent review]({comment_url})** updated to latest commit {latest_commit_url}") return except Exception as e: get_logger().exception(f"Failed to update persistent review, error: {e}") diff --git a/pr_agent/servers/bitbucket_server_webhook.py b/pr_agent/servers/bitbucket_server_webhook.py new file mode 100644 index 00000000..c6ce8353 --- /dev/null +++ b/pr_agent/servers/bitbucket_server_webhook.py @@ -0,0 +1,64 @@ +import json + +import uvicorn +from fastapi import APIRouter, FastAPI +from fastapi.encoders import jsonable_encoder +from starlette import status +from starlette.background import BackgroundTasks +from starlette.middleware import Middleware +from starlette.requests import Request +from starlette.responses import JSONResponse +from starlette_context.middleware import RawContextMiddleware + +from pr_agent.agent.pr_agent import PRAgent +from pr_agent.config_loader import get_settings +from pr_agent.log import get_logger + +router = APIRouter() + + +def handle_request(background_tasks: BackgroundTasks, url: str, body: str, log_context: dict): + log_context["action"] = body + log_context["event"] = "pull_request" if body == "review" else "comment" + log_context["api_url"] = url + with get_logger().contextualize(**log_context): + background_tasks.add_task(PRAgent().handle_request, url, body) + + +@router.post("/webhook") +async def handle_webhook(background_tasks: BackgroundTasks, request: Request): + log_context = {"server_type": "bitbucket_server"} + data = await request.json() + get_logger().info(json.dumps(data)) + + pr_id = data['pullRequest']['id'] + repository_name = data['pullRequest']['toRef']['repository']['slug'] + project_name = data['pullRequest']['toRef']['repository']['project']['key'] + bitbucket_server = get_settings().get("BITBUCKET_SERVER.URL") + pr_url = f"{bitbucket_server}/projects/{project_name}/repos/{repository_name}/pull-requests/{pr_id}" + + log_context["api_url"] = pr_url + log_context["event"] = "pull_request" + + handle_request(background_tasks, pr_url, "review", log_context) + return JSONResponse(status_code=status.HTTP_200_OK, content=jsonable_encoder({"message": "success"})) + + +@router.get("/") +async def root(): + return {"status": "ok"} + + +def start(): + bitbucket_server_url = get_settings().get("BITBUCKET_SERVER.URL", None) + if not bitbucket_server_url: + raise ValueError("BITBUCKET_SERVER.URL is not set") + get_settings().config.git_provider = "bitbucket_server" + middleware = [Middleware(RawContextMiddleware)] + app = FastAPI(middleware=middleware) + app.include_router(router) + uvicorn.run(app, host="0.0.0.0", port=3000) + + +if __name__ == '__main__': + start() diff --git a/pr_agent/servers/gitlab_webhook.py b/pr_agent/servers/gitlab_webhook.py index e2e66e09..a5d5a115 100644 --- a/pr_agent/servers/gitlab_webhook.py +++ b/pr_agent/servers/gitlab_webhook.py @@ -38,7 +38,7 @@ async def gitlab_webhook(background_tasks: BackgroundTasks, request: Request): try: secret_dict = json.loads(secret) gitlab_token = secret_dict["gitlab_token"] - log_context["sender"] = secret_dict["id"] + log_context["sender"] = secret_dict.get("token_name", secret_dict.get("id", "unknown")) context["settings"] = copy.deepcopy(global_settings) context["settings"].gitlab.personal_access_token = gitlab_token except Exception as e: diff --git a/pr_agent/servers/help.py b/pr_agent/servers/help.py index 0f3f3caa..c32c5666 100644 --- a/pr_agent/servers/help.py +++ b/pr_agent/servers/help.py @@ -1,12 +1,14 @@ -commands_text = "> **/review [-i]**: Request a review of your Pull Request. For an incremental review, which only " \ - "considers changes since the last review, include the '-i' option.\n" \ - "> **/describe**: Modify the PR title and description based on the contents of the PR.\n" \ - "> **/improve [--extended]**: Suggest improvements to the code in the PR. Extended mode employs several calls, and provides a more thorough feedback. \n" \ - "> **/ask \\**: Pose a question about the PR.\n" \ - "> **/update_changelog**: Update the changelog based on the PR's contents.\n\n" \ - ">To edit any configuration parameter from **configuration.toml**, add --config_path=new_value\n" \ +commands_text = "> **/review**: Request a review of your Pull Request.\n" \ + "> **/describe**: Update the PR title and description based on the contents of the PR.\n" \ + "> **/improve [--extended]**: Suggest code improvements. Extended mode provides a higher quality feedback.\n" \ + "> **/ask \\**: Ask a question about the PR.\n" \ + "> **/update_changelog**: Update the changelog based on the PR's contents.\n" \ + "> **/add_docs**: Generate docstring for new components introduced in the PR.\n" \ + "> **/generate_labels**: Generate labels for the PR based on the PR's contents.\n" \ + "> see the [tools guide](https://github.com/Codium-ai/pr-agent/blob/main/docs/TOOLS_GUIDE.md) for more details.\n\n" \ + ">To edit any configuration parameter from the [configuration.toml](https://github.com/Codium-ai/pr-agent/blob/main/pr_agent/settings/configuration.toml), add --config_path=new_value.\n" \ ">For example: /review --pr_reviewer.extra_instructions=\"focus on the file: ...\" \n" \ - ">To list the possible configuration parameters, use the **/config** command.\n" \ + ">To list the possible configuration parameters, add a **/config** comment.\n" \ def bot_help_text(user: str): diff --git a/pr_agent/servers/serverless.py b/pr_agent/servers/serverless.py index b03d9171..c0bce606 100644 --- a/pr_agent/servers/serverless.py +++ b/pr_agent/servers/serverless.py @@ -3,10 +3,8 @@ from mangum import Mangum from starlette.middleware import Middleware from starlette_context.middleware import RawContextMiddleware -from pr_agent.log import setup_logger from pr_agent.servers.github_app import router -setup_logger() middleware = [Middleware(RawContextMiddleware)] app = FastAPI(middleware=middleware) diff --git a/pr_agent/settings/.secrets_template.toml b/pr_agent/settings/.secrets_template.toml index b6b11cd4..ba51382c 100644 --- a/pr_agent/settings/.secrets_template.toml +++ b/pr_agent/settings/.secrets_template.toml @@ -36,6 +36,10 @@ api_base = "" # the base url for your huggingface inference endpoint [ollama] api_base = "" # the base url for your local Llama 2, Code Llama, and other models inference endpoint. Acquire through https://ollama.ai/ +[vertexai] +vertex_project = "" # the google cloud platform project name for your vertexai deployment +vertex_location = "" # the google cloud platform location for your vertexai deployment + [github] # ---- Set the following only for deployment type == "user" user_token = "" # A GitHub personal access token with 'repo' scope. diff --git a/pr_agent/settings/configuration.toml b/pr_agent/settings/configuration.toml index dd863ebb..38e96fd1 100644 --- a/pr_agent/settings/configuration.toml +++ b/pr_agent/settings/configuration.toml @@ -16,11 +16,13 @@ secret_provider="google_cloud_storage" cli_mode=false [pr_reviewer] # /review # +# enable/disable features require_focused_review=false require_score_review=false require_tests_review=true require_security_review=true require_estimate_effort_to_review=true +# general options num_code_suggestions=4 inline_code_comments = false ask_and_reflect=false @@ -28,6 +30,9 @@ automatic_review=true remove_previous_review_comment=false persistent_comment=true extra_instructions = "" +# review labels +enable_review_labels_security=true +enable_review_labels_effort=false # specific configurations for incremental review (/review -i) require_all_thresholds_for_incremental_review=false minimal_commits_for_incremental_review=0 @@ -74,6 +79,7 @@ extra_instructions = "" # The type of deployment to create. Valid values are 'app' or 'user'. deployment_type = "user" ratelimit_retries = 5 +base_url = "https://api.github.com" [github_action] # auto_review = true # set as env var in .github/workflows/pr-agent.yaml diff --git a/pr_agent/settings/pr_code_suggestions_prompts.toml b/pr_agent/settings/pr_code_suggestions_prompts.toml index a3eb93a1..42ec7441 100644 --- a/pr_agent/settings/pr_code_suggestions_prompts.toml +++ b/pr_agent/settings/pr_code_suggestions_prompts.toml @@ -90,16 +90,19 @@ Code suggestions: Example output: ```yaml Code suggestions: - - relevant file: |- - src/file1.py - suggestion content: |- - Add a docstring to func1() - existing code: |- - def func1(): - relevant lines start: 12 - relevant lines end: 12 - improved code: |- - ... +- relevant file: |- + src/file1.py + suggestion content: |- + Add a docstring to func1() + existing code: |- + def func1(): + relevant lines start: |- + 12 + relevant lines end: |- + 12 + improved code: |- + ... +... ``` diff --git a/pr_agent/settings/pr_custom_labels.toml b/pr_agent/settings/pr_custom_labels.toml index f61a208c..976258dc 100644 --- a/pr_agent/settings/pr_custom_labels.toml +++ b/pr_agent/settings/pr_custom_labels.toml @@ -13,6 +13,7 @@ Extra instructions from the user: ' {% endif %} + The output must be a YAML object equivalent to type $Labels, according to the following Pydantic definitions: ' {%- if enable_custom_labels %} diff --git a/pr_agent/settings/pr_description_prompts.toml b/pr_agent/settings/pr_description_prompts.toml index 514a1991..2a51b324 100644 --- a/pr_agent/settings/pr_description_prompts.toml +++ b/pr_agent/settings/pr_description_prompts.toml @@ -27,6 +27,7 @@ class PRType(str, Enum): {%- if enable_custom_labels %} {{ custom_labels_class }} + {%- endif %} class FileWalkthrough(BaseModel): diff --git a/pr_agent/settings/pr_reviewer_prompts.toml b/pr_agent/settings/pr_reviewer_prompts.toml index 103d5e14..b75c296a 100644 --- a/pr_agent/settings/pr_reviewer_prompts.toml +++ b/pr_agent/settings/pr_reviewer_prompts.toml @@ -93,7 +93,7 @@ PR Analysis: description: >- Estimate, on a scale of 1-5 (inclusive), the time and effort required to review this PR by an experienced and knowledgeable developer. 1 means short and easy review , 5 means long and hard review. Take into account the size, complexity, quality, and the needed changes of the PR code diff. - Explain your answer shortly (1-2 sentences). + Explain your answer shortly (1-2 sentences). Use the format: '1, because ...' {%- endif %} PR Feedback: General suggestions: @@ -130,7 +130,8 @@ PR Feedback: Security concerns: type: string description: >- - yes\\no question: does this PR code introduce possible vulnerabilities such as exposure of sensitive information (e.g., API keys, secrets, passwords), or security concerns like SQL injection, XSS, CSRF, and others ? If answered 'yes', explain your answer briefly. + does this PR code introduce possible vulnerabilities such as exposure of sensitive information (e.g., API keys, secrets, passwords), or security concerns like SQL injection, XSS, CSRF, and others ? Answer 'No' if there are no possible issues. + Answer 'Yes, because ...' if there are security concerns or issues. Explain your answer shortly. {%- endif %} ``` diff --git a/pr_agent/tools/pr_description.py b/pr_agent/tools/pr_description.py index c0eb6606..0e7244d3 100644 --- a/pr_agent/tools/pr_description.py +++ b/pr_agent/tools/pr_description.py @@ -157,6 +157,9 @@ class PRDescription: user=user_prompt ) + if get_settings().config.verbosity_level >= 2: + get_logger().info(f"\nAI response:\n{response}") + return response def _prepare_data(self): diff --git a/pr_agent/tools/pr_reviewer.py b/pr_agent/tools/pr_reviewer.py index 5b8e5472..c3e35295 100644 --- a/pr_agent/tools/pr_reviewer.py +++ b/pr_agent/tools/pr_reviewer.py @@ -10,7 +10,7 @@ from yaml import SafeLoader from pr_agent.algo.ai_handler import AiHandler from pr_agent.algo.pr_processing import get_pr_diff, retry_with_fallback_models from pr_agent.algo.token_handler import TokenHandler -from pr_agent.algo.utils import convert_to_markdown, load_yaml, try_fix_yaml, set_custom_labels +from pr_agent.algo.utils import convert_to_markdown, load_yaml, try_fix_yaml, set_custom_labels, get_user_labels from pr_agent.config_loader import get_settings from pr_agent.git_providers import get_git_provider from pr_agent.git_providers.git_provider import IncrementalPR, get_main_pr_language @@ -121,8 +121,8 @@ class PRReviewer: # publish the review if get_settings().pr_reviewer.persistent_comment and not self.incremental.is_incremental: self.git_provider.publish_persistent_comment(pr_comment, - initial_text="## PR Analysis", - updated_text="## PR Analysis (updated)") + initial_header="## PR Analysis", + update_header=True) else: self.git_provider.publish_comment(pr_comment) @@ -178,6 +178,9 @@ class PRReviewer: user=user_prompt ) + if get_settings().config.verbosity_level >= 2: + get_logger().info(f"\nAI response:\n{response}") + return response def _prepare_pr_review(self) -> str: @@ -246,11 +249,18 @@ class PRReviewer: # Add help text if not in CLI mode if not get_settings().get("CONFIG.CLI_MODE", False): markdown_text += "\n### How to use\n" + if self.git_provider.is_supported("gfm_markdown"): + markdown_text += "\n**
Instructions**\n" bot_user = "[bot]" if get_settings().github_app.override_deployment_type else get_settings().github_app.bot_user if user and bot_user not in user: markdown_text += bot_help_text(user) else: markdown_text += actions_help_text + if self.git_provider.is_supported("gfm_markdown"): + markdown_text += "\n
\n" + + # Add custom labels from the review prediction (effort, security) + self.set_review_labels(data) # Log markdown response if verbosity level is high if get_settings().config.verbosity_level >= 2: @@ -268,14 +278,7 @@ class PRReviewer: if get_settings().pr_reviewer.num_code_suggestions == 0: return - review_text = self.prediction.strip() - review_text = review_text.removeprefix('```yaml').rstrip('`') - try: - data = yaml.load(review_text, Loader=SafeLoader) - except Exception as e: - get_logger().error(f"Failed to parse AI prediction: {e}") - data = try_fix_yaml(review_text) - + data = load_yaml(self.prediction.strip()) comments: List[str] = [] for suggestion in data.get('PR Feedback', {}).get('Code feedback', []): relevant_file = suggestion.get('relevant file', '').strip() @@ -372,3 +375,28 @@ class PRReviewer: ) return False return True + + def set_review_labels(self, data): + if (get_settings().pr_reviewer.enable_review_labels_security or + get_settings().pr_reviewer.enable_review_labels_effort): + try: + review_labels = [] + if get_settings().pr_reviewer.enable_review_labels_effort: + estimated_effort = data['PR Analysis']['Estimated effort to review [1-5]'] + estimated_effort_number = int(estimated_effort.split(',')[0]) + if 1 <= estimated_effort_number <= 5: # 1, because ... + review_labels.append(f'Review effort [1-5]: {estimated_effort_number}') + if get_settings().pr_reviewer.enable_review_labels_security: + security_concerns = data['PR Analysis']['Security concerns'] # yes, because ... + security_concerns_bool = 'yes' in security_concerns.lower() or 'true' in security_concerns.lower() + if security_concerns_bool: + review_labels.append('Possible security concern') + + if review_labels: + current_labels = self.git_provider.get_labels() + current_labels_filtered = [label for label in current_labels if + not label.lower().startswith('review effort [1-5]:') and not label.lower().startswith( + 'possible security concern')] + self.git_provider.publish_labels(review_labels + current_labels_filtered) + except Exception as e: + get_logger().error(f"Failed to set review labels, error: {e}") diff --git a/pr_agent/tools/pr_similar_issue.py b/pr_agent/tools/pr_similar_issue.py index c717b59f..832c577f 100644 --- a/pr_agent/tools/pr_similar_issue.py +++ b/pr_agent/tools/pr_similar_issue.py @@ -8,6 +8,7 @@ import pinecone from pinecone_datasets import Dataset, DatasetMetadata from pydantic import BaseModel, Field +from pr_agent.algo import MAX_TOKENS from pr_agent.algo.token_handler import TokenHandler from pr_agent.algo.utils import get_max_tokens from pr_agent.config_loader import get_settings diff --git a/requirements.txt b/requirements.txt index 8589b30b..eae08f4c 100644 --- a/requirements.txt +++ b/requirements.txt @@ -13,7 +13,7 @@ atlassian-python-api==3.39.0 GitPython==3.1.32 PyYAML==6.0 starlette-context==0.3.6 -litellm~=0.1.574 +litellm==0.12.5 boto3==1.28.25 google-cloud-storage==2.10.0 ujson==5.8.0 @@ -22,3 +22,4 @@ msrest==0.7.1 pinecone-client pinecone-datasets @ git+https://github.com/mrT23/pinecone-datasets.git@main loguru==0.7.2 +google-cloud-aiplatform==1.35.0 diff --git a/tests/unittest/test_bitbucket_provider.py b/tests/unittest/test_bitbucket_provider.py index 3bb64a0c..e17a26ce 100644 --- a/tests/unittest/test_bitbucket_provider.py +++ b/tests/unittest/test_bitbucket_provider.py @@ -1,3 +1,4 @@ +from pr_agent.git_providers import BitbucketServerProvider from pr_agent.git_providers.bitbucket_provider import BitbucketProvider @@ -8,3 +9,10 @@ class TestBitbucketProvider: assert workspace_slug == "WORKSPACE_XYZ" assert repo_slug == "MY_TEST_REPO" assert pr_number == 321 + + def test_bitbucket_server_pr_url(self): + url = "https://git.onpreminstance.com/projects/AAA/repos/my-repo/pull-requests/1" + workspace_slug, repo_slug, pr_number = BitbucketServerProvider._parse_pr_url(url) + assert workspace_slug == "AAA" + assert repo_slug == "my-repo" + assert pr_number == 1 diff --git a/tests/unittest/test_load_yaml.py b/tests/unittest/test_load_yaml.py index a345aee2..a77c847b 100644 --- a/tests/unittest/test_load_yaml.py +++ b/tests/unittest/test_load_yaml.py @@ -2,6 +2,9 @@ # Generated by CodiumAI import pytest +import yaml +from yaml.scanner import ScannerError + from pr_agent.algo.utils import load_yaml @@ -12,7 +15,7 @@ class TestLoadYaml: expected_output = {'name': 'John Smith', 'age': 35} assert load_yaml(yaml_str) == expected_output - def test_load_complicated_yaml(self): + def test_load_invalid_yaml1(self): yaml_str = \ '''\ PR Analysis: @@ -26,7 +29,23 @@ PR Feedback: Code feedback: - relevant file: pr_agent/settings/pr_description_prompts.toml suggestion: Consider using a more descriptive variable name than 'user' for the command prompt. A more descriptive name would make the code more readable and maintainable. [medium] - relevant line: 'user="""PR Info:' + relevant line: user="""PR Info: aaa Security concerns: No''' - expected_output = {'PR Analysis': {'Main theme': 'Enhancing the `/describe` command prompt by adding title and description', 'Type of PR': 'Enhancement', 'Relevant tests added': False, 'Focused PR': 'Yes, the PR is focused on enhancing the `/describe` command prompt.'}, 'PR Feedback': {'General suggestions': 'The PR seems to be well-structured and focused on a specific enhancement. However, it would be beneficial to add tests to ensure the new feature works as expected.', 'Code feedback': [{'relevant file': 'pr_agent/settings/pr_description_prompts.toml', 'suggestion': "Consider using a more descriptive variable name than 'user' for the command prompt. A more descriptive name would make the code more readable and maintainable. [medium]", 'relevant line': 'user="""PR Info:'}], 'Security concerns': False}} + with pytest.raises(ScannerError): + yaml.safe_load(yaml_str) + + expected_output = {'PR Analysis': {'Main theme': 'Enhancing the `/describe` command prompt by adding title and description', 'Type of PR': 'Enhancement', 'Relevant tests added': False, 'Focused PR': 'Yes, the PR is focused on enhancing the `/describe` command prompt.'}, 'PR Feedback': {'General suggestions': 'The PR seems to be well-structured and focused on a specific enhancement. However, it would be beneficial to add tests to ensure the new feature works as expected.', 'Code feedback': [{'relevant file': 'pr_agent/settings/pr_description_prompts.toml', 'suggestion': "Consider using a more descriptive variable name than 'user' for the command prompt. A more descriptive name would make the code more readable and maintainable. [medium]", 'relevant line': 'user="""PR Info: aaa'}], 'Security concerns': False}} assert load_yaml(yaml_str) == expected_output + + def test_load_invalid_yaml2(self): + yaml_str = '''\ +- relevant file: src/app.py: + suggestion content: The print statement is outside inside the if __name__ ==: \ + ''' + with pytest.raises(ScannerError): + yaml.safe_load(yaml_str) + + expected_output =[{'relevant file': 'src/app.py:', + 'suggestion content': 'The print statement is outside inside the if __name__ ==: '}] + assert load_yaml(yaml_str) == expected_output +