mirror of
https://github.com/qodo-ai/pr-agent.git
synced 2025-07-03 04:10:49 +08:00
Merge branch 'main' into feature_azure_devops
This commit is contained in:
2
.github/workflows/build-and-test.yaml
vendored
2
.github/workflows/build-and-test.yaml
vendored
@ -2,6 +2,8 @@ name: Build-and-test
|
||||
|
||||
on:
|
||||
push:
|
||||
pull_request:
|
||||
types: [ opened, reopened ]
|
||||
|
||||
jobs:
|
||||
build-and-test:
|
||||
|
115
INSTALL.md
115
INSTALL.md
@ -18,6 +18,18 @@ docker run --rm -it -e OPENAI.KEY=<your key> -e GITHUB.USER_TOKEN=<your token> c
|
||||
```
|
||||
docker run --rm -it -e OPENAI.KEY=<your key> -e GITHUB.USER_TOKEN=<your token> codiumai/pr-agent --pr_url <pr_url> ask "<your question>"
|
||||
```
|
||||
Note: If you want to ensure you're running a specific version of the Docker image, consider using the image's digest.
|
||||
The digest is a unique identifier for a specific version of an image. You can pull and run an image using its digest by referencing it like so: repository@sha256:digest. Always ensure you're using the correct and trusted digest for your operations.
|
||||
|
||||
1. To request a review for a PR using a specific digest, run the following command:
|
||||
```bash
|
||||
docker run --rm -it -e OPENAI.KEY=<your key> -e GITHUB.USER_TOKEN=<your token> codiumai/pr-agent@sha256:71b5ee15df59c745d352d84752d01561ba64b6d51327f97d46152f0c58a5f678 --pr_url <pr_url> review
|
||||
```
|
||||
|
||||
2. To ask a question about a PR using the same digest, run the following command:
|
||||
```bash
|
||||
docker run --rm -it -e OPENAI.KEY=<your key> -e GITHUB.USER_TOKEN=<your token> codiumai/pr-agent@sha256:71b5ee15df59c745d352d84752d01561ba64b6d51327f97d46152f0c58a5f678 --pr_url <pr_url> ask "<your question>"
|
||||
```
|
||||
|
||||
Possible questions you can ask include:
|
||||
|
||||
@ -42,6 +54,10 @@ on:
|
||||
jobs:
|
||||
pr_agent_job:
|
||||
runs-on: ubuntu-latest
|
||||
permissions:
|
||||
issues: write
|
||||
pull-requests: write
|
||||
contents: write
|
||||
name: Run pr agent on every pull request, respond to user comments
|
||||
steps:
|
||||
- name: PR Agent action step
|
||||
@ -51,7 +67,28 @@ jobs:
|
||||
OPENAI_KEY: ${{ secrets.OPENAI_KEY }}
|
||||
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
||||
```
|
||||
** if you want to pin your action to a specific commit for stability reasons
|
||||
```yaml
|
||||
on:
|
||||
pull_request:
|
||||
issue_comment:
|
||||
|
||||
jobs:
|
||||
pr_agent_job:
|
||||
runs-on: ubuntu-latest
|
||||
permissions:
|
||||
issues: write
|
||||
pull-requests: write
|
||||
contents: write
|
||||
name: Run pr agent on every pull request, respond to user comments
|
||||
steps:
|
||||
- name: PR Agent action step
|
||||
id: pragent
|
||||
uses: Codium-ai/pr-agent@<commit_sha>
|
||||
env:
|
||||
OPENAI_KEY: ${{ secrets.OPENAI_KEY }}
|
||||
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
||||
```
|
||||
2. Add the following secret to your repository under `Settings > Secrets`:
|
||||
|
||||
```
|
||||
@ -92,6 +129,7 @@ pip install -r requirements.txt
|
||||
|
||||
```
|
||||
cp pr_agent/settings/.secrets_template.toml pr_agent/settings/.secrets.toml
|
||||
chmod 600 pr_agent/settings/.secrets.toml
|
||||
# Edit .secrets.toml file
|
||||
```
|
||||
|
||||
@ -105,6 +143,14 @@ python pr_agent/cli.py --pr_url <pr_url> describe
|
||||
python pr_agent/cli.py --pr_url <pr_url> improve
|
||||
```
|
||||
|
||||
5. **Debugging LLM API Calls**
|
||||
If you're testing your codium/pr-agent server, and need to see if calls were made successfully + the exact call logs, you can use the [LiteLLM Debugger tool](https://docs.litellm.ai/docs/debugging/hosted_debugging).
|
||||
|
||||
You can do this by setting `litellm_debugger=true` in configuration.toml. Your Logs will be viewable in real-time @ `admin.litellm.ai/<your_email>`. Set your email in the `.secrets.toml` under 'user_email'.
|
||||
|
||||
<img src="./pics/debugger.png" width="900"/>
|
||||
|
||||
|
||||
---
|
||||
|
||||
#### Method 4: Run as a polling server
|
||||
@ -128,6 +174,7 @@ Allowing you to automate the review process on your private or public repositori
|
||||
- Pull requests: Read & write
|
||||
- Issue comment: Read & write
|
||||
- Metadata: Read-only
|
||||
- Contents: Read-only
|
||||
- Set the following events:
|
||||
- Issue comment
|
||||
- Pull request
|
||||
@ -216,3 +263,71 @@ docker push codiumai/pr-agent:github_app # Push to your Docker repository
|
||||
5. Configure the lambda function to have a Function URL.
|
||||
6. Go back to steps 8-9 of [Method 5](#method-5-run-as-a-github-app) with the function url as your Webhook URL.
|
||||
The Webhook URL would look like `https://<LAMBDA_FUNCTION_URL>/api/v1/github_webhooks`
|
||||
|
||||
---
|
||||
|
||||
#### AWS CodeCommit Setup
|
||||
|
||||
Not all features have been added to CodeCommit yet. As of right now, CodeCommit has been implemented to run the pr-agent CLI on the command line, using AWS credentials stored in environment variables. (More features will be added in the future.) The following is a set of instructions to have pr-agent do a review of your CodeCommit pull request from the command line:
|
||||
|
||||
1. Create an IAM user that you will use to read CodeCommit pull requests and post comments
|
||||
* Note: That user should have CLI access only, not Console access
|
||||
2. Add IAM permissions to that user, to allow access to CodeCommit (see IAM Role example below)
|
||||
3. Generate an Access Key for your IAM user
|
||||
4. Set the Access Key and Secret using environment variables (see Access Key example below)
|
||||
5. Set the `git_provider` value to `codecommit` in the `pr_agent/settings/configuration.toml` settings file
|
||||
6. Set the `PYTHONPATH` to include your `pr-agent` project directory
|
||||
* Option A: Add `PYTHONPATH="/PATH/TO/PROJECTS/pr-agent` to your `.env` file
|
||||
* Option B: Set `PYTHONPATH` and run the CLI in one command, for example:
|
||||
* `PYTHONPATH="/PATH/TO/PROJECTS/pr-agent python pr_agent/cli.py [--ARGS]`
|
||||
|
||||
#### AWS CodeCommit IAM Role Example
|
||||
|
||||
Example IAM permissions to that user to allow access to CodeCommit:
|
||||
|
||||
* Note: The following is a working example of IAM permissions that has read access to the repositories and write access to allow posting comments
|
||||
* Note: If you only want pr-agent to review your pull requests, you can tighten the IAM permissions further, however this IAM example will work, and allow the pr-agent to post comments to the PR
|
||||
* Note: You may want to replace the `"Resource": "*"` with your list of repos, to limit access to only those repos
|
||||
|
||||
```
|
||||
{
|
||||
"Version": "2012-10-17",
|
||||
"Statement": [
|
||||
{
|
||||
"Effect": "Allow",
|
||||
"Action": [
|
||||
"codecommit:BatchDescribe*",
|
||||
"codecommit:BatchGet*",
|
||||
"codecommit:Describe*",
|
||||
"codecommit:EvaluatePullRequestApprovalRules",
|
||||
"codecommit:Get*",
|
||||
"codecommit:List*",
|
||||
"codecommit:PostComment*",
|
||||
"codecommit:PutCommentReaction"
|
||||
],
|
||||
"Resource": "*"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
#### AWS CodeCommit Access Key and Secret
|
||||
|
||||
Example setting the Access Key and Secret using environment variables
|
||||
|
||||
```sh
|
||||
export AWS_ACCESS_KEY_ID="XXXXXXXXXXXXXXXX"
|
||||
export AWS_SECRET_ACCESS_KEY="XXXXXXXXXXXXXXXX"
|
||||
export AWS_DEFAULT_REGION="us-east-1"
|
||||
```
|
||||
|
||||
#### AWS CodeCommit CLI Example
|
||||
|
||||
After you set up AWS CodeCommit using the instructions above, here is an example CLI run that tells pr-agent to **review** a given pull request.
|
||||
(Replace your specific PYTHONPATH and PR URL in the example)
|
||||
|
||||
```sh
|
||||
PYTHONPATH="/PATH/TO/PROJECTS/pr-agent" python pr_agent/cli.py \
|
||||
--pr_url https://us-east-1.console.aws.amazon.com/codesuite/codecommit/repositories/MY_REPO_NAME/pull-requests/321 \
|
||||
review
|
||||
```
|
||||
|
42
README.md
42
README.md
@ -75,26 +75,26 @@ CodiumAI `PR-Agent` is an open-source tool aiming to help developers review pull
|
||||
|
||||
## Overview
|
||||
`PR-Agent` offers extensive pull request functionalities across various git providers:
|
||||
| | | GitHub | Gitlab | Bitbucket |
|
||||
|-------|---------------------------------------------|:------:|:------:|:---------:|
|
||||
| TOOLS | Review | :white_check_mark: | :white_check_mark: | :white_check_mark: |
|
||||
| | ⮑ Inline review | :white_check_mark: | :white_check_mark: | |
|
||||
| | Ask | :white_check_mark: | :white_check_mark: | |
|
||||
| | Auto-Description | :white_check_mark: | :white_check_mark: | |
|
||||
| | Improve Code | :white_check_mark: | :white_check_mark: | |
|
||||
| | Reflect and Review | :white_check_mark: | | |
|
||||
| | Update CHANGELOG.md | :white_check_mark: | | |
|
||||
| | | | | |
|
||||
| USAGE | CLI | :white_check_mark: | :white_check_mark: | :white_check_mark: |
|
||||
| | App / webhook | :white_check_mark: | :white_check_mark: | |
|
||||
| | Tagging bot | :white_check_mark: | | |
|
||||
| | Actions | :white_check_mark: | | |
|
||||
| | | | | |
|
||||
| CORE | PR compression | :white_check_mark: | :white_check_mark: | :white_check_mark: |
|
||||
| | Repo language prioritization | :white_check_mark: | :white_check_mark: | :white_check_mark: |
|
||||
| | Adaptive and token-aware<br />file patch fitting | :white_check_mark: | :white_check_mark: | :white_check_mark: |
|
||||
| | Multiple models support | :white_check_mark: | :white_check_mark: | :white_check_mark: |
|
||||
| | Incremental PR Review | :white_check_mark: | | |
|
||||
| | | GitHub | Gitlab | Bitbucket | CodeCommit |
|
||||
|-------|---------------------------------------------|:------:|:------:|:---------:|:----------:|
|
||||
| TOOLS | Review | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
|
||||
| | ⮑ Inline review | :white_check_mark: | :white_check_mark: | | |
|
||||
| | Ask | :white_check_mark: | :white_check_mark: | :white_check_mark: |
|
||||
| | Auto-Description | :white_check_mark: | :white_check_mark: | | |
|
||||
| | Improve Code | :white_check_mark: | :white_check_mark: | | |
|
||||
| | Reflect and Review | :white_check_mark: | | | |
|
||||
| | Update CHANGELOG.md | :white_check_mark: | | | |
|
||||
| | | | | | |
|
||||
| USAGE | CLI | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
|
||||
| | App / webhook | :white_check_mark: | :white_check_mark: | | |
|
||||
| | Tagging bot | :white_check_mark: | | | |
|
||||
| | Actions | :white_check_mark: | | | |
|
||||
| | | | | | |
|
||||
| CORE | PR compression | :white_check_mark: | :white_check_mark: | :white_check_mark: | |
|
||||
| | Repo language prioritization | :white_check_mark: | :white_check_mark: | :white_check_mark: | |
|
||||
| | Adaptive and token-aware<br />file patch fitting | :white_check_mark: | :white_check_mark: | :white_check_mark: | |
|
||||
| | Multiple models support | :white_check_mark: | :white_check_mark: | :white_check_mark: | |
|
||||
| | Incremental PR Review | :white_check_mark: | | | |
|
||||
|
||||
Examples for invoking the different tools via the CLI:
|
||||
- **Review**: python cli.py --pr_url=<pr_url> review
|
||||
@ -153,7 +153,7 @@ Here are some advantages of PR-Agent:
|
||||
- We emphasize **real-life practical usage**. Each tool (review, improve, ask, ...) has a single GPT-4 call, no more. We feel that this is critical for realistic team usage - obtaining an answer quickly (~30 seconds) and affordably.
|
||||
- Our [PR Compression strategy](./PR_COMPRESSION.md) is a core ability that enables to effectively tackle both short and long PRs.
|
||||
- Our JSON prompting strategy enables to have **modular, customizable tools**. For example, the '/review' tool categories can be controlled via the [configuration](./CONFIGURATION.md) file. Adding additional categories is easy and accessible.
|
||||
- We support **multiple git providers** (GitHub, Gitlab, Bitbucket), **multiple ways** to use the tool (CLI, GitHub Action, GitHub App, Docker, ...), and **multiple models** (GPT-4, GPT-3.5, Anthropic, Cohere, Llama2).
|
||||
- We support **multiple git providers** (GitHub, Gitlab, Bitbucket, CodeCommit), **multiple ways** to use the tool (CLI, GitHub Action, GitHub App, Docker, ...), and **multiple models** (GPT-4, GPT-3.5, Anthropic, Cohere, Llama2).
|
||||
- We are open-source, and welcome contributions from the community.
|
||||
|
||||
|
||||
|
BIN
pics/debugger.png
Normal file
BIN
pics/debugger.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 534 KiB |
@ -15,6 +15,7 @@ from pr_agent.tools.pr_update_changelog import PRUpdateChangelog
|
||||
from pr_agent.tools.pr_config import PRConfig
|
||||
|
||||
command2class = {
|
||||
"auto_review": PRReviewer,
|
||||
"answer": PRReviewer,
|
||||
"review": PRReviewer,
|
||||
"review_pr": PRReviewer,
|
||||
@ -70,6 +71,8 @@ class PRAgent:
|
||||
if notify:
|
||||
notify()
|
||||
await PRReviewer(pr_url, is_answer=True, args=args).run()
|
||||
elif action == "auto_review":
|
||||
await PRReviewer(pr_url, is_auto=True, args=args).run()
|
||||
elif action in command2class:
|
||||
if notify:
|
||||
notify()
|
||||
|
@ -26,10 +26,10 @@ class AiHandler:
|
||||
try:
|
||||
openai.api_key = get_settings().openai.key
|
||||
litellm.openai_key = get_settings().openai.key
|
||||
litellm.debugger = get_settings().config.litellm_debugger
|
||||
self.azure = False
|
||||
if get_settings().get("OPENAI.ORG", None):
|
||||
litellm.organization = get_settings().openai.org
|
||||
self.deployment_id = get_settings().get("OPENAI.DEPLOYMENT_ID", None)
|
||||
if get_settings().get("OPENAI.API_TYPE", None):
|
||||
if get_settings().openai.api_type == "azure":
|
||||
self.azure = True
|
||||
@ -44,12 +44,23 @@ class AiHandler:
|
||||
litellm.cohere_key = get_settings().cohere.key
|
||||
if get_settings().get("REPLICATE.KEY", None):
|
||||
litellm.replicate_key = get_settings().replicate.key
|
||||
if get_settings().get("REPLICATE.KEY", None):
|
||||
litellm.replicate_key = get_settings().replicate.key
|
||||
if get_settings().get("HUGGINGFACE.KEY", None):
|
||||
litellm.huggingface_key = get_settings().huggingface.key
|
||||
except AttributeError as e:
|
||||
raise ValueError("OpenAI key is required") from e
|
||||
|
||||
@property
|
||||
def deployment_id(self):
|
||||
"""
|
||||
Returns the deployment ID for the OpenAI API.
|
||||
"""
|
||||
return get_settings().get("OPENAI.DEPLOYMENT_ID", None)
|
||||
|
||||
@retry(exceptions=(APIError, Timeout, TryAgain, AttributeError, RateLimitError),
|
||||
tries=OPENAI_RETRIES, delay=2, backoff=2, jitter=(1, 3))
|
||||
async def chat_completion(self, model: str, temperature: float, system: str, user: str):
|
||||
async def chat_completion(self, model: str, system: str, user: str, temperature: float = 0.2):
|
||||
"""
|
||||
Performs a chat completion using the OpenAI ChatCompletion API.
|
||||
Retries in case of API errors or timeouts.
|
||||
@ -70,15 +81,22 @@ class AiHandler:
|
||||
TryAgain: If there is an attribute error during OpenAI inference.
|
||||
"""
|
||||
try:
|
||||
deployment_id = self.deployment_id
|
||||
if get_settings().config.verbosity_level >= 2:
|
||||
logging.debug(
|
||||
f"Generating completion with {model}"
|
||||
f"{(' from deployment ' + deployment_id) if deployment_id else ''}"
|
||||
)
|
||||
if self.azure:
|
||||
model = self.azure + "/" + model
|
||||
response = await acompletion(
|
||||
model=model,
|
||||
deployment_id=self.deployment_id,
|
||||
deployment_id=deployment_id,
|
||||
messages=[
|
||||
{"role": "system", "content": system},
|
||||
{"role": "user", "content": user}
|
||||
],
|
||||
temperature=temperature,
|
||||
azure=self.azure,
|
||||
force_timeout=get_settings().config.ai_timeout
|
||||
)
|
||||
except (APIError, Timeout, TryAgain) as e:
|
||||
|
@ -1,5 +1,4 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
import re
|
||||
|
||||
@ -157,7 +156,7 @@ def convert_to_hunks_with_lines_numbers(patch: str, file) -> str:
|
||||
|
||||
example output:
|
||||
## src/file.ts
|
||||
--new hunk--
|
||||
__new hunk__
|
||||
881 line1
|
||||
882 line2
|
||||
883 line3
|
||||
@ -166,7 +165,7 @@ def convert_to_hunks_with_lines_numbers(patch: str, file) -> str:
|
||||
889 line6
|
||||
890 line7
|
||||
...
|
||||
--old hunk--
|
||||
__old hunk__
|
||||
line1
|
||||
line2
|
||||
- line3
|
||||
@ -176,8 +175,7 @@ def convert_to_hunks_with_lines_numbers(patch: str, file) -> str:
|
||||
...
|
||||
"""
|
||||
|
||||
patch_with_lines_str = f"## {file.filename}\n"
|
||||
import re
|
||||
patch_with_lines_str = f"\n\n## {file.filename}\n"
|
||||
patch_lines = patch.splitlines()
|
||||
RE_HUNK_HEADER = re.compile(
|
||||
r"^@@ -(\d+)(?:,(\d+))? \+(\d+)(?:,(\d+))? @@[ ]?(.*)")
|
||||
@ -185,23 +183,30 @@ def convert_to_hunks_with_lines_numbers(patch: str, file) -> str:
|
||||
old_content_lines = []
|
||||
match = None
|
||||
start1, size1, start2, size2 = -1, -1, -1, -1
|
||||
prev_header_line = []
|
||||
header_line =[]
|
||||
for line in patch_lines:
|
||||
if 'no newline at end of file' in line.lower():
|
||||
continue
|
||||
|
||||
if line.startswith('@@'):
|
||||
header_line = line
|
||||
match = RE_HUNK_HEADER.match(line)
|
||||
if match and new_content_lines: # found a new hunk, split the previous lines
|
||||
if new_content_lines:
|
||||
patch_with_lines_str += '\n--new hunk--\n'
|
||||
if prev_header_line:
|
||||
patch_with_lines_str += f'\n{prev_header_line}\n'
|
||||
patch_with_lines_str += '__new hunk__\n'
|
||||
for i, line_new in enumerate(new_content_lines):
|
||||
patch_with_lines_str += f"{start2 + i} {line_new}\n"
|
||||
if old_content_lines:
|
||||
patch_with_lines_str += '--old hunk--\n'
|
||||
patch_with_lines_str += '__old hunk__\n'
|
||||
for line_old in old_content_lines:
|
||||
patch_with_lines_str += f"{line_old}\n"
|
||||
new_content_lines = []
|
||||
old_content_lines = []
|
||||
if match:
|
||||
prev_header_line = header_line
|
||||
try:
|
||||
start1, size1, start2, size2 = map(int, match.groups()[:4])
|
||||
except: # '@@ -0,0 +1 @@' case
|
||||
@ -219,12 +224,13 @@ def convert_to_hunks_with_lines_numbers(patch: str, file) -> str:
|
||||
# finishing last hunk
|
||||
if match and new_content_lines:
|
||||
if new_content_lines:
|
||||
patch_with_lines_str += '\n--new hunk--\n'
|
||||
patch_with_lines_str += f'\n{header_line}\n'
|
||||
patch_with_lines_str += '\n__new hunk__\n'
|
||||
for i, line_new in enumerate(new_content_lines):
|
||||
patch_with_lines_str += f"{start2 + i} {line_new}\n"
|
||||
if old_content_lines:
|
||||
patch_with_lines_str += '\n--old hunk--\n'
|
||||
patch_with_lines_str += '\n__old hunk__\n'
|
||||
for line_old in old_content_lines:
|
||||
patch_with_lines_str += f"{line_old}\n"
|
||||
|
||||
return patch_with_lines_str.strip()
|
||||
return patch_with_lines_str.rstrip()
|
||||
|
@ -57,7 +57,7 @@ def get_pr_diff(git_provider: GitProvider, token_handler: TokenHandler, model: s
|
||||
pr_languages = sort_files_by_main_languages(git_provider.get_languages(), diff_files)
|
||||
|
||||
# generate a standard diff string, with patch extension
|
||||
patches_extended, total_tokens = pr_generate_extended_diff(pr_languages, token_handler,
|
||||
patches_extended, total_tokens, patches_extended_tokens = pr_generate_extended_diff(pr_languages, token_handler,
|
||||
add_line_numbers_to_hunks)
|
||||
|
||||
# if we are under the limit, return the full diff
|
||||
@ -78,9 +78,9 @@ def get_pr_diff(git_provider: GitProvider, token_handler: TokenHandler, model: s
|
||||
return final_diff
|
||||
|
||||
|
||||
def pr_generate_extended_diff(pr_languages: list, token_handler: TokenHandler,
|
||||
add_line_numbers_to_hunks: bool) -> \
|
||||
Tuple[list, int]:
|
||||
def pr_generate_extended_diff(pr_languages: list,
|
||||
token_handler: TokenHandler,
|
||||
add_line_numbers_to_hunks: bool) -> Tuple[list, int, list]:
|
||||
"""
|
||||
Generate a standard diff string with patch extension, while counting the number of tokens used and applying diff
|
||||
minimization techniques if needed.
|
||||
@ -90,13 +90,10 @@ def pr_generate_extended_diff(pr_languages: list, token_handler: TokenHandler,
|
||||
files.
|
||||
- token_handler: An object of the TokenHandler class used for handling tokens in the context of the pull request.
|
||||
- add_line_numbers_to_hunks: A boolean indicating whether to add line numbers to the hunks in the diff.
|
||||
|
||||
Returns:
|
||||
- patches_extended: A list of extended patches for each file in the pull request.
|
||||
- total_tokens: The total number of tokens used in the extended patches.
|
||||
"""
|
||||
total_tokens = token_handler.prompt_tokens # initial tokens
|
||||
patches_extended = []
|
||||
patches_extended_tokens = []
|
||||
for lang in pr_languages:
|
||||
for file in lang['files']:
|
||||
original_file_content_str = file.base_file
|
||||
@ -106,7 +103,7 @@ def pr_generate_extended_diff(pr_languages: list, token_handler: TokenHandler,
|
||||
|
||||
# extend each patch with extra lines of context
|
||||
extended_patch = extend_patch(original_file_content_str, patch, num_lines=PATCH_EXTRA_LINES)
|
||||
full_extended_patch = f"## {file.filename}\n\n{extended_patch}\n"
|
||||
full_extended_patch = f"\n\n## {file.filename}\n\n{extended_patch}\n"
|
||||
|
||||
if add_line_numbers_to_hunks:
|
||||
full_extended_patch = convert_to_hunks_with_lines_numbers(extended_patch, file)
|
||||
@ -114,9 +111,10 @@ def pr_generate_extended_diff(pr_languages: list, token_handler: TokenHandler,
|
||||
patch_tokens = token_handler.count_tokens(full_extended_patch)
|
||||
file.tokens = patch_tokens
|
||||
total_tokens += patch_tokens
|
||||
patches_extended_tokens.append(patch_tokens)
|
||||
patches_extended.append(full_extended_patch)
|
||||
|
||||
return patches_extended, total_tokens
|
||||
return patches_extended, total_tokens, patches_extended_tokens
|
||||
|
||||
|
||||
def pr_generate_compressed_diff(top_langs: list, token_handler: TokenHandler, model: str,
|
||||
@ -208,18 +206,45 @@ def pr_generate_compressed_diff(top_langs: list, token_handler: TokenHandler, mo
|
||||
|
||||
|
||||
async def retry_with_fallback_models(f: Callable):
|
||||
all_models = _get_all_models()
|
||||
all_deployments = _get_all_deployments(all_models)
|
||||
# try each (model, deployment_id) pair until one is successful, otherwise raise exception
|
||||
for i, (model, deployment_id) in enumerate(zip(all_models, all_deployments)):
|
||||
try:
|
||||
get_settings().set("openai.deployment_id", deployment_id)
|
||||
return await f(model)
|
||||
except Exception as e:
|
||||
logging.warning(
|
||||
f"Failed to generate prediction with {model}"
|
||||
f"{(' from deployment ' + deployment_id) if deployment_id else ''}: "
|
||||
f"{traceback.format_exc()}"
|
||||
)
|
||||
if i == len(all_models) - 1: # If it's the last iteration
|
||||
raise # Re-raise the last exception
|
||||
|
||||
|
||||
def _get_all_models() -> List[str]:
|
||||
model = get_settings().config.model
|
||||
fallback_models = get_settings().config.fallback_models
|
||||
if not isinstance(fallback_models, list):
|
||||
fallback_models = [fallback_models]
|
||||
fallback_models = [m.strip() for m in fallback_models.split(",")]
|
||||
all_models = [model] + fallback_models
|
||||
for i, model in enumerate(all_models):
|
||||
try:
|
||||
return await f(model)
|
||||
except Exception as e:
|
||||
logging.warning(f"Failed to generate prediction with {model}: {traceback.format_exc()}")
|
||||
if i == len(all_models) - 1: # If it's the last iteration
|
||||
raise # Re-raise the last exception
|
||||
return all_models
|
||||
|
||||
|
||||
def _get_all_deployments(all_models: List[str]) -> List[str]:
|
||||
deployment_id = get_settings().get("openai.deployment_id", None)
|
||||
fallback_deployments = get_settings().get("openai.fallback_deployments", [])
|
||||
if not isinstance(fallback_deployments, list) and fallback_deployments:
|
||||
fallback_deployments = [d.strip() for d in fallback_deployments.split(",")]
|
||||
if fallback_deployments:
|
||||
all_deployments = [deployment_id] + fallback_deployments
|
||||
if len(all_deployments) < len(all_models):
|
||||
raise ValueError(f"The number of deployments ({len(all_deployments)}) "
|
||||
f"is less than the number of models ({len(all_models)})")
|
||||
else:
|
||||
all_deployments = [deployment_id] * len(all_models)
|
||||
return all_deployments
|
||||
|
||||
|
||||
def find_line_number_of_relevant_line_in_file(diff_files: List[FilePatchInfo],
|
||||
@ -297,7 +322,9 @@ def clip_tokens(text: str, max_tokens: int) -> str:
|
||||
Returns:
|
||||
str: The clipped string.
|
||||
"""
|
||||
# We'll estimate the number of tokens by hueristically assuming 2.5 tokens per word
|
||||
if not text:
|
||||
return text
|
||||
|
||||
try:
|
||||
encoder = get_token_encoder()
|
||||
num_input_tokens = len(encoder.encode(text))
|
||||
@ -310,4 +337,84 @@ def clip_tokens(text: str, max_tokens: int) -> str:
|
||||
return clipped_text
|
||||
except Exception as e:
|
||||
logging.warning(f"Failed to clip tokens: {e}")
|
||||
return text
|
||||
return text
|
||||
|
||||
|
||||
def get_pr_multi_diffs(git_provider: GitProvider,
|
||||
token_handler: TokenHandler,
|
||||
model: str,
|
||||
max_calls: int = 5) -> List[str]:
|
||||
"""
|
||||
Retrieves the diff files from a Git provider, sorts them by main language, and generates patches for each file.
|
||||
The patches are split into multiple groups based on the maximum number of tokens allowed for the given model.
|
||||
|
||||
Args:
|
||||
git_provider (GitProvider): An object that provides access to Git provider APIs.
|
||||
token_handler (TokenHandler): An object that handles tokens in the context of a pull request.
|
||||
model (str): The name of the model.
|
||||
max_calls (int, optional): The maximum number of calls to retrieve diff files. Defaults to 5.
|
||||
|
||||
Returns:
|
||||
List[str]: A list of final diff strings, split into multiple groups based on the maximum number of tokens allowed for the given model.
|
||||
|
||||
Raises:
|
||||
RateLimitExceededException: If the rate limit for the Git provider API is exceeded.
|
||||
"""
|
||||
try:
|
||||
diff_files = git_provider.get_diff_files()
|
||||
except RateLimitExceededException as e:
|
||||
logging.error(f"Rate limit exceeded for git provider API. original message {e}")
|
||||
raise
|
||||
|
||||
# Sort files by main language
|
||||
pr_languages = sort_files_by_main_languages(git_provider.get_languages(), diff_files)
|
||||
|
||||
# Sort files within each language group by tokens in descending order
|
||||
sorted_files = []
|
||||
for lang in pr_languages:
|
||||
sorted_files.extend(sorted(lang['files'], key=lambda x: x.tokens, reverse=True))
|
||||
|
||||
patches = []
|
||||
final_diff_list = []
|
||||
total_tokens = token_handler.prompt_tokens
|
||||
call_number = 1
|
||||
for file in sorted_files:
|
||||
if call_number > max_calls:
|
||||
if get_settings().config.verbosity_level >= 2:
|
||||
logging.info(f"Reached max calls ({max_calls})")
|
||||
break
|
||||
|
||||
original_file_content_str = file.base_file
|
||||
new_file_content_str = file.head_file
|
||||
patch = file.patch
|
||||
if not patch:
|
||||
continue
|
||||
|
||||
# Remove delete-only hunks
|
||||
patch = handle_patch_deletions(patch, original_file_content_str, new_file_content_str, file.filename)
|
||||
if patch is None:
|
||||
continue
|
||||
|
||||
patch = convert_to_hunks_with_lines_numbers(patch, file)
|
||||
new_patch_tokens = token_handler.count_tokens(patch)
|
||||
if patch and (total_tokens + new_patch_tokens > MAX_TOKENS[model] - OUTPUT_BUFFER_TOKENS_SOFT_THRESHOLD):
|
||||
final_diff = "\n".join(patches)
|
||||
final_diff_list.append(final_diff)
|
||||
patches = []
|
||||
total_tokens = token_handler.prompt_tokens
|
||||
call_number += 1
|
||||
if get_settings().config.verbosity_level >= 2:
|
||||
logging.info(f"Call number: {call_number}")
|
||||
|
||||
if patch:
|
||||
patches.append(patch)
|
||||
total_tokens += new_patch_tokens
|
||||
if get_settings().config.verbosity_level >= 2:
|
||||
logging.info(f"Tokens: {total_tokens}, last filename: {file.filename}")
|
||||
|
||||
# Add the last chunk
|
||||
if patches:
|
||||
final_diff = "\n".join(patches)
|
||||
final_diff_list.append(final_diff)
|
||||
|
||||
return final_diff_list
|
||||
|
@ -32,33 +32,37 @@ def convert_to_markdown(output_data: dict) -> str:
|
||||
|
||||
emojis = {
|
||||
"Main theme": "🎯",
|
||||
"PR summary": "📝",
|
||||
"Type of PR": "📌",
|
||||
"Score": "🏅",
|
||||
"Relevant tests added": "🧪",
|
||||
"Unrelated changes": "⚠️",
|
||||
"Focused PR": "✨",
|
||||
"Security concerns": "🔒",
|
||||
"General PR suggestions": "💡",
|
||||
"General suggestions": "💡",
|
||||
"Insights from user's answers": "📝",
|
||||
"Code feedback": "🤖",
|
||||
}
|
||||
|
||||
for key, value in output_data.items():
|
||||
if not value:
|
||||
if value is None or value == '' or value == {}:
|
||||
continue
|
||||
if isinstance(value, dict):
|
||||
markdown_text += f"## {key}\n\n"
|
||||
markdown_text += convert_to_markdown(value)
|
||||
elif isinstance(value, list):
|
||||
if key.lower() == 'code feedback':
|
||||
markdown_text += "\n" # just looks nicer with additional line breaks
|
||||
emoji = emojis.get(key, "")
|
||||
markdown_text += f"- {emoji} **{key}:**\n\n"
|
||||
if key.lower() == 'code feedback':
|
||||
markdown_text += f"\n\n- **<details><summary> { emoji } Code feedback:**</summary>\n\n"
|
||||
else:
|
||||
markdown_text += f"- {emoji} **{key}:**\n\n"
|
||||
for item in value:
|
||||
if isinstance(item, dict) and key.lower() == 'code feedback':
|
||||
markdown_text += parse_code_suggestion(item)
|
||||
elif item:
|
||||
markdown_text += f" - {item}\n"
|
||||
if key.lower() == 'code feedback':
|
||||
markdown_text += "</details>\n\n"
|
||||
elif value != 'n/a':
|
||||
emoji = emojis.get(key, "")
|
||||
markdown_text += f"- {emoji} **{key}:** {value}\n"
|
||||
@ -245,14 +249,13 @@ def update_settings_from_args(args: List[str]) -> List[str]:
|
||||
arg = arg.strip()
|
||||
if arg.startswith('--'):
|
||||
arg = arg.strip('-').strip()
|
||||
vals = arg.split('=')
|
||||
vals = arg.split('=', 1)
|
||||
if len(vals) != 2:
|
||||
logging.error(f'Invalid argument format: {arg}')
|
||||
if len(vals) > 2: # --extended is a valid argument
|
||||
logging.error(f'Invalid argument format: {arg}')
|
||||
other_args.append(arg)
|
||||
continue
|
||||
key, value = vals
|
||||
key = key.strip().upper()
|
||||
value = value.strip()
|
||||
key, value = _fix_key_value(*vals)
|
||||
get_settings().set(key, value)
|
||||
logging.info(f'Updated setting {key} to: "{value}"')
|
||||
else:
|
||||
@ -260,8 +263,18 @@ def update_settings_from_args(args: List[str]) -> List[str]:
|
||||
return other_args
|
||||
|
||||
|
||||
def _fix_key_value(key: str, value: str):
|
||||
key = key.strip().upper()
|
||||
value = value.strip()
|
||||
try:
|
||||
value = yaml.safe_load(value)
|
||||
except Exception as e:
|
||||
logging.error(f"Failed to parse YAML for config override {key}={value}", exc_info=e)
|
||||
return key, value
|
||||
|
||||
|
||||
def load_yaml(review_text: str) -> dict:
|
||||
review_text = review_text.lstrip('```yaml').rstrip('`')
|
||||
review_text = review_text.removeprefix('```yaml').rstrip('`')
|
||||
try:
|
||||
data = yaml.load(review_text, Loader=yaml.SafeLoader)
|
||||
except Exception as e:
|
||||
|
@ -19,13 +19,21 @@ For example:
|
||||
- cli.py --pr_url=... reflect
|
||||
|
||||
Supported commands:
|
||||
review / review_pr - Add a review that includes a summary of the PR and specific suggestions for improvement.
|
||||
ask / ask_question [question] - Ask a question about the PR.
|
||||
describe / describe_pr - Modify the PR title and description based on the PR's contents.
|
||||
improve / improve_code - Suggest improvements to the code in the PR as pull request comments ready to commit.
|
||||
reflect - Ask the PR author questions about the PR.
|
||||
update_changelog - Update the changelog based on the PR's contents.
|
||||
-review / review_pr - Add a review that includes a summary of the PR and specific suggestions for improvement.
|
||||
|
||||
-ask / ask_question [question] - Ask a question about the PR.
|
||||
|
||||
-describe / describe_pr - Modify the PR title and description based on the PR's contents.
|
||||
|
||||
-improve / improve_code - Suggest improvements to the code in the PR as pull request comments ready to commit.
|
||||
Extended mode ('improve --extended') employs several calls, and provides a more thorough feedback
|
||||
|
||||
-reflect - Ask the PR author questions about the PR.
|
||||
|
||||
-update_changelog - Update the changelog based on the PR's contents.
|
||||
|
||||
|
||||
Configuration:
|
||||
To edit any configuration parameter from 'configuration.toml', just add -config_path=<value>.
|
||||
For example: 'python cli.py --pr_url=... review --pr_reviewer.extra_instructions="focus on the file: ..."'
|
||||
""")
|
||||
|
@ -19,6 +19,7 @@ global_settings = Dynaconf(
|
||||
"settings/pr_questions_prompts.toml",
|
||||
"settings/pr_description_prompts.toml",
|
||||
"settings/pr_code_suggestions_prompts.toml",
|
||||
"settings/pr_sort_code_suggestions_prompts.toml",
|
||||
"settings/pr_information_from_user_prompts.toml",
|
||||
"settings/pr_update_changelog_prompts.toml",
|
||||
"settings_prod/.secrets.toml"
|
||||
|
@ -1,5 +1,6 @@
|
||||
from pr_agent.config_loader import get_settings
|
||||
from pr_agent.git_providers.bitbucket_provider import BitbucketProvider
|
||||
from pr_agent.git_providers.codecommit_provider import CodeCommitProvider
|
||||
from pr_agent.git_providers.github_provider import GithubProvider
|
||||
from pr_agent.git_providers.gitlab_provider import GitLabProvider
|
||||
from pr_agent.git_providers.local_git_provider import LocalGitProvider
|
||||
@ -10,6 +11,7 @@ _GIT_PROVIDERS = {
|
||||
'gitlab': GitLabProvider,
|
||||
'bitbucket': BitbucketProvider,
|
||||
'azure': AzureDevopsProvider,
|
||||
'codecommit': CodeCommitProvider,
|
||||
'local' : LocalGitProvider
|
||||
}
|
||||
|
||||
|
@ -1,3 +1,4 @@
|
||||
import json
|
||||
import logging
|
||||
from typing import Optional, Tuple
|
||||
from urllib.parse import urlparse
|
||||
@ -5,17 +6,17 @@ from urllib.parse import urlparse
|
||||
import requests
|
||||
from atlassian.bitbucket import Cloud
|
||||
|
||||
from ..algo.pr_processing import clip_tokens
|
||||
from ..config_loader import get_settings
|
||||
from .git_provider import FilePatchInfo
|
||||
from .git_provider import FilePatchInfo, GitProvider
|
||||
|
||||
|
||||
class BitbucketProvider:
|
||||
class BitbucketProvider(GitProvider):
|
||||
def __init__(self, pr_url: Optional[str] = None, incremental: Optional[bool] = False):
|
||||
s = requests.Session()
|
||||
s.headers['Authorization'] = f'Bearer {get_settings().get("BITBUCKET.BEARER_TOKEN", None)}'
|
||||
s.headers['Content-Type'] = 'application/json'
|
||||
self.headers = s.headers
|
||||
self.bitbucket_client = Cloud(session=s)
|
||||
|
||||
self.workspace_slug = None
|
||||
self.repo_slug = None
|
||||
self.repo = None
|
||||
@ -25,6 +26,64 @@ class BitbucketProvider:
|
||||
self.incremental = incremental
|
||||
if pr_url:
|
||||
self.set_pr(pr_url)
|
||||
self.bitbucket_comment_api_url = self.pr._BitbucketBase__data['links']['comments']['href']
|
||||
|
||||
def get_repo_settings(self):
|
||||
try:
|
||||
contents = self.repo_obj.get_contents(".pr_agent.toml", ref=self.pr.head.sha).decoded_content
|
||||
return contents
|
||||
except Exception:
|
||||
return ""
|
||||
|
||||
def publish_code_suggestions(self, code_suggestions: list) -> bool:
|
||||
"""
|
||||
Publishes code suggestions as comments on the PR.
|
||||
"""
|
||||
post_parameters_list = []
|
||||
for suggestion in code_suggestions:
|
||||
body = suggestion['body']
|
||||
relevant_file = suggestion['relevant_file']
|
||||
relevant_lines_start = suggestion['relevant_lines_start']
|
||||
relevant_lines_end = suggestion['relevant_lines_end']
|
||||
|
||||
if not relevant_lines_start or relevant_lines_start == -1:
|
||||
if get_settings().config.verbosity_level >= 2:
|
||||
logging.exception(
|
||||
f"Failed to publish code suggestion, relevant_lines_start is {relevant_lines_start}")
|
||||
continue
|
||||
|
||||
if relevant_lines_end < relevant_lines_start:
|
||||
if get_settings().config.verbosity_level >= 2:
|
||||
logging.exception(f"Failed to publish code suggestion, "
|
||||
f"relevant_lines_end is {relevant_lines_end} and "
|
||||
f"relevant_lines_start is {relevant_lines_start}")
|
||||
continue
|
||||
|
||||
if relevant_lines_end > relevant_lines_start:
|
||||
post_parameters = {
|
||||
"body": body,
|
||||
"path": relevant_file,
|
||||
"line": relevant_lines_end,
|
||||
"start_line": relevant_lines_start,
|
||||
"start_side": "RIGHT",
|
||||
}
|
||||
else: # API is different for single line comments
|
||||
post_parameters = {
|
||||
"body": body,
|
||||
"path": relevant_file,
|
||||
"line": relevant_lines_start,
|
||||
"side": "RIGHT",
|
||||
}
|
||||
post_parameters_list.append(post_parameters)
|
||||
|
||||
|
||||
try:
|
||||
self.publish_inline_comments(post_parameters_list)
|
||||
return True
|
||||
except Exception as e:
|
||||
if get_settings().config.verbosity_level >= 2:
|
||||
logging.error(f"Failed to publish code suggestion, error: {e}")
|
||||
return False
|
||||
|
||||
def is_supported(self, capability: str) -> bool:
|
||||
if capability in ['get_issue_comments', 'create_inline_comment', 'publish_inline_comments', 'get_labels']:
|
||||
@ -62,14 +121,29 @@ class BitbucketProvider:
|
||||
except Exception as e:
|
||||
logging.exception(f"Failed to remove temp comments, error: {e}")
|
||||
|
||||
def publish_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
|
||||
pass
|
||||
def publish_inline_comment(self, comment: str, from_line: int, to_line: int, file: str):
|
||||
payload = json.dumps( {
|
||||
"content": {
|
||||
"raw": comment,
|
||||
},
|
||||
"inline": {
|
||||
"to": from_line,
|
||||
"path": file
|
||||
},
|
||||
})
|
||||
response = requests.request(
|
||||
"POST",
|
||||
self.bitbucket_comment_api_url,
|
||||
data=payload,
|
||||
headers=self.headers
|
||||
)
|
||||
return response
|
||||
|
||||
|
||||
def create_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
|
||||
raise NotImplementedError("Bitbucket provider does not support creating inline comments yet")
|
||||
|
||||
def publish_inline_comments(self, comments: list[dict]):
|
||||
raise NotImplementedError("Bitbucket provider does not support publishing inline comments yet")
|
||||
for comment in comments:
|
||||
self.publish_inline_comment(comment['body'], comment['start_line'], comment['line'], comment['path'])
|
||||
|
||||
def get_title(self):
|
||||
return self.pr.title
|
||||
@ -81,10 +155,7 @@ class BitbucketProvider:
|
||||
def get_pr_branch(self):
|
||||
return self.pr.source_branch
|
||||
|
||||
def get_pr_description(self):
|
||||
max_tokens = get_settings().get("CONFIG.MAX_DESCRIPTION_TOKENS", None)
|
||||
if max_tokens:
|
||||
return clip_tokens(self.pr.description, max_tokens)
|
||||
def get_pr_description_full(self):
|
||||
return self.pr.description
|
||||
|
||||
def get_user_id(self):
|
||||
@ -104,7 +175,7 @@ class BitbucketProvider:
|
||||
parsed_url = urlparse(pr_url)
|
||||
|
||||
if 'bitbucket.org' not in parsed_url.netloc:
|
||||
raise ValueError("The provided URL is not a valid GitHub URL")
|
||||
raise ValueError("The provided URL is not a valid Bitbucket URL")
|
||||
|
||||
path_parts = parsed_url.path.strip('/').split('/')
|
||||
|
||||
|
203
pr_agent/git_providers/codecommit_client.py
Normal file
203
pr_agent/git_providers/codecommit_client.py
Normal file
@ -0,0 +1,203 @@
|
||||
import boto3
|
||||
import botocore
|
||||
|
||||
|
||||
class CodeCommitDifferencesResponse:
|
||||
"""
|
||||
CodeCommitDifferencesResponse is the response object returned from our get_differences() function.
|
||||
It maps the JSON response to member variables of this class.
|
||||
"""
|
||||
|
||||
def __init__(self, json: dict):
|
||||
before_blob = json.get("beforeBlob", {})
|
||||
after_blob = json.get("afterBlob", {})
|
||||
|
||||
self.before_blob_id = before_blob.get("blobId", "")
|
||||
self.before_blob_path = before_blob.get("path", "")
|
||||
self.after_blob_id = after_blob.get("blobId", "")
|
||||
self.after_blob_path = after_blob.get("path", "")
|
||||
self.change_type = json.get("changeType", "")
|
||||
|
||||
|
||||
class CodeCommitPullRequestResponse:
|
||||
"""
|
||||
CodeCommitPullRequestResponse is the response object returned from our get_pr() function.
|
||||
It maps the JSON response to member variables of this class.
|
||||
"""
|
||||
|
||||
def __init__(self, json: dict):
|
||||
self.title = json.get("title", "")
|
||||
self.description = json.get("description", "")
|
||||
|
||||
self.targets = []
|
||||
for target in json.get("pullRequestTargets", []):
|
||||
self.targets.append(CodeCommitPullRequestResponse.CodeCommitPullRequestTarget(target))
|
||||
|
||||
class CodeCommitPullRequestTarget:
|
||||
"""
|
||||
CodeCommitPullRequestTarget is a subclass of CodeCommitPullRequestResponse that
|
||||
holds details about an individual target commit.
|
||||
"""
|
||||
|
||||
def __init__(self, json: dict):
|
||||
self.source_commit = json.get("sourceCommit", "")
|
||||
self.source_branch = json.get("sourceReference", "")
|
||||
self.destination_commit = json.get("destinationCommit", "")
|
||||
self.destination_branch = json.get("destinationReference", "")
|
||||
|
||||
|
||||
class CodeCommitClient:
|
||||
"""
|
||||
CodeCommitClient is a wrapper around the AWS boto3 SDK for the CodeCommit client
|
||||
"""
|
||||
|
||||
def __init__(self):
|
||||
self.boto_client = None
|
||||
|
||||
def _connect_boto_client(self):
|
||||
try:
|
||||
self.boto_client = boto3.client("codecommit")
|
||||
except Exception as e:
|
||||
raise ValueError(f"Failed to connect to AWS CodeCommit: {e}")
|
||||
|
||||
def get_differences(self, repo_name: int, destination_commit: str, source_commit: str):
|
||||
"""
|
||||
Get the differences between two commits in CodeCommit.
|
||||
|
||||
Parameters:
|
||||
- repo_name: Name of the repository
|
||||
- destination_commit: Commit hash you want to merge into (the "before" hash) (usually on the main or master branch)
|
||||
- source_commit: Commit hash of the code you are adding (the "after" branch)
|
||||
|
||||
Returns:
|
||||
- List of CodeCommitDifferencesResponse objects
|
||||
|
||||
Boto3 Documentation:
|
||||
aws codecommit get-differences
|
||||
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/codecommit/client/get_differences.html
|
||||
"""
|
||||
if self.boto_client is None:
|
||||
self._connect_boto_client()
|
||||
|
||||
# The differences response from AWS is paginated, so we need to iterate through the pages to get all the differences.
|
||||
differences = []
|
||||
try:
|
||||
paginator = self.boto_client.get_paginator("get_differences")
|
||||
for page in paginator.paginate(
|
||||
repositoryName=repo_name,
|
||||
beforeCommitSpecifier=destination_commit,
|
||||
afterCommitSpecifier=source_commit,
|
||||
):
|
||||
differences.extend(page.get("differences", []))
|
||||
except botocore.exceptions.ClientError as e:
|
||||
raise ValueError(f"Failed to retrieve differences from CodeCommit PR #{self.pr_num}") from e
|
||||
|
||||
output = []
|
||||
for json in differences:
|
||||
output.append(CodeCommitDifferencesResponse(json))
|
||||
return output
|
||||
|
||||
def get_file(self, repo_name: str, file_path: str, sha_hash: str, optional: bool = False):
|
||||
"""
|
||||
Retrieve a file from CodeCommit.
|
||||
|
||||
Parameters:
|
||||
- repo_name: Name of the repository
|
||||
- file_path: Path to the file you are retrieving
|
||||
- sha_hash: Commit hash of the file you are retrieving
|
||||
|
||||
Returns:
|
||||
- File contents
|
||||
|
||||
Boto3 Documentation:
|
||||
aws codecommit get_file
|
||||
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/codecommit/client/get_file.html
|
||||
"""
|
||||
if not file_path:
|
||||
return ""
|
||||
|
||||
if self.boto_client is None:
|
||||
self._connect_boto_client()
|
||||
|
||||
try:
|
||||
response = self.boto_client.get_file(repositoryName=repo_name, commitSpecifier=sha_hash, filePath=file_path)
|
||||
except botocore.exceptions.ClientError as e:
|
||||
# if the file does not exist, but is flagged as optional, then return an empty string
|
||||
if optional and e.response["Error"]["Code"] == 'FileDoesNotExistException':
|
||||
return ""
|
||||
raise ValueError(f"CodeCommit cannot retrieve file '{file_path}' from repository '{repo_name}'") from e
|
||||
except Exception as e:
|
||||
raise ValueError(f"CodeCommit cannot retrieve file '{file_path}' from repository '{repo_name}'") from e
|
||||
if "fileContent" not in response:
|
||||
raise ValueError(f"File content is empty for file: {file_path}")
|
||||
|
||||
return response.get("fileContent", "")
|
||||
|
||||
def get_pr(self, pr_number: int):
|
||||
"""
|
||||
Get a information about a CodeCommit PR.
|
||||
|
||||
Parameters:
|
||||
- pr_number: The PR number you are requesting
|
||||
|
||||
Returns:
|
||||
- CodeCommitPullRequestResponse object
|
||||
|
||||
Boto3 Documentation:
|
||||
aws codecommit get_pull_request
|
||||
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/codecommit/client/get_pull_request.html
|
||||
"""
|
||||
if self.boto_client is None:
|
||||
self._connect_boto_client()
|
||||
|
||||
try:
|
||||
response = self.boto_client.get_pull_request(pullRequestId=str(pr_number))
|
||||
except botocore.exceptions.ClientError as e:
|
||||
if e.response["Error"]["Code"] == 'PullRequestDoesNotExistException':
|
||||
raise ValueError(f"CodeCommit cannot retrieve PR: PR number does not exist: {pr_number}") from e
|
||||
raise ValueError(f"CodeCommit cannot retrieve PR: {pr_number}: boto client error") from e
|
||||
except Exception as e:
|
||||
raise ValueError(f"CodeCommit cannot retrieve PR: {pr_number}") from e
|
||||
|
||||
if "pullRequest" not in response:
|
||||
raise ValueError("CodeCommit PR number not found: {pr_number}")
|
||||
|
||||
return CodeCommitPullRequestResponse(response.get("pullRequest", {}))
|
||||
|
||||
def publish_comment(self, repo_name: str, pr_number: int, destination_commit: str, source_commit: str, comment: str):
|
||||
"""
|
||||
Publish a comment to a pull request
|
||||
|
||||
Parameters:
|
||||
- repo_name: name of the repository
|
||||
- pr_number: number of the pull request
|
||||
- destination_commit: The commit hash you want to merge into (the "before" hash) (usually on the main or master branch)
|
||||
- source_commit: The commit hash of the code you are adding (the "after" branch)
|
||||
- pr_comment: comment
|
||||
|
||||
Returns:
|
||||
- None
|
||||
|
||||
Boto3 Documentation:
|
||||
aws codecommit post_comment_for_pull_request
|
||||
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/codecommit/client/post_comment_for_pull_request.html
|
||||
"""
|
||||
if self.boto_client is None:
|
||||
self._connect_boto_client()
|
||||
|
||||
try:
|
||||
self.boto_client.post_comment_for_pull_request(
|
||||
pullRequestId=str(pr_number),
|
||||
repositoryName=repo_name,
|
||||
beforeCommitId=destination_commit,
|
||||
afterCommitId=source_commit,
|
||||
content=comment,
|
||||
)
|
||||
except botocore.exceptions.ClientError as e:
|
||||
if e.response["Error"]["Code"] == 'RepositoryDoesNotExistException':
|
||||
raise ValueError(f"Repository does not exist: {repo_name}") from e
|
||||
if e.response["Error"]["Code"] == 'PullRequestDoesNotExistException':
|
||||
raise ValueError(f"PR number does not exist: {pr_number}") from e
|
||||
raise ValueError(f"Boto3 client error calling post_comment_for_pull_request") from e
|
||||
except Exception as e:
|
||||
raise ValueError(f"Error calling post_comment_for_pull_request") from e
|
363
pr_agent/git_providers/codecommit_provider.py
Normal file
363
pr_agent/git_providers/codecommit_provider.py
Normal file
@ -0,0 +1,363 @@
|
||||
import logging
|
||||
import os
|
||||
from collections import Counter
|
||||
from typing import List, Optional, Tuple
|
||||
from urllib.parse import urlparse
|
||||
|
||||
from ..algo.language_handler import is_valid_file, language_extension_map
|
||||
from ..algo.pr_processing import clip_tokens
|
||||
from ..algo.utils import load_large_diff
|
||||
from ..config_loader import get_settings
|
||||
from .git_provider import EDIT_TYPE, FilePatchInfo, GitProvider, IncrementalPR
|
||||
from pr_agent.git_providers.codecommit_client import CodeCommitClient
|
||||
|
||||
|
||||
class PullRequestCCMimic:
|
||||
"""
|
||||
This class mimics the PullRequest class from the PyGithub library for the CodeCommitProvider.
|
||||
"""
|
||||
|
||||
def __init__(self, title: str, diff_files: List[FilePatchInfo]):
|
||||
self.title = title
|
||||
self.diff_files = diff_files
|
||||
self.description = None
|
||||
self.source_commit = None
|
||||
self.source_branch = None # the branch containing your new code changes
|
||||
self.destination_commit = None
|
||||
self.destination_branch = None # the branch you are going to merge into
|
||||
|
||||
|
||||
class CodeCommitFile:
|
||||
"""
|
||||
This class represents a file in a pull request in CodeCommit.
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
a_path: str,
|
||||
a_blob_id: str,
|
||||
b_path: str,
|
||||
b_blob_id: str,
|
||||
edit_type: EDIT_TYPE,
|
||||
):
|
||||
self.a_path = a_path
|
||||
self.a_blob_id = a_blob_id
|
||||
self.b_path = b_path
|
||||
self.b_blob_id = b_blob_id
|
||||
self.edit_type: EDIT_TYPE = edit_type
|
||||
self.filename = b_path if b_path else a_path
|
||||
|
||||
|
||||
class CodeCommitProvider(GitProvider):
|
||||
"""
|
||||
This class implements the GitProvider interface for AWS CodeCommit repositories.
|
||||
"""
|
||||
|
||||
def __init__(self, pr_url: Optional[str] = None, incremental: Optional[bool] = False):
|
||||
self.codecommit_client = CodeCommitClient()
|
||||
self.aws_client = None
|
||||
self.repo_name = None
|
||||
self.pr_num = None
|
||||
self.pr = None
|
||||
self.diff_files = None
|
||||
self.git_files = None
|
||||
if pr_url:
|
||||
self.set_pr(pr_url)
|
||||
|
||||
def provider_name(self):
|
||||
return "CodeCommit"
|
||||
|
||||
def is_supported(self, capability: str) -> bool:
|
||||
if capability in [
|
||||
"get_issue_comments",
|
||||
"create_inline_comment",
|
||||
"publish_inline_comments",
|
||||
"get_labels",
|
||||
]:
|
||||
return False
|
||||
return True
|
||||
|
||||
def set_pr(self, pr_url: str):
|
||||
self.repo_name, self.pr_num = self._parse_pr_url(pr_url)
|
||||
self.pr = self._get_pr()
|
||||
|
||||
def get_files(self) -> list[CodeCommitFile]:
|
||||
# bring files from CodeCommit only once
|
||||
if self.git_files:
|
||||
return self.git_files
|
||||
|
||||
self.git_files = []
|
||||
differences = self.codecommit_client.get_differences(self.repo_name, self.pr.destination_commit, self.pr.source_commit)
|
||||
for item in differences:
|
||||
self.git_files.append(CodeCommitFile(item.before_blob_path,
|
||||
item.before_blob_id,
|
||||
item.after_blob_path,
|
||||
item.after_blob_id,
|
||||
CodeCommitProvider._get_edit_type(item.change_type)))
|
||||
return self.git_files
|
||||
|
||||
def get_diff_files(self) -> list[FilePatchInfo]:
|
||||
"""
|
||||
Retrieves the list of files that have been modified, added, deleted, or renamed in a pull request in CodeCommit,
|
||||
along with their content and patch information.
|
||||
|
||||
Returns:
|
||||
diff_files (List[FilePatchInfo]): List of FilePatchInfo objects representing the modified, added, deleted,
|
||||
or renamed files in the merge request.
|
||||
"""
|
||||
# bring files from CodeCommit only once
|
||||
if self.diff_files:
|
||||
return self.diff_files
|
||||
|
||||
self.diff_files = []
|
||||
|
||||
files = self.get_files()
|
||||
for diff_item in files:
|
||||
patch_filename = ""
|
||||
if diff_item.a_blob_id is not None:
|
||||
patch_filename = diff_item.a_path
|
||||
original_file_content_str = self.codecommit_client.get_file(
|
||||
self.repo_name, diff_item.a_path, self.pr.destination_commit)
|
||||
if isinstance(original_file_content_str, (bytes, bytearray)):
|
||||
original_file_content_str = original_file_content_str.decode("utf-8")
|
||||
else:
|
||||
original_file_content_str = ""
|
||||
|
||||
if diff_item.b_blob_id is not None:
|
||||
patch_filename = diff_item.b_path
|
||||
new_file_content_str = self.codecommit_client.get_file(self.repo_name, diff_item.b_path, self.pr.source_commit)
|
||||
if isinstance(new_file_content_str, (bytes, bytearray)):
|
||||
new_file_content_str = new_file_content_str.decode("utf-8")
|
||||
else:
|
||||
new_file_content_str = ""
|
||||
|
||||
patch = load_large_diff(patch_filename, new_file_content_str, original_file_content_str)
|
||||
|
||||
# Store the diffs as a list of FilePatchInfo objects
|
||||
info = FilePatchInfo(
|
||||
original_file_content_str,
|
||||
new_file_content_str,
|
||||
patch,
|
||||
diff_item.b_path,
|
||||
edit_type=diff_item.edit_type,
|
||||
old_filename=None
|
||||
if diff_item.a_path == diff_item.b_path
|
||||
else diff_item.a_path,
|
||||
)
|
||||
# Only add valid files to the diff list
|
||||
# "bad extensions" are set in the language_extensions.toml file
|
||||
# a "valid file" is one that is not in the "bad extensions" list
|
||||
if is_valid_file(info.filename):
|
||||
self.diff_files.append(info)
|
||||
|
||||
return self.diff_files
|
||||
|
||||
def publish_description(self, pr_title: str, pr_body: str):
|
||||
return "" # not implemented yet
|
||||
|
||||
def publish_comment(self, pr_comment: str, is_temporary: bool = False):
|
||||
if is_temporary:
|
||||
logging.info(pr_comment)
|
||||
return
|
||||
|
||||
try:
|
||||
self.codecommit_client.publish_comment(
|
||||
repo_name=self.repo_name,
|
||||
pr_number=str(self.pr_num),
|
||||
destination_commit=self.pr.destination_commit,
|
||||
source_commit=self.pr.source_commit,
|
||||
comment=pr_comment,
|
||||
)
|
||||
except Exception as e:
|
||||
raise ValueError(f"CodeCommit Cannot post comment for PR: {self.pr_num}") from e
|
||||
|
||||
def publish_code_suggestions(self, code_suggestions: list) -> bool:
|
||||
return [""] # not implemented yet
|
||||
|
||||
def publish_labels(self, labels):
|
||||
return [""] # not implemented yet
|
||||
|
||||
def get_labels(self):
|
||||
return [""] # not implemented yet
|
||||
|
||||
def remove_initial_comment(self):
|
||||
return "" # not implemented yet
|
||||
|
||||
def publish_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
|
||||
raise NotImplementedError("CodeCommit provider does not support publishing inline comments yet")
|
||||
|
||||
def create_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
|
||||
raise NotImplementedError("CodeCommit provider does not support creating inline comments yet")
|
||||
|
||||
def publish_inline_comments(self, comments: list[dict]):
|
||||
raise NotImplementedError("CodeCommit provider does not support publishing inline comments yet")
|
||||
|
||||
def get_title(self):
|
||||
return self.pr.get("title", "")
|
||||
|
||||
def get_languages(self):
|
||||
"""
|
||||
Returns a dictionary of languages, containing the percentage of each language used in the PR.
|
||||
|
||||
Returns:
|
||||
dict: A dictionary where each key is a language name and the corresponding value is the percentage of that language in the PR.
|
||||
"""
|
||||
commit_files = self.get_files()
|
||||
filenames = [ item.filename for item in commit_files ]
|
||||
extensions = CodeCommitProvider._get_file_extensions(filenames)
|
||||
|
||||
# Calculate the percentage of each file extension in the PR
|
||||
percentages = CodeCommitProvider._get_language_percentages(extensions)
|
||||
|
||||
# The global language_extension_map is a dictionary of languages,
|
||||
# where each dictionary item is a BoxList of extensions.
|
||||
# We want a dictionary of extensions,
|
||||
# where each dictionary item is a language name.
|
||||
# We build that language->extension dictionary here in main_extensions_flat.
|
||||
main_extensions_flat = {}
|
||||
for language, extensions in language_extension_map.items():
|
||||
for ext in extensions:
|
||||
main_extensions_flat[ext] = language
|
||||
|
||||
# Map the file extension/languages to percentages
|
||||
languages = {}
|
||||
for ext, pct in percentages.items():
|
||||
languages[main_extensions_flat.get(ext, "")] = pct
|
||||
|
||||
return languages
|
||||
|
||||
def get_pr_branch(self):
|
||||
return self.pr.source_branch
|
||||
|
||||
def get_pr_description_full(self) -> str:
|
||||
return self.pr.description
|
||||
|
||||
def get_user_id(self):
|
||||
return -1 # not implemented yet
|
||||
|
||||
def get_issue_comments(self):
|
||||
raise NotImplementedError("CodeCommit provider does not support issue comments yet")
|
||||
|
||||
def get_repo_settings(self):
|
||||
# a local ".pr_agent.toml" settings file is optional
|
||||
settings_filename = ".pr_agent.toml"
|
||||
return self.codecommit_client.get_file(self.repo_name, settings_filename, self.pr.source_commit, optional=True)
|
||||
|
||||
def add_eyes_reaction(self, issue_comment_id: int) -> Optional[int]:
|
||||
return True
|
||||
|
||||
def remove_reaction(self, issue_comment_id: int, reaction_id: int) -> bool:
|
||||
return True
|
||||
|
||||
@staticmethod
|
||||
def _parse_pr_url(pr_url: str) -> Tuple[str, int]:
|
||||
# Example PR URL:
|
||||
# https://us-east-1.console.aws.amazon.com/codesuite/codecommit/repositories/__MY_REPO__/pull-requests/123456"
|
||||
parsed_url = urlparse(pr_url)
|
||||
|
||||
if "us-east-1.console.aws.amazon.com" not in parsed_url.netloc:
|
||||
raise ValueError(f"The provided URL is not a valid CodeCommit URL: {pr_url}")
|
||||
|
||||
path_parts = parsed_url.path.strip("/").split("/")
|
||||
|
||||
if (
|
||||
len(path_parts) < 6
|
||||
or path_parts[0] != "codesuite"
|
||||
or path_parts[1] != "codecommit"
|
||||
or path_parts[2] != "repositories"
|
||||
or path_parts[4] != "pull-requests"
|
||||
):
|
||||
raise ValueError(f"The provided URL does not appear to be a CodeCommit PR URL: {pr_url}")
|
||||
|
||||
repo_name = path_parts[3]
|
||||
|
||||
try:
|
||||
pr_number = int(path_parts[5])
|
||||
except ValueError as e:
|
||||
raise ValueError(f"Unable to convert PR number to integer: '{path_parts[5]}'") from e
|
||||
|
||||
return repo_name, pr_number
|
||||
|
||||
def _get_pr(self):
|
||||
response = self.codecommit_client.get_pr(self.pr_num)
|
||||
|
||||
if len(response.targets) == 0:
|
||||
raise ValueError(f"No files found in CodeCommit PR: {self.pr_num}")
|
||||
|
||||
# TODO: implement support for multiple commits in one CodeCommit PR
|
||||
# for now, we are only using the first commit in the PR
|
||||
if len(response.targets) > 1:
|
||||
logging.warning(
|
||||
"Multiple commits in one PR is not supported for CodeCommit yet. Continuing, using the first commit only..."
|
||||
)
|
||||
|
||||
# Return our object that mimics PullRequest class from the PyGithub library
|
||||
# (This strategy was copied from the LocalGitProvider)
|
||||
mimic = PullRequestCCMimic(response.title, self.diff_files)
|
||||
mimic.description = response.description
|
||||
mimic.source_commit = response.targets[0].source_commit
|
||||
mimic.source_branch = response.targets[0].source_branch
|
||||
mimic.destination_commit = response.targets[0].destination_commit
|
||||
mimic.destination_branch = response.targets[0].destination_branch
|
||||
|
||||
return mimic
|
||||
|
||||
def get_commit_messages(self):
|
||||
return "" # not implemented yet
|
||||
|
||||
@staticmethod
|
||||
def _get_edit_type(codecommit_change_type):
|
||||
"""
|
||||
Convert the CodeCommit change type string to the EDIT_TYPE enum.
|
||||
The CodeCommit change type string is returned from the get_differences SDK method.
|
||||
|
||||
Returns:
|
||||
An EDIT_TYPE enum representing the modified, added, deleted, or renamed file in the PR diff.
|
||||
"""
|
||||
t = codecommit_change_type.upper()
|
||||
edit_type = None
|
||||
if t == "A":
|
||||
edit_type = EDIT_TYPE.ADDED
|
||||
elif t == "D":
|
||||
edit_type = EDIT_TYPE.DELETED
|
||||
elif t == "M":
|
||||
edit_type = EDIT_TYPE.MODIFIED
|
||||
elif t == "R":
|
||||
edit_type = EDIT_TYPE.RENAMED
|
||||
return edit_type
|
||||
|
||||
@staticmethod
|
||||
def _get_file_extensions(filenames):
|
||||
"""
|
||||
Return a list of file extensions from a list of filenames.
|
||||
The returned extensions will include the dot "." prefix,
|
||||
to accommodate for the dots in the existing language_extension_map settings.
|
||||
Filenames with no extension will return an empty string for the extension.
|
||||
"""
|
||||
extensions = []
|
||||
for filename in filenames:
|
||||
filename, ext = os.path.splitext(filename)
|
||||
if ext:
|
||||
extensions.append(ext.lower())
|
||||
else:
|
||||
extensions.append("")
|
||||
return extensions
|
||||
|
||||
@staticmethod
|
||||
def _get_language_percentages(extensions):
|
||||
"""
|
||||
Return a dictionary containing the programming language name (as the key),
|
||||
and the percentage that language is used (as the value),
|
||||
given a list of file extensions.
|
||||
"""
|
||||
total_files = len(extensions)
|
||||
if total_files == 0:
|
||||
return {}
|
||||
|
||||
# Identify language by file extension and count
|
||||
lang_count = Counter(extensions)
|
||||
# Convert counts to percentages
|
||||
lang_percentage = {
|
||||
lang: round(count / total_files * 100) for lang, count in lang_count.items()
|
||||
}
|
||||
return lang_percentage
|
@ -55,7 +55,7 @@ class GitProvider(ABC):
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def publish_code_suggestions(self, code_suggestions: list):
|
||||
def publish_code_suggestions(self, code_suggestions: list) -> bool:
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
@ -83,13 +83,38 @@ class GitProvider(ABC):
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def get_pr_description(self):
|
||||
def get_pr_description_full(self) -> str:
|
||||
pass
|
||||
|
||||
def get_pr_description(self) -> str:
|
||||
from pr_agent.config_loader import get_settings
|
||||
from pr_agent.algo.pr_processing import clip_tokens
|
||||
max_tokens = get_settings().get("CONFIG.MAX_DESCRIPTION_TOKENS", None)
|
||||
description = self.get_pr_description_full()
|
||||
if max_tokens:
|
||||
return clip_tokens(description, max_tokens)
|
||||
return description
|
||||
|
||||
def get_user_description(self) -> str:
|
||||
description = (self.get_pr_description_full() or "").strip()
|
||||
# if the existing description wasn't generated by the pr-agent, just return it as-is
|
||||
if not description.startswith("## PR Type"):
|
||||
return description
|
||||
# if the existing description was generated by the pr-agent, but it doesn't contain the user description,
|
||||
# return nothing (empty string) because it means there is no user description
|
||||
if "## User Description:" not in description:
|
||||
return ""
|
||||
# otherwise, extract the original user description from the existing pr-agent description and return it
|
||||
return description.split("## User Description:", 1)[1].strip()
|
||||
|
||||
@abstractmethod
|
||||
def get_issue_comments(self):
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def get_repo_settings(self):
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def add_eyes_reaction(self, issue_comment_id: int) -> Optional[int]:
|
||||
pass
|
||||
|
@ -166,7 +166,7 @@ class GithubProvider(GitProvider):
|
||||
def publish_inline_comments(self, comments: list[dict]):
|
||||
self.pr.create_review(commit=self.last_commit_id, comments=comments)
|
||||
|
||||
def publish_code_suggestions(self, code_suggestions: list):
|
||||
def publish_code_suggestions(self, code_suggestions: list) -> bool:
|
||||
"""
|
||||
Publishes code suggestions as comments on the PR.
|
||||
"""
|
||||
@ -233,10 +233,7 @@ class GithubProvider(GitProvider):
|
||||
def get_pr_branch(self):
|
||||
return self.pr.head.ref
|
||||
|
||||
def get_pr_description(self):
|
||||
max_tokens = get_settings().get("CONFIG.MAX_DESCRIPTION_TOKENS", None)
|
||||
if max_tokens:
|
||||
return clip_tokens(self.pr.body, max_tokens)
|
||||
def get_pr_description_full(self):
|
||||
return self.pr.body
|
||||
|
||||
def get_user_id(self):
|
||||
|
@ -14,6 +14,9 @@ from .git_provider import EDIT_TYPE, FilePatchInfo, GitProvider
|
||||
|
||||
logger = logging.getLogger()
|
||||
|
||||
class DiffNotFoundError(Exception):
|
||||
"""Raised when the diff for a merge request cannot be found."""
|
||||
pass
|
||||
|
||||
class GitLabProvider(GitProvider):
|
||||
|
||||
@ -56,7 +59,7 @@ class GitLabProvider(GitProvider):
|
||||
self.last_diff = self.mr.diffs.list(get_all=True)[-1]
|
||||
except IndexError as e:
|
||||
logger.error(f"Could not get diff for merge request {self.id_mr}")
|
||||
raise ValueError(f"Could not get diff for merge request {self.id_mr}") from e
|
||||
raise DiffNotFoundError(f"Could not get diff for merge request {self.id_mr}") from e
|
||||
|
||||
|
||||
def _get_pr_file_content(self, file_path: str, branch: str) -> str:
|
||||
@ -150,16 +153,20 @@ class GitLabProvider(GitProvider):
|
||||
def create_inline_comments(self, comments: list[dict]):
|
||||
raise NotImplementedError("Gitlab provider does not support publishing inline comments yet")
|
||||
|
||||
def send_inline_comment(self, body, edit_type, found, relevant_file, relevant_line_in_file, source_line_no,
|
||||
target_file, target_line_no):
|
||||
def send_inline_comment(self,body: str,edit_type: str,found: bool,relevant_file: str,relevant_line_in_file: int,
|
||||
source_line_no: int, target_file: str,target_line_no: int) -> None:
|
||||
if not found:
|
||||
logging.info(f"Could not find position for {relevant_file} {relevant_line_in_file}")
|
||||
else:
|
||||
d = self.last_diff
|
||||
# in order to have exact sha's we have to find correct diff for this change
|
||||
diff = self.get_relevant_diff(relevant_file, relevant_line_in_file)
|
||||
if diff is None:
|
||||
logger.error(f"Could not get diff for merge request {self.id_mr}")
|
||||
raise DiffNotFoundError(f"Could not get diff for merge request {self.id_mr}")
|
||||
pos_obj = {'position_type': 'text',
|
||||
'new_path': target_file.filename,
|
||||
'old_path': target_file.old_filename if target_file.old_filename else target_file.filename,
|
||||
'base_sha': d.base_commit_sha, 'start_sha': d.start_commit_sha, 'head_sha': d.head_commit_sha}
|
||||
'base_sha': diff.base_commit_sha, 'start_sha': diff.start_commit_sha, 'head_sha': diff.head_commit_sha}
|
||||
if edit_type == 'deletion':
|
||||
pos_obj['old_line'] = source_line_no - 1
|
||||
elif edit_type == 'addition':
|
||||
@ -171,7 +178,24 @@ class GitLabProvider(GitProvider):
|
||||
self.mr.discussions.create({'body': body,
|
||||
'position': pos_obj})
|
||||
|
||||
def publish_code_suggestions(self, code_suggestions: list):
|
||||
def get_relevant_diff(self, relevant_file: str, relevant_line_in_file: int) -> Optional[dict]:
|
||||
changes = self.mr.changes() # Retrieve the changes for the merge request once
|
||||
if not changes:
|
||||
logging.error('No changes found for the merge request.')
|
||||
return None
|
||||
all_diffs = self.mr.diffs.list(get_all=True)
|
||||
if not all_diffs:
|
||||
logging.error('No diffs found for the merge request.')
|
||||
return None
|
||||
for diff in all_diffs:
|
||||
for change in changes['changes']:
|
||||
if change['new_path'] == relevant_file and relevant_line_in_file in change['diff']:
|
||||
return diff
|
||||
logging.debug(
|
||||
f'No relevant diff found for {relevant_file} {relevant_line_in_file}. Falling back to last diff.')
|
||||
return self.last_diff # fallback to last_diff if no relevant diff is found
|
||||
|
||||
def publish_code_suggestions(self, code_suggestions: list) -> bool:
|
||||
for suggestion in code_suggestions:
|
||||
try:
|
||||
body = suggestion['body']
|
||||
@ -275,10 +299,7 @@ class GitLabProvider(GitProvider):
|
||||
def get_pr_branch(self):
|
||||
return self.mr.source_branch
|
||||
|
||||
def get_pr_description(self):
|
||||
max_tokens = get_settings().get("CONFIG.MAX_DESCRIPTION_TOKENS", None)
|
||||
if max_tokens:
|
||||
return clip_tokens(self.mr.description, max_tokens)
|
||||
def get_pr_description_full(self):
|
||||
return self.mr.description
|
||||
|
||||
def get_issue_comments(self):
|
||||
|
@ -130,7 +130,7 @@ class LocalGitProvider(GitProvider):
|
||||
relevant_lines_start: int, relevant_lines_end: int):
|
||||
raise NotImplementedError('Publishing code suggestions is not implemented for the local git provider')
|
||||
|
||||
def publish_code_suggestions(self, code_suggestions: list):
|
||||
def publish_code_suggestions(self, code_suggestions: list) -> bool:
|
||||
raise NotImplementedError('Publishing code suggestions is not implemented for the local git provider')
|
||||
|
||||
def publish_labels(self, labels):
|
||||
@ -158,7 +158,7 @@ class LocalGitProvider(GitProvider):
|
||||
def get_user_id(self):
|
||||
return -1 # Not used anywhere for the local provider, but required by the interface
|
||||
|
||||
def get_pr_description(self):
|
||||
def get_pr_description_full(self):
|
||||
commits_diff = list(self.repo.iter_commits(self.target_branch_name + '..HEAD'))
|
||||
# Get the commit messages and concatenate
|
||||
commit_messages = " ".join([commit.message for commit in commits_diff])
|
||||
|
@ -1,6 +1,8 @@
|
||||
import copy
|
||||
import logging
|
||||
import sys
|
||||
import os
|
||||
import time
|
||||
from typing import Any, Dict
|
||||
|
||||
import uvicorn
|
||||
@ -14,7 +16,7 @@ from pr_agent.config_loader import get_settings, global_settings
|
||||
from pr_agent.git_providers import get_git_provider
|
||||
from pr_agent.servers.utils import verify_signature
|
||||
|
||||
logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
|
||||
logging.basicConfig(stream=sys.stdout, level=logging.INFO)
|
||||
router = APIRouter()
|
||||
|
||||
|
||||
@ -34,7 +36,8 @@ async def handle_github_webhooks(request: Request, response: Response):
|
||||
context["installation_id"] = installation_id
|
||||
context["settings"] = copy.deepcopy(global_settings)
|
||||
|
||||
return await handle_request(body)
|
||||
response = await handle_request(body, event=request.headers.get("X-GitHub-Event", None))
|
||||
return response or {}
|
||||
|
||||
|
||||
@router.post("/api/v1/marketplace_webhooks")
|
||||
@ -48,70 +51,119 @@ async def get_body(request):
|
||||
except Exception as e:
|
||||
logging.error("Error parsing request body", e)
|
||||
raise HTTPException(status_code=400, detail="Error parsing request body") from e
|
||||
body_bytes = await request.body()
|
||||
signature_header = request.headers.get('x-hub-signature-256', None)
|
||||
webhook_secret = getattr(get_settings().github, 'webhook_secret', None)
|
||||
if webhook_secret:
|
||||
body_bytes = await request.body()
|
||||
signature_header = request.headers.get('x-hub-signature-256', None)
|
||||
verify_signature(body_bytes, webhook_secret, signature_header)
|
||||
return body
|
||||
|
||||
|
||||
_duplicate_requests_cache = {}
|
||||
|
||||
|
||||
async def handle_request(body: Dict[str, Any]):
|
||||
async def handle_request(body: Dict[str, Any], event: str):
|
||||
"""
|
||||
Handle incoming GitHub webhook requests.
|
||||
|
||||
Args:
|
||||
body: The request body.
|
||||
event: The GitHub event type.
|
||||
"""
|
||||
action = body.get("action")
|
||||
if not action:
|
||||
return {}
|
||||
agent = PRAgent()
|
||||
bot_user = get_settings().github_app.bot_user
|
||||
logging.info(f"action: '{action}'")
|
||||
logging.info(f"event: '{event}'")
|
||||
|
||||
if get_settings().github_app.duplicate_requests_cache and _is_duplicate_request(body):
|
||||
return {}
|
||||
|
||||
# handle all sorts of comment events (e.g. issue_comment)
|
||||
if action == 'created':
|
||||
if "comment" not in body:
|
||||
return {}
|
||||
comment_body = body.get("comment", {}).get("body")
|
||||
sender = body.get("sender", {}).get("login")
|
||||
if sender and 'bot' in sender:
|
||||
if sender and bot_user in sender:
|
||||
logging.info(f"Ignoring comment from {bot_user} user")
|
||||
return {}
|
||||
if "issue" not in body or "pull_request" not in body["issue"]:
|
||||
logging.info(f"Processing comment from {sender} user")
|
||||
if "issue" in body and "pull_request" in body["issue"] and "url" in body["issue"]["pull_request"]:
|
||||
api_url = body["issue"]["pull_request"]["url"]
|
||||
elif "comment" in body and "pull_request_url" in body["comment"]:
|
||||
api_url = body["comment"]["pull_request_url"]
|
||||
else:
|
||||
return {}
|
||||
pull_request = body["issue"]["pull_request"]
|
||||
api_url = pull_request.get("url")
|
||||
logging.info(f"Handling comment because of event={event} and action={action}")
|
||||
comment_id = body.get("comment", {}).get("id")
|
||||
provider = get_git_provider()(pr_url=api_url)
|
||||
await agent.handle_request(api_url, comment_body, notify=lambda: provider.add_eyes_reaction(comment_id))
|
||||
|
||||
|
||||
elif action == "opened" or 'reopened' in action:
|
||||
# handle pull_request event:
|
||||
# automatically review opened/reopened/ready_for_review PRs as long as they're not in draft,
|
||||
# as well as direct review requests from the bot
|
||||
elif event == 'pull_request':
|
||||
pull_request = body.get("pull_request")
|
||||
if not pull_request:
|
||||
return {}
|
||||
api_url = pull_request.get("url")
|
||||
if not api_url:
|
||||
return {}
|
||||
await agent.handle_request(api_url, "/review")
|
||||
if pull_request.get("draft", True) or pull_request.get("state") != "open" or pull_request.get("user", {}).get("login", "") == bot_user:
|
||||
return {}
|
||||
if action in get_settings().github_app.handle_pr_actions:
|
||||
if action == "review_requested":
|
||||
if body.get("requested_reviewer", {}).get("login", "") != bot_user:
|
||||
return {}
|
||||
if pull_request.get("created_at") == pull_request.get("updated_at"):
|
||||
# avoid double reviews when opening a PR for the first time
|
||||
return {}
|
||||
logging.info(f"Performing review because of event={event} and action={action}")
|
||||
for command in get_settings().github_app.pr_commands:
|
||||
logging.info(f"Performing command: {command}")
|
||||
await agent.handle_request(api_url, command)
|
||||
|
||||
logging.info("event or action does not require handling")
|
||||
return {}
|
||||
|
||||
|
||||
def _is_duplicate_request(body: Dict[str, Any]) -> bool:
|
||||
"""
|
||||
In some deployments its possible to get duplicate requests if the handling is long,
|
||||
This function checks if the request is duplicate and if so - ignores it.
|
||||
"""
|
||||
request_hash = hash(str(body))
|
||||
logging.info(f"request_hash: {request_hash}")
|
||||
request_time = time.monotonic()
|
||||
ttl = get_settings().github_app.duplicate_requests_cache_ttl # in seconds
|
||||
to_delete = [key for key, key_time in _duplicate_requests_cache.items() if request_time - key_time > ttl]
|
||||
for key in to_delete:
|
||||
del _duplicate_requests_cache[key]
|
||||
is_duplicate = request_hash in _duplicate_requests_cache
|
||||
_duplicate_requests_cache[request_hash] = request_time
|
||||
if is_duplicate:
|
||||
logging.info(f"Ignoring duplicate request {request_hash}")
|
||||
return is_duplicate
|
||||
|
||||
|
||||
@router.get("/")
|
||||
async def root():
|
||||
return {"status": "ok"}
|
||||
|
||||
|
||||
def start():
|
||||
# Override the deployment type to app
|
||||
get_settings().set("GITHUB.DEPLOYMENT_TYPE", "app")
|
||||
if get_settings().github_app.override_deployment_type:
|
||||
# Override the deployment type to app
|
||||
get_settings().set("GITHUB.DEPLOYMENT_TYPE", "app")
|
||||
get_settings().set("CONFIG.PUBLISH_OUTPUT_PROGRESS", False)
|
||||
middleware = [Middleware(RawContextMiddleware)]
|
||||
app = FastAPI(middleware=middleware)
|
||||
app.include_router(router)
|
||||
|
||||
uvicorn.run(app, host="0.0.0.0", port=3000)
|
||||
uvicorn.run(app, host="0.0.0.0", port=int(os.environ.get("PORT", "3000")))
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
|
@ -1,9 +1,10 @@
|
||||
commands_text = "> **/review [-i]**: Request a review of your Pull Request. For an incremental review, which only " \
|
||||
"considers changes since the last review, include the '-i' option.\n" \
|
||||
"> **/describe**: Modify the PR title and description based on the contents of the PR.\n" \
|
||||
"> **/improve**: Suggest improvements to the code in the PR. \n" \
|
||||
"> **/ask \\<QUESTION\\>**: Pose a question about the PR.\n\n" \
|
||||
">To edit any configuration parameter from 'configuration.toml', add --config_path=new_value\n" \
|
||||
"> **/improve [--extended]**: Suggest improvements to the code in the PR. Extended mode employs several calls, and provides a more thorough feedback. \n" \
|
||||
"> **/ask \\<QUESTION\\>**: Pose a question about the PR.\n" \
|
||||
"> **/update_changelog**: Update the changelog based on the PR's contents.\n\n" \
|
||||
">To edit any configuration parameter from **configuration.toml**, add --config_path=new_value\n" \
|
||||
">For example: /review --pr_reviewer.extra_instructions=\"focus on the file: ...\" \n" \
|
||||
">To list the possible configuration parameters, use the **/config** command.\n" \
|
||||
|
||||
|
@ -14,6 +14,7 @@ key = "" # Acquire through https://platform.openai.com
|
||||
#api_version = '2023-05-15' # Check Azure documentation for the current API version
|
||||
#api_base = "" # The base URL for your Azure OpenAI resource. e.g. "https://<your resource name>.openai.azure.com"
|
||||
#deployment_id = "" # The deployment name you chose when you deployed the engine
|
||||
#fallback_deployments = [] # For each fallback model specified in configuration.toml in the [config] section, specify the appropriate deployment_id
|
||||
|
||||
[anthropic]
|
||||
key = "" # Optional, uncomment if you want to use Anthropic. Acquire through https://www.anthropic.com/
|
||||
|
@ -10,19 +10,23 @@ use_repo_settings_file=true
|
||||
ai_timeout=180
|
||||
max_description_tokens = 500
|
||||
max_commits_tokens = 500
|
||||
litellm_debugger=false
|
||||
|
||||
[pr_reviewer] # /review #
|
||||
require_focused_review=true
|
||||
require_focused_review=false
|
||||
require_score_review=false
|
||||
require_tests_review=true
|
||||
require_security_review=true
|
||||
num_code_suggestions=3
|
||||
num_code_suggestions=4
|
||||
inline_code_comments = false
|
||||
ask_and_reflect=false
|
||||
automatic_review=true
|
||||
extra_instructions = ""
|
||||
|
||||
[pr_description] # /describe #
|
||||
publish_description_as_comment=false
|
||||
add_original_user_description=false
|
||||
keep_original_user_title=false
|
||||
extra_instructions = ""
|
||||
|
||||
[pr_questions] # /ask #
|
||||
@ -30,6 +34,12 @@ extra_instructions = ""
|
||||
[pr_code_suggestions] # /improve #
|
||||
num_code_suggestions=4
|
||||
extra_instructions = ""
|
||||
rank_suggestions = false
|
||||
# params for '/improve --extended' mode
|
||||
num_code_suggestions_per_chunk=8
|
||||
rank_extended_suggestions = true
|
||||
max_number_of_calls = 5
|
||||
final_clip_factor = 0.9
|
||||
|
||||
[pr_update_changelog] # /update_changelog #
|
||||
push_changelog_changes=false
|
||||
@ -42,6 +52,21 @@ extra_instructions = ""
|
||||
deployment_type = "user"
|
||||
ratelimit_retries = 5
|
||||
|
||||
[github_app]
|
||||
# these toggles allows running the github app from custom deployments
|
||||
bot_user = "github-actions[bot]"
|
||||
override_deployment_type = true
|
||||
# in some deployments it's possible to get duplicate requests if the handling is long,
|
||||
# these settings are used to avoid handling duplicate requests.
|
||||
duplicate_requests_cache = false
|
||||
duplicate_requests_cache_ttl = 60 # in seconds
|
||||
# settings for "pull_request" event
|
||||
handle_pr_actions = ['opened', 'reopened', 'ready_for_review', 'review_requested']
|
||||
pr_commands = [
|
||||
"/describe --pr_description.add_original_user_description=true --pr_description.keep_original_user_title=true",
|
||||
"/auto_review",
|
||||
]
|
||||
|
||||
[gitlab]
|
||||
# URL to the gitlab service
|
||||
url = "https://gitlab.com"
|
||||
|
@ -1,19 +1,49 @@
|
||||
[pr_code_suggestions_prompt]
|
||||
system="""You are a language model called CodiumAI-PR-Code-Reviewer.
|
||||
Your task is to provide meaningfull non-trivial code suggestions to improve the new code in a PR (the '+' lines).
|
||||
- Try to give important suggestions like fixing code problems, issues and bugs. As a second priority, provide suggestions for meaningfull code improvements, like performance, vulnerability, modularity, and best practices.
|
||||
- Suggestions should refer only to the 'new hunk' code, and focus on improving the new added code lines, with '+'.
|
||||
system="""You are a language model called PR-Code-Reviewer.
|
||||
Your task is to provide meaningful actionable code suggestions, to improve the new code presented in a PR.
|
||||
|
||||
Example PR Diff input:
|
||||
'
|
||||
## src/file1.py
|
||||
|
||||
@@ -12,3 +12,5 @@ def func1():
|
||||
__new hunk__
|
||||
12 code line that already existed in the file...
|
||||
13 code line that already existed in the file....
|
||||
14 +new code line added in the PR
|
||||
15 code line that already existed in the file...
|
||||
16 code line that already existed in the file...
|
||||
__old hunk__
|
||||
code line that already existed in the file...
|
||||
-code line that was removed in the PR
|
||||
code line that already existed in the file...
|
||||
|
||||
|
||||
@@ ... @@ def func2():
|
||||
__new hunk__
|
||||
...
|
||||
__old hunk__
|
||||
...
|
||||
|
||||
|
||||
## src/file2.py
|
||||
...
|
||||
'
|
||||
|
||||
Specific instructions:
|
||||
- Focus on important suggestions like fixing code problems, issues and bugs. As a second priority, provide suggestions for meaningful code improvements, like performance, vulnerability, modularity, and best practices.
|
||||
- Suggestions should refer only to code from the '__new hunk__' sections, and focus on new lines of code (lines starting with '+').
|
||||
- Provide the exact line number range (inclusive) for each issue.
|
||||
- Assume there is additional code in the relevant file that is not included in the diff.
|
||||
- Assume there is additional relevant code, that is not included in the diff.
|
||||
- Provide up to {{ num_code_suggestions }} code suggestions.
|
||||
- Make sure not to provide suggestions repeating modifications already implemented in the new PR code (the '+' lines).
|
||||
- Don't output line numbers in the 'improved code' snippets.
|
||||
- Avoid making suggestions that have already been implemented in the PR code. For example, if you want to add logs, or change a variable to const, or anything else, make sure it isn't already in the '__new hunk__' code.
|
||||
- Don't suggest to add docstring or type hints.
|
||||
|
||||
{%- if extra_instructions %}
|
||||
|
||||
Extra instructions from the user:
|
||||
{{ extra_instructions }}
|
||||
{% endif %}
|
||||
{%- endif %}
|
||||
|
||||
You must use the following JSON schema to format your answer:
|
||||
```json
|
||||
@ -30,39 +60,26 @@ You must use the following JSON schema to format your answer:
|
||||
},
|
||||
"suggestion content": {
|
||||
"type": "string",
|
||||
"description": "a concrete suggestion for meaningfully improving the new PR code."
|
||||
"description": "a concrete suggestion for meaningfully improving the new PR code (lines from the '__new hunk__' sections, starting with '+')."
|
||||
},
|
||||
"existing code": {
|
||||
"type": "string",
|
||||
"description": "a code snippet showing authentic relevant code lines from a 'new hunk' section. It must be continuous, correctly formatted and indented, and without line numbers."
|
||||
"description": "a code snippet showing the relevant code lines from a '__new hunk__' section. It must be continuous, correctly formatted and indented, and without line numbers."
|
||||
},
|
||||
"relevant lines": {
|
||||
"type": "string",
|
||||
"description": "the relevant lines in the 'new hunk' sections, in the format of 'start_line-end_line'. For example: '10-15'. They should be derived from the hunk line numbers, and correspond to the 'existing code' snippet above."
|
||||
"description": "the relevant lines from a '__new hunk__' section, in the format of 'start_line-end_line'. For example: '10-15'. They should be derived from the hunk line numbers, and correspond to the 'existing code' snippet above."
|
||||
},
|
||||
"improved code": {
|
||||
"type": "string",
|
||||
"description": "a new code snippet that can be used to replace the relevant lines in 'new hunk' code. Replacement suggestions should be complete, correctly formatted and indented, and without line numbers."
|
||||
"description": "a new code snippet that can be used to replace the relevant lines in '__new hunk__' code. Replacement suggestions should be complete, correctly formatted and indented, and without line numbers."
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Example input:
|
||||
'
|
||||
## src/file1.py
|
||||
---new_hunk---
|
||||
```
|
||||
[new hunk code, annotated with line numbers]
|
||||
```
|
||||
---old_hunk---
|
||||
```
|
||||
[old hunk code]
|
||||
```
|
||||
...
|
||||
'
|
||||
|
||||
Don't output line numbers in the 'improved code' snippets.
|
||||
Don't repeat the prompt in the answer, and avoid outputting the 'type' and 'description' fields.
|
||||
"""
|
||||
|
||||
|
@ -3,7 +3,7 @@ system="""You are CodiumAI-PR-Reviewer, a language model designed to review git
|
||||
Your task is to provide full description of the PR content.
|
||||
- Make sure not to focus the new PR code (the '+' lines).
|
||||
- Notice that the 'Previous title', 'Previous description' and 'Commit messages' sections may be partial, simplistic, non-informative or not up-to-date. Hence, compare them to the PR diff code, and use them only as a reference.
|
||||
|
||||
- If needed, each YAML output should be in block scalar format ('|-')
|
||||
{%- if extra_instructions %}
|
||||
|
||||
Extra instructions from the user:
|
||||
@ -33,7 +33,7 @@ PR Description:
|
||||
PR Main Files Walkthrough:
|
||||
type: array
|
||||
maxItems: 10
|
||||
description: >-
|
||||
description: |-
|
||||
a walkthrough of the PR changes. Review main files, and shortly describe the changes in each file (up to 10 most important files).
|
||||
items:
|
||||
filename:
|
||||
@ -46,10 +46,12 @@ PR Main Files Walkthrough:
|
||||
|
||||
Example output:
|
||||
```yaml
|
||||
PR Title: ...
|
||||
PR Title: |-
|
||||
...
|
||||
PR Type:
|
||||
- Bug fix
|
||||
PR Description: ...
|
||||
PR Description: |-
|
||||
...
|
||||
PR Main Files Walkthrough:
|
||||
- ...
|
||||
- ...
|
||||
|
@ -1,11 +1,35 @@
|
||||
[pr_review_prompt]
|
||||
system="""You are CodiumAI-PR-Reviewer, a language model designed to review git pull requests.
|
||||
Your task is to provide constructive and concise feedback for the PR, and also provide meaningfull code suggestions to improve the new PR code (the '+' lines).
|
||||
system="""You are PR-Reviewer, a language model designed to review git pull requests.
|
||||
Your task is to provide constructive and concise feedback for the PR, and also provide meaningful code suggestions.
|
||||
|
||||
Example PR Diff input:
|
||||
'
|
||||
## src/file1.py
|
||||
|
||||
@@ -12,5 +12,5 @@ def func1():
|
||||
code line that already existed in the file...
|
||||
code line that already existed in the file....
|
||||
-code line that was removed in the PR
|
||||
+new code line added in the PR
|
||||
code line that already existed in the file...
|
||||
code line that already existed in the file...
|
||||
|
||||
@@ ... @@ def func2():
|
||||
...
|
||||
|
||||
|
||||
## src/file2.py
|
||||
...
|
||||
'
|
||||
|
||||
Thre review should focus on new code added in the PR (lines starting with '+'), and not on code that already existed in the file (lines starting with '-', or without prefix).
|
||||
|
||||
{%- if num_code_suggestions > 0 %}
|
||||
- Provide up to {{ num_code_suggestions }} code suggestions.
|
||||
- Try to focus on the most important suggestions, like fixing code problems, issues and bugs. As a second priority, provide suggestions for meaningfull code improvements, like performance, vulnerability, modularity, and best practices.
|
||||
- Suggestions should focus on improving the new added code lines.
|
||||
- Make sure not to provide suggestions repeating modifications already implemented in the new PR code (the '+' lines).
|
||||
- Focus on important suggestions like fixing code problems, issues and bugs. As a second priority, provide suggestions for meaningful code improvements, like performance, vulnerability, modularity, and best practices.
|
||||
- Avoid making suggestions that have already been implemented in the PR code. For example, if you want to add logs, or change a variable to const, or anything else, make sure it isn't already in the PR code.
|
||||
- Don't suggest to add docstring or type hints.
|
||||
- Suggestions should focus on improving the new code added in the PR (lines starting with '+')
|
||||
{%- endif %}
|
||||
|
||||
{%- if extra_instructions %}
|
||||
@ -20,6 +44,9 @@ PR Analysis:
|
||||
Main theme:
|
||||
type: string
|
||||
description: a short explanation of the PR
|
||||
PR summary:
|
||||
type: string
|
||||
description: summary of the PR in 2-3 sentences.
|
||||
Type of PR:
|
||||
type: string
|
||||
enum:
|
||||
@ -32,7 +59,7 @@ PR Analysis:
|
||||
{%- if require_score %}
|
||||
Score:
|
||||
type: int
|
||||
description: >-
|
||||
description: |-
|
||||
Rate this PR on a scale of 0-100 (inclusive), where 0 means the worst
|
||||
possible PR code, and 100 means PR code of the highest quality, without
|
||||
any bugs or performance issues, that is ready to be merged immediately and
|
||||
@ -46,13 +73,13 @@ PR Analysis:
|
||||
{%- if question_str %}
|
||||
Insights from user's answer:
|
||||
type: string
|
||||
description: >-
|
||||
description: |-
|
||||
shortly summarize the insights you gained from the user's answers to the questions
|
||||
{%- endif %}
|
||||
{%- if require_focused %}
|
||||
Focused PR:
|
||||
type: string
|
||||
description: >-
|
||||
description: |-
|
||||
Is this a focused PR, in the sense that all the PR code diff changes are
|
||||
united under a single focused theme ? If the theme is too broad, or the PR
|
||||
code diff changes are too scattered, then the PR is not focused. Explain
|
||||
@ -61,12 +88,11 @@ PR Analysis:
|
||||
PR Feedback:
|
||||
General suggestions:
|
||||
type: string
|
||||
description: >-
|
||||
description: |-
|
||||
General suggestions and feedback for the contributors and maintainers of
|
||||
this PR. May include important suggestions for the overall structure,
|
||||
primary purpose, best practices, critical bugs, and other aspects of the
|
||||
PR. Don't address PR title and description, or lack of tests. Explain your
|
||||
suggestions.
|
||||
PR. Don't address PR title and description, or lack of tests. Explain your suggestions.
|
||||
{%- if num_code_suggestions > 0 %}
|
||||
Code feedback:
|
||||
type: array
|
||||
@ -78,7 +104,7 @@ PR Feedback:
|
||||
description: the relevant file full path
|
||||
suggestion:
|
||||
type: string
|
||||
description: >-
|
||||
description: |-
|
||||
a concrete suggestion for meaningfully improving the new PR code. Also
|
||||
describe how, specifically, the suggestion can be applied to new PR
|
||||
code. Add tags with importance measure that matches each suggestion
|
||||
@ -86,10 +112,10 @@ PR Feedback:
|
||||
adding docstrings, renaming PR title and description, or linter like.
|
||||
relevant line:
|
||||
type: string
|
||||
description: >-
|
||||
a single code line taken from the relevant file, to which the
|
||||
suggestion applies. The line should be a '+' line. Make sure to output
|
||||
the line exactly as it appears in the relevant file
|
||||
description: |-
|
||||
a single code line taken from the relevant file, to which the suggestion applies.
|
||||
The code line should start with a '+'.
|
||||
Make sure to output the line exactly as it appears in the relevant file
|
||||
{%- endif %}
|
||||
{%- if require_security %}
|
||||
Security concerns:
|
||||
@ -103,22 +129,29 @@ PR Feedback:
|
||||
Example output:
|
||||
```yaml
|
||||
PR Analysis:
|
||||
Main theme: xxx
|
||||
Type of PR: Bug fix
|
||||
Main theme: |-
|
||||
xxx
|
||||
PR summary: |-
|
||||
xxx
|
||||
Type of PR: |-
|
||||
Bug fix
|
||||
{%- if require_score %}
|
||||
Score: 89
|
||||
{%- endif %}
|
||||
Relevant tests added: No
|
||||
Relevant tests added: |-
|
||||
No
|
||||
{%- if require_focused %}
|
||||
Focused PR: no, because ...
|
||||
{%- endif %}
|
||||
PR Feedback:
|
||||
General PR suggestions: ...
|
||||
General PR suggestions: |-
|
||||
...
|
||||
{%- if num_code_suggestions > 0 %}
|
||||
Code feedback:
|
||||
- relevant file: |-
|
||||
directory/xxx.py
|
||||
suggestion: xxx [important]
|
||||
suggestion: |-
|
||||
xxx [important]
|
||||
relevant line: |-
|
||||
xxx
|
||||
...
|
||||
@ -128,7 +161,7 @@ PR Feedback:
|
||||
{%- endif %}
|
||||
```
|
||||
|
||||
Make sure to output a valid YAML. Use multi-line block scalar ('|') if needed.
|
||||
Each YAML output MUST be after a newline, indented, with block scalar indicator ('|-').
|
||||
Don't repeat the prompt in the answer, and avoid outputting the 'type' and 'description' fields.
|
||||
"""
|
||||
|
||||
@ -160,7 +193,7 @@ The PR Git Diff:
|
||||
```
|
||||
{{diff}}
|
||||
```
|
||||
Note that lines in the diff body are prefixed with a symbol that represents the type of change: '-' for deletions, '+' for additions, and ' ' (a space) for unchanged lines.
|
||||
Note that lines in the diff body are prefixed with a symbol that represents the type of change: '-' for deletions, '+' for additions. Focus on the '+' lines.
|
||||
|
||||
Response (should be a valid YAML, and nothing else):
|
||||
```yaml
|
||||
|
46
pr_agent/settings/pr_sort_code_suggestions_prompts.toml
Normal file
46
pr_agent/settings/pr_sort_code_suggestions_prompts.toml
Normal file
@ -0,0 +1,46 @@
|
||||
[pr_sort_code_suggestions_prompt]
|
||||
system="""
|
||||
"""
|
||||
|
||||
user="""You are given a list of code suggestions to improve a PR:
|
||||
|
||||
{{ suggestion_str|trim }}
|
||||
|
||||
|
||||
Your task is to sort the code suggestions by their order of importance, and return a list with sorting order.
|
||||
The sorting order is a list of pairs, where each pair contains the index of the suggestion in the original list.
|
||||
Rank the suggestions based on their importance to improving the PR, with critical issues first and minor issues last.
|
||||
|
||||
You must use the following YAML schema to format your answer:
|
||||
```yaml
|
||||
Sort Order:
|
||||
type: array
|
||||
maxItems: {{ suggestion_list|length }}
|
||||
uniqueItems: true
|
||||
items:
|
||||
suggestion number:
|
||||
type: integer
|
||||
minimum: 1
|
||||
maximum: {{ suggestion_list|length }}
|
||||
importance order:
|
||||
type: integer
|
||||
minimum: 1
|
||||
maximum: {{ suggestion_list|length }}
|
||||
```
|
||||
|
||||
Example output:
|
||||
```yaml
|
||||
Sort Order:
|
||||
- suggestion number: 1
|
||||
importance order: 2
|
||||
- suggestion number: 2
|
||||
importance order: 3
|
||||
- suggestion number: 3
|
||||
importance order: 1
|
||||
```
|
||||
|
||||
Make sure to output a valid YAML. Use multi-line block scalar ('|') if needed.
|
||||
Don't repeat the prompt in the answer, and avoid outputting the 'type' and 'description' fields.
|
||||
Response (should be a valid YAML, and nothing else):
|
||||
```yaml
|
||||
"""
|
@ -2,11 +2,13 @@ import copy
|
||||
import json
|
||||
import logging
|
||||
import textwrap
|
||||
from typing import List
|
||||
|
||||
import yaml
|
||||
from jinja2 import Environment, StrictUndefined
|
||||
|
||||
from pr_agent.algo.ai_handler import AiHandler
|
||||
from pr_agent.algo.pr_processing import get_pr_diff, retry_with_fallback_models
|
||||
from pr_agent.algo.pr_processing import get_pr_diff, retry_with_fallback_models, get_pr_multi_diffs
|
||||
from pr_agent.algo.token_handler import TokenHandler
|
||||
from pr_agent.algo.utils import try_fix_json
|
||||
from pr_agent.config_loader import get_settings
|
||||
@ -22,6 +24,13 @@ class PRCodeSuggestions:
|
||||
self.git_provider.get_languages(), self.git_provider.get_files()
|
||||
)
|
||||
|
||||
# extended mode
|
||||
self.is_extended = any(["extended" in arg for arg in args])
|
||||
if self.is_extended:
|
||||
num_code_suggestions = get_settings().pr_code_suggestions.num_code_suggestions_per_chunk
|
||||
else:
|
||||
num_code_suggestions = get_settings().pr_code_suggestions.num_code_suggestions
|
||||
|
||||
self.ai_handler = AiHandler()
|
||||
self.patches_diff = None
|
||||
self.prediction = None
|
||||
@ -32,7 +41,7 @@ class PRCodeSuggestions:
|
||||
"description": self.git_provider.get_pr_description(),
|
||||
"language": self.main_language,
|
||||
"diff": "", # empty diff for initial calculation
|
||||
"num_code_suggestions": get_settings().pr_code_suggestions.num_code_suggestions,
|
||||
"num_code_suggestions": num_code_suggestions,
|
||||
"extra_instructions": get_settings().pr_code_suggestions.extra_instructions,
|
||||
"commit_messages_str": self.git_provider.get_commit_messages(),
|
||||
}
|
||||
@ -42,18 +51,26 @@ class PRCodeSuggestions:
|
||||
get_settings().pr_code_suggestions_prompt.user)
|
||||
|
||||
async def run(self):
|
||||
assert type(self.git_provider) != BitbucketProvider, "Bitbucket is not supported for now"
|
||||
|
||||
logging.info('Generating code suggestions for PR...')
|
||||
if get_settings().config.publish_output:
|
||||
self.git_provider.publish_comment("Preparing review...", is_temporary=True)
|
||||
await retry_with_fallback_models(self._prepare_prediction)
|
||||
|
||||
logging.info('Preparing PR review...')
|
||||
data = self._prepare_pr_code_suggestions()
|
||||
if not self.is_extended:
|
||||
await retry_with_fallback_models(self._prepare_prediction)
|
||||
data = self._prepare_pr_code_suggestions()
|
||||
else:
|
||||
data = await retry_with_fallback_models(self._prepare_prediction_extended)
|
||||
|
||||
if (not self.is_extended and get_settings().pr_code_suggestions.rank_suggestions) or \
|
||||
(self.is_extended and get_settings().pr_code_suggestions.rank_extended_suggestions):
|
||||
logging.info('Ranking Suggestions...')
|
||||
data['Code suggestions'] = await self.rank_suggestions(data['Code suggestions'])
|
||||
|
||||
if get_settings().config.publish_output:
|
||||
logging.info('Pushing PR review...')
|
||||
self.git_provider.remove_initial_comment()
|
||||
logging.info('Pushing inline code comments...')
|
||||
logging.info('Pushing inline code suggestions...')
|
||||
self.push_inline_code_suggestions(data)
|
||||
|
||||
async def _prepare_prediction(self, model: str):
|
||||
@ -93,6 +110,10 @@ class PRCodeSuggestions:
|
||||
|
||||
def push_inline_code_suggestions(self, data):
|
||||
code_suggestions = []
|
||||
|
||||
if not data['Code suggestions']:
|
||||
return self.git_provider.publish_comment('No suggestions found to improve this PR.')
|
||||
|
||||
for d in data['Code suggestions']:
|
||||
try:
|
||||
if get_settings().config.verbosity_level >= 2:
|
||||
@ -117,7 +138,11 @@ class PRCodeSuggestions:
|
||||
if get_settings().config.verbosity_level >= 2:
|
||||
logging.info(f"Could not parse suggestion: {d}")
|
||||
|
||||
self.git_provider.publish_code_suggestions(code_suggestions)
|
||||
is_successful = self.git_provider.publish_code_suggestions(code_suggestions)
|
||||
if not is_successful:
|
||||
logging.info("Failed to publish code suggestions, trying to publish each suggestion separately")
|
||||
for code_suggestion in code_suggestions:
|
||||
self.git_provider.publish_code_suggestions([code_suggestion])
|
||||
|
||||
def dedent_code(self, relevant_file, relevant_lines_start, new_code_snippet):
|
||||
try: # dedent code snippet
|
||||
@ -141,3 +166,81 @@ class PRCodeSuggestions:
|
||||
|
||||
return new_code_snippet
|
||||
|
||||
async def _prepare_prediction_extended(self, model: str) -> dict:
|
||||
logging.info('Getting PR diff...')
|
||||
patches_diff_list = get_pr_multi_diffs(self.git_provider, self.token_handler, model,
|
||||
max_calls=get_settings().pr_code_suggestions.max_number_of_calls)
|
||||
|
||||
logging.info('Getting multi AI predictions...')
|
||||
prediction_list = []
|
||||
for i, patches_diff in enumerate(patches_diff_list):
|
||||
logging.info(f"Processing chunk {i + 1} of {len(patches_diff_list)}")
|
||||
self.patches_diff = patches_diff
|
||||
prediction = await self._get_prediction(model)
|
||||
prediction_list.append(prediction)
|
||||
self.prediction_list = prediction_list
|
||||
|
||||
data = {}
|
||||
for prediction in prediction_list:
|
||||
self.prediction = prediction
|
||||
data_per_chunk = self._prepare_pr_code_suggestions()
|
||||
if "Code suggestions" in data:
|
||||
data["Code suggestions"].extend(data_per_chunk["Code suggestions"])
|
||||
else:
|
||||
data.update(data_per_chunk)
|
||||
self.data = data
|
||||
return data
|
||||
|
||||
async def rank_suggestions(self, data: List) -> List:
|
||||
"""
|
||||
Call a model to rank (sort) code suggestions based on their importance order.
|
||||
|
||||
Args:
|
||||
data (List): A list of code suggestions to be ranked.
|
||||
|
||||
Returns:
|
||||
List: The ranked list of code suggestions.
|
||||
"""
|
||||
|
||||
suggestion_list = []
|
||||
# remove invalid suggestions
|
||||
for i, suggestion in enumerate(data):
|
||||
if suggestion['existing code'] != suggestion['improved code']:
|
||||
suggestion_list.append(suggestion)
|
||||
|
||||
data_sorted = [[]] * len(suggestion_list)
|
||||
|
||||
try:
|
||||
suggestion_str = ""
|
||||
for i, suggestion in enumerate(suggestion_list):
|
||||
suggestion_str += f"suggestion {i + 1}: " + str(suggestion) + '\n\n'
|
||||
|
||||
variables = {'suggestion_list': suggestion_list, 'suggestion_str': suggestion_str}
|
||||
model = get_settings().config.model
|
||||
environment = Environment(undefined=StrictUndefined)
|
||||
system_prompt = environment.from_string(get_settings().pr_sort_code_suggestions_prompt.system).render(
|
||||
variables)
|
||||
user_prompt = environment.from_string(get_settings().pr_sort_code_suggestions_prompt.user).render(variables)
|
||||
if get_settings().config.verbosity_level >= 2:
|
||||
logging.info(f"\nSystem prompt:\n{system_prompt}")
|
||||
logging.info(f"\nUser prompt:\n{user_prompt}")
|
||||
response, finish_reason = await self.ai_handler.chat_completion(model=model, system=system_prompt,
|
||||
user=user_prompt)
|
||||
|
||||
sort_order = yaml.safe_load(response)
|
||||
for s in sort_order['Sort Order']:
|
||||
suggestion_number = s['suggestion number']
|
||||
importance_order = s['importance order']
|
||||
data_sorted[importance_order - 1] = suggestion_list[suggestion_number - 1]
|
||||
|
||||
if get_settings().pr_code_suggestions.final_clip_factor != 1:
|
||||
new_len = int(0.5 + len(data_sorted) * get_settings().pr_code_suggestions.final_clip_factor)
|
||||
data_sorted = data_sorted[:new_len]
|
||||
except Exception as e:
|
||||
if get_settings().config.verbosity_level >= 1:
|
||||
logging.info(f"Could not sort suggestions, error: {e}")
|
||||
data_sorted = suggestion_list
|
||||
|
||||
return data_sorted
|
||||
|
||||
|
||||
|
@ -42,6 +42,8 @@ class PRDescription:
|
||||
"extra_instructions": get_settings().pr_description.extra_instructions,
|
||||
"commit_messages_str": self.git_provider.get_commit_messages()
|
||||
}
|
||||
|
||||
self.user_description = self.git_provider.get_user_description()
|
||||
|
||||
# Initialize the token handler
|
||||
self.token_handler = TokenHandler(
|
||||
@ -145,6 +147,9 @@ class PRDescription:
|
||||
# Load the AI prediction data into a dictionary
|
||||
data = load_yaml(self.prediction.strip())
|
||||
|
||||
if get_settings().pr_description.add_original_user_description and self.user_description:
|
||||
data["User Description"] = self.user_description
|
||||
|
||||
# Initialization
|
||||
pr_types = []
|
||||
|
||||
@ -161,13 +166,19 @@ class PRDescription:
|
||||
elif type(data['PR Type']) == str:
|
||||
pr_types = data['PR Type'].split(',')
|
||||
|
||||
# Assign the value of the 'PR Title' key to 'title' variable and remove it from the dictionary
|
||||
title = data.pop('PR Title')
|
||||
# Remove the 'PR Title' key from the dictionary
|
||||
ai_title = data.pop('PR Title')
|
||||
if get_settings().pr_description.keep_original_user_title:
|
||||
# Assign the original PR title to the 'title' variable
|
||||
title = self.vars["title"]
|
||||
else:
|
||||
# Assign the value of the 'PR Title' key to 'title' variable
|
||||
title = ai_title
|
||||
|
||||
# Iterate over the remaining dictionary items and append the key and value to 'pr_body' in a markdown format,
|
||||
# except for the items containing the word 'walkthrough'
|
||||
pr_body = ""
|
||||
for key, value in data.items():
|
||||
for idx, (key, value) in enumerate(data.items()):
|
||||
pr_body += f"## {key}:\n"
|
||||
if 'walkthrough' in key.lower():
|
||||
# for filename, description in value.items():
|
||||
@ -179,7 +190,9 @@ class PRDescription:
|
||||
# if the value is a list, join its items by comma
|
||||
if type(value) == list:
|
||||
value = ', '.join(v for v in value)
|
||||
pr_body += f"{value}\n\n___\n"
|
||||
pr_body += f"{value}\n"
|
||||
if idx < len(data) - 1:
|
||||
pr_body += "\n___\n"
|
||||
|
||||
if get_settings().config.verbosity_level >= 2:
|
||||
logging.info(f"title:\n{title}\n{pr_body}")
|
||||
|
@ -23,7 +23,7 @@ class PRReviewer:
|
||||
"""
|
||||
The PRReviewer class is responsible for reviewing a pull request and generating feedback using an AI model.
|
||||
"""
|
||||
def __init__(self, pr_url: str, is_answer: bool = False, args: list = None):
|
||||
def __init__(self, pr_url: str, is_answer: bool = False, is_auto: bool = False, args: list = None):
|
||||
"""
|
||||
Initialize the PRReviewer object with the necessary attributes and objects to review a pull request.
|
||||
|
||||
@ -40,6 +40,7 @@ class PRReviewer:
|
||||
)
|
||||
self.pr_url = pr_url
|
||||
self.is_answer = is_answer
|
||||
self.is_auto = is_auto
|
||||
|
||||
if self.is_answer and not self.git_provider.is_supported("get_issue_comments"):
|
||||
raise Exception(f"Answer mode is not supported for {get_settings().config.git_provider} for now")
|
||||
@ -93,8 +94,12 @@ class PRReviewer:
|
||||
"""
|
||||
Review the pull request and generate feedback.
|
||||
"""
|
||||
logging.info('Reviewing PR...')
|
||||
|
||||
if self.is_auto and not get_settings().pr_reviewer.automatic_review:
|
||||
logging.info(f'Automatic review is disabled {self.pr_url}')
|
||||
return None
|
||||
|
||||
logging.info(f'Reviewing PR: {self.pr_url} ...')
|
||||
|
||||
if get_settings().config.publish_output:
|
||||
self.git_provider.publish_comment("Preparing review...", is_temporary=True)
|
||||
|
||||
@ -237,7 +242,7 @@ class PRReviewer:
|
||||
return
|
||||
|
||||
review_text = self.prediction.strip()
|
||||
review_text = review_text.lstrip('```yaml').rstrip('`')
|
||||
review_text = review_text.removeprefix('```yaml').rstrip('`')
|
||||
try:
|
||||
data = yaml.load(review_text, Loader=SafeLoader)
|
||||
except Exception as e:
|
||||
|
@ -42,8 +42,9 @@ dependencies = [
|
||||
"atlassian-python-api==3.39.0",
|
||||
"GitPython~=3.1.32",
|
||||
"starlette-context==0.3.6",
|
||||
"litellm~=0.1.351",
|
||||
"PyYAML==6.0"
|
||||
"litellm~=0.1.445",
|
||||
"PyYAML==6.0",
|
||||
"boto3~=1.28.25"
|
||||
]
|
||||
|
||||
[project.urls]
|
||||
|
@ -11,7 +11,7 @@ pytest~=7.4.0
|
||||
aiohttp~=3.8.4
|
||||
atlassian-python-api==3.39.0
|
||||
GitPython~=3.1.32
|
||||
litellm~=0.1.351
|
||||
PyYAML==6.0
|
||||
starlette-context==0.3.6
|
||||
litellm~=0.1.351
|
||||
litellm~=0.1.445
|
||||
boto3~=1.28.25
|
||||
|
10
tests/unittest/test_bitbucket_provider.py
Normal file
10
tests/unittest/test_bitbucket_provider.py
Normal file
@ -0,0 +1,10 @@
|
||||
from pr_agent.git_providers.bitbucket_provider import BitbucketProvider
|
||||
|
||||
|
||||
class TestBitbucketProvider:
|
||||
def test_parse_pr_url(self):
|
||||
url = "https://bitbucket.org/WORKSPACE_XYZ/MY_TEST_REPO/pull-requests/321"
|
||||
workspace_slug, repo_slug, pr_number = BitbucketProvider._parse_pr_url(url)
|
||||
assert workspace_slug == "WORKSPACE_XYZ"
|
||||
assert repo_slug == "MY_TEST_REPO"
|
||||
assert pr_number == 321
|
136
tests/unittest/test_codecommit_client.py
Normal file
136
tests/unittest/test_codecommit_client.py
Normal file
@ -0,0 +1,136 @@
|
||||
from unittest.mock import MagicMock
|
||||
from pr_agent.git_providers.codecommit_client import CodeCommitClient
|
||||
|
||||
|
||||
class TestCodeCommitProvider:
|
||||
def test_get_differences(self):
|
||||
# Create a mock CodeCommitClient instance and codecommit_client member
|
||||
api = CodeCommitClient()
|
||||
api.boto_client = MagicMock()
|
||||
|
||||
# Mock the response from the AWS client for get_differences method
|
||||
api.boto_client.get_paginator.return_value.paginate.return_value = [
|
||||
{
|
||||
"differences": [
|
||||
{
|
||||
"beforeBlob": {
|
||||
"path": "file1.py",
|
||||
"blobId": "291b15c3ab4219e43a5f4f9091e5a97ee9d7400b",
|
||||
},
|
||||
"afterBlob": {
|
||||
"path": "file1.py",
|
||||
"blobId": "46ad86582da03cc34c804c24b17976571bca1eba",
|
||||
},
|
||||
"changeType": "M",
|
||||
},
|
||||
{
|
||||
"beforeBlob": {"path": "", "blobId": ""},
|
||||
"afterBlob": {
|
||||
"path": "file2.py",
|
||||
"blobId": "2404c7874fcbd684d6779c1420072f088647fd79",
|
||||
},
|
||||
"changeType": "A",
|
||||
},
|
||||
{
|
||||
"beforeBlob": {
|
||||
"path": "file3.py",
|
||||
"blobId": "9af7989045ce40e9478ebb8089dfbadac19a9cde",
|
||||
},
|
||||
"afterBlob": {"path": "", "blobId": ""},
|
||||
"changeType": "D",
|
||||
},
|
||||
{
|
||||
"beforeBlob": {
|
||||
"path": "file5.py",
|
||||
"blobId": "738e36eec120ef9d6393a149252698f49156d5b4",
|
||||
},
|
||||
"afterBlob": {
|
||||
"path": "file6.py",
|
||||
"blobId": "faecdb85f7ba199df927a783b261378a1baeca85",
|
||||
},
|
||||
"changeType": "R",
|
||||
},
|
||||
]
|
||||
}
|
||||
]
|
||||
|
||||
diffs = api.get_differences("my_test_repo", "commit1", "commit2")
|
||||
|
||||
assert len(diffs) == 4
|
||||
assert diffs[0].before_blob_path == "file1.py"
|
||||
assert diffs[0].before_blob_id == "291b15c3ab4219e43a5f4f9091e5a97ee9d7400b"
|
||||
assert diffs[0].after_blob_path == "file1.py"
|
||||
assert diffs[0].after_blob_id == "46ad86582da03cc34c804c24b17976571bca1eba"
|
||||
assert diffs[0].change_type == "M"
|
||||
assert diffs[1].before_blob_path == ""
|
||||
assert diffs[1].before_blob_id == ""
|
||||
assert diffs[1].after_blob_path == "file2.py"
|
||||
assert diffs[1].after_blob_id == "2404c7874fcbd684d6779c1420072f088647fd79"
|
||||
assert diffs[1].change_type == "A"
|
||||
assert diffs[2].before_blob_path == "file3.py"
|
||||
assert diffs[2].before_blob_id == "9af7989045ce40e9478ebb8089dfbadac19a9cde"
|
||||
assert diffs[2].after_blob_path == ""
|
||||
assert diffs[2].after_blob_id == ""
|
||||
assert diffs[2].change_type == "D"
|
||||
assert diffs[3].before_blob_path == "file5.py"
|
||||
assert diffs[3].before_blob_id == "738e36eec120ef9d6393a149252698f49156d5b4"
|
||||
assert diffs[3].after_blob_path == "file6.py"
|
||||
assert diffs[3].after_blob_id == "faecdb85f7ba199df927a783b261378a1baeca85"
|
||||
assert diffs[3].change_type == "R"
|
||||
|
||||
def test_get_file(self):
|
||||
# Create a mock CodeCommitClient instance and codecommit_client member
|
||||
api = CodeCommitClient()
|
||||
api.boto_client = MagicMock()
|
||||
|
||||
# Mock the response from the AWS client for get_pull_request method
|
||||
# def get_file(self, repo_name: str, file_path: str, sha_hash: str):
|
||||
api.boto_client.get_file.return_value = {
|
||||
"commitId": "6335d6d4496e8d50af559560997604bb03abc122",
|
||||
"blobId": "c172209495d7968a8fdad76469564fb708460bc1",
|
||||
"filePath": "requirements.txt",
|
||||
"fileSize": 65,
|
||||
"fileContent": b"boto3==1.28.25\ndynaconf==3.1.12\nfastapi==0.99.0\nPyGithub==1.59.*\n",
|
||||
}
|
||||
|
||||
repo_name = "my_test_repo"
|
||||
file_path = "requirements.txt"
|
||||
sha_hash = "84114a356ece1e5b7637213c8e486fea7c254656"
|
||||
content = api.get_file(repo_name, file_path, sha_hash)
|
||||
|
||||
assert len(content) == 65
|
||||
assert content == b"boto3==1.28.25\ndynaconf==3.1.12\nfastapi==0.99.0\nPyGithub==1.59.*\n"
|
||||
assert content.decode("utf-8") == "boto3==1.28.25\ndynaconf==3.1.12\nfastapi==0.99.0\nPyGithub==1.59.*\n"
|
||||
|
||||
def test_get_pr(self):
|
||||
# Create a mock CodeCommitClient instance and codecommit_client member
|
||||
api = CodeCommitClient()
|
||||
api.boto_client = MagicMock()
|
||||
|
||||
# Mock the response from the AWS client for get_pull_request method
|
||||
api.boto_client.get_pull_request.return_value = {
|
||||
"pullRequest": {
|
||||
"pullRequestId": "3",
|
||||
"title": "My PR",
|
||||
"description": "My PR description",
|
||||
"pullRequestTargets": [
|
||||
{
|
||||
"sourceCommit": "commit1",
|
||||
"sourceReference": "branch1",
|
||||
"destinationCommit": "commit2",
|
||||
"destinationReference": "branch2",
|
||||
"repositoryName": "my_test_repo",
|
||||
}
|
||||
],
|
||||
}
|
||||
}
|
||||
|
||||
pr = api.get_pr(321)
|
||||
|
||||
assert pr.title == "My PR"
|
||||
assert pr.description == "My PR description"
|
||||
assert len(pr.targets) == 1
|
||||
assert pr.targets[0].source_commit == "commit1"
|
||||
assert pr.targets[0].source_branch == "branch1"
|
||||
assert pr.targets[0].destination_commit == "commit2"
|
||||
assert pr.targets[0].destination_branch == "branch2"
|
119
tests/unittest/test_codecommit_provider.py
Normal file
119
tests/unittest/test_codecommit_provider.py
Normal file
@ -0,0 +1,119 @@
|
||||
import pytest
|
||||
from pr_agent.git_providers.codecommit_provider import CodeCommitFile
|
||||
from pr_agent.git_providers.codecommit_provider import CodeCommitProvider
|
||||
from pr_agent.git_providers.git_provider import EDIT_TYPE
|
||||
|
||||
|
||||
class TestCodeCommitFile:
|
||||
# Test that a CodeCommitFile object is created successfully with valid parameters.
|
||||
# Generated by CodiumAI
|
||||
def test_valid_parameters(self):
|
||||
a_path = "path/to/file_a"
|
||||
a_blob_id = "12345"
|
||||
b_path = "path/to/file_b"
|
||||
b_blob_id = "67890"
|
||||
edit_type = EDIT_TYPE.ADDED
|
||||
|
||||
file = CodeCommitFile(a_path, a_blob_id, b_path, b_blob_id, edit_type)
|
||||
|
||||
assert file.a_path == a_path
|
||||
assert file.a_blob_id == a_blob_id
|
||||
assert file.b_path == b_path
|
||||
assert file.b_blob_id == b_blob_id
|
||||
assert file.edit_type == edit_type
|
||||
assert file.filename == b_path
|
||||
|
||||
|
||||
class TestCodeCommitProvider:
|
||||
def test_parse_pr_url(self):
|
||||
url = "https://us-east-1.console.aws.amazon.com/codesuite/codecommit/repositories/my_test_repo/pull-requests/321"
|
||||
repo_name, pr_number = CodeCommitProvider._parse_pr_url(url)
|
||||
assert repo_name == "my_test_repo"
|
||||
assert pr_number == 321
|
||||
|
||||
# Test that an error is raised when an invalid CodeCommit URL is provided to the set_pr() method of the CodeCommitProvider class.
|
||||
# Generated by CodiumAI
|
||||
def test_invalid_codecommit_url(self):
|
||||
provider = CodeCommitProvider()
|
||||
with pytest.raises(ValueError):
|
||||
provider.set_pr("https://example.com/codecommit/repositories/my_test_repo/pull-requests/4321")
|
||||
|
||||
def test_get_file_extensions(self):
|
||||
filenames = [
|
||||
"app.py",
|
||||
"cli.py",
|
||||
"composer.json",
|
||||
"composer.lock",
|
||||
"hello.py",
|
||||
"image1.jpg",
|
||||
"image2.JPG",
|
||||
"index.js",
|
||||
"provider.py",
|
||||
"README",
|
||||
"test.py",
|
||||
]
|
||||
expected_extensions = [
|
||||
".py",
|
||||
".py",
|
||||
".json",
|
||||
".lock",
|
||||
".py",
|
||||
".jpg",
|
||||
".jpg",
|
||||
".js",
|
||||
".py",
|
||||
"",
|
||||
".py",
|
||||
]
|
||||
extensions = CodeCommitProvider._get_file_extensions(filenames)
|
||||
assert extensions == expected_extensions
|
||||
|
||||
def test_get_language_percentages(self):
|
||||
extensions = [
|
||||
".py",
|
||||
".py",
|
||||
".json",
|
||||
".lock",
|
||||
".py",
|
||||
".jpg",
|
||||
".jpg",
|
||||
".js",
|
||||
".py",
|
||||
"",
|
||||
".py",
|
||||
]
|
||||
percentages = CodeCommitProvider._get_language_percentages(extensions)
|
||||
assert percentages[".py"] == 45
|
||||
assert percentages[".json"] == 9
|
||||
assert percentages[".lock"] == 9
|
||||
assert percentages[".jpg"] == 18
|
||||
assert percentages[".js"] == 9
|
||||
assert percentages[""] == 9
|
||||
|
||||
# The _get_file_extensions function needs the "." prefix on the extension,
|
||||
# but the _get_language_percentages function will work with or without the "." prefix
|
||||
extensions = [
|
||||
"txt",
|
||||
"py",
|
||||
"py",
|
||||
]
|
||||
percentages = CodeCommitProvider._get_language_percentages(extensions)
|
||||
assert percentages["py"] == 67
|
||||
assert percentages["txt"] == 33
|
||||
|
||||
# test an empty list
|
||||
percentages = CodeCommitProvider._get_language_percentages([])
|
||||
assert percentages == {}
|
||||
|
||||
def test_get_edit_type(self):
|
||||
assert CodeCommitProvider._get_edit_type("A") == EDIT_TYPE.ADDED
|
||||
assert CodeCommitProvider._get_edit_type("D") == EDIT_TYPE.DELETED
|
||||
assert CodeCommitProvider._get_edit_type("M") == EDIT_TYPE.MODIFIED
|
||||
assert CodeCommitProvider._get_edit_type("R") == EDIT_TYPE.RENAMED
|
||||
|
||||
assert CodeCommitProvider._get_edit_type("a") == EDIT_TYPE.ADDED
|
||||
assert CodeCommitProvider._get_edit_type("d") == EDIT_TYPE.DELETED
|
||||
assert CodeCommitProvider._get_edit_type("m") == EDIT_TYPE.MODIFIED
|
||||
assert CodeCommitProvider._get_edit_type("r") == EDIT_TYPE.RENAMED
|
||||
|
||||
assert CodeCommitProvider._get_edit_type("X") is None
|
@ -67,33 +67,11 @@ class TestConvertToMarkdown:
|
||||
]
|
||||
}
|
||||
expected_output = """\
|
||||
- 🎯 **Main theme:** Test
|
||||
- 📌 **Type of PR:** Test type
|
||||
- 🧪 **Relevant tests added:** no
|
||||
- ✨ **Focused PR:** Yes
|
||||
- 💡 **General PR suggestions:** general suggestion...
|
||||
|
||||
- 🤖 **Code feedback:**
|
||||
|
||||
- **Code example:**
|
||||
- **Before:**
|
||||
```
|
||||
Code before
|
||||
```
|
||||
- **After:**
|
||||
```
|
||||
Code after
|
||||
```
|
||||
|
||||
- **Code example:**
|
||||
- **Before:**
|
||||
```
|
||||
Code before 2
|
||||
```
|
||||
- **After:**
|
||||
```
|
||||
Code after 2
|
||||
```
|
||||
- 🎯 **Main theme:** Test\n\
|
||||
- 📌 **Type of PR:** Test type\n\
|
||||
- 🧪 **Relevant tests added:** no\n\
|
||||
- ✨ **Focused PR:** Yes\n\
|
||||
- **General PR suggestions:** general suggestion...\n\n\n- **<details><summary> 🤖 Code feedback:**</summary>\n\n - **Code example:**\n - **Before:**\n ```\n Code before\n ```\n - **After:**\n ```\n Code after\n ```\n\n - **Code example:**\n - **Before:**\n ```\n Code before 2\n ```\n - **After:**\n ```\n Code after 2\n ```\n\n</details>\
|
||||
"""
|
||||
assert convert_to_markdown(input_data).strip() == expected_output.strip()
|
||||
|
||||
@ -113,5 +91,5 @@ class TestConvertToMarkdown:
|
||||
'General PR suggestions': {},
|
||||
'Code suggestions': {}
|
||||
}
|
||||
expected_output = ""
|
||||
expected_output = ''
|
||||
assert convert_to_markdown(input_data).strip() == expected_output.strip()
|
||||
|
Reference in New Issue
Block a user