Compare commits

..

34 Commits

Author SHA1 Message Date
a81cbaa9bd Add in memory provider 2023-08-31 18:35:51 +03:00
e4e1eb6d6b Add Gitlab webhook secret 2023-08-31 18:29:36 +03:00
3d4a062251 Add Gitlab webhook secret 2023-08-31 18:13:47 +03:00
6378603cd8 small refactor of azure devops 2023-08-30 17:06:09 +03:00
588eb6e97f Merge remote-tracking branch 'origin/main' into pre_pr 2023-08-30 15:53:59 +03:00
c5d05d53cd Less restrictive requirements.txt 2023-08-30 15:08:47 +03:00
f6a48c4c8b Less restrictive requirements.txt 2023-08-30 12:32:19 +03:00
f619d60a78 Allow overriding GitHub app default action by using repo local file 2023-08-30 12:31:32 +03:00
d51e7ee5ad Code adjustment to support calling is library 2023-08-30 10:29:51 +03:00
f14c5d296a Merge pull request #251 from zmeir/zmeir-fix_azure_api
Fixed incorrect usage for Azure OpenAI API
2023-08-28 20:52:04 +03:00
18d46fb655 Merge pull request #250 from Codium-ai/tr/prompts_yaml
Refactor Code to Use YAML Instead of JSON for PR Code Suggestions
2023-08-28 20:25:31 +03:00
07bd926678 Merge pull request #249 from Codium-ai/tr/readme_updates
Enhancements to Installation Instructions and Readme
2023-08-28 20:22:41 +03:00
d3c7dcc407 AZURE_DEVOPS_AVAILABLE 2023-08-28 20:21:29 +03:00
f5dd7207dc Merge remote-tracking branch 'origin/main' into tr/prompts_yaml 2023-08-28 20:19:22 +03:00
e5e10d5ec5 Merge pull request #241 from szecsip/feature_azure_devops
Add Azure DevOps provider with basic functionality
2023-08-28 17:03:05 +03:00
314d13e25f Fixed incorrect usage for Azure OpenAI API 2023-08-28 16:13:26 +03:00
2dc2a45e4b yaml 2023-08-28 09:48:43 +03:00
3051dc50fb update README.md 2023-08-28 08:41:02 +03:00
e776cebc33 update README.md 2023-08-28 08:31:56 +03:00
33ef23289f Merge pull request #248 from Codium-ai/ok/requirements
Consolidation of Redundant Dependency Lists
2023-08-27 16:36:46 +03:00
85bc307186 Consolidate redundant dependency list 2023-08-27 16:00:38 +03:00
a0f53d23af Consolidate redundant dependency list 2023-08-27 15:58:14 +03:00
82ac9d447b Consolidate redundant dependency list 2023-08-27 15:39:45 +03:00
9286e61753 Consolidate redundant dependency list 2023-08-27 15:36:39 +03:00
56828f0170 Merge pull request #246 from Codium-ai/ok/bitbucket_server
Implementing Bitbucket Server Support
2023-08-27 10:27:00 +03:00
b94ed61219 Merge branch 'main' into feature_azure_devops 2023-08-24 16:41:33 +00:00
ceaff2a269 fix exception printing 2023-08-24 16:35:34 +00:00
12167bc3a1 fix imports 2023-08-24 16:34:20 +00:00
c163d47a63 fix imports 2023-08-24 15:22:14 +00:00
5d529a71ad some minor changes in Azure DevOps git provider 2023-08-24 15:20:00 +00:00
01d1cf98f4 init Azure DevOps git provider 2023-08-23 16:01:10 +00:00
52ba2793cd modify get_main_pr_language to handle azuredevops provided language format 2023-08-23 15:59:49 +00:00
524faadffb init AzureDevopsProvider 2023-08-13 23:00:45 +02:00
82710c2d15 add AzureDevopsProvider to __init__.py 2023-08-13 22:56:50 +02:00
29 changed files with 661 additions and 313 deletions

View File

@ -1,9 +1,23 @@
## Installation ## Installation
To get started with PR-Agent quickly, you first need to acquire two tokens:
1. An OpenAI key from [here](https://platform.openai.com/), with access to GPT-4.
2. A GitHub personal access token (classic) with the repo scope.
There are several ways to use PR-Agent:
- [Method 1: Use Docker image (no installation required)](INSTALL.md#method-1-use-docker-image-no-installation-required)
- [Method 2: Run as a GitHub Action](INSTALL.md#method-2-run-as-a-github-action)
- [Method 3: Run from source](INSTALL.md#method-3-run-from-source)
- [Method 4: Run as a polling server](INSTALL.md#method-4-run-as-a-polling-server)
- [Method 5: Run as a GitHub App](INSTALL.md#method-5-run-as-a-github-app)
- [Method 6: Deploy as a Lambda Function](INSTALL.md#method-6---deploy-as-a-lambda-function)
- [Method 7: AWS CodeCommit](INSTALL.md#method-7---aws-codecommit-setup)
--- ---
#### Method 1: Use Docker image (no installation required) ### Method 1: Use Docker image (no installation required)
To request a review for a PR, or ask a question about a PR, you can run directly from the Docker image. Here's how: To request a review for a PR, or ask a question about a PR, you can run directly from the Docker image. Here's how:
@ -41,7 +55,7 @@ Possible questions you can ask include:
--- ---
#### Method 2: Run as a GitHub Action ### Method 2: Run as a GitHub Action
You can use our pre-built Github Action Docker image to run PR-Agent as a Github Action. You can use our pre-built Github Action Docker image to run PR-Agent as a Github Action.
@ -111,7 +125,7 @@ When you open your next PR, you should see a comment from `github-actions` bot w
--- ---
#### Method 3: Run from source ### Method 3: Run from source
1. Clone this repository: 1. Clone this repository:
@ -143,17 +157,9 @@ python pr_agent/cli.py --pr_url <pr_url> describe
python pr_agent/cli.py --pr_url <pr_url> improve python pr_agent/cli.py --pr_url <pr_url> improve
``` ```
5. **Debugging LLM API Calls**
If you're testing your codium/pr-agent server, and need to see if calls were made successfully + the exact call logs, you can use the [LiteLLM Debugger tool](https://docs.litellm.ai/docs/debugging/hosted_debugging).
You can do this by setting `litellm_debugger=true` in configuration.toml. Your Logs will be viewable in real-time @ `admin.litellm.ai/<your_email>`. Set your email in the `.secrets.toml` under 'user_email'.
<img src="./pics/debugger.png" width="900"/>
--- ---
#### Method 4: Run as a polling server ### Method 4: Run as a polling server
Request reviews by tagging your Github user on a PR Request reviews by tagging your Github user on a PR
Follow steps 1-3 of method 2. Follow steps 1-3 of method 2.
@ -165,7 +171,7 @@ python pr_agent/servers/github_polling.py
--- ---
#### Method 5: Run as a GitHub App ### Method 5: Run as a GitHub App
Allowing you to automate the review process on your private or public repositories. Allowing you to automate the review process on your private or public repositories.
1. Create a GitHub App from the [Github Developer Portal](https://docs.github.com/en/developers/apps/creating-a-github-app). 1. Create a GitHub App from the [Github Developer Portal](https://docs.github.com/en/developers/apps/creating-a-github-app).
@ -247,7 +253,7 @@ docker push codiumai/pr-agent:github_app # Push to your Docker repository
--- ---
#### Deploy as a Lambda Function ### Method 6 - Deploy as a Lambda Function
1. Follow steps 1-5 of [Method 5](#method-5-run-as-a-github-app). 1. Follow steps 1-5 of [Method 5](#method-5-run-as-a-github-app).
2. Build a docker image that can be used as a lambda function 2. Build a docker image that can be used as a lambda function
@ -266,7 +272,7 @@ docker push codiumai/pr-agent:github_app # Push to your Docker repository
--- ---
#### AWS CodeCommit Setup ### Method 7 - AWS CodeCommit Setup
Not all features have been added to CodeCommit yet. As of right now, CodeCommit has been implemented to run the pr-agent CLI on the command line, using AWS credentials stored in environment variables. (More features will be added in the future.) The following is a set of instructions to have pr-agent do a review of your CodeCommit pull request from the command line: Not all features have been added to CodeCommit yet. As of right now, CodeCommit has been implemented to run the pr-agent CLI on the command line, using AWS credentials stored in environment variables. (More features will be added in the future.) The following is a set of instructions to have pr-agent do a review of your CodeCommit pull request from the command line:
@ -281,7 +287,7 @@ Not all features have been added to CodeCommit yet. As of right now, CodeCommit
* Option B: Set `PYTHONPATH` and run the CLI in one command, for example: * Option B: Set `PYTHONPATH` and run the CLI in one command, for example:
* `PYTHONPATH="/PATH/TO/PROJECTS/pr-agent python pr_agent/cli.py [--ARGS]` * `PYTHONPATH="/PATH/TO/PROJECTS/pr-agent python pr_agent/cli.py [--ARGS]`
#### AWS CodeCommit IAM Role Example ##### AWS CodeCommit IAM Role Example
Example IAM permissions to that user to allow access to CodeCommit: Example IAM permissions to that user to allow access to CodeCommit:
@ -311,7 +317,7 @@ Example IAM permissions to that user to allow access to CodeCommit:
} }
``` ```
#### AWS CodeCommit Access Key and Secret ##### AWS CodeCommit Access Key and Secret
Example setting the Access Key and Secret using environment variables Example setting the Access Key and Secret using environment variables
@ -321,7 +327,7 @@ export AWS_SECRET_ACCESS_KEY="XXXXXXXXXXXXXXXX"
export AWS_DEFAULT_REGION="us-east-1" export AWS_DEFAULT_REGION="us-east-1"
``` ```
#### AWS CodeCommit CLI Example ##### AWS CodeCommit CLI Example
After you set up AWS CodeCommit using the instructions above, here is an example CLI run that tells pr-agent to **review** a given pull request. After you set up AWS CodeCommit using the instructions above, here is an example CLI run that tells pr-agent to **review** a given pull request.
(Replace your specific PYTHONPATH and PR URL in the example) (Replace your specific PYTHONPATH and PR URL in the example)
@ -331,3 +337,10 @@ PYTHONPATH="/PATH/TO/PROJECTS/pr-agent" python pr_agent/cli.py \
--pr_url https://us-east-1.console.aws.amazon.com/codesuite/codecommit/repositories/MY_REPO_NAME/pull-requests/321 \ --pr_url https://us-east-1.console.aws.amazon.com/codesuite/codecommit/repositories/MY_REPO_NAME/pull-requests/321 \
review review
``` ```
### Appendix - **Debugging LLM API Calls**
If you're testing your codium/pr-agent server, and need to see if calls were made successfully + the exact call logs, you can use the [LiteLLM Debugger tool](https://docs.litellm.ai/docs/debugging/hosted_debugging).
You can do this by setting `litellm_debugger=true` in configuration.toml. Your Logs will be viewable in real-time @ `admin.litellm.ai/<your_email>`. Set your email in the `.secrets.toml` under 'user_email'.
<img src="./pics/debugger.png" width="800"/>

1
MANIFEST.in Normal file
View File

@ -0,0 +1 @@
recursive-include pr_agent/settings/ *.toml

View File

@ -15,45 +15,45 @@ Making pull requests less painful with an AI agent
</div> </div>
<div style="text-align:left;"> <div style="text-align:left;">
CodiumAI `PR-Agent` is an open-source tool aiming to help developers review pull requests faster and more efficiently. It automatically analyzes the pull request and can provide several types of feedback: CodiumAI `PR-Agent` is an open-source tool aiming to help developers review pull requests faster and more efficiently. It automatically analyzes the pull request and can provide several types of PR feedback:
**Auto-Description**: Automatically generating PR description - title, type, summary, code walkthrough and PR labels. **Auto-Description**: Automatically generating [PR description](https://github.com/Codium-ai/pr-agent/pull/229#issue-1860711415) - title, type, summary, code walkthrough and labels.
\ \
**PR Review**: Adjustable feedback about the PR main theme, type, relevant tests, security issues, focus, score, and various suggestions for the PR content. **Auto Review**: [Adjustable feedback](https://github.com/Codium-ai/pr-agent/pull/229#issuecomment-1695022908) about the PR main theme, type, relevant tests, security issues, score, and various suggestions for the PR content.
\ \
**Question Answering**: Answering free-text questions about the PR. **Question Answering**: Answering [free-text questions](https://github.com/Codium-ai/pr-agent/pull/229#issuecomment-1695021332) about the PR.
\ \
**Code Suggestions**: Committable code suggestions for improving the PR. **Code Suggestions**: [Committable code suggestions](https://github.com/Codium-ai/pr-agent/pull/229#discussion_r1306919276) for improving the PR.
\ \
**Update Changelog**: Automatically updating the CHANGELOG.md file with the PR changes. **Update Changelog**: Automatically updating the CHANGELOG.md file with the [PR changes](https://github.com/Codium-ai/pr-agent/pull/168#discussion_r1282077645).
<h3>Example results:</h2> <h3>Example results:</h2>
</div> </div>
<h4>/describe:</h4> <h4><a href="https://github.com/Codium-ai/pr-agent/pull/229#issuecomment-1687561986">/describe:</a></h4>
<div align="center"> <div align="center">
<p float="center"> <p float="center">
<img src="https://www.codium.ai/images/describe-2.gif" width="800"> <img src="https://www.codium.ai/images/describe-2.gif" width="800">
</p> </p>
</div> </div>
<h4>/review:</h4> <h4><a href="https://github.com/Codium-ai/pr-agent/pull/229#issuecomment-1695021901">/review:</a></h4>
<div align="center"> <div align="center">
<p float="center"> <p float="center">
<img src="https://www.codium.ai/images/review-2.gif" width="800"> <img src="https://www.codium.ai/images/review-2.gif" width="800">
</p> </p>
</div> </div>
<h4>/reflect_and_review:</h4> <h4><a href="https://github.com/Codium-ai/pr-agent/pull/78#issuecomment-1639739496">/reflect_and_review:</a></h4>
<div align="center"> <div align="center">
<p float="center"> <p float="center">
<img src="https://www.codium.ai/images/reflect_and_review.gif" width="800"> <img src="https://www.codium.ai/images/reflect_and_review.gif" width="800">
</p> </p>
</div> </div>
<h4>/ask:</h4> <h4><a href="https://github.com/Codium-ai/pr-agent/pull/229#issuecomment-1695020538">/ask:</a></h4>
<div align="center"> <div align="center">
<p float="center"> <p float="center">
<img src="https://www.codium.ai/images/ask-2.gif" width="800"> <img src="https://www.codium.ai/images/ask-2.gif" width="800">
</p> </p>
</div> </div>
<h4>/improve:</h4> <h4><a href="https://github.com/Codium-ai/pr-agent/pull/229#issuecomment-1695024952">/improve:</a></h4>
<div align="center"> <div align="center">
<p float="center"> <p float="center">
<img src="https://www.codium.ai/images/improve-2.gif" width="800"> <img src="https://www.codium.ai/images/improve-2.gif" width="800">
@ -82,6 +82,7 @@ CodiumAI `PR-Agent` is an open-source tool aiming to help developers review pull
| | Ask | :white_check_mark: | :white_check_mark: | :white_check_mark: | | | Ask | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | Auto-Description | :white_check_mark: | :white_check_mark: | | | | | Auto-Description | :white_check_mark: | :white_check_mark: | | |
| | Improve Code | :white_check_mark: | :white_check_mark: | | | | | Improve Code | :white_check_mark: | :white_check_mark: | | |
| | ⮑ Extended | :white_check_mark: | :white_check_mark: | | |
| | Reflect and Review | :white_check_mark: | | | | | | Reflect and Review | :white_check_mark: | | | |
| | Update CHANGELOG.md | :white_check_mark: | | | | | | Update CHANGELOG.md | :white_check_mark: | | | |
| | | | | | | | | | | | | |
@ -134,7 +135,8 @@ There are several ways to use PR-Agent:
- Request reviews by tagging your GitHub user on a PR - Request reviews by tagging your GitHub user on a PR
- [Method 5: Run as a GitHub App](INSTALL.md#method-5-run-as-a-github-app) - [Method 5: Run as a GitHub App](INSTALL.md#method-5-run-as-a-github-app)
- Allowing you to automate the review process on your private or public repositories - Allowing you to automate the review process on your private or public repositories
- [Method 6: Deploy as a Lambda Function](INSTALL.md#method-6---deploy-as-a-lambda-function)
- [Method 7: AWS CodeCommit](INSTALL.md#method-7---aws-codecommit-setup)
## How it works ## How it works
@ -160,8 +162,9 @@ Here are some advantages of PR-Agent:
## Roadmap ## Roadmap
- [x] Support additional models, as a replacement for OpenAI (see [here](https://github.com/Codium-ai/pr-agent/pull/172)) - [x] Support additional models, as a replacement for OpenAI (see [here](https://github.com/Codium-ai/pr-agent/pull/172))
- [ ] Develop additional logic for handling large PRs - [x] Develop additional logic for handling large PRs (see [here](https://github.com/Codium-ai/pr-agent/pull/229))
- [ ] Add additional context to the prompt. For example, repo (or relevant files) summarization, with tools such a [ctags](https://github.com/universal-ctags/ctags) - [ ] Add additional context to the prompt. For example, repo (or relevant files) summarization, with tools such a [ctags](https://github.com/universal-ctags/ctags)
- [ ] PR-Agent for issues, and just for pull requests
- [ ] Adding more tools. Possible directions: - [ ] Adding more tools. Possible directions:
- [x] PR description - [x] PR description
- [x] Inline code suggestions - [x] Inline code suggestions

View File

@ -2,7 +2,8 @@ FROM python:3.10 as base
WORKDIR /app WORKDIR /app
ADD pyproject.toml . ADD pyproject.toml .
RUN pip install . && rm pyproject.toml ADD requirements.txt .
RUN pip install . && rm pyproject.toml requirements.txt
ENV PYTHONPATH=/app ENV PYTHONPATH=/app
FROM base as github_app FROM base as github_app

View File

@ -87,8 +87,6 @@ class AiHandler:
f"Generating completion with {model}" f"Generating completion with {model}"
f"{(' from deployment ' + deployment_id) if deployment_id else ''}" f"{(' from deployment ' + deployment_id) if deployment_id else ''}"
) )
if self.azure:
model = self.azure + "/" + model
response = await acompletion( response = await acompletion(
model=model, model=model,
deployment_id=deployment_id, deployment_id=deployment_id,
@ -97,6 +95,7 @@ class AiHandler:
{"role": "user", "content": user} {"role": "user", "content": user}
], ],
temperature=temperature, temperature=temperature,
azure=self.azure,
force_timeout=get_settings().config.ai_timeout force_timeout=get_settings().config.ai_timeout
) )
except (APIError, Timeout, TryAgain) as e: except (APIError, Timeout, TryAgain) as e:

View File

@ -1,19 +1,17 @@
from __future__ import annotations from __future__ import annotations
import difflib
import logging import logging
import re
import traceback import traceback
from typing import Any, Callable, List, Tuple from typing import Callable, List, Tuple
from github import RateLimitExceededException from github import RateLimitExceededException
from pr_agent.algo import MAX_TOKENS from pr_agent.algo import MAX_TOKENS
from pr_agent.algo.git_patch_processing import convert_to_hunks_with_lines_numbers, extend_patch, handle_patch_deletions from pr_agent.algo.git_patch_processing import convert_to_hunks_with_lines_numbers, extend_patch, handle_patch_deletions
from pr_agent.algo.language_handler import sort_files_by_main_languages from pr_agent.algo.language_handler import sort_files_by_main_languages
from pr_agent.algo.token_handler import TokenHandler, get_token_encoder from pr_agent.algo.token_handler import TokenHandler
from pr_agent.config_loader import get_settings from pr_agent.config_loader import get_settings
from pr_agent.git_providers.git_provider import FilePatchInfo, GitProvider from pr_agent.git_providers.git_provider import GitProvider
DELETED_FILES_ = "Deleted files:\n" DELETED_FILES_ = "Deleted files:\n"
@ -247,99 +245,6 @@ def _get_all_deployments(all_models: List[str]) -> List[str]:
return all_deployments return all_deployments
def find_line_number_of_relevant_line_in_file(diff_files: List[FilePatchInfo],
relevant_file: str,
relevant_line_in_file: str) -> Tuple[int, int]:
"""
Find the line number and absolute position of a relevant line in a file.
Args:
diff_files (List[FilePatchInfo]): A list of FilePatchInfo objects representing the patches of files.
relevant_file (str): The name of the file where the relevant line is located.
relevant_line_in_file (str): The content of the relevant line.
Returns:
Tuple[int, int]: A tuple containing the line number and absolute position of the relevant line in the file.
"""
position = -1
absolute_position = -1
re_hunk_header = re.compile(
r"^@@ -(\d+)(?:,(\d+))? \+(\d+)(?:,(\d+))? @@[ ]?(.*)")
for file in diff_files:
if file.filename.strip() == relevant_file:
patch = file.patch
patch_lines = patch.splitlines()
# try to find the line in the patch using difflib, with some margin of error
matches_difflib: list[str | Any] = difflib.get_close_matches(relevant_line_in_file,
patch_lines, n=3, cutoff=0.93)
if len(matches_difflib) == 1 and matches_difflib[0].startswith('+'):
relevant_line_in_file = matches_difflib[0]
delta = 0
start1, size1, start2, size2 = 0, 0, 0, 0
for i, line in enumerate(patch_lines):
if line.startswith('@@'):
delta = 0
match = re_hunk_header.match(line)
start1, size1, start2, size2 = map(int, match.groups()[:4])
elif not line.startswith('-'):
delta += 1
if relevant_line_in_file in line and line[0] != '-':
position = i
absolute_position = start2 + delta - 1
break
if position == -1 and relevant_line_in_file[0] == '+':
no_plus_line = relevant_line_in_file[1:].lstrip()
for i, line in enumerate(patch_lines):
if line.startswith('@@'):
delta = 0
match = re_hunk_header.match(line)
start1, size1, start2, size2 = map(int, match.groups()[:4])
elif not line.startswith('-'):
delta += 1
if no_plus_line in line and line[0] != '-':
# The model might add a '+' to the beginning of the relevant_line_in_file even if originally
# it's a context line
position = i
absolute_position = start2 + delta - 1
break
return position, absolute_position
def clip_tokens(text: str, max_tokens: int) -> str:
"""
Clip the number of tokens in a string to a maximum number of tokens.
Args:
text (str): The string to clip.
max_tokens (int): The maximum number of tokens allowed in the string.
Returns:
str: The clipped string.
"""
if not text:
return text
try:
encoder = get_token_encoder()
num_input_tokens = len(encoder.encode(text))
if num_input_tokens <= max_tokens:
return text
num_chars = len(text)
chars_per_token = num_chars / num_input_tokens
num_output_chars = int(chars_per_token * max_tokens)
clipped_text = text[:num_output_chars]
return clipped_text
except Exception as e:
logging.warning(f"Failed to clip tokens: {e}")
return text
def get_pr_multi_diffs(git_provider: GitProvider, def get_pr_multi_diffs(git_provider: GitProvider,
token_handler: TokenHandler, token_handler: TokenHandler,
model: str, model: str,

View File

@ -21,7 +21,7 @@ class TokenHandler:
method. method.
""" """
def __init__(self, pr, vars: dict, system, user): def __init__(self, vars: dict, system, user):
""" """
Initializes the TokenHandler object. Initializes the TokenHandler object.
@ -32,9 +32,9 @@ class TokenHandler:
- user: The user string. - user: The user string.
""" """
self.encoder = get_token_encoder() self.encoder = get_token_encoder()
self.prompt_tokens = self._get_system_user_tokens(pr, self.encoder, vars, system, user) self.prompt_tokens = self._get_system_user_tokens(self.encoder, vars, system, user)
def _get_system_user_tokens(self, pr, encoder, vars: dict, system, user): def _get_system_user_tokens(self, encoder, vars: dict, system, user):
""" """
Calculates the number of tokens in the system and user strings. Calculates the number of tokens in the system and user strings.

View File

@ -5,14 +5,24 @@ import json
import logging import logging
import re import re
import textwrap import textwrap
from dataclasses import dataclass
from datetime import datetime from datetime import datetime
from typing import Any, List from enum import Enum
from typing import Any, List, Tuple, Optional
import yaml import yaml
from starlette_context import context from starlette_context import context
from pr_agent.algo.token_handler import get_token_encoder
from pr_agent.config_loader import get_settings, global_settings from pr_agent.config_loader import get_settings, global_settings
class EDIT_TYPE(Enum):
ADDED = 1
DELETED = 2
MODIFIED = 3
RENAMED = 4
def get_setting(key: str) -> Any: def get_setting(key: str) -> Any:
try: try:
key = key.upper() key = key.upper()
@ -276,7 +286,7 @@ def _fix_key_value(key: str, value: str):
def load_yaml(review_text: str) -> dict: def load_yaml(review_text: str) -> dict:
review_text = review_text.removeprefix('```yaml').rstrip('`') review_text = review_text.removeprefix('```yaml').rstrip('`')
try: try:
data = yaml.load(review_text, Loader=yaml.SafeLoader) data = yaml.safe_load(review_text)
except Exception as e: except Exception as e:
logging.error(f"Failed to parse AI prediction: {e}") logging.error(f"Failed to parse AI prediction: {e}")
data = try_fix_yaml(review_text) data = try_fix_yaml(review_text)
@ -294,3 +304,108 @@ def try_fix_yaml(review_text: str) -> dict:
except: except:
pass pass
return data return data
def clip_tokens(text: str, max_tokens: int) -> str:
"""
Clip the number of tokens in a string to a maximum number of tokens.
Args:
text (str): The string to clip.
max_tokens (int): The maximum number of tokens allowed in the string.
Returns:
str: The clipped string.
"""
if not text:
return text
try:
encoder = get_token_encoder()
num_input_tokens = len(encoder.encode(text))
if num_input_tokens <= max_tokens:
return text
num_chars = len(text)
chars_per_token = num_chars / num_input_tokens
num_output_chars = int(chars_per_token * max_tokens)
clipped_text = text[:num_output_chars]
return clipped_text
except Exception as e:
logging.warning(f"Failed to clip tokens: {e}")
return text
@dataclass
class FilePatchInfo:
base_file: str
head_file: str
patch: str
filename: str
tokens: int = -1
edit_type: EDIT_TYPE = EDIT_TYPE.MODIFIED
old_filename: str = None
language: Optional[str] = None
def find_line_number_of_relevant_line_in_file(diff_files: List[FilePatchInfo],
relevant_file: str,
relevant_line_in_file: str) -> Tuple[int, int]:
"""
Find the line number and absolute position of a relevant line in a file.
Args:
diff_files (List[FilePatchInfo]): A list of FilePatchInfo objects representing the patches of files.
relevant_file (str): The name of the file where the relevant line is located.
relevant_line_in_file (str): The content of the relevant line.
Returns:
Tuple[int, int]: A tuple containing the line number and absolute position of the relevant line in the file.
"""
position = -1
absolute_position = -1
re_hunk_header = re.compile(
r"^@@ -(\d+)(?:,(\d+))? \+(\d+)(?:,(\d+))? @@[ ]?(.*)")
for file in diff_files:
if file.filename.strip() == relevant_file:
patch = file.patch
patch_lines = patch.splitlines()
# try to find the line in the patch using difflib, with some margin of error
matches_difflib: list[str | Any] = difflib.get_close_matches(relevant_line_in_file,
patch_lines, n=3, cutoff=0.93)
if len(matches_difflib) == 1 and matches_difflib[0].startswith('+'):
relevant_line_in_file = matches_difflib[0]
delta = 0
start1, size1, start2, size2 = 0, 0, 0, 0
for i, line in enumerate(patch_lines):
if line.startswith('@@'):
delta = 0
match = re_hunk_header.match(line)
start1, size1, start2, size2 = map(int, match.groups()[:4])
elif not line.startswith('-'):
delta += 1
if relevant_line_in_file in line and line[0] != '-':
position = i
absolute_position = start2 + delta - 1
break
if position == -1 and relevant_line_in_file[0] == '+':
no_plus_line = relevant_line_in_file[1:].lstrip()
for i, line in enumerate(patch_lines):
if line.startswith('@@'):
delta = 0
match = re_hunk_header.match(line)
start1, size1, start2, size2 = map(int, match.groups()[:4])
elif not line.startswith('-'):
delta += 1
if no_plus_line in line and line[0] != '-':
# The model might add a '+' to the beginning of the relevant_line_in_file even if originally
# it's a context line
position = i
absolute_position = start2 + delta - 1
break
return position, absolute_position

View File

@ -4,11 +4,13 @@ from pr_agent.git_providers.codecommit_provider import CodeCommitProvider
from pr_agent.git_providers.github_provider import GithubProvider from pr_agent.git_providers.github_provider import GithubProvider
from pr_agent.git_providers.gitlab_provider import GitLabProvider from pr_agent.git_providers.gitlab_provider import GitLabProvider
from pr_agent.git_providers.local_git_provider import LocalGitProvider from pr_agent.git_providers.local_git_provider import LocalGitProvider
from pr_agent.git_providers.azuredevops_provider import AzureDevopsProvider
_GIT_PROVIDERS = { _GIT_PROVIDERS = {
'github': GithubProvider, 'github': GithubProvider,
'gitlab': GitLabProvider, 'gitlab': GitLabProvider,
'bitbucket': BitbucketProvider, 'bitbucket': BitbucketProvider,
'azure': AzureDevopsProvider,
'codecommit': CodeCommitProvider, 'codecommit': CodeCommitProvider,
'local' : LocalGitProvider 'local' : LocalGitProvider
} }

View File

@ -0,0 +1,267 @@
import json
import logging
from typing import Optional, Tuple
from urllib.parse import urlparse
import os
AZURE_DEVOPS_AVAILABLE = True
try:
from msrest.authentication import BasicAuthentication
from azure.devops.connection import Connection
from azure.devops.v7_1.git.models import Comment, CommentThread, GitVersionDescriptor, GitPullRequest
except ImportError:
AZURE_DEVOPS_AVAILABLE = False
from ..config_loader import get_settings
from ..algo.utils import load_large_diff, FilePatchInfo, EDIT_TYPE, clip_tokens
from ..algo.language_handler import is_valid_file
class AzureDevopsProvider:
def __init__(self, pr_url: Optional[str] = None, incremental: Optional[bool] = False):
if not AZURE_DEVOPS_AVAILABLE:
raise ImportError("Azure DevOps provider is not available. Please install the required dependencies.")
self.azure_devops_client = self._get_azure_devops_client()
self.workspace_slug = None
self.repo_slug = None
self.repo = None
self.pr_num = None
self.pr = None
self.temp_comments = []
self.incremental = incremental
if pr_url:
self.set_pr(pr_url)
def is_supported(self, capability: str) -> bool:
if capability in ['get_issue_comments', 'create_inline_comment', 'publish_inline_comments', 'get_labels', 'remove_initial_comment']:
return False
return True
def set_pr(self, pr_url: str):
self.workspace_slug, self.repo_slug, self.pr_num = self._parse_pr_url(pr_url)
self.pr = self._get_pr()
def get_repo_settings(self):
try:
contents = self.azure_devops_client.get_item_content(repository_id=self.repo_slug,
project=self.workspace_slug, download=False,
include_content_metadata=False, include_content=True,
path=".pr_agent.toml")
return contents
except Exception as e:
logging.exception("get repo settings error")
return ""
def get_files(self):
files = []
for i in self.azure_devops_client.get_pull_request_commits(project=self.workspace_slug,
repository_id=self.repo_slug,
pull_request_id=self.pr_num):
changes_obj = self.azure_devops_client.get_changes(project=self.workspace_slug,
repository_id=self.repo_slug, commit_id=i.commit_id)
for c in changes_obj.changes:
files.append(c['item']['path'])
return list(set(files))
def get_diff_files(self) -> list[FilePatchInfo]:
try:
base_sha = self.pr.last_merge_target_commit
head_sha = self.pr.last_merge_source_commit
commits = self.azure_devops_client.get_pull_request_commits(project=self.workspace_slug,
repository_id=self.repo_slug,
pull_request_id=self.pr_num)
diff_files = []
diffs = []
diff_types = {}
for c in commits:
changes_obj = self.azure_devops_client.get_changes(project=self.workspace_slug,
repository_id=self.repo_slug, commit_id=c.commit_id)
for i in changes_obj.changes:
diffs.append(i['item']['path'])
diff_types[i['item']['path']] = i['changeType']
diffs = list(set(diffs))
for file in diffs:
if not is_valid_file(file):
continue
version = GitVersionDescriptor(version=head_sha.commit_id, version_type='commit')
new_file_content_str = self.azure_devops_client.get_item(repository_id=self.repo_slug,
path=file,
project=self.workspace_slug,
version_descriptor=version,
download=False,
include_content=True)
new_file_content_str = new_file_content_str.content
edit_type = EDIT_TYPE.MODIFIED
if diff_types[file] == 'add':
edit_type = EDIT_TYPE.ADDED
elif diff_types[file] == 'delete':
edit_type = EDIT_TYPE.DELETED
elif diff_types[file] == 'rename':
edit_type = EDIT_TYPE.RENAMED
version = GitVersionDescriptor(version=base_sha.commit_id, version_type='commit')
original_file_content_str = self.azure_devops_client.get_item(repository_id=self.repo_slug,
path=file,
project=self.workspace_slug,
version_descriptor=version,
download=False,
include_content=True)
original_file_content_str = original_file_content_str.content
patch = load_large_diff(file, new_file_content_str, original_file_content_str)
diff_files.append(FilePatchInfo(original_file_content_str, new_file_content_str,
patch=patch,
filename=file,
edit_type=edit_type))
self.diff_files = diff_files
return diff_files
except Exception as e:
print(f"Error: {str(e)}")
return []
def publish_comment(self, pr_comment: str, is_temporary: bool = False):
comment = Comment(content=pr_comment)
thread = CommentThread(comments=[comment])
thread_response = self.azure_devops_client.create_thread(comment_thread=thread, project=self.workspace_slug,
repository_id=self.repo_slug,
pull_request_id=self.pr_num)
if is_temporary:
self.temp_comments.append({'thread_id': thread_response.id, 'comment_id': comment.id})
def publish_description(self, pr_title: str, pr_body: str):
try:
updated_pr = GitPullRequest()
updated_pr.title = pr_title
updated_pr.description = pr_body
self.azure_devops_client.update_pull_request(project=self.workspace_slug,
repository_id=self.repo_slug,
pull_request_id=self.pr_num,
git_pull_request_to_update=updated_pr)
except Exception as e:
logging.exception(f"Could not update pull request {self.pr_num} description: {e}")
def remove_initial_comment(self):
return "" # not implemented yet
def publish_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
raise NotImplementedError("Azure DevOps provider does not support publishing inline comment yet")
def create_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
raise NotImplementedError("Azure DevOps provider does not support creating inline comments yet")
def publish_inline_comments(self, comments: list[dict]):
raise NotImplementedError("Azure DevOps provider does not support publishing inline comments yet")
def get_title(self):
return self.pr.title
def get_languages(self):
languages = []
files = self.azure_devops_client.get_items(project=self.workspace_slug, repository_id=self.repo_slug,
recursion_level="Full", include_content_metadata=True,
include_links=False, download=False)
for f in files:
if f.git_object_type == 'blob':
file_name, file_extension = os.path.splitext(f.path)
languages.append(file_extension[1:])
extension_counts = {}
for ext in languages:
if ext != '':
extension_counts[ext] = extension_counts.get(ext, 0) + 1
total_extensions = sum(extension_counts.values())
extension_percentages = {ext: (count / total_extensions) * 100 for ext, count in extension_counts.items()}
return extension_percentages
def get_pr_branch(self):
pr_info = self.azure_devops_client.get_pull_request_by_id(project=self.workspace_slug,
pull_request_id=self.pr_num)
source_branch = pr_info.source_ref_name.split('/')[-1]
return source_branch
def get_pr_description(self):
max_tokens = get_settings().get("CONFIG.MAX_DESCRIPTION_TOKENS", None)
if max_tokens:
return clip_tokens(self.pr.description, max_tokens)
return self.pr.description
def get_user_id(self):
return 0
def get_issue_comments(self):
raise NotImplementedError("Azure DevOps provider does not support issue comments yet")
def add_eyes_reaction(self, issue_comment_id: int) -> Optional[int]:
return True
def remove_reaction(self, issue_comment_id: int, reaction_id: int) -> bool:
return True
def get_issue_comments(self):
raise NotImplementedError("Azure DevOps provider does not support issue comments yet")
@staticmethod
def _parse_pr_url(pr_url: str) -> Tuple[str, int]:
parsed_url = urlparse(pr_url)
if 'azure.com' not in parsed_url.netloc:
raise ValueError("The provided URL is not a valid Azure DevOps URL")
path_parts = parsed_url.path.strip('/').split('/')
if len(path_parts) < 6 or path_parts[4] != 'pullrequest':
raise ValueError("The provided URL does not appear to be a Azure DevOps PR URL")
workspace_slug = path_parts[1]
repo_slug = path_parts[3]
try:
pr_number = int(path_parts[5])
except ValueError as e:
raise ValueError("Unable to convert PR number to integer") from e
return workspace_slug, repo_slug, pr_number
def _get_azure_devops_client(self):
try:
pat = get_settings().azure_devops.pat
org = get_settings().azure_devops.org
except AttributeError as e:
raise ValueError(
"Azure DevOps PAT token is required ") from e
credentials = BasicAuthentication('', pat)
azure_devops_connection = Connection(base_url=org, creds=credentials)
azure_devops_client = azure_devops_connection.clients.get_git_client()
return azure_devops_client
def _get_repo(self):
if self.repo is None:
self.repo = self.azure_devops_client.get_repository(project=self.workspace_slug,
repository_id=self.repo_slug)
return self.repo
def _get_pr(self):
self.pr = self.azure_devops_client.get_pull_request_by_id(pull_request_id=self.pr_num, project=self.workspace_slug)
return self.pr
def get_commit_messages(self):
return "" # not implemented yet

View File

@ -8,7 +8,8 @@ from atlassian.bitbucket import Cloud
from starlette_context import context from starlette_context import context
from ..config_loader import get_settings from ..config_loader import get_settings
from .git_provider import FilePatchInfo, GitProvider from .git_provider import GitProvider
from ..algo.utils import FilePatchInfo
class BitbucketProvider(GitProvider): class BitbucketProvider(GitProvider):

View File

@ -1,6 +1,9 @@
import boto3 try: # Allow this module to be imported without requiring boto3
import botocore import boto3
import botocore
except ModuleNotFoundError:
boto3 = None
botocore = None
class CodeCommitDifferencesResponse: class CodeCommitDifferencesResponse:
""" """

View File

@ -4,13 +4,12 @@ from collections import Counter
from typing import List, Optional, Tuple from typing import List, Optional, Tuple
from urllib.parse import urlparse from urllib.parse import urlparse
from ..algo.language_handler import is_valid_file, language_extension_map
from ..algo.pr_processing import clip_tokens
from ..algo.utils import load_large_diff
from ..config_loader import get_settings
from .git_provider import EDIT_TYPE, FilePatchInfo, GitProvider, IncrementalPR
from pr_agent.git_providers.codecommit_client import CodeCommitClient from pr_agent.git_providers.codecommit_client import CodeCommitClient
from ..algo.language_handler import is_valid_file, language_extension_map
from ..algo.utils import EDIT_TYPE, FilePatchInfo, load_large_diff
from .git_provider import GitProvider
class PullRequestCCMimic: class PullRequestCCMimic:
""" """

View File

@ -1,27 +1,10 @@
import logging
from abc import ABC, abstractmethod from abc import ABC, abstractmethod
from dataclasses import dataclass
# enum EDIT_TYPE (ADDED, DELETED, MODIFIED, RENAMED) # enum EDIT_TYPE (ADDED, DELETED, MODIFIED, RENAMED)
from enum import Enum
from typing import Optional from typing import Optional
from pr_agent.algo.utils import FilePatchInfo
class EDIT_TYPE(Enum):
ADDED = 1
DELETED = 2
MODIFIED = 3
RENAMED = 4
@dataclass
class FilePatchInfo:
base_file: str
head_file: str
patch: str
filename: str
tokens: int = -1
edit_type: EDIT_TYPE = EDIT_TYPE.MODIFIED
old_filename: str = None
class GitProvider(ABC): class GitProvider(ABC):
@ -87,7 +70,7 @@ class GitProvider(ABC):
def get_pr_description(self) -> str: def get_pr_description(self) -> str:
from pr_agent.config_loader import get_settings from pr_agent.config_loader import get_settings
from pr_agent.algo.pr_processing import clip_tokens from pr_agent.algo.utils import clip_tokens
max_tokens = get_settings().get("CONFIG.MAX_DESCRIPTION_TOKENS", None) max_tokens = get_settings().get("CONFIG.MAX_DESCRIPTION_TOKENS", None)
description = self.get_pr_description_full() description = self.get_pr_description_full()
if max_tokens: if max_tokens:
@ -137,6 +120,8 @@ def get_main_pr_language(languages, files) -> str:
# validate that the specific commit uses the main language # validate that the specific commit uses the main language
extension_list = [] extension_list = []
for file in files: for file in files:
if isinstance(file, str):
file = FilePatchInfo(base_file=None, head_file=None, patch=None, filename=file)
extension_list.append(file.filename.rsplit('.')[-1]) extension_list.append(file.filename.rsplit('.')[-1])
# get the most common extension # get the most common extension
@ -158,10 +143,12 @@ def get_main_pr_language(languages, files) -> str:
most_common_extension == 'scala' and top_language == 'scala' or \ most_common_extension == 'scala' and top_language == 'scala' or \
most_common_extension == 'kt' and top_language == 'kotlin' or \ most_common_extension == 'kt' and top_language == 'kotlin' or \
most_common_extension == 'pl' and top_language == 'perl' or \ most_common_extension == 'pl' and top_language == 'perl' or \
most_common_extension == 'swift' and top_language == 'swift': most_common_extension == 'swift' and top_language == 'swift' or \
most_common_extension == top_language:
main_language_str = top_language main_language_str = top_language
except Exception: except Exception as e:
logging.exception(e)
pass pass
return main_language_str return main_language_str

View File

@ -9,10 +9,9 @@ from github import AppAuthentication, Auth, Github, GithubException, Reaction
from retry import retry from retry import retry
from starlette_context import context from starlette_context import context
from .git_provider import FilePatchInfo, GitProvider, IncrementalPR from .git_provider import GitProvider, IncrementalPR
from ..algo.language_handler import is_valid_file from ..algo.language_handler import is_valid_file
from ..algo.utils import load_large_diff from ..algo.utils import load_large_diff, clip_tokens, find_line_number_of_relevant_line_in_file, FilePatchInfo
from ..algo.pr_processing import find_line_number_of_relevant_line_in_file, clip_tokens
from ..config_loader import get_settings from ..config_loader import get_settings
from ..servers.utils import RateLimitExceeded from ..servers.utils import RateLimitExceeded

View File

@ -7,10 +7,9 @@ import gitlab
from gitlab import GitlabGetError from gitlab import GitlabGetError
from ..algo.language_handler import is_valid_file from ..algo.language_handler import is_valid_file
from ..algo.pr_processing import clip_tokens from ..algo.utils import load_large_diff, clip_tokens, EDIT_TYPE, FilePatchInfo
from ..algo.utils import load_large_diff
from ..config_loader import get_settings from ..config_loader import get_settings
from .git_provider import EDIT_TYPE, FilePatchInfo, GitProvider from .git_provider import GitProvider
logger = logging.getLogger() logger = logging.getLogger()

View File

@ -0,0 +1,79 @@
import itertools
from collections import Counter
from typing import List, Optional
from pr_agent.algo.utils import FilePatchInfo
from pr_agent.git_providers.git_provider import GitProvider
class InMemoryProvider(GitProvider):
def __init__(self, head_branch: str, target_branch: str, files: List[FilePatchInfo]):
self.head_branch = head_branch
self.target_branch = target_branch
self.files = files
def is_supported(self, capability: str) -> bool:
pass
def get_files(self) -> list[FilePatchInfo]:
return self.files
def get_diff_files(self) -> list[FilePatchInfo]:
return self.get_files()
def publish_description(self, pr_title: str, pr_body: str):
pass
def publish_comment(self, pr_comment: str, is_temporary: bool = False):
pass
def publish_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
pass
def create_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
pass
def publish_inline_comments(self, comments: list[dict]):
pass
def publish_code_suggestions(self, code_suggestions: list) -> bool:
pass
def publish_labels(self, labels):
pass
def get_labels(self):
pass
def remove_initial_comment(self):
pass
def get_languages(self):
language_count = Counter(file.language for file in self.files)
return dict(language_count)
def get_pr_branch(self):
pass
def get_user_id(self):
pass
def get_pr_description_full(self) -> str:
pass
def get_issue_comments(self):
pass
def get_repo_settings(self):
pass
def add_eyes_reaction(self, issue_comment_id: int) -> Optional[int]:
pass
def remove_reaction(self, issue_comment_id: int, reaction_id: int) -> bool:
pass
def get_commit_messages(self):
pass

View File

@ -6,7 +6,8 @@ from typing import List
from git import Repo from git import Repo
from pr_agent.config_loader import _find_repository_root, get_settings from pr_agent.config_loader import _find_repository_root, get_settings
from pr_agent.git_providers.git_provider import EDIT_TYPE, FilePatchInfo, GitProvider from pr_agent.git_providers.git_provider import GitProvider
from pr_agent.algo.utils import EDIT_TYPE, FilePatchInfo
class PullRequestMimic: class PullRequestMimic:

View File

@ -1,8 +1,8 @@
[pr_code_suggestions_prompt] [pr_code_suggestions_prompt]
system="""You are a language model called PR-Code-Reviewer. system="""You are a language model called PR-Code-Reviewer, that specializes in suggesting code improvements for Pull Request (PR).
Your task is to provide meaningful actionable code suggestions, to improve the new code presented in a PR. Your task is to provide meaningful and actionable code suggestions, to improve the new code presented in a PR.
Example PR Diff input: Example for a PR Diff input:
' '
## src/file1.py ## src/file1.py
@ -10,8 +10,8 @@ Example PR Diff input:
__new hunk__ __new hunk__
12 code line that already existed in the file... 12 code line that already existed in the file...
13 code line that already existed in the file.... 13 code line that already existed in the file....
14 +new code line added in the PR 14 +new code line1 added in the PR
15 code line that already existed in the file... 15 +new code line2 added in the PR
16 code line that already existed in the file... 16 code line that already existed in the file...
__old hunk__ __old hunk__
code line that already existed in the file... code line that already existed in the file...
@ -31,13 +31,17 @@ __old hunk__
' '
Specific instructions: Specific instructions:
- Focus on important suggestions like fixing code problems, issues and bugs. As a second priority, provide suggestions for meaningful code improvements, like performance, vulnerability, modularity, and best practices.
- Suggestions should refer only to code from the '__new hunk__' sections, and focus on new lines of code (lines starting with '+').
- Provide the exact line number range (inclusive) for each issue.
- Assume there is additional relevant code, that is not included in the diff.
- Provide up to {{ num_code_suggestions }} code suggestions. - Provide up to {{ num_code_suggestions }} code suggestions.
- Avoid making suggestions that have already been implemented in the PR code. For example, if you want to add logs, or change a variable to const, or anything else, make sure it isn't already in the '__new hunk__' code. - Prioritize suggestions that address major problems, issues and bugs in the code.
- Don't suggest to add docstring or type hints. As a second priority, suggestions should focus on best practices, code readability, maintainability, enhancments, performance, and other aspects.
Don't suggest to add docstring or type hints.
Try to provide diverse and insightful suggestions.
- Suggestions should refer only to code from the '__new hunk__' sections, and focus on new lines of code (lines starting with '+').
Avoid making suggestions that have already been implemented in the PR code. For example, if you want to add logs, or change a variable to const, or anything else, make sure it isn't already in the '__new hunk__' code.
For each suggestion, make sure to take into consideration also the context, meaning the lines before and after the relevant code.
- Provide the exact line numbers range (inclusive) for each issue.
- Assume there is additional relevant code, that is not included in the diff.
{%- if extra_instructions %} {%- if extra_instructions %}
@ -45,63 +49,76 @@ Extra instructions from the user:
{{ extra_instructions }} {{ extra_instructions }}
{%- endif %} {%- endif %}
You must use the following JSON schema to format your answer: You must use the following YAML schema to format your answer:
```json ```yaml
{ Code suggestions:
"Code suggestions": { type: array
"type": "array", minItems: 1
"minItems": 1, maxItems: {{ num_code_suggestions }}
"maxItems": {{ num_code_suggestions }}, uniqueItems: true
"uniqueItems": "true", items:
"items": { relevant file:
"relevant file": { type: string
"type": "string", description: the relevant file full path
"description": "the relevant file full path" suggestion content:
}, type: string
"suggestion content": { description: |-
"type": "string", a concrete suggestion for meaningfully improving the new PR code.
"description": "a concrete suggestion for meaningfully improving the new PR code (lines from the '__new hunk__' sections, starting with '+')." existing code:
}, type: string
"existing code": { description: |-
"type": "string", a code snippet showing the relevant code lines from a '__new hunk__' section.
"description": "a code snippet showing the relevant code lines from a '__new hunk__' section. It must be continuous, correctly formatted and indented, and without line numbers." It must be continuous, correctly formatted and indented, and without line numbers.
}, relevant lines:
"relevant lines": { type: string
"type": "string", description: |-
"description": "the relevant lines from a '__new hunk__' section, in the format of 'start_line-end_line'. For example: '10-15'. They should be derived from the hunk line numbers, and correspond to the 'existing code' snippet above." the relevant lines from a '__new hunk__' section, in the format of 'start_line-end_line'.
}, For example: '10-15'. They should be derived from the hunk line numbers, and correspond to the 'existing code' snippet above.
"improved code": { improved code:
"type": "string", type: string
"description": "a new code snippet that can be used to replace the relevant lines in '__new hunk__' code. Replacement suggestions should be complete, correctly formatted and indented, and without line numbers." description: |-
} a new code snippet that can be used to replace the relevant lines in '__new hunk__' code.
} Replacement suggestions should be complete, correctly formatted and indented, and without line numbers.
}
}
``` ```
Don't output line numbers in the 'improved code' snippets. Example output:
```yaml
Code suggestions:
- relevant file: |-
src/file1.py
suggestion content: |-
Add a docstring to func1()
existing code: |-
def func1():
relevant lines: '12-12'
improved code: |-
...
```
Each YAML output MUST be after a newline, indented, with block scalar indicator ('|-').
Don't repeat the prompt in the answer, and avoid outputting the 'type' and 'description' fields. Don't repeat the prompt in the answer, and avoid outputting the 'type' and 'description' fields.
""" """
user="""PR Info: user="""PR Info:
Title: '{{title}}'
Branch: '{{branch}}'
Description: '{{description}}'
{%- if language %}
Main language: {{language}}
{%- endif %}
{%- if commit_messages_str %}
Commit messages: Title: '{{title}}'
{{commit_messages_str}}
Branch: '{{branch}}'
Description: '{{description}}'
{%- if language %}
Main language: {{language}}
{%- endif %} {%- endif %}
The PR Diff: The PR Diff:
``` ```
{{diff}} {{- diff|trim }}
``` ```
Response (should be a valid JSON, and nothing else): Response (should be a valid YAML, and nothing else):
```json ```yaml
""" """

View File

@ -1,16 +1,13 @@
import copy import copy
import json
import logging import logging
import textwrap import textwrap
from typing import List from typing import List, Dict
import yaml
from jinja2 import Environment, StrictUndefined from jinja2 import Environment, StrictUndefined
from pr_agent.algo.ai_handler import AiHandler from pr_agent.algo.ai_handler import AiHandler
from pr_agent.algo.pr_processing import get_pr_diff, retry_with_fallback_models, get_pr_multi_diffs from pr_agent.algo.pr_processing import get_pr_diff, retry_with_fallback_models, get_pr_multi_diffs
from pr_agent.algo.token_handler import TokenHandler from pr_agent.algo.token_handler import TokenHandler
from pr_agent.algo.utils import try_fix_json from pr_agent.algo.utils import load_yaml
from pr_agent.config_loader import get_settings from pr_agent.config_loader import get_settings
from pr_agent.git_providers import BitbucketProvider, get_git_provider from pr_agent.git_providers import BitbucketProvider, get_git_provider
from pr_agent.git_providers.git_provider import get_main_pr_language from pr_agent.git_providers.git_provider import get_main_pr_language
@ -45,9 +42,7 @@ class PRCodeSuggestions:
"extra_instructions": get_settings().pr_code_suggestions.extra_instructions, "extra_instructions": get_settings().pr_code_suggestions.extra_instructions,
"commit_messages_str": self.git_provider.get_commit_messages(), "commit_messages_str": self.git_provider.get_commit_messages(),
} }
self.token_handler = TokenHandler(self.git_provider.pr, self.token_handler = TokenHandler(self.vars, get_settings().pr_code_suggestions_prompt.system,
self.vars,
get_settings().pr_code_suggestions_prompt.system,
get_settings().pr_code_suggestions_prompt.user) get_settings().pr_code_suggestions_prompt.user)
async def run(self): async def run(self):
@ -98,14 +93,11 @@ class PRCodeSuggestions:
return response return response
def _prepare_pr_code_suggestions(self) -> str: def _prepare_pr_code_suggestions(self) -> Dict:
review = self.prediction.strip() review = self.prediction.strip()
try: data = load_yaml(review)
data = json.loads(review) if isinstance(data, list):
except json.decoder.JSONDecodeError: data = {'Code suggestions': data}
if get_settings().config.verbosity_level >= 2:
logging.info(f"Could not parse json response: {review}")
data = try_fix_json(review, code_suggestions=True)
return data return data
def push_inline_code_suggestions(self, data): def push_inline_code_suggestions(self, data):
@ -227,7 +219,7 @@ class PRCodeSuggestions:
response, finish_reason = await self.ai_handler.chat_completion(model=model, system=system_prompt, response, finish_reason = await self.ai_handler.chat_completion(model=model, system=system_prompt,
user=user_prompt) user=user_prompt)
sort_order = yaml.safe_load(response) sort_order = load_yaml(response)
for s in sort_order['Sort Order']: for s in sort_order['Sort Order']:
suggestion_number = s['suggestion number'] suggestion_number = s['suggestion number']
importance_order = s['importance order'] importance_order = s['importance order']

View File

@ -46,12 +46,8 @@ class PRDescription:
self.user_description = self.git_provider.get_user_description() self.user_description = self.git_provider.get_user_description()
# Initialize the token handler # Initialize the token handler
self.token_handler = TokenHandler( self.token_handler = TokenHandler(self.vars, get_settings().pr_description_prompt.system,
self.git_provider.pr, get_settings().pr_description_prompt.user)
self.vars,
get_settings().pr_description_prompt.system,
get_settings().pr_description_prompt.user,
)
# Initialize patches_diff and prediction attributes # Initialize patches_diff and prediction attributes
self.patches_diff = None self.patches_diff = None

View File

@ -26,9 +26,7 @@ class PRInformationFromUser:
"diff": "", # empty diff for initial calculation "diff": "", # empty diff for initial calculation
"commit_messages_str": self.git_provider.get_commit_messages(), "commit_messages_str": self.git_provider.get_commit_messages(),
} }
self.token_handler = TokenHandler(self.git_provider.pr, self.token_handler = TokenHandler(self.vars, get_settings().pr_information_from_user_prompt.system,
self.vars,
get_settings().pr_information_from_user_prompt.system,
get_settings().pr_information_from_user_prompt.user) get_settings().pr_information_from_user_prompt.user)
self.patches_diff = None self.patches_diff = None
self.prediction = None self.prediction = None

View File

@ -29,9 +29,7 @@ class PRQuestions:
"questions": self.question_str, "questions": self.question_str,
"commit_messages_str": self.git_provider.get_commit_messages(), "commit_messages_str": self.git_provider.get_commit_messages(),
} }
self.token_handler = TokenHandler(self.git_provider.pr, self.token_handler = TokenHandler(self.vars, get_settings().pr_questions_prompt.system,
self.vars,
get_settings().pr_questions_prompt.system,
get_settings().pr_questions_prompt.user) get_settings().pr_questions_prompt.user)
self.patches_diff = None self.patches_diff = None
self.prediction = None self.prediction = None

View File

@ -9,8 +9,7 @@ from jinja2 import Environment, StrictUndefined
from yaml import SafeLoader from yaml import SafeLoader
from pr_agent.algo.ai_handler import AiHandler from pr_agent.algo.ai_handler import AiHandler
from pr_agent.algo.pr_processing import get_pr_diff, retry_with_fallback_models, \ from pr_agent.algo.pr_processing import get_pr_diff, retry_with_fallback_models
find_line_number_of_relevant_line_in_file, clip_tokens
from pr_agent.algo.token_handler import TokenHandler from pr_agent.algo.token_handler import TokenHandler
from pr_agent.algo.utils import convert_to_markdown, try_fix_json, try_fix_yaml, load_yaml from pr_agent.algo.utils import convert_to_markdown, try_fix_json, try_fix_yaml, load_yaml
from pr_agent.config_loader import get_settings from pr_agent.config_loader import get_settings
@ -66,12 +65,8 @@ class PRReviewer:
"commit_messages_str": self.git_provider.get_commit_messages(), "commit_messages_str": self.git_provider.get_commit_messages(),
} }
self.token_handler = TokenHandler( self.token_handler = TokenHandler(self.vars, get_settings().pr_review_prompt.system,
self.git_provider.pr, get_settings().pr_review_prompt.user)
self.vars,
get_settings().pr_review_prompt.system,
get_settings().pr_review_prompt.user
)
def parse_args(self, args: List[str]) -> None: def parse_args(self, args: List[str]) -> None:
""" """
@ -217,8 +212,8 @@ class PRReviewer:
markdown_text = convert_to_markdown(data) markdown_text = convert_to_markdown(data)
user = self.git_provider.get_user_id() user = self.git_provider.get_user_id()
# Add help text if not in CLI mode # Add help text if not in CLI§ mode
if not get_settings().get("CONFIG.CLI_MODE", False): if not get_settings().get("CONFIG.CLI§_MODE", False):
markdown_text += "\n### How to use\n" markdown_text += "\n### How to use\n"
if user and '[bot]' not in user: if user and '[bot]' not in user:
markdown_text += bot_help_text(user) markdown_text += bot_help_text(user)

View File

@ -40,9 +40,7 @@ class PRUpdateChangelog:
"extra_instructions": get_settings().pr_update_changelog.extra_instructions, "extra_instructions": get_settings().pr_update_changelog.extra_instructions,
"commit_messages_str": self.git_provider.get_commit_messages(), "commit_messages_str": self.git_provider.get_commit_messages(),
} }
self.token_handler = TokenHandler(self.git_provider.pr, self.token_handler = TokenHandler(self.vars, get_settings().pr_update_changelog_prompt.system,
self.vars,
get_settings().pr_update_changelog_prompt.system,
get_settings().pr_update_changelog_prompt.user) get_settings().pr_update_changelog_prompt.user)
async def run(self): async def run(self):

View File

@ -26,39 +26,21 @@ classifiers = [
"Operating System :: Independent", "Operating System :: Independent",
"Programming Language :: Python :: 3", "Programming Language :: Python :: 3",
] ]
dynamic = ["dependencies"]
dependencies = [ [tool.setuptools.dynamic]
"dynaconf==3.1.12", dependencies = {file = ["requirements.txt"]}
"fastapi==0.99.0",
"PyGithub==1.59.*",
"retry==0.9.2",
"openai==0.27.8",
"Jinja2==3.1.2",
"tiktoken==0.4.0",
"uvicorn==0.22.0",
"python-gitlab==3.15.0",
"pytest~=7.4.0",
"aiohttp~=3.8.4",
"atlassian-python-api==3.39.0",
"GitPython~=3.1.32",
"starlette-context==0.3.6",
"litellm~=0.1.445",
"PyYAML==6.0",
"boto3~=1.28.25",
"google-cloud-storage==2.10.0",
"ujson==5.8.0"
]
[project.urls] [project.urls]
"Homepage" = "https://github.com/Codium-ai/pr-agent" "Homepage" = "https://github.com/Codium-ai/pr-agent"
[tool.setuptools] [tool.setuptools]
include-package-data = false include-package-data = true
license-files = ["LICENSE"] license-files = ["LICENSE"]
[tool.setuptools.packages.find] [tool.setuptools.packages.find]
where = ["."] where = ["."]
include = ["pr_agent"] include = ["pr_agent", "pr_agent.*"]
[project.scripts] [project.scripts]
pr-agent = "pr_agent.cli:run" pr-agent = "pr_agent.cli:run"

View File

@ -1,19 +1,19 @@
dynaconf==3.1.12 dynaconf~=3.1.12
fastapi==0.99.0 fastapi~=0.103.0
PyGithub==1.59.* PyGithub~=1.59.0
retry==0.9.2 retry~=0.9.2
openai==0.27.8 openai~=0.27.8
Jinja2==3.1.2 Jinja2~=3.1.2
tiktoken==0.4.0 tiktoken~=0.4.0
uvicorn==0.22.0 uvicorn~=0.22.0
python-gitlab==3.15.0 python-gitlab~=3.15.0
pytest~=7.4.0 pytest~=7.4.0
aiohttp~=3.8.4 aiohttp~=3.8.4
atlassian-python-api==3.39.0 atlassian-python-api~=3.39.0
GitPython~=3.1.32 GitPython~=3.1.32
PyYAML==6.0 PyYAML~=6.0
starlette-context==0.3.6 starlette-context~=0.3.6
litellm~=0.1.445 litellm~=0.1.445
boto3~=1.28.25 boto3~=1.28.25
google-cloud-storage==2.10.0 google-cloud-storage~=2.10.0
ujson==5.8.0 ujson~=5.8.0

View File

@ -1,7 +1,7 @@
import pytest import pytest
from pr_agent.git_providers.codecommit_provider import CodeCommitFile from pr_agent.git_providers.codecommit_provider import CodeCommitFile
from pr_agent.git_providers.codecommit_provider import CodeCommitProvider from pr_agent.git_providers.codecommit_provider import CodeCommitProvider
from pr_agent.git_providers.git_provider import EDIT_TYPE from pr_agent.algo.utils import EDIT_TYPE
class TestCodeCommitFile: class TestCodeCommitFile:

View File

@ -1,8 +1,6 @@
# Generated by CodiumAI # Generated by CodiumAI
from pr_agent.git_providers.git_provider import FilePatchInfo from pr_agent.algo.utils import FilePatchInfo, find_line_number_of_relevant_line_in_file
from pr_agent.algo.pr_processing import find_line_number_of_relevant_line_in_file
import pytest import pytest