mirror of
https://github.com/qodo-ai/pr-agent.git
synced 2025-07-04 12:50:38 +08:00
Compare commits
97 Commits
ok/repo_co
...
idavidov/g
Author | SHA1 | Date | |
---|---|---|---|
9770f4709a | |||
35afe758e9 | |||
50125ae57f | |||
6595c3e0c9 | |||
fdd16f6c75 | |||
7b7e913195 | |||
5477469a91 | |||
dee1f168f8 | |||
bb18e32c56 | |||
70286e9574 | |||
3f60d12a9a | |||
164b340c29 | |||
4bb035ec0f | |||
23a79bc8fe | |||
1db53ae1ad | |||
cca951d787 | |||
230d684cd3 | |||
0a02fa8597 | |||
f82b9620af | |||
ce29d9eb49 | |||
b7b650eb05 | |||
6ca0655517 | |||
edcf89a456 | |||
7762a67250 | |||
7049c73790 | |||
cc7be0811a | |||
d3a5aea89e | |||
dd87df49f5 | |||
e85bcf3a17 | |||
abb754b16b | |||
bb5878c99a | |||
273a9e35d9 | |||
fcc208d09f | |||
20bbdac135 | |||
ceedf2bf83 | |||
2d6b947292 | |||
2e13b12fe6 | |||
2d56c88291 | |||
cf9c6a872d | |||
0bb8ab70a4 | |||
4a47b78a90 | |||
3e542cd88b | |||
17ed050ca7 | |||
e24c5e3501 | |||
b206b1c5ff | |||
0270306d3c | |||
3e09b9ac37 | |||
725ac9e85d | |||
e00500b90c | |||
f1f271fa00 | |||
d38c5236dd | |||
49a3a1e511 | |||
1b0b90e51d | |||
64481e2d84 | |||
e0f295659d | |||
fe75e3f2ec | |||
e3274af831 | |||
95b6abef09 | |||
7f1849a867 | |||
7760f37dee | |||
ebbe655c40 | |||
164ed77d72 | |||
b1148e5f7a | |||
2012e25596 | |||
a75253097b | |||
079d62af56 | |||
6c4a5bae52 | |||
886139c6b5 | |||
8f751f7371 | |||
43297b851f | |||
4f39239e73 | |||
00e1925927 | |||
7189b3ab41 | |||
a00038fbd8 | |||
a45343793a | |||
703215fe83 | |||
0f975ccf4a | |||
7367c62cf9 | |||
fed0ea349a | |||
bd86266a4b | |||
bd07a0cd7f | |||
ed8554699b | |||
749ae1be79 | |||
0e3dbbd0f2 | |||
7a57db5d88 | |||
102edcdcf1 | |||
c92648cbd5 | |||
26b008565b | |||
0dec24aa37 | |||
68a2f2a27d | |||
cfa14178f8 | |||
b97c4b6114 | |||
3d43cecbea | |||
eb143ec851 | |||
3e94a71dcd | |||
dd14423b07 | |||
8e47fdc284 |
36
.github/workflows/build-and-test.yaml
vendored
Normal file
36
.github/workflows/build-and-test.yaml
vendored
Normal file
@ -0,0 +1,36 @@
|
||||
name: Build-and-test
|
||||
|
||||
on:
|
||||
push:
|
||||
|
||||
jobs:
|
||||
build-and-test:
|
||||
runs-on: ubuntu-latest
|
||||
|
||||
steps:
|
||||
- id: checkout
|
||||
uses: actions/checkout@v2
|
||||
|
||||
- id: dockerx
|
||||
name: Setup Docker Buildx
|
||||
uses: docker/setup-buildx-action@v2
|
||||
|
||||
- id: build
|
||||
name: Build dev docker
|
||||
uses: docker/build-push-action@v2
|
||||
with:
|
||||
context: .
|
||||
file: ./docker/Dockerfile
|
||||
push: false
|
||||
load: true
|
||||
tags: codiumai/pr-agent:test
|
||||
cache-from: type=gha,scope=dev
|
||||
cache-to: type=gha,mode=max,scope=dev
|
||||
target: test
|
||||
|
||||
- id: test
|
||||
name: Test dev docker
|
||||
run: |
|
||||
docker run --rm codiumai/pr-agent:test pytest -v
|
||||
|
||||
|
@ -1,6 +1,17 @@
|
||||
# This workflow enables developers to call PR-Agents `/[actions]` in PR's comments and upon PR creation.
|
||||
# Learn more at https://www.codium.ai/pr-agent/
|
||||
# This is v0.2 of this workflow file
|
||||
|
||||
name: PR-Agent
|
||||
|
||||
on:
|
||||
pull_request:
|
||||
issue_comment:
|
||||
|
||||
permissions:
|
||||
issues: write
|
||||
pull-requests: write
|
||||
|
||||
jobs:
|
||||
pr_agent_job:
|
||||
runs-on: ubuntu-latest
|
17
CHANGELOG.md
17
CHANGELOG.md
@ -1,5 +1,18 @@
|
||||
## 2023-08-03
|
||||
|
||||
### Optimized
|
||||
- Optimized PR diff processing by introducing caching for diff files, reducing the number of API calls.
|
||||
- Refactored `load_large_diff` function to generate a patch only when necessary.
|
||||
- Fixed a bug in the GitLab provider where the new file was not retrieved correctly.
|
||||
|
||||
## 2023-08-02
|
||||
|
||||
### Enhanced
|
||||
- Updated several tools in the `pr_agent` package to use commit messages in their functionality.
|
||||
- Commit messages are now retrieved and stored in the `vars` dictionary for each tool.
|
||||
- Added a section to display the commit messages in the prompts of various tools.
|
||||
|
||||
## 2023-08-01
|
||||
2023-08-01
|
||||
|
||||
### Enhanced
|
||||
- Introduced the ability to retrieve commit messages from pull requests across different git providers.
|
||||
@ -29,4 +42,4 @@
|
||||
### Added
|
||||
- New feature for updating the CHANGELOG.md based on the contents of a PR.
|
||||
- Added support for this feature for the Github provider.
|
||||
- New configuration settings and prompts for the changelog update feature.
|
||||
- New configuration settings and prompts for the changelog update feature.
|
||||
|
@ -1,12 +1,57 @@
|
||||
## Configuration
|
||||
|
||||
The different tools and sub-tools used by CodiumAI pr-agent are adjustable via the configuration file: `/pr-agent/settings/configuration.toml`.
|
||||
The different tools and sub-tools used by CodiumAI PR-Agent are adjustable via the **[configuration file](pr_agent/settings/configuration.toml)**
|
||||
|
||||
### Working from CLI
|
||||
When running from source (CLI), your local configuration file will be initially used.
|
||||
|
||||
Example for invoking the 'review' tools via the CLI:
|
||||
|
||||
To edit the configuration of any tool, just add `--config_path=<value>` to you command.
|
||||
For example if you want to edit online the `pr_reviewer` configurations, you can run:
|
||||
```
|
||||
/review --pr_reviewer.extra_instructions="focus on the file xyz" --pr_reviewer.require_score_review=false ...
|
||||
python cli.py --pr-url=<pr_url> review
|
||||
```
|
||||
In addition to general configurations, the 'review' tool will use parameters from the `[pr_reviewer]` section (every tool has a dedicated section in the configuration file).
|
||||
|
||||
Note that you can print results locally, without publishing them, by setting in `configuration.toml`:
|
||||
|
||||
```
|
||||
[config]
|
||||
publish_output=true
|
||||
verbosity_level=2
|
||||
```
|
||||
This is useful for debugging or experimenting with the different tools.
|
||||
|
||||
### Working from pre-built repo (GitHub Action/GitHub App/Docker)
|
||||
When running PR-Agent from a pre-built repo, the default configuration file will be loaded.
|
||||
|
||||
To edit the configuration, you have two options:
|
||||
1. Place a local configuration file in the root of your local repo. The local file will be used instead of the default one.
|
||||
2. For online usage, just add `--config_path=<value>` to you command, to edit a specific configuration value.
|
||||
For example if you want to edit `pr_reviewer` configurations, you can run:
|
||||
```
|
||||
/review --pr_reviewer.extra_instructions="..." --pr_reviewer.require_score_review=false ...
|
||||
```
|
||||
|
||||
Any configuration value in `configuration.toml` file can be similarly edited.
|
||||
|
||||
### General configuration parameters
|
||||
|
||||
#### Changing a model
|
||||
See [here](pr_agent/algo/__init__.py) for the list of available models.
|
||||
|
||||
To use Llama2 model, for example, set:
|
||||
```
|
||||
[config]
|
||||
model = "replicate/llama-2-70b-chat:2c1608e18606fad2812020dc541930f2d0495ce32eee50074220b87300bc16e1"
|
||||
[replicate]
|
||||
key = ...
|
||||
```
|
||||
(you can obtain a Llama2 key from [here](https://replicate.com/replicate/llama-2-70b-chat/api))
|
||||
|
||||
Also review the [AiHandler](pr_agent/algo/ai_handler.py) file for instruction how to set keys for other models.
|
||||
|
||||
#### Extra instructions
|
||||
All PR-Agent tools have a parameter called `extra_instructions`, that enables to add free-text extra instructions. Example usage:
|
||||
```
|
||||
/update_changelog --pr_update_changelog.extra_instructions="Make sure to update also the version ..."
|
||||
```
|
@ -92,6 +92,7 @@ pip install -r requirements.txt
|
||||
|
||||
```
|
||||
cp pr_agent/settings/.secrets_template.toml pr_agent/settings/.secrets.toml
|
||||
chmod 600 pr_agent/settings/.secrets.toml
|
||||
# Edit .secrets.toml file
|
||||
```
|
||||
|
||||
@ -128,6 +129,7 @@ Allowing you to automate the review process on your private or public repositori
|
||||
- Pull requests: Read & write
|
||||
- Issue comment: Read & write
|
||||
- Metadata: Read-only
|
||||
- Contents: Read-only
|
||||
- Set the following events:
|
||||
- Issue comment
|
||||
- Pull request
|
||||
|
41
README.md
41
README.md
@ -65,7 +65,6 @@ CodiumAI `PR-Agent` is an open-source tool aiming to help developers review pull
|
||||
- [Overview](#overview)
|
||||
- [Try it now](#try-it-now)
|
||||
- [Installation](#installation)
|
||||
- [Usage and tools](#usage-and-tools)
|
||||
- [Configuration](./CONFIGURATION.md)
|
||||
- [How it works](#how-it-works)
|
||||
- [Why use PR-Agent](#why-use-pr-agent)
|
||||
@ -80,7 +79,7 @@ CodiumAI `PR-Agent` is an open-source tool aiming to help developers review pull
|
||||
|-------|---------------------------------------------|:------:|:------:|:---------:|
|
||||
| TOOLS | Review | :white_check_mark: | :white_check_mark: | :white_check_mark: |
|
||||
| | ⮑ Inline review | :white_check_mark: | :white_check_mark: | |
|
||||
| | Ask | :white_check_mark: | :white_check_mark: | |
|
||||
| | Ask | :white_check_mark: | :white_check_mark: | :white_check_mark:
|
||||
| | Auto-Description | :white_check_mark: | :white_check_mark: | |
|
||||
| | Improve Code | :white_check_mark: | :white_check_mark: | |
|
||||
| | Reflect and Review | :white_check_mark: | | |
|
||||
@ -94,15 +93,16 @@ CodiumAI `PR-Agent` is an open-source tool aiming to help developers review pull
|
||||
| CORE | PR compression | :white_check_mark: | :white_check_mark: | :white_check_mark: |
|
||||
| | Repo language prioritization | :white_check_mark: | :white_check_mark: | :white_check_mark: |
|
||||
| | Adaptive and token-aware<br />file patch fitting | :white_check_mark: | :white_check_mark: | :white_check_mark: |
|
||||
| | Multiple models support | :white_check_mark: | :white_check_mark: | :white_check_mark: |
|
||||
| | Incremental PR Review | :white_check_mark: | | |
|
||||
|
||||
Examples for invoking the different tools via the CLI:
|
||||
- **Review**: python cli.py --pr-url=<pr_url> review
|
||||
- **Describe**: python cli.py --pr-url=<pr_url> describe
|
||||
- **Improve**: python cli.py --pr-url=<pr_url> improve
|
||||
- **Ask**: python cli.py --pr-url=<pr_url> ask "Write me a poem about this PR"
|
||||
- **Reflect**: python cli.py --pr-url=<pr_url> reflect
|
||||
- **Update Changelog**: python cli.py --pr-url=<pr_url> update_changelog
|
||||
- **Review**: python cli.py --pr_url=<pr_url> review
|
||||
- **Describe**: python cli.py --pr_url=<pr_url> describe
|
||||
- **Improve**: python cli.py --pr_url=<pr_url> improve
|
||||
- **Ask**: python cli.py --pr_url=<pr_url> ask "Write me a poem about this PR"
|
||||
- **Reflect**: python cli.py --pr_url=<pr_url> reflect
|
||||
- **Update Changelog**: python cli.py --pr_url=<pr_url> update_changelog
|
||||
|
||||
"<pr_url>" is the url of the relevant PR (for example: https://github.com/Codium-ai/pr-agent/pull/50).
|
||||
|
||||
@ -135,19 +135,11 @@ There are several ways to use PR-Agent:
|
||||
- [Method 5: Run as a GitHub App](INSTALL.md#method-5-run-as-a-github-app)
|
||||
- Allowing you to automate the review process on your private or public repositories
|
||||
|
||||
## Usage and Tools
|
||||
|
||||
**PR-Agent** provides six types of interactions ("tools"): `"PR Reviewer"`, `"PR Q&A"`, `"PR Description"`, `"PR Code Sueggestions"`, `"PR Reflect and Review"` and `"PR Update Changlog"`.
|
||||
|
||||
- The "PR Reviewer" tool automatically analyzes PRs, and provides various types of feedback.
|
||||
- The "PR Q&A" tool answers free-text questions about the PR.
|
||||
- The "PR Description" tool automatically sets the PR Title and body.
|
||||
- The "PR Code Suggestion" tool provide inline code suggestions for the PR that can be applied and committed.
|
||||
- The "PR Reflect and Review" tool initiates a dialog with the user, asks them to reflect on the PR, and then provides a more focused review.
|
||||
- The "PR Update Changelog" tool automatically updates the CHANGELOG.md file with the PR changes.
|
||||
|
||||
## How it works
|
||||
|
||||
The following diagram illustrates PR-Agent tools and their flow:
|
||||
|
||||

|
||||
|
||||
Check out the [PR Compression strategy](./PR_COMPRESSION.md) page for more details on how we convert a code diff to a manageable LLM prompt
|
||||
@ -156,29 +148,28 @@ Check out the [PR Compression strategy](./PR_COMPRESSION.md) page for more detai
|
||||
|
||||
A reasonable question that can be asked is: `"Why use PR-Agent? What make it stand out from existing tools?"`
|
||||
|
||||
Here are some of the reasons why:
|
||||
Here are some advantages of PR-Agent:
|
||||
|
||||
- We emphasize **real-life practical usage**. Each tool (review, improve, ask, ...) has a single GPT-4 call, no more. We feel that this is critical for realistic team usage - obtaining an answer quickly (~30 seconds) and affordably.
|
||||
- Our [PR Compression strategy](./PR_COMPRESSION.md) is a core ability that enables to effectively tackle both short and long PRs.
|
||||
- Our JSON prompting strategy enables to have **modular, customizable tools**. For example, the '/review' tool categories can be controlled via the configuration file. Adding additional categories is easy and accessible.
|
||||
- We support **multiple git providers** (GitHub, Gitlab, Bitbucket), and multiple ways to use the tool (CLI, GitHub Action, GitHub App, Docker, ...).
|
||||
- Our JSON prompting strategy enables to have **modular, customizable tools**. For example, the '/review' tool categories can be controlled via the [configuration](./CONFIGURATION.md) file. Adding additional categories is easy and accessible.
|
||||
- We support **multiple git providers** (GitHub, Gitlab, Bitbucket), **multiple ways** to use the tool (CLI, GitHub Action, GitHub App, Docker, ...), and **multiple models** (GPT-4, GPT-3.5, Anthropic, Cohere, Llama2).
|
||||
- We are open-source, and welcome contributions from the community.
|
||||
|
||||
|
||||
## Roadmap
|
||||
|
||||
- [ ] Support open-source models, as a replacement for OpenAI models. (Note - a minimal requirement for each open-source model is to have 8k+ context, and good support for generating JSON as an output)
|
||||
- [x] Support other Git providers, such as Gitlab and Bitbucket.
|
||||
- [ ] Develop additional logic for handling large PRs, and compressing git patches
|
||||
- [x] Support additional models, as a replacement for OpenAI (see [here](https://github.com/Codium-ai/pr-agent/pull/172))
|
||||
- [ ] Develop additional logic for handling large PRs
|
||||
- [ ] Add additional context to the prompt. For example, repo (or relevant files) summarization, with tools such a [ctags](https://github.com/universal-ctags/ctags)
|
||||
- [ ] Adding more tools. Possible directions:
|
||||
- [x] PR description
|
||||
- [x] Inline code suggestions
|
||||
- [x] Reflect and review
|
||||
- [x] Rank the PR (see [here](https://github.com/Codium-ai/pr-agent/pull/89))
|
||||
- [ ] Enforcing CONTRIBUTING.md guidelines
|
||||
- [ ] Performance (are there any performance issues)
|
||||
- [ ] Documentation (is the PR properly documented)
|
||||
- [ ] Rank the PR importance
|
||||
- [ ] ...
|
||||
|
||||
## Similar Projects
|
||||
|
@ -4,17 +4,21 @@ WORKDIR /app
|
||||
ADD pyproject.toml .
|
||||
RUN pip install . && rm pyproject.toml
|
||||
ENV PYTHONPATH=/app
|
||||
ADD pr_agent pr_agent
|
||||
|
||||
FROM base as github_app
|
||||
ADD pr_agent pr_agent
|
||||
CMD ["python", "pr_agent/servers/github_app.py"]
|
||||
|
||||
FROM base as github_polling
|
||||
ADD pr_agent pr_agent
|
||||
CMD ["python", "pr_agent/servers/github_polling.py"]
|
||||
|
||||
FROM base as test
|
||||
ADD requirements-dev.txt .
|
||||
RUN pip install -r requirements-dev.txt && rm requirements-dev.txt
|
||||
ADD pr_agent pr_agent
|
||||
ADD tests tests
|
||||
|
||||
FROM base as cli
|
||||
ADD pr_agent pr_agent
|
||||
ENTRYPOINT ["python", "pr_agent/cli.py"]
|
||||
|
@ -12,6 +12,7 @@ from pr_agent.tools.pr_information_from_user import PRInformationFromUser
|
||||
from pr_agent.tools.pr_questions import PRQuestions
|
||||
from pr_agent.tools.pr_reviewer import PRReviewer
|
||||
from pr_agent.tools.pr_update_changelog import PRUpdateChangelog
|
||||
from pr_agent.tools.pr_config import PRConfig
|
||||
|
||||
command2class = {
|
||||
"answer": PRReviewer,
|
||||
@ -26,6 +27,8 @@ command2class = {
|
||||
"ask": PRQuestions,
|
||||
"ask_question": PRQuestions,
|
||||
"update_changelog": PRUpdateChangelog,
|
||||
"config": PRConfig,
|
||||
"settings": PRConfig,
|
||||
}
|
||||
|
||||
commands = list(command2class.keys())
|
||||
@ -34,7 +37,7 @@ class PRAgent:
|
||||
def __init__(self):
|
||||
pass
|
||||
|
||||
async def handle_request(self, pr_url, request) -> bool:
|
||||
async def handle_request(self, pr_url, request, notify=None) -> bool:
|
||||
# First, apply repo specific settings if exists
|
||||
if get_settings().config.use_repo_settings_file:
|
||||
repo_settings_file = None
|
||||
@ -64,8 +67,12 @@ class PRAgent:
|
||||
if action == "reflect_and_review" and not get_settings().pr_reviewer.ask_and_reflect:
|
||||
action = "review"
|
||||
if action == "answer":
|
||||
if notify:
|
||||
notify()
|
||||
await PRReviewer(pr_url, is_answer=True, args=args).run()
|
||||
elif action in command2class:
|
||||
if notify:
|
||||
notify()
|
||||
await command2class[action](pr_url, args=args).run()
|
||||
else:
|
||||
return False
|
||||
|
@ -7,4 +7,8 @@ MAX_TOKENS = {
|
||||
'gpt-4': 8000,
|
||||
'gpt-4-0613': 8000,
|
||||
'gpt-4-32k': 32000,
|
||||
'claude-instant-1': 100000,
|
||||
'claude-2': 100000,
|
||||
'command-nightly': 4096,
|
||||
'replicate/llama-2-70b-chat:2c1608e18606fad2812020dc541930f2d0495ce32eee50074220b87300bc16e1': 4096,
|
||||
}
|
||||
|
@ -1,12 +1,15 @@
|
||||
import logging
|
||||
|
||||
import litellm
|
||||
import openai
|
||||
from litellm import acompletion
|
||||
from openai.error import APIError, RateLimitError, Timeout, TryAgain
|
||||
from retry import retry
|
||||
|
||||
from pr_agent.config_loader import get_settings
|
||||
|
||||
OPENAI_RETRIES=5
|
||||
OPENAI_RETRIES = 5
|
||||
|
||||
|
||||
class AiHandler:
|
||||
"""
|
||||
@ -22,18 +25,34 @@ class AiHandler:
|
||||
"""
|
||||
try:
|
||||
openai.api_key = get_settings().openai.key
|
||||
litellm.openai_key = get_settings().openai.key
|
||||
self.azure = False
|
||||
if get_settings().get("OPENAI.ORG", None):
|
||||
openai.organization = get_settings().openai.org
|
||||
self.deployment_id = get_settings().get("OPENAI.DEPLOYMENT_ID", None)
|
||||
litellm.organization = get_settings().openai.org
|
||||
if get_settings().get("OPENAI.API_TYPE", None):
|
||||
openai.api_type = get_settings().openai.api_type
|
||||
if get_settings().openai.api_type == "azure":
|
||||
self.azure = True
|
||||
litellm.azure_key = get_settings().openai.key
|
||||
if get_settings().get("OPENAI.API_VERSION", None):
|
||||
openai.api_version = get_settings().openai.api_version
|
||||
litellm.api_version = get_settings().openai.api_version
|
||||
if get_settings().get("OPENAI.API_BASE", None):
|
||||
openai.api_base = get_settings().openai.api_base
|
||||
litellm.api_base = get_settings().openai.api_base
|
||||
if get_settings().get("ANTHROPIC.KEY", None):
|
||||
litellm.anthropic_key = get_settings().anthropic.key
|
||||
if get_settings().get("COHERE.KEY", None):
|
||||
litellm.cohere_key = get_settings().cohere.key
|
||||
if get_settings().get("REPLICATE.KEY", None):
|
||||
litellm.replicate_key = get_settings().replicate.key
|
||||
except AttributeError as e:
|
||||
raise ValueError("OpenAI key is required") from e
|
||||
|
||||
@property
|
||||
def deployment_id(self):
|
||||
"""
|
||||
Returns the deployment ID for the OpenAI API.
|
||||
"""
|
||||
return get_settings().get("OPENAI.DEPLOYMENT_ID", None)
|
||||
|
||||
@retry(exceptions=(APIError, Timeout, TryAgain, AttributeError, RateLimitError),
|
||||
tries=OPENAI_RETRIES, delay=2, backoff=2, jitter=(1, 3))
|
||||
async def chat_completion(self, model: str, temperature: float, system: str, user: str):
|
||||
@ -57,15 +76,23 @@ class AiHandler:
|
||||
TryAgain: If there is an attribute error during OpenAI inference.
|
||||
"""
|
||||
try:
|
||||
response = await openai.ChatCompletion.acreate(
|
||||
model=model,
|
||||
deployment_id=self.deployment_id,
|
||||
messages=[
|
||||
{"role": "system", "content": system},
|
||||
{"role": "user", "content": user}
|
||||
],
|
||||
temperature=temperature,
|
||||
)
|
||||
deployment_id = self.deployment_id
|
||||
if get_settings().config.verbosity_level >= 2:
|
||||
logging.debug(
|
||||
f"Generating completion with {model}"
|
||||
f"{(' from deployment ' + deployment_id) if deployment_id else ''}"
|
||||
)
|
||||
response = await acompletion(
|
||||
model=model,
|
||||
deployment_id=deployment_id,
|
||||
messages=[
|
||||
{"role": "system", "content": system},
|
||||
{"role": "user", "content": user}
|
||||
],
|
||||
temperature=temperature,
|
||||
azure=self.azure,
|
||||
force_timeout=get_settings().config.ai_timeout
|
||||
)
|
||||
except (APIError, Timeout, TryAgain) as e:
|
||||
logging.error("Error during OpenAI inference: ", e)
|
||||
raise
|
||||
@ -75,8 +102,9 @@ class AiHandler:
|
||||
except (Exception) as e:
|
||||
logging.error("Unknown error during OpenAI inference: ", e)
|
||||
raise TryAgain from e
|
||||
if response is None or len(response.choices) == 0:
|
||||
if response is None or len(response["choices"]) == 0:
|
||||
raise TryAgain
|
||||
resp = response.choices[0]['message']['content']
|
||||
finish_reason = response.choices[0].finish_reason
|
||||
return resp, finish_reason
|
||||
resp = response["choices"][0]['message']['content']
|
||||
finish_reason = response["choices"][0]["finish_reason"]
|
||||
print(resp, finish_reason)
|
||||
return resp, finish_reason
|
||||
|
@ -41,7 +41,11 @@ def extend_patch(original_file_str, patch_str, num_lines) -> str:
|
||||
extended_patch_lines.extend(
|
||||
original_lines[start1 + size1 - 1:start1 + size1 - 1 + num_lines])
|
||||
|
||||
start1, size1, start2, size2 = map(int, match.groups()[:4])
|
||||
try:
|
||||
start1, size1, start2, size2 = map(int, match.groups()[:4])
|
||||
except: # '@@ -0,0 +1 @@' case
|
||||
start1, size1, size2 = map(int, match.groups()[:3])
|
||||
start2 = 0
|
||||
section_header = match.groups()[4]
|
||||
extended_start1 = max(1, start1 - num_lines)
|
||||
extended_size1 = size1 + (start1 - extended_start1) + num_lines
|
||||
@ -198,7 +202,12 @@ def convert_to_hunks_with_lines_numbers(patch: str, file) -> str:
|
||||
patch_with_lines_str += f"{line_old}\n"
|
||||
new_content_lines = []
|
||||
old_content_lines = []
|
||||
start1, size1, start2, size2 = map(int, match.groups()[:4])
|
||||
try:
|
||||
start1, size1, start2, size2 = map(int, match.groups()[:4])
|
||||
except: # '@@ -0,0 +1 @@' case
|
||||
start1, size1, size2 = map(int, match.groups()[:3])
|
||||
start2 = 0
|
||||
|
||||
elif line.startswith('+'):
|
||||
new_content_lines.append(line)
|
||||
elif line.startswith('-'):
|
||||
|
@ -1,17 +1,19 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import difflib
|
||||
import logging
|
||||
from typing import Callable, Tuple
|
||||
import re
|
||||
import traceback
|
||||
from typing import Any, Callable, List, Tuple
|
||||
|
||||
from github import RateLimitExceededException
|
||||
|
||||
from pr_agent.algo import MAX_TOKENS
|
||||
from pr_agent.algo.git_patch_processing import convert_to_hunks_with_lines_numbers, extend_patch, handle_patch_deletions
|
||||
from pr_agent.algo.language_handler import sort_files_by_main_languages
|
||||
from pr_agent.algo.token_handler import TokenHandler
|
||||
from pr_agent.algo.utils import load_large_diff
|
||||
from pr_agent.algo.token_handler import TokenHandler, get_token_encoder
|
||||
from pr_agent.config_loader import get_settings
|
||||
from pr_agent.git_providers.git_provider import GitProvider
|
||||
from pr_agent.git_providers.git_provider import FilePatchInfo, GitProvider
|
||||
|
||||
DELETED_FILES_ = "Deleted files:\n"
|
||||
|
||||
@ -46,7 +48,7 @@ def get_pr_diff(git_provider: GitProvider, token_handler: TokenHandler, model: s
|
||||
PATCH_EXTRA_LINES = 0
|
||||
|
||||
try:
|
||||
diff_files = list(git_provider.get_diff_files())
|
||||
diff_files = git_provider.get_diff_files()
|
||||
except RateLimitExceededException as e:
|
||||
logging.error(f"Rate limit exceeded for git provider API. original message {e}")
|
||||
raise
|
||||
@ -98,12 +100,7 @@ def pr_generate_extended_diff(pr_languages: list, token_handler: TokenHandler,
|
||||
for lang in pr_languages:
|
||||
for file in lang['files']:
|
||||
original_file_content_str = file.base_file
|
||||
new_file_content_str = file.head_file
|
||||
patch = file.patch
|
||||
|
||||
# handle the case of large patch, that initially was not loaded
|
||||
patch = load_large_diff(file, new_file_content_str, original_file_content_str, patch)
|
||||
|
||||
if not patch:
|
||||
continue
|
||||
|
||||
@ -161,7 +158,6 @@ def pr_generate_compressed_diff(top_langs: list, token_handler: TokenHandler, mo
|
||||
original_file_content_str = file.base_file
|
||||
new_file_content_str = file.head_file
|
||||
patch = file.patch
|
||||
patch = load_large_diff(file, new_file_content_str, original_file_content_str, patch)
|
||||
if not patch:
|
||||
continue
|
||||
|
||||
@ -212,15 +208,133 @@ def pr_generate_compressed_diff(top_langs: list, token_handler: TokenHandler, mo
|
||||
|
||||
|
||||
async def retry_with_fallback_models(f: Callable):
|
||||
all_models = _get_all_models()
|
||||
all_deployments = _get_all_deployments(all_models)
|
||||
# try each (model, deployment_id) pair until one is successful, otherwise raise exception
|
||||
for i, (model, deployment_id) in enumerate(zip(all_models, all_deployments)):
|
||||
try:
|
||||
get_settings().set("openai.deployment_id", deployment_id)
|
||||
return await f(model)
|
||||
except Exception as e:
|
||||
logging.warning(
|
||||
f"Failed to generate prediction with {model}"
|
||||
f"{(' from deployment ' + deployment_id) if deployment_id else ''}: "
|
||||
f"{traceback.format_exc()}"
|
||||
)
|
||||
if i == len(all_models) - 1: # If it's the last iteration
|
||||
raise # Re-raise the last exception
|
||||
|
||||
|
||||
def _get_all_models() -> List[str]:
|
||||
model = get_settings().config.model
|
||||
fallback_models = get_settings().config.fallback_models
|
||||
if not isinstance(fallback_models, list):
|
||||
fallback_models = [fallback_models]
|
||||
fallback_models = [m.strip() for m in fallback_models.split(",")]
|
||||
all_models = [model] + fallback_models
|
||||
for i, model in enumerate(all_models):
|
||||
try:
|
||||
return await f(model)
|
||||
except Exception as e:
|
||||
logging.warning(f"Failed to generate prediction with {model}: {e}")
|
||||
if i == len(all_models) - 1: # If it's the last iteration
|
||||
raise # Re-raise the last exception
|
||||
return all_models
|
||||
|
||||
|
||||
def _get_all_deployments(all_models: List[str]) -> List[str]:
|
||||
deployment_id = get_settings().get("openai.deployment_id", None)
|
||||
fallback_deployments = get_settings().get("openai.fallback_deployments", [])
|
||||
if not isinstance(fallback_deployments, list) and fallback_deployments:
|
||||
fallback_deployments = [d.strip() for d in fallback_deployments.split(",")]
|
||||
if fallback_deployments:
|
||||
all_deployments = [deployment_id] + fallback_deployments
|
||||
if len(all_deployments) < len(all_models):
|
||||
raise ValueError(f"The number of deployments ({len(all_deployments)}) "
|
||||
f"is less than the number of models ({len(all_models)})")
|
||||
else:
|
||||
all_deployments = [deployment_id] * len(all_models)
|
||||
return all_deployments
|
||||
|
||||
|
||||
def find_line_number_of_relevant_line_in_file(diff_files: List[FilePatchInfo],
|
||||
relevant_file: str,
|
||||
relevant_line_in_file: str) -> Tuple[int, int]:
|
||||
"""
|
||||
Find the line number and absolute position of a relevant line in a file.
|
||||
|
||||
Args:
|
||||
diff_files (List[FilePatchInfo]): A list of FilePatchInfo objects representing the patches of files.
|
||||
relevant_file (str): The name of the file where the relevant line is located.
|
||||
relevant_line_in_file (str): The content of the relevant line.
|
||||
|
||||
Returns:
|
||||
Tuple[int, int]: A tuple containing the line number and absolute position of the relevant line in the file.
|
||||
"""
|
||||
position = -1
|
||||
absolute_position = -1
|
||||
re_hunk_header = re.compile(
|
||||
r"^@@ -(\d+)(?:,(\d+))? \+(\d+)(?:,(\d+))? @@[ ]?(.*)")
|
||||
|
||||
for file in diff_files:
|
||||
if file.filename.strip() == relevant_file:
|
||||
patch = file.patch
|
||||
patch_lines = patch.splitlines()
|
||||
|
||||
# try to find the line in the patch using difflib, with some margin of error
|
||||
matches_difflib: list[str | Any] = difflib.get_close_matches(relevant_line_in_file,
|
||||
patch_lines, n=3, cutoff=0.93)
|
||||
if len(matches_difflib) == 1 and matches_difflib[0].startswith('+'):
|
||||
relevant_line_in_file = matches_difflib[0]
|
||||
|
||||
delta = 0
|
||||
start1, size1, start2, size2 = 0, 0, 0, 0
|
||||
for i, line in enumerate(patch_lines):
|
||||
if line.startswith('@@'):
|
||||
delta = 0
|
||||
match = re_hunk_header.match(line)
|
||||
start1, size1, start2, size2 = map(int, match.groups()[:4])
|
||||
elif not line.startswith('-'):
|
||||
delta += 1
|
||||
|
||||
if relevant_line_in_file in line and line[0] != '-':
|
||||
position = i
|
||||
absolute_position = start2 + delta - 1
|
||||
break
|
||||
|
||||
if position == -1 and relevant_line_in_file[0] == '+':
|
||||
no_plus_line = relevant_line_in_file[1:].lstrip()
|
||||
for i, line in enumerate(patch_lines):
|
||||
if line.startswith('@@'):
|
||||
delta = 0
|
||||
match = re_hunk_header.match(line)
|
||||
start1, size1, start2, size2 = map(int, match.groups()[:4])
|
||||
elif not line.startswith('-'):
|
||||
delta += 1
|
||||
|
||||
if no_plus_line in line and line[0] != '-':
|
||||
# The model might add a '+' to the beginning of the relevant_line_in_file even if originally
|
||||
# it's a context line
|
||||
position = i
|
||||
absolute_position = start2 + delta - 1
|
||||
break
|
||||
return position, absolute_position
|
||||
|
||||
|
||||
def clip_tokens(text: str, max_tokens: int) -> str:
|
||||
"""
|
||||
Clip the number of tokens in a string to a maximum number of tokens.
|
||||
|
||||
Args:
|
||||
text (str): The string to clip.
|
||||
max_tokens (int): The maximum number of tokens allowed in the string.
|
||||
|
||||
Returns:
|
||||
str: The clipped string.
|
||||
"""
|
||||
# We'll estimate the number of tokens by hueristically assuming 2.5 tokens per word
|
||||
try:
|
||||
encoder = get_token_encoder()
|
||||
num_input_tokens = len(encoder.encode(text))
|
||||
if num_input_tokens <= max_tokens:
|
||||
return text
|
||||
num_chars = len(text)
|
||||
chars_per_token = num_chars / num_input_tokens
|
||||
num_output_chars = int(chars_per_token * max_tokens)
|
||||
clipped_text = text[:num_output_chars]
|
||||
return clipped_text
|
||||
except Exception as e:
|
||||
logging.warning(f"Failed to clip tokens: {e}")
|
||||
return text
|
@ -1,9 +1,13 @@
|
||||
from jinja2 import Environment, StrictUndefined
|
||||
from tiktoken import encoding_for_model
|
||||
from tiktoken import encoding_for_model, get_encoding
|
||||
|
||||
from pr_agent.config_loader import get_settings
|
||||
|
||||
|
||||
def get_token_encoder():
|
||||
return encoding_for_model(get_settings().config.model) if "gpt" in get_settings().config.model else get_encoding(
|
||||
"cl100k_base")
|
||||
|
||||
class TokenHandler:
|
||||
"""
|
||||
A class for handling tokens in the context of a pull request.
|
||||
@ -27,7 +31,7 @@ class TokenHandler:
|
||||
- system: The system string.
|
||||
- user: The user string.
|
||||
"""
|
||||
self.encoder = encoding_for_model(get_settings().config.model)
|
||||
self.encoder = get_token_encoder()
|
||||
self.prompt_tokens = self._get_system_user_tokens(pr, self.encoder, vars, system, user)
|
||||
|
||||
def _get_system_user_tokens(self, pr, encoder, vars: dict, system, user):
|
||||
@ -47,7 +51,6 @@ class TokenHandler:
|
||||
environment = Environment(undefined=StrictUndefined)
|
||||
system_prompt = environment.from_string(system).render(vars)
|
||||
user_prompt = environment.from_string(user).render(vars)
|
||||
|
||||
system_prompt_tokens = len(encoder.encode(system_prompt))
|
||||
user_prompt_tokens = len(encoder.encode(user_prompt))
|
||||
return system_prompt_tokens + user_prompt_tokens
|
||||
|
@ -8,8 +8,8 @@ import textwrap
|
||||
from datetime import datetime
|
||||
from typing import Any, List
|
||||
|
||||
import yaml
|
||||
from starlette_context import context
|
||||
|
||||
from pr_agent.config_loader import get_settings, global_settings
|
||||
|
||||
|
||||
@ -40,7 +40,7 @@ def convert_to_markdown(output_data: dict) -> str:
|
||||
"Security concerns": "🔒",
|
||||
"General PR suggestions": "💡",
|
||||
"Insights from user's answers": "📝",
|
||||
"Code suggestions": "🤖",
|
||||
"Code feedback": "🤖",
|
||||
}
|
||||
|
||||
for key, value in output_data.items():
|
||||
@ -50,12 +50,12 @@ def convert_to_markdown(output_data: dict) -> str:
|
||||
markdown_text += f"## {key}\n\n"
|
||||
markdown_text += convert_to_markdown(value)
|
||||
elif isinstance(value, list):
|
||||
if key.lower() == 'code suggestions':
|
||||
if key.lower() == 'code feedback':
|
||||
markdown_text += "\n" # just looks nicer with additional line breaks
|
||||
emoji = emojis.get(key, "")
|
||||
markdown_text += f"- {emoji} **{key}:**\n\n"
|
||||
for item in value:
|
||||
if isinstance(item, dict) and key.lower() == 'code suggestions':
|
||||
if isinstance(item, dict) and key.lower() == 'code feedback':
|
||||
markdown_text += parse_code_suggestion(item)
|
||||
elif item:
|
||||
markdown_text += f" - {item}\n"
|
||||
@ -100,7 +100,7 @@ def try_fix_json(review, max_iter=10, code_suggestions=False):
|
||||
Args:
|
||||
- review: A string containing the JSON message to be fixed.
|
||||
- max_iter: An integer representing the maximum number of iterations to try and fix the JSON message.
|
||||
- code_suggestions: A boolean indicating whether to try and fix JSON messages with code suggestions.
|
||||
- code_suggestions: A boolean indicating whether to try and fix JSON messages with code feedback.
|
||||
|
||||
Returns:
|
||||
- data: A dictionary containing the parsed JSON data.
|
||||
@ -108,7 +108,7 @@ def try_fix_json(review, max_iter=10, code_suggestions=False):
|
||||
The function attempts to fix broken or incomplete JSON messages by parsing until the last valid code suggestion.
|
||||
If the JSON message ends with a closing bracket, the function calls the fix_json_escape_char function to fix the
|
||||
message.
|
||||
If code_suggestions is True and the JSON message contains code suggestions, the function tries to fix the JSON
|
||||
If code_suggestions is True and the JSON message contains code feedback, the function tries to fix the JSON
|
||||
message by parsing until the last valid code suggestion.
|
||||
The function uses regular expressions to find the last occurrence of "}," with any number of whitespaces or
|
||||
newlines.
|
||||
@ -128,7 +128,8 @@ def try_fix_json(review, max_iter=10, code_suggestions=False):
|
||||
else:
|
||||
closing_bracket = "]}}"
|
||||
|
||||
if review.rfind("'Code suggestions': [") > 0 or review.rfind('"Code suggestions": [') > 0:
|
||||
if (review.rfind("'Code feedback': [") > 0 or review.rfind('"Code feedback": [') > 0) or \
|
||||
(review.rfind("'Code suggestions': [") > 0 or review.rfind('"Code suggestions": [') > 0) :
|
||||
last_code_suggestion_ind = [m.end() for m in re.finditer(r"\}\s*,", review)][-1] - 1
|
||||
valid_json = False
|
||||
iter_count = 0
|
||||
@ -195,38 +196,30 @@ def convert_str_to_datetime(date_str):
|
||||
return datetime.strptime(date_str, datetime_format)
|
||||
|
||||
|
||||
def load_large_diff(file, new_file_content_str: str, original_file_content_str: str, patch: str) -> str:
|
||||
def load_large_diff(filename, new_file_content_str: str, original_file_content_str: str) -> str:
|
||||
"""
|
||||
Generate a patch for a modified file by comparing the original content of the file with the new content provided as
|
||||
input.
|
||||
|
||||
Args:
|
||||
file: The file object for which the patch needs to be generated.
|
||||
new_file_content_str: The new content of the file as a string.
|
||||
original_file_content_str: The original content of the file as a string.
|
||||
patch: An optional patch string that can be provided as input.
|
||||
|
||||
Returns:
|
||||
The generated or provided patch string.
|
||||
|
||||
Raises:
|
||||
None.
|
||||
|
||||
Additional Information:
|
||||
- If 'patch' is not provided as input, the function generates a patch using the 'difflib' library and returns it
|
||||
as output.
|
||||
- If the 'settings.config.verbosity_level' is greater than or equal to 2, a warning message is logged indicating
|
||||
that the file was modified but no patch was found, and a patch is manually created.
|
||||
"""
|
||||
if not patch: # to Do - also add condition for file extension
|
||||
try:
|
||||
diff = difflib.unified_diff(original_file_content_str.splitlines(keepends=True),
|
||||
new_file_content_str.splitlines(keepends=True))
|
||||
if get_settings().config.verbosity_level >= 2:
|
||||
logging.warning(f"File was modified, but no patch was found. Manually creating patch: {file.filename}.")
|
||||
patch = ''.join(diff)
|
||||
except Exception:
|
||||
pass
|
||||
patch = ""
|
||||
try:
|
||||
diff = difflib.unified_diff(original_file_content_str.splitlines(keepends=True),
|
||||
new_file_content_str.splitlines(keepends=True))
|
||||
if get_settings().config.verbosity_level >= 2:
|
||||
logging.warning(f"File was modified, but no patch was found. Manually creating patch: {filename}.")
|
||||
patch = ''.join(diff)
|
||||
except Exception:
|
||||
pass
|
||||
return patch
|
||||
|
||||
|
||||
@ -265,3 +258,26 @@ def update_settings_from_args(args: List[str]) -> List[str]:
|
||||
else:
|
||||
other_args.append(arg)
|
||||
return other_args
|
||||
|
||||
|
||||
def load_yaml(review_text: str) -> dict:
|
||||
review_text = review_text.removeprefix('```yaml').rstrip('`')
|
||||
try:
|
||||
data = yaml.load(review_text, Loader=yaml.SafeLoader)
|
||||
except Exception as e:
|
||||
logging.error(f"Failed to parse AI prediction: {e}")
|
||||
data = try_fix_yaml(review_text)
|
||||
return data
|
||||
|
||||
def try_fix_yaml(review_text: str) -> dict:
|
||||
review_text_lines = review_text.split('\n')
|
||||
data = {}
|
||||
for i in range(1, len(review_text_lines)):
|
||||
review_text_lines_tmp = '\n'.join(review_text_lines[:-i])
|
||||
try:
|
||||
data = yaml.load(review_text_lines_tmp, Loader=yaml.SafeLoader)
|
||||
logging.info(f"Successfully parsed AI prediction after removing {i} lines")
|
||||
break
|
||||
except:
|
||||
pass
|
||||
return data
|
||||
|
@ -10,13 +10,13 @@ from pr_agent.config_loader import get_settings
|
||||
def run(inargs=None):
|
||||
parser = argparse.ArgumentParser(description='AI based pull request analyzer', usage=
|
||||
"""\
|
||||
Usage: cli.py --pr-url <URL on supported git hosting service> <command> [<args>].
|
||||
Usage: cli.py --pr-url=<URL on supported git hosting service> <command> [<args>].
|
||||
For example:
|
||||
- cli.py --pr-url=... review
|
||||
- cli.py --pr-url=... describe
|
||||
- cli.py --pr-url=... improve
|
||||
- cli.py --pr-url=... ask "write me a poem about this PR"
|
||||
- cli.py --pr-url=... reflect
|
||||
- cli.py --pr_url=... review
|
||||
- cli.py --pr_url=... describe
|
||||
- cli.py --pr_url=... improve
|
||||
- cli.py --pr_url=... ask "write me a poem about this PR"
|
||||
- cli.py --pr_url=... reflect
|
||||
|
||||
Supported commands:
|
||||
review / review_pr - Add a review that includes a summary of the PR and specific suggestions for improvement.
|
||||
@ -27,7 +27,7 @@ reflect - Ask the PR author questions about the PR.
|
||||
update_changelog - Update the changelog based on the PR's contents.
|
||||
|
||||
To edit any configuration parameter from 'configuration.toml', just add -config_path=<value>.
|
||||
For example: '- cli.py --pr-url=... review --pr_reviewer.extra_instructions="focus on the file: ..."'
|
||||
For example: 'python cli.py --pr_url=... review --pr_reviewer.extra_instructions="focus on the file: ..."'
|
||||
""")
|
||||
parser.add_argument('--pr_url', type=str, help='The URL of the PR to review', required=True)
|
||||
parser.add_argument('command', type=str, help='The', choices=commands, default='review')
|
||||
|
@ -5,6 +5,7 @@ from urllib.parse import urlparse
|
||||
import requests
|
||||
from atlassian.bitbucket import Cloud
|
||||
|
||||
from ..algo.pr_processing import clip_tokens
|
||||
from ..config_loader import get_settings
|
||||
from .git_provider import FilePatchInfo
|
||||
|
||||
@ -25,6 +26,13 @@ class BitbucketProvider:
|
||||
if pr_url:
|
||||
self.set_pr(pr_url)
|
||||
|
||||
def get_repo_settings(self):
|
||||
try:
|
||||
contents = self.repo_obj.get_contents(".pr_agent.toml", ref=self.pr.head.sha).decoded_content
|
||||
return contents
|
||||
except Exception:
|
||||
return ""
|
||||
|
||||
def is_supported(self, capability: str) -> bool:
|
||||
if capability in ['get_issue_comments', 'create_inline_comment', 'publish_inline_comments', 'get_labels']:
|
||||
return False
|
||||
@ -81,6 +89,9 @@ class BitbucketProvider:
|
||||
return self.pr.source_branch
|
||||
|
||||
def get_pr_description(self):
|
||||
max_tokens = get_settings().get("CONFIG.MAX_DESCRIPTION_TOKENS", None)
|
||||
if max_tokens:
|
||||
return clip_tokens(self.pr.description, max_tokens)
|
||||
return self.pr.description
|
||||
|
||||
def get_user_id(self):
|
||||
@ -89,12 +100,25 @@ class BitbucketProvider:
|
||||
def get_issue_comments(self):
|
||||
raise NotImplementedError("Bitbucket provider does not support issue comments yet")
|
||||
|
||||
def get_repo_settings(self):
|
||||
try:
|
||||
contents = self.repo_obj.get_contents(".pr_agent.toml", ref=self.pr.head.sha).decoded_content
|
||||
return contents
|
||||
except Exception:
|
||||
return ""
|
||||
|
||||
def add_eyes_reaction(self, issue_comment_id: int) -> Optional[int]:
|
||||
return True
|
||||
|
||||
def remove_reaction(self, issue_comment_id: int, reaction_id: int) -> bool:
|
||||
return True
|
||||
|
||||
@staticmethod
|
||||
def _parse_pr_url(pr_url: str) -> Tuple[str, int]:
|
||||
parsed_url = urlparse(pr_url)
|
||||
|
||||
if 'bitbucket.org' not in parsed_url.netloc:
|
||||
raise ValueError("The provided URL is not a valid GitHub URL")
|
||||
raise ValueError("The provided URL is not a valid Bitbucket URL")
|
||||
|
||||
path_parts = parsed_url.path.strip('/').split('/')
|
||||
|
||||
|
@ -3,6 +3,7 @@ from dataclasses import dataclass
|
||||
|
||||
# enum EDIT_TYPE (ADDED, DELETED, MODIFIED, RENAMED)
|
||||
from enum import Enum
|
||||
from typing import Optional
|
||||
|
||||
|
||||
class EDIT_TYPE(Enum):
|
||||
@ -88,6 +89,21 @@ class GitProvider(ABC):
|
||||
def get_issue_comments(self):
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def get_repo_settings(self):
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def add_eyes_reaction(self, issue_comment_id: int) -> Optional[int]:
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def remove_reaction(self, issue_comment_id: int, reaction_id: int) -> bool:
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def get_commit_messages(self):
|
||||
pass
|
||||
|
||||
def get_main_pr_language(languages, files) -> str:
|
||||
"""
|
||||
|
@ -1,15 +1,18 @@
|
||||
import logging
|
||||
import hashlib
|
||||
|
||||
from datetime import datetime
|
||||
from typing import Optional, Tuple
|
||||
from typing import Optional, Tuple, Any
|
||||
from urllib.parse import urlparse
|
||||
|
||||
from github import AppAuthentication, Auth, Github, GithubException
|
||||
from github import AppAuthentication, Auth, Github, GithubException, Reaction
|
||||
from retry import retry
|
||||
from starlette_context import context
|
||||
|
||||
from .git_provider import FilePatchInfo, GitProvider, IncrementalPR
|
||||
from ..algo.language_handler import is_valid_file
|
||||
from ..algo.utils import load_large_diff
|
||||
from ..algo.pr_processing import find_line_number_of_relevant_line_in_file, clip_tokens
|
||||
from ..config_loader import get_settings
|
||||
from ..servers.utils import RateLimitExceeded
|
||||
|
||||
@ -27,6 +30,7 @@ class GithubProvider(GitProvider):
|
||||
self.pr = None
|
||||
self.github_user_id = None
|
||||
self.diff_files = None
|
||||
self.git_files = None
|
||||
self.incremental = incremental
|
||||
if pr_url:
|
||||
self.set_pr(pr_url)
|
||||
@ -81,40 +85,56 @@ class GithubProvider(GitProvider):
|
||||
def get_files(self):
|
||||
if self.incremental.is_incremental and self.file_set:
|
||||
return self.file_set.values()
|
||||
return self.pr.get_files()
|
||||
if not self.git_files:
|
||||
# bring files from GitHub only once
|
||||
self.git_files = self.pr.get_files()
|
||||
return self.git_files
|
||||
|
||||
@retry(exceptions=RateLimitExceeded,
|
||||
tries=get_settings().github.ratelimit_retries, delay=2, backoff=2, jitter=(1, 3))
|
||||
def get_diff_files(self) -> list[FilePatchInfo]:
|
||||
"""
|
||||
Retrieves the list of files that have been modified, added, deleted, or renamed in a pull request in GitHub,
|
||||
along with their content and patch information.
|
||||
|
||||
Returns:
|
||||
diff_files (List[FilePatchInfo]): List of FilePatchInfo objects representing the modified, added, deleted,
|
||||
or renamed files in the merge request.
|
||||
"""
|
||||
try:
|
||||
if self.diff_files:
|
||||
return self.diff_files
|
||||
|
||||
files = self.get_files()
|
||||
diff_files = []
|
||||
for file in files:
|
||||
if is_valid_file(file.filename):
|
||||
new_file_content_str = self._get_pr_file_content(file, self.pr.head.sha)
|
||||
patch = file.patch
|
||||
if self.incremental.is_incremental and self.file_set:
|
||||
original_file_content_str = self._get_pr_file_content(file,
|
||||
self.incremental.last_seen_commit_sha)
|
||||
patch = load_large_diff(file,
|
||||
new_file_content_str,
|
||||
original_file_content_str,
|
||||
None)
|
||||
self.file_set[file.filename] = patch
|
||||
else:
|
||||
original_file_content_str = self._get_pr_file_content(file, self.pr.base.sha)
|
||||
|
||||
diff_files.append(
|
||||
FilePatchInfo(original_file_content_str, new_file_content_str, patch, file.filename))
|
||||
for file in files:
|
||||
if not is_valid_file(file.filename):
|
||||
continue
|
||||
|
||||
new_file_content_str = self._get_pr_file_content(file, self.pr.head.sha) # communication with GitHub
|
||||
patch = file.patch
|
||||
|
||||
if self.incremental.is_incremental and self.file_set:
|
||||
original_file_content_str = self._get_pr_file_content(file, self.incremental.last_seen_commit_sha)
|
||||
patch = load_large_diff(file.filename, new_file_content_str, original_file_content_str)
|
||||
self.file_set[file.filename] = patch
|
||||
else:
|
||||
original_file_content_str = self._get_pr_file_content(file, self.pr.base.sha)
|
||||
if not patch:
|
||||
patch = load_large_diff(file.filename, new_file_content_str, original_file_content_str)
|
||||
|
||||
diff_files.append(FilePatchInfo(original_file_content_str, new_file_content_str, patch, file.filename))
|
||||
|
||||
self.diff_files = diff_files
|
||||
return diff_files
|
||||
|
||||
except GithubException.RateLimitExceededException as e:
|
||||
logging.error(f"Rate limit exceeded for GitHub API. Original message: {e}")
|
||||
raise RateLimitExceeded("Rate limit exceeded for GitHub API.") from e
|
||||
|
||||
def publish_description(self, pr_title: str, pr_body: str):
|
||||
self.pr.edit(title=pr_title, body=pr_body)
|
||||
# self.pr.create_issue_comment(pr_comment)
|
||||
|
||||
def publish_comment(self, pr_comment: str, is_temporary: bool = False):
|
||||
if is_temporary and not get_settings().config.publish_output_progress:
|
||||
@ -131,22 +151,9 @@ class GithubProvider(GitProvider):
|
||||
def publish_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
|
||||
self.publish_inline_comments([self.create_inline_comment(body, relevant_file, relevant_line_in_file)])
|
||||
|
||||
|
||||
def create_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
|
||||
self.diff_files = self.diff_files if self.diff_files else self.get_diff_files()
|
||||
position = -1
|
||||
for file in self.diff_files:
|
||||
if file.filename.strip() == relevant_file:
|
||||
patch = file.patch
|
||||
patch_lines = patch.splitlines()
|
||||
for i, line in enumerate(patch_lines):
|
||||
if relevant_line_in_file in line:
|
||||
position = i
|
||||
break
|
||||
elif relevant_line_in_file[0] == '+' and relevant_line_in_file[1:].lstrip() in line:
|
||||
# The model often adds a '+' to the beginning of the relevant_line_in_file even if originally
|
||||
# it's a context line
|
||||
position = i
|
||||
break
|
||||
position, absolute_position = find_line_number_of_relevant_line_in_file(self.diff_files, relevant_file.strip('`'), relevant_line_in_file)
|
||||
if position == -1:
|
||||
if get_settings().config.verbosity_level >= 2:
|
||||
logging.info(f"Could not find position for {relevant_file} {relevant_line_in_file}")
|
||||
@ -154,8 +161,6 @@ class GithubProvider(GitProvider):
|
||||
else:
|
||||
subject_type = "LINE"
|
||||
path = relevant_file.strip()
|
||||
# placeholder for future API support (already supported in single inline comment)
|
||||
# return dict(body=body, path=path, position=position, subject_type=subject_type)
|
||||
return dict(body=body, path=path, position=position) if subject_type == "LINE" else {}
|
||||
|
||||
def publish_inline_comments(self, comments: list[dict]):
|
||||
@ -229,6 +234,9 @@ class GithubProvider(GitProvider):
|
||||
return self.pr.head.ref
|
||||
|
||||
def get_pr_description(self):
|
||||
max_tokens = get_settings().get("CONFIG.MAX_DESCRIPTION_TOKENS", None)
|
||||
if max_tokens:
|
||||
return clip_tokens(self.pr.body, max_tokens)
|
||||
return self.pr.body
|
||||
|
||||
def get_user_id(self):
|
||||
@ -258,6 +266,23 @@ class GithubProvider(GitProvider):
|
||||
except Exception:
|
||||
return ""
|
||||
|
||||
def add_eyes_reaction(self, issue_comment_id: int) -> Optional[int]:
|
||||
try:
|
||||
reaction = self.pr.get_issue_comment(issue_comment_id).create_reaction("eyes")
|
||||
return reaction.id
|
||||
except Exception as e:
|
||||
logging.exception(f"Failed to add eyes reaction, error: {e}")
|
||||
return None
|
||||
|
||||
def remove_reaction(self, issue_comment_id: int, reaction_id: int) -> bool:
|
||||
try:
|
||||
self.pr.get_issue_comment(issue_comment_id).delete_reaction(reaction_id)
|
||||
return True
|
||||
except Exception as e:
|
||||
logging.exception(f"Failed to remove eyes reaction, error: {e}")
|
||||
return False
|
||||
|
||||
|
||||
@staticmethod
|
||||
def _parse_pr_url(pr_url: str) -> Tuple[str, int]:
|
||||
parsed_url = urlparse(pr_url)
|
||||
@ -353,17 +378,45 @@ class GithubProvider(GitProvider):
|
||||
logging.exception(f"Failed to get labels, error: {e}")
|
||||
return []
|
||||
|
||||
def get_commit_messages(self) -> str:
|
||||
def get_commit_messages(self):
|
||||
"""
|
||||
Retrieves the commit messages of a pull request.
|
||||
|
||||
Returns:
|
||||
str: A string containing the commit messages of the pull request.
|
||||
"""
|
||||
max_tokens = get_settings().get("CONFIG.MAX_COMMITS_TOKENS", None)
|
||||
try:
|
||||
commit_list = self.pr.get_commits()
|
||||
commit_messages = [commit.commit.message for commit in commit_list]
|
||||
commit_messages_str = "\n".join([f"{i + 1}. {message}" for i, message in enumerate(commit_messages)])
|
||||
except:
|
||||
except Exception:
|
||||
commit_messages_str = ""
|
||||
if max_tokens:
|
||||
commit_messages_str = clip_tokens(commit_messages_str, max_tokens)
|
||||
return commit_messages_str
|
||||
|
||||
def generate_link_to_relevant_line_number(self, suggestion) -> str:
|
||||
try:
|
||||
relevant_file = suggestion['relevant file'].strip('`').strip("'")
|
||||
relevant_line_str = suggestion['relevant line']
|
||||
if not relevant_line_str:
|
||||
return ""
|
||||
|
||||
position, absolute_position = find_line_number_of_relevant_line_in_file \
|
||||
(self.diff_files, relevant_file, relevant_line_str)
|
||||
|
||||
if absolute_position != -1:
|
||||
# # link to right file only
|
||||
# link = f"https://github.com/{self.repo}/blob/{self.pr.head.sha}/{relevant_file}" \
|
||||
# + "#" + f"L{absolute_position}"
|
||||
|
||||
# link to diff
|
||||
sha_file = hashlib.sha256(relevant_file.encode('utf-8')).hexdigest()
|
||||
link = f"https://github.com/{self.repo}/pull/{self.pr_num}/files#diff-{sha_file}R{absolute_position}"
|
||||
return link
|
||||
except Exception as e:
|
||||
if get_settings().config.verbosity_level >= 2:
|
||||
logging.info(f"Failed adding line link, error: {e}")
|
||||
|
||||
return ""
|
||||
|
@ -7,11 +7,16 @@ import gitlab
|
||||
from gitlab import GitlabGetError
|
||||
|
||||
from ..algo.language_handler import is_valid_file
|
||||
from ..algo.pr_processing import clip_tokens
|
||||
from ..algo.utils import load_large_diff
|
||||
from ..config_loader import get_settings
|
||||
from .git_provider import EDIT_TYPE, FilePatchInfo, GitProvider
|
||||
|
||||
logger = logging.getLogger()
|
||||
|
||||
class DiffNotFoundError(Exception):
|
||||
"""Raised when the diff for a merge request cannot be found."""
|
||||
pass
|
||||
|
||||
class GitLabProvider(GitProvider):
|
||||
|
||||
@ -30,6 +35,7 @@ class GitLabProvider(GitProvider):
|
||||
self.id_mr = None
|
||||
self.mr = None
|
||||
self.diff_files = None
|
||||
self.git_files = None
|
||||
self.temp_comments = []
|
||||
self._set_merge_request(merge_request_url)
|
||||
self.RE_HUNK_HEADER = re.compile(
|
||||
@ -53,7 +59,7 @@ class GitLabProvider(GitProvider):
|
||||
self.last_diff = self.mr.diffs.list(get_all=True)[-1]
|
||||
except IndexError as e:
|
||||
logger.error(f"Could not get diff for merge request {self.id_mr}")
|
||||
raise ValueError(f"Could not get diff for merge request {self.id_mr}") from e
|
||||
raise DiffNotFoundError(f"Could not get diff for merge request {self.id_mr}") from e
|
||||
|
||||
|
||||
def _get_pr_file_content(self, file_path: str, branch: str) -> str:
|
||||
@ -65,19 +71,27 @@ class GitLabProvider(GitProvider):
|
||||
return ''
|
||||
|
||||
def get_diff_files(self) -> list[FilePatchInfo]:
|
||||
"""
|
||||
Retrieves the list of files that have been modified, added, deleted, or renamed in a pull request in GitLab,
|
||||
along with their content and patch information.
|
||||
|
||||
Returns:
|
||||
diff_files (List[FilePatchInfo]): List of FilePatchInfo objects representing the modified, added, deleted,
|
||||
or renamed files in the merge request.
|
||||
"""
|
||||
|
||||
if self.diff_files:
|
||||
return self.diff_files
|
||||
|
||||
diffs = self.mr.changes()['changes']
|
||||
diff_files = []
|
||||
for diff in diffs:
|
||||
if is_valid_file(diff['new_path']):
|
||||
original_file_content_str = self._get_pr_file_content(diff['old_path'], self.mr.target_branch)
|
||||
new_file_content_str = self._get_pr_file_content(diff['new_path'], self.mr.source_branch)
|
||||
edit_type = EDIT_TYPE.MODIFIED
|
||||
if diff['new_file']:
|
||||
edit_type = EDIT_TYPE.ADDED
|
||||
elif diff['deleted_file']:
|
||||
edit_type = EDIT_TYPE.DELETED
|
||||
elif diff['renamed_file']:
|
||||
edit_type = EDIT_TYPE.RENAMED
|
||||
# original_file_content_str = self._get_pr_file_content(diff['old_path'], self.mr.target_branch)
|
||||
# new_file_content_str = self._get_pr_file_content(diff['new_path'], self.mr.source_branch)
|
||||
original_file_content_str = self._get_pr_file_content(diff['old_path'], self.mr.diff_refs['base_sha'])
|
||||
new_file_content_str = self._get_pr_file_content(diff['new_path'], self.mr.diff_refs['head_sha'])
|
||||
|
||||
try:
|
||||
if isinstance(original_file_content_str, bytes):
|
||||
original_file_content_str = bytes.decode(original_file_content_str, 'utf-8')
|
||||
@ -86,15 +100,33 @@ class GitLabProvider(GitProvider):
|
||||
except UnicodeDecodeError:
|
||||
logging.warning(
|
||||
f"Cannot decode file {diff['old_path']} or {diff['new_path']} in merge request {self.id_mr}")
|
||||
|
||||
edit_type = EDIT_TYPE.MODIFIED
|
||||
if diff['new_file']:
|
||||
edit_type = EDIT_TYPE.ADDED
|
||||
elif diff['deleted_file']:
|
||||
edit_type = EDIT_TYPE.DELETED
|
||||
elif diff['renamed_file']:
|
||||
edit_type = EDIT_TYPE.RENAMED
|
||||
|
||||
filename = diff['new_path']
|
||||
patch = diff['diff']
|
||||
if not patch:
|
||||
patch = load_large_diff(filename, new_file_content_str, original_file_content_str)
|
||||
|
||||
diff_files.append(
|
||||
FilePatchInfo(original_file_content_str, new_file_content_str, diff['diff'], diff['new_path'],
|
||||
FilePatchInfo(original_file_content_str, new_file_content_str,
|
||||
patch=patch,
|
||||
filename=filename,
|
||||
edit_type=edit_type,
|
||||
old_filename=None if diff['old_path'] == diff['new_path'] else diff['old_path']))
|
||||
self.diff_files = diff_files
|
||||
return diff_files
|
||||
|
||||
def get_files(self):
|
||||
return [change['new_path'] for change in self.mr.changes()['changes']]
|
||||
if not self.git_files:
|
||||
self.git_files = [change['new_path'] for change in self.mr.changes()['changes']]
|
||||
return self.git_files
|
||||
|
||||
def publish_description(self, pr_title: str, pr_body: str):
|
||||
try:
|
||||
@ -110,7 +142,6 @@ class GitLabProvider(GitProvider):
|
||||
self.temp_comments.append(comment)
|
||||
|
||||
def publish_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
|
||||
self.diff_files = self.diff_files if self.diff_files else self.get_diff_files()
|
||||
edit_type, found, source_line_no, target_file, target_line_no = self.search_line(relevant_file,
|
||||
relevant_line_in_file)
|
||||
self.send_inline_comment(body, edit_type, found, relevant_file, relevant_line_in_file, source_line_no,
|
||||
@ -122,16 +153,20 @@ class GitLabProvider(GitProvider):
|
||||
def create_inline_comments(self, comments: list[dict]):
|
||||
raise NotImplementedError("Gitlab provider does not support publishing inline comments yet")
|
||||
|
||||
def send_inline_comment(self, body, edit_type, found, relevant_file, relevant_line_in_file, source_line_no,
|
||||
target_file, target_line_no):
|
||||
def send_inline_comment(self,body: str,edit_type: str,found: bool,relevant_file: str,relevant_line_in_file: int,
|
||||
source_line_no: int, target_file: str,target_line_no: int) -> None:
|
||||
if not found:
|
||||
logging.info(f"Could not find position for {relevant_file} {relevant_line_in_file}")
|
||||
else:
|
||||
d = self.last_diff
|
||||
# in order to have exact sha's we have to find correct diff for this change
|
||||
diff = self.get_relevant_diff(relevant_file, relevant_line_in_file)
|
||||
if diff is None:
|
||||
logger.error(f"Could not get diff for merge request {self.id_mr}")
|
||||
raise DiffNotFoundError(f"Could not get diff for merge request {self.id_mr}")
|
||||
pos_obj = {'position_type': 'text',
|
||||
'new_path': target_file.filename,
|
||||
'old_path': target_file.old_filename if target_file.old_filename else target_file.filename,
|
||||
'base_sha': d.base_commit_sha, 'start_sha': d.start_commit_sha, 'head_sha': d.head_commit_sha}
|
||||
'base_sha': diff.base_commit_sha, 'start_sha': diff.start_commit_sha, 'head_sha': diff.head_commit_sha}
|
||||
if edit_type == 'deletion':
|
||||
pos_obj['old_line'] = source_line_no - 1
|
||||
elif edit_type == 'addition':
|
||||
@ -143,6 +178,23 @@ class GitLabProvider(GitProvider):
|
||||
self.mr.discussions.create({'body': body,
|
||||
'position': pos_obj})
|
||||
|
||||
def get_relevant_diff(self, relevant_file: str, relevant_line_in_file: int) -> Optional[dict]:
|
||||
changes = self.mr.changes() # Retrieve the changes for the merge request once
|
||||
if not changes:
|
||||
logging.error('No changes found for the merge request.')
|
||||
return None
|
||||
all_diffs = self.mr.diffs.list(get_all=True)
|
||||
if not all_diffs:
|
||||
logging.error('No diffs found for the merge request.')
|
||||
return None
|
||||
for diff in all_diffs:
|
||||
for change in changes['changes']:
|
||||
if change['new_path'] == relevant_file and relevant_line_in_file in change['diff']:
|
||||
return diff
|
||||
logging.debug(
|
||||
f'No relevant diff found for {relevant_file} {relevant_line_in_file}. Falling back to last diff.')
|
||||
return self.last_diff # fallback to last_diff if no relevant diff is found
|
||||
|
||||
def publish_code_suggestions(self, code_suggestions: list):
|
||||
for suggestion in code_suggestions:
|
||||
try:
|
||||
@ -151,9 +203,9 @@ class GitLabProvider(GitProvider):
|
||||
relevant_lines_start = suggestion['relevant_lines_start']
|
||||
relevant_lines_end = suggestion['relevant_lines_end']
|
||||
|
||||
self.diff_files = self.diff_files if self.diff_files else self.get_diff_files()
|
||||
diff_files = self.get_diff_files()
|
||||
target_file = None
|
||||
for file in self.diff_files:
|
||||
for file in diff_files:
|
||||
if file.filename == relevant_file:
|
||||
if file.filename == relevant_file:
|
||||
target_file = file
|
||||
@ -180,7 +232,7 @@ class GitLabProvider(GitProvider):
|
||||
target_file = None
|
||||
|
||||
edit_type = self.get_edit_type(relevant_line_in_file)
|
||||
for file in self.diff_files:
|
||||
for file in self.get_diff_files():
|
||||
if file.filename == relevant_file:
|
||||
edit_type, found, source_line_no, target_file, target_line_no = self.find_in_file(file,
|
||||
relevant_line_in_file)
|
||||
@ -248,6 +300,9 @@ class GitLabProvider(GitProvider):
|
||||
return self.mr.source_branch
|
||||
|
||||
def get_pr_description(self):
|
||||
max_tokens = get_settings().get("CONFIG.MAX_DESCRIPTION_TOKENS", None)
|
||||
if max_tokens:
|
||||
return clip_tokens(self.mr.description, max_tokens)
|
||||
return self.mr.description
|
||||
|
||||
def get_issue_comments(self):
|
||||
@ -260,6 +315,12 @@ class GitLabProvider(GitProvider):
|
||||
except Exception:
|
||||
return ""
|
||||
|
||||
def add_eyes_reaction(self, issue_comment_id: int) -> Optional[int]:
|
||||
return True
|
||||
|
||||
def remove_reaction(self, issue_comment_id: int, reaction_id: int) -> bool:
|
||||
return True
|
||||
|
||||
def _parse_merge_request_url(self, merge_request_url: str) -> Tuple[str, int]:
|
||||
parsed_url = urlparse(merge_request_url)
|
||||
|
||||
@ -305,16 +366,19 @@ class GitLabProvider(GitProvider):
|
||||
def get_labels(self):
|
||||
return self.mr.labels
|
||||
|
||||
def get_commit_messages(self) -> str:
|
||||
def get_commit_messages(self):
|
||||
"""
|
||||
Retrieves the commit messages of a pull request.
|
||||
|
||||
Returns:
|
||||
str: A string containing the commit messages of the pull request.
|
||||
"""
|
||||
max_tokens = get_settings().get("CONFIG.MAX_COMMITS_TOKENS", None)
|
||||
try:
|
||||
commit_messages_list = [commit['message'] for commit in self.mr.commits()._list]
|
||||
commit_messages_str = "\n".join([f"{i + 1}. {message}" for i, message in enumerate(commit_messages_list)])
|
||||
except:
|
||||
except Exception:
|
||||
commit_messages_str = ""
|
||||
return commit_messages_str
|
||||
if max_tokens:
|
||||
commit_messages_str = clip_tokens(commit_messages_str, max_tokens)
|
||||
return commit_messages_str
|
@ -4,6 +4,7 @@ import os
|
||||
|
||||
from pr_agent.agent.pr_agent import PRAgent
|
||||
from pr_agent.config_loader import get_settings
|
||||
from pr_agent.git_providers import get_git_provider
|
||||
from pr_agent.tools.pr_reviewer import PRReviewer
|
||||
|
||||
|
||||
@ -14,6 +15,8 @@ async def run_action():
|
||||
OPENAI_KEY = os.environ.get('OPENAI_KEY')
|
||||
OPENAI_ORG = os.environ.get('OPENAI_ORG')
|
||||
GITHUB_TOKEN = os.environ.get('GITHUB_TOKEN')
|
||||
get_settings().set("CONFIG.PUBLISH_OUTPUT_PROGRESS", False)
|
||||
|
||||
|
||||
# Check if required environment variables are set
|
||||
if not GITHUB_EVENT_NAME:
|
||||
@ -61,7 +64,9 @@ async def run_action():
|
||||
pr_url = event_payload.get("issue", {}).get("pull_request", {}).get("url")
|
||||
if pr_url:
|
||||
body = comment_body.strip().lower()
|
||||
await PRAgent().handle_request(pr_url, body)
|
||||
comment_id = event_payload.get("comment", {}).get("id")
|
||||
provider = get_git_provider()(pr_url=pr_url)
|
||||
await PRAgent().handle_request(pr_url, body, notify=lambda: provider.add_eyes_reaction(comment_id))
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
|
@ -11,6 +11,7 @@ from starlette_context.middleware import RawContextMiddleware
|
||||
|
||||
from pr_agent.agent.pr_agent import PRAgent
|
||||
from pr_agent.config_loader import get_settings, global_settings
|
||||
from pr_agent.git_providers import get_git_provider
|
||||
from pr_agent.servers.utils import verify_signature
|
||||
|
||||
logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
|
||||
@ -80,7 +81,10 @@ async def handle_request(body: Dict[str, Any]):
|
||||
return {}
|
||||
pull_request = body["issue"]["pull_request"]
|
||||
api_url = pull_request.get("url")
|
||||
await agent.handle_request(api_url, comment_body)
|
||||
comment_id = body.get("comment", {}).get("id")
|
||||
provider = get_git_provider()(pr_url=api_url)
|
||||
await agent.handle_request(api_url, comment_body, notify=lambda: provider.add_eyes_reaction(comment_id))
|
||||
|
||||
|
||||
elif action == "opened" or 'reopened' in action:
|
||||
pull_request = body.get("pull_request")
|
||||
@ -102,6 +106,7 @@ async def root():
|
||||
def start():
|
||||
# Override the deployment type to app
|
||||
get_settings().set("GITHUB.DEPLOYMENT_TYPE", "app")
|
||||
get_settings().set("CONFIG.PUBLISH_OUTPUT_PROGRESS", False)
|
||||
middleware = [Middleware(RawContextMiddleware)]
|
||||
app = FastAPI(middleware=middleware)
|
||||
app.include_router(router)
|
||||
|
@ -36,6 +36,7 @@ async def polling_loop():
|
||||
git_provider = get_git_provider()()
|
||||
user_id = git_provider.get_user_id()
|
||||
agent = PRAgent()
|
||||
get_settings().set("CONFIG.PUBLISH_OUTPUT_PROGRESS", False)
|
||||
|
||||
try:
|
||||
deployment_type = get_settings().github.deployment_type
|
||||
@ -98,8 +99,10 @@ async def polling_loop():
|
||||
if user_tag not in comment_body:
|
||||
continue
|
||||
rest_of_comment = comment_body.split(user_tag)[1].strip()
|
||||
|
||||
success = await agent.handle_request(pr_url, rest_of_comment)
|
||||
comment_id = comment['id']
|
||||
git_provider.set_pr(pr_url)
|
||||
success = await agent.handle_request(pr_url, rest_of_comment,
|
||||
notify=lambda: git_provider.add_eyes_reaction(comment_id)) # noqa E501
|
||||
if not success:
|
||||
git_provider.set_pr(pr_url)
|
||||
git_provider.publish_comment("### How to use PR-Agent\n" +
|
||||
|
@ -2,9 +2,11 @@ commands_text = "> **/review [-i]**: Request a review of your Pull Request. For
|
||||
"considers changes since the last review, include the '-i' option.\n" \
|
||||
"> **/describe**: Modify the PR title and description based on the contents of the PR.\n" \
|
||||
"> **/improve**: Suggest improvements to the code in the PR. \n" \
|
||||
"> **/ask \\<QUESTION\\>**: Pose a question about the PR.\n\n" \
|
||||
">To edit any configuration parameter from 'configuration.toml', add --config_path=new_value\n" \
|
||||
">For example: /review --pr_reviewer.extra_instructions=\"focus on the file: ...\" " \
|
||||
"> **/ask \\<QUESTION\\>**: Pose a question about the PR.\n" \
|
||||
"> **/update_changelog**: Update the changelog based on the PR's contents.\n\n" \
|
||||
">To edit any configuration parameter from **configuration.toml**, add --config_path=new_value\n" \
|
||||
">For example: /review --pr_reviewer.extra_instructions=\"focus on the file: ...\" \n" \
|
||||
">To list the possible configuration parameters, use the **/config** command.\n" \
|
||||
|
||||
|
||||
def bot_help_text(user: str):
|
||||
|
@ -7,17 +7,27 @@
|
||||
# See README for details about GitHub App deployment.
|
||||
|
||||
[openai]
|
||||
key = "<API_KEY>" # Acquire through https://platform.openai.com
|
||||
org = "<ORGANIZATION>" # Optional, may be commented out.
|
||||
key = "" # Acquire through https://platform.openai.com
|
||||
#org = "<ORGANIZATION>" # Optional, may be commented out.
|
||||
# Uncomment the following for Azure OpenAI
|
||||
#api_type = "azure"
|
||||
#api_version = '2023-05-15' # Check Azure documentation for the current API version
|
||||
#api_base = "<API_BASE>" # The base URL for your Azure OpenAI resource. e.g. "https://<your resource name>.openai.azure.com"
|
||||
#deployment_id = "<DEPLOYMENT_ID>" # The deployment name you chose when you deployed the engine
|
||||
#api_base = "" # The base URL for your Azure OpenAI resource. e.g. "https://<your resource name>.openai.azure.com"
|
||||
#deployment_id = "" # The deployment name you chose when you deployed the engine
|
||||
#fallback_deployments = [] # For each fallback model specified in configuration.toml in the [config] section, specify the appropriate deployment_id
|
||||
|
||||
[anthropic]
|
||||
key = "" # Optional, uncomment if you want to use Anthropic. Acquire through https://www.anthropic.com/
|
||||
|
||||
[cohere]
|
||||
key = "" # Optional, uncomment if you want to use Cohere. Acquire through https://dashboard.cohere.ai/
|
||||
|
||||
[replicate]
|
||||
key = "" # Optional, uncomment if you want to use Replicate. Acquire through https://replicate.com/
|
||||
[github]
|
||||
# ---- Set the following only for deployment type == "user"
|
||||
user_token = "<TOKEN>" # A GitHub personal access token with 'repo' scope.
|
||||
user_token = "" # A GitHub personal access token with 'repo' scope.
|
||||
deployment_type = "user" #set to user by default
|
||||
|
||||
# ---- Set the following only for deployment type == "app", see README for details.
|
||||
private_key = """\
|
||||
|
@ -7,14 +7,17 @@ publish_output_progress=true
|
||||
verbosity_level=0 # 0,1,2
|
||||
use_extra_bad_extensions=false
|
||||
use_repo_settings_file=true
|
||||
ai_timeout=180
|
||||
max_description_tokens = 500
|
||||
max_commits_tokens = 500
|
||||
|
||||
[pr_reviewer] # /review #
|
||||
require_focused_review=true
|
||||
require_score_review=false
|
||||
require_tests_review=true
|
||||
require_security_review=true
|
||||
num_code_suggestions=0
|
||||
inline_code_comments = true
|
||||
num_code_suggestions=3
|
||||
inline_code_comments = false
|
||||
ask_and_reflect=false
|
||||
extra_instructions = ""
|
||||
|
||||
@ -32,6 +35,8 @@ extra_instructions = ""
|
||||
push_changelog_changes=false
|
||||
extra_instructions = ""
|
||||
|
||||
[pr_config] # /config #
|
||||
|
||||
[github]
|
||||
# The type of deployment to create. Valid values are 'app' or 'user'.
|
||||
deployment_type = "user"
|
||||
|
@ -73,6 +73,11 @@ Description: '{{description}}'
|
||||
{%- if language %}
|
||||
Main language: {{language}}
|
||||
{%- endif %}
|
||||
{%- if commit_messages_str %}
|
||||
|
||||
Commit messages:
|
||||
{{commit_messages_str}}
|
||||
{%- endif %}
|
||||
|
||||
|
||||
The PR Diff:
|
||||
|
@ -2,38 +2,67 @@
|
||||
system="""You are CodiumAI-PR-Reviewer, a language model designed to review git pull requests.
|
||||
Your task is to provide full description of the PR content.
|
||||
- Make sure not to focus the new PR code (the '+' lines).
|
||||
|
||||
- Notice that the 'Previous title', 'Previous description' and 'Commit messages' sections may be partial, simplistic, non-informative or not up-to-date. Hence, compare them to the PR diff code, and use them only as a reference.
|
||||
- If needed, each YAML output should be in block scalar format ('|-')
|
||||
{%- if extra_instructions %}
|
||||
|
||||
Extra instructions from the user:
|
||||
{{ extra_instructions }}
|
||||
{% endif %}
|
||||
|
||||
You must use the following JSON schema to format your answer:
|
||||
```json
|
||||
{
|
||||
"PR Title": {
|
||||
"type": "string",
|
||||
"description": "an informative title for the PR, describing its main theme"
|
||||
},
|
||||
"PR Type": {
|
||||
"type": "string",
|
||||
"description": possible values are: ["Bug fix", "Tests", "Bug fix with tests", "Refactoring", "Enhancement", "Documentation", "Other"]
|
||||
},
|
||||
"PR Description": {
|
||||
"type": "string",
|
||||
"description": "an informative and concise description of the PR"
|
||||
},
|
||||
"PR Main Files Walkthrough": {
|
||||
"type": "string",
|
||||
"description": "a walkthrough of the PR changes. Review main files, in bullet points, and shortly describe the changes in each file (up to 10 most important files). Format: -`filename`: description of changes\n..."
|
||||
}
|
||||
}
|
||||
You must use the following YAML schema to format your answer:
|
||||
```yaml
|
||||
PR Title:
|
||||
type: string
|
||||
description: an informative title for the PR, describing its main theme
|
||||
PR Type:
|
||||
type: array
|
||||
items:
|
||||
type: string
|
||||
enum:
|
||||
- Bug fix
|
||||
- Tests
|
||||
- Bug fix with tests
|
||||
- Refactoring
|
||||
- Enhancement
|
||||
- Documentation
|
||||
- Other
|
||||
PR Description:
|
||||
type: string
|
||||
description: an informative and concise description of the PR
|
||||
PR Main Files Walkthrough:
|
||||
type: array
|
||||
maxItems: 10
|
||||
description: |-
|
||||
a walkthrough of the PR changes. Review main files, and shortly describe the changes in each file (up to 10 most important files).
|
||||
items:
|
||||
filename:
|
||||
type: string
|
||||
description: the relevant file full path
|
||||
changes in file:
|
||||
type: string
|
||||
description: minimal and concise description of the changes in the relevant file
|
||||
|
||||
Don't repeat the prompt in the answer, and avoid outputting the 'type' and 'description' fields.
|
||||
|
||||
Example output:
|
||||
```yaml
|
||||
PR Title: |-
|
||||
...
|
||||
PR Type:
|
||||
- Bug fix
|
||||
PR Description: |-
|
||||
...
|
||||
PR Main Files Walkthrough:
|
||||
- ...
|
||||
- ...
|
||||
```
|
||||
|
||||
Make sure to output a valid YAML. Don't repeat the prompt in the answer, and avoid outputting the 'type' and 'description' fields.
|
||||
"""
|
||||
|
||||
user="""PR Info:
|
||||
Previous title: '{{title}}'
|
||||
Previous description: '{{description}}'
|
||||
Branch: '{{branch}}'
|
||||
{%- if language %}
|
||||
|
||||
@ -52,6 +81,6 @@ The PR Git Diff:
|
||||
```
|
||||
Note that lines in the diff body are prefixed with a symbol that represents the type of change: '-' for deletions, '+' for additions, and ' ' (a space) for unchanged lines.
|
||||
|
||||
Response (should be a valid JSON, and nothing else):
|
||||
```json
|
||||
Response (should be a valid YAML, and nothing else):
|
||||
```yaml
|
||||
"""
|
||||
|
@ -21,6 +21,11 @@ Description: '{{description}}'
|
||||
{%- if language %}
|
||||
Main language: {{language}}
|
||||
{%- endif %}
|
||||
{%- if commit_messages_str %}
|
||||
|
||||
Commit messages:
|
||||
{{commit_messages_str}}
|
||||
{%- endif %}
|
||||
|
||||
|
||||
The PR Git Diff:
|
||||
|
@ -13,6 +13,11 @@ Description: '{{description}}'
|
||||
{%- if language %}
|
||||
Main language: {{language}}
|
||||
{%- endif %}
|
||||
{%- if commit_messages_str %}
|
||||
|
||||
Commit messages:
|
||||
{{commit_messages_str}}
|
||||
{%- endif %}
|
||||
|
||||
|
||||
The PR Git Diff:
|
||||
|
@ -1,12 +1,13 @@
|
||||
[pr_review_prompt]
|
||||
system="""You are CodiumAI-PR-Reviewer, a language model designed to review git pull requests.
|
||||
Your task is to provide constructive and concise feedback for the PR, and also provide meaningfull code suggestions to improve the new PR code (the '+' lines).
|
||||
- Provide up to {{ num_code_suggestions }} code suggestions.
|
||||
{%- if num_code_suggestions > 0 %}
|
||||
- Try to focus on important suggestions like fixing code problems, issues and bugs. As a second priority, provide suggestions for meaningfull code improvements, like performance, vulnerability, modularity, and best practices.
|
||||
- Provide up to {{ num_code_suggestions }} code suggestions.
|
||||
- Try to focus on the most important suggestions, like fixing code problems, issues and bugs. As a second priority, provide suggestions for meaningfull code improvements, like performance, vulnerability, modularity, and best practices.
|
||||
- Suggestions should focus on improving the new added code lines.
|
||||
- Make sure not to provide suggestions repeating modifications already implemented in the new PR code (the '+' lines).
|
||||
{%- endif %}
|
||||
- If needed, each YAML output should be in block scalar format ('|-')
|
||||
|
||||
{%- if extra_instructions %}
|
||||
|
||||
@ -14,117 +15,121 @@ Extra instructions from the user:
|
||||
{{ extra_instructions }}
|
||||
{% endif %}
|
||||
|
||||
You must use the following JSON schema to format your answer:
|
||||
```json
|
||||
{
|
||||
"PR Analysis": {
|
||||
"Main theme": {
|
||||
"type": "string",
|
||||
"description": "a short explanation of the PR"
|
||||
},
|
||||
"Type of PR": {
|
||||
"type": "string",
|
||||
"enum": ["Bug fix", "Tests", "Bug fix with tests", "Refactoring", "Enhancement", "Documentation", "Other"]
|
||||
},
|
||||
You must use the following YAML schema to format your answer:
|
||||
```yaml
|
||||
PR Analysis:
|
||||
Main theme:
|
||||
type: string
|
||||
description: a short explanation of the PR
|
||||
Type of PR:
|
||||
type: string
|
||||
enum:
|
||||
- Bug fix
|
||||
- Tests
|
||||
- Refactoring
|
||||
- Enhancement
|
||||
- Documentation
|
||||
- Other
|
||||
{%- if require_score %}
|
||||
"Score": {
|
||||
"type": "int",
|
||||
"description": "Rate this PR on a scale of 0-100 (inclusive), where 0 means the worst possible PR code, and 100 means PR code of the highest quality, without any bugs or performance issues, that is ready to be merged immediately and run in production at scale."
|
||||
},
|
||||
Score:
|
||||
type: int
|
||||
description: >-
|
||||
Rate this PR on a scale of 0-100 (inclusive), where 0 means the worst
|
||||
possible PR code, and 100 means PR code of the highest quality, without
|
||||
any bugs or performance issues, that is ready to be merged immediately and
|
||||
run in production at scale.
|
||||
{%- endif %}
|
||||
{%- if require_tests %}
|
||||
"Relevant tests added": {
|
||||
"type": "string",
|
||||
"description": "yes\\no question: does this PR have relevant tests ?"
|
||||
},
|
||||
Relevant tests added:
|
||||
type: string
|
||||
description: yes\\no question: does this PR have relevant tests ?
|
||||
{%- endif %}
|
||||
{%- if question_str %}
|
||||
"Insights from user's answer": {
|
||||
"type": "string",
|
||||
"description": "shortly summarize the insights you gained from the user's answers to the questions"
|
||||
},
|
||||
Insights from user's answer:
|
||||
type: string
|
||||
description: >-
|
||||
shortly summarize the insights you gained from the user's answers to the questions
|
||||
{%- endif %}
|
||||
{%- if require_focused %}
|
||||
"Focused PR": {
|
||||
"type": "string",
|
||||
"description": "Is this a focused PR, in the sense that it has a clear and coherent title and description, and all PR code diff changes are properly derived from the title and description? Explain your response."
|
||||
}
|
||||
},
|
||||
Focused PR:
|
||||
type: string
|
||||
description: >-
|
||||
Is this a focused PR, in the sense that all the PR code diff changes are
|
||||
united under a single focused theme ? If the theme is too broad, or the PR
|
||||
code diff changes are too scattered, then the PR is not focused. Explain
|
||||
your answer shortly.
|
||||
{%- endif %}
|
||||
"PR Feedback": {
|
||||
"General PR suggestions": {
|
||||
"type": "string",
|
||||
"description": "General suggestions and feedback for the contributors and maintainers of this PR. May include important suggestions for the overall structure, primary purpose, best practices, critical bugs, and other aspects of the PR. Explain your suggestions."
|
||||
},
|
||||
PR Feedback:
|
||||
General suggestions:
|
||||
type: string
|
||||
description: >-
|
||||
General suggestions and feedback for the contributors and maintainers of
|
||||
this PR. May include important suggestions for the overall structure,
|
||||
primary purpose, best practices, critical bugs, and other aspects of the
|
||||
PR. Don't address PR title and description, or lack of tests. Explain your
|
||||
suggestions.
|
||||
{%- if num_code_suggestions > 0 %}
|
||||
"Code suggestions": {
|
||||
"type": "array",
|
||||
"maxItems": {{ num_code_suggestions }},
|
||||
"uniqueItems": true,
|
||||
"items": {
|
||||
"relevant file": {
|
||||
"type": "string",
|
||||
"description": "the relevant file full path"
|
||||
},
|
||||
"suggestion content": {
|
||||
"type": "string",
|
||||
"description": "a concrete suggestion for meaningfully improving the new PR code. Also describe how, specifically, the suggestion can be applied to new PR code. Add tags with importance measure that matches each suggestion ('important' or 'medium'). Do not make suggestions for updating or adding docstrings, renaming PR title and description, or linter like.
|
||||
},
|
||||
"relevant line in file": {
|
||||
"type": "string",
|
||||
"description": "an authentic single code line from the PR git diff section, to which the suggestion applies."
|
||||
}
|
||||
}
|
||||
},
|
||||
Code feedback:
|
||||
type: array
|
||||
maxItems: {{ num_code_suggestions }}
|
||||
uniqueItems: true
|
||||
items:
|
||||
relevant file:
|
||||
type: string
|
||||
description: the relevant file full path
|
||||
suggestion:
|
||||
type: string
|
||||
description: |
|
||||
a concrete suggestion for meaningfully improving the new PR code. Also
|
||||
describe how, specifically, the suggestion can be applied to new PR
|
||||
code. Add tags with importance measure that matches each suggestion
|
||||
('important' or 'medium'). Do not make suggestions for updating or
|
||||
adding docstrings, renaming PR title and description, or linter like.
|
||||
relevant line:
|
||||
type: string
|
||||
description: |
|
||||
a single code line taken from the relevant file, to which the suggestion applies.
|
||||
The line should be a '+' line.
|
||||
Make sure to output the line exactly as it appears in the relevant file
|
||||
{%- endif %}
|
||||
{%- if require_security %}
|
||||
"Security concerns": {
|
||||
"type": "string",
|
||||
"description": "yes\\no question: does this PR code introduce possible security concerns or issues, like SQL injection, XSS, CSRF, and others ? explain your answer"
|
||||
? explain your answer"
|
||||
}
|
||||
Security concerns:
|
||||
type: string
|
||||
description: >-
|
||||
yes\\no question: does this PR code introduce possible security concerns or
|
||||
issues, like SQL injection, XSS, CSRF, and others ? If answered 'yes',explain your answer shortly
|
||||
{%- endif %}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Example output:
|
||||
'
|
||||
{
|
||||
"PR Analysis":
|
||||
{
|
||||
"Main theme": "xxx",
|
||||
"Type of PR": "Bug fix",
|
||||
```yaml
|
||||
PR Analysis:
|
||||
Main theme: xxx
|
||||
Type of PR: Bug fix
|
||||
{%- if require_score %}
|
||||
"Score": 89,
|
||||
{%- endif %}
|
||||
{%- if require_tests %}
|
||||
"Relevant tests added": "No",
|
||||
Score: 89
|
||||
{%- endif %}
|
||||
Relevant tests added: No
|
||||
{%- if require_focused %}
|
||||
"Focused PR": "yes\\no, because ..."
|
||||
Focused PR: no, because ...
|
||||
{%- endif %}
|
||||
},
|
||||
"PR Feedback":
|
||||
{
|
||||
"General PR suggestions": "..., `xxx`...",
|
||||
PR Feedback:
|
||||
General PR suggestions: ...
|
||||
{%- if num_code_suggestions > 0 %}
|
||||
"Code suggestions": [
|
||||
{
|
||||
"relevant file": "directory/xxx.py",
|
||||
"suggestion content": "xxx [important]",
|
||||
"relevant line in file": "xxx",
|
||||
},
|
||||
...
|
||||
]
|
||||
Code feedback:
|
||||
- relevant file: |-
|
||||
directory/xxx.py
|
||||
suggestion: xxx [important]
|
||||
relevant line: |-
|
||||
xxx
|
||||
...
|
||||
{%- endif %}
|
||||
{%- if require_security %}
|
||||
"Security concerns": "No, because ..."
|
||||
Security concerns: No
|
||||
{%- endif %}
|
||||
}
|
||||
}
|
||||
'
|
||||
```
|
||||
|
||||
Make sure to output a valid YAML. Use multi-line block scalar ('|') if needed.
|
||||
Don't repeat the prompt in the answer, and avoid outputting the 'type' and 'description' fields.
|
||||
"""
|
||||
|
||||
@ -135,6 +140,11 @@ Description: '{{description}}'
|
||||
{%- if language %}
|
||||
Main language: {{language}}
|
||||
{%- endif %}
|
||||
{%- if commit_messages_str %}
|
||||
|
||||
Commit messages:
|
||||
{{commit_messages_str}}
|
||||
{%- endif %}
|
||||
|
||||
{%- if question_str %}
|
||||
######
|
||||
@ -153,6 +163,6 @@ The PR Git Diff:
|
||||
```
|
||||
Note that lines in the diff body are prefixed with a symbol that represents the type of change: '-' for deletions, '+' for additions, and ' ' (a space) for unchanged lines.
|
||||
|
||||
Response (should be a valid JSON, and nothing else):
|
||||
```json
|
||||
Response (should be a valid YAML, and nothing else):
|
||||
```yaml
|
||||
"""
|
||||
|
@ -19,6 +19,11 @@ Description: '{{description}}'
|
||||
{%- if language %}
|
||||
Main language: {{language}}
|
||||
{%- endif %}
|
||||
{%- if commit_messages_str %}
|
||||
|
||||
Commit messages:
|
||||
{{commit_messages_str}}
|
||||
{%- endif %}
|
||||
|
||||
|
||||
The PR Diff:
|
||||
|
@ -34,6 +34,7 @@ class PRCodeSuggestions:
|
||||
"diff": "", # empty diff for initial calculation
|
||||
"num_code_suggestions": get_settings().pr_code_suggestions.num_code_suggestions,
|
||||
"extra_instructions": get_settings().pr_code_suggestions.extra_instructions,
|
||||
"commit_messages_str": self.git_provider.get_commit_messages(),
|
||||
}
|
||||
self.token_handler = TokenHandler(self.git_provider.pr,
|
||||
self.vars,
|
||||
@ -57,12 +58,12 @@ class PRCodeSuggestions:
|
||||
|
||||
async def _prepare_prediction(self, model: str):
|
||||
logging.info('Getting PR diff...')
|
||||
# we are using extended hunk with line numbers for code suggestions
|
||||
self.patches_diff = get_pr_diff(self.git_provider,
|
||||
self.token_handler,
|
||||
model,
|
||||
add_line_numbers_to_hunks=True,
|
||||
disable_extra_lines=True)
|
||||
|
||||
logging.info('Getting AI prediction...')
|
||||
self.prediction = await self._get_prediction(model)
|
||||
|
||||
@ -92,6 +93,10 @@ class PRCodeSuggestions:
|
||||
|
||||
def push_inline_code_suggestions(self, data):
|
||||
code_suggestions = []
|
||||
|
||||
if not data['Code suggestions']:
|
||||
return self.git_provider.publish_comment('No suggestions found to improve this PR.')
|
||||
|
||||
for d in data['Code suggestions']:
|
||||
try:
|
||||
if get_settings().config.verbosity_level >= 2:
|
||||
|
48
pr_agent/tools/pr_config.py
Normal file
48
pr_agent/tools/pr_config.py
Normal file
@ -0,0 +1,48 @@
|
||||
import logging
|
||||
|
||||
from pr_agent.config_loader import get_settings
|
||||
from pr_agent.git_providers import get_git_provider
|
||||
|
||||
|
||||
class PRConfig:
|
||||
"""
|
||||
The PRConfig class is responsible for listing all configuration options available for the user.
|
||||
"""
|
||||
def __init__(self, pr_url: str, args=None):
|
||||
"""
|
||||
Initialize the PRConfig object with the necessary attributes and objects to comment on a pull request.
|
||||
|
||||
Args:
|
||||
pr_url (str): The URL of the pull request to be reviewed.
|
||||
args (list, optional): List of arguments passed to the PRReviewer class. Defaults to None.
|
||||
"""
|
||||
self.git_provider = get_git_provider()(pr_url)
|
||||
|
||||
async def run(self):
|
||||
logging.info('Getting configuration settings...')
|
||||
logging.info('Preparing configs...')
|
||||
pr_comment = self._prepare_pr_configs()
|
||||
if get_settings().config.publish_output:
|
||||
logging.info('Pushing configs...')
|
||||
self.git_provider.publish_comment(pr_comment)
|
||||
self.git_provider.remove_initial_comment()
|
||||
return ""
|
||||
|
||||
def _prepare_pr_configs(self) -> str:
|
||||
import tomli
|
||||
with open(get_settings().find_file("configuration.toml"), "rb") as conf_file:
|
||||
configuration_headers = [header.lower() for header in tomli.load(conf_file).keys()]
|
||||
relevant_configs = {
|
||||
header: configs for header, configs in get_settings().to_dict().items()
|
||||
if header.lower().startswith("pr_") and header.lower() in configuration_headers
|
||||
}
|
||||
comment_str = "Possible Configurations:"
|
||||
for header, configs in relevant_configs.items():
|
||||
if configs:
|
||||
comment_str += "\n"
|
||||
for key, value in configs.items():
|
||||
comment_str += f"\n{header.lower()}.{key.lower()} = {repr(value) if isinstance(value, str) else value}"
|
||||
comment_str += " "
|
||||
if get_settings().config.verbosity_level >= 2:
|
||||
logging.info(f"comment_str:\n{comment_str}")
|
||||
return comment_str
|
@ -8,6 +8,7 @@ from jinja2 import Environment, StrictUndefined
|
||||
from pr_agent.algo.ai_handler import AiHandler
|
||||
from pr_agent.algo.pr_processing import get_pr_diff, retry_with_fallback_models
|
||||
from pr_agent.algo.token_handler import TokenHandler
|
||||
from pr_agent.algo.utils import load_yaml
|
||||
from pr_agent.config_loader import get_settings
|
||||
from pr_agent.git_providers import get_git_provider
|
||||
from pr_agent.git_providers.git_provider import get_main_pr_language
|
||||
@ -27,7 +28,6 @@ class PRDescription:
|
||||
self.main_pr_language = get_main_pr_language(
|
||||
self.git_provider.get_languages(), self.git_provider.get_files()
|
||||
)
|
||||
commit_messages_str = self.git_provider.get_commit_messages()
|
||||
|
||||
# Initialize the AI handler
|
||||
self.ai_handler = AiHandler()
|
||||
@ -40,7 +40,7 @@ class PRDescription:
|
||||
"language": self.main_pr_language,
|
||||
"diff": "", # empty diff for initial calculation
|
||||
"extra_instructions": get_settings().pr_description.extra_instructions,
|
||||
"commit_messages_str": commit_messages_str
|
||||
"commit_messages_str": self.git_provider.get_commit_messages()
|
||||
}
|
||||
|
||||
# Initialize the token handler
|
||||
@ -140,34 +140,45 @@ class PRDescription:
|
||||
- title: a string containing the PR title.
|
||||
- pr_body: a string containing the PR body in a markdown format.
|
||||
- pr_types: a list of strings containing the PR types.
|
||||
- markdown_text: a string containing the AI prediction data in a markdown format.
|
||||
- markdown_text: a string containing the AI prediction data in a markdown format. used for publishing a comment
|
||||
"""
|
||||
# Load the AI prediction data into a dictionary
|
||||
data = json.loads(self.prediction)
|
||||
data = load_yaml(self.prediction.strip())
|
||||
|
||||
# Initialization
|
||||
markdown_text = pr_body = ""
|
||||
pr_types = []
|
||||
|
||||
# Iterate over the dictionary items and append the key and value to 'markdown_text' in a markdown format
|
||||
markdown_text = ""
|
||||
for key, value in data.items():
|
||||
markdown_text += f"## {key}\n\n"
|
||||
markdown_text += f"{value}\n\n"
|
||||
|
||||
# If the 'PR Type' key is present in the dictionary, split its value by comma and assign it to 'pr_types'
|
||||
if 'PR Type' in data:
|
||||
pr_types = data['PR Type'].split(',')
|
||||
if type(data['PR Type']) == list:
|
||||
pr_types = data['PR Type']
|
||||
elif type(data['PR Type']) == str:
|
||||
pr_types = data['PR Type'].split(',')
|
||||
|
||||
# Assign the value of the 'PR Title' key to 'title' variable and remove it from the dictionary
|
||||
title = data.pop('PR Title')
|
||||
|
||||
# Iterate over the remaining dictionary items and append the key and value to 'pr_body' in a markdown format,
|
||||
# except for the items containing the word 'walkthrough'
|
||||
pr_body = ""
|
||||
for key, value in data.items():
|
||||
pr_body += f"## {key}:\n"
|
||||
if 'walkthrough' in key.lower():
|
||||
pr_body += f"{value}\n"
|
||||
# for filename, description in value.items():
|
||||
for file in value:
|
||||
filename = file['filename'].replace("'", "`")
|
||||
description = file['changes in file']
|
||||
pr_body += f'`{filename}`: {description}\n'
|
||||
else:
|
||||
# if the value is a list, join its items by comma
|
||||
if type(value) == list:
|
||||
value = ', '.join(v for v in value)
|
||||
pr_body += f"{value}\n\n___\n"
|
||||
|
||||
if get_settings().config.verbosity_level >= 2:
|
||||
|
@ -24,6 +24,7 @@ class PRInformationFromUser:
|
||||
"description": self.git_provider.get_pr_description(),
|
||||
"language": self.main_pr_language,
|
||||
"diff": "", # empty diff for initial calculation
|
||||
"commit_messages_str": self.git_provider.get_commit_messages(),
|
||||
}
|
||||
self.token_handler = TokenHandler(self.git_provider.pr,
|
||||
self.vars,
|
||||
|
@ -27,6 +27,7 @@ class PRQuestions:
|
||||
"language": self.main_pr_language,
|
||||
"diff": "", # empty diff for initial calculation
|
||||
"questions": self.question_str,
|
||||
"commit_messages_str": self.git_provider.get_commit_messages(),
|
||||
}
|
||||
self.token_handler = TokenHandler(self.git_provider.pr,
|
||||
self.vars,
|
||||
|
@ -4,12 +4,15 @@ import logging
|
||||
from collections import OrderedDict
|
||||
from typing import List, Tuple
|
||||
|
||||
import yaml
|
||||
from jinja2 import Environment, StrictUndefined
|
||||
from yaml import SafeLoader
|
||||
|
||||
from pr_agent.algo.ai_handler import AiHandler
|
||||
from pr_agent.algo.pr_processing import get_pr_diff, retry_with_fallback_models
|
||||
from pr_agent.algo.pr_processing import get_pr_diff, retry_with_fallback_models, \
|
||||
find_line_number_of_relevant_line_in_file, clip_tokens
|
||||
from pr_agent.algo.token_handler import TokenHandler
|
||||
from pr_agent.algo.utils import convert_to_markdown, try_fix_json
|
||||
from pr_agent.algo.utils import convert_to_markdown, try_fix_json, try_fix_yaml, load_yaml
|
||||
from pr_agent.config_loader import get_settings
|
||||
from pr_agent.git_providers import get_git_provider
|
||||
from pr_agent.git_providers.git_provider import IncrementalPR, get_main_pr_language
|
||||
@ -59,6 +62,7 @@ class PRReviewer:
|
||||
'question_str': question_str,
|
||||
'answer_str': answer_str,
|
||||
"extra_instructions": get_settings().pr_reviewer.extra_instructions,
|
||||
"commit_messages_str": self.git_provider.get_commit_messages(),
|
||||
}
|
||||
|
||||
self.token_handler = TokenHandler(
|
||||
@ -158,28 +162,43 @@ class PRReviewer:
|
||||
Prepare the PR review by processing the AI prediction and generating a markdown-formatted text that summarizes
|
||||
the feedback.
|
||||
"""
|
||||
review = self.prediction.strip()
|
||||
|
||||
try:
|
||||
data = json.loads(review)
|
||||
except json.decoder.JSONDecodeError:
|
||||
data = try_fix_json(review)
|
||||
data = load_yaml(self.prediction.strip())
|
||||
|
||||
# Move 'Security concerns' key to 'PR Analysis' section for better display
|
||||
if 'PR Feedback' in data and 'Security concerns' in data['PR Feedback']:
|
||||
val = data['PR Feedback']['Security concerns']
|
||||
del data['PR Feedback']['Security concerns']
|
||||
data['PR Analysis']['Security concerns'] = val
|
||||
pr_feedback = data.get('PR Feedback', {})
|
||||
security_concerns = pr_feedback.get('Security concerns')
|
||||
if security_concerns is not None:
|
||||
del pr_feedback['Security concerns']
|
||||
if type(security_concerns) == bool and security_concerns == False:
|
||||
data.setdefault('PR Analysis', {})['Security concerns'] = 'No security concerns found'
|
||||
else:
|
||||
data.setdefault('PR Analysis', {})['Security concerns'] = security_concerns
|
||||
|
||||
# Filter out code suggestions that can be submitted as inline comments
|
||||
if get_settings().config.git_provider != 'bitbucket' and get_settings().pr_reviewer.inline_code_comments \
|
||||
and 'Code suggestions' in data['PR Feedback']:
|
||||
data['PR Feedback']['Code suggestions'] = [
|
||||
d for d in data['PR Feedback']['Code suggestions']
|
||||
if any(key not in d for key in ('relevant file', 'relevant line in file', 'suggestion content'))
|
||||
]
|
||||
if not data['PR Feedback']['Code suggestions']:
|
||||
del data['PR Feedback']['Code suggestions']
|
||||
#
|
||||
if 'Code feedback' in pr_feedback:
|
||||
code_feedback = pr_feedback['Code feedback']
|
||||
|
||||
# Filter out code suggestions that can be submitted as inline comments
|
||||
if get_settings().pr_reviewer.inline_code_comments:
|
||||
del pr_feedback['Code feedback']
|
||||
else:
|
||||
for suggestion in code_feedback:
|
||||
if ('relevant file' in suggestion) and (not suggestion['relevant file'].startswith('``')):
|
||||
suggestion['relevant file'] = f"``{suggestion['relevant file']}``"
|
||||
|
||||
if 'relevant line' not in suggestion:
|
||||
suggestion['relevant line'] = ''
|
||||
|
||||
relevant_line_str = suggestion['relevant line'].split('\n')[0]
|
||||
|
||||
# removing '+'
|
||||
suggestion['relevant line'] = relevant_line_str.lstrip('+').strip()
|
||||
|
||||
# try to add line numbers link to code suggestions
|
||||
if hasattr(self.git_provider, 'generate_link_to_relevant_line_number'):
|
||||
link = self.git_provider.generate_link_to_relevant_line_number(suggestion)
|
||||
if link:
|
||||
suggestion['relevant line'] = f"[{suggestion['relevant line']}]({link})"
|
||||
|
||||
# Add incremental review section
|
||||
if self.incremental.is_incremental:
|
||||
@ -204,7 +223,10 @@ class PRReviewer:
|
||||
# Log markdown response if verbosity level is high
|
||||
if get_settings().config.verbosity_level >= 2:
|
||||
logging.info(f"Markdown response:\n{markdown_text}")
|
||||
|
||||
|
||||
if markdown_text == None or len(markdown_text) == 0:
|
||||
markdown_text = ""
|
||||
|
||||
return markdown_text
|
||||
|
||||
def _publish_inline_code_comments(self) -> None:
|
||||
@ -214,17 +236,19 @@ class PRReviewer:
|
||||
if get_settings().pr_reviewer.num_code_suggestions == 0:
|
||||
return
|
||||
|
||||
review = self.prediction.strip()
|
||||
review_text = self.prediction.strip()
|
||||
review_text = review_text.removeprefix('```yaml').rstrip('`')
|
||||
try:
|
||||
data = json.loads(review)
|
||||
except json.decoder.JSONDecodeError:
|
||||
data = try_fix_json(review)
|
||||
data = yaml.load(review_text, Loader=SafeLoader)
|
||||
except Exception as e:
|
||||
logging.error(f"Failed to parse AI prediction: {e}")
|
||||
data = try_fix_yaml(review_text)
|
||||
|
||||
comments: List[str] = []
|
||||
for suggestion in data.get('PR Feedback', {}).get('Code suggestions', []):
|
||||
for suggestion in data.get('PR Feedback', {}).get('Code feedback', []):
|
||||
relevant_file = suggestion.get('relevant file', '').strip()
|
||||
relevant_line_in_file = suggestion.get('relevant line in file', '').strip()
|
||||
content = suggestion.get('suggestion content', '')
|
||||
relevant_line_in_file = suggestion.get('relevant line', '').strip()
|
||||
content = suggestion.get('suggestion', '')
|
||||
if not relevant_file or not relevant_line_in_file or not content:
|
||||
logging.info("Skipping inline comment with missing file/line/content")
|
||||
continue
|
||||
|
@ -38,6 +38,7 @@ class PRUpdateChangelog:
|
||||
"changelog_file_str": self.changelog_file_str,
|
||||
"today": date.today(),
|
||||
"extra_instructions": get_settings().pr_update_changelog.extra_instructions,
|
||||
"commit_messages_str": self.git_provider.get_commit_messages(),
|
||||
}
|
||||
self.token_handler = TokenHandler(self.git_provider.pr,
|
||||
self.vars,
|
||||
|
@ -41,7 +41,9 @@ dependencies = [
|
||||
"aiohttp~=3.8.4",
|
||||
"atlassian-python-api==3.39.0",
|
||||
"GitPython~=3.1.32",
|
||||
"starlette-context==0.3.6"
|
||||
"starlette-context==0.3.6",
|
||||
"litellm~=0.1.351",
|
||||
"PyYAML==6.0"
|
||||
]
|
||||
|
||||
[project.urls]
|
||||
|
@ -1 +1,17 @@
|
||||
-e .
|
||||
dynaconf==3.1.12
|
||||
fastapi==0.99.0
|
||||
PyGithub==1.59.*
|
||||
retry==0.9.2
|
||||
openai==0.27.8
|
||||
Jinja2==3.1.2
|
||||
tiktoken==0.4.0
|
||||
uvicorn==0.22.0
|
||||
python-gitlab==3.15.0
|
||||
pytest~=7.4.0
|
||||
aiohttp~=3.8.4
|
||||
atlassian-python-api==3.39.0
|
||||
GitPython~=3.1.32
|
||||
litellm~=0.1.351
|
||||
PyYAML==6.0
|
||||
starlette-context==0.3.6
|
||||
litellm~=0.1.351
|
10
tests/unittest/test_bitbucket_provider.py
Normal file
10
tests/unittest/test_bitbucket_provider.py
Normal file
@ -0,0 +1,10 @@
|
||||
from pr_agent.git_providers.bitbucket_provider import BitbucketProvider
|
||||
|
||||
|
||||
class TestBitbucketProvider:
|
||||
def test_parse_pr_url(self):
|
||||
url = "https://bitbucket.org/WORKSPACE_XYZ/MY_TEST_REPO/pull-requests/321"
|
||||
workspace_slug, repo_slug, pr_number = BitbucketProvider._parse_pr_url(url)
|
||||
assert workspace_slug == "WORKSPACE_XYZ"
|
||||
assert repo_slug == "MY_TEST_REPO"
|
||||
assert pr_number == 321
|
@ -51,7 +51,7 @@ class TestConvertToMarkdown:
|
||||
'Unrelated changes': 'n/a', # won't be included in the output
|
||||
'Focused PR': 'Yes',
|
||||
'General PR suggestions': 'general suggestion...',
|
||||
'Code suggestions': [
|
||||
'Code feedback': [
|
||||
{
|
||||
'Code example': {
|
||||
'Before': 'Code before',
|
||||
@ -73,7 +73,7 @@ class TestConvertToMarkdown:
|
||||
- ✨ **Focused PR:** Yes
|
||||
- 💡 **General PR suggestions:** general suggestion...
|
||||
|
||||
- 🤖 **Code suggestions:**
|
||||
- 🤖 **Code feedback:**
|
||||
|
||||
- **Code example:**
|
||||
- **Before:**
|
||||
|
@ -0,0 +1,68 @@
|
||||
|
||||
# Generated by CodiumAI
|
||||
from pr_agent.git_providers.git_provider import FilePatchInfo
|
||||
from pr_agent.algo.pr_processing import find_line_number_of_relevant_line_in_file
|
||||
|
||||
|
||||
import pytest
|
||||
|
||||
class TestFindLineNumberOfRelevantLineInFile:
|
||||
# Tests that the function returns the correct line number and absolute position when the relevant line is found in the patch
|
||||
def test_relevant_line_found_in_patch(self):
|
||||
diff_files = [
|
||||
FilePatchInfo(base_file='file1', head_file='file1', patch='@@ -1,1 +1,2 @@\n-line1\n+line2\n+relevant_line\n', filename='file1')
|
||||
]
|
||||
relevant_file = 'file1'
|
||||
relevant_line_in_file = 'relevant_line'
|
||||
expected = (3, 2) # (position in patch, absolute_position in new file)
|
||||
assert find_line_number_of_relevant_line_in_file(diff_files, relevant_file, relevant_line_in_file) == expected
|
||||
|
||||
# Tests that the function returns the correct line number and absolute position when a similar line is found using difflib
|
||||
def test_similar_line_found_using_difflib(self):
|
||||
diff_files = [
|
||||
FilePatchInfo(base_file='file1', head_file='file1', patch='@@ -1,1 +1,2 @@\n-line1\n+relevant_line in file similar match\n', filename='file1')
|
||||
]
|
||||
relevant_file = 'file1'
|
||||
relevant_line_in_file = '+relevant_line in file similar match ' # note the space at the end. This is to simulate a similar line found using difflib
|
||||
expected = (2, 1)
|
||||
assert find_line_number_of_relevant_line_in_file(diff_files, relevant_file, relevant_line_in_file) == expected
|
||||
|
||||
# Tests that the function returns (-1, -1) when the relevant line is not found in the patch and no similar line is found using difflib
|
||||
def test_relevant_line_not_found(self):
|
||||
diff_files = [
|
||||
FilePatchInfo(base_file='file1', head_file='file1', patch='@@ -1,1 +1,2 @@\n-line1\n+relevant_line\n', filename='file1')
|
||||
]
|
||||
relevant_file = 'file1'
|
||||
relevant_line_in_file = 'not_found'
|
||||
expected = (-1, -1)
|
||||
assert find_line_number_of_relevant_line_in_file(diff_files, relevant_file, relevant_line_in_file) == expected
|
||||
|
||||
# Tests that the function returns (-1, -1) when the relevant file is not found in any of the patches
|
||||
def test_relevant_file_not_found(self):
|
||||
diff_files = [
|
||||
FilePatchInfo(base_file='file1', head_file='file1', patch='@@ -1,1 +1,2 @@\n-line1\n+relevant_line\n', filename='file2')
|
||||
]
|
||||
relevant_file = 'file1'
|
||||
relevant_line_in_file = 'relevant_line'
|
||||
expected = (-1, -1)
|
||||
assert find_line_number_of_relevant_line_in_file(diff_files, relevant_file, relevant_line_in_file) == expected
|
||||
|
||||
# Tests that the function returns (-1, -1) when the relevant_line_in_file is an empty string
|
||||
def test_empty_relevant_line(self):
|
||||
diff_files = [
|
||||
FilePatchInfo(base_file='file1', head_file='file1', patch='@@ -1,1 +1,2 @@\n-line1\n+relevant_line\n', filename='file1')
|
||||
]
|
||||
relevant_file = 'file1'
|
||||
relevant_line_in_file = ''
|
||||
expected = (0, 0)
|
||||
assert find_line_number_of_relevant_line_in_file(diff_files, relevant_file, relevant_line_in_file) == expected
|
||||
|
||||
# Tests that the function returns (-1, -1) when the relevant_line_in_file is found in the patch but it is a deleted line
|
||||
def test_relevant_line_found_but_deleted(self):
|
||||
diff_files = [
|
||||
FilePatchInfo(base_file='file1', head_file='file1', patch='@@ -1,2 +1,1 @@\n-line1\n-relevant_line\n', filename='file1')
|
||||
]
|
||||
relevant_file = 'file1'
|
||||
relevant_line_in_file = 'relevant_line'
|
||||
expected = (-1, -1)
|
||||
assert find_line_number_of_relevant_line_in_file(diff_files, relevant_file, relevant_line_in_file) == expected
|
32
tests/unittest/test_load_yaml.py
Normal file
32
tests/unittest/test_load_yaml.py
Normal file
@ -0,0 +1,32 @@
|
||||
|
||||
# Generated by CodiumAI
|
||||
|
||||
import pytest
|
||||
from pr_agent.algo.utils import load_yaml
|
||||
|
||||
|
||||
class TestLoadYaml:
|
||||
# Tests that load_yaml loads a valid YAML string
|
||||
def test_load_valid_yaml(self):
|
||||
yaml_str = 'name: John Smith\nage: 35'
|
||||
expected_output = {'name': 'John Smith', 'age': 35}
|
||||
assert load_yaml(yaml_str) == expected_output
|
||||
|
||||
def test_load_complicated_yaml(self):
|
||||
yaml_str = \
|
||||
'''\
|
||||
PR Analysis:
|
||||
Main theme: Enhancing the `/describe` command prompt by adding title and description
|
||||
Type of PR: Enhancement
|
||||
Relevant tests added: No
|
||||
Focused PR: Yes, the PR is focused on enhancing the `/describe` command prompt.
|
||||
|
||||
PR Feedback:
|
||||
General suggestions: The PR seems to be well-structured and focused on a specific enhancement. However, it would be beneficial to add tests to ensure the new feature works as expected.
|
||||
Code feedback:
|
||||
- relevant file: pr_agent/settings/pr_description_prompts.toml
|
||||
suggestion: Consider using a more descriptive variable name than 'user' for the command prompt. A more descriptive name would make the code more readable and maintainable. [medium]
|
||||
relevant line: 'user="""PR Info:'
|
||||
Security concerns: No'''
|
||||
expected_output = {'PR Analysis': {'Main theme': 'Enhancing the `/describe` command prompt by adding title and description', 'Type of PR': 'Enhancement', 'Relevant tests added': False, 'Focused PR': 'Yes, the PR is focused on enhancing the `/describe` command prompt.'}, 'PR Feedback': {'General suggestions': 'The PR seems to be well-structured and focused on a specific enhancement. However, it would be beneficial to add tests to ensure the new feature works as expected.', 'Code feedback': [{'relevant file': 'pr_agent/settings/pr_description_prompts.toml', 'suggestion': "Consider using a more descriptive variable name than 'user' for the command prompt. A more descriptive name would make the code more readable and maintainable. [medium]", 'relevant line': 'user="""PR Info:'}], 'Security concerns': False}}
|
||||
assert load_yaml(yaml_str) == expected_output
|
Reference in New Issue
Block a user