Compare commits

..

168 Commits

Author SHA1 Message Date
20bbdac135 Test github action 2023-08-10 16:41:50 +03:00
ceedf2bf83 Merge branch 'main' into ok/test_action 2023-08-10 16:40:01 +03:00
2d6b947292 Test github action 2023-08-10 16:37:02 +03:00
2e13b12fe6 Merge pull request #193 from coditamar/fix/workflow_yaml_permissions
adding `permissions` to `review.yaml`, also adding some comments
2023-08-10 16:17:32 +03:00
2d56c88291 Merge remote-tracking branch 'upstream/main' into fix/workflow_yaml_permissions 2023-08-10 16:16:47 +03:00
cf9c6a872d Test github action 2023-08-10 16:09:29 +03:00
0bb8ab70a4 Merge remote-tracking branch 'origin/main' 2023-08-10 15:16:10 +03:00
4a47b78a90 Rename workflow 2023-08-10 15:16:03 +03:00
3e542cd88b adding permissions to review.yaml, also adding some comments 2023-08-10 08:10:10 +03:00
17ed050ca7 Merge pull request #192 from coditamar/fix/minor_cli_and_requirements_fixes
Correcting CLI and README Descriptions and Fixing Requirements.txt
2023-08-10 02:18:13 +03:00
e24c5e3501 Update requirements.txt 2023-08-10 02:16:16 +03:00
b206b1c5ff Protect for empty description 2023-08-10 02:08:36 +03:00
0270306d3c litellm was mentioned twice in the requirements.txt 2023-08-10 01:34:24 +03:00
3e09b9ac37 fixing pr_url param description (was wrongly mentioned as pr-url) 2023-08-10 01:31:06 +03:00
725ac9e85d fixing cli pr_url help description 2023-08-10 01:30:12 +03:00
e00500b90c PyYAML dependency 2023-08-10 00:56:28 +03:00
f1f271fa00 PyYAML dependency 2023-08-10 00:44:00 +03:00
d38c5236dd Merge pull request #187 from Codium-ai/ok/limit_description
Limiting Description and Commit Messages Length
2023-08-09 14:14:47 +03:00
49a3a1e511 Merge pull request #188 from Codium-ai/tr/update_review_prompt
Update PR Review and Description Generation to Use YAML
2023-08-09 14:14:36 +03:00
64481e2d84 block scalar 2023-08-09 14:01:48 +03:00
e0f295659d A less hacky way 2023-08-09 12:17:54 +03:00
fe75e3f2ec yaml
yaml
2023-08-09 12:15:52 +03:00
e3274af831 A (still) hacky way to clip description and commit messages 2023-08-09 10:17:58 +03:00
7760f37dee Merge pull request #185 from zmeir/zmeir-fix_inline_comment_position
Attempt to fix bug in create_inline_comment
2023-08-07 20:41:52 +03:00
ebbe655c40 Don't commment on Github, only eyes reaction 2023-08-07 18:09:39 +03:00
164ed77d72 Attempt to fix bug in create_inline_comment 2023-08-07 17:09:50 +03:00
b1148e5f7a Don't commment on Github, only eyes reaction 2023-08-07 16:34:28 +03:00
2012e25596 Merge pull request #182 from Codium-ai/ok/add_eyes_reaction
Add Eyes Reaction to Comments and Configure AI Timeout
2023-08-07 16:28:38 +03:00
a75253097b Don't remove eyes 2023-08-07 16:28:20 +03:00
079d62af56 Merge pull request #181 from Codium-ai/ok/inference_timeout
Configurable AI Timeout
2023-08-07 16:23:06 +03:00
886139c6b5 Support adding / removing reaction from comments in GitHub different servers 2023-08-07 16:18:08 +03:00
8f751f7371 Default timeout for AI is now 180s, configurable 2023-08-07 13:26:28 +03:00
43297b851f Merge pull request #177 from Codium-ai/tr/update_readme
Update README and CONFIGURATION Documentation
2023-08-07 09:26:12 +03:00
4f39239e73 readme update
readme update
2023-08-07 09:11:54 +03:00
00e1925927 Merge pull request #172 from krrishdholakia/patch-1
adding support for Anthropic, Cohere, Replicate, Azure
2023-08-06 18:38:36 +03:00
7189b3ab41 suggestions -> feedback 2023-08-06 18:20:39 +03:00
a00038fbd8 Merge remote-tracking branch 'origin/main' into patch-1 2023-08-06 18:09:09 +03:00
a45343793a Merge pull request #175 from Codium-ai/tr/review_adjustments
Making the 'Review' Feature Great Again
2023-08-06 12:14:43 +03:00
703215fe83 updating secrets template 2023-08-05 22:53:59 -07:00
0f975ccf4a bug fixes 2023-08-05 22:50:41 -07:00
7367c62cf9 TestFindLineNumberOfRelevantLineInFile 2023-08-06 08:31:15 +03:00
fed0ea349a find_line_number_of_relevant_line_in_file
find_line_number_of_relevant_line_in_file
2023-08-06 08:13:07 +03:00
bd86266a4b Merge pull request #173 from Codium-ai/tr/caching
Optimization of PR Diff Processing
2023-08-05 09:23:45 +03:00
bd07a0cd7f Update Configuration.md 2023-08-04 12:13:04 +03:00
ed8554699b bug fixes and updates 2023-08-03 16:05:46 -07:00
749ae1be79 Update CHANGELOG.md 2023-08-03 19:55:51 +00:00
0e3dbbd0f2 fix major bug in gitlab 2023-08-03 22:51:38 +03:00
7a57db5d88 load_large_diff is done once 2023-08-03 22:14:05 +03:00
102edcdcf1 adding support for Anthropic, Cohere, Replicate, Azure 2023-08-03 12:04:08 -07:00
c92648cbd5 caching 2023-08-03 21:38:18 +03:00
26b008565b Merge pull request #170 from Codium-ai/tr/edge_case_for_hunks
Handling edge case for hunks in git patch processing
2023-08-03 12:11:27 +03:00
0dec24aa37 edge case for hunks 2023-08-03 10:50:22 +03:00
68a2f2a27d fix requirement.txt 2023-08-03 10:19:51 +03:00
cfa14178f8 Merge pull request #168 from Codium-ai/tr/further_use_commit_messages
Use commit messages in PR tools
2023-08-03 07:58:25 +03:00
b97c4b6114 Update CHANGELOG.md 2023-08-02 18:36:34 +03:00
3d43cecbea Merge pull request #167 from zmeir/zmeir-list_configurations_as_comment
Add /config command to list the possible configuration settings
2023-08-02 18:35:20 +03:00
eb143ec851 Update CHANGELOG.md 2023-08-02 15:32:15 +00:00
3e94a71dcd commit_messages_str is used in all tools 2023-08-02 18:26:39 +03:00
dd14423b07 Add /config command to list the possible configuration settings 2023-08-02 16:42:54 +03:00
8e47fdc284 Merge pull request #164 from Codium-ai/ok/repo_config
Support for Repo-Specific Configuration File
2023-08-01 19:09:23 +03:00
ab607d74be Support repo-specific configuration file 2023-08-01 18:36:20 +03:00
bfe7304449 Support repo-specific configuration file 2023-08-01 18:04:52 +03:00
e12874b696 Support repo-specific configuration file 2023-08-01 17:44:08 +03:00
696e2bd6ff Support repo-specific configuration file 2023-08-01 17:27:25 +03:00
450f410e3c Support repo-specific configuration file 2023-08-01 17:22:03 +03:00
08a3f033cb Merge pull request #162 from Codium-ai/ok/settings_refactor
Refactor settings usage and CLI
2023-08-01 16:05:20 +03:00
c5a79ceedd Merge remote-tracking branch 'origin/main' into ok/settings_refactor 2023-08-01 16:01:04 +03:00
13547afc58 Merge pull request #163 from Codium-ai/tr/commit_messages
Adding Commit Messages Retrieval Functionality
2023-08-01 15:59:26 +03:00
8ae936e504 Bug fixes 2023-08-01 15:58:23 +03:00
e577d27f9b Update CHANGELOG.md 2023-08-01 12:38:31 +00:00
dfb73c963a get_commit_messages for gitlab 2023-08-01 15:30:14 +03:00
8c0370a166 Commit messages in pr-description 2023-08-01 15:15:59 +03:00
d7b77764c3 Support context aware settings (for each incoming request), support override of settings, refactor CLI to use pr_agent.py 2023-08-01 14:43:26 +03:00
6605f9c444 typos in 'commands_text' 2023-07-31 11:02:30 +03:00
2a8adcbbd6 update README.md 2023-07-30 22:16:56 +03:00
0b22c8d427 update README.md 2023-07-30 22:04:59 +03:00
dfa0d9fd43 update README.md 2023-07-30 22:01:14 +03:00
c8470645e2 add tests and update README.md 2023-07-30 21:54:07 +03:00
5a181e52d5 Merge pull request #159 from Codium-ai/tr/edit_any_config_setting
The Configurator Strikes Back
2023-07-30 15:19:07 +03:00
0ad8dcd2aa Merge remote-tracking branch 'origin/tr/edit_any_config_setting' into tr/edit_any_config_setting 2023-07-30 12:27:40 +03:00
e2d015a20c final 2023-07-30 12:27:32 +03:00
a0cfe4b48a Update CHANGELOG.md 2023-07-30 12:26:53 +03:00
a6ba8b614a Example args 2023-07-30 12:16:43 +03:00
4f0fabd2ca update_settings_from_args refactor 2023-07-30 12:14:26 +03:00
42b047a14e update_settings_from_args 2023-07-30 12:04:57 +03:00
3daf94954a update_settings_from_args 2023-07-30 11:43:44 +03:00
b564d8ac32 Merge pull request #147 from zmeir/zmeir-align_describe_styling
Minor improvements to describe command
2023-07-28 20:55:15 +03:00
d8e6da74db Update .dockerignore 2023-07-28 12:15:17 +03:00
278f1883fd Merge pull request #153 from marshally/fix_iteration_error_in_reflect_tmp
fix TypeError when iterating discussion_messages
2023-07-28 12:12:12 +03:00
ef71a7049e fix TypeError when iterating discussion_messages
When `pr-agent` is reviewing a long list of messages, a TypeError is thrown on the line

```python
for message in reversed(discussion_messages):
```

When reviewing the PyGithub library, the recommend an alternate syntax for iterating a paginated list in reverse.

https://github.com/PyGithub/PyGithub/blob/v1.59.0/github/PaginatedList.py#L122-L125

```
    If you want to iterate in reversed order, just do::

        for repo in user.get_repos().reversed:
            print(repo.name)
```

And here's a copy of the actual traceback

```
Traceback (most recent call last):
  File "/app/pr_agent/servers/github_action_runner.py", line 68, in <module>
    asyncio.run(run_action())
  File "/usr/local/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/app/pr_agent/servers/github_action_runner.py", line 64, in run_action
    await PRAgent().handle_request(pr_url, body)
  File "/app/pr_agent/agent/pr_agent.py", line 19, in handle_request
    await PRReviewer(pr_url, is_answer=True).review()
  File "/app/pr_agent/tools/pr_reviewer.py", line 49, in __init__
    answer_str, question_str = self._get_user_answers()
  File "/app/pr_agent/tools/pr_reviewer.py", line 253, in _get_user_answers
    for message in reversed(discussion_messages):
TypeError: object of type 'PaginatedList' has no len()
```
2023-07-28 11:04:46 +02:00
6fde87b3bd Merge pull request #152 from Codium-ai/tr/gitlab_fixes
Improvements and Error Handling for GitLab Provider
2023-07-28 11:40:53 +03:00
07fe91e57b Update CHANGELOG.md 2023-07-28 08:39:42 +00:00
01e2f3f0cd Merge pull request #150 from Codium-ai/ok/handle_installation_id_properly
Github App: handle concurrent requests from multiple installations of app
2023-07-28 11:38:14 +03:00
63a703c000 Handle marketplace hook 2023-07-28 11:30:51 +03:00
4664d91844 bug fixes in gitlab code suggestion 2023-07-28 11:24:14 +03:00
8f16c46012 try-except 2023-07-28 10:52:49 +03:00
a8780f722d Handle marketplace hook 2023-07-28 03:22:25 +03:00
1a8fce1505 Updated handling of installation id 2023-07-28 02:44:28 +03:00
8519b106f9 Updated .gitignore 2023-07-28 02:28:50 +03:00
d375dd62fe Merge pull request #141 from patryk-kowalski-ds/pg/pip_package
Transition to pip package with pyproject.toml
2023-07-28 02:23:06 +03:00
3770bf8031 Update setup.py 2023-07-28 02:22:38 +03:00
5c527eca66 Merge remote-tracking branch 'origin/main' into pg/pip_package 2023-07-28 02:19:04 +03:00
b4ca52c7d8 updated Dockerfile.github_action 2023-07-28 02:18:12 +03:00
a78d741292 updated pyproject.toml 2023-07-28 02:09:01 +03:00
42388b1f8d Merge pull request #146 from idavidov/idsvidov/gitlabpaginator_fix
Fix for GitLab Paginator in GitLab Provider
2023-07-28 02:01:04 +03:00
2ce91fbdf5 Merge pull request #148 from eltociear/patch-1
Fix typo in PR_COMPRESSION.md
2023-07-28 01:50:30 +03:00
aa7659d6bf Fix typo in PR_COMPRESSION.md
Withing -> Within
2023-07-28 00:18:58 +09:00
4aa54b9bd4 Add /describe -c option 2023-07-27 17:42:50 +03:00
c6d0bacc08 Match styling of both /describe modes 2023-07-27 17:31:31 +03:00
a50e137bba Merge pull request #133 from idavidov/idavidov/github-ratelimit-message
Handling GitHub API Rate Limit Exceeded Exception
2023-07-27 14:22:11 +03:00
92c0522f4d Merge pull request #144 from Codium-ai/tr/readme_update
Update README with 'Why use PR-Agent?' section
2023-07-27 10:43:56 +03:00
6a72df2981 Merge pull request #139 from Codium-ai/tr/changelog
Add feature to update CHANGELOG.md based on PR content
2023-07-27 09:04:48 +03:00
808ca48605 if not self.commit_changelog: 2023-07-27 08:48:39 +03:00
c827cbc0ae final touches 2023-07-27 08:47:26 +03:00
48fcb46d4f Delete CHANGELOG.md 2023-07-27 08:46:14 +03:00
66b94599ec Update CHANGELOG.md 2023-07-27 08:45:33 +03:00
231efb33c1 add CHANGELOG.md 2023-07-27 08:43:29 +03:00
eb798dae6f Why use PR-Agent
Why use PR-Agent
2023-07-27 08:25:05 +03:00
52576c79b3 Update CHANGELOG.md 2023-07-26 20:40:28 +03:00
cce2a79a1f add CHANGELOG.md 2023-07-26 20:40:15 +03:00
413e5f6d77 general 2023-07-26 20:37:38 +03:00
09ca848d4c Merge remote-tracking branch 'origin/tr/changelog' into tr/changelog 2023-07-26 20:33:32 +03:00
801923789b final 2023-07-26 20:33:21 +03:00
cfb696dfd5 Delete CHANGELOG.md 2023-07-26 20:09:18 +03:00
2e7a0a88fa Update CHANGELOG.md 2023-07-26 20:08:29 +03:00
1dbbafc30a add CHANGELOG.md 2023-07-26 20:08:06 +03:00
d8eae7faab Delete CHANGELOG.md 2023-07-26 20:06:23 +03:00
14eceb6e61 PRUpdateChangelog 2023-07-26 20:05:18 +03:00
884317c4f7 stable 2023-07-26 20:03:22 +03:00
c5f4b229b8 Merge pull request #142 from patryk-kowalski-ds/pk/local-git-provider-impvs
Improvements to Local Git Provider
2023-07-26 19:18:35 +03:00
5a2a17ec25 Merge pull request #140 from Codium-ai/tr/enhance_review
Enhancement of PRReviewer class in pr_reviewer.py
2023-07-26 17:32:15 +03:00
1bd47b0d53 enhance pr_reviewer.py code 2023-07-26 17:24:03 +03:00
7531ccd31f stable 2023-07-26 16:29:42 +03:00
3b19827ae2 Add validation for repository path 2023-07-26 15:29:09 +02:00
ea6e1811c1 Fixed PR title - should be feature branch name, not target branch name 2023-07-26 14:15:50 +02:00
bc2cf75b76 Use pyproject.toml to install dependencies instead of requirements.txt. Fix incorrect mangum version 2023-07-26 09:14:24 +02:00
9e1e0766b7 Set python min version to 3.10 2023-07-26 09:13:54 +02:00
ccde68293f Update README.md 2023-07-26 10:09:01 +03:00
99d53af28d Update CHANGELOG.md 2023-07-26 09:50:21 +03:00
5ea607be58 Add package setup 2023-07-26 08:48:12 +02:00
e3846a480e s 2023-07-26 09:21:31 +03:00
a60a58794c Merge pull request #132 from Codium-ai/tr/code_enhancment
Enhancement of GitHub Webhook and Polling Server
2023-07-26 07:24:46 +03:00
8ae5faca53 Fix cyclic dependency 2023-07-25 16:52:18 +03:00
28d6adf62a Quick fix for github action 2023-07-25 16:41:29 +03:00
1229fba346 + settings.github.ratelimit_retries setup in configuration.toml 2023-07-25 16:37:13 +03:00
59a59ebf66 Quick fix for github action 2023-07-25 16:36:58 +03:00
36ab12c486 Merge pull request #136 from Codium-ai/ok/handle_sub_group
Handle subgroup in GitLab merge request URL parsing
2023-07-25 16:15:35 +03:00
0254e3d04a Merge pull request #128 from patryk-kowalski-ds/deepsense.ai/local-git-provider
Add Local Git Provider Support
2023-07-25 16:15:02 +03:00
f6036e936e + settings.github.ratelimit_retries setup in configuration.toml 2023-07-25 15:23:40 +03:00
10a07e497d Handle sub group in gitlab MR URLs 2023-07-25 15:15:51 +03:00
3b334805ee still need GithubException.RateLimitExceededException in pr_processing.py for correct exception catch 2023-07-25 15:14:56 +03:00
b6f6c903a0 moved @retry to github_provider.py and fetch number of retries from settings 2023-07-25 15:12:02 +03:00
55637a5620 added retry decorator similar to used in ai_handler following @okotek suggestion 2023-07-25 14:42:54 +03:00
404cc0a00e small change to show message and fail 2023-07-25 14:20:20 +03:00
0815e2024c - Replaced two dot diff with three dot diff. Cleaned up obsolete code linked to double dot diff.
- Moved target_branch_existence assertion to _prepare_repo method
- Renamed branch_name -> target_branch_name
- Simplified get_files method
2023-07-25 13:07:21 +02:00
41dcb75e8e Merge pull request #134 from Codium-ai/ok/gitlat_use_oauth
Use OAuth token for GitLab API
2023-07-25 14:04:50 +03:00
d23daf880f Change gitlab API to use oauth_token instead of PAT (PAT shuold work as well) 2023-07-25 13:58:48 +03:00
d1a8a610e9 Revert "show how much time until rate limit reset"
This reverts commit 8f482cd41a.
2023-07-25 13:38:55 +03:00
918549a4fc Implementing 'is_supported' method 2023-07-25 12:35:39 +02:00
8f482cd41a show how much time until rate limit reset
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-07-25 13:23:19 +03:00
34096059ff quick and dirty response for github API ratelimit, until some smart solution will be implemented 2023-07-25 13:05:56 +03:00
2dfbfec8c2 refactor 2023-07-24 19:48:24 +03:00
6170995665 replaced hardcoded main with actual target_branch name' 2023-07-24 16:59:07 +02:00
ca42a54bc3 Update pr_agent/git_providers/local_git_provider.py
Co-authored-by: Ori Kotek <orikotek@gmail.com>
2023-07-24 16:47:05 +02:00
c0610afe2a Update pr_agent/git_providers/local_git_provider.py
Co-authored-by: Ori Kotek <orikotek@gmail.com>
2023-07-24 16:46:46 +02:00
d4cbcc465c Update pr_agent/git_providers/local_git_provider.py
Co-authored-by: Ori Kotek <orikotek@gmail.com>
2023-07-24 16:46:36 +02:00
8e6518f071 Added GitPython to requirements. Changed default review path (aesthetics) 2023-07-24 14:28:37 +02:00
02ecaa340f Local Git Provider Implementation 2023-07-24 12:49:57 +02:00
60 changed files with 2152 additions and 667 deletions

View File

@ -1,3 +1,5 @@
venv/
pr_agent/settings/.secrets.toml
pics/
pics/
pr_agent.egg-info/
build/

36
.github/workflows/build-and-test.yaml vendored Normal file
View File

@ -0,0 +1,36 @@
name: Build-and-test
on:
push:
jobs:
build-and-test:
runs-on: ubuntu-latest
steps:
- id: checkout
uses: actions/checkout@v2
- id: dockerx
name: Setup Docker Buildx
uses: docker/setup-buildx-action@v2
- id: build
name: Build dev docker
uses: docker/build-push-action@v2
with:
context: .
file: ./docker/Dockerfile
push: false
load: true
tags: codiumai/pr-agent:test
cache-from: type=gha,scope=dev
cache-to: type=gha,mode=max,scope=dev
target: test
- id: test
name: Test dev docker
run: |
docker run --rm codiumai/pr-agent:test pytest -v

View File

@ -1,6 +1,17 @@
# This workflow enables developers to call PR-Agents `/[actions]` in PR's comments and upon PR creation.
# Learn more at https://www.codium.ai/pr-agent/
# This is v0.2 of this workflow file
name: PR-Agent
on:
pull_request:
issue_comment:
permissions:
issues: write
pull-requests: write
jobs:
pr_agent_job:
runs-on: ubuntu-latest

6
.gitignore vendored
View File

@ -1,4 +1,8 @@
.idea/
venv/
pr_agent/settings/.secrets.toml
__pycache__
__pycache__
dist/
*.egg-info/
build/
review.md

45
CHANGELOG.md Normal file
View File

@ -0,0 +1,45 @@
## 2023-08-03
### Optimized
- Optimized PR diff processing by introducing caching for diff files, reducing the number of API calls.
- Refactored `load_large_diff` function to generate a patch only when necessary.
- Fixed a bug in the GitLab provider where the new file was not retrieved correctly.
## 2023-08-02
### Enhanced
- Updated several tools in the `pr_agent` package to use commit messages in their functionality.
- Commit messages are now retrieved and stored in the `vars` dictionary for each tool.
- Added a section to display the commit messages in the prompts of various tools.
## 2023-08-01
### Enhanced
- Introduced the ability to retrieve commit messages from pull requests across different git providers.
- Implemented commit messages retrieval for GitHub and GitLab providers.
- Updated the PR description template to include a section for commit messages if they exist.
- Added support for repository-specific configuration files (.pr_agent.yaml) for the PR Agent.
- Implemented this feature for both GitHub and GitLab providers.
- Added a new configuration option 'use_repo_settings_file' to enable or disable the use of a repo-specific settings file.
## 2023-07-30
### Enhanced
- Added the ability to modify any configuration parameter from 'configuration.toml' on-the-fly.
- Updated the command line interface and bot commands to accept configuration changes as arguments.
- Improved the PR agent to handle additional arguments for each action.
## 2023-07-28
### Improved
- Enhanced error handling and logging in the GitLab provider.
- Improved handling of inline comments and code suggestions in GitLab.
- Fixed a bug where an additional unneeded line was added to code suggestions in GitLab.
## 2023-07-26
### Added
- New feature for updating the CHANGELOG.md based on the contents of a PR.
- Added support for this feature for the Github provider.
- New configuration settings and prompts for the changelog update feature.

View File

@ -1,19 +1,57 @@
## Configuration
The different tools and sub-tools used by CodiumAI pr-agent are easily configurable via the configuration file: `/pr-agent/settings/configuration.toml`.
##### Git Provider:
You can select your git_provider with the flag `git_provider` in the `config` section
The different tools and sub-tools used by CodiumAI PR-Agent are adjustable via the **[configuration file](pr_agent/settings/configuration.toml)**
##### PR Reviewer:
### Working from CLI
When running from source (CLI), your local configuration file will be initially used.
Example for invoking the 'review' tools via the CLI:
You can enable/disable the different PR Reviewer abilities with the following flags (`pr_reviewer` section):
```
require_focused_review=true
require_score_review=true
require_tests_review=true
require_security_review=true
python cli.py --pr-url=<pr_url> review
```
You can contol the number of suggestions returned by the PR Reviewer with the following flag:
```inline_code_comments=3```
And enable/disable the inline code suggestions with the following flag:
```inline_code_comments=true```
In addition to general configurations, the 'review' tool will use parameters from the `[pr_reviewer]` section (every tool has a dedicated section in the configuration file).
Note that you can print results locally, without publishing them, by setting in `configuration.toml`:
```
[config]
publish_output=true
verbosity_level=2
```
This is useful for debugging or experimenting with the different tools.
### Working from pre-built repo (GitHub Action/GitHub App/Docker)
When running PR-Agent from a pre-built repo, the default configuration file will be loaded.
To edit the configuration, you have two options:
1. Place a local configuration file in the root of your local repo. The local file will be used instead of the default one.
2. For online usage, just add `--config_path=<value>` to you command, to edit a specific configuration value.
For example if you want to edit `pr_reviewer` configurations, you can run:
```
/review --pr_reviewer.extra_instructions="..." --pr_reviewer.require_score_review=false ...
```
Any configuration value in `configuration.toml` file can be similarly edited.
### General configuration parameters
#### Changing a model
See [here](pr_agent/algo/__init__.py) for the list of available models.
To use Llama2 model, for example, set:
```
[config]
model = "replicate/llama-2-70b-chat:2c1608e18606fad2812020dc541930f2d0495ce32eee50074220b87300bc16e1"
[replicate]
key = ...
```
(you can obtain a Llama2 key from [here](https://replicate.com/replicate/llama-2-70b-chat/api))
Also review the [AiHandler](pr_agent/algo/ai_handler.py) file for instruction how to set keys for other models.
#### Extra instructions
All PR-Agent tools have a parameter called `extra_instructions`, that enables to add free-text extra instructions. Example usage:
```
/update_changelog --pr_update_changelog.extra_instructions="Make sure to update also the version ..."
```

View File

@ -1,8 +1,8 @@
FROM python:3.10 as base
WORKDIR /app
ADD requirements.txt .
RUN pip install -r requirements.txt && rm requirements.txt
ADD pyproject.toml .
RUN pip install . && rm pyproject.toml
ENV PYTHONPATH=/app
ADD pr_agent pr_agent
ADD github_action/entrypoint.sh /

View File

@ -31,7 +31,7 @@ We prioritize additions over deletions:
- File patches are a list of hunks, remove all hunks of type deletion-only from the hunks in the file patch
#### Adaptive and token-aware file patch fitting
We use [tiktoken](https://github.com/openai/tiktoken) to tokenize the patches after the modifications described above, and we use the following strategy to fit the patches into the prompt:
1. Withing each language we sort the files by the number of tokens in the file (in descending order):
1. Within each language we sort the files by the number of tokens in the file (in descending order):
* ```[[file2.py, file.py],[file4.jsx, file3.js],[readme.md]]```
2. Iterate through the patches in the order described above
2. Add the patches to the prompt until the prompt reaches a certain buffer from the max token length
@ -39,4 +39,4 @@ We use [tiktoken](https://github.com/openai/tiktoken) to tokenize the patches af
4. If we haven't reached the max token length, add the `deleted files` to the prompt until the prompt reaches the max token length (hard stop), skip the rest of the patches.
### Example
![](https://codium.ai/images/git_patch_logic.png)
![](https://codium.ai/images/git_patch_logic.png)

View File

@ -23,7 +23,9 @@ CodiumAI `PR-Agent` is an open-source tool aiming to help developers review pull
\
**Question Answering**: Answering free-text questions about the PR.
\
**Code Suggestion**: Committable code suggestions for improving the PR.
**Code Suggestions**: Committable code suggestions for improving the PR.
\
**Update Changelog**: Automatically updating the CHANGELOG.md file with the PR changes.
<h3>Example results:</h2>
</div>
@ -63,9 +65,9 @@ CodiumAI `PR-Agent` is an open-source tool aiming to help developers review pull
- [Overview](#overview)
- [Try it now](#try-it-now)
- [Installation](#installation)
- [Usage and tools](#usage-and-tools)
- [Configuration](./CONFIGURATION.md)
- [How it works](#how-it-works)
- [Why use PR-Agent](#why-use-pr-agent)
- [Roadmap](#roadmap)
- [Similar projects](#similar-projects)
</div>
@ -81,6 +83,7 @@ CodiumAI `PR-Agent` is an open-source tool aiming to help developers review pull
| | Auto-Description | :white_check_mark: | :white_check_mark: | |
| | Improve Code | :white_check_mark: | :white_check_mark: | |
| | Reflect and Review | :white_check_mark: | | |
| | Update CHANGELOG.md | :white_check_mark: | | |
| | | | | |
| USAGE | CLI | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | App / webhook | :white_check_mark: | :white_check_mark: | |
@ -90,14 +93,16 @@ CodiumAI `PR-Agent` is an open-source tool aiming to help developers review pull
| CORE | PR compression | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | Repo language prioritization | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | Adaptive and token-aware<br />file patch fitting | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | Multiple models support | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | Incremental PR Review | :white_check_mark: | | |
Examples for invoking the different tools via the CLI:
- **Review**: python cli.py --pr-url=<pr_url> review
- **Describe**: python cli.py --pr-url=<pr_url> describe
- **Improve**: python cli.py --pr-url=<pr_url> improve
- **Ask**: python cli.py --pr-url=<pr_url> ask "Write me a poem about this PR"
- **Reflect**: python cli.py --pr-url=<pr_url> reflect
- **Review**: python cli.py --pr_url=<pr_url> review
- **Describe**: python cli.py --pr_url=<pr_url> describe
- **Improve**: python cli.py --pr_url=<pr_url> improve
- **Ask**: python cli.py --pr_url=<pr_url> ask "Write me a poem about this PR"
- **Reflect**: python cli.py --pr_url=<pr_url> reflect
- **Update Changelog**: python cli.py --pr_url=<pr_url> update_changelog
"<pr_url>" is the url of the relevant PR (for example: https://github.com/Codium-ai/pr-agent/pull/50).
@ -130,36 +135,41 @@ There are several ways to use PR-Agent:
- [Method 5: Run as a GitHub App](INSTALL.md#method-5-run-as-a-github-app)
- Allowing you to automate the review process on your private or public repositories
## Usage and Tools
**PR-Agent** provides five types of interactions ("tools"): `"PR Reviewer"`, `"PR Q&A"`, `"PR Description"`, `"PR Code Sueggestions"` and `"PR Reflect and Review"`.
- The "PR Reviewer" tool automatically analyzes PRs, and provides various types of feedback.
- The "PR Q&A" tool answers free-text questions about the PR.
- The "PR Description" tool automatically sets the PR Title and body.
- The "PR Code Suggestion" tool provide inline code suggestions for the PR that can be applied and committed.
- The "PR Reflect and Review" tool initiates a dialog with the user, asks them to reflect on the PR, and then provides a more focused review.
## How it works
The following diagram illustrates PR-Agent tools and their flow:
![PR-Agent Tools](https://www.codium.ai/wp-content/uploads/2023/07/codiumai-diagram-v4.jpg)
Check out the [PR Compression strategy](./PR_COMPRESSION.md) page for more details on how we convert a code diff to a manageable LLM prompt
## Why use PR-Agent?
A reasonable question that can be asked is: `"Why use PR-Agent? What make it stand out from existing tools?"`
Here are some advantages of PR-Agent:
- We emphasize **real-life practical usage**. Each tool (review, improve, ask, ...) has a single GPT-4 call, no more. We feel that this is critical for realistic team usage - obtaining an answer quickly (~30 seconds) and affordably.
- Our [PR Compression strategy](./PR_COMPRESSION.md) is a core ability that enables to effectively tackle both short and long PRs.
- Our JSON prompting strategy enables to have **modular, customizable tools**. For example, the '/review' tool categories can be controlled via the [configuration](./CONFIGURATION.md) file. Adding additional categories is easy and accessible.
- We support **multiple git providers** (GitHub, Gitlab, Bitbucket), **multiple ways** to use the tool (CLI, GitHub Action, GitHub App, Docker, ...), and **multiple models** (GPT-4, GPT-3.5, Anthropic, Cohere, Llama2).
- We are open-source, and welcome contributions from the community.
## Roadmap
- [ ] Support open-source models, as a replacement for OpenAI models. (Note - a minimal requirement for each open-source model is to have 8k+ context, and good support for generating JSON as an output)
- [x] Support other Git providers, such as Gitlab and Bitbucket.
- [ ] Develop additional logic for handling large PRs, and compressing git patches
- [x] Support additional models, as a replacement for OpenAI (see [here](https://github.com/Codium-ai/pr-agent/pull/172))
- [ ] Develop additional logic for handling large PRs
- [ ] Add additional context to the prompt. For example, repo (or relevant files) summarization, with tools such a [ctags](https://github.com/universal-ctags/ctags)
- [ ] Adding more tools. Possible directions:
- [x] PR description
- [x] Inline code suggestions
- [x] Reflect and review
- [x] Rank the PR (see [here](https://github.com/Codium-ai/pr-agent/pull/89))
- [ ] Enforcing CONTRIBUTING.md guidelines
- [ ] Performance (are there any performance issues)
- [ ] Documentation (is the PR properly documented)
- [ ] Rank the PR importance
- [ ] ...
## Similar Projects

View File

@ -1,20 +1,24 @@
FROM python:3.10 as base
WORKDIR /app
ADD requirements.txt .
RUN pip install -r requirements.txt && rm requirements.txt
ADD pyproject.toml .
RUN pip install . && rm pyproject.toml
ENV PYTHONPATH=/app
ADD pr_agent pr_agent
FROM base as github_app
ADD pr_agent pr_agent
CMD ["python", "pr_agent/servers/github_app.py"]
FROM base as github_polling
ADD pr_agent pr_agent
CMD ["python", "pr_agent/servers/github_polling.py"]
FROM base as test
ADD requirements-dev.txt .
RUN pip install -r requirements-dev.txt && rm requirements-dev.txt
ADD pr_agent pr_agent
ADD tests tests
FROM base as cli
ADD pr_agent pr_agent
ENTRYPOINT ["python", "pr_agent/cli.py"]

View File

@ -4,9 +4,9 @@ RUN yum update -y && \
yum install -y gcc python3-devel && \
yum clean all
ADD requirements.txt .
RUN pip install -r requirements.txt && rm requirements.txt
RUN pip install mangum==16.0.0
ADD pyproject.toml .
RUN pip install . && rm pyproject.toml
RUN pip install mangum==0.17.0
COPY pr_agent/ ${LAMBDA_TASK_ROOT}/pr_agent/
CMD ["pr_agent.servers.serverless.serverless"]

View File

@ -1,33 +1,79 @@
import re
import logging
import os
import shlex
import tempfile
from pr_agent.config_loader import settings
from pr_agent.algo.utils import update_settings_from_args
from pr_agent.config_loader import get_settings
from pr_agent.git_providers import get_git_provider
from pr_agent.tools.pr_code_suggestions import PRCodeSuggestions
from pr_agent.tools.pr_description import PRDescription
from pr_agent.tools.pr_information_from_user import PRInformationFromUser
from pr_agent.tools.pr_questions import PRQuestions
from pr_agent.tools.pr_reviewer import PRReviewer
from pr_agent.tools.pr_update_changelog import PRUpdateChangelog
from pr_agent.tools.pr_config import PRConfig
command2class = {
"answer": PRReviewer,
"review": PRReviewer,
"review_pr": PRReviewer,
"reflect": PRInformationFromUser,
"reflect_and_review": PRInformationFromUser,
"describe": PRDescription,
"describe_pr": PRDescription,
"improve": PRCodeSuggestions,
"improve_code": PRCodeSuggestions,
"ask": PRQuestions,
"ask_question": PRQuestions,
"update_changelog": PRUpdateChangelog,
"config": PRConfig,
"settings": PRConfig,
}
commands = list(command2class.keys())
class PRAgent:
def __init__(self):
pass
async def handle_request(self, pr_url, request) -> bool:
action, *args = request.strip().split()
if any(cmd == action for cmd in ["/answer"]):
await PRReviewer(pr_url, is_answer=True).review()
elif any(cmd == action for cmd in ["/review", "/review_pr", "/reflect_and_review"]):
if settings.pr_reviewer.ask_and_reflect or "/reflect_and_review" in request:
await PRInformationFromUser(pr_url).generate_questions()
else:
await PRReviewer(pr_url, args=args).review()
elif any(cmd == action for cmd in ["/describe", "/describe_pr"]):
await PRDescription(pr_url).describe()
elif any(cmd == action for cmd in ["/improve", "/improve_code"]):
await PRCodeSuggestions(pr_url).suggest()
elif any(cmd == action for cmd in ["/ask", "/ask_question"]):
await PRQuestions(pr_url, args).answer()
async def handle_request(self, pr_url, request, notify=None) -> bool:
# First, apply repo specific settings if exists
if get_settings().config.use_repo_settings_file:
repo_settings_file = None
try:
git_provider = get_git_provider()(pr_url)
repo_settings = git_provider.get_repo_settings()
if repo_settings:
repo_settings_file = None
fd, repo_settings_file = tempfile.mkstemp(suffix='.toml')
os.write(fd, repo_settings)
get_settings().load_file(repo_settings_file)
finally:
if repo_settings_file:
try:
os.remove(repo_settings_file)
except Exception as e:
logging.error(f"Failed to remove temporary settings file {repo_settings_file}", e)
# Then, apply user specific settings if exists
request = request.replace("'", "\\'")
lexer = shlex.shlex(request, posix=True)
lexer.whitespace_split = True
action, *args = list(lexer)
args = update_settings_from_args(args)
action = action.lstrip("/").lower()
if action == "reflect_and_review" and not get_settings().pr_reviewer.ask_and_reflect:
action = "review"
if action == "answer":
if notify:
notify()
await PRReviewer(pr_url, is_answer=True, args=args).run()
elif action in command2class:
if notify:
notify()
await command2class[action](pr_url, args=args).run()
else:
return False
return True

View File

@ -7,4 +7,8 @@ MAX_TOKENS = {
'gpt-4': 8000,
'gpt-4-0613': 8000,
'gpt-4-32k': 32000,
'claude-instant-1': 100000,
'claude-2': 100000,
'command-nightly': 4096,
'replicate/llama-2-70b-chat:2c1608e18606fad2812020dc541930f2d0495ce32eee50074220b87300bc16e1': 4096,
}

View File

@ -1,12 +1,15 @@
import logging
import litellm
import openai
from openai.error import APIError, Timeout, TryAgain, RateLimitError
from litellm import acompletion
from openai.error import APIError, RateLimitError, Timeout, TryAgain
from retry import retry
from pr_agent.config_loader import settings
from pr_agent.config_loader import get_settings
OPENAI_RETRIES = 5
OPENAI_RETRIES=5
class AiHandler:
"""
@ -21,16 +24,26 @@ class AiHandler:
Raises a ValueError if the OpenAI key is missing.
"""
try:
openai.api_key = settings.openai.key
if settings.get("OPENAI.ORG", None):
openai.organization = settings.openai.org
self.deployment_id = settings.get("OPENAI.DEPLOYMENT_ID", None)
if settings.get("OPENAI.API_TYPE", None):
openai.api_type = settings.openai.api_type
if settings.get("OPENAI.API_VERSION", None):
openai.api_version = settings.openai.api_version
if settings.get("OPENAI.API_BASE", None):
openai.api_base = settings.openai.api_base
openai.api_key = get_settings().openai.key
litellm.openai_key = get_settings().openai.key
self.azure = False
if get_settings().get("OPENAI.ORG", None):
litellm.organization = get_settings().openai.org
self.deployment_id = get_settings().get("OPENAI.DEPLOYMENT_ID", None)
if get_settings().get("OPENAI.API_TYPE", None):
if get_settings().openai.api_type == "azure":
self.azure = True
litellm.azure_key = get_settings().openai.key
if get_settings().get("OPENAI.API_VERSION", None):
litellm.api_version = get_settings().openai.api_version
if get_settings().get("OPENAI.API_BASE", None):
litellm.api_base = get_settings().openai.api_base
if get_settings().get("ANTHROPIC.KEY", None):
litellm.anthropic_key = get_settings().anthropic.key
if get_settings().get("COHERE.KEY", None):
litellm.cohere_key = get_settings().cohere.key
if get_settings().get("REPLICATE.KEY", None):
litellm.replicate_key = get_settings().replicate.key
except AttributeError as e:
raise ValueError("OpenAI key is required") from e
@ -57,15 +70,17 @@ class AiHandler:
TryAgain: If there is an attribute error during OpenAI inference.
"""
try:
response = await openai.ChatCompletion.acreate(
model=model,
deployment_id=self.deployment_id,
messages=[
{"role": "system", "content": system},
{"role": "user", "content": user}
],
temperature=temperature,
)
response = await acompletion(
model=model,
deployment_id=self.deployment_id,
messages=[
{"role": "system", "content": system},
{"role": "user", "content": user}
],
temperature=temperature,
azure=self.azure,
force_timeout=get_settings().config.ai_timeout
)
except (APIError, Timeout, TryAgain) as e:
logging.error("Error during OpenAI inference: ", e)
raise
@ -75,8 +90,9 @@ class AiHandler:
except (Exception) as e:
logging.error("Unknown error during OpenAI inference: ", e)
raise TryAgain from e
if response is None or len(response.choices) == 0:
if response is None or len(response["choices"]) == 0:
raise TryAgain
resp = response.choices[0]['message']['content']
finish_reason = response.choices[0].finish_reason
return resp, finish_reason
resp = response["choices"][0]['message']['content']
finish_reason = response["choices"][0]["finish_reason"]
print(resp, finish_reason)
return resp, finish_reason

View File

@ -3,7 +3,7 @@ from __future__ import annotations
import logging
import re
from pr_agent.config_loader import settings
from pr_agent.config_loader import get_settings
def extend_patch(original_file_str, patch_str, num_lines) -> str:
@ -41,7 +41,11 @@ def extend_patch(original_file_str, patch_str, num_lines) -> str:
extended_patch_lines.extend(
original_lines[start1 + size1 - 1:start1 + size1 - 1 + num_lines])
start1, size1, start2, size2 = map(int, match.groups()[:4])
try:
start1, size1, start2, size2 = map(int, match.groups()[:4])
except: # '@@ -0,0 +1 @@' case
start1, size1, size2 = map(int, match.groups()[:3])
start2 = 0
section_header = match.groups()[4]
extended_start1 = max(1, start1 - num_lines)
extended_size1 = size1 + (start1 - extended_start1) + num_lines
@ -55,7 +59,7 @@ def extend_patch(original_file_str, patch_str, num_lines) -> str:
continue
extended_patch_lines.append(line)
except Exception as e:
if settings.config.verbosity_level >= 2:
if get_settings().config.verbosity_level >= 2:
logging.error(f"Failed to extend patch: {e}")
return patch_str
@ -126,14 +130,14 @@ def handle_patch_deletions(patch: str, original_file_content_str: str,
"""
if not new_file_content_str:
# logic for handling deleted files - don't show patch, just show that the file was deleted
if settings.config.verbosity_level > 0:
if get_settings().config.verbosity_level > 0:
logging.info(f"Processing file: {file_name}, minimizing deletion file")
patch = None # file was deleted
else:
patch_lines = patch.splitlines()
patch_new = omit_deletion_hunks(patch_lines)
if patch != patch_new:
if settings.config.verbosity_level > 0:
if get_settings().config.verbosity_level > 0:
logging.info(f"Processing file: {file_name}, hunks were deleted")
patch = patch_new
return patch
@ -141,7 +145,8 @@ def handle_patch_deletions(patch: str, original_file_content_str: str,
def convert_to_hunks_with_lines_numbers(patch: str, file) -> str:
"""
Convert a given patch string into a string with line numbers for each hunk, indicating the new and old content of the file.
Convert a given patch string into a string with line numbers for each hunk, indicating the new and old content of
the file.
Args:
patch (str): The patch string to be converted.
@ -197,7 +202,12 @@ def convert_to_hunks_with_lines_numbers(patch: str, file) -> str:
patch_with_lines_str += f"{line_old}\n"
new_content_lines = []
old_content_lines = []
start1, size1, start2, size2 = map(int, match.groups()[:4])
try:
start1, size1, start2, size2 = map(int, match.groups()[:4])
except: # '@@ -0,0 +1 @@' case
start1, size1, size2 = map(int, match.groups()[:3])
start2 = 0
elif line.startswith('+'):
new_content_lines.append(line)
elif line.startswith('-'):

View File

@ -1,19 +1,19 @@
# Language Selection, source: https://github.com/bigcode-project/bigcode-dataset/blob/main/language_selection/programming-languages-to-file-extensions.json # noqa E501
from typing import Dict
from pr_agent.config_loader import settings
from pr_agent.config_loader import get_settings
language_extension_map_org = settings.language_extension_map_org
language_extension_map_org = get_settings().language_extension_map_org
language_extension_map = {k.lower(): v for k, v in language_extension_map_org.items()}
# Bad Extensions, source: https://github.com/EleutherAI/github-downloader/blob/345e7c4cbb9e0dc8a0615fd995a08bf9d73b3fe6/download_repo_text.py # noqa: E501
bad_extensions = settings.bad_extensions.default
if settings.config.use_extra_bad_extensions:
bad_extensions += settings.bad_extensions.extra
bad_extensions = get_settings().bad_extensions.default
if get_settings().config.use_extra_bad_extensions:
bad_extensions += get_settings().bad_extensions.extra
def filter_bad_extensions(files):
return [f for f in files if is_valid_file(f.filename)]
return [f for f in files if f.filename is not None and is_valid_file(f.filename)]
def is_valid_file(filename):

View File

@ -1,15 +1,19 @@
from __future__ import annotations
import difflib
import logging
from typing import Tuple, Union, Callable, List
import re
import traceback
from typing import Any, Callable, List, Tuple
from github import RateLimitExceededException
from pr_agent.algo import MAX_TOKENS
from pr_agent.algo.git_patch_processing import convert_to_hunks_with_lines_numbers, extend_patch, handle_patch_deletions
from pr_agent.algo.language_handler import sort_files_by_main_languages
from pr_agent.algo.token_handler import TokenHandler
from pr_agent.algo.utils import load_large_diff
from pr_agent.config_loader import settings
from pr_agent.git_providers.git_provider import GitProvider
from pr_agent.algo.token_handler import TokenHandler, get_token_encoder
from pr_agent.config_loader import get_settings
from pr_agent.git_providers.git_provider import FilePatchInfo, GitProvider
DELETED_FILES_ = "Deleted files:\n"
@ -19,18 +23,21 @@ OUTPUT_BUFFER_TOKENS_SOFT_THRESHOLD = 1000
OUTPUT_BUFFER_TOKENS_HARD_THRESHOLD = 600
PATCH_EXTRA_LINES = 3
def get_pr_diff(git_provider: GitProvider, token_handler: TokenHandler, model: str,
add_line_numbers_to_hunks: bool = False, disable_extra_lines: bool = False) -> str:
"""
Returns a string with the diff of the pull request, applying diff minimization techniques if needed.
Args:
git_provider (GitProvider): An object of the GitProvider class representing the Git provider used for the pull request.
token_handler (TokenHandler): An object of the TokenHandler class used for handling tokens in the context of the pull request.
git_provider (GitProvider): An object of the GitProvider class representing the Git provider used for the pull
request.
token_handler (TokenHandler): An object of the TokenHandler class used for handling tokens in the context of the
pull request.
model (str): The name of the model used for tokenization.
add_line_numbers_to_hunks (bool, optional): A boolean indicating whether to add line numbers to the hunks in the diff. Defaults to False.
disable_extra_lines (bool, optional): A boolean indicating whether to disable the extension of each patch with extra lines of context. Defaults to False.
add_line_numbers_to_hunks (bool, optional): A boolean indicating whether to add line numbers to the hunks in the
diff. Defaults to False.
disable_extra_lines (bool, optional): A boolean indicating whether to disable the extension of each patch with
extra lines of context. Defaults to False.
Returns:
str: A string with the diff of the pull request, applying diff minimization techniques if needed.
@ -40,7 +47,11 @@ def get_pr_diff(git_provider: GitProvider, token_handler: TokenHandler, model: s
global PATCH_EXTRA_LINES
PATCH_EXTRA_LINES = 0
diff_files = list(git_provider.get_diff_files())
try:
diff_files = git_provider.get_diff_files()
except RateLimitExceededException as e:
logging.error(f"Rate limit exceeded for git provider API. original message {e}")
raise
# get pr languages
pr_languages = sort_files_by_main_languages(git_provider.get_languages(), diff_files)
@ -71,10 +82,12 @@ def pr_generate_extended_diff(pr_languages: list, token_handler: TokenHandler,
add_line_numbers_to_hunks: bool) -> \
Tuple[list, int]:
"""
Generate a standard diff string with patch extension, while counting the number of tokens used and applying diff minimization techniques if needed.
Generate a standard diff string with patch extension, while counting the number of tokens used and applying diff
minimization techniques if needed.
Args:
- pr_languages: A list of dictionaries representing the languages used in the pull request and their corresponding files.
- pr_languages: A list of dictionaries representing the languages used in the pull request and their corresponding
files.
- token_handler: An object of the TokenHandler class used for handling tokens in the context of the pull request.
- add_line_numbers_to_hunks: A boolean indicating whether to add line numbers to the hunks in the diff.
@ -87,12 +100,7 @@ def pr_generate_extended_diff(pr_languages: list, token_handler: TokenHandler,
for lang in pr_languages:
for file in lang['files']:
original_file_content_str = file.base_file
new_file_content_str = file.head_file
patch = file.patch
# handle the case of large patch, that initially was not loaded
patch = load_large_diff(file, new_file_content_str, original_file_content_str, patch)
if not patch:
continue
@ -114,10 +122,13 @@ def pr_generate_extended_diff(pr_languages: list, token_handler: TokenHandler,
def pr_generate_compressed_diff(top_langs: list, token_handler: TokenHandler, model: str,
convert_hunks_to_line_numbers: bool) -> Tuple[list, list, list]:
"""
Generate a compressed diff string for a pull request, using diff minimization techniques to reduce the number of tokens used.
Generate a compressed diff string for a pull request, using diff minimization techniques to reduce the number of
tokens used.
Args:
top_langs (list): A list of dictionaries representing the languages used in the pull request and their corresponding files.
token_handler (TokenHandler): An object of the TokenHandler class used for handling tokens in the context of the pull request.
top_langs (list): A list of dictionaries representing the languages used in the pull request and their
corresponding files.
token_handler (TokenHandler): An object of the TokenHandler class used for handling tokens in the context of the
pull request.
model (str): The model used for tokenization.
convert_hunks_to_line_numbers (bool): A boolean indicating whether to convert hunks to line numbers in the diff.
Returns:
@ -147,7 +158,6 @@ def pr_generate_compressed_diff(top_langs: list, token_handler: TokenHandler, mo
original_file_content_str = file.base_file
new_file_content_str = file.head_file
patch = file.patch
patch = load_large_diff(file, new_file_content_str, original_file_content_str, patch)
if not patch:
continue
@ -176,7 +186,7 @@ def pr_generate_compressed_diff(top_langs: list, token_handler: TokenHandler, mo
# Current logic is to skip the patch if it's too large
# TODO: Option for alternative logic to remove hunks from the patch to reduce the number of tokens
# until we meet the requirements
if settings.config.verbosity_level >= 2:
if get_settings().config.verbosity_level >= 2:
logging.warning(f"Patch too large, minimizing it, {file.filename}")
if not modified_files_list:
total_tokens += token_handler.count_tokens(MORE_MODIFIED_FILES_)
@ -191,15 +201,15 @@ def pr_generate_compressed_diff(top_langs: list, token_handler: TokenHandler, mo
patch_final = patch
patches.append(patch_final)
total_tokens += token_handler.count_tokens(patch_final)
if settings.config.verbosity_level >= 2:
if get_settings().config.verbosity_level >= 2:
logging.info(f"Tokens: {total_tokens}, last filename: {file.filename}")
return patches, modified_files_list, deleted_files_list
async def retry_with_fallback_models(f: Callable):
model = settings.config.model
fallback_models = settings.config.fallback_models
model = get_settings().config.model
fallback_models = get_settings().config.fallback_models
if not isinstance(fallback_models, list):
fallback_models = [fallback_models]
all_models = [model] + fallback_models
@ -207,6 +217,97 @@ async def retry_with_fallback_models(f: Callable):
try:
return await f(model)
except Exception as e:
logging.warning(f"Failed to generate prediction with {model}: {e}")
logging.warning(f"Failed to generate prediction with {model}: {traceback.format_exc()}")
if i == len(all_models) - 1: # If it's the last iteration
raise # Re-raise the last exception
def find_line_number_of_relevant_line_in_file(diff_files: List[FilePatchInfo],
relevant_file: str,
relevant_line_in_file: str) -> Tuple[int, int]:
"""
Find the line number and absolute position of a relevant line in a file.
Args:
diff_files (List[FilePatchInfo]): A list of FilePatchInfo objects representing the patches of files.
relevant_file (str): The name of the file where the relevant line is located.
relevant_line_in_file (str): The content of the relevant line.
Returns:
Tuple[int, int]: A tuple containing the line number and absolute position of the relevant line in the file.
"""
position = -1
absolute_position = -1
re_hunk_header = re.compile(
r"^@@ -(\d+)(?:,(\d+))? \+(\d+)(?:,(\d+))? @@[ ]?(.*)")
for file in diff_files:
if file.filename.strip() == relevant_file:
patch = file.patch
patch_lines = patch.splitlines()
# try to find the line in the patch using difflib, with some margin of error
matches_difflib: list[str | Any] = difflib.get_close_matches(relevant_line_in_file,
patch_lines, n=3, cutoff=0.93)
if len(matches_difflib) == 1 and matches_difflib[0].startswith('+'):
relevant_line_in_file = matches_difflib[0]
delta = 0
start1, size1, start2, size2 = 0, 0, 0, 0
for i, line in enumerate(patch_lines):
if line.startswith('@@'):
delta = 0
match = re_hunk_header.match(line)
start1, size1, start2, size2 = map(int, match.groups()[:4])
elif not line.startswith('-'):
delta += 1
if relevant_line_in_file in line and line[0] != '-':
position = i
absolute_position = start2 + delta - 1
break
if position == -1 and relevant_line_in_file[0] == '+':
no_plus_line = relevant_line_in_file[1:].lstrip()
for i, line in enumerate(patch_lines):
if line.startswith('@@'):
delta = 0
match = re_hunk_header.match(line)
start1, size1, start2, size2 = map(int, match.groups()[:4])
elif not line.startswith('-'):
delta += 1
if no_plus_line in line and line[0] != '-':
# The model might add a '+' to the beginning of the relevant_line_in_file even if originally
# it's a context line
position = i
absolute_position = start2 + delta - 1
break
return position, absolute_position
def clip_tokens(text: str, max_tokens: int) -> str:
"""
Clip the number of tokens in a string to a maximum number of tokens.
Args:
text (str): The string to clip.
max_tokens (int): The maximum number of tokens allowed in the string.
Returns:
str: The clipped string.
"""
# We'll estimate the number of tokens by hueristically assuming 2.5 tokens per word
try:
encoder = get_token_encoder()
num_input_tokens = len(encoder.encode(text))
if num_input_tokens <= max_tokens:
return text
num_chars = len(text)
chars_per_token = num_chars / num_input_tokens
num_output_chars = int(chars_per_token * max_tokens)
clipped_text = text[:num_output_chars]
return clipped_text
except Exception as e:
logging.warning(f"Failed to clip tokens: {e}")
return text

View File

@ -1,18 +1,24 @@
from jinja2 import Environment, StrictUndefined
from tiktoken import encoding_for_model
from tiktoken import encoding_for_model, get_encoding
from pr_agent.algo import MAX_TOKENS
from pr_agent.config_loader import settings
from pr_agent.config_loader import get_settings
def get_token_encoder():
return encoding_for_model(get_settings().config.model) if "gpt" in get_settings().config.model else get_encoding(
"cl100k_base")
class TokenHandler:
"""
A class for handling tokens in the context of a pull request.
Attributes:
- encoder: An object of the encoding_for_model class from the tiktoken module. Used to encode strings and count the number of tokens in them.
- limit: The maximum number of tokens allowed for the given model, as defined in the MAX_TOKENS dictionary in the pr_agent.algo module.
- prompt_tokens: The number of tokens in the system and user strings, as calculated by the _get_system_user_tokens method.
- encoder: An object of the encoding_for_model class from the tiktoken module. Used to encode strings and count the
number of tokens in them.
- limit: The maximum number of tokens allowed for the given model, as defined in the MAX_TOKENS dictionary in the
pr_agent.algo module.
- prompt_tokens: The number of tokens in the system and user strings, as calculated by the _get_system_user_tokens
method.
"""
def __init__(self, pr, vars: dict, system, user):
@ -25,7 +31,7 @@ class TokenHandler:
- system: The system string.
- user: The user string.
"""
self.encoder = encoding_for_model(settings.config.model)
self.encoder = get_token_encoder()
self.prompt_tokens = self._get_system_user_tokens(pr, self.encoder, vars, system, user)
def _get_system_user_tokens(self, pr, encoder, vars: dict, system, user):
@ -45,7 +51,6 @@ class TokenHandler:
environment = Environment(undefined=StrictUndefined)
system_prompt = environment.from_string(system).render(vars)
user_prompt = environment.from_string(user).render(vars)
system_prompt_tokens = len(encoder.encode(system_prompt))
user_prompt_tokens = len(encoder.encode(user_prompt))
return system_prompt_tokens + user_prompt_tokens

View File

@ -1,15 +1,25 @@
from __future__ import annotations
import difflib
from datetime import datetime
import json
import logging
import re
import textwrap
from datetime import datetime
from typing import Any, List
from pr_agent.config_loader import settings
import yaml
from starlette_context import context
from pr_agent.config_loader import get_settings, global_settings
def get_setting(key: str) -> Any:
try:
key = key.upper()
return context.get("settings", global_settings).get(key, global_settings.get(key, None))
except Exception:
return global_settings.get(key, None)
def convert_to_markdown(output_data: dict) -> str:
"""
Convert a dictionary of data into markdown format.
@ -30,7 +40,7 @@ def convert_to_markdown(output_data: dict) -> str:
"Security concerns": "🔒",
"General PR suggestions": "💡",
"Insights from user's answers": "📝",
"Code suggestions": "🤖",
"Code feedback": "🤖",
}
for key, value in output_data.items():
@ -40,12 +50,12 @@ def convert_to_markdown(output_data: dict) -> str:
markdown_text += f"## {key}\n\n"
markdown_text += convert_to_markdown(value)
elif isinstance(value, list):
if key.lower() == 'code suggestions':
if key.lower() == 'code feedback':
markdown_text += "\n" # just looks nicer with additional line breaks
emoji = emojis.get(key, "")
markdown_text += f"- {emoji} **{key}:**\n\n"
for item in value:
if isinstance(item, dict) and key.lower() == 'code suggestions':
if isinstance(item, dict) and key.lower() == 'code feedback':
markdown_text += parse_code_suggestion(item)
elif item:
markdown_text += f" - {item}\n"
@ -90,18 +100,22 @@ def try_fix_json(review, max_iter=10, code_suggestions=False):
Args:
- review: A string containing the JSON message to be fixed.
- max_iter: An integer representing the maximum number of iterations to try and fix the JSON message.
- code_suggestions: A boolean indicating whether to try and fix JSON messages with code suggestions.
- code_suggestions: A boolean indicating whether to try and fix JSON messages with code feedback.
Returns:
- data: A dictionary containing the parsed JSON data.
The function attempts to fix broken or incomplete JSON messages by parsing until the last valid code suggestion.
If the JSON message ends with a closing bracket, the function calls the fix_json_escape_char function to fix the message.
If code_suggestions is True and the JSON message contains code suggestions, the function tries to fix the JSON message by parsing until the last valid code suggestion.
The function uses regular expressions to find the last occurrence of "}," with any number of whitespaces or newlines.
If the JSON message ends with a closing bracket, the function calls the fix_json_escape_char function to fix the
message.
If code_suggestions is True and the JSON message contains code feedback, the function tries to fix the JSON
message by parsing until the last valid code suggestion.
The function uses regular expressions to find the last occurrence of "}," with any number of whitespaces or
newlines.
It tries to parse the JSON message with the closing bracket and checks if it is valid.
If the JSON message is valid, the parsed JSON data is returned.
If the JSON message is not valid, the last code suggestion is removed and the process is repeated until a valid JSON message is obtained or the maximum number of iterations is reached.
If the JSON message is not valid, the last code suggestion is removed and the process is repeated until a valid JSON
message is obtained or the maximum number of iterations is reached.
If a valid JSON message is not obtained, an error is logged and an empty dictionary is returned.
"""
@ -114,7 +128,8 @@ def try_fix_json(review, max_iter=10, code_suggestions=False):
else:
closing_bracket = "]}}"
if review.rfind("'Code suggestions': [") > 0 or review.rfind('"Code suggestions": [') > 0:
if (review.rfind("'Code feedback': [") > 0 or review.rfind('"Code feedback": [') > 0) or \
(review.rfind("'Code suggestions': [") > 0 or review.rfind('"Code suggestions": [') > 0) :
last_code_suggestion_ind = [m.end() for m in re.finditer(r"\}\s*,", review)][-1] - 1
valid_json = False
iter_count = 0
@ -181,33 +196,88 @@ def convert_str_to_datetime(date_str):
return datetime.strptime(date_str, datetime_format)
def load_large_diff(file, new_file_content_str: str, original_file_content_str: str, patch: str) -> str:
def load_large_diff(filename, new_file_content_str: str, original_file_content_str: str) -> str:
"""
Generate a patch for a modified file by comparing the original content of the file with the new content provided as input.
Generate a patch for a modified file by comparing the original content of the file with the new content provided as
input.
Args:
file: The file object for which the patch needs to be generated.
new_file_content_str: The new content of the file as a string.
original_file_content_str: The original content of the file as a string.
patch: An optional patch string that can be provided as input.
Returns:
The generated or provided patch string.
Raises:
None.
Additional Information:
- If 'patch' is not provided as input, the function generates a patch using the 'difflib' library and returns it as output.
- If the 'settings.config.verbosity_level' is greater than or equal to 2, a warning message is logged indicating that the file was modified but no patch was found, and a patch is manually created.
"""
if not patch: # to Do - also add condition for file extension
try:
diff = difflib.unified_diff(original_file_content_str.splitlines(keepends=True),
new_file_content_str.splitlines(keepends=True))
if settings.config.verbosity_level >= 2:
logging.warning(f"File was modified, but no patch was found. Manually creating patch: {file.filename}.")
patch = ''.join(diff)
except Exception:
pass
patch = ""
try:
diff = difflib.unified_diff(original_file_content_str.splitlines(keepends=True),
new_file_content_str.splitlines(keepends=True))
if get_settings().config.verbosity_level >= 2:
logging.warning(f"File was modified, but no patch was found. Manually creating patch: {filename}.")
patch = ''.join(diff)
except Exception:
pass
return patch
def update_settings_from_args(args: List[str]) -> List[str]:
"""
Update the settings of the Dynaconf object based on the arguments passed to the function.
Args:
args: A list of arguments passed to the function.
Example args: ['--pr_code_suggestions.extra_instructions="be funny',
'--pr_code_suggestions.num_code_suggestions=3']
Returns:
None
Raises:
ValueError: If the argument is not in the correct format.
"""
other_args = []
if args:
for arg in args:
arg = arg.strip()
if arg.startswith('--'):
arg = arg.strip('-').strip()
vals = arg.split('=')
if len(vals) != 2:
logging.error(f'Invalid argument format: {arg}')
other_args.append(arg)
continue
key, value = vals
key = key.strip().upper()
value = value.strip()
get_settings().set(key, value)
logging.info(f'Updated setting {key} to: "{value}"')
else:
other_args.append(arg)
return other_args
def load_yaml(review_text: str) -> dict:
review_text = review_text.lstrip('```yaml').rstrip('`')
try:
data = yaml.load(review_text, Loader=yaml.SafeLoader)
except Exception as e:
logging.error(f"Failed to parse AI prediction: {e}")
data = try_fix_yaml(review_text)
return data
def try_fix_yaml(review_text: str) -> dict:
review_text_lines = review_text.split('\n')
data = {}
for i in range(1, len(review_text_lines)):
review_text_lines_tmp = '\n'.join(review_text_lines[:-i])
try:
data = yaml.load(review_text_lines_tmp, Loader=yaml.SafeLoader)
logging.info(f"Successfully parsed AI prediction after removing {i} lines")
break
except:
pass
return data

View File

@ -3,23 +3,20 @@ import asyncio
import logging
import os
from pr_agent.tools.pr_code_suggestions import PRCodeSuggestions
from pr_agent.tools.pr_description import PRDescription
from pr_agent.tools.pr_information_from_user import PRInformationFromUser
from pr_agent.tools.pr_questions import PRQuestions
from pr_agent.tools.pr_reviewer import PRReviewer
from pr_agent.agent.pr_agent import PRAgent, commands
from pr_agent.config_loader import get_settings
def run(args=None):
def run(inargs=None):
parser = argparse.ArgumentParser(description='AI based pull request analyzer', usage=
"""\
Usage: cli.py --pr-url <URL on supported git hosting service> <command> [<args>].
Usage: cli.py --pr-url=<URL on supported git hosting service> <command> [<args>].
For example:
- cli.py --pr-url=... review
- cli.py --pr-url=... describe
- cli.py --pr-url=... improve
- cli.py --pr-url=... ask "write me a poem about this PR"
- cli.py --pr-url=... reflect
- cli.py --pr_url=... review
- cli.py --pr_url=... describe
- cli.py --pr_url=... improve
- cli.py --pr_url=... ask "write me a poem about this PR"
- cli.py --pr_url=... reflect
Supported commands:
review / review_pr - Add a review that includes a summary of the PR and specific suggestions for improvement.
@ -27,75 +24,22 @@ ask / ask_question [question] - Ask a question about the PR.
describe / describe_pr - Modify the PR title and description based on the PR's contents.
improve / improve_code - Suggest improvements to the code in the PR as pull request comments ready to commit.
reflect - Ask the PR author questions about the PR.
update_changelog - Update the changelog based on the PR's contents.
To edit any configuration parameter from 'configuration.toml', just add -config_path=<value>.
For example: 'python cli.py --pr_url=... review --pr_reviewer.extra_instructions="focus on the file: ..."'
""")
parser.add_argument('--pr_url', type=str, help='The URL of the PR to review', required=True)
parser.add_argument('command', type=str, help='The', choices=['review', 'review_pr',
'ask', 'ask_question',
'describe', 'describe_pr',
'improve', 'improve_code',
'reflect', 'review_after_reflect'],
default='review')
parser.add_argument('command', type=str, help='The', choices=commands, default='review')
parser.add_argument('rest', nargs=argparse.REMAINDER, default=[])
args = parser.parse_args(args)
args = parser.parse_args(inargs)
logging.basicConfig(level=os.environ.get("LOGLEVEL", "INFO"))
command = args.command.lower()
commands = {
'ask': _handle_ask_command,
'ask_question': _handle_ask_command,
'describe': _handle_describe_command,
'describe_pr': _handle_describe_command,
'improve': _handle_improve_command,
'improve_code': _handle_improve_command,
'review': _handle_review_command,
'review_pr': _handle_review_command,
'reflect': _handle_reflect_command,
'review_after_reflect': _handle_review_after_reflect_command
}
if command in commands:
commands[command](args.pr_url, args.rest)
else:
print(f"Unknown command: {command}")
get_settings().set("CONFIG.CLI_MODE", True)
result = asyncio.run(PRAgent().handle_request(args.pr_url, command + " " + " ".join(args.rest)))
if not result:
parser.print_help()
def _handle_ask_command(pr_url: str, rest: list):
if len(rest) == 0:
print("Please specify a question")
return
print(f"Question: {' '.join(rest)} about PR {pr_url}")
reviewer = PRQuestions(pr_url, rest)
asyncio.run(reviewer.answer())
def _handle_describe_command(pr_url: str, rest: list):
print(f"PR description: {pr_url}")
reviewer = PRDescription(pr_url)
asyncio.run(reviewer.describe())
def _handle_improve_command(pr_url: str, rest: list):
print(f"PR code suggestions: {pr_url}")
reviewer = PRCodeSuggestions(pr_url)
asyncio.run(reviewer.suggest())
def _handle_review_command(pr_url: str, rest: list):
print(f"Reviewing PR: {pr_url}")
reviewer = PRReviewer(pr_url, cli_mode=True, args=rest)
asyncio.run(reviewer.review())
def _handle_reflect_command(pr_url: str, rest: list):
print(f"Asking the PR author questions: {pr_url}")
reviewer = PRInformationFromUser(pr_url)
asyncio.run(reviewer.generate_questions())
def _handle_review_after_reflect_command(pr_url: str, rest: list):
print(f"Processing author's answers and sending review: {pr_url}")
reviewer = PRReviewer(pr_url, cli_mode=True, is_answer=True)
asyncio.run(reviewer.review())
if __name__ == '__main__':
run()

View File

@ -1,20 +1,64 @@
from os.path import abspath, dirname, join
from pathlib import Path
from typing import Optional
from dynaconf import Dynaconf
from starlette_context import context
PR_AGENT_TOML_KEY = 'pr-agent'
current_dir = dirname(abspath(__file__))
settings = Dynaconf(
global_settings = Dynaconf(
envvar_prefix=False,
merge_enabled=True,
settings_files=[join(current_dir, f) for f in [
"settings/.secrets.toml",
"settings/configuration.toml",
"settings/language_extensions.toml",
"settings/pr_reviewer_prompts.toml",
"settings/pr_questions_prompts.toml",
"settings/pr_description_prompts.toml",
"settings/pr_code_suggestions_prompts.toml",
"settings/pr_information_from_user_prompts.toml",
"settings_prod/.secrets.toml"
]]
"settings/.secrets.toml",
"settings/configuration.toml",
"settings/language_extensions.toml",
"settings/pr_reviewer_prompts.toml",
"settings/pr_questions_prompts.toml",
"settings/pr_description_prompts.toml",
"settings/pr_code_suggestions_prompts.toml",
"settings/pr_information_from_user_prompts.toml",
"settings/pr_update_changelog_prompts.toml",
"settings_prod/.secrets.toml"
]]
)
def get_settings():
try:
return context["settings"]
except Exception:
return global_settings
# Add local configuration from pyproject.toml of the project being reviewed
def _find_repository_root() -> Path:
"""
Identify project root directory by recursively searching for the .git directory in the parent directories.
"""
cwd = Path.cwd().resolve()
no_way_up = False
while not no_way_up:
no_way_up = cwd == cwd.parent
if (cwd / ".git").is_dir():
return cwd
cwd = cwd.parent
return None
def _find_pyproject() -> Optional[Path]:
"""
Search for file pyproject.toml in the repository root.
"""
repo_root = _find_repository_root()
if repo_root:
pyproject = _find_repository_root() / "pyproject.toml"
return pyproject if pyproject.is_file() else None
return None
pyproject_path = _find_pyproject()
if pyproject_path is not None:
get_settings().load_file(pyproject_path, env=f'tool.{PR_AGENT_TOML_KEY}')

View File

@ -1,17 +1,19 @@
from pr_agent.config_loader import settings
from pr_agent.config_loader import get_settings
from pr_agent.git_providers.bitbucket_provider import BitbucketProvider
from pr_agent.git_providers.github_provider import GithubProvider
from pr_agent.git_providers.gitlab_provider import GitLabProvider
from pr_agent.git_providers.local_git_provider import LocalGitProvider
_GIT_PROVIDERS = {
'github': GithubProvider,
'gitlab': GitLabProvider,
'bitbucket': BitbucketProvider,
'local' : LocalGitProvider
}
def get_git_provider():
try:
provider_id = settings.config.git_provider
provider_id = get_settings().config.git_provider
except AttributeError as e:
raise ValueError("git_provider is a required attribute in the configuration file") from e
if provider_id not in _GIT_PROVIDERS:

View File

@ -5,15 +5,15 @@ from urllib.parse import urlparse
import requests
from atlassian.bitbucket import Cloud
from pr_agent.config_loader import settings
from ..algo.pr_processing import clip_tokens
from ..config_loader import get_settings
from .git_provider import FilePatchInfo
class BitbucketProvider:
def __init__(self, pr_url: Optional[str] = None, incremental: Optional[bool] = False):
s = requests.Session()
s.headers['Authorization'] = f'Bearer {settings.get("BITBUCKET.BEARER_TOKEN", None)}'
s.headers['Authorization'] = f'Bearer {get_settings().get("BITBUCKET.BEARER_TOKEN", None)}'
self.bitbucket_client = Cloud(session=s)
self.workspace_slug = None
@ -82,6 +82,9 @@ class BitbucketProvider:
return self.pr.source_branch
def get_pr_description(self):
max_tokens = get_settings().get("CONFIG.MAX_DESCRIPTION_TOKENS", None)
if max_tokens:
return clip_tokens(self.pr.description, max_tokens)
return self.pr.description
def get_user_id(self):
@ -90,6 +93,12 @@ class BitbucketProvider:
def get_issue_comments(self):
raise NotImplementedError("Bitbucket provider does not support issue comments yet")
def add_eyes_reaction(self, issue_comment_id: int) -> Optional[int]:
return True
def remove_reaction(self, issue_comment_id: int, reaction_id: int) -> bool:
return True
@staticmethod
def _parse_pr_url(pr_url: str) -> Tuple[str, int]:
parsed_url = urlparse(pr_url)
@ -121,3 +130,6 @@ class BitbucketProvider:
def _get_pr_file_content(self, remote_link: str):
return ""
def get_commit_messages(self):
return "" # not implemented yet

View File

@ -3,6 +3,7 @@ from dataclasses import dataclass
# enum EDIT_TYPE (ADDED, DELETED, MODIFIED, RENAMED)
from enum import Enum
from typing import Optional
class EDIT_TYPE(Enum):
@ -88,6 +89,17 @@ class GitProvider(ABC):
def get_issue_comments(self):
pass
@abstractmethod
def add_eyes_reaction(self, issue_comment_id: int) -> Optional[int]:
pass
@abstractmethod
def remove_reaction(self, issue_comment_id: int, reaction_id: int) -> bool:
pass
@abstractmethod
def get_commit_messages(self):
pass
def get_main_pr_language(languages, files) -> str:
"""
@ -136,3 +148,4 @@ class IncrementalPR:
self.commits_range = None
self.first_new_commit_sha = None
self.last_seen_commit_sha = None

View File

@ -1,27 +1,36 @@
import logging
import hashlib
from datetime import datetime
from typing import Optional, Tuple
from typing import Optional, Tuple, Any
from urllib.parse import urlparse
from github import AppAuthentication, Github, Auth
from pr_agent.config_loader import settings
from github import AppAuthentication, Auth, Github, GithubException, Reaction
from retry import retry
from starlette_context import context
from .git_provider import FilePatchInfo, GitProvider, IncrementalPR
from ..algo.language_handler import is_valid_file
from ..algo.utils import load_large_diff
from ..algo.pr_processing import find_line_number_of_relevant_line_in_file, clip_tokens
from ..config_loader import get_settings
from ..servers.utils import RateLimitExceeded
class GithubProvider(GitProvider):
def __init__(self, pr_url: Optional[str] = None, incremental=IncrementalPR(False)):
self.repo_obj = None
self.installation_id = settings.get("GITHUB.INSTALLATION_ID")
try:
self.installation_id = context.get("installation_id", None)
except Exception:
self.installation_id = None
self.github_client = self._get_github_client()
self.repo = None
self.pr_num = None
self.pr = None
self.github_user_id = None
self.diff_files = None
self.git_files = None
self.incremental = incremental
if pr_url:
self.set_pr(pr_url)
@ -76,36 +85,59 @@ class GithubProvider(GitProvider):
def get_files(self):
if self.incremental.is_incremental and self.file_set:
return self.file_set.values()
return self.pr.get_files()
if not self.git_files:
# bring files from GitHub only once
self.git_files = self.pr.get_files()
return self.git_files
@retry(exceptions=RateLimitExceeded,
tries=get_settings().github.ratelimit_retries, delay=2, backoff=2, jitter=(1, 3))
def get_diff_files(self) -> list[FilePatchInfo]:
files = self.get_files()
diff_files = []
for file in files:
if is_valid_file(file.filename):
new_file_content_str = self._get_pr_file_content(file, self.pr.head.sha)
"""
Retrieves the list of files that have been modified, added, deleted, or renamed in a pull request in GitHub,
along with their content and patch information.
Returns:
diff_files (List[FilePatchInfo]): List of FilePatchInfo objects representing the modified, added, deleted,
or renamed files in the merge request.
"""
try:
if self.diff_files:
return self.diff_files
files = self.get_files()
diff_files = []
for file in files:
if not is_valid_file(file.filename):
continue
new_file_content_str = self._get_pr_file_content(file, self.pr.head.sha) # communication with GitHub
patch = file.patch
if self.incremental.is_incremental and self.file_set:
original_file_content_str = self._get_pr_file_content(file, self.incremental.last_seen_commit_sha)
patch = load_large_diff(file,
new_file_content_str,
original_file_content_str,
None)
patch = load_large_diff(file.filename, new_file_content_str, original_file_content_str)
self.file_set[file.filename] = patch
else:
original_file_content_str = self._get_pr_file_content(file, self.pr.base.sha)
if not patch:
patch = load_large_diff(file.filename, new_file_content_str, original_file_content_str)
diff_files.append(
FilePatchInfo(original_file_content_str, new_file_content_str, patch, file.filename))
self.diff_files = diff_files
return diff_files
diff_files.append(FilePatchInfo(original_file_content_str, new_file_content_str, patch, file.filename))
self.diff_files = diff_files
return diff_files
except GithubException.RateLimitExceededException as e:
logging.error(f"Rate limit exceeded for GitHub API. Original message: {e}")
raise RateLimitExceeded("Rate limit exceeded for GitHub API.") from e
def publish_description(self, pr_title: str, pr_body: str):
self.pr.edit(title=pr_title, body=pr_body)
# self.pr.create_issue_comment(pr_comment)
def publish_comment(self, pr_comment: str, is_temporary: bool = False):
if is_temporary and not settings.config.publish_output_progress:
if is_temporary and not get_settings().config.publish_output_progress:
logging.debug(f"Skipping publish_comment for temporary comment: {pr_comment}")
return
response = self.pr.create_issue_comment(pr_comment)
@ -119,31 +151,16 @@ class GithubProvider(GitProvider):
def publish_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
self.publish_inline_comments([self.create_inline_comment(body, relevant_file, relevant_line_in_file)])
def create_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
self.diff_files = self.diff_files if self.diff_files else self.get_diff_files()
position = -1
for file in self.diff_files:
if file.filename.strip() == relevant_file:
patch = file.patch
patch_lines = patch.splitlines()
for i, line in enumerate(patch_lines):
if relevant_line_in_file in line:
position = i
break
elif relevant_line_in_file[0] == '+' and relevant_line_in_file[1:].lstrip() in line:
# The model often adds a '+' to the beginning of the relevant_line_in_file even if originally
# it's a context line
position = i
break
position, absolute_position = find_line_number_of_relevant_line_in_file(self.diff_files, relevant_file.strip('`'), relevant_line_in_file)
if position == -1:
if settings.config.verbosity_level >= 2:
if get_settings().config.verbosity_level >= 2:
logging.info(f"Could not find position for {relevant_file} {relevant_line_in_file}")
subject_type = "FILE"
else:
subject_type = "LINE"
path = relevant_file.strip()
# placeholder for future API support (already supported in single inline comment)
# return dict(body=body, path=path, position=position, subject_type=subject_type)
return dict(body=body, path=path, position=position) if subject_type == "LINE" else {}
def publish_inline_comments(self, comments: list[dict]):
@ -161,13 +178,13 @@ class GithubProvider(GitProvider):
relevant_lines_end = suggestion['relevant_lines_end']
if not relevant_lines_start or relevant_lines_start == -1:
if settings.config.verbosity_level >= 2:
if get_settings().config.verbosity_level >= 2:
logging.exception(
f"Failed to publish code suggestion, relevant_lines_start is {relevant_lines_start}")
continue
if relevant_lines_end < relevant_lines_start:
if settings.config.verbosity_level >= 2:
if get_settings().config.verbosity_level >= 2:
logging.exception(f"Failed to publish code suggestion, "
f"relevant_lines_end is {relevant_lines_end} and "
f"relevant_lines_start is {relevant_lines_start}")
@ -194,7 +211,7 @@ class GithubProvider(GitProvider):
self.pr.create_review(commit=self.last_commit_id, comments=post_parameters_list)
return True
except Exception as e:
if settings.config.verbosity_level >= 2:
if get_settings().config.verbosity_level >= 2:
logging.error(f"Failed to publish code suggestion, error: {e}")
return False
@ -217,6 +234,9 @@ class GithubProvider(GitProvider):
return self.pr.head.ref
def get_pr_description(self):
max_tokens = get_settings().get("CONFIG.MAX_DESCRIPTION_TOKENS", None)
if max_tokens:
return clip_tokens(self.pr.body, max_tokens)
return self.pr.body
def get_user_id(self):
@ -228,7 +248,7 @@ class GithubProvider(GitProvider):
return self.github_user_id
def get_notifications(self, since: datetime):
deployment_type = settings.get("GITHUB.DEPLOYMENT_TYPE", "user")
deployment_type = get_settings().get("GITHUB.DEPLOYMENT_TYPE", "user")
if deployment_type != 'user':
raise ValueError("Deployment mode must be set to 'user' to get notifications")
@ -239,6 +259,30 @@ class GithubProvider(GitProvider):
def get_issue_comments(self):
return self.pr.get_issue_comments()
def get_repo_settings(self):
try:
contents = self.repo_obj.get_contents(".pr_agent.toml", ref=self.pr.head.sha).decoded_content
return contents
except Exception:
return ""
def add_eyes_reaction(self, issue_comment_id: int) -> Optional[int]:
try:
reaction = self.pr.get_issue_comment(issue_comment_id).create_reaction("eyes")
return reaction.id
except Exception as e:
logging.exception(f"Failed to add eyes reaction, error: {e}")
return None
def remove_reaction(self, issue_comment_id: int, reaction_id: int) -> bool:
try:
self.pr.get_issue_comment(issue_comment_id).delete_reaction(reaction_id)
return True
except Exception as e:
logging.exception(f"Failed to remove eyes reaction, error: {e}")
return False
@staticmethod
def _parse_pr_url(pr_url: str) -> Tuple[str, int]:
parsed_url = urlparse(pr_url)
@ -269,12 +313,12 @@ class GithubProvider(GitProvider):
return repo_name, pr_number
def _get_github_client(self):
deployment_type = settings.get("GITHUB.DEPLOYMENT_TYPE", "user")
deployment_type = get_settings().get("GITHUB.DEPLOYMENT_TYPE", "user")
if deployment_type == 'app':
try:
private_key = settings.github.private_key
app_id = settings.github.app_id
private_key = get_settings().github.private_key
app_id = get_settings().github.app_id
except AttributeError as e:
raise ValueError("GitHub app ID and private key are required when using GitHub app deployment") from e
if not self.installation_id:
@ -285,7 +329,7 @@ class GithubProvider(GitProvider):
if deployment_type == 'user':
try:
token = settings.github.user_token
token = get_settings().github.user_token
except AttributeError as e:
raise ValueError(
"GitHub token is required when using user deployment. See: "
@ -314,7 +358,9 @@ class GithubProvider(GitProvider):
def publish_labels(self, pr_types):
try:
label_color_map = {"Bug fix": "1d76db", "Tests": "e99695", "Bug fix with tests": "c5def5", "Refactoring": "bfdadc", "Enhancement": "bfd4f2", "Documentation": "d4c5f9", "Other": "d1bcf9"}
label_color_map = {"Bug fix": "1d76db", "Tests": "e99695", "Bug fix with tests": "c5def5",
"Refactoring": "bfdadc", "Enhancement": "bfd4f2", "Documentation": "d4c5f9",
"Other": "d1bcf9"}
post_parameters = []
for p in pr_types:
color = label_color_map.get(p, "d1bcf9") # default to "Other" color
@ -330,4 +376,47 @@ class GithubProvider(GitProvider):
return [label.name for label in self.pr.labels]
except Exception as e:
logging.exception(f"Failed to get labels, error: {e}")
return []
return []
def get_commit_messages(self):
"""
Retrieves the commit messages of a pull request.
Returns:
str: A string containing the commit messages of the pull request.
"""
max_tokens = get_settings().get("CONFIG.MAX_COMMITS_TOKENS", None)
try:
commit_list = self.pr.get_commits()
commit_messages = [commit.commit.message for commit in commit_list]
commit_messages_str = "\n".join([f"{i + 1}. {message}" for i, message in enumerate(commit_messages)])
except Exception:
commit_messages_str = ""
if max_tokens:
commit_messages_str = clip_tokens(commit_messages_str, max_tokens)
return commit_messages_str
def generate_link_to_relevant_line_number(self, suggestion) -> str:
try:
relevant_file = suggestion['relevant file'].strip('`').strip("'")
relevant_line_str = suggestion['relevant line']
if not relevant_line_str:
return ""
position, absolute_position = find_line_number_of_relevant_line_in_file \
(self.diff_files, relevant_file, relevant_line_str)
if absolute_position != -1:
# # link to right file only
# link = f"https://github.com/{self.repo}/blob/{self.pr.head.sha}/{relevant_file}" \
# + "#" + f"L{absolute_position}"
# link to diff
sha_file = hashlib.sha256(relevant_file.encode('utf-8')).hexdigest()
link = f"https://github.com/{self.repo}/pull/{self.pr_num}/files#diff-{sha_file}R{absolute_position}"
return link
except Exception as e:
if get_settings().config.verbosity_level >= 2:
logging.info(f"Failed adding line link, error: {e}")
return ""

View File

@ -6,9 +6,10 @@ from urllib.parse import urlparse
import gitlab
from gitlab import GitlabGetError
from pr_agent.config_loader import settings
from ..algo.language_handler import is_valid_file
from ..algo.pr_processing import clip_tokens
from ..algo.utils import load_large_diff
from ..config_loader import get_settings
from .git_provider import EDIT_TYPE, FilePatchInfo, GitProvider
logger = logging.getLogger()
@ -16,22 +17,22 @@ logger = logging.getLogger()
class GitLabProvider(GitProvider):
def __init__(self, merge_request_url: Optional[str] = None, incremental: Optional[bool] = False):
gitlab_url = settings.get("GITLAB.URL", None)
gitlab_url = get_settings().get("GITLAB.URL", None)
if not gitlab_url:
raise ValueError("GitLab URL is not set in the config file")
gitlab_access_token = settings.get("GITLAB.PERSONAL_ACCESS_TOKEN", None)
gitlab_access_token = get_settings().get("GITLAB.PERSONAL_ACCESS_TOKEN", None)
if not gitlab_access_token:
raise ValueError("GitLab personal access token is not set in the config file")
self.gl = gitlab.Gitlab(
gitlab_url,
gitlab_access_token
url=gitlab_url,
oauth_token=gitlab_access_token
)
self.id_project = None
self.id_mr = None
self.mr = None
self.diff_files = None
self.git_files = None
self.temp_comments = []
self._set_merge_request(merge_request_url)
self.RE_HUNK_HEADER = re.compile(
@ -67,19 +68,27 @@ class GitLabProvider(GitProvider):
return ''
def get_diff_files(self) -> list[FilePatchInfo]:
"""
Retrieves the list of files that have been modified, added, deleted, or renamed in a pull request in GitLab,
along with their content and patch information.
Returns:
diff_files (List[FilePatchInfo]): List of FilePatchInfo objects representing the modified, added, deleted,
or renamed files in the merge request.
"""
if self.diff_files:
return self.diff_files
diffs = self.mr.changes()['changes']
diff_files = []
for diff in diffs:
if is_valid_file(diff['new_path']):
original_file_content_str = self._get_pr_file_content(diff['old_path'], self.mr.target_branch)
new_file_content_str = self._get_pr_file_content(diff['new_path'], self.mr.source_branch)
edit_type = EDIT_TYPE.MODIFIED
if diff['new_file']:
edit_type = EDIT_TYPE.ADDED
elif diff['deleted_file']:
edit_type = EDIT_TYPE.DELETED
elif diff['renamed_file']:
edit_type = EDIT_TYPE.RENAMED
# original_file_content_str = self._get_pr_file_content(diff['old_path'], self.mr.target_branch)
# new_file_content_str = self._get_pr_file_content(diff['new_path'], self.mr.source_branch)
original_file_content_str = self._get_pr_file_content(diff['old_path'], self.mr.diff_refs['base_sha'])
new_file_content_str = self._get_pr_file_content(diff['new_path'], self.mr.diff_refs['head_sha'])
try:
if isinstance(original_file_content_str, bytes):
original_file_content_str = bytes.decode(original_file_content_str, 'utf-8')
@ -88,15 +97,33 @@ class GitLabProvider(GitProvider):
except UnicodeDecodeError:
logging.warning(
f"Cannot decode file {diff['old_path']} or {diff['new_path']} in merge request {self.id_mr}")
edit_type = EDIT_TYPE.MODIFIED
if diff['new_file']:
edit_type = EDIT_TYPE.ADDED
elif diff['deleted_file']:
edit_type = EDIT_TYPE.DELETED
elif diff['renamed_file']:
edit_type = EDIT_TYPE.RENAMED
filename = diff['new_path']
patch = diff['diff']
if not patch:
patch = load_large_diff(filename, new_file_content_str, original_file_content_str)
diff_files.append(
FilePatchInfo(original_file_content_str, new_file_content_str, diff['diff'], diff['new_path'],
FilePatchInfo(original_file_content_str, new_file_content_str,
patch=patch,
filename=filename,
edit_type=edit_type,
old_filename=None if diff['old_path'] == diff['new_path'] else diff['old_path']))
self.diff_files = diff_files
return diff_files
def get_files(self):
return [change['new_path'] for change in self.mr.changes()['changes']]
if not self.git_files:
self.git_files = [change['new_path'] for change in self.mr.changes()['changes']]
return self.git_files
def publish_description(self, pr_title: str, pr_body: str):
try:
@ -112,7 +139,6 @@ class GitLabProvider(GitProvider):
self.temp_comments.append(comment)
def publish_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
self.diff_files = self.diff_files if self.diff_files else self.get_diff_files()
edit_type, found, source_line_no, target_file, target_line_no = self.search_line(relevant_file,
relevant_line_in_file)
self.send_inline_comment(body, edit_type, found, relevant_file, relevant_line_in_file, source_line_no,
@ -141,38 +167,48 @@ class GitLabProvider(GitProvider):
else:
pos_obj['new_line'] = target_line_no - 1
pos_obj['old_line'] = source_line_no - 1
logging.debug(f"Creating comment in {self.id_mr} with body {body} and position {pos_obj}")
self.mr.discussions.create({'body': body,
'position': pos_obj})
def publish_code_suggestions(self, code_suggestions: list):
for suggestion in code_suggestions:
body = suggestion['body']
relevant_file = suggestion['relevant_file']
relevant_lines_start = suggestion['relevant_lines_start']
relevant_lines_end = suggestion['relevant_lines_end']
try:
body = suggestion['body']
relevant_file = suggestion['relevant_file']
relevant_lines_start = suggestion['relevant_lines_start']
relevant_lines_end = suggestion['relevant_lines_end']
self.diff_files = self.diff_files if self.diff_files else self.get_diff_files()
target_file = None
for file in self.diff_files:
if file.filename == relevant_file:
diff_files = self.get_diff_files()
target_file = None
for file in diff_files:
if file.filename == relevant_file:
target_file = file
break
range = relevant_lines_end - relevant_lines_start + 1
body = body.replace('```suggestion', f'```suggestion:-0+{range}')
if file.filename == relevant_file:
target_file = file
break
range = relevant_lines_end - relevant_lines_start # no need to add 1
body = body.replace('```suggestion', f'```suggestion:-0+{range}')
lines = target_file.head_file.splitlines()
relevant_line_in_file = lines[relevant_lines_start - 1]
lines = target_file.head_file.splitlines()
relevant_line_in_file = lines[relevant_lines_start - 1]
edit_type, found, source_line_no, target_file, target_line_no = self.find_in_file(target_file,
relevant_line_in_file)
self.send_inline_comment(body, edit_type, found, relevant_file, relevant_line_in_file, source_line_no,
target_file, target_line_no)
# edit_type, found, source_line_no, target_file, target_line_no = self.find_in_file(target_file,
# relevant_line_in_file)
# for code suggestions, we want to edit the new code
source_line_no = None
target_line_no = relevant_lines_start + 1
found = True
edit_type = 'addition'
self.send_inline_comment(body, edit_type, found, relevant_file, relevant_line_in_file, source_line_no,
target_file, target_line_no)
except Exception as e:
logging.exception(f"Could not publish code suggestion:\nsuggestion: {suggestion}\nerror: {e}")
def search_line(self, relevant_file, relevant_line_in_file):
target_file = None
edit_type = self.get_edit_type(relevant_line_in_file)
for file in self.diff_files:
for file in self.get_diff_files():
if file.filename == relevant_file:
edit_type, found, source_line_no, target_file, target_line_no = self.find_in_file(file,
relevant_line_in_file)
@ -240,25 +276,51 @@ class GitLabProvider(GitProvider):
return self.mr.source_branch
def get_pr_description(self):
max_tokens = get_settings().get("CONFIG.MAX_DESCRIPTION_TOKENS", None)
if max_tokens:
return clip_tokens(self.mr.description, max_tokens)
return self.mr.description
def get_issue_comments(self):
raise NotImplementedError("GitLab provider does not support issue comments yet")
def _parse_merge_request_url(self, merge_request_url: str) -> Tuple[int, int]:
def get_repo_settings(self):
try:
contents = self.gl.projects.get(self.id_project).files.get(file_path='.pr_agent.toml', ref=self.mr.source_branch)
return contents
except Exception:
return ""
def add_eyes_reaction(self, issue_comment_id: int) -> Optional[int]:
return True
def remove_reaction(self, issue_comment_id: int, reaction_id: int) -> bool:
return True
def _parse_merge_request_url(self, merge_request_url: str) -> Tuple[str, int]:
parsed_url = urlparse(merge_request_url)
path_parts = parsed_url.path.strip('/').split('/')
if path_parts[-2] != 'merge_requests':
if 'merge_requests' not in path_parts:
raise ValueError("The provided URL does not appear to be a GitLab merge request URL")
mr_index = path_parts.index('merge_requests')
# Ensure there is an ID after 'merge_requests'
if len(path_parts) <= mr_index + 1:
raise ValueError("The provided URL does not contain a merge request ID")
try:
mr_id = int(path_parts[-1])
mr_id = int(path_parts[mr_index + 1])
except ValueError as e:
raise ValueError("Unable to convert merge request ID to integer") from e
# Gitlab supports access by both project numeric ID as well as 'namespace/project_name'
return "/".join(path_parts[:2]), mr_id
# Handle special delimiter (-)
project_path = "/".join(path_parts[:mr_index])
if project_path.endswith('/-'):
project_path = project_path[:-2]
# Return the path before 'merge_requests' and the ID
return project_path, mr_id
def _get_merge_request(self):
mr = self.gl.projects.get(self.id_project).mergerequests.get(self.id_mr)
@ -279,3 +341,20 @@ class GitLabProvider(GitProvider):
def get_labels(self):
return self.mr.labels
def get_commit_messages(self):
"""
Retrieves the commit messages of a pull request.
Returns:
str: A string containing the commit messages of the pull request.
"""
max_tokens = get_settings().get("CONFIG.MAX_COMMITS_TOKENS", None)
try:
commit_messages_list = [commit['message'] for commit in self.mr.commits()._list]
commit_messages_str = "\n".join([f"{i + 1}. {message}" for i, message in enumerate(commit_messages_list)])
except Exception:
commit_messages_str = ""
if max_tokens:
commit_messages_str = clip_tokens(commit_messages_str, max_tokens)
return commit_messages_str

View File

@ -0,0 +1,178 @@
import logging
from collections import Counter
from pathlib import Path
from typing import List
from git import Repo
from pr_agent.config_loader import _find_repository_root, get_settings
from pr_agent.git_providers.git_provider import EDIT_TYPE, FilePatchInfo, GitProvider
class PullRequestMimic:
"""
This class mimics the PullRequest class from the PyGithub library for the LocalGitProvider.
"""
def __init__(self, title: str, diff_files: List[FilePatchInfo]):
self.title = title
self.diff_files = diff_files
class LocalGitProvider(GitProvider):
"""
This class implements the GitProvider interface for local git repositories.
It mimics the PR functionality of the GitProvider interface,
but does not require a hosted git repository.
Instead of providing a PR url, the user provides a local branch path to generate a diff-patch.
For the MVP it only supports the /review and /describe capabilities.
"""
def __init__(self, target_branch_name, incremental=False):
self.repo_path = _find_repository_root()
if self.repo_path is None:
raise ValueError('Could not find repository root')
self.repo = Repo(self.repo_path)
self.head_branch_name = self.repo.head.ref.name
self.target_branch_name = target_branch_name
self._prepare_repo()
self.diff_files = None
self.pr = PullRequestMimic(self.get_pr_title(), self.get_diff_files())
self.description_path = get_settings().get('local.description_path') \
if get_settings().get('local.description_path') is not None else self.repo_path / 'description.md'
self.review_path = get_settings().get('local.review_path') \
if get_settings().get('local.review_path') is not None else self.repo_path / 'review.md'
# inline code comments are not supported for local git repositories
get_settings().pr_reviewer.inline_code_comments = False
def _prepare_repo(self):
"""
Prepare the repository for PR-mimic generation.
"""
logging.debug('Preparing repository for PR-mimic generation...')
if self.repo.is_dirty():
raise ValueError('The repository is not in a clean state. Please commit or stash pending changes.')
if self.target_branch_name not in self.repo.heads:
raise KeyError(f'Branch: {self.target_branch_name} does not exist')
def is_supported(self, capability: str) -> bool:
if capability in ['get_issue_comments', 'create_inline_comment', 'publish_inline_comments', 'get_labels']:
return False
return True
def get_diff_files(self) -> list[FilePatchInfo]:
diffs = self.repo.head.commit.diff(
self.repo.merge_base(self.repo.head, self.repo.branches[self.target_branch_name]),
create_patch=True,
R=True
)
diff_files = []
for diff_item in diffs:
if diff_item.a_blob is not None:
original_file_content_str = diff_item.a_blob.data_stream.read().decode('utf-8')
else:
original_file_content_str = "" # empty file
if diff_item.b_blob is not None:
new_file_content_str = diff_item.b_blob.data_stream.read().decode('utf-8')
else:
new_file_content_str = "" # empty file
edit_type = EDIT_TYPE.MODIFIED
if diff_item.new_file:
edit_type = EDIT_TYPE.ADDED
elif diff_item.deleted_file:
edit_type = EDIT_TYPE.DELETED
elif diff_item.renamed_file:
edit_type = EDIT_TYPE.RENAMED
diff_files.append(
FilePatchInfo(original_file_content_str,
new_file_content_str,
diff_item.diff.decode('utf-8'),
diff_item.b_path,
edit_type=edit_type,
old_filename=None if diff_item.a_path == diff_item.b_path else diff_item.a_path
)
)
self.diff_files = diff_files
return diff_files
def get_files(self) -> List[str]:
"""
Returns a list of files with changes in the diff.
"""
diff_index = self.repo.head.commit.diff(
self.repo.merge_base(self.repo.head, self.repo.branches[self.target_branch_name]),
R=True
)
# Get the list of changed files
diff_files = [item.a_path for item in diff_index]
return diff_files
def publish_description(self, pr_title: str, pr_body: str):
with open(self.description_path, "w") as file:
# Write the string to the file
file.write(pr_title + '\n' + pr_body)
def publish_comment(self, pr_comment: str, is_temporary: bool = False):
with open(self.review_path, "w") as file:
# Write the string to the file
file.write(pr_comment)
def publish_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
raise NotImplementedError('Publishing inline comments is not implemented for the local git provider')
def create_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
raise NotImplementedError('Creating inline comments is not implemented for the local git provider')
def publish_inline_comments(self, comments: list[dict]):
raise NotImplementedError('Publishing inline comments is not implemented for the local git provider')
def publish_code_suggestion(self, body: str, relevant_file: str,
relevant_lines_start: int, relevant_lines_end: int):
raise NotImplementedError('Publishing code suggestions is not implemented for the local git provider')
def publish_code_suggestions(self, code_suggestions: list):
raise NotImplementedError('Publishing code suggestions is not implemented for the local git provider')
def publish_labels(self, labels):
pass # Not applicable to the local git provider, but required by the interface
def remove_initial_comment(self):
pass # Not applicable to the local git provider, but required by the interface
def get_languages(self):
"""
Calculate percentage of languages in repository. Used for hunk prioritisation.
"""
# Get all files in repository
filepaths = [Path(item.path) for item in self.repo.tree().traverse() if item.type == 'blob']
# Identify language by file extension and count
lang_count = Counter(ext.lstrip('.') for filepath in filepaths for ext in [filepath.suffix.lower()])
# Convert counts to percentages
total_files = len(filepaths)
lang_percentage = {lang: count / total_files * 100 for lang, count in lang_count.items()}
return lang_percentage
def get_pr_branch(self):
return self.repo.head
def get_user_id(self):
return -1 # Not used anywhere for the local provider, but required by the interface
def get_pr_description(self):
commits_diff = list(self.repo.iter_commits(self.target_branch_name + '..HEAD'))
# Get the commit messages and concatenate
commit_messages = " ".join([commit.message for commit in commits_diff])
# TODO Handle the description better - maybe use gpt-3.5 summarisation here?
return commit_messages[:200] # Use max 200 characters
def get_pr_title(self):
"""
Substitutes the branch-name as the PR-mimic title.
"""
return self.head_branch_name
def get_issue_comments(self):
raise NotImplementedError('Getting issue comments is not implemented for the local git provider')
def get_labels(self):
raise NotImplementedError('Getting labels is not implemented for the local git provider')

View File

@ -3,7 +3,8 @@ import json
import os
from pr_agent.agent.pr_agent import PRAgent
from pr_agent.config_loader import settings
from pr_agent.config_loader import get_settings
from pr_agent.git_providers import get_git_provider
from pr_agent.tools.pr_reviewer import PRReviewer
@ -14,6 +15,8 @@ async def run_action():
OPENAI_KEY = os.environ.get('OPENAI_KEY')
OPENAI_ORG = os.environ.get('OPENAI_ORG')
GITHUB_TOKEN = os.environ.get('GITHUB_TOKEN')
get_settings().set("CONFIG.PUBLISH_OUTPUT_PROGRESS", False)
# Check if required environment variables are set
if not GITHUB_EVENT_NAME:
@ -30,11 +33,11 @@ async def run_action():
return
# Set the environment variables in the settings
settings.set("OPENAI.KEY", OPENAI_KEY)
get_settings().set("OPENAI.KEY", OPENAI_KEY)
if OPENAI_ORG:
settings.set("OPENAI.ORG", OPENAI_ORG)
settings.set("GITHUB.USER_TOKEN", GITHUB_TOKEN)
settings.set("GITHUB.DEPLOYMENT_TYPE", "user")
get_settings().set("OPENAI.ORG", OPENAI_ORG)
get_settings().set("GITHUB.USER_TOKEN", GITHUB_TOKEN)
get_settings().set("GITHUB.DEPLOYMENT_TYPE", "user")
# Load the event payload
try:
@ -50,7 +53,7 @@ async def run_action():
if action in ["opened", "reopened"]:
pr_url = event_payload.get("pull_request", {}).get("url")
if pr_url:
await PRReviewer(pr_url).review()
await PRReviewer(pr_url).run()
# Handle issue comment event
elif GITHUB_EVENT_NAME == "issue_comment":
@ -61,7 +64,9 @@ async def run_action():
pr_url = event_payload.get("issue", {}).get("pull_request", {}).get("url")
if pr_url:
body = comment_body.strip().lower()
await PRAgent().handle_request(pr_url, body)
comment_id = event_payload.get("comment", {}).get("id")
provider = get_git_provider()(pr_url=pr_url)
await PRAgent().handle_request(pr_url, body, notify=lambda: provider.add_eyes_reaction(comment_id))
if __name__ == '__main__':

View File

@ -1,11 +1,17 @@
import copy
import logging
import sys
from typing import Any, Dict
import uvicorn
from fastapi import APIRouter, FastAPI, HTTPException, Request, Response
from starlette.middleware import Middleware
from starlette_context import context
from starlette_context.middleware import RawContextMiddleware
from pr_agent.agent.pr_agent import PRAgent
from pr_agent.config_loader import settings
from pr_agent.config_loader import get_settings, global_settings
from pr_agent.git_providers import get_git_provider
from pr_agent.servers.utils import verify_signature
logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
@ -14,7 +20,29 @@ router = APIRouter()
@router.post("/api/v1/github_webhooks")
async def handle_github_webhooks(request: Request, response: Response):
logging.debug("Received a github webhook")
"""
Receives and processes incoming GitHub webhook requests.
Verifies the request signature, parses the request body, and passes it to the handle_request function for further
processing.
"""
logging.debug("Received a GitHub webhook")
body = await get_body(request)
logging.debug(f'Request body:\n{body}')
installation_id = body.get("installation", {}).get("id")
context["installation_id"] = installation_id
context["settings"] = copy.deepcopy(global_settings)
return await handle_request(body)
@router.post("/api/v1/marketplace_webhooks")
async def handle_marketplace_webhooks(request: Request, response: Response):
body = await get_body(request)
logging.info(f'Request body:\n{body}')
async def get_body(request):
try:
body = await request.json()
except Exception as e:
@ -22,43 +50,52 @@ async def handle_github_webhooks(request: Request, response: Response):
raise HTTPException(status_code=400, detail="Error parsing request body") from e
body_bytes = await request.body()
signature_header = request.headers.get('x-hub-signature-256', None)
try:
webhook_secret = settings.github.webhook_secret
except AttributeError:
webhook_secret = None
webhook_secret = getattr(get_settings().github, 'webhook_secret', None)
if webhook_secret:
verify_signature(body_bytes, webhook_secret, signature_header)
logging.debug(f'Request body:\n{body}')
return await handle_request(body)
return body
async def handle_request(body):
action = body.get("action", None)
installation_id = body.get("installation", {}).get("id", None)
settings.set("GITHUB.INSTALLATION_ID", installation_id)
async def handle_request(body: Dict[str, Any]):
"""
Handle incoming GitHub webhook requests.
Args:
body: The request body.
"""
action = body.get("action")
if not action:
return {}
agent = PRAgent()
if action == 'created':
if "comment" not in body:
return {}
comment_body = body.get("comment", {}).get("body", None)
if 'sender' in body and 'login' in body['sender'] and 'bot' in body['sender']['login']:
comment_body = body.get("comment", {}).get("body")
sender = body.get("sender", {}).get("login")
if sender and 'bot' in sender:
return {}
if "issue" not in body and "pull_request" not in body["issue"]:
if "issue" not in body or "pull_request" not in body["issue"]:
return {}
pull_request = body["issue"]["pull_request"]
api_url = pull_request.get("url", None)
await agent.handle_request(api_url, comment_body)
api_url = pull_request.get("url")
comment_id = body.get("comment", {}).get("id")
provider = get_git_provider()(pr_url=api_url)
await agent.handle_request(api_url, comment_body, notify=lambda: provider.add_eyes_reaction(comment_id))
elif action in ["opened"] or 'reopened' in action:
pull_request = body.get("pull_request", None)
elif action == "opened" or 'reopened' in action:
pull_request = body.get("pull_request")
if not pull_request:
return {}
api_url = pull_request.get("url", None)
if api_url is None:
api_url = pull_request.get("url")
if not api_url:
return {}
await agent.handle_request(api_url, "/review")
else:
return {}
return {}
@router.get("/")
@ -68,12 +105,14 @@ async def root():
def start():
# Override the deployment type to app
settings.set("GITHUB.DEPLOYMENT_TYPE", "app")
app = FastAPI()
get_settings().set("GITHUB.DEPLOYMENT_TYPE", "app")
get_settings().set("CONFIG.PUBLISH_OUTPUT_PROGRESS", False)
middleware = [Middleware(RawContextMiddleware)]
app = FastAPI(middleware=middleware)
app.include_router(router)
uvicorn.run(app, host="0.0.0.0", port=3000)
if __name__ == '__main__':
start()
start()

View File

@ -6,7 +6,7 @@ from datetime import datetime, timezone
import aiohttp
from pr_agent.agent.pr_agent import PRAgent
from pr_agent.config_loader import settings
from pr_agent.config_loader import get_settings
from pr_agent.git_providers import get_git_provider
from pr_agent.servers.help import bot_help_text
@ -15,28 +15,41 @@ NOTIFICATION_URL = "https://api.github.com/notifications"
def now() -> str:
"""
Get the current UTC time in ISO 8601 format.
Returns:
str: The current UTC time in ISO 8601 format.
"""
now_utc = datetime.now(timezone.utc).isoformat()
now_utc = now_utc.replace("+00:00", "Z")
return now_utc
async def polling_loop():
"""
Polls for notifications and handles them accordingly.
"""
handled_ids = set()
since = [now()]
last_modified = [None]
git_provider = get_git_provider()()
user_id = git_provider.get_user_id()
agent = PRAgent()
get_settings().set("CONFIG.PUBLISH_OUTPUT_PROGRESS", False)
try:
deployment_type = settings.github.deployment_type
token = settings.github.user_token
deployment_type = get_settings().github.deployment_type
token = get_settings().github.user_token
except AttributeError:
deployment_type = 'none'
token = None
if deployment_type != 'user':
raise ValueError("Deployment mode must be set to 'user' to get notifications")
if not token:
raise ValueError("User token must be set to get notifications")
async with aiohttp.ClientSession() as session:
while True:
try:
@ -52,6 +65,7 @@ async def polling_loop():
params["since"] = since[0]
if last_modified[0]:
headers["If-Modified-Since"] = last_modified[0]
async with session.get(NOTIFICATION_URL, headers=headers, params=params) as response:
if response.status == 200:
if 'Last-Modified' in response.headers:
@ -85,8 +99,10 @@ async def polling_loop():
if user_tag not in comment_body:
continue
rest_of_comment = comment_body.split(user_tag)[1].strip()
success = await agent.handle_request(pr_url, rest_of_comment)
comment_id = comment['id']
git_provider.set_pr(pr_url)
success = await agent.handle_request(pr_url, rest_of_comment,
notify=lambda: git_provider.add_eyes_reaction(comment_id)) # noqa E501
if not success:
git_provider.set_pr(pr_url)
git_provider.publish_comment("### How to use PR-Agent\n" +
@ -100,4 +116,4 @@ async def polling_loop():
if __name__ == '__main__':
asyncio.run(polling_loop())
asyncio.run(polling_loop())

View File

@ -7,7 +7,7 @@ from fastapi.responses import JSONResponse
from starlette.background import BackgroundTasks
from pr_agent.agent.pr_agent import PRAgent
from pr_agent.config_loader import settings
from pr_agent.config_loader import get_settings
app = FastAPI()
router = APIRouter()
@ -29,13 +29,13 @@ async def gitlab_webhook(background_tasks: BackgroundTasks, request: Request):
return JSONResponse(status_code=status.HTTP_200_OK, content=jsonable_encoder({"message": "success"}))
def start():
gitlab_url = settings.get("GITLAB.URL", None)
gitlab_url = get_settings().get("GITLAB.URL", None)
if not gitlab_url:
raise ValueError("GITLAB.URL is not set")
gitlab_token = settings.get("GITLAB.PERSONAL_ACCESS_TOKEN", None)
gitlab_token = get_settings().get("GITLAB.PERSONAL_ACCESS_TOKEN", None)
if not gitlab_token:
raise ValueError("GITLAB.PERSONAL_ACCESS_TOKEN is not set")
settings.config.git_provider = "gitlab"
get_settings().config.git_provider = "gitlab"
app = FastAPI()
app.include_router(router)

View File

@ -1,9 +1,11 @@
commands_text = "> **/review [-i]**: Request a review of your Pull Request. For an incremental review, which only " \
"considers changes since the last review, include the '-i' option.\n" \
"> **/describe**: Modify the PR title and description based on the contents of the PR.\n" \
"> **/improve**: Suggest improvements to the code in the PR. " \
"These will be provided as pull request comments, ready to commit.\n" \
"> **/ask \\<QUESTION\\>**: Pose a question about the PR.\n"
"> **/improve**: Suggest improvements to the code in the PR. \n" \
"> **/ask \\<QUESTION\\>**: Pose a question about the PR.\n\n" \
">To edit any configuration parameter from 'configuration.toml', add --config_path=new_value\n" \
">For example: /review --pr_reviewer.extra_instructions=\"focus on the file: ...\" \n" \
">To list the possible configuration parameters, use the **/config** command.\n" \
def bot_help_text(user: str):

View File

@ -21,3 +21,7 @@ def verify_signature(payload_body, secret_token, signature_header):
if not hmac.compare_digest(expected_signature, signature_header):
raise HTTPException(status_code=403, detail="Request signatures didn't match!")
class RateLimitExceeded(Exception):
"""Raised when the git provider API rate limit has been exceeded."""
pass

View File

@ -7,17 +7,26 @@
# See README for details about GitHub App deployment.
[openai]
key = "<API_KEY>" # Acquire through https://platform.openai.com
org = "<ORGANIZATION>" # Optional, may be commented out.
key = "" # Acquire through https://platform.openai.com
#org = "<ORGANIZATION>" # Optional, may be commented out.
# Uncomment the following for Azure OpenAI
#api_type = "azure"
#api_version = '2023-05-15' # Check Azure documentation for the current API version
#api_base = "<API_BASE>" # The base URL for your Azure OpenAI resource. e.g. "https://<your resource name>.openai.azure.com"
#deployment_id = "<DEPLOYMENT_ID>" # The deployment name you chose when you deployed the engine
#api_base = "" # The base URL for your Azure OpenAI resource. e.g. "https://<your resource name>.openai.azure.com"
#deployment_id = "" # The deployment name you chose when you deployed the engine
[anthropic]
key = "" # Optional, uncomment if you want to use Anthropic. Acquire through https://www.anthropic.com/
[cohere]
key = "" # Optional, uncomment if you want to use Cohere. Acquire through https://dashboard.cohere.ai/
[replicate]
key = "" # Optional, uncomment if you want to use Replicate. Acquire through https://replicate.com/
[github]
# ---- Set the following only for deployment type == "user"
user_token = "<TOKEN>" # A GitHub personal access token with 'repo' scope.
user_token = "" # A GitHub personal access token with 'repo' scope.
deployment_type = "user" #set to user by default
# ---- Set the following only for deployment type == "app", see README for details.
private_key = """\

View File

@ -1,32 +1,46 @@
[config]
model="gpt-4"
fallback-models=["gpt-3.5-turbo-16k", "gpt-3.5-turbo"]
fallback_models=["gpt-3.5-turbo-16k"]
git_provider="github"
publish_output=true
publish_output_progress=true
verbosity_level=0 # 0,1,2
use_extra_bad_extensions=false
use_repo_settings_file=true
ai_timeout=180
max_description_tokens = 500
max_commits_tokens = 500
[pr_reviewer]
[pr_reviewer] # /review #
require_focused_review=true
require_score_review=false
require_tests_review=true
require_security_review=true
num_code_suggestions=0
inline_code_comments = true
num_code_suggestions=3
inline_code_comments = false
ask_and_reflect=false
extra_instructions = ""
[pr_description]
[pr_description] # /describe #
publish_description_as_comment=false
extra_instructions = ""
[pr_questions]
[pr_questions] # /ask #
[pr_code_suggestions]
[pr_code_suggestions] # /improve #
num_code_suggestions=4
extra_instructions = ""
[pr_update_changelog] # /update_changelog #
push_changelog_changes=false
extra_instructions = ""
[pr_config] # /config #
[github]
# The type of deployment to create. Valid values are 'app' or 'user'.
deployment_type = "user"
ratelimit_retries = 5
[gitlab]
# URL to the gitlab service
@ -40,3 +54,8 @@ magic_word = "AutoReview"
# Polling interval
polling_interval_seconds = 30
[local]
# LocalGitProvider settings - uncomment to use paths other than default
# description_path= "path/to/description.md"
# review_path= "path/to/review.md"

View File

@ -9,6 +9,12 @@ Your task is to provide meaningfull non-trivial code suggestions to improve the
- Make sure not to provide suggestions repeating modifications already implemented in the new PR code (the '+' lines).
- Don't output line numbers in the 'improved code' snippets.
{%- if extra_instructions %}
Extra instructions from the user:
{{ extra_instructions }}
{% endif %}
You must use the following JSON schema to format your answer:
```json
{
@ -67,6 +73,11 @@ Description: '{{description}}'
{%- if language %}
Main language: {{language}}
{%- endif %}
{%- if commit_messages_str %}
Commit messages:
{{commit_messages_str}}
{%- endif %}
The PR Diff:

View File

@ -2,36 +2,75 @@
system="""You are CodiumAI-PR-Reviewer, a language model designed to review git pull requests.
Your task is to provide full description of the PR content.
- Make sure not to focus the new PR code (the '+' lines).
- Notice that the 'Previous title', 'Previous description' and 'Commit messages' sections may be partial, simplistic, non-informative or not up-to-date. Hence, compare them to the PR diff code, and use them only as a reference.
You must use the following JSON schema to format your answer:
```json
{
"PR Title": {
"type": "string",
"description": "an informative title for the PR, describing its main theme"
},
"PR Type": {
"type": "string",
"description": possible values are: ["Bug fix", "Tests", "Bug fix with tests", "Refactoring", "Enhancement", "Documentation", "Other"]
},
"PR Description": {
"type": "string",
"description": "an informative and concise description of the PR"
},
"PR Main Files Walkthrough": {
"type": "string",
"description": "a walkthrough of the PR changes. Review main files, in bullet points, and shortly describe the changes in each file (up to 10 most important files). Format: -`filename`: description of changes\n..."
}
}
{%- if extra_instructions %}
Don't repeat the prompt in the answer, and avoid outputting the 'type' and 'description' fields.
Extra instructions from the user:
{{ extra_instructions }}
{% endif %}
You must use the following YAML schema to format your answer:
```yaml
PR Title:
type: string
description: an informative title for the PR, describing its main theme
PR Type:
type: array
items:
type: string
enum:
- Bug fix
- Tests
- Bug fix with tests
- Refactoring
- Enhancement
- Documentation
- Other
PR Description:
type: string
description: an informative and concise description of the PR
PR Main Files Walkthrough:
type: array
maxItems: 10
description: >-
a walkthrough of the PR changes. Review main files, and shortly describe the changes in each file (up to 10 most important files).
items:
filename:
type: string
description: the relevant file full path
changes in file:
type: string
description: minimal and concise description of the changes in the relevant file
Example output:
```yaml
PR Title: ...
PR Type:
- Bug fix
PR Description: ...
PR Main Files Walkthrough:
- ...
- ...
```
Make sure to output a valid YAML. Don't repeat the prompt in the answer, and avoid outputting the 'type' and 'description' fields.
"""
user="""PR Info:
Previous title: '{{title}}'
Previous description: '{{description}}'
Branch: '{{branch}}'
{%- if language %}
Main language: {{language}}
{%- endif %}
{%- if commit_messages_str %}
Commit messages:
{{commit_messages_str}}
{%- endif %}
The PR Git Diff:
@ -40,6 +79,6 @@ The PR Git Diff:
```
Note that lines in the diff body are prefixed with a symbol that represents the type of change: '-' for deletions, '+' for additions, and ' ' (a space) for unchanged lines.
Response (should be a valid JSON, and nothing else):
```json
Response (should be a valid YAML, and nothing else):
```yaml
"""

View File

@ -21,6 +21,11 @@ Description: '{{description}}'
{%- if language %}
Main language: {{language}}
{%- endif %}
{%- if commit_messages_str %}
Commit messages:
{{commit_messages_str}}
{%- endif %}
The PR Git Diff:

View File

@ -13,6 +13,11 @@ Description: '{{description}}'
{%- if language %}
Main language: {{language}}
{%- endif %}
{%- if commit_messages_str %}
Commit messages:
{{commit_messages_str}}
{%- endif %}
The PR Git Diff:

View File

@ -1,124 +1,134 @@
[pr_review_prompt]
system="""You are CodiumAI-PR-Reviewer, a language model designed to review git pull requests.
Your task is to provide constructive and concise feedback for the PR, and also provide meaningfull code suggestions to improve the new PR code (the '+' lines).
- Provide up to {{ num_code_suggestions }} code suggestions.
{%- if num_code_suggestions > 0 %}
- Try to focus on important suggestions like fixing code problems, issues and bugs. As a second priority, provide suggestions for meaningfull code improvements, like performance, vulnerability, modularity, and best practices.
- Provide up to {{ num_code_suggestions }} code suggestions.
- Try to focus on the most important suggestions, like fixing code problems, issues and bugs. As a second priority, provide suggestions for meaningfull code improvements, like performance, vulnerability, modularity, and best practices.
- Suggestions should focus on improving the new added code lines.
- Make sure not to provide suggestions repeating modifications already implemented in the new PR code (the '+' lines).
{%- endif %}
You must use the following JSON schema to format your answer:
```json
{
"PR Analysis": {
"Main theme": {
"type": "string",
"description": "a short explanation of the PR"
},
"Type of PR": {
"type": "string",
"enum": ["Bug fix", "Tests", "Bug fix with tests", "Refactoring", "Enhancement", "Documentation", "Other"]
},
{%- if extra_instructions %}
Extra instructions from the user:
{{ extra_instructions }}
{% endif %}
You must use the following YAML schema to format your answer:
```yaml
PR Analysis:
Main theme:
type: string
description: a short explanation of the PR
Type of PR:
type: string
enum:
- Bug fix
- Tests
- Refactoring
- Enhancement
- Documentation
- Other
{%- if require_score %}
"Score": {
"type": "int",
"description": "Rate this PR on a scale of 0-100 (inclusive), where 0 means the worst possible PR code, and 100 means PR code of the highest quality, without any bugs or performance issues, that is ready to be merged immediately and run in production at scale."
},
Score:
type: int
description: >-
Rate this PR on a scale of 0-100 (inclusive), where 0 means the worst
possible PR code, and 100 means PR code of the highest quality, without
any bugs or performance issues, that is ready to be merged immediately and
run in production at scale.
{%- endif %}
{%- if require_tests %}
"Relevant tests added": {
"type": "string",
"description": "yes\\no question: does this PR have relevant tests ?"
},
Relevant tests added:
type: string
description: yes\\no question: does this PR have relevant tests ?
{%- endif %}
{%- if question_str %}
"Insights from user's answer": {
"type": "string",
"description": "shortly summarize the insights you gained from the user's answers to the questions"
},
Insights from user's answer:
type: string
description: >-
shortly summarize the insights you gained from the user's answers to the questions
{%- endif %}
{%- if require_focused %}
"Focused PR": {
"type": "string",
"description": "Is this a focused PR, in the sense that it has a clear and coherent title and description, and all PR code diff changes are properly derived from the title and description? Explain your response."
}
},
Focused PR:
type: string
description: >-
Is this a focused PR, in the sense that all the PR code diff changes are
united under a single focused theme ? If the theme is too broad, or the PR
code diff changes are too scattered, then the PR is not focused. Explain
your answer shortly.
{%- endif %}
"PR Feedback": {
"General PR suggestions": {
"type": "string",
"description": "General suggestions and feedback for the contributors and maintainers of this PR. May include important suggestions for the overall structure, primary purpose, best practices, critical bugs, and other aspects of the PR. Explain your suggestions."
},
PR Feedback:
General suggestions:
type: string
description: >-
General suggestions and feedback for the contributors and maintainers of
this PR. May include important suggestions for the overall structure,
primary purpose, best practices, critical bugs, and other aspects of the
PR. Don't address PR title and description, or lack of tests. Explain your
suggestions.
{%- if num_code_suggestions > 0 %}
"Code suggestions": {
"type": "array",
"maxItems": {{ num_code_suggestions }},
"uniqueItems": true,
"items": {
"relevant file": {
"type": "string",
"description": "the relevant file full path"
},
"suggestion content": {
"type": "string",
"description": "a concrete suggestion for meaningfully improving the new PR code. Also describe how, specifically, the suggestion can be applied to new PR code. Add tags with importance measure that matches each suggestion ('important' or 'medium'). Do not make suggestions for updating or adding docstrings, renaming PR title and description, or linter like.
},
"relevant line in file": {
"type": "string",
"description": "an authentic single code line from the PR git diff section, to which the suggestion applies."
}
}
},
Code feedback:
type: array
maxItems: {{ num_code_suggestions }}
uniqueItems: true
items:
relevant file:
type: string
description: the relevant file full path
suggestion:
type: string
description: >-
a concrete suggestion for meaningfully improving the new PR code. Also
describe how, specifically, the suggestion can be applied to new PR
code. Add tags with importance measure that matches each suggestion
('important' or 'medium'). Do not make suggestions for updating or
adding docstrings, renaming PR title and description, or linter like.
relevant line:
type: string
description: >-
a single code line taken from the relevant file, to which the
suggestion applies. The line should be a '+' line. Make sure to output
the line exactly as it appears in the relevant file
{%- endif %}
{%- if require_security %}
"Security concerns": {
"type": "string",
"description": "yes\\no question: does this PR code introduce possible security concerns or issues, like SQL injection, XSS, CSRF, and others ? explain your answer"
? explain your answer"
}
Security concerns:
type: string
description: >-
yes\\no question: does this PR code introduce possible security concerns or
issues, like SQL injection, XSS, CSRF, and others ? If answered 'yes',explain your answer shortly
{%- endif %}
}
}
```
Example output:
'
{
"PR Analysis":
{
"Main theme": "xxx",
"Type of PR": "Bug fix",
```yaml
PR Analysis:
Main theme: xxx
Type of PR: Bug fix
{%- if require_score %}
"Score": 89,
{%- endif %}
{%- if require_tests %}
"Relevant tests added": "No",
Score: 89
{%- endif %}
Relevant tests added: No
{%- if require_focused %}
"Focused PR": "yes\\no, because ..."
Focused PR: no, because ...
{%- endif %}
},
"PR Feedback":
{
"General PR suggestions": "..., `xxx`...",
PR Feedback:
General PR suggestions: ...
{%- if num_code_suggestions > 0 %}
"Code suggestions": [
{
"relevant file": "directory/xxx.py",
"suggestion content": "xxx [important]",
"relevant line in file": "xxx",
},
...
]
Code feedback:
- relevant file: |-
directory/xxx.py
suggestion: xxx [important]
relevant line: |-
xxx
...
{%- endif %}
{%- if require_security %}
"Security concerns": "No, because ..."
Security concerns: No
{%- endif %}
}
}
'
```
Make sure to output a valid YAML. Use multi-line block scalar ('|') if needed.
Don't repeat the prompt in the answer, and avoid outputting the 'type' and 'description' fields.
"""
@ -129,6 +139,11 @@ Description: '{{description}}'
{%- if language %}
Main language: {{language}}
{%- endif %}
{%- if commit_messages_str %}
Commit messages:
{{commit_messages_str}}
{%- endif %}
{%- if question_str %}
######
@ -147,6 +162,6 @@ The PR Git Diff:
```
Note that lines in the diff body are prefixed with a symbol that represents the type of change: '-' for deletions, '+' for additions, and ' ' (a space) for unchanged lines.
Response (should be a valid JSON, and nothing else):
```json
Response (should be a valid YAML, and nothing else):
```yaml
"""

View File

@ -0,0 +1,45 @@
[pr_update_changelog_prompt]
system="""You are a language model called CodiumAI-PR-Changlog-summarizer.
Your task is to update the CHANGELOG.md file of the project, to shortly summarize important changes introduced in this PR (the '+' lines).
- The output should match the existing CHANGELOG.md format, style and conventions, so it will look like a natural part of the file. For example, if previous changes were summarized in a single line, you should do the same.
- Don't repeat previous changes. Generate only new content, that is not already in the CHANGELOG.md file.
- Be general, and avoid specific details, files, etc. The output should be minimal, no more than 3-4 short lines. Ignore non-relevant subsections.
{%- if extra_instructions %}
Extra instructions from the user:
{{ extra_instructions }}
{%- endif %}
"""
user="""PR Info:
Title: '{{title}}'
Branch: '{{branch}}'
Description: '{{description}}'
{%- if language %}
Main language: {{language}}
{%- endif %}
{%- if commit_messages_str %}
Commit messages:
{{commit_messages_str}}
{%- endif %}
The PR Diff:
```
{{diff}}
```
Current date:
```
{{today}}
```
The current CHANGELOG.md:
```
{{ changelog_file_str }}
```
Response:
"""

View File

@ -9,18 +9,19 @@ from pr_agent.algo.ai_handler import AiHandler
from pr_agent.algo.pr_processing import get_pr_diff, retry_with_fallback_models
from pr_agent.algo.token_handler import TokenHandler
from pr_agent.algo.utils import try_fix_json
from pr_agent.config_loader import settings
from pr_agent.config_loader import get_settings
from pr_agent.git_providers import BitbucketProvider, get_git_provider
from pr_agent.git_providers.git_provider import get_main_pr_language
class PRCodeSuggestions:
def __init__(self, pr_url: str, cli_mode=False):
def __init__(self, pr_url: str, cli_mode=False, args: list = None):
self.git_provider = get_git_provider()(pr_url)
self.main_language = get_main_pr_language(
self.git_provider.get_languages(), self.git_provider.get_files()
)
self.ai_handler = AiHandler()
self.patches_diff = None
self.prediction = None
@ -31,23 +32,25 @@ class PRCodeSuggestions:
"description": self.git_provider.get_pr_description(),
"language": self.main_language,
"diff": "", # empty diff for initial calculation
'num_code_suggestions': settings.pr_code_suggestions.num_code_suggestions,
"num_code_suggestions": get_settings().pr_code_suggestions.num_code_suggestions,
"extra_instructions": get_settings().pr_code_suggestions.extra_instructions,
"commit_messages_str": self.git_provider.get_commit_messages(),
}
self.token_handler = TokenHandler(self.git_provider.pr,
self.vars,
settings.pr_code_suggestions_prompt.system,
settings.pr_code_suggestions_prompt.user)
get_settings().pr_code_suggestions_prompt.system,
get_settings().pr_code_suggestions_prompt.user)
async def suggest(self):
async def run(self):
assert type(self.git_provider) != BitbucketProvider, "Bitbucket is not supported for now"
logging.info('Generating code suggestions for PR...')
if settings.config.publish_output:
if get_settings().config.publish_output:
self.git_provider.publish_comment("Preparing review...", is_temporary=True)
await retry_with_fallback_models(self._prepare_prediction)
logging.info('Preparing PR review...')
data = self._prepare_pr_code_suggestions()
if settings.config.publish_output:
if get_settings().config.publish_output:
logging.info('Pushing PR review...')
self.git_provider.remove_initial_comment()
logging.info('Pushing inline code comments...')
@ -55,12 +58,12 @@ class PRCodeSuggestions:
async def _prepare_prediction(self, model: str):
logging.info('Getting PR diff...')
# we are using extended hunk with line numbers for code suggestions
self.patches_diff = get_pr_diff(self.git_provider,
self.token_handler,
model,
add_line_numbers_to_hunks=True,
disable_extra_lines=True)
logging.info('Getting AI prediction...')
self.prediction = await self._get_prediction(model)
@ -68,9 +71,9 @@ class PRCodeSuggestions:
variables = copy.deepcopy(self.vars)
variables["diff"] = self.patches_diff # update diff
environment = Environment(undefined=StrictUndefined)
system_prompt = environment.from_string(settings.pr_code_suggestions_prompt.system).render(variables)
user_prompt = environment.from_string(settings.pr_code_suggestions_prompt.user).render(variables)
if settings.config.verbosity_level >= 2:
system_prompt = environment.from_string(get_settings().pr_code_suggestions_prompt.system).render(variables)
user_prompt = environment.from_string(get_settings().pr_code_suggestions_prompt.user).render(variables)
if get_settings().config.verbosity_level >= 2:
logging.info(f"\nSystem prompt:\n{system_prompt}")
logging.info(f"\nUser prompt:\n{user_prompt}")
response, finish_reason = await self.ai_handler.chat_completion(model=model, temperature=0.2,
@ -83,7 +86,7 @@ class PRCodeSuggestions:
try:
data = json.loads(review)
except json.decoder.JSONDecodeError:
if settings.config.verbosity_level >= 2:
if get_settings().config.verbosity_level >= 2:
logging.info(f"Could not parse json response: {review}")
data = try_fix_json(review, code_suggestions=True)
return data
@ -91,22 +94,28 @@ class PRCodeSuggestions:
def push_inline_code_suggestions(self, data):
code_suggestions = []
for d in data['Code suggestions']:
if settings.config.verbosity_level >= 2:
logging.info(f"suggestion: {d}")
relevant_file = d['relevant file'].strip()
relevant_lines_str = d['relevant lines'].strip()
relevant_lines_start = int(relevant_lines_str.split('-')[0]) # absolute position
relevant_lines_end = int(relevant_lines_str.split('-')[-1])
content = d['suggestion content']
new_code_snippet = d['improved code']
try:
if get_settings().config.verbosity_level >= 2:
logging.info(f"suggestion: {d}")
relevant_file = d['relevant file'].strip()
relevant_lines_str = d['relevant lines'].strip()
if ',' in relevant_lines_str: # handling 'relevant lines': '181, 190' or '178-184, 188-194'
relevant_lines_str = relevant_lines_str.split(',')[0]
relevant_lines_start = int(relevant_lines_str.split('-')[0]) # absolute position
relevant_lines_end = int(relevant_lines_str.split('-')[-1])
content = d['suggestion content']
new_code_snippet = d['improved code']
if new_code_snippet:
new_code_snippet = self.dedent_code(relevant_file, relevant_lines_start, new_code_snippet)
if new_code_snippet:
new_code_snippet = self.dedent_code(relevant_file, relevant_lines_start, new_code_snippet)
body = f"**Suggestion:** {content}\n```suggestion\n" + new_code_snippet + "\n```"
code_suggestions.append({'body': body,'relevant_file': relevant_file,
'relevant_lines_start': relevant_lines_start,
'relevant_lines_end': relevant_lines_end})
body = f"**Suggestion:** {content}\n```suggestion\n" + new_code_snippet + "\n```"
code_suggestions.append({'body': body, 'relevant_file': relevant_file,
'relevant_lines_start': relevant_lines_start,
'relevant_lines_end': relevant_lines_end})
except Exception:
if get_settings().config.verbosity_level >= 2:
logging.info(f"Could not parse suggestion: {d}")
self.git_provider.publish_code_suggestions(code_suggestions)
@ -127,7 +136,8 @@ class PRCodeSuggestions:
if delta_spaces > 0:
new_code_snippet = textwrap.indent(new_code_snippet, delta_spaces * " ").rstrip('\n')
except Exception as e:
if settings.config.verbosity_level >= 2:
if get_settings().config.verbosity_level >= 2:
logging.info(f"Could not dedent code snippet for file {relevant_file}, error: {e}")
return new_code_snippet

View File

@ -0,0 +1,48 @@
import logging
from pr_agent.config_loader import get_settings
from pr_agent.git_providers import get_git_provider
class PRConfig:
"""
The PRConfig class is responsible for listing all configuration options available for the user.
"""
def __init__(self, pr_url: str, args=None):
"""
Initialize the PRConfig object with the necessary attributes and objects to comment on a pull request.
Args:
pr_url (str): The URL of the pull request to be reviewed.
args (list, optional): List of arguments passed to the PRReviewer class. Defaults to None.
"""
self.git_provider = get_git_provider()(pr_url)
async def run(self):
logging.info('Getting configuration settings...')
logging.info('Preparing configs...')
pr_comment = self._prepare_pr_configs()
if get_settings().config.publish_output:
logging.info('Pushing configs...')
self.git_provider.publish_comment(pr_comment)
self.git_provider.remove_initial_comment()
return ""
def _prepare_pr_configs(self) -> str:
import tomli
with open(get_settings().find_file("configuration.toml"), "rb") as conf_file:
configuration_headers = [header.lower() for header in tomli.load(conf_file).keys()]
relevant_configs = {
header: configs for header, configs in get_settings().to_dict().items()
if header.lower().startswith("pr_") and header.lower() in configuration_headers
}
comment_str = "Possible Configurations:"
for header, configs in relevant_configs.items():
if configs:
comment_str += "\n"
for key, value in configs.items():
comment_str += f"\n{header.lower()}.{key.lower()} = {repr(value) if isinstance(value, str) else value}"
comment_str += " "
if get_settings().config.verbosity_level >= 2:
logging.info(f"comment_str:\n{comment_str}")
return comment_str

View File

@ -1,32 +1,34 @@
import copy
import json
import logging
from typing import Tuple, List
from typing import List, Tuple
from jinja2 import Environment, StrictUndefined
from pr_agent.algo.ai_handler import AiHandler
from pr_agent.algo.pr_processing import get_pr_diff, retry_with_fallback_models
from pr_agent.algo.token_handler import TokenHandler
from pr_agent.config_loader import settings
from pr_agent.algo.utils import load_yaml
from pr_agent.config_loader import get_settings
from pr_agent.git_providers import get_git_provider
from pr_agent.git_providers.git_provider import get_main_pr_language
class PRDescription:
def __init__(self, pr_url: str):
def __init__(self, pr_url: str, args: list = None):
"""
Initialize the PRDescription object with the necessary attributes and objects for generating a PR description using an AI model.
Initialize the PRDescription object with the necessary attributes and objects for generating a PR description
using an AI model.
Args:
pr_url (str): The URL of the pull request.
args (list, optional): List of arguments passed to the PRDescription class. Defaults to None.
"""
# Initialize the git provider and main PR language
self.git_provider = get_git_provider()(pr_url)
self.main_pr_language = get_main_pr_language(
self.git_provider.get_languages(), self.git_provider.get_files()
)
# Initialize the AI handler
self.ai_handler = AiHandler()
@ -37,26 +39,28 @@ class PRDescription:
"description": self.git_provider.get_pr_description(),
"language": self.main_pr_language,
"diff": "", # empty diff for initial calculation
"extra_instructions": get_settings().pr_description.extra_instructions,
"commit_messages_str": self.git_provider.get_commit_messages()
}
# Initialize the token handler
self.token_handler = TokenHandler(
self.git_provider.pr,
self.vars,
settings.pr_description_prompt.system,
settings.pr_description_prompt.user,
get_settings().pr_description_prompt.system,
get_settings().pr_description_prompt.user,
)
# Initialize patches_diff and prediction attributes
self.patches_diff = None
self.prediction = None
async def describe(self):
async def run(self):
"""
Generates a PR description using an AI model and publishes it to the PR.
"""
logging.info('Generating a PR description...')
if settings.config.publish_output:
if get_settings().config.publish_output:
self.git_provider.publish_comment("Preparing pr description...", is_temporary=True)
await retry_with_fallback_models(self._prepare_prediction)
@ -64,9 +68,9 @@ class PRDescription:
logging.info('Preparing answer...')
pr_title, pr_body, pr_types, markdown_text = self._prepare_pr_answer()
if settings.config.publish_output:
if get_settings().config.publish_output:
logging.info('Pushing answer...')
if settings.pr_description.publish_description_as_comment:
if get_settings().pr_description.publish_description_as_comment:
self.git_provider.publish_comment(markdown_text)
else:
self.git_provider.publish_description(pr_title, pr_body)
@ -112,10 +116,10 @@ class PRDescription:
variables["diff"] = self.patches_diff # update diff
environment = Environment(undefined=StrictUndefined)
system_prompt = environment.from_string(settings.pr_description_prompt.system).render(variables)
user_prompt = environment.from_string(settings.pr_description_prompt.user).render(variables)
system_prompt = environment.from_string(get_settings().pr_description_prompt.system).render(variables)
user_prompt = environment.from_string(get_settings().pr_description_prompt.user).render(variables)
if settings.config.verbosity_level >= 2:
if get_settings().config.verbosity_level >= 2:
logging.info(f"\nSystem prompt:\n{system_prompt}")
logging.info(f"\nUser prompt:\n{user_prompt}")
@ -136,37 +140,48 @@ class PRDescription:
- title: a string containing the PR title.
- pr_body: a string containing the PR body in a markdown format.
- pr_types: a list of strings containing the PR types.
- markdown_text: a string containing the AI prediction data in a markdown format.
- markdown_text: a string containing the AI prediction data in a markdown format. used for publishing a comment
"""
# Load the AI prediction data into a dictionary
data = json.loads(self.prediction)
data = load_yaml(self.prediction.strip())
# Initialization
markdown_text = pr_body = ""
pr_types = []
# Iterate over the dictionary items and append the key and value to 'markdown_text' in a markdown format
markdown_text = ""
for key, value in data.items():
markdown_text += f"## {key}\n\n"
markdown_text += f"{value}\n\n"
# If the 'PR Type' key is present in the dictionary, split its value by comma and assign it to 'pr_types'
if 'PR Type' in data:
pr_types = data['PR Type'].split(',')
if type(data['PR Type']) == list:
pr_types = data['PR Type']
elif type(data['PR Type']) == str:
pr_types = data['PR Type'].split(',')
# Assign the value of the 'PR Title' key to 'title' variable and remove it from the dictionary
title = data.pop('PR Title')
# Iterate over the remaining dictionary items and append the key and value to 'pr_body' in a markdown format,
# except for the items containing the word 'walkthrough'
pr_body = ""
for key, value in data.items():
pr_body += f"{key}:\n"
pr_body += f"## {key}:\n"
if 'walkthrough' in key.lower():
pr_body += f"{value}\n"
# for filename, description in value.items():
for file in value:
filename = file['filename'].replace("'", "`")
description = file['changes in file']
pr_body += f'`{filename}`: {description}\n'
else:
pr_body += f"**{value}**\n\n___\n"
# if the value is a list, join its items by comma
if type(value) == list:
value = ', '.join(v for v in value)
pr_body += f"{value}\n\n___\n"
if settings.config.verbosity_level >= 2:
if get_settings().config.verbosity_level >= 2:
logging.info(f"title:\n{title}\n{pr_body}")
return title, pr_body, pr_types, markdown_text

View File

@ -6,15 +6,13 @@ from jinja2 import Environment, StrictUndefined
from pr_agent.algo.ai_handler import AiHandler
from pr_agent.algo.pr_processing import get_pr_diff, retry_with_fallback_models
from pr_agent.algo.token_handler import TokenHandler
from pr_agent.config_loader import settings
from pr_agent.config_loader import get_settings
from pr_agent.git_providers import get_git_provider
from pr_agent.git_providers.git_provider import get_main_pr_language
class PRInformationFromUser:
def __init__(self, pr_url: str):
def __init__(self, pr_url: str, args: list = None):
self.git_provider = get_git_provider()(pr_url)
self.main_pr_language = get_main_pr_language(
self.git_provider.get_languages(), self.git_provider.get_files()
@ -26,22 +24,23 @@ class PRInformationFromUser:
"description": self.git_provider.get_pr_description(),
"language": self.main_pr_language,
"diff": "", # empty diff for initial calculation
"commit_messages_str": self.git_provider.get_commit_messages(),
}
self.token_handler = TokenHandler(self.git_provider.pr,
self.vars,
settings.pr_information_from_user_prompt.system,
settings.pr_information_from_user_prompt.user)
get_settings().pr_information_from_user_prompt.system,
get_settings().pr_information_from_user_prompt.user)
self.patches_diff = None
self.prediction = None
async def generate_questions(self):
async def run(self):
logging.info('Generating question to the user...')
if settings.config.publish_output:
if get_settings().config.publish_output:
self.git_provider.publish_comment("Preparing questions...", is_temporary=True)
await retry_with_fallback_models(self._prepare_prediction)
logging.info('Preparing questions...')
pr_comment = self._prepare_pr_answer()
if settings.config.publish_output:
if get_settings().config.publish_output:
logging.info('Pushing questions...')
self.git_provider.publish_comment(pr_comment)
self.git_provider.remove_initial_comment()
@ -57,9 +56,9 @@ class PRInformationFromUser:
variables = copy.deepcopy(self.vars)
variables["diff"] = self.patches_diff # update diff
environment = Environment(undefined=StrictUndefined)
system_prompt = environment.from_string(settings.pr_information_from_user_prompt.system).render(variables)
user_prompt = environment.from_string(settings.pr_information_from_user_prompt.user).render(variables)
if settings.config.verbosity_level >= 2:
system_prompt = environment.from_string(get_settings().pr_information_from_user_prompt.system).render(variables)
user_prompt = environment.from_string(get_settings().pr_information_from_user_prompt.user).render(variables)
if get_settings().config.verbosity_level >= 2:
logging.info(f"\nSystem prompt:\n{system_prompt}")
logging.info(f"\nUser prompt:\n{user_prompt}")
response, finish_reason = await self.ai_handler.chat_completion(model=model, temperature=0.2,
@ -68,7 +67,7 @@ class PRInformationFromUser:
def _prepare_pr_answer(self) -> str:
model_output = self.prediction.strip()
if settings.config.verbosity_level >= 2:
if get_settings().config.verbosity_level >= 2:
logging.info(f"answer_str:\n{model_output}")
answer_str = f"{model_output}\n\n Please respond to the questions above in the following format:\n\n" +\
"\n>/answer\n>1) ...\n>2) ...\n>...\n"

View File

@ -6,7 +6,7 @@ from jinja2 import Environment, StrictUndefined
from pr_agent.algo.ai_handler import AiHandler
from pr_agent.algo.pr_processing import get_pr_diff, retry_with_fallback_models
from pr_agent.algo.token_handler import TokenHandler
from pr_agent.config_loader import settings
from pr_agent.config_loader import get_settings
from pr_agent.git_providers import get_git_provider
from pr_agent.git_providers.git_provider import get_main_pr_language
@ -27,11 +27,12 @@ class PRQuestions:
"language": self.main_pr_language,
"diff": "", # empty diff for initial calculation
"questions": self.question_str,
"commit_messages_str": self.git_provider.get_commit_messages(),
}
self.token_handler = TokenHandler(self.git_provider.pr,
self.vars,
settings.pr_questions_prompt.system,
settings.pr_questions_prompt.user)
get_settings().pr_questions_prompt.system,
get_settings().pr_questions_prompt.user)
self.patches_diff = None
self.prediction = None
@ -42,14 +43,14 @@ class PRQuestions:
question_str = ""
return question_str
async def answer(self):
async def run(self):
logging.info('Answering a PR question...')
if settings.config.publish_output:
if get_settings().config.publish_output:
self.git_provider.publish_comment("Preparing answer...", is_temporary=True)
await retry_with_fallback_models(self._prepare_prediction)
logging.info('Preparing answer...')
pr_comment = self._prepare_pr_answer()
if settings.config.publish_output:
if get_settings().config.publish_output:
logging.info('Pushing answer...')
self.git_provider.publish_comment(pr_comment)
self.git_provider.remove_initial_comment()
@ -65,9 +66,9 @@ class PRQuestions:
variables = copy.deepcopy(self.vars)
variables["diff"] = self.patches_diff # update diff
environment = Environment(undefined=StrictUndefined)
system_prompt = environment.from_string(settings.pr_questions_prompt.system).render(variables)
user_prompt = environment.from_string(settings.pr_questions_prompt.user).render(variables)
if settings.config.verbosity_level >= 2:
system_prompt = environment.from_string(get_settings().pr_questions_prompt.system).render(variables)
user_prompt = environment.from_string(get_settings().pr_questions_prompt.user).render(variables)
if get_settings().config.verbosity_level >= 2:
logging.info(f"\nSystem prompt:\n{system_prompt}")
logging.info(f"\nUser prompt:\n{user_prompt}")
response, finish_reason = await self.ai_handler.chat_completion(model=model, temperature=0.2,
@ -77,6 +78,6 @@ class PRQuestions:
def _prepare_pr_answer(self) -> str:
answer_str = f"Question: {self.question_str}\n\n"
answer_str += f"Answer:\n{self.prediction.strip()}\n\n"
if settings.config.verbosity_level >= 2:
if get_settings().config.verbosity_level >= 2:
logging.info(f"answer_str:\n{answer_str}")
return answer_str

View File

@ -2,22 +2,37 @@ import copy
import json
import logging
from collections import OrderedDict
from typing import List, Tuple
import yaml
from jinja2 import Environment, StrictUndefined
from yaml import SafeLoader
from pr_agent.algo.ai_handler import AiHandler
from pr_agent.algo.pr_processing import get_pr_diff, retry_with_fallback_models
from pr_agent.algo.pr_processing import get_pr_diff, retry_with_fallback_models, \
find_line_number_of_relevant_line_in_file, clip_tokens
from pr_agent.algo.token_handler import TokenHandler
from pr_agent.algo.utils import convert_to_markdown, try_fix_json
from pr_agent.config_loader import settings
from pr_agent.algo.utils import convert_to_markdown, try_fix_json, try_fix_yaml, load_yaml
from pr_agent.config_loader import get_settings
from pr_agent.git_providers import get_git_provider
from pr_agent.git_providers.git_provider import get_main_pr_language, IncrementalPR
from pr_agent.git_providers.git_provider import IncrementalPR, get_main_pr_language
from pr_agent.servers.help import actions_help_text, bot_help_text
class PRReviewer:
def __init__(self, pr_url: str, cli_mode=False, is_answer: bool = False, args=None):
self.parse_args(args)
"""
The PRReviewer class is responsible for reviewing a pull request and generating feedback using an AI model.
"""
def __init__(self, pr_url: str, is_answer: bool = False, args: list = None):
"""
Initialize the PRReviewer object with the necessary attributes and objects to review a pull request.
Args:
pr_url (str): The URL of the pull request to be reviewed.
is_answer (bool, optional): Indicates whether the review is being done in answer mode. Defaults to False.
args (list, optional): List of arguments passed to the PRReviewer class. Defaults to None.
"""
self.parse_args(args) # -i command
self.git_provider = get_git_provider()(pr_url, incremental=self.incremental)
self.main_language = get_main_pr_language(
@ -25,34 +40,48 @@ class PRReviewer:
)
self.pr_url = pr_url
self.is_answer = is_answer
if self.is_answer and not self.git_provider.is_supported("get_issue_comments"):
raise Exception(f"Answer mode is not supported for {settings.config.git_provider} for now")
answer_str, question_str = self._get_user_answers()
raise Exception(f"Answer mode is not supported for {get_settings().config.git_provider} for now")
self.ai_handler = AiHandler()
self.patches_diff = None
self.prediction = None
self.cli_mode = cli_mode
answer_str, question_str = self._get_user_answers()
self.vars = {
"title": self.git_provider.pr.title,
"branch": self.git_provider.get_pr_branch(),
"description": self.git_provider.get_pr_description(),
"language": self.main_language,
"diff": "", # empty diff for initial calculation
"require_score": settings.pr_reviewer.require_score_review,
"require_tests": settings.pr_reviewer.require_tests_review,
"require_security": settings.pr_reviewer.require_security_review,
"require_focused": settings.pr_reviewer.require_focused_review,
'num_code_suggestions': settings.pr_reviewer.num_code_suggestions,
#
"require_score": get_settings().pr_reviewer.require_score_review,
"require_tests": get_settings().pr_reviewer.require_tests_review,
"require_security": get_settings().pr_reviewer.require_security_review,
"require_focused": get_settings().pr_reviewer.require_focused_review,
'num_code_suggestions': get_settings().pr_reviewer.num_code_suggestions,
'question_str': question_str,
'answer_str': answer_str,
"extra_instructions": get_settings().pr_reviewer.extra_instructions,
"commit_messages_str": self.git_provider.get_commit_messages(),
}
self.token_handler = TokenHandler(self.git_provider.pr,
self.vars,
settings.pr_review_prompt.system,
settings.pr_review_prompt.user)
def parse_args(self, args):
self.token_handler = TokenHandler(
self.git_provider.pr,
self.vars,
get_settings().pr_review_prompt.system,
get_settings().pr_review_prompt.user
)
def parse_args(self, args: List[str]) -> None:
"""
Parse the arguments passed to the PRReviewer class and set the 'incremental' attribute accordingly.
Args:
args: A list of arguments passed to the PRReviewer class.
Returns:
None
"""
is_incremental = False
if args and len(args) >= 1:
arg = args[0]
@ -60,70 +89,121 @@ class PRReviewer:
is_incremental = True
self.incremental = IncrementalPR(is_incremental)
async def review(self):
async def run(self) -> None:
"""
Review the pull request and generate feedback.
"""
logging.info('Reviewing PR...')
if settings.config.publish_output:
if get_settings().config.publish_output:
self.git_provider.publish_comment("Preparing review...", is_temporary=True)
await retry_with_fallback_models(self._prepare_prediction)
logging.info('Preparing PR review...')
pr_comment = self._prepare_pr_review()
if settings.config.publish_output:
if get_settings().config.publish_output:
logging.info('Pushing PR review...')
self.git_provider.publish_comment(pr_comment)
self.git_provider.remove_initial_comment()
if settings.pr_reviewer.inline_code_comments:
if get_settings().pr_reviewer.inline_code_comments:
logging.info('Pushing inline code comments...')
self._publish_inline_code_comments()
return ""
async def _prepare_prediction(self, model: str):
async def _prepare_prediction(self, model: str) -> None:
"""
Prepare the AI prediction for the pull request review.
Args:
model: A string representing the AI model to be used for the prediction.
Returns:
None
"""
logging.info('Getting PR diff...')
self.patches_diff = get_pr_diff(self.git_provider, self.token_handler, model)
logging.info('Getting AI prediction...')
self.prediction = await self._get_prediction(model)
async def _get_prediction(self, model: str):
async def _get_prediction(self, model: str) -> str:
"""
Generate an AI prediction for the pull request review.
Args:
model: A string representing the AI model to be used for the prediction.
Returns:
A string representing the AI prediction for the pull request review.
"""
variables = copy.deepcopy(self.vars)
variables["diff"] = self.patches_diff # update diff
environment = Environment(undefined=StrictUndefined)
system_prompt = environment.from_string(settings.pr_review_prompt.system).render(variables)
user_prompt = environment.from_string(settings.pr_review_prompt.user).render(variables)
if settings.config.verbosity_level >= 2:
system_prompt = environment.from_string(get_settings().pr_review_prompt.system).render(variables)
user_prompt = environment.from_string(get_settings().pr_review_prompt.user).render(variables)
if get_settings().config.verbosity_level >= 2:
logging.info(f"\nSystem prompt:\n{system_prompt}")
logging.info(f"\nUser prompt:\n{user_prompt}")
response, finish_reason = await self.ai_handler.chat_completion(model=model, temperature=0.2,
system=system_prompt, user=user_prompt)
response, finish_reason = await self.ai_handler.chat_completion(
model=model,
temperature=0.2,
system=system_prompt,
user=user_prompt
)
return response
def _prepare_pr_review(self) -> str:
review = self.prediction.strip()
try:
data = json.loads(review)
except json.decoder.JSONDecodeError:
data = try_fix_json(review)
"""
Prepare the PR review by processing the AI prediction and generating a markdown-formatted text that summarizes
the feedback.
"""
data = load_yaml(self.prediction.strip())
# reordering for nicer display
if 'PR Feedback' in data:
if 'Security concerns' in data['PR Feedback']:
val = data['PR Feedback']['Security concerns']
del data['PR Feedback']['Security concerns']
data['PR Analysis']['Security concerns'] = val
# Move 'Security concerns' key to 'PR Analysis' section for better display
pr_feedback = data.get('PR Feedback', {})
security_concerns = pr_feedback.get('Security concerns')
if security_concerns is not None:
del pr_feedback['Security concerns']
if type(security_concerns) == bool and security_concerns == False:
data.setdefault('PR Analysis', {})['Security concerns'] = 'No security concerns found'
else:
data.setdefault('PR Analysis', {})['Security concerns'] = security_concerns
if settings.config.git_provider != 'bitbucket' and \
settings.pr_reviewer.inline_code_comments and \
'Code suggestions' in data['PR Feedback']:
# keeping only code suggestions that can't be submitted as inline comments
data['PR Feedback']['Code suggestions'] = [
d for d in data['PR Feedback']['Code suggestions']
if any(key not in d for key in ('relevant file', 'relevant line in file', 'suggestion content'))
]
if not data['PR Feedback']['Code suggestions']:
del data['PR Feedback']['Code suggestions']
#
if 'Code feedback' in pr_feedback:
code_feedback = pr_feedback['Code feedback']
# Filter out code suggestions that can be submitted as inline comments
if get_settings().pr_reviewer.inline_code_comments:
del pr_feedback['Code feedback']
else:
for suggestion in code_feedback:
if ('relevant file' in suggestion) and (not suggestion['relevant file'].startswith('``')):
suggestion['relevant file'] = f"``{suggestion['relevant file']}``"
if 'relevant line' not in suggestion:
suggestion['relevant line'] = ''
relevant_line_str = suggestion['relevant line'].split('\n')[0]
# removing '+'
suggestion['relevant line'] = relevant_line_str.lstrip('+').strip()
# try to add line numbers link to code suggestions
if hasattr(self.git_provider, 'generate_link_to_relevant_line_number'):
link = self.git_provider.generate_link_to_relevant_line_number(suggestion)
if link:
suggestion['relevant line'] = f"[{suggestion['relevant line']}]({link})"
# Add incremental review section
if self.incremental.is_incremental:
# Rename title when incremental review - Add to the beginning of the dict
last_commit_url = f"{self.git_provider.get_pr_url()}/commits/{self.git_provider.incremental.first_new_commit_sha}"
last_commit_url = f"{self.git_provider.get_pr_url()}/commits/" \
f"{self.git_provider.incremental.first_new_commit_sha}"
data = OrderedDict(data)
data.update({'Incremental PR Review': {
"⏮️ Review for commits since previous PR-Agent review": f"Starting from commit {last_commit_url}"}})
@ -132,32 +212,43 @@ class PRReviewer:
markdown_text = convert_to_markdown(data)
user = self.git_provider.get_user_id()
if not self.cli_mode:
# Add help text if not in CLI mode
if not get_settings().get("CONFIG.CLI_MODE", False):
markdown_text += "\n### How to use\n"
if user and '[bot]' not in user:
markdown_text += bot_help_text(user)
else:
markdown_text += actions_help_text
if settings.config.verbosity_level >= 2:
# Log markdown response if verbosity level is high
if get_settings().config.verbosity_level >= 2:
logging.info(f"Markdown response:\n{markdown_text}")
if markdown_text == None or len(markdown_text) == 0:
markdown_text = ""
return markdown_text
def _publish_inline_code_comments(self):
if settings.pr_reviewer.num_code_suggestions == 0:
def _publish_inline_code_comments(self) -> None:
"""
Publishes inline comments on a pull request with code suggestions generated by the AI model.
"""
if get_settings().pr_reviewer.num_code_suggestions == 0:
return
review = self.prediction.strip()
review_text = self.prediction.strip()
review_text = review_text.lstrip('```yaml').rstrip('`')
try:
data = json.loads(review)
except json.decoder.JSONDecodeError:
data = try_fix_json(review)
data = yaml.load(review_text, Loader=SafeLoader)
except Exception as e:
logging.error(f"Failed to parse AI prediction: {e}")
data = try_fix_yaml(review_text)
comments = []
for d in data['PR Feedback']['Code suggestions']:
relevant_file = d.get('relevant file', '').strip()
relevant_line_in_file = d.get('relevant line in file', '').strip()
content = d.get('suggestion content', '')
comments: List[str] = []
for suggestion in data.get('PR Feedback', {}).get('Code feedback', []):
relevant_file = suggestion.get('relevant file', '').strip()
relevant_line_in_file = suggestion.get('relevant line', '').strip()
content = suggestion.get('suggestion', '')
if not relevant_file or not relevant_line_in_file or not content:
logging.info("Skipping inline comment with missing file/line/content")
continue
@ -172,15 +263,26 @@ class PRReviewer:
if comments:
self.git_provider.publish_inline_comments(comments)
def _get_user_answers(self):
answer_str = question_str = ""
def _get_user_answers(self) -> Tuple[str, str]:
"""
Retrieves the question and answer strings from the discussion messages related to a pull request.
Returns:
A tuple containing the question and answer strings.
"""
question_str = ""
answer_str = ""
if self.is_answer:
discussion_messages = self.git_provider.get_issue_comments()
for message in discussion_messages.reversed:
if "Questions to better understand the PR:" in message.body:
question_str = message.body
elif '/answer' in message.body:
answer_str = message.body
if answer_str and question_str:
break
return question_str, answer_str

View File

@ -0,0 +1,160 @@
import copy
import logging
from datetime import date
from time import sleep
from typing import Tuple
from jinja2 import Environment, StrictUndefined
from pr_agent.algo.ai_handler import AiHandler
from pr_agent.algo.pr_processing import get_pr_diff, retry_with_fallback_models
from pr_agent.algo.token_handler import TokenHandler
from pr_agent.config_loader import get_settings
from pr_agent.git_providers import GithubProvider, get_git_provider
from pr_agent.git_providers.git_provider import get_main_pr_language
CHANGELOG_LINES = 50
class PRUpdateChangelog:
def __init__(self, pr_url: str, cli_mode=False, args=None):
self.git_provider = get_git_provider()(pr_url)
self.main_language = get_main_pr_language(
self.git_provider.get_languages(), self.git_provider.get_files()
)
self.commit_changelog = get_settings().pr_update_changelog.push_changelog_changes
self._get_changlog_file() # self.changelog_file_str
self.ai_handler = AiHandler()
self.patches_diff = None
self.prediction = None
self.cli_mode = cli_mode
self.vars = {
"title": self.git_provider.pr.title,
"branch": self.git_provider.get_pr_branch(),
"description": self.git_provider.get_pr_description(),
"language": self.main_language,
"diff": "", # empty diff for initial calculation
"changelog_file_str": self.changelog_file_str,
"today": date.today(),
"extra_instructions": get_settings().pr_update_changelog.extra_instructions,
"commit_messages_str": self.git_provider.get_commit_messages(),
}
self.token_handler = TokenHandler(self.git_provider.pr,
self.vars,
get_settings().pr_update_changelog_prompt.system,
get_settings().pr_update_changelog_prompt.user)
async def run(self):
assert type(self.git_provider) == GithubProvider, "Currently only Github is supported"
logging.info('Updating the changelog...')
if get_settings().config.publish_output:
self.git_provider.publish_comment("Preparing changelog updates...", is_temporary=True)
await retry_with_fallback_models(self._prepare_prediction)
logging.info('Preparing PR changelog updates...')
new_file_content, answer = self._prepare_changelog_update()
if get_settings().config.publish_output:
self.git_provider.remove_initial_comment()
logging.info('Publishing changelog updates...')
if self.commit_changelog:
logging.info('Pushing PR changelog updates to repo...')
self._push_changelog_update(new_file_content, answer)
else:
logging.info('Publishing PR changelog as comment...')
self.git_provider.publish_comment(f"**Changelog updates:**\n\n{answer}")
async def _prepare_prediction(self, model: str):
logging.info('Getting PR diff...')
self.patches_diff = get_pr_diff(self.git_provider, self.token_handler, model)
logging.info('Getting AI prediction...')
self.prediction = await self._get_prediction(model)
async def _get_prediction(self, model: str):
variables = copy.deepcopy(self.vars)
variables["diff"] = self.patches_diff # update diff
environment = Environment(undefined=StrictUndefined)
system_prompt = environment.from_string(get_settings().pr_update_changelog_prompt.system).render(variables)
user_prompt = environment.from_string(get_settings().pr_update_changelog_prompt.user).render(variables)
if get_settings().config.verbosity_level >= 2:
logging.info(f"\nSystem prompt:\n{system_prompt}")
logging.info(f"\nUser prompt:\n{user_prompt}")
response, finish_reason = await self.ai_handler.chat_completion(model=model, temperature=0.2,
system=system_prompt, user=user_prompt)
return response
def _prepare_changelog_update(self) -> Tuple[str, str]:
answer = self.prediction.strip().strip("```").strip() # noqa B005
if hasattr(self, "changelog_file"):
existing_content = self.changelog_file.decoded_content.decode()
else:
existing_content = ""
if existing_content:
new_file_content = answer + "\n\n" + self.changelog_file.decoded_content.decode()
else:
new_file_content = answer
if not self.commit_changelog:
answer += "\n\n\n>to commit the new content to the CHANGELOG.md file, please type:" \
"\n>'/update_changelog --pr_update_changelog.push_changelog_changes=true'\n"
if get_settings().config.verbosity_level >= 2:
logging.info(f"answer:\n{answer}")
return new_file_content, answer
def _push_changelog_update(self, new_file_content, answer):
self.git_provider.repo_obj.update_file(path=self.changelog_file.path,
message="Update CHANGELOG.md",
content=new_file_content,
sha=self.changelog_file.sha,
branch=self.git_provider.get_pr_branch())
d = dict(body="CHANGELOG.md update",
path=self.changelog_file.path,
line=max(2, len(answer.splitlines())),
start_line=1)
sleep(5) # wait for the file to be updated
last_commit_id = list(self.git_provider.pr.get_commits())[-1]
try:
self.git_provider.pr.create_review(commit=last_commit_id, comments=[d])
except Exception:
# we can't create a review for some reason, let's just publish a comment
self.git_provider.publish_comment(f"**Changelog updates:**\n\n{answer}")
def _get_default_changelog(self):
example_changelog = \
"""
Example:
## <current_date>
### Added
...
### Changed
...
### Fixed
...
"""
return example_changelog
def _get_changlog_file(self):
try:
self.changelog_file = self.git_provider.repo_obj.get_contents("CHANGELOG.md",
ref=self.git_provider.get_pr_branch())
changelog_file_lines = self.changelog_file.decoded_content.decode().splitlines()
changelog_file_lines = changelog_file_lines[:CHANGELOG_LINES]
self.changelog_file_str = "\n".join(changelog_file_lines)
except Exception:
self.changelog_file_str = ""
if self.commit_changelog:
logging.info("No CHANGELOG.md file found in the repository. Creating one...")
changelog_file = self.git_provider.repo_obj.create_file(path="CHANGELOG.md",
message='add CHANGELOG.md',
content="",
branch=self.git_provider.get_pr_branch())
self.changelog_file = changelog_file['content']
if not self.changelog_file_str:
self.changelog_file_str = self._get_default_changelog()

View File

@ -1,3 +1,66 @@
[build-system]
requires = ["setuptools>=61.0"]
build-backend = "setuptools.build_meta"
[project]
name = "pr_agent"
version = "0.0.1"
authors = [
{name = "Itamar Friedman", email = "itamar.f@codium.ai"},
]
maintainers = [
{name = "Ori Kotek", email = "ori.k@codium.ai"},
{name = "Tal Ridnik", email = "tal.r@codium.ai"},
{name = "Hussam Lawen", email = "hussam.l@codium.ai"},
{name = "Sagi Medina", email = "sagi.m@codium.ai"}
]
description = "CodiumAI PR-Agent is an open-source tool to automatically analyze a pull request and provide several types of feedback"
readme = "README.md"
requires-python = ">=3.9"
keywords = ["ai", "tool", "developer", "review", "agent"]
license = {file = "LICENSE", name = "Apache 2.0 License"}
classifiers = [
"Development Status :: 3 - Alpha",
"Intended Audience :: Developers",
"Operating System :: Independent",
"Programming Language :: Python :: 3",
]
dependencies = [
"dynaconf==3.1.12",
"fastapi==0.99.0",
"PyGithub==1.59.*",
"retry==0.9.2",
"openai==0.27.8",
"Jinja2==3.1.2",
"tiktoken==0.4.0",
"uvicorn==0.22.0",
"python-gitlab==3.15.0",
"pytest~=7.4.0",
"aiohttp~=3.8.4",
"atlassian-python-api==3.39.0",
"GitPython~=3.1.32",
"starlette-context==0.3.6",
"litellm~=0.1.351",
"PyYAML==6.0"
]
[project.urls]
"Homepage" = "https://github.com/Codium-ai/pr-agent"
[tool.setuptools]
include-package-data = false
license-files = ["LICENSE"]
[tool.setuptools.packages.find]
where = ["."]
include = ["pr_agent"]
[project.scripts]
pr-agent = "pr_agent.cli:run"
[tool.ruff]
line-length = 120

View File

@ -10,3 +10,8 @@ python-gitlab==3.15.0
pytest~=7.4.0
aiohttp~=3.8.4
atlassian-python-api==3.39.0
GitPython~=3.1.32
litellm~=0.1.351
PyYAML==6.0
starlette-context==0.3.6
litellm~=0.1.351

5
setup.py Normal file
View File

@ -0,0 +1,5 @@
# for compatibility with legacy tools
# see: https://setuptools.pypa.io/en/latest/userguide/pyproject_config.html
from setuptools import setup
setup()

View File

@ -51,7 +51,7 @@ class TestConvertToMarkdown:
'Unrelated changes': 'n/a', # won't be included in the output
'Focused PR': 'Yes',
'General PR suggestions': 'general suggestion...',
'Code suggestions': [
'Code feedback': [
{
'Code example': {
'Before': 'Code before',
@ -73,7 +73,7 @@ class TestConvertToMarkdown:
- **Focused PR:** Yes
- 💡 **General PR suggestions:** general suggestion...
- 🤖 **Code suggestions:**
- 🤖 **Code feedback:**
- **Code example:**
- **Before:**

View File

@ -0,0 +1,68 @@
# Generated by CodiumAI
from pr_agent.git_providers.git_provider import FilePatchInfo
from pr_agent.algo.pr_processing import find_line_number_of_relevant_line_in_file
import pytest
class TestFindLineNumberOfRelevantLineInFile:
# Tests that the function returns the correct line number and absolute position when the relevant line is found in the patch
def test_relevant_line_found_in_patch(self):
diff_files = [
FilePatchInfo(base_file='file1', head_file='file1', patch='@@ -1,1 +1,2 @@\n-line1\n+line2\n+relevant_line\n', filename='file1')
]
relevant_file = 'file1'
relevant_line_in_file = 'relevant_line'
expected = (3, 2) # (position in patch, absolute_position in new file)
assert find_line_number_of_relevant_line_in_file(diff_files, relevant_file, relevant_line_in_file) == expected
# Tests that the function returns the correct line number and absolute position when a similar line is found using difflib
def test_similar_line_found_using_difflib(self):
diff_files = [
FilePatchInfo(base_file='file1', head_file='file1', patch='@@ -1,1 +1,2 @@\n-line1\n+relevant_line in file similar match\n', filename='file1')
]
relevant_file = 'file1'
relevant_line_in_file = '+relevant_line in file similar match ' # note the space at the end. This is to simulate a similar line found using difflib
expected = (2, 1)
assert find_line_number_of_relevant_line_in_file(diff_files, relevant_file, relevant_line_in_file) == expected
# Tests that the function returns (-1, -1) when the relevant line is not found in the patch and no similar line is found using difflib
def test_relevant_line_not_found(self):
diff_files = [
FilePatchInfo(base_file='file1', head_file='file1', patch='@@ -1,1 +1,2 @@\n-line1\n+relevant_line\n', filename='file1')
]
relevant_file = 'file1'
relevant_line_in_file = 'not_found'
expected = (-1, -1)
assert find_line_number_of_relevant_line_in_file(diff_files, relevant_file, relevant_line_in_file) == expected
# Tests that the function returns (-1, -1) when the relevant file is not found in any of the patches
def test_relevant_file_not_found(self):
diff_files = [
FilePatchInfo(base_file='file1', head_file='file1', patch='@@ -1,1 +1,2 @@\n-line1\n+relevant_line\n', filename='file2')
]
relevant_file = 'file1'
relevant_line_in_file = 'relevant_line'
expected = (-1, -1)
assert find_line_number_of_relevant_line_in_file(diff_files, relevant_file, relevant_line_in_file) == expected
# Tests that the function returns (-1, -1) when the relevant_line_in_file is an empty string
def test_empty_relevant_line(self):
diff_files = [
FilePatchInfo(base_file='file1', head_file='file1', patch='@@ -1,1 +1,2 @@\n-line1\n+relevant_line\n', filename='file1')
]
relevant_file = 'file1'
relevant_line_in_file = ''
expected = (0, 0)
assert find_line_number_of_relevant_line_in_file(diff_files, relevant_file, relevant_line_in_file) == expected
# Tests that the function returns (-1, -1) when the relevant_line_in_file is found in the patch but it is a deleted line
def test_relevant_line_found_but_deleted(self):
diff_files = [
FilePatchInfo(base_file='file1', head_file='file1', patch='@@ -1,2 +1,1 @@\n-line1\n-relevant_line\n', filename='file1')
]
relevant_file = 'file1'
relevant_line_in_file = 'relevant_line'
expected = (-1, -1)
assert find_line_number_of_relevant_line_in_file(diff_files, relevant_file, relevant_line_in_file) == expected

View File

@ -2,7 +2,7 @@
import logging
from pr_agent.algo.git_patch_processing import handle_patch_deletions
from pr_agent.config_loader import settings
from pr_agent.config_loader import get_settings
"""
Code Analysis
@ -49,7 +49,7 @@ class TestHandlePatchDeletions:
original_file_content_str = 'foo\nbar\n'
new_file_content_str = ''
file_name = 'file.py'
settings.config.verbosity_level = 1
get_settings().config.verbosity_level = 1
with caplog.at_level(logging.INFO):
handle_patch_deletions(patch, original_file_content_str, new_file_content_str, file_name)

View File

@ -0,0 +1,32 @@
# Generated by CodiumAI
import pytest
from pr_agent.algo.utils import load_yaml
class TestLoadYaml:
# Tests that load_yaml loads a valid YAML string
def test_load_valid_yaml(self):
yaml_str = 'name: John Smith\nage: 35'
expected_output = {'name': 'John Smith', 'age': 35}
assert load_yaml(yaml_str) == expected_output
def test_load_complicated_yaml(self):
yaml_str = \
'''\
PR Analysis:
Main theme: Enhancing the `/describe` command prompt by adding title and description
Type of PR: Enhancement
Relevant tests added: No
Focused PR: Yes, the PR is focused on enhancing the `/describe` command prompt.
PR Feedback:
General suggestions: The PR seems to be well-structured and focused on a specific enhancement. However, it would be beneficial to add tests to ensure the new feature works as expected.
Code feedback:
- relevant file: pr_agent/settings/pr_description_prompts.toml
suggestion: Consider using a more descriptive variable name than 'user' for the command prompt. A more descriptive name would make the code more readable and maintainable. [medium]
relevant line: 'user="""PR Info:'
Security concerns: No'''
expected_output = {'PR Analysis': {'Main theme': 'Enhancing the `/describe` command prompt by adding title and description', 'Type of PR': 'Enhancement', 'Relevant tests added': False, 'Focused PR': 'Yes, the PR is focused on enhancing the `/describe` command prompt.'}, 'PR Feedback': {'General suggestions': 'The PR seems to be well-structured and focused on a specific enhancement. However, it would be beneficial to add tests to ensure the new feature works as expected.', 'Code feedback': [{'relevant file': 'pr_agent/settings/pr_description_prompts.toml', 'suggestion': "Consider using a more descriptive variable name than 'user' for the command prompt. A more descriptive name would make the code more readable and maintainable. [medium]", 'relevant line': 'user="""PR Info:'}], 'Security concerns': False}}
assert load_yaml(yaml_str) == expected_output