Compare commits

..

28 Commits

Author SHA1 Message Date
3b27c834a4 Merge remote-tracking branch 'origin/main' into ok/update_readme 2023-07-19 11:14:44 +03:00
5bc2ef1eff Merge pull request #92 from YuviGold/deploy-on-lambda
Deployment on AWS Lambda
2023-07-19 11:12:29 +03:00
2f558006bf Update INSTALL.md, add notes about injecting secrets 2023-07-19 11:09:35 +03:00
370520df51 Update docker/Dockerfile.lambda
have a fixed mangum version

Co-authored-by: Ori Kotek <orikotek@gmail.com>
2023-07-19 11:05:24 +03:00
2e832b8fb4 Merge pull request #86 from Codium-ai/GadiZimerman-patch-1
Update README.md
2023-07-19 10:51:31 +03:00
a47fa342cb Merge pull request #88 from zmeir/zmeir-cli_args
CLI Arguments Refactoring
2023-07-19 08:15:19 +03:00
bdf7eff7cd Merge pull request #87 from Codium-ai/tr/bug_fix
Add Insights from User's Answers and Fix User Answers Fetching
2023-07-18 18:20:15 +03:00
dc67e6a66e Support deploying pr-agent on AWS Lambda 2023-07-18 17:46:42 +03:00
77f243b7ab Allow passing CLI args (helps with debugging) 2023-07-18 16:39:46 +03:00
c507785475 bugfix 2023-07-18 16:32:51 +03:00
5c5015b267 Update README.md 2023-07-18 14:45:15 +03:00
3efe08d619 Merge pull request #85 from Codium-ai/hl/always_filer_bad_extensions
Filter out bad files before getting their head and original source code and diff
2023-07-18 13:50:25 +03:00
2e36fce4eb Merge pull request #83 from Codium-ai/hl/gitlab_description
Support describe for Gitlab
2023-07-18 13:47:32 +03:00
d6d4427545 Merge pull request #84 from Codium-ai/GadiZimerman-patch-1
Update README.md
2023-07-18 13:37:43 +03:00
5d45632247 Performance improvement: Filter out bad files before getting their head and original source code and diff 2023-07-18 13:33:32 +03:00
90c045e3d0 Update README.md
changing image
2023-07-18 13:26:19 +03:00
7f0a96d8f7 readme 2023-07-18 13:17:30 +03:00
8fb9affef3 add try catch 2023-07-18 13:14:01 +03:00
6c42a471e1 Merge pull request #76 from zmeir/zmeir-publish_inline_comments_single_api_call
Optimization of Inline Comments Publishing
2023-07-18 13:05:11 +03:00
f2b74b6970 support gitlab describe function 2023-07-18 13:03:36 +03:00
ffd11aeffc Merge pull request #81 from Codium-ai/GadiZimerman-patch-1
Update README.md
2023-07-18 12:55:26 +03:00
05e4e09dfc Lint 2023-07-18 12:27:28 +03:00
13092118dc Move the new git provider function to the abstract interface 2023-07-18 12:26:49 +03:00
7d108992fc Merge remote-tracking branch 'origin/main' into zmeir-publish_inline_comments_single_api_call 2023-07-18 11:53:41 +03:00
e5a8ed205e Merge pull request #82 from Codium-ai/ok/lint
Linting and Code Cleanup
2023-07-18 11:40:43 +03:00
978348240b Update README.md 2023-07-18 09:59:47 +03:00
4d92e7d9c2 Update README.md
consider changing section headers to reflect commands format
2023-07-18 09:56:40 +03:00
24583b05f7 Publish GitHub review comments with single API call 2023-07-17 10:41:02 +03:00
14 changed files with 187 additions and 63 deletions

View File

@ -149,16 +149,35 @@ git clone https://github.com/Codium-ai/pr-agent.git
```
5. Copy the secrets template file and fill in the following:
```
cp pr_agent/settings/.secrets_template.toml pr_agent/settings/.secrets.toml
# Edit .secrets.toml file
```
- Your OpenAI key.
- Set deployment_type to 'app'
- Copy your app's private key to the private_key field.
- Copy your app's ID to the app_id field.
- Copy your app's webhook secret to the webhook_secret field.
- Set deployment_type to 'app' in [configuration.toml](./pr_agent/settings/configuration.toml)
> The .secrets.toml file is not copied to the Docker image by default, and is only used for local development.
> If you want to use the .secrets.toml file in your Docker image, you can add remove it from the .dockerignore file.
> In most production environments, you would inject the secrets file as environment variables or as mounted volumes.
> For example, in order to inject a secrets file as a volume in a Kubernetes environment you can update your pod spec to include the following,
> assuming you have a secret named `pr-agent-settings` with a key named `.secrets.toml`:
```
volumes:
- name: settings-volume
secret:
secretName: pr-agent-settings
// ...
containers:
// ...
volumeMounts:
- mountPath: /app/pr_agent/settings_prod
name: settings-volume
```
cp pr_agent/settings/.secrets_template.toml pr_agent/settings/.secrets.toml
# Edit .secrets.toml file
```
> Another option is to set the secrets as environment variables in your deployment environment, for example `OPENAI.KEY` and `GITHUB.USER_TOKEN`.
6. Build a Docker image for the app and optionally push it to a Docker repository. We'll use Dockerhub as an example:
@ -169,6 +188,7 @@ docker push codiumai/pr-agent:github_app # Push to your Docker repository
7. Host the app using a server, serverless function, or container environment. Alternatively, for development and
debugging, you may use tools like smee.io to forward webhooks to your local machine.
You can check [Deploy as a Lambda Function](#deploy-as-a-lambda-function)
8. Go back to your app's settings, and set the following:
@ -178,3 +198,20 @@ docker push codiumai/pr-agent:github_app # Push to your Docker repository
9. Install the app by navigating to the "Install App" tab and selecting your desired repositories.
---
#### Deploy as a Lambda Function
1. Follow steps 1-5 of [Method 5](#method-5-run-as-a-github-app).
2. Build a docker image that can be used as a lambda function
```shell
docker buildx build --platform=linux/amd64 . -t codiumai/pr-agent:serverless -f docker/Dockerfile.lambda
```
3. Push image to ECR
```shell
docker tag codiumai/pr-agent:serverless <AWS_ACCOUNT>.dkr.ecr.<AWS_REGION>.amazonaws.com/codiumai/pr-agent:serverless
docker push <AWS_ACCOUNT>.dkr.ecr.<AWS_REGION>.amazonaws.com/codiumai/pr-agent:serverless
```
4. Create a lambda function that uses the uploaded image. Set the lambda timeout to be at least 3m.
5. Configure the lambda function to have a Function URL.
6. Go back to steps 8-9 of [Method 5](#method-5-run-as-a-github-app) with the function url as your Webhook URL.
The Webhook URL would look like `https://<LAMBDA_FUNCTION_URL>/api/v1/github_webhooks`

View File

@ -24,25 +24,25 @@ CodiumAI `PR-Agent` is an open-source tool aiming to help developers review PRs
<h3>Example results:</h2>
</div>
<h4>Describe:</h4>
<h4>/describe:</h4>
<div align="center">
<p float="center">
<img src="https://codium.ai/images/describe.gif" width="800">
</p>
</div>
<h4>Review:</h4>
<h4>/review:</h4>
<div align="center">
<p float="center">
<img src="https://codium.ai/images/review.gif" width="800">
</p>
</div>
<h4>Ask:</h4>
<h4>/ask:</h4>
<div align="center">
<p float="center">
<img src="https://codium.ai/images/ask.gif" width="800">
</p>
</div>
<h4>Improve:</h4>
<h4>/improve:</h4>
<div align="center">
<p float="center">
<img src="https://codium.ai/images/improve.gif" width="800">
@ -76,7 +76,7 @@ To set up your own PR-Agent, see the [Quickstart](#Quickstart) section
| TOOLS | Review | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | ⮑ Inline review | :white_check_mark: | :white_check_mark: | |
| | Ask | :white_check_mark: | :white_check_mark: | |
| | Auto-Description | :white_check_mark: | | |
| | Auto-Description | :white_check_mark: | :white_check_mark: | |
| | Improve Code | :white_check_mark: | :white_check_mark: | |
| | Reflect and Review | :white_check_mark: | | |
| | | | | |
@ -132,7 +132,7 @@ Here are several ways to install and run PR-Agent:
## How it works
![PR-Agent Tools](https://codium.ai/images/pr_agent_overview.png)
![PR-Agent Tools](https://www.codium.ai/wp-content/uploads/2023/07/codiumai-diagram-v4.jpg)
Check out the [PR Compression strategy](./PR_COMPRESSION.md) page for more details on how we convert a code diff to a manageable LLM prompt

12
docker/Dockerfile.lambda Normal file
View File

@ -0,0 +1,12 @@
FROM public.ecr.aws/lambda/python:3.10
RUN yum update -y && \
yum install -y gcc python3-devel && \
yum clean all
ADD requirements.txt .
RUN pip install -r requirements.txt && rm requirements.txt
RUN pip install mangum==16.0.0
COPY pr_agent/ ${LAMBDA_TASK_ROOT}/pr_agent/
CMD ["pr_agent.servers.serverless.serverless"]

View File

@ -64,7 +64,11 @@ bad_extensions = [
def filter_bad_extensions(files):
return [f for f in files if f.filename.split('.')[-1] not in bad_extensions]
return [f for f in files if is_valid_file(f.filename)]
def is_valid_file(filename):
return filename.split('.')[-1] not in bad_extensions
def sort_files_by_main_languages(languages: Dict, files: list):

View File

@ -29,10 +29,10 @@ def get_pr_diff(git_provider: Union[GithubProvider, Any], token_handler: TokenHa
global PATCH_EXTRA_LINES
PATCH_EXTRA_LINES = 0
git_provider.pr.diff_files = list(git_provider.get_diff_files())
diff_files = list(git_provider.get_diff_files())
# get pr languages
pr_languages = sort_files_by_main_languages(git_provider.get_languages(), git_provider.pr.diff_files)
pr_languages = sort_files_by_main_languages(git_provider.get_languages(), diff_files)
# generate a standard diff string, with patch extension
patches_extended, total_tokens = pr_generate_extended_diff(pr_languages, token_handler,

View File

@ -17,6 +17,7 @@ def convert_to_markdown(output_data: dict) -> str:
"Focused PR": "",
"Security concerns": "🔒",
"General PR suggestions": "💡",
"Insights from user's answers": "📝",
"Code suggestions": "🤖"
}

View File

@ -10,7 +10,7 @@ from pr_agent.tools.pr_questions import PRQuestions
from pr_agent.tools.pr_reviewer import PRReviewer
def run():
def run(args=None):
parser = argparse.ArgumentParser(description='AI based pull request analyzer', usage="""\
Usage: cli.py --pr-url <URL on supported git hosting service> <command> [<args>].
For example:
@ -35,7 +35,7 @@ reflect - Ask the PR author questions about the PR.
'reflect', 'review_after_reflect'],
default='review')
parser.add_argument('rest', nargs=argparse.REMAINDER, default=[])
args = parser.parse_args()
args = parser.parse_args(args)
logging.basicConfig(level=os.environ.get("LOGLEVEL", "INFO"))
command = args.command.lower()
if command in ['ask', 'ask_question']:

View File

@ -26,7 +26,7 @@ class BitbucketProvider:
self.set_pr(pr_url)
def is_supported(self, capability: str) -> bool:
if capability == 'get_issue_comments':
if capability in ['get_issue_comments', 'create_inline_comment', 'publish_inline_comments']:
return False
return True
@ -64,6 +64,12 @@ class BitbucketProvider:
def publish_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
pass
def create_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
raise NotImplementedError("Bitbucket provider does not support creating inline comments yet")
def publish_inline_comments(self, comments: list[dict]):
raise NotImplementedError("Bitbucket provider does not support publishing inline comments yet")
def get_title(self):
return self.pr.title

View File

@ -44,6 +44,14 @@ class GitProvider(ABC):
def publish_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
pass
@abstractmethod
def create_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
pass
@abstractmethod
def publish_inline_comments(self, comments: list[dict]):
pass
@abstractmethod
def publish_code_suggestion(self, body: str, relevant_file: str,
relevant_lines_start: int, relevant_lines_end: int):

View File

@ -3,11 +3,12 @@ from datetime import datetime
from typing import Optional, Tuple
from urllib.parse import urlparse
from github import AppAuthentication, Github
from github import AppAuthentication, Github, Auth
from pr_agent.config_loader import settings
from .git_provider import FilePatchInfo, GitProvider
from ..algo.language_handler import is_valid_file
class GithubProvider(GitProvider):
@ -37,9 +38,10 @@ class GithubProvider(GitProvider):
files = self.pr.get_files()
diff_files = []
for file in files:
original_file_content_str = self._get_pr_file_content(file, self.pr.base.sha)
new_file_content_str = self._get_pr_file_content(file, self.pr.head.sha)
diff_files.append(FilePatchInfo(original_file_content_str, new_file_content_str, file.patch, file.filename))
if is_valid_file(file.filename):
original_file_content_str = self._get_pr_file_content(file, self.pr.base.sha)
new_file_content_str = self._get_pr_file_content(file, self.pr.head.sha)
diff_files.append(FilePatchInfo(original_file_content_str, new_file_content_str, file.patch, file.filename))
self.diff_files = diff_files
return diff_files
@ -57,6 +59,9 @@ class GithubProvider(GitProvider):
self.pr.comments_list.append(response)
def publish_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
self.publish_inline_comments([self.create_inline_comment(body, relevant_file, relevant_line_in_file)])
def create_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
self.diff_files = self.diff_files if self.diff_files else self.get_diff_files()
position = -1
for file in self.diff_files:
@ -67,7 +72,7 @@ class GithubProvider(GitProvider):
if relevant_line_in_file in line:
position = i
break
elif relevant_line_in_file[0] == '+' and relevant_line_in_file[1:] in line:
elif relevant_line_in_file[0] == '+' and relevant_line_in_file[1:].lstrip() in line:
# The model often adds a '+' to the beginning of the relevant_line_in_file even if originally
# it's a context line
position = i
@ -75,9 +80,16 @@ class GithubProvider(GitProvider):
if position == -1:
if settings.config.verbosity_level >= 2:
logging.info(f"Could not find position for {relevant_file} {relevant_line_in_file}")
subject_type = "FILE"
else:
path = relevant_file.strip()
self.pr.create_review_comment(body=body, commit_id=self.last_commit_id, path=path, position=position)
subject_type = "LINE"
path = relevant_file.strip()
# placeholder for future API support (already supported in single inline comment)
# return dict(body=body, path=path, position=position, subject_type=subject_type)
return dict(body=body, path=path, position=position) if subject_type == "LINE" else {}
def publish_inline_comments(self, comments: list[dict]):
self.pr.create_review(commit=self.last_commit_id, comments=comments)
def publish_code_suggestion(self, body: str,
relevant_file: str,
@ -218,7 +230,7 @@ class GithubProvider(GitProvider):
raise ValueError(
"GitHub token is required when using user deployment. See: "
"https://github.com/Codium-ai/pr-agent#method-2-run-from-source") from e
return Github(token)
return Github(auth=Auth.Token(token))
def _get_repo(self):
return self.github_client.get_repo(self.repo)

View File

@ -9,6 +9,7 @@ from gitlab import GitlabGetError
from pr_agent.config_loader import settings
from .git_provider import EDIT_TYPE, FilePatchInfo, GitProvider
from ..algo.language_handler import is_valid_file
class GitLabProvider(GitProvider):
@ -33,7 +34,7 @@ class GitLabProvider(GitProvider):
r"^@@ -(\d+)(?:,(\d+))? \+(\d+)(?:,(\d+))? @@[ ]?(.*)")
def is_supported(self, capability: str) -> bool:
if capability == 'get_issue_comments':
if capability in ['get_issue_comments', 'create_inline_comment', 'publish_inline_comments']:
return False
return True
@ -59,27 +60,28 @@ class GitLabProvider(GitProvider):
diffs = self.mr.changes()['changes']
diff_files = []
for diff in diffs:
original_file_content_str = self._get_pr_file_content(diff['old_path'], self.mr.target_branch)
new_file_content_str = self._get_pr_file_content(diff['new_path'], self.mr.source_branch)
edit_type = EDIT_TYPE.MODIFIED
if diff['new_file']:
edit_type = EDIT_TYPE.ADDED
elif diff['deleted_file']:
edit_type = EDIT_TYPE.DELETED
elif diff['renamed_file']:
edit_type = EDIT_TYPE.RENAMED
try:
if isinstance(original_file_content_str, bytes):
original_file_content_str = bytes.decode(original_file_content_str, 'utf-8')
if isinstance(new_file_content_str, bytes):
new_file_content_str = bytes.decode(new_file_content_str, 'utf-8')
except UnicodeDecodeError:
logging.warning(
f"Cannot decode file {diff['old_path']} or {diff['new_path']} in merge request {self.id_mr}")
diff_files.append(
FilePatchInfo(original_file_content_str, new_file_content_str, diff['diff'], diff['new_path'],
edit_type=edit_type,
old_filename=None if diff['old_path'] == diff['new_path'] else diff['old_path']))
if is_valid_file(diff['new_path']):
original_file_content_str = self._get_pr_file_content(diff['old_path'], self.mr.target_branch)
new_file_content_str = self._get_pr_file_content(diff['new_path'], self.mr.source_branch)
edit_type = EDIT_TYPE.MODIFIED
if diff['new_file']:
edit_type = EDIT_TYPE.ADDED
elif diff['deleted_file']:
edit_type = EDIT_TYPE.DELETED
elif diff['renamed_file']:
edit_type = EDIT_TYPE.RENAMED
try:
if isinstance(original_file_content_str, bytes):
original_file_content_str = bytes.decode(original_file_content_str, 'utf-8')
if isinstance(new_file_content_str, bytes):
new_file_content_str = bytes.decode(new_file_content_str, 'utf-8')
except UnicodeDecodeError:
logging.warning(
f"Cannot decode file {diff['old_path']} or {diff['new_path']} in merge request {self.id_mr}")
diff_files.append(
FilePatchInfo(original_file_content_str, new_file_content_str, diff['diff'], diff['new_path'],
edit_type=edit_type,
old_filename=None if diff['old_path'] == diff['new_path'] else diff['old_path']))
self.diff_files = diff_files
return diff_files
@ -87,8 +89,12 @@ class GitLabProvider(GitProvider):
return [change['new_path'] for change in self.mr.changes()['changes']]
def publish_description(self, pr_title: str, pr_body: str):
logging.exception("Not implemented yet")
pass
try:
self.mr.title = pr_title
self.mr.description = pr_body
self.mr.save()
except Exception as e:
logging.exception(f"Could not update merge request {self.id_mr} description: {e}")
def publish_comment(self, mr_comment: str, is_temporary: bool = False):
comment = self.mr.notes.create({'body': mr_comment})
@ -102,6 +108,12 @@ class GitLabProvider(GitProvider):
self.send_inline_comment(body, edit_type, found, relevant_file, relevant_line_in_file, source_line_no,
target_file, target_line_no)
def create_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
raise NotImplementedError("Gitlab provider does not support creating inline comments yet")
def create_inline_comment(self, comments: list[dict]):
raise NotImplementedError("Gitlab provider does not support publishing inline comments yet")
def send_inline_comment(self, body, edit_type, found, relevant_file, relevant_line_in_file, source_line_no,
target_file, target_line_no):
if not found:
@ -181,7 +193,7 @@ class GitLabProvider(GitProvider):
found = True
edit_type = self.get_edit_type(line)
break
elif relevant_line_in_file[0] == '+' and relevant_line_in_file[1:] in line:
elif relevant_line_in_file[0] == '+' and relevant_line_in_file[1:].lstrip() in line:
# The model often adds a '+' to the beginning of the relevant_line_in_file even if originally
# it's a context line
found = True

View File

@ -0,0 +1,18 @@
import logging
from fastapi import FastAPI
from mangum import Mangum
from pr_agent.servers.github_app import router
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
app = FastAPI()
app.include_router(router)
handler = Mangum(app, lifespan="off")
def serverless(event, context):
return handler(event, context)

View File

@ -24,7 +24,7 @@ class PRReviewer:
self.is_answer = is_answer
if self.is_answer and not self.git_provider.is_supported("get_issue_comments"):
raise Exception(f"Answer mode is not supported for {settings.config.git_provider} for now")
answer_str = question_str = self._get_user_answers()
answer_str, question_str = self._get_user_answers()
self.ai_handler = AiHandler()
self.patches_diff = None
self.prediction = None
@ -51,7 +51,7 @@ class PRReviewer:
async def review(self):
logging.info('Reviewing PR...')
if settings.config.publish_output:
self.git_provider.publish_comment("Preparing review...", is_temporary=True)
self.git_provider.publish_comment("Preparing review...", is_temporary=True)
logging.info('Getting PR diff...')
self.patches_diff = get_pr_diff(self.git_provider, self.token_handler)
logging.info('Getting AI prediction...')
@ -96,10 +96,16 @@ class PRReviewer:
del data['PR Feedback']['Security concerns']
data['PR Analysis']['Security concerns'] = val
if settings.config.git_provider == 'github' and \
if settings.config.git_provider != 'bitbucket' and \
settings.pr_reviewer.inline_code_comments and \
'Code suggestions' in data['PR Feedback']:
del data['PR Feedback']['Code suggestions']
# keeping only code suggestions that can't be submitted as inline comments
data['PR Feedback']['Code suggestions'] = [
d for d in data['PR Feedback']['Code suggestions']
if any(key not in d for key in ('relevant file', 'relevant line in file', 'suggestion content'))
]
if not data['PR Feedback']['Code suggestions']:
del data['PR Feedback']['Code suggestions']
markdown_text = convert_to_markdown(data)
user = self.git_provider.get_user_id()
@ -125,16 +131,24 @@ class PRReviewer:
except json.decoder.JSONDecodeError:
data = try_fix_json(review)
if settings.pr_reviewer.num_code_suggestions > 0:
try:
for d in data['PR Feedback']['Code suggestions']:
relevant_file = d['relevant file'].strip()
relevant_line_in_file = d['relevant line in file'].strip()
content = d['suggestion content']
comments = []
for d in data['PR Feedback']['Code suggestions']:
relevant_file = d.get('relevant file', '').strip()
relevant_line_in_file = d.get('relevant line in file', '').strip()
content = d.get('suggestion content', '')
if not relevant_file or not relevant_line_in_file or not content:
logging.info("Skipping inline comment with missing file/line/content")
continue
self.git_provider.publish_inline_comment(content, relevant_file, relevant_line_in_file)
except KeyError:
pass
if self.git_provider.is_supported("create_inline_comment"):
comment = self.git_provider.create_inline_comment(content, relevant_file, relevant_line_in_file)
if comment:
comments.append(comment)
else:
self.git_provider.publish_inline_comment(content, relevant_file, relevant_line_in_file)
if comments:
self.git_provider.publish_inline_comments(comments)
def _get_user_answers(self):
answer_str = question_str = ""

View File

@ -1,6 +1,6 @@
dynaconf==3.1.12
fastapi==0.99.0
PyGithub==1.58.2
PyGithub==1.59.*
retry==0.9.2
openai==0.27.8
Jinja2==3.1.2