Compare commits

..

36 Commits

Author SHA1 Message Date
ea1cd7ae45 Github custom action development - WIP 2023-07-13 19:14:44 +03:00
1c1aad2806 Github custom action development - WIP 2023-07-13 19:08:10 +03:00
f466d79031 Github custom action development - WIP 2023-07-13 18:59:54 +03:00
e2323dfb9f Github custom action development - WIP 2023-07-13 18:54:40 +03:00
e51e443adc Github custom action development - WIP 2023-07-13 18:54:11 +03:00
f6d4a214ca Github custom action development - WIP 2023-07-13 18:40:03 +03:00
4bb46d9faa Github custom action development - WIP 2023-07-13 18:37:32 +03:00
f337d76af6 Github custom action development - WIP 2023-07-13 18:32:28 +03:00
4033303c1f Github custom action development - WIP 2023-07-13 18:18:23 +03:00
38c8d187d2 Github custom action development - WIP 2023-07-13 18:16:25 +03:00
f8ddfd2f25 Merge remote-tracking branch 'origin/tr/description_tool' into feature/github_action 2023-07-13 18:06:35 +03:00
4b4fda37a6 publish_description as abstract method 2023-07-13 18:04:28 +03:00
9ca6b789a7 Github custom action development - WIP 2023-07-13 18:02:38 +03:00
0f73f5f906 set as title 2023-07-13 17:53:17 +03:00
5742a9be1e Github custom action development 2023-07-13 17:46:12 +03:00
914cc6639a ignore current title 2023-07-13 17:34:18 +03:00
f34cda126a stable 2023-07-13 17:31:28 +03:00
dece20c984 PRDescription 2023-07-13 17:24:56 +03:00
94c1f430af General PR suggestions prompt 2023-07-13 16:34:56 +03:00
9fadde388b remove title and description 2023-07-13 16:26:33 +03:00
d1b6b3bc95 Merge pull request #43 from Codium-ai/tr/inline_code_suggestions
Tr/inline code suggestions
2023-07-13 10:48:42 +03:00
77a451ada0 inline_code_comments 2023-07-13 09:44:33 +03:00
4b8420aa16 remove suggestion number 2023-07-13 08:10:36 +03:00
25bc69f70e Merge pull request #41 from Codium-ai/gitlab_small_fix
Update gitlab config
2023-07-12 18:16:43 +03:00
e2faf117c5 Update gitlab config 2023-07-12 18:02:28 +03:00
aaff03bb60 Merge pull request #40 from Codium-ai/feature/support_azure_openai
Add Azure OpenAI support
2023-07-12 13:37:00 +03:00
cd1e62ec96 Add Azure OpenAI support 2023-07-12 11:53:46 +03:00
7767cae181 Merge pull request #39 from Codium-ai/bugfix/cli
Remove installation_id from cli
2023-07-12 11:31:43 +03:00
1bc206e7b2 Remove installation_id from cli 2023-07-12 11:31:06 +03:00
52a438b3c8 Merge pull request #38 from Codium-ai/hl/try_fix_when_broken_output
Try to fix json output when it's broken or incomplete
2023-07-11 22:23:07 +03:00
b8a71b369d add max_iter 2023-07-11 22:22:08 +03:00
72af2a1f9c Add tests 2023-07-11 22:11:55 +03:00
fd4a2bf7ff refactor try_fix_json, generalize finding the ending of a json item (support new lines, spaces tab) 2023-07-11 22:11:42 +03:00
a3211d4958 Merge commit '210d94f2aa6ebf872b9b85051d1842c32d4fc34e' into hl/try_fix_when_broken_output 2023-07-11 17:33:02 +03:00
86d7ed5f82 Try to fix broken json output 2023-07-11 17:32:48 +03:00
210d94f2aa Merge pull request #24 from Xyand/feature/gitlab_provider
Feature/gitlab provider
2023-07-11 16:56:44 +03:00
23 changed files with 417 additions and 122 deletions

16
.github/workflows/review.yaml vendored Normal file
View File

@ -0,0 +1,16 @@
on:
pull_request:
issue_comment:
jobs:
pr_agent_job:
runs-on: ubuntu-latest
name: Run pr agent on every pull request
steps:
- name: PR Agent action step
id: pragent
uses: Codium-ai/pr-agent@feature/github_action
env:
OPENAI_KEY: ${{ secrets.OPENAI_KEY }}
OPENAI_ORG: ${{ secrets.OPENAI_ORG }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

10
Dockerfile.github_action Normal file
View File

@ -0,0 +1,10 @@
FROM python:3.10 as base
WORKDIR /app
ADD requirements.txt .
RUN pip install -r requirements.txt && rm requirements.txt
ENV PYTHONPATH=/app
ADD pr_agent pr_agent
ADD github_action/entrypoint.sh /
RUN chmod +x /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]

View File

@ -183,7 +183,6 @@ Here is a quick overview of the different sub-tools of PR Reviewer:
- PR Analysis - PR Analysis
- Summarize main theme - Summarize main theme
- PR description and title
- PR type classification - PR type classification
- Is the PR covered by relevant tests - Is the PR covered by relevant tests
- Is this a focused PR - Is this a focused PR
@ -199,7 +198,6 @@ This is how a typical output of the PR Reviewer looks like:
#### PR Analysis #### PR Analysis
- 🎯 **Main theme:** Adding language extension handler and token handler - 🎯 **Main theme:** Adding language extension handler and token handler
- 🔍 **Description and title:** Yes
- 📌 **Type of PR:** Enhancement - 📌 **Type of PR:** Enhancement
- 🧪 **Relevant tests added:** No - 🧪 **Relevant tests added:** No
-**Focused PR:** Yes, the PR is focused on adding two new handlers for language extension and token counting. -**Focused PR:** Yes, the PR is focused on adding two new handlers for language extension and token counting.
@ -250,45 +248,6 @@ require_tests_review=true
require_security_review=true require_security_review=true
``` ```
#### Code Suggestions configuration:
There are also configuration options to control different aspects of the `code suggestions` feature.
The number of suggestions provided can be controlled by adjusting the following parameter:
```
num_code_suggestions=4
```
You can also enable more verbose and informative mode of code suggestions:
```
extended_code_suggestions=false
```
This is a comparison of the regular and extended code suggestions modes:
- **relevant file:** sql.py
- **suggestion content:** Remove hardcoded sensitive information like username and password. Use environment variables or a secure method to store these values. [important]
Example for extended suggestion:
- **relevant file:** sql.py
- **suggestion content:** Remove hardcoded sensitive information (username and password) [important]
- **why:** Hardcoding sensitive information is a security risk. It's better to use environment variables or a secure way to store these values.
- **code example:**
- **before code:**
```
user = "root",
password = "Mysql@123",
```
- **after code:**
```
user = os.getenv('DB_USER'),
password = os.getenv('DB_PASSWORD'),
```
---
## How it works ## How it works
![PR-Agent Tools](./pics/pr_agent_overview.png) ![PR-Agent Tools](./pics/pr_agent_overview.png)

5
action.yaml Normal file
View File

@ -0,0 +1,5 @@
name: 'PR Agent'
description: 'Summarize, review and suggest improvements for pull requests'
runs:
using: 'docker'
image: 'Dockerfile.github_action'

View File

@ -0,0 +1,2 @@
#!/bin/bash
python /app/pr_agent/servers/github_action_runner.py

View File

@ -14,6 +14,13 @@ class AiHandler:
openai.api_key = settings.openai.key openai.api_key = settings.openai.key
if settings.get("OPENAI.ORG", None): if settings.get("OPENAI.ORG", None):
openai.organization = settings.openai.org openai.organization = settings.openai.org
self.deployment_id = settings.get("OPENAI.DEPLOYMENT_ID", None)
if settings.get("OPENAI.API_TYPE", None):
openai.api_type = settings.openai.api_type
if settings.get("OPENAI.API_VERSION", None):
openai.engine = settings.openai.api_version
if settings.get("OPENAI.API_BASE", None):
openai.api_base = settings.openai.api_base
except AttributeError as e: except AttributeError as e:
raise ValueError("OpenAI key is required") from e raise ValueError("OpenAI key is required") from e
@ -23,6 +30,7 @@ class AiHandler:
try: try:
response = await openai.ChatCompletion.acreate( response = await openai.ChatCompletion.acreate(
model=model, model=model,
deployment_id=self.deployment_id,
messages=[ messages=[
{"role": "system", "content": system}, {"role": "system", "content": system},
{"role": "user", "content": user} {"role": "user", "content": user}

View File

@ -24,10 +24,10 @@ def get_pr_diff(git_provider: Union[GithubProvider, Any], token_handler: TokenHa
Returns a string with the diff of the PR. Returns a string with the diff of the PR.
If needed, apply diff minimization techniques to reduce the number of tokens If needed, apply diff minimization techniques to reduce the number of tokens
""" """
files = list(git_provider.get_diff_files()) git_provider.pr.files = list(git_provider.get_diff_files())
# get pr languages # get pr languages
pr_languages = sort_files_by_main_languages(git_provider.get_languages(), files) pr_languages = sort_files_by_main_languages(git_provider.get_languages(), git_provider.pr.files)
# generate a standard diff string, with patch extension # generate a standard diff string, with patch extension
patches_extended, total_tokens = pr_generate_extended_diff(pr_languages, token_handler) patches_extended, total_tokens = pr_generate_extended_diff(pr_languages, token_handler)

View File

@ -1,5 +1,8 @@
from __future__ import annotations from __future__ import annotations
import json
import logging
import re
import textwrap import textwrap
@ -8,7 +11,6 @@ def convert_to_markdown(output_data: dict) -> str:
emojis = { emojis = {
"Main theme": "🎯", "Main theme": "🎯",
"Description and title": "🔍",
"Type of PR": "📌", "Type of PR": "📌",
"Relevant tests added": "🧪", "Relevant tests added": "🧪",
"Unrelated changes": "⚠️", "Unrelated changes": "⚠️",
@ -50,10 +52,7 @@ def parse_code_suggestion(code_suggestions: dict) -> str:
code_str_indented = textwrap.indent(code_str, ' ') code_str_indented = textwrap.indent(code_str, ' ')
markdown_text += f" - **{code_key}:**\n{code_str_indented}\n" markdown_text += f" - **{code_key}:**\n{code_str_indented}\n"
else: else:
if "suggestion number" in sub_key.lower(): if "relevant file" in sub_key.lower():
# markdown_text += f"- **suggestion {sub_value}:**\n" # prettier formatting
pass
elif "relevant file" in sub_key.lower():
markdown_text += f"\n - **{sub_key}:** {sub_value}\n" markdown_text += f"\n - **{sub_key}:** {sub_value}\n"
else: else:
markdown_text += f" **{sub_key}:** {sub_value}\n" markdown_text += f" **{sub_key}:** {sub_value}\n"
@ -61,3 +60,25 @@ def parse_code_suggestion(code_suggestions: dict) -> str:
markdown_text += "\n" markdown_text += "\n"
return markdown_text return markdown_text
def try_fix_json(review, max_iter=10):
# Try to fix JSON if it is broken/incomplete: parse until the last valid code suggestion
data = {}
if review.rfind("'Code suggestions': [") > 0 or review.rfind('"Code suggestions": [') > 0:
last_code_suggestion_ind = [m.end() for m in re.finditer(r"\}\s*,", review)][-1] - 1
valid_json = False
iter_count = 0
while last_code_suggestion_ind > 0 and not valid_json and iter_count < max_iter:
try:
data = json.loads(review[:last_code_suggestion_ind] + "]}}")
valid_json = True
review = review[:last_code_suggestion_ind].strip() + "]}}"
except json.decoder.JSONDecodeError:
review = review[:last_code_suggestion_ind]
# Use regular expression to find the last occurrence of "}," with any number of whitespaces or newlines
last_code_suggestion_ind = [m.end() for m in re.finditer(r"\}\s*,", review)][-1] - 1
iter_count += 1
if not valid_json:
logging.error("Unable to decode JSON response from AI")
data = {}
return data

View File

@ -3,6 +3,7 @@ import asyncio
import logging import logging
import os import os
from pr_agent.tools.pr_description import PRDescription
from pr_agent.tools.pr_questions import PRQuestions from pr_agent.tools.pr_questions import PRQuestions
from pr_agent.tools.pr_reviewer import PRReviewer from pr_agent.tools.pr_reviewer import PRReviewer
@ -11,15 +12,20 @@ def run():
parser = argparse.ArgumentParser(description='AI based pull request analyzer') parser = argparse.ArgumentParser(description='AI based pull request analyzer')
parser.add_argument('--pr_url', type=str, help='The URL of the PR to review', required=True) parser.add_argument('--pr_url', type=str, help='The URL of the PR to review', required=True)
parser.add_argument('--question', type=str, help='Optional question to ask', required=False) parser.add_argument('--question', type=str, help='Optional question to ask', required=False)
parser.add_argument('--pr_description', action='store_true', help='Optional question to ask', required=False)
args = parser.parse_args() args = parser.parse_args()
logging.basicConfig(level=os.environ.get("LOGLEVEL", "INFO")) logging.basicConfig(level=os.environ.get("LOGLEVEL", "INFO"))
if args.question: if args.question:
print(f"Question: {args.question} about PR {args.pr_url}") print(f"Question: {args.question} about PR {args.pr_url}")
reviewer = PRQuestions(args.pr_url, args.question, installation_id=None) reviewer = PRQuestions(args.pr_url, args.question)
asyncio.run(reviewer.answer()) asyncio.run(reviewer.answer())
elif args.pr_description:
print(f"PR description: {args.pr_url}")
reviewer = PRDescription(args.pr_url)
asyncio.run(reviewer.describe())
else: else:
print(f"Reviewing PR: {args.pr_url}") print(f"Reviewing PR: {args.pr_url}")
reviewer = PRReviewer(args.pr_url, installation_id=None, cli_mode=True) reviewer = PRReviewer(args.pr_url, cli_mode=True)
asyncio.run(reviewer.review()) asyncio.run(reviewer.review())

View File

@ -11,6 +11,7 @@ settings = Dynaconf(
"settings/configuration.toml", "settings/configuration.toml",
"settings/pr_reviewer_prompts.toml", "settings/pr_reviewer_prompts.toml",
"settings/pr_questions_prompts.toml", "settings/pr_questions_prompts.toml",
"settings/pr_description_prompts.toml",
"settings_prod/.secrets.toml" "settings_prod/.secrets.toml"
]] ]]
) )

View File

@ -16,6 +16,10 @@ class GitProvider(ABC):
def get_diff_files(self) -> list[FilePatchInfo]: def get_diff_files(self) -> list[FilePatchInfo]:
pass pass
@abstractmethod
def publish_description(self, pr_title: str, pr_body: str):
pass
@abstractmethod @abstractmethod
def publish_comment(self, pr_comment: str, is_temporary: bool = False): def publish_comment(self, pr_comment: str, is_temporary: bool = False):
pass pass

View File

@ -26,6 +26,8 @@ class GithubProvider:
self.pr = self._get_pr() self.pr = self._get_pr()
def get_files(self): def get_files(self):
if hasattr(self.pr, 'files'):
return self.pr.files
return self.pr.get_files() return self.pr.get_files()
def get_diff_files(self) -> list[FilePatchInfo]: def get_diff_files(self) -> list[FilePatchInfo]:
@ -37,6 +39,10 @@ class GithubProvider:
diff_files.append(FilePatchInfo(original_file_content_str, new_file_content_str, file.patch, file.filename)) diff_files.append(FilePatchInfo(original_file_content_str, new_file_content_str, file.patch, file.filename))
return diff_files return diff_files
def publish_description(self, pr_title: str, pr_body: str):
self.pr.edit(title=pr_title, body=pr_body)
# self.pr.create_issue_comment(pr_comment)
def publish_comment(self, pr_comment: str, is_temporary: bool = False): def publish_comment(self, pr_comment: str, is_temporary: bool = False):
response = self.pr.create_issue_comment(pr_comment) response = self.pr.create_issue_comment(pr_comment)
if hasattr(response, "user") and hasattr(response.user, "login"): if hasattr(response, "user") and hasattr(response.user, "login"):

View File

@ -44,6 +44,10 @@ class GitLabProvider(GitProvider):
def get_files(self): def get_files(self):
return [change['new_path'] for change in self.mr.changes()['changes']] return [change['new_path'] for change in self.mr.changes()['changes']]
def publish_description(self, pr_title: str, pr_body: str):
logging.exception("Not implemented yet")
pass
def publish_comment(self, mr_comment: str, is_temporary: bool = False): def publish_comment(self, mr_comment: str, is_temporary: bool = False):
comment = self.mr.notes.create({'body': mr_comment}) comment = self.mr.notes.create({'body': mr_comment})
if is_temporary: if is_temporary:

View File

@ -0,0 +1,58 @@
import asyncio
import json
import os
from pr_agent.config_loader import settings
from pr_agent.tools.pr_questions import PRQuestions
from pr_agent.tools.pr_reviewer import PRReviewer
async def run_action():
GITHUB_EVENT_NAME = os.environ.get('GITHUB_EVENT_NAME', None)
if not GITHUB_EVENT_NAME:
print("GITHUB_EVENT_NAME not set")
return
GITHUB_EVENT_PATH = os.environ.get('GITHUB_EVENT_PATH', None)
if not GITHUB_EVENT_PATH:
print("GITHUB_EVENT_PATH not set")
return
event_payload = json.load(open(GITHUB_EVENT_PATH, 'r'))
RUNNER_DEBUG = os.environ.get('RUNNER_DEBUG', None)
if not RUNNER_DEBUG:
print("RUNNER_DEBUG not set")
OPENAI_KEY = os.environ.get('OPENAI_KEY', None)
if not OPENAI_KEY:
print("OPENAI_KEY not set")
return
OPENAI_ORG = os.environ.get('OPENAI_ORG', None)
GITHUB_TOKEN = os.environ.get('GITHUB_TOKEN', None)
if not GITHUB_TOKEN:
print("GITHUB_TOKEN not set")
return
settings.set("OPENAI.KEY", OPENAI_KEY)
if OPENAI_ORG:
settings.set("OPENAI.ORG", OPENAI_ORG)
settings.set("GITHUB.USER_TOKEN", GITHUB_TOKEN)
settings.set("GITHUB.DEPLOYMENT_TYPE", "user")
if GITHUB_EVENT_NAME == "pull_request":
action = event_payload.get("action", None)
if action in ["opened", "reopened"]:
pr_url = event_payload.get("pull_request", {}).get("url", None)
if pr_url:
await PRReviewer(pr_url).review()
elif GITHUB_EVENT_NAME == "issue_comment":
action = event_payload.get("action", None)
if action in ["created", "edited"]:
comment_body = event_payload.get("comment", {}).get("body", None)
if comment_body:
pr_url = event_payload.get("issue", {}).get("pull_request", {}).get("url", None)
if pr_url:
if comment_body.strip().lower() == "review":
await PRReviewer(pr_url).review()
elif comment_body.lstrip().lower().startswith("answer"):
await PRQuestions(pr_url, comment_body).answer()
if __name__ == '__main__':
asyncio.run(run_action())

View File

@ -9,6 +9,11 @@
[openai] [openai]
key = "<API_KEY>" # Acquire through https://platform.openai.com key = "<API_KEY>" # Acquire through https://platform.openai.com
org = "<ORGANIZATION>" # Optional, may be commented out. org = "<ORGANIZATION>" # Optional, may be commented out.
# Uncomment the following for Azure OpenAI
#api_type = "azure"
#api_version = '2023-05-15' # Check Azure documentation for the current API version
#api_base = "<API_BASE>" # The base URL for your Azure OpenAI resource. e.g. "https://<your resource name>.openai.azure.com"
#deployment_id = "<DEPLOYMENT_ID>" # The deployment name you chose when you deployed the engine
[github] [github]
# ---- Set the following only for deployment type == "user" # ---- Set the following only for deployment type == "user"

View File

@ -2,14 +2,14 @@
model="gpt-4-0613" model="gpt-4-0613"
git_provider="github" git_provider="github"
publish_review=true publish_review=true
verbosity_level=0 # 0,1,2 verbosity_level=0 # 0,1,2
[pr_reviewer] [pr_reviewer]
require_focused_review=true require_focused_review=true
require_tests_review=true require_tests_review=true
require_security_review=true require_security_review=true
extended_code_suggestions=false
num_code_suggestions=4 num_code_suggestions=4
inline_code_comments = true
[pr_questions] [pr_questions]
@ -19,7 +19,7 @@ deployment_type = "user"
[gitlab] [gitlab]
# URL to the gitlab service # URL to the gitlab service
gitlab_url = "https://gitlab.com" url = "https://gitlab.com"
# Polling (either project id or namespace/project_name) syntax can be used # Polling (either project id or namespace/project_name) syntax can be used
projects_to_monitor = ['org_name/repo_name'] projects_to_monitor = ['org_name/repo_name']

View File

@ -0,0 +1,45 @@
[pr_description_prompt]
system="""You are CodiumAI-PR-Reviewer, a language model designed to review git pull requests.
Your task is to provide full description of the PR content.
- Make sure not to focus the new PR code (the '+' lines).
You must use the following JSON schema to format your answer:
```json
{
"PR Title": {
"type": "string",
"description": "an informative title for the PR, describing its main theme"
},
"Type of PR": {
"type": "string",
"enum": ["Bug fix", "Tests", "Bug fix with tests", "Refactoring", "Enhancement", "Documentation", "Other"]
},
"PR Description": {
"type": "string",
"description": "an informative and concise description of the PR"
},
"PR Main Files Walkthrough": {
"type": "string",
"description": "a walkthrough of the PR changes. Review main files, in bullet points, and shortly describe the changes in each file (up to 10 most important files). Format: -`filename`: description of changes\n..."
}
}
Don't repeat the prompt in the answer, and avoid outputting the 'type' and 'description' fields.
"""
user="""PR Info:
Branch: '{{branch}}'
{%- if language %}
Main language: {{language}}
{%- endif %}
The PR Git Diff:
```
{{diff}}
```
Note that lines in the diff body are prefixed with a symbol that represents the type of change: '-' for deletions, '+' for additions, and ' ' (a space) for unchanged lines.
Response (should be a valid JSON, and nothing else):
```json
"""

View File

@ -3,9 +3,6 @@ system="""You are CodiumAI-PR-Reviewer, a language model designed to review git
Your task is to provide constructive and concise feedback for the PR, and also provide meaningfull code suggestions to improve the new PR code (the '+' lines). Your task is to provide constructive and concise feedback for the PR, and also provide meaningfull code suggestions to improve the new PR code (the '+' lines).
- Provide up to {{ num_code_suggestions }} code suggestions. - Provide up to {{ num_code_suggestions }} code suggestions.
- Try to focus on important suggestions like fixing code problems, issues and bugs. As a second priority, provide suggestions for meaningfull code improvements, like performance, vulnerability, modularity, and best practices. - Try to focus on important suggestions like fixing code problems, issues and bugs. As a second priority, provide suggestions for meaningfull code improvements, like performance, vulnerability, modularity, and best practices.
{%- if extended_code_suggestions %}
- For each suggestion, provide a short and concise code snippet to illustrate the existing code, and the improved code.
{%- endif %}
- Make sure not to provide suggestion repeating modifications already implemented in the new PR code (the '+' lines). - Make sure not to provide suggestion repeating modifications already implemented in the new PR code (the '+' lines).
You must use the following JSON schema to format your answer: You must use the following JSON schema to format your answer:
@ -16,10 +13,6 @@ You must use the following JSON schema to format your answer:
"type": "string", "type": "string",
"description": "a short explanation of the PR" "description": "a short explanation of the PR"
}, },
"Description and title": {
"type": "string",
"description": "yes\\no question: does this PR have a relevant description and title"
},
"Type of PR": { "Type of PR": {
"type": "string", "type": "string",
"enum": ["Bug fix", "Tests", "Bug fix with tests", "Refactoring", "Enhancement", "Documentation", "Other"] "enum": ["Bug fix", "Tests", "Bug fix with tests", "Refactoring", "Enhancement", "Documentation", "Other"]
@ -40,48 +33,25 @@ You must use the following JSON schema to format your answer:
"PR Feedback": { "PR Feedback": {
"General PR suggestions": { "General PR suggestions": {
"type": "string", "type": "string",
"description": "important suggestions for the contributors and maintainers of this PR, may include overall structure, primary purpose and best practices. consider using specific filenames, classes and functions names. explain yourself!" "description": "General suggestions and feedback for the contributors and maintainers of this PR. May include important suggestions for the overall structure, primary purpose, best practices, critical bugs, and other aspects of the PR. Explain your suggestions."
}, },
"Code suggestions": { "Code suggestions": {
"type": "array", "type": "array",
"maxItems": {{ num_code_suggestions }}, "maxItems": {{ num_code_suggestions }},
"uniqueItems": true, "uniqueItems": true,
"items": { "items": {
"suggestion number": {
"type": "int",
"description": "suggestion number, starting from 1"
},
"relevant file": { "relevant file": {
"type": "string", "type": "string",
"description": "the relevant file name" "description": "the relevant file full path"
}, },
"suggestion content": { "suggestion content": {
"type": "string", "type": "string",
{%- if extended_code_suggestions %}
"description": "a concrete suggestion for meaningfully improving the new PR code. Don't repeat previous suggestions. Add tags with importance measure that matches each suggestion ('important' or 'medium'). Do not make suggestions for updating or adding docstrings, renaming PR title and description, or linter like.
{%- else %}
"description": "a concrete suggestion for meaningfully improving the new PR code. Also describe how, specifically, the suggestion can be applied to new PR code. Add tags with importance measure that matches each suggestion ('important' or 'medium'). Do not make suggestions for updating or adding docstrings, renaming PR title and description, or linter like. "description": "a concrete suggestion for meaningfully improving the new PR code. Also describe how, specifically, the suggestion can be applied to new PR code. Add tags with importance measure that matches each suggestion ('important' or 'medium'). Do not make suggestions for updating or adding docstrings, renaming PR title and description, or linter like.
{%- endif %}
}, },
{%- if extended_code_suggestions %} "relevant line in file": {
"why": {
"type": "string", "type": "string",
"description": "shortly explain why this suggestion is important" "description": "an authentic single code line from the PR git diff section, to which the suggestion applies."
},
"code example": {
"type": "object",
"properties": {
"before code": {
"type": "string",
"description": "Short and concise code snippet, to illustrate the existing code"
},
"after code": {
"type": "string",
"description": "Short and concise code snippet, to illustrate the improved code"
}
}
} }
{%- endif %}
} }
}, },
{%- if require_security %} {%- if require_security %}
@ -101,7 +71,6 @@ Example output:
"PR Analysis": "PR Analysis":
{ {
"Main theme": "xxx", "Main theme": "xxx",
"Description and title": "Yes",
"Type of PR": "Bug fix", "Type of PR": "Bug fix",
{%- if require_tests %} {%- if require_tests %}
"Relevant tests added": "No", "Relevant tests added": "No",
@ -115,17 +84,9 @@ Example output:
"General PR suggestions": "..., `xxx`...", "General PR suggestions": "..., `xxx`...",
"Code suggestions": [ "Code suggestions": [
{ {
"suggestion number": 1, "relevant file": "directory/xxx.py",
"relevant file": "xxx.py",
"suggestion content": "xxx [important]", "suggestion content": "xxx [important]",
{%- if extended_code_suggestions %} "relevant line in file": "xxx",
"why": "xxx",
"code example":
{
"before code": "xxx",
"after code": "xxx"
}
{%- endif %}
}, },
... ...
] ]

View File

@ -0,0 +1,83 @@
import copy
import json
import logging
from jinja2 import Environment, StrictUndefined
from pr_agent.algo.ai_handler import AiHandler
from pr_agent.algo.pr_processing import get_pr_diff
from pr_agent.algo.token_handler import TokenHandler
from pr_agent.algo.utils import convert_to_markdown
from pr_agent.config_loader import settings
from pr_agent.git_providers import get_git_provider
from pr_agent.git_providers.git_provider import get_main_pr_language
class PRDescription:
def __init__(self, pr_url: str):
self.git_provider = get_git_provider()(pr_url)
self.main_pr_language = get_main_pr_language(
self.git_provider.get_languages(), self.git_provider.get_files()
)
self.ai_handler = AiHandler()
self.vars = {
"title": self.git_provider.pr.title,
"branch": self.git_provider.get_pr_branch(),
"description": self.git_provider.get_description(),
"language": self.main_pr_language,
"diff": "", # empty diff for initial calculation
}
self.token_handler = TokenHandler(self.git_provider.pr,
self.vars,
settings.pr_description_prompt.system,
settings.pr_description_prompt.user)
self.patches_diff = None
self.prediction = None
async def describe(self):
logging.info('Answering a PR question...')
if settings.config.publish_review:
self.git_provider.publish_comment("Preparing pr description...", is_temporary=True)
logging.info('Getting PR diff...')
self.patches_diff = get_pr_diff(self.git_provider, self.token_handler)
logging.info('Getting AI prediction...')
self.prediction = await self._get_prediction()
logging.info('Preparing answer...')
pr_title, pr_body = self._prepare_pr_answer()
if settings.config.publish_review:
logging.info('Pushing answer...')
self.git_provider.publish_description(pr_title, pr_body)
self.git_provider.remove_initial_comment()
return ""
async def _get_prediction(self):
variables = copy.deepcopy(self.vars)
variables["diff"] = self.patches_diff # update diff
environment = Environment(undefined=StrictUndefined)
system_prompt = environment.from_string(settings.pr_description_prompt.system).render(variables)
user_prompt = environment.from_string(settings.pr_description_prompt.user).render(variables)
if settings.config.verbosity_level >= 2:
logging.info(f"\nSystem prompt:\n{system_prompt}")
logging.info(f"\nUser prompt:\n{user_prompt}")
model = settings.config.model
response, finish_reason = await self.ai_handler.chat_completion(model=model, temperature=0.2,
system=system_prompt, user=user_prompt)
return response
def _prepare_pr_answer(self):
data = json.loads(self.prediction)
pr_body = ""
# for key, value in data.items():
# markdown_text += f"## {key}\n\n"
# markdown_text += f"{value}\n\n"
title = data['PR Title']
del data['PR Title']
for key, value in data.items():
pr_body += f"{key}:\n"
if 'walkthrough' in key.lower():
pr_body += f"{value}\n"
else:
pr_body += f"**{value}**\n\n___\n"
if settings.config.verbosity_level >= 2:
logging.info(f"title:\n{title}\n{pr_body}")
return title, pr_body

View File

@ -7,7 +7,7 @@ from jinja2 import Environment, StrictUndefined
from pr_agent.algo.ai_handler import AiHandler from pr_agent.algo.ai_handler import AiHandler
from pr_agent.algo.pr_processing import get_pr_diff from pr_agent.algo.pr_processing import get_pr_diff
from pr_agent.algo.token_handler import TokenHandler from pr_agent.algo.token_handler import TokenHandler
from pr_agent.algo.utils import convert_to_markdown from pr_agent.algo.utils import convert_to_markdown, try_fix_json
from pr_agent.config_loader import settings from pr_agent.config_loader import settings
from pr_agent.git_providers import get_git_provider from pr_agent.git_providers import get_git_provider
from pr_agent.git_providers.git_provider import get_main_pr_language from pr_agent.git_providers.git_provider import get_main_pr_language
@ -33,7 +33,6 @@ class PRReviewer:
"require_tests": settings.pr_reviewer.require_tests_review, "require_tests": settings.pr_reviewer.require_tests_review,
"require_security": settings.pr_reviewer.require_security_review, "require_security": settings.pr_reviewer.require_security_review,
"require_focused": settings.pr_reviewer.require_focused_review, "require_focused": settings.pr_reviewer.require_focused_review,
'extended_code_suggestions': settings.pr_reviewer.extended_code_suggestions,
'num_code_suggestions': settings.pr_reviewer.num_code_suggestions, 'num_code_suggestions': settings.pr_reviewer.num_code_suggestions,
} }
self.token_handler = TokenHandler(self.git_provider.pr, self.token_handler = TokenHandler(self.git_provider.pr,
@ -55,6 +54,9 @@ class PRReviewer:
logging.info('Pushing PR review...') logging.info('Pushing PR review...')
self.git_provider.publish_comment(pr_comment) self.git_provider.publish_comment(pr_comment)
self.git_provider.remove_initial_comment() self.git_provider.remove_initial_comment()
if settings.pr_reviewer.inline_code_comments:
logging.info('Pushing inline code comments...')
self._publish_inline_code_comments()
return "" return ""
async def _get_prediction(self): async def _get_prediction(self):
@ -69,11 +71,7 @@ class PRReviewer:
model = settings.config.model model = settings.config.model
response, finish_reason = await self.ai_handler.chat_completion(model=model, temperature=0.2, response, finish_reason = await self.ai_handler.chat_completion(model=model, temperature=0.2,
system=system_prompt, user=user_prompt) system=system_prompt, user=user_prompt)
try:
json.loads(response)
except json.decoder.JSONDecodeError:
logging.warning("Could not decode JSON")
response = {}
return response return response
def _prepare_pr_review(self) -> str: def _prepare_pr_review(self) -> str:
@ -81,8 +79,7 @@ class PRReviewer:
try: try:
data = json.loads(review) data = json.loads(review)
except json.decoder.JSONDecodeError: except json.decoder.JSONDecodeError:
logging.error("Unable to decode JSON response from AI") data = try_fix_json(review)
data = {}
# reordering for nicer display # reordering for nicer display
if 'PR Feedback' in data: if 'PR Feedback' in data:
@ -91,6 +88,9 @@ class PRReviewer:
del data['PR Feedback']['Security concerns'] del data['PR Feedback']['Security concerns']
data['PR Analysis']['Security concerns'] = val data['PR Analysis']['Security concerns'] = val
if settings.config.git_provider == 'github' and settings.pr_reviewer.inline_code_comments:
del data['PR Feedback']['Code suggestions']
markdown_text = convert_to_markdown(data) markdown_text = convert_to_markdown(data)
user = self.git_provider.get_user_id() user = self.git_provider.get_user_id()
@ -109,3 +109,36 @@ class PRReviewer:
if settings.config.verbosity_level >= 2: if settings.config.verbosity_level >= 2:
logging.info(f"Markdown response:\n{markdown_text}") logging.info(f"Markdown response:\n{markdown_text}")
return markdown_text return markdown_text
def _publish_inline_code_comments(self):
if settings.config.git_provider != 'github': # inline comments are currently only supported for github
return
review = self.prediction.strip()
try:
data = json.loads(review)
except json.decoder.JSONDecodeError:
data = try_fix_json(review)
pr = self.git_provider.pr
last_commit_id = list(pr.get_commits())[-1]
files = list(self.git_provider.get_diff_files())
for d in data['PR Feedback']['Code suggestions']:
relevant_file = d['relevant file'].strip()
relevant_line_in_file = d['relevant line in file'].strip()
content = d['suggestion content']
position = -1
for file in files:
if file.filename.strip() == relevant_file:
patch = file.patch
patch_lines = patch.splitlines()
for i, line in enumerate(patch_lines):
if relevant_line_in_file in line:
position = i
if position == -1:
logging.info(f"Could not find position for {relevant_file} {relevant_line_in_file}")
else:
body = content
path = relevant_file.strip()
pr.create_review_comment(body=body, commit_id=last_commit_id, path=path, position=position)

View File

@ -46,7 +46,6 @@ class TestConvertToMarkdown:
def test_simple_dictionary_input(self): def test_simple_dictionary_input(self):
input_data = { input_data = {
'Main theme': 'Test', 'Main theme': 'Test',
'Description and title': 'Test description',
'Type of PR': 'Test type', 'Type of PR': 'Test type',
'Relevant tests added': 'no', 'Relevant tests added': 'no',
'Unrelated changes': 'n/a', # won't be included in the output 'Unrelated changes': 'n/a', # won't be included in the output
@ -54,14 +53,12 @@ class TestConvertToMarkdown:
'General PR suggestions': 'general suggestion...', 'General PR suggestions': 'general suggestion...',
'Code suggestions': [ 'Code suggestions': [
{ {
'Suggestion number': 1,
'Code example': { 'Code example': {
'Before': 'Code before', 'Before': 'Code before',
'After': 'Code after' 'After': 'Code after'
} }
}, },
{ {
'Suggestion number': 2,
'Code example': { 'Code example': {
'Before': 'Code before 2', 'Before': 'Code before 2',
'After': 'Code after 2' 'After': 'Code after 2'
@ -71,7 +68,6 @@ class TestConvertToMarkdown:
} }
expected_output = """\ expected_output = """\
- 🎯 **Main theme:** Test - 🎯 **Main theme:** Test
- 🔍 **Description and title:** Test description
- 📌 **Type of PR:** Test type - 📌 **Type of PR:** Test type
- 🧪 **Relevant tests added:** no - 🧪 **Relevant tests added:** no
- ✨ **Focused PR:** Yes - ✨ **Focused PR:** Yes
@ -110,7 +106,6 @@ class TestConvertToMarkdown:
def test_dictionary_input_containing_only_empty_dictionaries(self): def test_dictionary_input_containing_only_empty_dictionaries(self):
input_data = { input_data = {
'Main theme': {}, 'Main theme': {},
'Description and title': {},
'Type of PR': {}, 'Type of PR': {},
'Relevant tests added': {}, 'Relevant tests added': {},
'Unrelated changes': {}, 'Unrelated changes': {},

View File

@ -0,0 +1,83 @@
# Generated by CodiumAI
from pr_agent.algo.utils import try_fix_json
import pytest
class TestTryFixJson:
# Tests that JSON with complete 'Code suggestions' section returns expected output
def test_incomplete_code_suggestions(self):
review = '{"PR Analysis": {"Main theme": "xxx", "Type of PR": "Bug fix"}, "PR Feedback": {"General PR suggestions": "..., `xxx`...", "Code suggestions": [{"relevant file": "xxx.py", "suggestion content": "xxx [important]"}, {"suggestion number": 2, "relevant file": "yyy.py", "suggestion content": "yyy [incomp...'
expected_output = {
'PR Analysis': {
'Main theme': 'xxx',
'Type of PR': 'Bug fix'
},
'PR Feedback': {
'General PR suggestions': '..., `xxx`...',
'Code suggestions': [
{
'relevant file': 'xxx.py',
'suggestion content': 'xxx [important]'
}
]
}
}
assert try_fix_json(review) == expected_output
def test_incomplete_code_suggestions_new_line(self):
review = '{"PR Analysis": {"Main theme": "xxx", "Type of PR": "Bug fix"}, "PR Feedback": {"General PR suggestions": "..., `xxx`...", "Code suggestions": [{"relevant file": "xxx.py", "suggestion content": "xxx [important]"} \n\t, {"suggestion number": 2, "relevant file": "yyy.py", "suggestion content": "yyy [incomp...'
expected_output = {
'PR Analysis': {
'Main theme': 'xxx',
'Type of PR': 'Bug fix'
},
'PR Feedback': {
'General PR suggestions': '..., `xxx`...',
'Code suggestions': [
{
'relevant file': 'xxx.py',
'suggestion content': 'xxx [important]'
}
]
}
}
assert try_fix_json(review) == expected_output
def test_incomplete_code_suggestions_many_close_brackets(self):
review = '{"PR Analysis": {"Main theme": "xxx", "Type of PR": "Bug fix"}, "PR Feedback": {"General PR suggestions": "..., `xxx`...", "Code suggestions": [{"relevant file": "xxx.py", "suggestion content": "xxx [important]"} \n, {"suggestion number": 2, "relevant file": "yyy.py", "suggestion content": "yyy }, [}\n ,incomp.} ,..'
expected_output = {
'PR Analysis': {
'Main theme': 'xxx',
'Type of PR': 'Bug fix'
},
'PR Feedback': {
'General PR suggestions': '..., `xxx`...',
'Code suggestions': [
{
'relevant file': 'xxx.py',
'suggestion content': 'xxx [important]'
}
]
}
}
assert try_fix_json(review) == expected_output
def test_incomplete_code_suggestions_relevant_file(self):
review = '{"PR Analysis": {"Main theme": "xxx", "Type of PR": "Bug fix"}, "PR Feedback": {"General PR suggestions": "..., `xxx`...", "Code suggestions": [{"relevant file": "xxx.py", "suggestion content": "xxx [important]"}, {"suggestion number": 2, "relevant file": "yyy.p'
expected_output = {
'PR Analysis': {
'Main theme': 'xxx',
'Type of PR': 'Bug fix'
},
'PR Feedback': {
'General PR suggestions': '..., `xxx`...',
'Code suggestions': [
{
'relevant file': 'xxx.py',
'suggestion content': 'xxx [important]'
}
]
}
}
assert try_fix_json(review) == expected_output

View File

@ -41,14 +41,6 @@ class TestParseCodeSuggestion:
expected_output = "\n" # modified to expect a newline character expected_output = "\n" # modified to expect a newline character
assert parse_code_suggestion(input_data) == expected_output assert parse_code_suggestion(input_data) == expected_output
# Tests that function returns correct output when 'suggestion number' key has a non-integer value
def test_non_integer_suggestion_number(self):
input_data = {
"Suggestion number": "one",
"Description": "This is a suggestion"
}
expected_output = " **Description:** This is a suggestion\n\n"
assert parse_code_suggestion(input_data) == expected_output
# Tests that function returns correct output when 'before' or 'after' key has a non-string value # Tests that function returns correct output when 'before' or 'after' key has a non-string value
def test_non_string_before_or_after(self): def test_non_string_before_or_after(self):
@ -64,7 +56,6 @@ class TestParseCodeSuggestion:
# Tests that function returns correct output when input dictionary does not have 'code example' key # Tests that function returns correct output when input dictionary does not have 'code example' key
def test_no_code_example_key(self): def test_no_code_example_key(self):
code_suggestions = { code_suggestions = {
'suggestion number': 1,
'suggestion': 'Suggestion 1', 'suggestion': 'Suggestion 1',
'description': 'Description 1', 'description': 'Description 1',
'before': 'Before 1', 'before': 'Before 1',
@ -76,7 +67,6 @@ class TestParseCodeSuggestion:
# Tests that function returns correct output when input dictionary has 'code example' key # Tests that function returns correct output when input dictionary has 'code example' key
def test_with_code_example_key(self): def test_with_code_example_key(self):
code_suggestions = { code_suggestions = {
'suggestion number': 2,
'suggestion': 'Suggestion 2', 'suggestion': 'Suggestion 2',
'description': 'Description 2', 'description': 'Description 2',
'code example': { 'code example': {