Github custom action development - WIP

2025-07-21 04:50:39 +08:00 · 2023-07-13 19:14:44 +03:00 · 2023-07-13 19:08:10 +03:00 · 2023-07-13 18:59:54 +03:00 · 2023-07-13 18:54:40 +03:00 · 2023-07-13 18:54:11 +03:00
23 changed files with 417 additions and 122 deletions
--- a/.github/workflows/review.yaml
+++ b/.github/workflows/review.yaml
@ -0,0 +1,16 @@
 on:
  pull_request:
  issue_comment:
 jobs:
  pr_agent_job:
    runs-on: ubuntu-latest
    name: Run pr agent on every pull request
    steps:
      - name: PR Agent action step
        id: pragent
        uses: Codium-ai/pr-agent@feature/github_action
        env:
          OPENAI_KEY: ${{ secrets.OPENAI_KEY }}
          OPENAI_ORG: ${{ secrets.OPENAI_ORG }}
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
--- a/Dockerfile.github_action
+++ b/Dockerfile.github_action
@ -0,0 +1,10 @@
 FROM python:3.10 as base
 WORKDIR /app
 ADD requirements.txt .
 RUN pip install -r requirements.txt && rm requirements.txt
 ENV PYTHONPATH=/app
 ADD pr_agent pr_agent
 ADD github_action/entrypoint.sh /
 RUN chmod +x /entrypoint.sh
 ENTRYPOINT ["/entrypoint.sh"]
--- a/README.md
+++ b/README.md
@ -183,7 +183,6 @@ Here is a quick overview of the different sub-tools of PR Reviewer:
 - PR Analysis
  - Summarize main theme
  - PR description and title
  - PR type classification
  - Is the PR covered by relevant tests
  - Is this a focused PR
@ -199,7 +198,6 @@ This is how a typical output of the PR Reviewer looks like:
 #### PR Analysis
 - 🎯 **Main theme:** Adding language extension handler and token handler
 - 🔍 **Description and title:** Yes
 - 📌 **Type of PR:** Enhancement
 - 🧪 **Relevant tests added:** No
 - ✨ **Focused PR:** Yes, the PR is focused on adding two new handlers for language extension and token counting.
@ -250,45 +248,6 @@ require_tests_review=true
 require_security_review=true
 ```
 #### Code Suggestions configuration:
 There are also configuration options to control different aspects of the `code suggestions` feature.
 The number of suggestions provided can be controlled by adjusting the following parameter:
 ```
 num_code_suggestions=4
 ```
 You can also enable more verbose and informative mode of code suggestions:
 ```
 extended_code_suggestions=false
 ```
 This is a comparison of the regular and extended code suggestions modes:
 - **relevant file:** sql.py
 - **suggestion content:** Remove hardcoded sensitive information like username and password. Use environment variables or a secure method to store these values. [important]
 Example for extended suggestion:
 - **relevant file:** sql.py
 - **suggestion content:** Remove hardcoded sensitive information (username and password) [important]
 - **why:** Hardcoding sensitive information is a security risk. It's better to use environment variables or a secure way to store these values.
 - **code example:**
  - **before code:**
    ```
    user = "root",
    password = "Mysql@123",
    ```
  - **after code:**
    ```
    user = os.getenv('DB_USER'),
    password = os.getenv('DB_PASSWORD'),
    ```
 ---
 ## How it works
 ![PR-Agent Tools](./pics/pr_agent_overview.png)
--- a/action.yaml
+++ b/action.yaml
@ -0,0 +1,5 @@
 name: 'PR Agent'
 description: 'Summarize, review and suggest improvements for pull requests'
 runs:
  using: 'docker'
  image: 'Dockerfile.github_action'
--- a/github_action/entrypoint.sh
+++ b/github_action/entrypoint.sh
@ -0,0 +1,2 @@
 #!/bin/bash
 python /app/pr_agent/servers/github_action_runner.py
--- a/pr_agent/algo/ai_handler.py
+++ b/pr_agent/algo/ai_handler.py
@ -14,6 +14,13 @@ class AiHandler:
            openai.api_key = settings.openai.key
            if settings.get("OPENAI.ORG", None):
                openai.organization = settings.openai.org
            self.deployment_id = settings.get("OPENAI.DEPLOYMENT_ID", None)
            if settings.get("OPENAI.API_TYPE", None):
                openai.api_type = settings.openai.api_type
            if settings.get("OPENAI.API_VERSION", None):
                openai.engine = settings.openai.api_version
            if settings.get("OPENAI.API_BASE", None):
                openai.api_base = settings.openai.api_base
        except AttributeError as e:
            raise ValueError("OpenAI key is required") from e
@ -23,6 +30,7 @@ class AiHandler:
        try:
            response = await openai.ChatCompletion.acreate(
                            model=model,
                            deployment_id=self.deployment_id,
                            messages=[
                                {"role": "system", "content": system},
                                {"role": "user", "content": user}
--- a/pr_agent/algo/pr_processing.py
+++ b/pr_agent/algo/pr_processing.py
@ -24,10 +24,10 @@ def get_pr_diff(git_provider: Union[GithubProvider, Any], token_handler: TokenHa
    Returns a string with the diff of the PR.
    If needed, apply diff minimization techniques to reduce the number of tokens
    """
-    files = list(git_provider.get_diff_files())
+    git_provider.pr.files = list(git_provider.get_diff_files())
    # get pr languages
-    pr_languages = sort_files_by_main_languages(git_provider.get_languages(), files)
+    pr_languages = sort_files_by_main_languages(git_provider.get_languages(), git_provider.pr.files)
    # generate a standard diff string, with patch extension
    patches_extended, total_tokens = pr_generate_extended_diff(pr_languages, token_handler)
--- a/pr_agent/algo/utils.py
+++ b/pr_agent/algo/utils.py
@ -1,5 +1,8 @@
 from __future__ import annotations
 import json
 import logging
 import re
 import textwrap
@ -8,7 +11,6 @@ def convert_to_markdown(output_data: dict) -> str:
    emojis = {
        "Main theme": "🎯",
        "Description and title": "🔍",
        "Type of PR": "📌",
        "Relevant tests added": "🧪",
        "Unrelated changes": "⚠️",
@ -50,10 +52,7 @@ def parse_code_suggestion(code_suggestions: dict) -> str:
                code_str_indented = textwrap.indent(code_str, '        ')
                markdown_text += f"    - **{code_key}:**\n{code_str_indented}\n"
        else:
-            if "suggestion number" in sub_key.lower():
+            if "relevant file" in sub_key.lower():
                # markdown_text += f"- **suggestion {sub_value}:**\n"  # prettier formatting
                pass
            elif "relevant file" in sub_key.lower():
                markdown_text += f"\n  - **{sub_key}:** {sub_value}\n"
            else:
                markdown_text += f"   **{sub_key}:** {sub_value}\n"
@ -61,3 +60,25 @@ def parse_code_suggestion(code_suggestions: dict) -> str:
    markdown_text += "\n"
    return markdown_text
 def try_fix_json(review, max_iter=10):
    # Try to fix JSON if it is broken/incomplete: parse until the last valid code suggestion
    data = {}
    if review.rfind("'Code suggestions': [") > 0 or review.rfind('"Code suggestions": [') > 0:
        last_code_suggestion_ind = [m.end() for m in re.finditer(r"\}\s*,", review)][-1] - 1
        valid_json = False
        iter_count = 0
        while last_code_suggestion_ind > 0 and not valid_json and iter_count < max_iter:
            try:
                data = json.loads(review[:last_code_suggestion_ind] + "]}}")
                valid_json = True
                review = review[:last_code_suggestion_ind].strip() + "]}}"
            except json.decoder.JSONDecodeError:
                review = review[:last_code_suggestion_ind]
                # Use regular expression to find the last occurrence of "}," with any number of whitespaces or newlines
                last_code_suggestion_ind = [m.end() for m in re.finditer(r"\}\s*,", review)][-1] - 1
                iter_count += 1
        if not valid_json:
            logging.error("Unable to decode JSON response from AI")
            data = {}
    return data
--- a/pr_agent/cli.py
+++ b/pr_agent/cli.py
@ -3,6 +3,7 @@ import asyncio
 import logging
 import os
 from pr_agent.tools.pr_description import PRDescription
 from pr_agent.tools.pr_questions import PRQuestions
 from pr_agent.tools.pr_reviewer import PRReviewer
@ -11,15 +12,20 @@ def run():
    parser = argparse.ArgumentParser(description='AI based pull request analyzer')
    parser.add_argument('--pr_url', type=str, help='The URL of the PR to review', required=True)
    parser.add_argument('--question', type=str, help='Optional question to ask', required=False)
    parser.add_argument('--pr_description', action='store_true', help='Optional question to ask', required=False)
    args = parser.parse_args()
    logging.basicConfig(level=os.environ.get("LOGLEVEL", "INFO"))
    if args.question:
        print(f"Question: {args.question} about PR {args.pr_url}")
-        reviewer = PRQuestions(args.pr_url, args.question, installation_id=None)
+        reviewer = PRQuestions(args.pr_url, args.question)
        asyncio.run(reviewer.answer())
    elif args.pr_description:
        print(f"PR description: {args.pr_url}")
        reviewer = PRDescription(args.pr_url)
        asyncio.run(reviewer.describe())
    else:
        print(f"Reviewing PR: {args.pr_url}")
-        reviewer = PRReviewer(args.pr_url, installation_id=None, cli_mode=True)
+        reviewer = PRReviewer(args.pr_url, cli_mode=True)
        asyncio.run(reviewer.review())
--- a/pr_agent/config_loader.py
+++ b/pr_agent/config_loader.py
@ -11,6 +11,7 @@ settings = Dynaconf(
         "settings/configuration.toml",
         "settings/pr_reviewer_prompts.toml",
         "settings/pr_questions_prompts.toml",
         "settings/pr_description_prompts.toml",
         "settings_prod/.secrets.toml"
        ]]
 )
--- a/pr_agent/git_providers/git_provider.py
+++ b/pr_agent/git_providers/git_provider.py
@ -16,6 +16,10 @@ class GitProvider(ABC):
    def get_diff_files(self) -> list[FilePatchInfo]:
        pass
    @abstractmethod
    def publish_description(self, pr_title: str, pr_body: str):
        pass
    @abstractmethod
    def publish_comment(self, pr_comment: str, is_temporary: bool = False):
        pass
--- a/pr_agent/git_providers/github_provider.py
+++ b/pr_agent/git_providers/github_provider.py
@ -26,6 +26,8 @@ class GithubProvider:
        self.pr = self._get_pr()
    def get_files(self):
        if hasattr(self.pr, 'files'):
            return self.pr.files
        return self.pr.get_files()
    def get_diff_files(self) -> list[FilePatchInfo]:
@ -37,6 +39,10 @@ class GithubProvider:
            diff_files.append(FilePatchInfo(original_file_content_str, new_file_content_str, file.patch, file.filename))
        return diff_files
    def publish_description(self, pr_title: str, pr_body: str):
        self.pr.edit(title=pr_title, body=pr_body)
        # self.pr.create_issue_comment(pr_comment)
    def publish_comment(self, pr_comment: str, is_temporary: bool = False):
        response = self.pr.create_issue_comment(pr_comment)
        if hasattr(response, "user") and hasattr(response.user, "login"):
--- a/pr_agent/git_providers/gitlab_provider.py
+++ b/pr_agent/git_providers/gitlab_provider.py
@ -44,6 +44,10 @@ class GitLabProvider(GitProvider):
    def get_files(self):
        return [change['new_path'] for change in self.mr.changes()['changes']]
    def publish_description(self, pr_title: str, pr_body: str):
        logging.exception("Not implemented yet")
        pass
    def publish_comment(self, mr_comment: str, is_temporary: bool = False):
        comment = self.mr.notes.create({'body': mr_comment})
        if is_temporary:
--- a/pr_agent/servers/github_action_runner.py
+++ b/pr_agent/servers/github_action_runner.py
@ -0,0 +1,58 @@
 import asyncio
 import json
 import os
 from pr_agent.config_loader import settings
 from pr_agent.tools.pr_questions import PRQuestions
 from pr_agent.tools.pr_reviewer import PRReviewer
 async def run_action():
    GITHUB_EVENT_NAME = os.environ.get('GITHUB_EVENT_NAME', None)
    if not GITHUB_EVENT_NAME:
        print("GITHUB_EVENT_NAME not set")
        return
    GITHUB_EVENT_PATH = os.environ.get('GITHUB_EVENT_PATH', None)
    if not GITHUB_EVENT_PATH:
        print("GITHUB_EVENT_PATH not set")
        return
    event_payload = json.load(open(GITHUB_EVENT_PATH, 'r'))
    RUNNER_DEBUG = os.environ.get('RUNNER_DEBUG', None)
    if not RUNNER_DEBUG:
        print("RUNNER_DEBUG not set")
    OPENAI_KEY = os.environ.get('OPENAI_KEY', None)
    if not OPENAI_KEY:
        print("OPENAI_KEY not set")
        return
    OPENAI_ORG = os.environ.get('OPENAI_ORG', None)
    GITHUB_TOKEN = os.environ.get('GITHUB_TOKEN', None)
    if not GITHUB_TOKEN:
        print("GITHUB_TOKEN not set")
        return
    settings.set("OPENAI.KEY", OPENAI_KEY)
    if OPENAI_ORG:
        settings.set("OPENAI.ORG", OPENAI_ORG)
    settings.set("GITHUB.USER_TOKEN", GITHUB_TOKEN)
    settings.set("GITHUB.DEPLOYMENT_TYPE", "user")
    if GITHUB_EVENT_NAME == "pull_request":
        action = event_payload.get("action", None)
        if action in ["opened", "reopened"]:
            pr_url = event_payload.get("pull_request", {}).get("url", None)
            if pr_url:
                await PRReviewer(pr_url).review()
    elif GITHUB_EVENT_NAME == "issue_comment":
        action = event_payload.get("action", None)
        if action in ["created", "edited"]:
            comment_body = event_payload.get("comment", {}).get("body", None)
            if comment_body:
                pr_url = event_payload.get("issue", {}).get("pull_request", {}).get("url", None)
                if pr_url:
                    if comment_body.strip().lower() == "review":
                        await PRReviewer(pr_url).review()
                    elif comment_body.lstrip().lower().startswith("answer"):
                        await PRQuestions(pr_url, comment_body).answer()
 if __name__ == '__main__':
    asyncio.run(run_action())
--- a/pr_agent/settings/.secrets_template.toml
+++ b/pr_agent/settings/.secrets_template.toml
@ -9,6 +9,11 @@
 [openai]
 key = "<API_KEY>"  # Acquire through https://platform.openai.com
 org = "<ORGANIZATION>"  # Optional, may be commented out.
 # Uncomment the following for Azure OpenAI
 #api_type = "azure"
 #api_version = '2023-05-15'  # Check Azure documentation for the current API version
 #api_base = "<API_BASE>"  # The base URL for your Azure OpenAI resource. e.g. "https://<your resource name>.openai.azure.com"
 #deployment_id = "<DEPLOYMENT_ID>"  # The deployment name you chose when you deployed the engine
 [github]
 # ---- Set the following only for deployment type == "user"
--- a/pr_agent/settings/configuration.toml
+++ b/pr_agent/settings/configuration.toml
@ -2,14 +2,14 @@
 model="gpt-4-0613"
 git_provider="github"
 publish_review=true
-verbosity_level=0  # 0,1,2
+verbosity_level=0 # 0,1,2
 [pr_reviewer]
 require_focused_review=true
 require_tests_review=true
 require_security_review=true
 extended_code_suggestions=false
 num_code_suggestions=4
 inline_code_comments = true
 [pr_questions]
@ -19,7 +19,7 @@ deployment_type = "user"
 [gitlab]
 # URL to the gitlab service
-gitlab_url = "https://gitlab.com"
+url = "https://gitlab.com"
 # Polling (either project id or namespace/project_name) syntax can be used
 projects_to_monitor = ['org_name/repo_name']
--- a/pr_agent/settings/pr_description_prompts.toml
+++ b/pr_agent/settings/pr_description_prompts.toml
@ -0,0 +1,45 @@
 [pr_description_prompt]
 system="""You are CodiumAI-PR-Reviewer, a language model designed to review git pull requests.
 Your task is to provide full description of the PR content.
 - Make sure not to focus the new PR code (the '+' lines).
 You must use the following JSON schema to format your answer:
 ```json
 {
  "PR Title": {
      "type": "string",
      "description": "an informative title for the PR, describing its main theme"
  },
  "Type of PR": {
      "type": "string",
      "enum": ["Bug fix", "Tests", "Bug fix with tests", "Refactoring", "Enhancement", "Documentation", "Other"]
    },
  "PR Description": {
      "type": "string",
      "description": "an informative and concise description of the PR"
  },
  "PR Main Files Walkthrough": {
      "type": "string",
      "description": "a walkthrough of the PR changes. Review main files, in bullet points, and shortly describe the changes in each file (up to 10 most important files). Format: -`filename`: description of changes\n..."
  }
 }
 Don't repeat the prompt in the answer, and avoid outputting the 'type' and 'description' fields.
 """
 user="""PR Info:
 Branch: '{{branch}}'
 {%- if language %}
 Main language: {{language}}
 {%- endif %}
 The PR Git Diff:
 ```
 {{diff}}
 ```
 Note that lines in the diff body are prefixed with a symbol that represents the type of change: '-' for deletions, '+' for additions, and ' ' (a space) for unchanged lines.
 Response (should be a valid JSON, and nothing else):
 ```json
 """
--- a/pr_agent/settings/pr_reviewer_prompts.toml
+++ b/pr_agent/settings/pr_reviewer_prompts.toml
@ -3,9 +3,6 @@ system="""You are CodiumAI-PR-Reviewer, a language model designed to review git
 Your task is to provide constructive and concise feedback for the PR, and also provide meaningfull code suggestions to improve the new PR code (the '+' lines).
 - Provide up to {{ num_code_suggestions }} code suggestions.
 - Try to focus on important suggestions like fixing code problems, issues and bugs. As a second priority, provide suggestions for meaningfull code improvements, like performance, vulnerability, modularity, and best practices.
 {%- if extended_code_suggestions %}
 - For each suggestion, provide a short and concise code snippet to illustrate the existing code, and the improved code.
 {%- endif %}
 - Make sure not to provide suggestion repeating modifications already implemented in the new PR code (the '+' lines).
 You must use the following JSON schema to format your answer:
@ -16,10 +13,6 @@ You must use the following JSON schema to format your answer:
      "type": "string",
      "description": "a short explanation of the PR"
    },
    "Description and title": {
      "type": "string",
      "description": "yes\\no question: does this PR have a relevant description and title"
    },
    "Type of PR": {
      "type": "string",
      "enum": ["Bug fix", "Tests", "Bug fix with tests", "Refactoring", "Enhancement", "Documentation", "Other"]
@ -40,48 +33,25 @@ You must use the following JSON schema to format your answer:
  "PR Feedback": {
    "General PR suggestions": {
      "type": "string",
-      "description": "important suggestions for the contributors and maintainers of this PR, may include overall structure, primary purpose and best practices. consider using specific filenames, classes and functions names. explain yourself!"
+      "description": "General suggestions and feedback for the contributors and maintainers of this PR. May include important suggestions for the overall structure, primary purpose, best practices, critical bugs, and other aspects of the PR. Explain your suggestions."
    },
    "Code suggestions": {
      "type": "array",
      "maxItems": {{ num_code_suggestions }},
      "uniqueItems": true,
      "items": {
        "suggestion number": {
          "type": "int",
          "description": "suggestion number, starting from 1"
        },
        "relevant file": {
          "type": "string",
-          "description": "the relevant file name"
+          "description": "the relevant file full path"
        },
        "suggestion content": {
          "type": "string",
 {%- if extended_code_suggestions %}
          "description": "a concrete suggestion for meaningfully improving the new PR code. Don't repeat previous suggestions. Add tags with importance measure that matches each suggestion ('important' or 'medium'). Do not make suggestions for updating or adding docstrings, renaming PR title and description, or linter like.
 {%- else %}
          "description": "a concrete suggestion for meaningfully improving the new PR code. Also describe how, specifically, the suggestion can be applied to new PR code. Add tags with importance measure that matches each suggestion ('important' or 'medium'). Do not make suggestions for updating or adding docstrings, renaming PR title and description, or linter like.
 {%- endif %}
        },
-{%- if extended_code_suggestions %}
+        "relevant line in file": {
        "why": {
          "type": "string",
-          "description": "shortly explain why this suggestion is important"
+          "description": "an authentic single code line from the PR git diff section, to which the suggestion applies."
        },
        "code example": {
          "type": "object",
          "properties": {
            "before code": {
              "type": "string",
              "description": "Short and concise code snippet, to illustrate the existing code"
            },
            "after code": {
              "type": "string",
              "description": "Short and concise code snippet, to illustrate the improved code"
            }
          }
        }
 {%- endif %}
      }
    },
 {%- if require_security %}
@ -101,7 +71,6 @@ Example output:
    "PR Analysis":
    {
        "Main theme": "xxx",
        "Description and title": "Yes",
        "Type of PR": "Bug fix",
 {%- if require_tests %}
        "Relevant tests added": "No",
@ -115,17 +84,9 @@ Example output:
        "General PR suggestions": "..., `xxx`...",
        "Code suggestions": [
            {
-                "suggestion number": 1,
+                "relevant file": "directory/xxx.py",
                "relevant file": "xxx.py",
                "suggestion content": "xxx [important]",
-{%- if extended_code_suggestions %}
+                "relevant line in file": "xxx",
                "why": "xxx",
                "code example":
                {
                    "before code": "xxx",
                    "after code": "xxx"
                }
 {%- endif %}
            },
            ...
        ]
--- a/pr_agent/tools/pr_description.py
+++ b/pr_agent/tools/pr_description.py
@ -0,0 +1,83 @@
 import copy
 import json
 import logging
 from jinja2 import Environment, StrictUndefined
 from pr_agent.algo.ai_handler import AiHandler
 from pr_agent.algo.pr_processing import get_pr_diff
 from pr_agent.algo.token_handler import TokenHandler
 from pr_agent.algo.utils import convert_to_markdown
 from pr_agent.config_loader import settings
 from pr_agent.git_providers import get_git_provider
 from pr_agent.git_providers.git_provider import get_main_pr_language
 class PRDescription:
    def __init__(self, pr_url: str):
        self.git_provider = get_git_provider()(pr_url)
        self.main_pr_language = get_main_pr_language(
            self.git_provider.get_languages(), self.git_provider.get_files()
        )
        self.ai_handler = AiHandler()
        self.vars = {
            "title": self.git_provider.pr.title,
            "branch": self.git_provider.get_pr_branch(),
            "description": self.git_provider.get_description(),
            "language": self.main_pr_language,
            "diff": "",  # empty diff for initial calculation
        }
        self.token_handler = TokenHandler(self.git_provider.pr,
                                          self.vars,
                                          settings.pr_description_prompt.system,
                                          settings.pr_description_prompt.user)
        self.patches_diff = None
        self.prediction = None
    async def describe(self):
        logging.info('Answering a PR question...')
        if settings.config.publish_review:
            self.git_provider.publish_comment("Preparing pr description...", is_temporary=True)
        logging.info('Getting PR diff...')
        self.patches_diff = get_pr_diff(self.git_provider, self.token_handler)
        logging.info('Getting AI prediction...')
        self.prediction = await self._get_prediction()
        logging.info('Preparing answer...')
        pr_title, pr_body = self._prepare_pr_answer()
        if settings.config.publish_review:
            logging.info('Pushing answer...')
            self.git_provider.publish_description(pr_title, pr_body)
            self.git_provider.remove_initial_comment()
        return ""
    async def _get_prediction(self):
        variables = copy.deepcopy(self.vars)
        variables["diff"] = self.patches_diff  # update diff
        environment = Environment(undefined=StrictUndefined)
        system_prompt = environment.from_string(settings.pr_description_prompt.system).render(variables)
        user_prompt = environment.from_string(settings.pr_description_prompt.user).render(variables)
        if settings.config.verbosity_level >= 2:
            logging.info(f"\nSystem prompt:\n{system_prompt}")
            logging.info(f"\nUser prompt:\n{user_prompt}")
        model = settings.config.model
        response, finish_reason = await self.ai_handler.chat_completion(model=model, temperature=0.2,
                                                                        system=system_prompt, user=user_prompt)
        return response
    def _prepare_pr_answer(self):
        data = json.loads(self.prediction)
        pr_body = ""
        # for key, value in data.items():
        #     markdown_text += f"## {key}\n\n"
        #     markdown_text += f"{value}\n\n"
        title = data['PR Title']
        del data['PR Title']
        for key, value in data.items():
            pr_body += f"{key}:\n"
            if 'walkthrough' in key.lower():
                pr_body += f"{value}\n"
            else:
                pr_body += f"**{value}**\n\n___\n"
        if settings.config.verbosity_level >= 2:
            logging.info(f"title:\n{title}\n{pr_body}")
        return title, pr_body
--- a/pr_agent/tools/pr_reviewer.py
+++ b/pr_agent/tools/pr_reviewer.py
@ -7,7 +7,7 @@ from jinja2 import Environment, StrictUndefined
 from pr_agent.algo.ai_handler import AiHandler
 from pr_agent.algo.pr_processing import get_pr_diff
 from pr_agent.algo.token_handler import TokenHandler
-from pr_agent.algo.utils import convert_to_markdown
+from pr_agent.algo.utils import convert_to_markdown, try_fix_json
 from pr_agent.config_loader import settings
 from pr_agent.git_providers import get_git_provider
 from pr_agent.git_providers.git_provider import get_main_pr_language
@ -33,7 +33,6 @@ class PRReviewer:
            "require_tests": settings.pr_reviewer.require_tests_review,
            "require_security": settings.pr_reviewer.require_security_review,
            "require_focused": settings.pr_reviewer.require_focused_review,
            'extended_code_suggestions': settings.pr_reviewer.extended_code_suggestions,
            'num_code_suggestions': settings.pr_reviewer.num_code_suggestions,
        }
        self.token_handler = TokenHandler(self.git_provider.pr,
@ -55,6 +54,9 @@ class PRReviewer:
            logging.info('Pushing PR review...')
            self.git_provider.publish_comment(pr_comment)
            self.git_provider.remove_initial_comment()
            if settings.pr_reviewer.inline_code_comments:
                logging.info('Pushing inline code comments...')
                self._publish_inline_code_comments()
        return ""
    async def _get_prediction(self):
@ -69,11 +71,7 @@ class PRReviewer:
        model = settings.config.model
        response, finish_reason = await self.ai_handler.chat_completion(model=model, temperature=0.2,
                                                                        system=system_prompt, user=user_prompt)
-        try:
+
            json.loads(response)
        except json.decoder.JSONDecodeError:
            logging.warning("Could not decode JSON")
            response = {}
        return response
    def _prepare_pr_review(self) -> str:
@ -81,8 +79,7 @@ class PRReviewer:
        try:
            data = json.loads(review)
        except json.decoder.JSONDecodeError:
-            logging.error("Unable to decode JSON response from AI")
+            data = try_fix_json(review)
            data = {}
        # reordering for nicer display
        if 'PR Feedback' in data:
@ -91,6 +88,9 @@ class PRReviewer:
                del data['PR Feedback']['Security concerns']
                data['PR Analysis']['Security concerns'] = val
        if settings.config.git_provider == 'github' and settings.pr_reviewer.inline_code_comments:
            del data['PR Feedback']['Code suggestions']
        markdown_text = convert_to_markdown(data)
        user = self.git_provider.get_user_id()
@ -109,3 +109,36 @@ class PRReviewer:
        if settings.config.verbosity_level >= 2:
            logging.info(f"Markdown response:\n{markdown_text}")
        return markdown_text
    def _publish_inline_code_comments(self):
        if settings.config.git_provider != 'github': # inline comments are currently only supported for github
            return
        review = self.prediction.strip()
        try:
            data = json.loads(review)
        except json.decoder.JSONDecodeError:
            data = try_fix_json(review)
        pr = self.git_provider.pr
        last_commit_id = list(pr.get_commits())[-1]
        files = list(self.git_provider.get_diff_files())
        for d in data['PR Feedback']['Code suggestions']:
            relevant_file = d['relevant file'].strip()
            relevant_line_in_file = d['relevant line in file'].strip()
            content = d['suggestion content']
            position = -1
            for file in files:
                if file.filename.strip() == relevant_file:
                    patch = file.patch
                    patch_lines = patch.splitlines()
                    for i, line in enumerate(patch_lines):
                        if relevant_line_in_file in line:
                            position = i
            if position == -1:
                logging.info(f"Could not find position for {relevant_file} {relevant_line_in_file}")
            else:
                body = content
                path = relevant_file.strip()
                pr.create_review_comment(body=body, commit_id=last_commit_id, path=path, position=position)
--- a/tests/unit/test_convert_to_markdown.py
+++ b/tests/unit/test_convert_to_markdown.py
@ -46,7 +46,6 @@ class TestConvertToMarkdown:
    def test_simple_dictionary_input(self):
        input_data = {
            'Main theme': 'Test',
            'Description and title': 'Test description',
            'Type of PR': 'Test type',
            'Relevant tests added': 'no',
            'Unrelated changes': 'n/a',  # won't be included in the output
@ -54,14 +53,12 @@ class TestConvertToMarkdown:
            'General PR suggestions': 'general suggestion...',
            'Code suggestions': [
                {
                    'Suggestion number': 1,
                    'Code example': {
                        'Before': 'Code before',
                        'After': 'Code after'
                    }
                },
                {
                    'Suggestion number': 2,
                    'Code example': {
                        'Before': 'Code before 2',
                        'After': 'Code after 2'
@ -71,7 +68,6 @@ class TestConvertToMarkdown:
        }
        expected_output = """\
 - 🎯 **Main theme:** Test
 - 🔍 **Description and title:** Test description
 - 📌 **Type of PR:** Test type
 - 🧪 **Relevant tests added:** no
 - ✨ **Focused PR:** Yes
@ -110,7 +106,6 @@ class TestConvertToMarkdown:
    def test_dictionary_input_containing_only_empty_dictionaries(self):
        input_data = {
            'Main theme': {},
            'Description and title': {},
            'Type of PR': {},
            'Relevant tests added': {},
            'Unrelated changes': {},
--- a/tests/unit/test_fix_output.py
+++ b/tests/unit/test_fix_output.py
@ -0,0 +1,83 @@
 # Generated by CodiumAI
 from pr_agent.algo.utils import try_fix_json
 import pytest
 class TestTryFixJson:
    # Tests that JSON with complete 'Code suggestions' section returns expected output
    def test_incomplete_code_suggestions(self):
        review = '{"PR Analysis": {"Main theme": "xxx", "Type of PR": "Bug fix"}, "PR Feedback": {"General PR suggestions": "..., `xxx`...", "Code suggestions": [{"relevant file": "xxx.py", "suggestion content": "xxx [important]"}, {"suggestion number": 2, "relevant file": "yyy.py", "suggestion content": "yyy [incomp...'
        expected_output = {
            'PR Analysis': {
                'Main theme': 'xxx',
                'Type of PR': 'Bug fix'
            },
            'PR Feedback': {
                'General PR suggestions': '..., `xxx`...',
                'Code suggestions': [
                    {
                        'relevant file': 'xxx.py',
                        'suggestion content': 'xxx [important]'
                    }
                ]
            }
        }
        assert try_fix_json(review) == expected_output
    def test_incomplete_code_suggestions_new_line(self):
        review = '{"PR Analysis": {"Main theme": "xxx", "Type of PR": "Bug fix"}, "PR Feedback": {"General PR suggestions": "..., `xxx`...", "Code suggestions": [{"relevant file": "xxx.py", "suggestion content": "xxx [important]"} \n\t, {"suggestion number": 2, "relevant file": "yyy.py", "suggestion content": "yyy [incomp...'
        expected_output = {
            'PR Analysis': {
                'Main theme': 'xxx',
                'Type of PR': 'Bug fix'
            },
            'PR Feedback': {
                'General PR suggestions': '..., `xxx`...',
                'Code suggestions': [
                    {
                        'relevant file': 'xxx.py',
                        'suggestion content': 'xxx [important]'
                    }
                ]
            }
        }
        assert try_fix_json(review) == expected_output
    def test_incomplete_code_suggestions_many_close_brackets(self):
        review = '{"PR Analysis": {"Main theme": "xxx", "Type of PR": "Bug fix"}, "PR Feedback": {"General PR suggestions": "..., `xxx`...", "Code suggestions": [{"relevant file": "xxx.py", "suggestion content": "xxx [important]"} \n, {"suggestion number": 2, "relevant file": "yyy.py", "suggestion content": "yyy }, [}\n ,incomp.}  ,..'
        expected_output = {
            'PR Analysis': {
                'Main theme': 'xxx',
                'Type of PR': 'Bug fix'
            },
            'PR Feedback': {
                'General PR suggestions': '..., `xxx`...',
                'Code suggestions': [
                    {
                        'relevant file': 'xxx.py',
                        'suggestion content': 'xxx [important]'
                    }
                ]
            }
        }
        assert try_fix_json(review) == expected_output
    def test_incomplete_code_suggestions_relevant_file(self):
        review = '{"PR Analysis": {"Main theme": "xxx", "Type of PR": "Bug fix"}, "PR Feedback": {"General PR suggestions": "..., `xxx`...", "Code suggestions": [{"relevant file": "xxx.py", "suggestion content": "xxx [important]"}, {"suggestion number": 2, "relevant file": "yyy.p'
        expected_output = {
            'PR Analysis': {
                'Main theme': 'xxx',
                'Type of PR': 'Bug fix'
            },
            'PR Feedback': {
                'General PR suggestions': '..., `xxx`...',
                'Code suggestions': [
                    {
                        'relevant file': 'xxx.py',
                        'suggestion content': 'xxx [important]'
                    }
                ]
            }
        }
        assert try_fix_json(review) == expected_output
--- a/tests/unit/test_parse_code_suggestion.py
+++ b/tests/unit/test_parse_code_suggestion.py
@ -41,14 +41,6 @@ class TestParseCodeSuggestion:
        expected_output = "\n"  # modified to expect a newline character
        assert parse_code_suggestion(input_data) == expected_output
    # Tests that function returns correct output when 'suggestion number' key has a non-integer value
    def test_non_integer_suggestion_number(self):
        input_data = {
            "Suggestion number": "one",
            "Description": "This is a suggestion"
        }
        expected_output = "   **Description:** This is a suggestion\n\n"
        assert parse_code_suggestion(input_data) == expected_output
    # Tests that function returns correct output when 'before' or 'after' key has a non-string value
    def test_non_string_before_or_after(self):
@ -64,7 +56,6 @@ class TestParseCodeSuggestion:
    # Tests that function returns correct output when input dictionary does not have 'code example' key
    def test_no_code_example_key(self):
        code_suggestions = {
            'suggestion number': 1,
            'suggestion': 'Suggestion 1',
            'description': 'Description 1',
            'before': 'Before 1',
@ -76,7 +67,6 @@ class TestParseCodeSuggestion:
    # Tests that function returns correct output when input dictionary has 'code example' key
    def test_with_code_example_key(self):
        code_suggestions = {
            'suggestion number': 2,
            'suggestion': 'Suggestion 2',
            'description': 'Description 2',
            'code example': {
Author	SHA1	Message	Date
Ori Kotek	ea1cd7ae45	Github custom action development - WIP	2023-07-13 19:14:44 +03:00
Ori Kotek	1c1aad2806	Github custom action development - WIP	2023-07-13 19:08:10 +03:00
Ori Kotek	f466d79031	Github custom action development - WIP	2023-07-13 18:59:54 +03:00
Ori Kotek	e2323dfb9f	Github custom action development - WIP	2023-07-13 18:54:40 +03:00
Ori Kotek	e51e443adc	Github custom action development - WIP	2023-07-13 18:54:11 +03:00
Ori Kotek	f6d4a214ca	Github custom action development - WIP	2023-07-13 18:40:03 +03:00
Ori Kotek	4bb46d9faa	Github custom action development - WIP	2023-07-13 18:37:32 +03:00
Ori Kotek	f337d76af6	Github custom action development - WIP	2023-07-13 18:32:28 +03:00
Ori Kotek	4033303c1f	Github custom action development - WIP	2023-07-13 18:18:23 +03:00
Ori Kotek	38c8d187d2	Github custom action development - WIP	2023-07-13 18:16:25 +03:00
Ori Kotek	f8ddfd2f25	Merge remote-tracking branch 'origin/tr/description_tool' into feature/github_action	2023-07-13 18:06:35 +03:00
mrT23	4b4fda37a6	publish_description as abstract method	2023-07-13 18:04:28 +03:00
Ori Kotek	9ca6b789a7	Github custom action development - WIP	2023-07-13 18:02:38 +03:00
mrT23	0f73f5f906	set as title	2023-07-13 17:53:17 +03:00
Ori Kotek	5742a9be1e	Github custom action development	2023-07-13 17:46:12 +03:00
mrT23	914cc6639a	ignore current title	2023-07-13 17:34:18 +03:00
mrT23	f34cda126a	stable	2023-07-13 17:31:28 +03:00
mrT23	dece20c984	PRDescription	2023-07-13 17:24:56 +03:00
mrT23	94c1f430af	General PR suggestions prompt	2023-07-13 16:34:56 +03:00
mrT23	9fadde388b	remove title and description	2023-07-13 16:26:33 +03:00
mrT23	d1b6b3bc95	Merge pull request #43 from Codium-ai/tr/inline_code_suggestions Tr/inline code suggestions	2023-07-13 10:48:42 +03:00
mrT23	77a451ada0	inline_code_comments	2023-07-13 09:44:33 +03:00
mrT23	4b8420aa16	remove suggestion number	2023-07-13 08:10:36 +03:00
Ori Kotek	25bc69f70e	Merge pull request #41 from Codium-ai/gitlab_small_fix Update gitlab config	2023-07-12 18:16:43 +03:00
Hussam Lawen	e2faf117c5	Update gitlab config	2023-07-12 18:02:28 +03:00
Hussam Lawen	aaff03bb60	Merge pull request #40 from Codium-ai/feature/support_azure_openai Add Azure OpenAI support	2023-07-12 13:37:00 +03:00
Ori Kotek	cd1e62ec96	Add Azure OpenAI support	2023-07-12 11:53:46 +03:00
Ori Kotek	7767cae181	Merge pull request #39 from Codium-ai/bugfix/cli Remove installation_id from cli	2023-07-12 11:31:43 +03:00
Ori Kotek	1bc206e7b2	Remove installation_id from cli	2023-07-12 11:31:06 +03:00
Hussam Lawen	52a438b3c8	Merge pull request #38 from Codium-ai/hl/try_fix_when_broken_output Try to fix json output when it's broken or incomplete	2023-07-11 22:23:07 +03:00
Hussam.lawen	b8a71b369d	add max_iter	2023-07-11 22:22:08 +03:00
Hussam.lawen	72af2a1f9c	Add tests	2023-07-11 22:11:55 +03:00
Hussam.lawen	fd4a2bf7ff	refactor try_fix_json, generalize finding the ending of a json item (support new lines, spaces tab)	2023-07-11 22:11:42 +03:00
Hussam.lawen	a3211d4958	Merge commit '210d94f2aa6ebf872b9b85051d1842c32d4fc34e' into hl/try_fix_when_broken_output	2023-07-11 17:33:02 +03:00
Hussam.lawen	86d7ed5f82	Try to fix broken json output	2023-07-11 17:32:48 +03:00
Ori Kotek	210d94f2aa	Merge pull request #24 from Xyand/feature/gitlab_provider Feature/gitlab provider	2023-07-11 16:56:44 +03:00
		`@ -0,0 +1,2 @@`
							`#!/bin/bash`
							`python /app/pr_agent/servers/github_action_runner.py`