Initial commit - PR-Agent OSS release

2025-07-21 04:50:39 +08:00 · 2023-07-06 00:21:08 +03:00
commit 4b4d91dfe9
44 changed files with 2426 additions and 0 deletions
--- a/.dockerignore
+++ b/.dockerignore
@ -0,0 +1 @@
+venv/
--- a/.gitignore
+++ b/.gitignore
@ -0,0 +1,4 @@
+.idea/
+venv/
+pr_agent/settings/.secrets.toml
+__pycache__
--- a/0
+++ b/0
--- a/202
+++ b/202
@ -0,0 +1,202 @@
+
+                                 Apache License
+                           Version 2.0, January 2004
+                        http://www.apache.org/licenses/
+
+   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+
+   1. Definitions.
+
+      "License" shall mean the terms and conditions for use, reproduction,
+      and distribution as defined by Sections 1 through 9 of this document.
+
+      "Licensor" shall mean the copyright owner or entity authorized by
+      the copyright owner that is granting the License.
+
+      "Legal Entity" shall mean the union of the acting entity and all
+      other entities that control, are controlled by, or are under common
+      control with that entity. For the purposes of this definition,
+      "control" means (i) the power, direct or indirect, to cause the
+      direction or management of such entity, whether by contract or
+      otherwise, or (ii) ownership of fifty percent (50%) or more of the
+      outstanding shares, or (iii) beneficial ownership of such entity.
+
+      "You" (or "Your") shall mean an individual or Legal Entity
+      exercising permissions granted by this License.
+
+      "Source" form shall mean the preferred form for making modifications,
+      including but not limited to software source code, documentation
+      source, and configuration files.
+
+      "Object" form shall mean any form resulting from mechanical
+      transformation or translation of a Source form, including but
+      not limited to compiled object code, generated documentation,
+      and conversions to other media types.
+
+      "Work" shall mean the work of authorship, whether in Source or
+      Object form, made available under the License, as indicated by a
+      copyright notice that is included in or attached to the work
+      (an example is provided in the Appendix below).
+
+      "Derivative Works" shall mean any work, whether in Source or Object
+      form, that is based on (or derived from) the Work and for which the
+      editorial revisions, annotations, elaborations, or other modifications
+      represent, as a whole, an original work of authorship. For the purposes
+      of this License, Derivative Works shall not include works that remain
+      separable from, or merely link (or bind by name) to the interfaces of,
+      the Work and Derivative Works thereof.
+
+      "Contribution" shall mean any work of authorship, including
+      the original version of the Work and any modifications or additions
+      to that Work or Derivative Works thereof, that is intentionally
+      submitted to Licensor for inclusion in the Work by the copyright owner
+      or by an individual or Legal Entity authorized to submit on behalf of
+      the copyright owner. For the purposes of this definition, "submitted"
+      means any form of electronic, verbal, or written communication sent
+      to the Licensor or its representatives, including but not limited to
+      communication on electronic mailing lists, source code control systems,
+      and issue tracking systems that are managed by, or on behalf of, the
+      Licensor for the purpose of discussing and improving the Work, but
+      excluding communication that is conspicuously marked or otherwise
+      designated in writing by the copyright owner as "Not a Contribution."
+
+      "Contributor" shall mean Licensor and any individual or Legal Entity
+      on behalf of whom a Contribution has been received by Licensor and
+      subsequently incorporated within the Work.
+
+   2. Grant of Copyright License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      copyright license to reproduce, prepare Derivative Works of,
+      publicly display, publicly perform, sublicense, and distribute the
+      Work and such Derivative Works in Source or Object form.
+
+   3. Grant of Patent License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      (except as stated in this section) patent license to make, have made,
+      use, offer to sell, sell, import, and otherwise transfer the Work,
+      where such license applies only to those patent claims licensable
+      by such Contributor that are necessarily infringed by their
+      Contribution(s) alone or by combination of their Contribution(s)
+      with the Work to which such Contribution(s) was submitted. If You
+      institute patent litigation against any entity (including a
+      cross-claim or counterclaim in a lawsuit) alleging that the Work
+      or a Contribution incorporated within the Work constitutes direct
+      or contributory patent infringement, then any patent licenses
+      granted to You under this License for that Work shall terminate
+      as of the date such litigation is filed.
+
+   4. Redistribution. You may reproduce and distribute copies of the
+      Work or Derivative Works thereof in any medium, with or without
+      modifications, and in Source or Object form, provided that You
+      meet the following conditions:
+
+      (a) You must give any other recipients of the Work or
+          Derivative Works a copy of this License; and
+
+      (b) You must cause any modified files to carry prominent notices
+          stating that You changed the files; and
+
+      (c) You must retain, in the Source form of any Derivative Works
+          that You distribute, all copyright, patent, trademark, and
+          attribution notices from the Source form of the Work,
+          excluding those notices that do not pertain to any part of
+          the Derivative Works; and
+
+      (d) If the Work includes a "NOTICE" text file as part of its
+          distribution, then any Derivative Works that You distribute must
+          include a readable copy of the attribution notices contained
+          within such NOTICE file, excluding those notices that do not
+          pertain to any part of the Derivative Works, in at least one
+          of the following places: within a NOTICE text file distributed
+          as part of the Derivative Works; within the Source form or
+          documentation, if provided along with the Derivative Works; or,
+          within a display generated by the Derivative Works, if and
+          wherever such third-party notices normally appear. The contents
+          of the NOTICE file are for informational purposes only and
+          do not modify the License. You may add Your own attribution
+          notices within Derivative Works that You distribute, alongside
+          or as an addendum to the NOTICE text from the Work, provided
+          that such additional attribution notices cannot be construed
+          as modifying the License.
+
+      You may add Your own copyright statement to Your modifications and
+      may provide additional or different license terms and conditions
+      for use, reproduction, or distribution of Your modifications, or
+      for any such Derivative Works as a whole, provided Your use,
+      reproduction, and distribution of the Work otherwise complies with
+      the conditions stated in this License.
+
+   5. Submission of Contributions. Unless You explicitly state otherwise,
+      any Contribution intentionally submitted for inclusion in the Work
+      by You to the Licensor shall be under the terms and conditions of
+      this License, without any additional terms or conditions.
+      Notwithstanding the above, nothing herein shall supersede or modify
+      the terms of any separate license agreement you may have executed
+      with Licensor regarding such Contributions.
+
+   6. Trademarks. This License does not grant permission to use the trade
+      names, trademarks, service marks, or product names of the Licensor,
+      except as required for reasonable and customary use in describing the
+      origin of the Work and reproducing the content of the NOTICE file.
+
+   7. Disclaimer of Warranty. Unless required by applicable law or
+      agreed to in writing, Licensor provides the Work (and each
+      Contributor provides its Contributions) on an "AS IS" BASIS,
+      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+      implied, including, without limitation, any warranties or conditions
+      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+      PARTICULAR PURPOSE. You are solely responsible for determining the
+      appropriateness of using or redistributing the Work and assume any
+      risks associated with Your exercise of permissions under this License.
+
+   8. Limitation of Liability. In no event and under no legal theory,
+      whether in tort (including negligence), contract, or otherwise,
+      unless required by applicable law (such as deliberate and grossly
+      negligent acts) or agreed to in writing, shall any Contributor be
+      liable to You for damages, including any direct, indirect, special,
+      incidental, or consequential damages of any character arising as a
+      result of this License or out of the use or inability to use the
+      Work (including but not limited to damages for loss of goodwill,
+      work stoppage, computer failure or malfunction, or any and all
+      other commercial damages or losses), even if such Contributor
+      has been advised of the possibility of such damages.
+
+   9. Accepting Warranty or Additional Liability. While redistributing
+      the Work or Derivative Works thereof, You may choose to offer,
+      and charge a fee for, acceptance of support, warranty, indemnity,
+      or other liability obligations and/or rights consistent with this
+      License. However, in accepting such obligations, You may act only
+      on Your own behalf and on Your sole responsibility, not on behalf
+      of any other Contributor, and only if You agree to indemnify,
+      defend, and hold each Contributor harmless for any liability
+      incurred by, or claims asserted against, such Contributor by reason
+      of your accepting any such warranty or additional liability.
+
+   END OF TERMS AND CONDITIONS
+
+   APPENDIX: How to apply the Apache License to your work.
+
+      To apply the Apache License to your work, attach the following
+      boilerplate notice, with the fields enclosed by brackets "[]"
+      replaced with your own identifying information. (Don't include
+      the brackets!)  The text should be enclosed in the appropriate
+      comment syntax for the file format. We also recommend that a
+      file or class name and description of purpose be included on the
+      same "printed page" as the copyright notice for easier
+      identification within third-party archives.
+
+   Copyright [2023] [Codium ltd]
+
+   Licensed under the Apache License, Version 2.0 (the "License");
+   you may not use this file except in compliance with the License.
+   You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
--- a/README.md
+++ b/README.md
@ -0,0 +1,283 @@
+<div align="center">
+
+# 🛡️ CodiumAI PR-Agent
+[![GitHub license](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://github.com/Codium-ai/pr-agent/blob/main/LICENSE)
+[![Discord](https://badgen.net/badge/icon/discord?icon=discord&label&color=purple)](https://discord.com/channels/1057273017547378788/1126104260430528613)
+
+CodiumAI `PR-Agent` is an open-source tool that helps developers review PRs faster and more efficiently. 
+It automatically analyzes the PR, and provides feedback and suggestions, and can answer questions. 
+It is powered by GPT-4, and is based on the [CodiumAI](https://github.com/Codium-ai/) platform.
+</div>
+
+TBD: Add screenshot of the PR Reviewer (could be gif)
+
+
+* [Quickstart](#Quickstart)
+* [Configuration](#Configuration)
+* [Usage and Tools](#usage-and-tools)
+* [Roadmap](#roadmap)
+* [Similar projects](#similar-projects)
+* Additional files:
+  * CONTRIBUTION.md
+  * LICENSE
+  * 
+
+## Quickstart
+
+To get started with PR-Agent quickly, you first need to acquire two tokens:
+1. An OpenAI key from [here](https://platform.openai.com/), with access to GPT-4.
+2. A GitHub personal access token (classic) with the repo scope.
+
+There are several ways to use PR-Agent. Let's start with the simplest one:
+
+---
+
+### Method 1: Use Docker image (no installation required)
+
+To request a review for a PR, or ask a question about a PR, you can run the appropriate
+Python scripts from the scripts folder. Here's how:
+
+1. To request a review for a PR, run the following command:
+```
+docker run --rm -it -e OPENAI.KEY=<your key> -e GITHUB.USER_TOKEN=<your token> codiumai/pr-agent \
+python pr_agent/scripts/review_pr_from_url.py --pr_url <pr url>
+```
+
+---
+
+2. To ask a question about a PR, run the following command:
+```
+docker run --rm -it -e OPENAI.KEY -e GITHUB.USER_TOKEN codiumai/pr-agent \
+python pr_agent/scripts/answer_pr_questions_from_url.py --pr_url <pr url> --question "<your question>"
+```
+
+Possible questions you can ask include:
+- What is the main theme of this PR?
+- Is the PR ready for merge?
+- What are the main changes in this PR?
+- Should this PR be split into smaller parts?
+- Can you compose a rhymed song about this PR.
+
+---
+
+### Method 2: Run from source
+
+1. Clone this repository:
+```
+git clone https://github.com/Codium-ai/pr-agent.git
+```
+
+2. Install the requirements in your favorite virtual environment:
+```
+pip install -r requirements.txt
+```
+
+3. Copy the secrets template file and fill in your OpenAI key and your GitHub user token:
+```
+cp pr_agent/settings/.secrets_template.toml pr_agent/settings/.secrets
+# Edit .secrets file
+```
+
+4. Run the appropriate Python scripts from the scripts folder:
+```
+python pr_agent/scripts/review_pr_from_url.py --pr_url <pr url>
+python pr_agent/scripts/answer_pr_questions_from_url.py --pr_url <pr url> --question "<your question>"
+```
+
+---
+
+### Method 3: Method 3: Run as a polling server; request reviews by tagging your Github user on a PR
+
+Follow steps 1-3 of method 2.
+Run the following command to start the server:
+```
+python pr_agent/servers/github_polling.py
+```
+
+---
+
+### Method 4: Run as a Github App, allowing you to automate the review process on your private or public repositories.
+
+1. Create a GitHub App from the [Github Developer Portal](https://docs.github.com/en/developers/apps/creating-a-github-app).
+   - Set the following permissions:
+     - Pull requests: Read & write
+     - Issue comment: Read & write
+     - Metadata: Read-only
+   - Set the following events:
+     - Issue comment
+     - Pull request
+
+2. Generate a random secret for your app, and save it for later. For example, you can use:
+```
+WEBHOOK_SECRET=$(python -c "import secrets; print(secrets.token_hex(10))")
+```
+
+3. Acquire the following pieces of information from your app's settings page:
+   - App private key (click "Generate a private key", and save the file)
+   - App ID
+
+4. Clone this repository:
+```
+git clone https://github.com/Codium-ai/pr-agent.git
+```
+
+5. Copy the secrets template file and fill in the following:
+   - Your OpenAI key.
+   - Set deployment_type to 'app'
+   - Copy your app's private key to the private_key field.
+   - Copy your app's ID to the app_id field.
+   - Copy your app's webhook secret to the webhook_secret field.
+```
+cp pr_agent/settings/.secrets_template.toml pr_agent/settings/.secrets
+# Edit .secrets file
+```
+
+6. Build a Docker image for the app and optionally push it to a Docker repository. We'll use Dockerhub as an example:
+```
+docker build . -t codiumai/pr-agent:github_app --target github_app -f docker/Dockerfile
+docker push codiumai/pr-agent:github_app  # Push to your Docker repository
+```
+
+7. Host the app using a server, serverless function, or container environment. Alternatively, for development and 
+   debugging, you may use tools like smee.io to forward webhooks to your local machine. 
+
+8. Go back to your app's settings, set the following:
+   - Webhook URL: The URL of your app's server, or the URL of the smee.io channel.
+   - Webhook secret: The secret you generated earlier.
+
+9. Install the app by navigating to the "Install App" tab, and selecting your desired repositories.
+
+---
+
+## Usage and Tools
+CodiumAI PR-Agent provides two types of interactions ("tools"): `"PR Reviewer"` and `"PR Q&A"`.
+- The "PR Reviewer" tool automatically analyzes PRs, and provides different types of feedbacks.
+- The "PR Q&A" tool answers free-text questions about the PR.
+
+### PR Reviewer
+Here is a quick overview of the different sub-tools of PR Reviewer:
+
+- PR Analysis
+  - Summarize main theme
+  - PR description and title
+  - PR type classification
+  - Is the PR covered by relevant tests
+  - Is the PR minimal and focused
+- PR Feedback
+  - General PR suggestions
+  - Code suggestions
+  - Security concerns
+
+This is how a typical output of the PR Reviewer looks like:
+
+---
+#### PR Analysis
+
+- 🎯 **Main theme:** Adding language extension handler and token handler
+- 🔍 **Description and title:** Yes
+- 📌 **Type of PR:** Enhancement
+- 🧪 **Relevant tests added:** No
+- ✨ **Minimal and focused:** Yes, the PR is focused on adding two new handlers for language extension and token counting.
+#### PR Feedback
+
+- 💡 **General PR suggestions:** The PR is generally well-structured and the code is clean. However, it would be beneficial to add some tests to ensure the new handlers work as expected. Also, consider adding docstrings to the new functions and classes to improve code readability and maintainability.
+
+- 🤖 **Code suggestions:**
+
+- **suggestion 1:**
+  - **relevant file:** pr_agent/algo/language_handler.py
+  - **suggestion content:** Consider using a set instead of a list for 'bad_extensions' as checking membership in a set is faster than in a list. [medium]
+
+- **suggestion 2:**
+  - **relevant file:** pr_agent/algo/language_handler.py
+  - **suggestion content:** In the 'filter_bad_extensions' function, you are splitting the filename on '.' and taking the last element to get the extension. This might not work as expected if the filename contains multiple '.' characters. Consider using 'os.path.splitext' to get the file extension more reliably. [important]
+
+- 🔒 **Security concerns:** No, the PR does not introduce possible security concerns or issues.
+
+---
+
+
+### PR Q&A
+This tool answers free-text questions about the PR. This is how a typical output of the PR Q&A looks like:
+
+---
+**Question**: summarize for me the PR in 4 bullet points
+
+**Answer**: 
+- The PR introduces a new feature to sort files by their main languages. It uses a mapping of programming languages to their file extensions to achieve this.
+- It also introduces a filter to exclude files with certain extensions, deemed as 'bad extensions', from the sorting process.
+- The PR modifies the `get_pr_diff` function in `pr_processing.py` to use the new sorting function. It also refactors the code to move the PR pruning logic into a separate function.
+- A new `TokenHandler` class is introduced in `token_handler.py` to handle token counting operations. This class is initialized with a PR, variables, system, and user, and provides methods to get system and user tokens and to count tokens in a patch.
+
+---
+
+## Configuration
+The different tools and sub-tools used by CodiumAI PR-Agent are easily configurable via the configuration file: `/settings/configuration.toml`.
+#### Enabling/disabling sub-tools:
+You can enable/disable the different PR Reviewer sub-sections  with the following flags:
+```
+require_minimal_and_focused_review=true
+require_tests_review=true
+require_security_review=true
+```
+#### Code Suggestions configuration:
+There are also configuration options to control different aspects of the `code suggestions` feature.
+The number of suggestions provided can be controlled by adjusting the following parameter:
+```
+num_code_suggestions=4
+```
+You can also enable more verbose and informative mode of code suggestions:
+```
+extended_code_suggestions=false
+``` 
+This is a comparison of the regular and extended code suggestions modes:
+
+---
+Example for regular suggestion:
+
+
+- **suggestion 1:**
+  - **relevant file:** sql.py
+  - **suggestion content:** Remove hardcoded sensitive information like username and password. Use environment variables or a secure method to store these values. [important]
+---
+
+Example for extended suggestion:
+
+
+- **suggestion 1:**
+  - **relevant file:** sql.py
+  - **suggestion content:** Remove hardcoded sensitive information (username and password) [important]
+  - **why:** Hardcoding sensitive information is a security risk. It's better to use environment variables or a secure way to store these values.
+  - **code example:**
+    - **before code:**
+        ```
+        user = "root",
+        password = "Mysql@123",
+        ```
+    - **after code:**
+        ```
+        user = os.getenv('DB_USER'),
+        password = os.getenv('DB_PASSWORD'),
+        ```
+---
+
+
+## Roadmap
+- [ ] Support open-source models, as a replacement for openai models. Note that a minimal requirement for each open-source model is to have 8k+ context, and good support for generating json as an output
+- [ ] Support other Git providers, such as Gitlab and Bitbucket.
+- [ ] Develop additional logics for handling large PRs, and compressing git patches
+- [ ] Dedicated tools and sub-tools for specific programming languages (Python, Javascript, Java, C++, etc)
+- [ ] Add additional context to the prompt. For example, repo (or relevant files) summarization, with tools such a [ctags](https://github.com/universal-ctags/ctags)
+- [ ] Adding more tools. Possible directions:
+  - [ ] Code Quality
+  - [ ] Coding Style
+  - [ ] Performance (are there any performance issues)
+  - [ ] Documentation (is the PR properly documented)
+  - [ ] Rank the PR importance
+  - [ ] ...
+
+## Similar Projects
+- [CodiumAI - Meaningful tests for busy devs](https://github.com/Codium-ai/codiumai-vscode-release)
+- [Aider - GPT powered coding in your terminal](https://github.com/paul-gauthier/aider)
+- [GPT-Engineer](https://github.com/AntonOsika/gpt-engineer)
+- [CodeReview BOT](https://github.com/anc95/ChatGPT-CodeReview)
--- a/docker/Dockerfile
+++ b/docker/Dockerfile
@ -0,0 +1,20 @@
+FROM python:3.10 as base
+
+WORKDIR /app
+ADD requirements.txt .
+RUN pip install -r requirements.txt && rm requirements.txt
+ENV PYTHONPATH=/app
+ADD pr_agent pr_agent
+
+FROM base as github_app
+CMD ["python", "servers/github_app.py"]
+
+FROM base as github_polling
+CMD ["python", "servers/github_polling.py"]
+
+FROM base as test
+ADD requirements-dev.txt .
+RUN pip install -r requirements-dev.txt && rm requirements-dev.txt
+
+FROM base as cli
+CMD ["bash"]
--- a/pics/extended_code_suggestion.png
+++ b/pics/extended_code_suggestion.png
--- a/pics/pr_questions.png
+++ b/pics/pr_questions.png
--- a/pics/pr_reviewer.png
+++ b/pics/pr_reviewer.png
--- a/pics/regular_code_suggestion.png
+++ b/pics/regular_code_suggestion.png
--- a/pr_agent/init.py
+++ b/pr_agent/init.py
@ -0,0 +1 @@
+
--- a/pr_agent/agent/init.py
+++ b/pr_agent/agent/init.py
--- a/pr_agent/agent/pr_agent.py
+++ b/pr_agent/agent/pr_agent.py
@ -0,0 +1,20 @@
+import re
+from typing import Optional
+
+from pr_agent.tools.pr_questions import PRQuestions
+from pr_agent.tools.pr_reviewer import PRReviewer
+
+
+class PRAgent:
+    def __init__(self, installation_id: Optional[int] = None):
+        self.installation_id = installation_id
+
+    async def handle_request(self, pr_url, request):
+        if 'please review' in request.lower():
+            reviewer = PRReviewer(pr_url, self.installation_id)
+            await reviewer.review()
+
+        elif 'please answer' in request.lower():
+            question = re.split(r'(?i)please answer', request)[1].strip()
+            answerer = PRQuestions(pr_url, question, self.installation_id)
+            await answerer.answer()
--- a/pr_agent/algo/init.py
+++ b/pr_agent/algo/init.py
@ -0,0 +1,10 @@
+MAX_TOKENS = {
+    'gpt-3.5-turbo': 4000,
+    'gpt-3.5-turbo-0613': 4000,
+    'gpt-3.5-turbo-0301': 4000,
+    'gpt-3.5-turbo-16k': 16000,
+    'gpt-3.5-turbo-16k-0613': 16000,
+    'gpt-4': 8000,
+    'gpt-4-0613': 8000,
+    'gpt-4-32k': 32000,
+}
--- a/pr_agent/algo/ai_handler.py
+++ b/pr_agent/algo/ai_handler.py
@ -0,0 +1,37 @@
+import logging
+
+import openai
+from openai.error import APIError, Timeout, TryAgain
+from retry import retry
+
+from pr_agent.config_loader import settings
+
+OPENAI_RETRIES=2
+
+class AiHandler:
+    def __init__(self):
+        try:
+            openai.api_key = settings.openai.key
+        except AttributeError as e:
+            raise ValueError("OpenAI key is required") from e
+
+    @retry(exceptions=(APIError, Timeout, TryAgain, AttributeError),
+           tries=OPENAI_RETRIES, delay=2, backoff=2, jitter=(1, 3))
+    async def chat_completion(self, model: str, temperature: float, system: str, user: str):
+        try:
+            response = await openai.ChatCompletion.acreate(
+                            model=model,
+                            messages=[
+                                {"role": "system", "content": system},
+                                {"role": "user", "content": user}
+                            ],
+                            temperature=temperature,
+                        )
+        except (APIError, Timeout, TryAgain) as e:
+            logging.error("Error during OpenAI inference: ", e)
+            raise
+        if response is None or len(response.choices) == 0:
+            raise TryAgain
+        resp = response.choices[0]['message']['content']
+        finish_reason = response.choices[0].finish_reason
+        return resp, finish_reason
--- a/pr_agent/algo/git_patch_processing.py
+++ b/pr_agent/algo/git_patch_processing.py
@ -0,0 +1,107 @@
+from __future__ import annotations
+
+import logging
+import re
+
+from pr_agent.config_loader import settings
+
+
+def extend_patch(original_file_str, patch_str, num_lines) -> str:
+    """
+    Extends the patch to include 'num_lines' more surrounding lines
+    """
+    if not patch_str or num_lines == 0:
+        return patch_str
+
+    original_lines = original_file_str.splitlines()
+    patch_lines = patch_str.splitlines()
+    extended_patch_lines = []
+
+    start1, size1, start2, size2 = -1, -1, -1, -1
+    RE_HUNK_HEADER = re.compile(
+        r"^@@ -(\d+)(?:,(\d+))? \+(\d+)(?:,(\d+))? @@[ ]?(.*)")
+    try:
+        for line in patch_lines:
+            if line.startswith('@@'):
+                match = RE_HUNK_HEADER.match(line)
+                if match:
+                    # finish previous hunk
+                    if start1 != -1:
+                        extended_patch_lines.extend(
+                            original_lines[start1 + size1 - 1:start1 + size1 - 1 + num_lines])
+
+                    start1, size1, start2, size2 = map(int, match.groups()[:4])
+                    section_header = match.groups()[4]
+                    extended_start1 = max(1, start1 - num_lines)
+                    extended_size1 = size1 + (start1 - extended_start1) + num_lines
+                    extended_start2 = max(1, start2 - num_lines)
+                    extended_size2 = size2 + (start2 - extended_start2) + num_lines
+                    extended_patch_lines.append(
+                        f'@@ -{extended_start1},{extended_size1} '
+                        f'+{extended_start2},{extended_size2} @@ {section_header}')
+                    extended_patch_lines.extend(
+                        original_lines[extended_start1 - 1:start1 - 1])  # one to zero based
+                    continue
+            extended_patch_lines.append(line)
+    except Exception as e:
+        if settings.config.verbosity_level >= 2:
+            logging.error(f"Failed to extend patch: {e}")
+        return patch_str
+
+    # finish previous hunk
+    if start1 != -1:
+        extended_patch_lines.extend(
+            original_lines[start1 + size1 - 1:start1 + size1 - 1 + num_lines])
+
+    extended_patch_str = '\n'.join(extended_patch_lines)
+    return extended_patch_str
+
+
+def omit_deletion_hunks(patch_lines) -> str:
+    temp_hunk = []
+    added_patched = []
+    add_hunk = False
+    inside_hunk = False
+    RE_HUNK_HEADER = re.compile(
+        r"^@@ -(\d+)(?:,(\d+))? \+(\d+)(?:,(\d+))?\ @@[ ]?(.*)")
+
+    for line in patch_lines:
+        if line.startswith('@@'):
+            match = RE_HUNK_HEADER.match(line)
+            if match:
+                # finish previous hunk
+                if inside_hunk and add_hunk:
+                    added_patched.extend(temp_hunk)
+                    temp_hunk = []
+                    add_hunk = False
+                temp_hunk.append(line)
+                inside_hunk = True
+        else:
+            temp_hunk.append(line)
+            edit_type = line[0]
+            if edit_type == '+':
+                add_hunk = True
+    if inside_hunk and add_hunk:
+        added_patched.extend(temp_hunk)
+
+    return '\n'.join(added_patched)
+
+
+def handle_patch_deletions(patch: str, original_file_content_str: str,
+                           new_file_content_str: str, file_name: str) -> str:
+    """
+    Handle entire file or deletion patches
+    """
+    if not new_file_content_str:
+        # logic for handling deleted files - don't show patch, just show that the file was deleted
+        if settings.config.verbosity_level > 0:
+            logging.info(f"Processing file: {file_name}, minimizing deletion file")
+        patch = "File was deleted\n"
+    else:
+        patch_lines = patch.splitlines()
+        patch_new = omit_deletion_hunks(patch_lines)
+        if patch != patch_new:
+            if settings.config.verbosity_level > 0:
+                logging.info(f"Processing file: {file_name}, hunks were deleted")
+            patch = patch_new
+    return patch
--- a/pr_agent/algo/language_handler.py
+++ b/pr_agent/algo/language_handler.py
--- a/pr_agent/algo/pr_processing.py
+++ b/pr_agent/algo/pr_processing.py
@ -0,0 +1,128 @@
+from __future__ import annotations
+
+import difflib
+import logging
+from typing import Any, Dict, Tuple
+
+from pr_agent.algo.git_patch_processing import extend_patch, handle_patch_deletions
+from pr_agent.algo.language_handler import sort_files_by_main_languages
+from pr_agent.algo.token_handler import TokenHandler
+from pr_agent.config_loader import settings
+from pr_agent.git_providers import GithubProvider
+
+OUTPUT_BUFFER_TOKENS = 800
+PATCH_EXTRA_LINES = 3
+
+
+def get_pr_diff(git_provider: [GithubProvider, Any], token_handler: TokenHandler) -> str:
+    """
+    Returns a string with the diff of the PR.
+    If needed, apply diff minimization techniques to reduce the number of tokens
+    """
+    files = list(git_provider.get_diff_files())
+
+    # get pr languages
+    pr_languages = sort_files_by_main_languages(git_provider.get_languages(), files)
+
+    # generate a standard diff string, with patch extension
+    patches_extended, total_tokens = pr_generate_extended_diff(pr_languages, token_handler)
+
+    # if we are under the limit, return the full diff
+    if total_tokens + OUTPUT_BUFFER_TOKENS < token_handler.limit:
+        return "\n".join(patches_extended)
+
+    # if we are over the limit, start pruning
+    patches_compressed = pr_generate_compressed_diff(pr_languages, token_handler)
+    return "\n".join(patches_compressed)
+
+
+def pr_generate_extended_diff(pr_languages: list, token_handler: TokenHandler) -> \
+        Tuple[list, int]:
+    """
+    Generate a standard diff string, with patch extension
+    """
+    total_tokens = token_handler.prompt_tokens  # initial tokens
+    patches_extended = []
+    for lang in pr_languages:
+        for file in lang['files']:
+            original_file_content_str = file.base_file
+            new_file_content_str = file.head_file
+            patch = file.patch
+
+            # handle the case of large patch, that initially was not loaded
+            patch = load_large_diff(file, new_file_content_str, original_file_content_str, patch)
+
+            if not patch:
+                continue
+
+            # extend each patch with extra lines of context
+            extended_patch = extend_patch(original_file_content_str, patch, num_lines=PATCH_EXTRA_LINES)
+            full_extended_patch = f"## {file.filename}\n\n{extended_patch}\n"
+
+            patch_tokens = token_handler.count_tokens(full_extended_patch)
+            file.tokens = patch_tokens
+            total_tokens += patch_tokens
+            patches_extended.append(full_extended_patch)
+
+    return patches_extended, total_tokens
+
+
+def pr_generate_compressed_diff(top_langs: list, token_handler: TokenHandler) -> list:
+    # Apply Diff Minimization techniques to reduce the number of tokens:
+    # 0. Start from the largest diff patch to smaller ones
+    # 1. Don't use extend context lines around diff
+    # 2. Minimize deleted files
+    # 3. Minimize deleted hunks
+    # 4. Minimize all remaining files when you reach token limit
+
+    patches = []
+
+    # sort each one of the languages in top_langs by the number of tokens in the diff
+    sorted_files = []
+    for lang in top_langs:
+        sorted_files.extend(sorted(lang['files'], key=lambda x: x.tokens, reverse=True))
+
+    total_tokens = token_handler.prompt_tokens
+    for file in sorted_files:
+        original_file_content_str = file.base_file
+        new_file_content_str = file.head_file
+        patch = file.patch
+        patch = load_large_diff(file, new_file_content_str, original_file_content_str, patch)
+        if not patch:
+            continue
+
+        # removing delete-only hunks
+        patch = handle_patch_deletions(patch, original_file_content_str,
+                                       new_file_content_str, file.filename)
+        new_patch_tokens = token_handler.count_tokens(patch)
+
+        if total_tokens > token_handler.limit - OUTPUT_BUFFER_TOKENS // 2:
+            logging.warning(f"File was fully skipped, no more tokens: {file.filename}.")
+            continue  # Hard Stop, no more tokens
+        if total_tokens + new_patch_tokens > token_handler.limit - OUTPUT_BUFFER_TOKENS:
+            # Current logic is to skip the patch if it's too large
+            # TODO: Option for alternative logic to remove hunks from the patch to reduce the number of tokens
+            #  until we meet the requirements
+            if settings.config.verbosity_level >= 2:
+                logging.warning(f"Patch too large, minimizing it, {file.filename}")
+            patch = "File was modified"
+        if patch:
+            patch_final = f"## {file.filename}\n\n{patch}\n"
+            patches.append(patch_final)
+            total_tokens += token_handler.count_tokens(patch_final)
+            if settings.config.verbosity_level >= 2:
+                logging.info(f"Tokens: {total_tokens}, last filename: {file.filename}")
+    return patches
+
+
+def load_large_diff(file, new_file_content_str: str, original_file_content_str: str, patch: str) -> str:
+    if not patch:  # to Do - also add condition for file extension
+        try:
+            diff = difflib.unified_diff(original_file_content_str.splitlines(keepends=True),
+                                        new_file_content_str.splitlines(keepends=True))
+            if settings.config.verbosity_level >= 2:
+                logging.warning(f"File was modified, but no patch was found. Manually creating patch: {file.filename}.")
+            patch = ''.join(diff)
+        except Exception:
+            pass
+    return patch
--- a/pr_agent/algo/token_handler.py
+++ b/pr_agent/algo/token_handler.py
@ -0,0 +1,24 @@
+from jinja2 import Environment, StrictUndefined
+from tiktoken import encoding_for_model
+
+from pr_agent.algo import MAX_TOKENS
+from pr_agent.config_loader import settings
+
+
+class TokenHandler:
+    def __init__(self, pr, vars: dict, system, user):
+        self.encoder = encoding_for_model(settings.config.model)
+        self.limit = MAX_TOKENS[settings.config.model]
+        self.prompt_tokens = self._get_system_user_tokens(pr, self.encoder, vars, system, user)
+
+    def _get_system_user_tokens(self, pr, encoder, vars: dict, system, user):
+        environment = Environment(undefined=StrictUndefined)
+        system_prompt = environment.from_string(system).render(vars)
+        user_prompt = environment.from_string(user).render(vars)
+
+        system_prompt_tokens = len(encoder.encode(system_prompt))
+        user_prompt_tokens = len(encoder.encode(user_prompt))
+        return system_prompt_tokens + user_prompt_tokens
+
+    def count_tokens(self, patch: str) -> int:
+        return len(self.encoder.encode(patch))
--- a/pr_agent/algo/utils.py
+++ b/pr_agent/algo/utils.py
@ -0,0 +1,59 @@
+from __future__ import annotations
+
+import textwrap
+
+
+def convert_to_markdown(output_data: dict) -> str:
+    markdown_text = ""
+
+    emojis = {
+        "Main theme": "🎯",
+        "Description and title": "🔍",
+        "Type of PR": "📌",
+        "Relevant tests added": "🧪",
+        "Unrelated changes": "⚠️",
+        "Minimal and focused": "✨",
+        "Security concerns": "🔒",
+        "General PR suggestions": "💡",
+        "Code suggestions": "🤖"
+    }
+
+    for key, value in output_data.items():
+        if not value:
+            continue
+        if isinstance(value, dict):
+            markdown_text += f"## {key}\n\n"
+            markdown_text += convert_to_markdown(value)
+        elif isinstance(value, list):
+            if key.lower() == 'code suggestions':
+                markdown_text += "\n"  # just looks nicer with additional line breaks
+            emoji = emojis.get(key, "‣")  # Use a dash if no emoji is found for the key
+            markdown_text += f"- {emoji} **{key}:**\n\n"
+            for item in value:
+                if isinstance(item, dict) and key.lower() == 'code suggestions':
+                    markdown_text += parse_code_suggestion(item)
+                elif item:
+                    markdown_text += f"  - {item}\n"
+        elif value != 'n/a':
+            emoji = emojis.get(key, "‣")  # Use a dash if no emoji is found for the key
+            markdown_text += f"- {emoji} **{key}:** {value}\n"
+    return markdown_text
+
+
+def parse_code_suggestion(code_suggestions: dict) -> str:
+    markdown_text = ""
+    for sub_key, sub_value in code_suggestions.items():
+        if isinstance(sub_value, dict):  # "code example"
+            markdown_text += f"  - **{sub_key}:**\n"
+            for code_key, code_value in sub_value.items():  # 'before' and 'after' code
+                code_str = f"```\n{code_value}\n```"
+                code_str_indented = textwrap.indent(code_str, '        ')
+                markdown_text += f"    - **{code_key}:**\n{code_str_indented}\n"
+        else:
+            if "suggestion number" in sub_key.lower():
+                markdown_text += f"- **suggestion {sub_value}:**\n"  # prettier formatting
+            else:
+                markdown_text += f"  - **{sub_key}:** {sub_value}\n"
+    markdown_text += "\n"
+    return markdown_text
+
--- a/pr_agent/config_loader.py
+++ b/pr_agent/config_loader.py
@ -0,0 +1,14 @@
+from os.path import abspath, dirname, join
+
+from dynaconf import Dynaconf
+
+current_dir = dirname(abspath(__file__))
+settings = Dynaconf(
+    envvar_prefix=False,
+    settings_files=[join(current_dir, f) for f in [
+         "settings/.secrets.toml",
+         "settings/configuration.toml",
+         "settings/pr_reviewer_prompts.toml",
+         "settings/pr_questions_prompts.toml"
+        ]]
+)
--- a/pr_agent/git_providers/init.py
+++ b/pr_agent/git_providers/init.py
@ -0,0 +1,15 @@
+from pr_agent.config_loader import settings
+from pr_agent.git_providers.github_provider import GithubProvider
+
+_GIT_PROVIDERS = {
+    'github': GithubProvider
+}
+
+def get_git_provider():
+    try:
+        provider_id = settings.config.git_provider
+    except AttributeError as e:
+        raise ValueError("github_provider is a required attribute in the configuration file") from e
+    if provider_id not in _GIT_PROVIDERS:
+        raise ValueError(f"Unknown git provider: {provider_id}")
+    return _GIT_PROVIDERS[provider_id]
--- a/pr_agent/git_providers/github_provider.py
+++ b/pr_agent/git_providers/github_provider.py
@ -0,0 +1,170 @@
+from collections import namedtuple
+from dataclasses import dataclass
+from datetime import datetime
+from typing import Optional, Tuple
+from urllib.parse import urlparse
+
+from github import AppAuthentication, File, Github
+
+from pr_agent.config_loader import settings
+
+@dataclass
+class FilePatchInfo:
+    base_file: str
+    head_file: str
+    patch: str
+    filename: str
+    tokens: int = -1
+
+class GithubProvider:
+    def __init__(self, pr_url: Optional[str] = None, installation_id: Optional[int] = None):
+        self.installation_id = installation_id
+        self.github_client = self._get_github_client()
+        self.repo = None
+        self.pr_num = None
+        self.pr = None
+        if pr_url:
+            self.set_pr(pr_url)
+
+    def set_pr(self, pr_url: str):
+        self.repo, self.pr_num = self._parse_pr_url(pr_url)
+        self.pr = self._get_pr()
+
+    def get_diff_files(self) -> list[FilePatchInfo]:
+        files = self.pr.get_files()
+        diff_files = []
+        for file in files:
+            original_file_content_str = self._get_pr_file_content(file, self.pr.base.sha)
+            new_file_content_str = self._get_pr_file_content(file, self.pr.head.sha)
+            diff_files.append(FilePatchInfo(original_file_content_str, new_file_content_str, file.patch, file.filename))
+        return diff_files
+
+    def publish_comment(self, pr_comment: str):
+        self.pr.create_issue_comment(pr_comment)
+
+    def get_title(self):
+        return self.pr.title
+
+    def get_description(self):
+        return self.pr.body
+
+    def get_languages(self):
+        return self._get_repo().get_languages()
+
+    def get_main_pr_language(self) -> str:
+        """
+        Get the main language of the commit. Return an empty string if cannot determine.
+        """
+        main_language_str = ""
+        try:
+            languages = self.get_languages()
+            top_language = max(languages, key=languages.get).lower()
+
+            # validate that the specific commit uses the main language
+            extension_list = []
+            files = self.pr.get_files()
+            for file in files:
+                extension_list.append(file.filename.rsplit('.')[-1])
+
+            # get the most common extension
+            most_common_extension = max(set(extension_list), key=extension_list.count)
+
+            # look for a match. TBD: add more languages, do this systematically
+            if most_common_extension == 'py' and top_language == 'python' or \
+                    most_common_extension == 'js' and top_language == 'javascript' or \
+                    most_common_extension == 'ts' and top_language == 'typescript' or \
+                    most_common_extension == 'go' and top_language == 'go' or \
+                    most_common_extension == 'java' and top_language == 'java' or \
+                    most_common_extension == 'c' and top_language == 'c' or \
+                    most_common_extension == 'cpp' and top_language == 'c++' or \
+                    most_common_extension == 'cs' and top_language == 'c#' or \
+                    most_common_extension == 'swift' and top_language == 'swift' or \
+                    most_common_extension == 'php' and top_language == 'php' or \
+                    most_common_extension == 'rb' and top_language == 'ruby' or \
+                    most_common_extension == 'rs' and top_language == 'rust' or \
+                    most_common_extension == 'scala' and top_language == 'scala' or \
+                    most_common_extension == 'kt' and top_language == 'kotlin' or \
+                    most_common_extension == 'pl' and top_language == 'perl' or \
+                    most_common_extension == 'swift' and top_language == 'swift':
+                main_language_str = top_language
+
+        except Exception:
+            pass
+
+        return main_language_str
+
+    def get_pr_branch(self):
+        return self.pr.head.ref
+
+    def get_notifications(self, since: datetime):
+        deployment_type = settings.get("GITHUB.DEPLOYMENT_TYPE", "user")
+
+        if deployment_type != 'user':
+            raise ValueError("Deployment mode must be set to 'user' to get notifications")
+
+        notifications = self.github_client.get_user().get_notifications(since=since)
+        return notifications
+
+    @staticmethod
+    def _parse_pr_url(pr_url: str) -> Tuple[str, int]:
+        parsed_url = urlparse(pr_url)
+
+        if 'github.com' not in parsed_url.netloc:
+            raise ValueError("The provided URL is not a valid GitHub URL")
+
+        path_parts = parsed_url.path.strip('/').split('/')
+        if 'api.github.com' in parsed_url.netloc:
+            if len(path_parts) < 5 or path_parts[3] != 'pulls':
+                raise ValueError("The provided URL does not appear to be a GitHub PR URL")
+            repo_name = '/'.join(path_parts[1:3])
+            try:
+                pr_number = int(path_parts[4])
+            except ValueError as e:
+                raise ValueError("Unable to convert PR number to integer") from e
+            return repo_name, pr_number
+
+        if len(path_parts) < 4 or path_parts[2] != 'pull':
+            raise ValueError("The provided URL does not appear to be a GitHub PR URL")
+
+        repo_name = '/'.join(path_parts[:2])
+        try:
+            pr_number = int(path_parts[3])
+        except ValueError as e:
+            raise ValueError("Unable to convert PR number to integer") from e
+
+        return repo_name, pr_number
+
+    def _get_github_client(self):
+        deployment_type = settings.get("GITHUB.DEPLOYMENT_TYPE", "user")
+
+        if deployment_type == 'app':
+            try:
+                private_key = settings.github.private_key
+                app_id = settings.github.app_id
+            except AttributeError as e:
+                raise ValueError("GitHub app ID and private key are required when using GitHub app deployment") from e
+            if not self.installation_id:
+                raise ValueError("GitHub app installation ID is required when using GitHub app deployment")
+            auth = AppAuthentication(app_id=app_id, private_key=private_key,
+                                     installation_id=self.installation_id)
+            return Github(app_auth=auth)
+
+        if deployment_type == 'user':
+            try:
+                token = settings.github.user_token
+            except AttributeError as e:
+                raise ValueError("GitHub token is required when using user deployment") from e
+            return Github(token)
+
+    def _get_repo(self):
+        return self.github_client.get_repo(self.repo)
+
+    def _get_pr(self):
+        return self._get_repo().get_pull(self.pr_num)
+
+    def _get_pr_file_content(self, file: FilePatchInfo, sha: str):
+        try:
+            file_content_str = self._get_repo().get_contents(file.filename, ref=sha).decoded_content.decode()
+        except Exception:
+            file_content_str = ""
+        return file_content_str
--- a/pr_agent/scripts/answer_pr_questions_from_url.py
+++ b/pr_agent/scripts/answer_pr_questions_from_url.py
@ -0,0 +1,16 @@
+import argparse
+import asyncio
+import logging
+import os
+
+from pr_agent.tools.pr_questions import PRQuestions
+
+if __name__ == '__main__':
+    parser = argparse.ArgumentParser(description='Review a PR from a URL')
+    parser.add_argument('--pr_url', type=str, help='The URL of the PR to review', required=True)
+    parser.add_argument('--question_str', type=str, help='The question to answer', required=True)
+
+    args = parser.parse_args()
+    logging.basicConfig(level=os.environ.get("LOGLEVEL", "INFO"))
+    reviewer = PRQuestions(args.pr_url, args.question_str, None)
+    asyncio.run(reviewer.answer())
--- a/pr_agent/scripts/review_pr_from_url.py
+++ b/pr_agent/scripts/review_pr_from_url.py
@ -0,0 +1,14 @@
+import argparse
+import asyncio
+import logging
+import os
+
+from pr_agent.tools.pr_reviewer import PRReviewer
+
+if __name__ == '__main__':
+    parser = argparse.ArgumentParser(description='Review a PR from a URL')
+    parser.add_argument('--pr_url', type=str, help='The URL of the PR to review', required=True)
+    args = parser.parse_args()
+    logging.basicConfig(level=os.environ.get("LOGLEVEL", "INFO"))
+    reviewer = PRReviewer(args.pr_url, None)
+    asyncio.run(reviewer.review())
--- a/pr_agent/servers/github_app_webhook.py
+++ b/pr_agent/servers/github_app_webhook.py
@ -0,0 +1,78 @@
+import logging
+import sys
+
+import uvicorn
+from fastapi import APIRouter, FastAPI, HTTPException, Request, Response
+
+from pr_agent.agent.pr_agent import PRAgent
+from pr_agent.config_loader import settings
+from pr_agent.servers.utils import verify_signature
+
+logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
+router = APIRouter()
+
+
+@router.post("/api/v1/github_webhooks")
+async def handle_github_webhooks(request: Request, response: Response):
+    logging.debug("Received a github webhook")
+    try:
+        body = await request.json()
+    except Exception as e:
+        logging.error("Error parsing request body", e)
+        raise HTTPException(status_code=400, detail="Error parsing request body") from e
+    body_bytes = await request.body()
+    signature_header = request.headers.get('x-hub-signature-256', None)
+    try:
+        webhook_secret = settings.github.webhook_secret
+    except AttributeError:
+        webhook_secret = None
+    if webhook_secret:
+        verify_signature(body_bytes, webhook_secret, signature_header)
+    logging.debug(f'Request body:\n{body}')
+    return await handle_request(body)
+
+
+async def handle_request(body):
+    action = body.get("action", None)
+    installation_id = body.get("installation", {}).get("id", None)
+    agent = PRAgent(installation_id)
+    if action == 'created':
+        if "comment" not in body:
+            return {}
+        comment_body = body.get("comment", {}).get("body", None)
+        if "says 'Please" in comment_body:
+            return {}
+        if "issue" not in body and "pull_request" not in body["issue"]:
+            return {}
+        pull_request = body["issue"]["pull_request"]
+        api_url = pull_request.get("url", None)
+        await agent.handle_request(api_url, comment_body)
+
+    elif action in ["opened"] or 'reopened' in action:
+        pull_request = body.get("pull_request", None)
+        if not pull_request:
+            return {}
+        api_url = pull_request.get("url", None)
+        if api_url is None:
+            return {}
+        await agent.handle_request(api_url, "please review")
+    else:
+        return {}
+
+
+@router.get("/")
+async def root():
+    return {"status": "ok"}
+
+
+def start():
+    if settings.get("GITHUB.DEPLOYMENT_TYPE", "user") != "app":
+        raise Exception("Please set deployment type to app in .secrets.toml file")
+    app = FastAPI()
+    app.include_router(router)
+
+    uvicorn.run(app, host="0.0.0.0", port=3000)
+
+
+if __name__ == '__main__':
+    start()
--- a/pr_agent/servers/github_polling.py
+++ b/pr_agent/servers/github_polling.py
@ -0,0 +1,73 @@
+import asyncio
+import logging
+import sys
+from datetime import datetime, timezone
+
+import aiohttp
+
+from pr_agent.agent.pr_agent import PRAgent
+from pr_agent.config_loader import settings
+
+logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
+NOTIFICATION_URL = "https://api.github.com/notifications"
+
+
+def now() -> str:
+    now_utc = datetime.now(timezone.utc).isoformat()
+    now_utc = now_utc.replace("+00:00", "Z")
+    return now_utc
+
+
+async def polling_loop():
+    since = [now()]
+    last_modified = [None]
+    try:
+        deployment_type = settings.github.deployment_type
+        token = settings.github.user_token
+    except AttributeError:
+        deployment_type = 'none'
+        token = None
+    if deployment_type != 'user':
+        raise ValueError("Deployment mode must be set to 'user' to get notifications")
+    if not token:
+        raise ValueError("User token must be set to get notifications")
+    async with aiohttp.ClientSession() as session:
+        while True:
+            headers = {
+                "Accept": "application/vnd.github.v3+json",
+                "Authorization": f"Bearer {token}"
+            }
+            params = {
+                "participating": "true"
+            }
+            if since[0]:
+                params["since"] = since[0]
+            if last_modified[0]:
+                headers["If-Modified-Since"] = last_modified[0]
+            async with session.get(NOTIFICATION_URL, headers=headers, params=params) as response:
+                if response.status == 200:
+                    if 'Last-Modified' in response.headers:
+                        last_modified[0] = response.headers['Last-Modified']
+                        since[0] = None
+                    notifications = await response.json()
+                    for notification in notifications:
+                        if 'reason' in notification and notification['reason'] == 'mention':
+                            if 'subject' in notification and notification['subject']['type'] == 'PullRequest':
+                                pr_url = notification['subject']['url']
+                                latest_comment = notification['subject']['latest_comment_url']
+                                async with session.get(latest_comment, headers=headers) as comment_response:
+                                    if comment_response.status == 200:
+                                        comment = await comment_response.json()
+                                        comment_body = comment['body'] if 'body' in comment else ''
+                                        commenter_github_user = comment['user']['login'] if 'user' in comment else ''
+                                        logging.info(f"Commenter: {commenter_github_user}\nComment: {comment_body}")
+                                        if comment_body.strip().startswith("@"):
+                                            agent = PRAgent()
+                                            await agent.handle_request(pr_url, comment_body)
+                elif response.status != 304:
+                    print(f"Failed to fetch notifications. Status code: {response.status}")
+
+            await asyncio.sleep(5)
+
+if __name__ == '__main__':
+    asyncio.run(polling_loop())
--- a/pr_agent/servers/utils.py
+++ b/pr_agent/servers/utils.py
@ -0,0 +1,23 @@
+import hashlib
+import hmac
+
+from fastapi import HTTPException
+
+
+def verify_signature(payload_body, secret_token, signature_header):
+    """Verify that the payload was sent from GitHub by validating SHA256.
+
+    Raise and return 403 if not authorized.
+
+    Args:
+        payload_body: original request body to verify (request.body())
+        secret_token: GitHub app webhook token (WEBHOOK_SECRET)
+        signature_header: header received from GitHub (x-hub-signature-256)
+    """
+    if not signature_header:
+        raise HTTPException(status_code=403, detail="x-hub-signature-256 header is missing!")
+    hash_object = hmac.new(secret_token.encode('utf-8'), msg=payload_body, digestmod=hashlib.sha256)
+    expected_signature = "sha256=" + hash_object.hexdigest()
+    if not hmac.compare_digest(expected_signature, signature_header):
+        raise HTTPException(status_code=403, detail="Request signatures didn't match!")
+
--- a/pr_agent/settings/.secrets_template.toml
+++ b/pr_agent/settings/.secrets_template.toml
@ -0,0 +1,26 @@
+# QUICKSTART:
+# Copy this file to .secrets in the same folder.
+# The minimum workable settings - set openai.key to your API key.
+# Set github.deployment_type to "user" and github.user_token to your GitHub personal access token.
+# This will allow you to run the CLI scripts in the scripts/ folder and the github_polling server.
+#
+# See README for details about GitHub App deployment.
+
+[openai]
+key = "<API_KEY>"
+
+[github]
+# The type of deployment to create. Valid values are 'app' or 'user'.
+deployment_type = "user"
+
+# ---- Set the following only for deployment type == "user"
+user_token = "<TOKEN>"  # A GitHub personal access token with 'repo' scope.
+
+# ---- Set the following only for deployment type == "app", see README for details.
+private_key = """\
+-----BEGIN RSA PRIVATE KEY-----
+<GITHUB PRIVATE KEY>
+-----END RSA PRIVATE KEY-----
+"""
+app_id = 123456  # The GitHub App ID, replace with your own.
+webhook_secret = "<WEBHOOK SECRET>"  # Optional, may be commented out.
--- a/pr_agent/settings/configuration.toml
+++ b/pr_agent/settings/configuration.toml
@ -0,0 +1,15 @@
+[config]
+model="gpt-4-0613"
+git_provider="github"
+publish_review=true
+verbosity_level=0  # 0,1,2
+
+[pr_reviewer]
+require_minimal_and_focused_review=true
+require_tests_review=true
+require_security_review=true
+extended_code_suggestions=false
+num_code_suggestions=4
+
+
+[pr_questions]
--- a/pr_agent/settings/pr_questions_prompts.toml
+++ b/pr_agent/settings/pr_questions_prompts.toml
@ -0,0 +1,30 @@
+[pr_questions_prompt]
+system="""You are CodiumAI-PR-Reviewer, a language model designed to review git pull requests.
+Your task is to answer questions about the new PR code (the '+' lines), and provide feedback.
+Be informative, constructive, and give examples. Try to be as specific as possible, and don't avoid answering the questions.
+Make sure not to repeat modifications already implemented in the new PR code (the '+' lines).
+"""
+
+user="""PR Info:
+Title: '{{title}}'
+Branch: '{{branch}}'
+Description: '{{description}}'
+{%- if language %}
+Main language: {{language}}
+{%- endif %}
+
+
+The PR Git Diff:
+```
+{{diff}}
+```
+Note that lines in the diff body are prefixed with a symbol that represents the type of change: '-' for deletions, '+' for additions, and ' ' (a space) for unchanged lines
+
+
+The PR Questions:
+```
+{{ questions }}
+```
+
+Response:
+"""
--- a/pr_agent/settings/pr_reviewer_prompts.toml
+++ b/pr_agent/settings/pr_reviewer_prompts.toml
@ -0,0 +1,159 @@
+[pr_review_prompt]
+system="""You are CodiumAI-PR-Reviewer, a language model designed to review git pull requests.
+Your task is to provide constructive and concise feedback for the PR, and also provide meaningfull code suggestions to improve the new PR code (the '+' lines).
+- Provide up to {{ num_code_suggestions }} code suggestions.
+- Try to focus on important suggestions like fixing code problems, issues and bugs. As a second priority, provide suggestions for meaningfull code improvements, like performance, vulnerability, modularity, and best practices.
+{%- if extended_code_suggestions %}
+- For each suggestion, provide a short and concise code snippet to illustrate the existing code, and the improved code.
+{%- endif %}
+- Make sure not to provide suggestion repeating modifications already implemented in the new PR code (the '+' lines).
+
+You must use the following JSON schema to format your answer:
+```json
+{
+  "PR Analysis": {
+    "Main theme": {
+      "type": "string",
+      "description": "a short explanation of the PR"
+    },
+    "Description and title": {
+      "type": "string",
+      "description": "yes\\no question: does this PR have a relevant description and title"
+    },
+    "Type of PR": {
+      "type": "string",
+      "enum": ["Bug fix", "Tests", "Bug fix with tests", "Refactoring", "Enhancement", "Documentation", "Other"]
+    },
+{%- if require_tests %}
+    "Relevant tests added": {
+      "type": "string",
+      "description": "yes\\no question: does this PR have relevant tests ?"
+    },
+{%- endif %}
+{%- if require_minimal_and_focused %}
+    "Minimal and focused": {
+      "type": "string",
+      "description": "is this PR as minimal and focused as possible, with all code changes centered around a single coherent theme, described in the PR description and title ?" explain your answer"
+    }
+  },
+{%- endif %}
+  "PR Feedback": {
+    "General PR suggestions": {
+      "type": "string",
+      "description": "important suggestions for the contributors and maintainers of this PR, may include overall structure, primary purpose and best practices. consider using specific filenames, classes and functions names. explain yourself!"
+    },
+    "Code suggestions": {
+      "type": "array",
+      "maxItems": {{ num_code_suggestions }},
+      "uniqueItems": true,
+      "items": {
+        "suggestion number": {
+          "type": "int",
+          "description": "suggestion number, starting from 1"
+        },
+        "relevant file": {
+          "type": "string",
+          "description": "the relevant file name"
+        },
+        "suggestion content": {
+          "type": "string",
+{%- if extended_code_suggestions %}
+          "description": "a concrete suggestion for meaningfully improving the new PR code. Don't repeat previous suggestions. Add tags with importance measure that matches each suggestion ('important' or 'medium'). Do not make suggestions for updating or adding docstrings, renaming PR title and description, or linter like.
+{%- else %}
+          "description": "a concrete suggestion for meaningfully improving the new PR code. Also describe how, specifically, the suggestion can be applied to new PR code. Add tags with importance measure that matches each suggestion ('important' or 'medium'). Do not make suggestions for updating or adding docstrings, renaming PR title and description, or linter like.
+{%- endif %}
+        },
+{%- if extended_code_suggestions %}
+        "why": {
+          "type": "string",
+          "description": "shortly explain why this suggestion is important"
+        },
+        "code example": {
+          "type": "object",
+          "properties": {
+            "before code": {
+              "type": "string",
+              "description": "Short and concise code snippet, to illustrate the existing code"
+            },
+            "after code": {
+              "type": "string",
+              "description": "Short and concise code snippet, to illustrate the improved code"
+            }
+          }
+        }
+{%- endif %}
+      }
+    },
+{%- if require_security %}
+    "Security concerns": {
+      "type": "string",
+      "description": "yes\\no question: does this PR code introduce possible security concerns or issues, like SQL injection, XSS, CSRF, and others ? explain your answer"
+       ? explain your answer"
+    }
+{%- endif %}
+  }
+}
+```
+
+Example output:
+'
+{
+    "PR Analysis":
+    {
+        "Main theme": "xxx",
+        "Description and title": "Yes",
+        "Type of PR": "Bug fix",
+{%- if require_tests %}
+        "Relevant tests added": "No",
+{%- endif %}
+{%- if require_minimal_and_focused %}
+        "Minimal and focused": "No, because ..."
+{%- endif %}
+    },
+    "PR Feedback":
+    {
+        "General PR suggestions": "..., `xxx`...",
+        "Code suggestions": [
+            {
+                "suggestion number": 1,
+                "relevant file": "xxx.py",
+                "suggestion content": "xxx [important]",
+{%- if extended_code_suggestions %}
+                "why": "xxx",
+                "code example":
+                {
+                    "before code": "xxx",
+                    "after code": "xxx"
+                }
+{%- endif %}
+            },
+            ...
+        ]
+{%- if require_security %},
+       "Security concerns": "No, because ..."
+{%- endif %}
+    }
+}
+'
+
+Don't repeat the prompt in the answer, and avoid outputting the 'type' and 'description' fields.
+"""
+
+user="""PR Info:
+Title: '{{title}}'
+Branch: '{{branch}}'
+Description: '{{description}}'
+{%- if language %}
+Main language: {{language}}
+{%- endif %}
+
+
+The PR Git Diff:
+```
+{{diff}}
+```
+Note that lines in the diff body are prefixed with a symbol that represents the type of change: '-' for deletions, '+' for additions, and ' ' (a space) for unchanged lines.
+
+Response (should be a valid JSON, and nothing else):
+```json
+"""
--- a/pr_agent/tools/init.py
+++ b/pr_agent/tools/init.py
--- a/pr_agent/tools/pr_questions.py
+++ b/pr_agent/tools/pr_questions.py
@ -0,0 +1,67 @@
+import copy
+import logging
+from typing import Optional
+
+from jinja2 import Environment, StrictUndefined
+
+from pr_agent.algo.ai_handler import AiHandler
+from pr_agent.algo.pr_processing import get_pr_diff
+from pr_agent.algo.token_handler import TokenHandler
+from pr_agent.config_loader import settings
+from pr_agent.git_providers import get_git_provider
+
+
+class PRQuestions:
+    def __init__(self, pr_url: str, question_str: str, installation_id: Optional[int] = None):
+        self.git_provider = get_git_provider()(pr_url, installation_id)
+        self.main_pr_language = self.git_provider.get_main_pr_language()
+        self.installation_id = installation_id
+        self.ai_handler = AiHandler()
+        self.question_str = question_str
+        self.vars = {
+            "title": self.git_provider.pr.title,
+            "branch": self.git_provider.get_pr_branch(),
+            "description": self.git_provider.pr.body,
+            "language": self.git_provider.get_main_pr_language(),
+            "diff": "", # empty diff for initial calculation
+            "questions": self.question_str,
+        }
+        self.token_handler = TokenHandler(self.git_provider.pr,
+                                          self.vars,
+                                          settings.pr_questions_prompt.system,
+                                          settings.pr_questions_prompt.user)
+        self.patches_diff = None
+        self.prediction = None
+
+    async def answer(self):
+        logging.info('Answering a PR question...')
+        self.git_provider.publish_comment("Preparing answer...")
+        logging.info('Getting PR diff...')
+        self.patches_diff = get_pr_diff(self.git_provider, self.token_handler)
+        logging.info('Getting AI prediction...')
+        self.prediction = await self._get_prediction()
+        logging.info('Preparing answer...')
+        pr_comment = self._prepare_pr_answer()
+        if settings.config.publish_review:
+            logging.info('Pushing answer...')
+            self.git_provider.publish_comment(pr_comment)
+        return ""
+
+    async def _get_prediction(self):
+        variables = copy.deepcopy(self.vars)
+        variables["diff"] = self.patches_diff  # update diff
+        environment = Environment(undefined=StrictUndefined)
+        system_prompt = environment.from_string(settings.pr_questions_prompt.system).render(variables)
+        user_prompt = environment.from_string(settings.pr_questions_prompt.user).render(variables)
+        if settings.config.verbosity_level >= 2:
+            logging.info(f"\nSystem prompt:\n{system_prompt}")
+            logging.info(f"\nUser prompt:\n{user_prompt}")
+        model = settings.config.model
+        response, finish_reason = await self.ai_handler.chat_completion(model=model, temperature=0.2,
+                                                                        system=system_prompt, user=user_prompt)
+        return response
+
+    def _prepare_pr_answer(self) -> str:
+        answer_str = f"Questions: {self.question_str}\n\n"
+        answer_str += f"Answer: {self.prediction.strip()}\n\n"
+        return answer_str
--- a/pr_agent/tools/pr_reviewer.py
+++ b/pr_agent/tools/pr_reviewer.py
@ -0,0 +1,88 @@
+import copy
+import json
+import logging
+from typing import Optional
+
+from jinja2 import Environment, StrictUndefined
+
+from pr_agent.algo.ai_handler import AiHandler
+from pr_agent.algo.pr_processing import get_pr_diff
+from pr_agent.algo.token_handler import TokenHandler
+from pr_agent.algo.utils import convert_to_markdown
+from pr_agent.config_loader import settings
+from pr_agent.git_providers import get_git_provider
+
+
+class PRReviewer:
+    def __init__(self, pr_url: str, installation_id: Optional[int] = None):
+
+        self.git_provider = get_git_provider()(pr_url, installation_id)
+        self.main_language = self.git_provider.get_main_pr_language()
+        self.installation_id = installation_id
+        self.ai_handler = AiHandler()
+        self.patches_diff = None
+        self.prediction = None
+        self.vars = {
+            "title": self.git_provider.pr.title,
+            "branch": self.git_provider.get_pr_branch(),
+            "description": self.git_provider.pr.body,
+            "language": self.git_provider.get_main_pr_language(),
+            "diff": "",  # empty diff for initial calculation
+            "require_tests": settings.pr_reviewer.require_tests_review,
+            "require_security": settings.pr_reviewer.require_security_review,
+            "require_minimal_and_focused": settings.pr_reviewer.require_minimal_and_focused_review,
+            'extended_code_suggestions': settings.pr_reviewer.extended_code_suggestions,
+            'num_code_suggestions': settings.pr_reviewer.num_code_suggestions,
+        }
+        self.token_handler = TokenHandler(self.git_provider.pr,
+                                          self.vars,
+                                          settings.pr_review_prompt.system,
+                                          settings.pr_review_prompt.user)
+
+    async def review(self):
+        logging.info('Reviewing PR...')
+        if settings.config.publish_review:
+            self.git_provider.publish_comment("Preparing review...")
+        logging.info('Getting PR diff...')
+        self.patches_diff = get_pr_diff(self.git_provider, self.token_handler)
+        logging.info('Getting AI prediction...')
+        self.prediction = await self._get_prediction()
+        logging.info('Preparing PR review...')
+        pr_comment = self._prepare_pr_review()
+        if settings.config.publish_review:
+            logging.info('Pushing PR review...')
+            self.git_provider.publish_comment(pr_comment)
+        return ""
+
+    async def _get_prediction(self):
+        variables = copy.deepcopy(self.vars)
+        variables["diff"] = self.patches_diff  # update diff
+        environment = Environment(undefined=StrictUndefined)
+        system_prompt = environment.from_string(settings.pr_review_prompt.system).render(variables)
+        user_prompt = environment.from_string(settings.pr_review_prompt.user).render(variables)
+        if settings.config.verbosity_level >= 2:
+            logging.info(f"\nSystem prompt:\n{system_prompt}")
+            logging.info(f"\nUser prompt:\n{user_prompt}")
+        model = settings.config.model
+        response, finish_reason = await self.ai_handler.chat_completion(model=model, temperature=0.2,
+                                                                        system=system_prompt, user=user_prompt)
+        try:
+            json.loads(response)
+        except json.decoder.JSONDecodeError:
+            logging.warning("Could not decode JSON")
+            response = {}
+        return response
+
+    def _prepare_pr_review(self) -> str:
+        review = self.prediction.strip()
+        try:
+            data = json.loads(review)
+        except json.decoder.JSONDecodeError:
+            logging.error("Unable to decode JSON response from AI")
+            data = {}
+        markdown_text = convert_to_markdown(data)
+        markdown_text += "\nAdd a comment that says 'Please review' to ask for a new review after you update the PR.\n"
+        markdown_text += "Add a comment that says 'Please answer <QUESTION...>' to ask a question about this PR.\n"
+        if settings.config.verbosity_level >= 2:
+            logging.info(f"Markdown response:\n{markdown_text}")
+        return markdown_text
--- a/pyproject.toml
+++ b/pyproject.toml
@ -0,0 +1,32 @@
+[tool.ruff]
+
+line-length = 120
+
+select = [
+  "E",  # Pyflakes
+  "F",  # Pyflakes
+  "B",  # flake8-bugbear
+  "I001",  # isort basic checks
+  "I002",  # isort missing-required-import
+  ]
+
+# First commit - only fixing isort
+fixable = [
+  "I001",  # isort basic checks
+]
+
+unfixable = [
+  "B",  # Avoid trying to fix flake8-bugbear (`B`) violations.
+  ]
+
+exclude = [
+  "api/code_completions",
+]
+
+ignore = [
+  "E999", "B008"
+]
+
+[tool.ruff.per-file-ignores]
+"__init__.py" = ["E402"]  # Ignore `E402` (import violations) in all `__init__.py` files, and in `path/to/file.py`.
+# TODO: should decide if maybe not to ignore these.
--- a/requirements-dev.txt
+++ b/requirements-dev.txt
@ -0,0 +1 @@
+pytest==7.4.0
--- a/requirements.txt
+++ b/requirements.txt
@ -0,0 +1,8 @@
+dynaconf==3.1.12
+fastapi==0.99.0
+PyGithub==1.58.2
+retry==0.9.2
+openai==0.27.8
+Jinja2==3.1.2
+tiktoken==0.4.0
+uvicorn==0.22.0
--- a/tests/unit/test_convert_to_markdown.py
+++ b/tests/unit/test_convert_to_markdown.py
@ -0,0 +1,124 @@
+# Generated by CodiumAI
+from pr_agent.algo.utils import convert_to_markdown
+
+"""
+Code Analysis
+
+Objective:
+The objective of the 'convert_to_markdown' function is to convert a dictionary of data into a markdown-formatted text. 
+The function takes in a dictionary as input and recursively iterates through its keys and values to generate the 
+markdown text.
+
+Inputs:
+- A dictionary of data containing information about a pull request.
+
+Flow:
+- Initialize an empty string variable 'markdown_text'.
+- Create a dictionary 'emojis' containing emojis for each key in the input dictionary.
+- Iterate through the input dictionary:
+  - If the value is empty, continue to the next iteration.
+  - If the value is a dictionary, recursively call the 'convert_to_markdown' function with the value as input and 
+  append the returned markdown text to 'markdown_text'.
+  - If the value is a list:
+    - If the key is 'code suggestions', add an additional line break to 'markdown_text'.
+    - Get the corresponding emoji for the key from the 'emojis' dictionary. If no emoji is found, use a dash.
+    - Append the emoji and key to 'markdown_text'.
+    - Iterate through the items in the list:
+      - If the item is a dictionary and the key is 'code suggestions', call the 'parse_code_suggestion' function with 
+      the item as input and append the returned markdown text to 'markdown_text'.
+      - If the item is not empty, append it to 'markdown_text'.
+  - If the value is not 'n/a', get the corresponding emoji for the key from the 'emojis' dictionary. If no emoji is 
+  found, use a dash. Append the emoji, key, and value to 'markdown_text'.
+- Return 'markdown_text'.
+
+Outputs:
+- A markdown-formatted string containing the information from the input dictionary.
+
+Additional aspects:
+- The function uses recursion to handle nested dictionaries.
+- The 'parse_code_suggestion' function is called for items in the 'code suggestions' list.
+- The function uses emojis to add visual cues to the markdown text.
+"""
+
+
+class TestConvertToMarkdown:
+    # Tests that the function works correctly with a simple dictionary input
+    def test_simple_dictionary_input(self):
+        input_data = {
+            'Main theme': 'Test',
+            'Description and title': 'Test description',
+            'Type of PR': 'Test type',
+            'Relevant tests added': 'no',
+            'Unrelated changes': 'n/a',  # won't be included in the output
+            'Minimal and focused': 'Yes',
+            'General PR suggestions': 'general suggestion...',
+            'Code suggestions': [
+                {
+                    'Suggestion number': 1,
+                    'Code example': {
+                        'Before': 'Code before',
+                        'After': 'Code after'
+                    }
+                },
+                {
+                    'Suggestion number': 2,
+                    'Code example': {
+                        'Before': 'Code before 2',
+                        'After': 'Code after 2'
+                    }
+                }
+            ]
+        }
+        expected_output = """\
+- 🎯 **Main theme:** Test
+- 🔍 **Description and title:** Test description
+- 📌 **Type of PR:** Test type
+- 🧪 **Relevant tests added:** no
+- ✨ **Minimal and focused:** Yes
+- 💡 **General PR suggestions:** general suggestion...
+
+- 🤖 **Code suggestions:**
+
+- **suggestion 1:**
+  - **Code example:**
+    - **Before:**
+        ```
+        Code before
+        ```
+    - **After:**
+        ```
+        Code after
+        ```
+
+- **suggestion 2:**
+  - **Code example:**
+    - **Before:**
+        ```
+        Code before 2
+        ```
+    - **After:**
+        ```
+        Code after 2
+        ```
+"""
+        assert convert_to_markdown(input_data).strip() == expected_output.strip()
+
+    # Tests that the function works correctly with an empty dictionary input
+    def test_empty_dictionary_input(self):
+        input_data = {}
+        expected_output = ""
+        assert convert_to_markdown(input_data).strip() == expected_output.strip()
+
+    def test_dictionary_input_containing_only_empty_dictionaries(self):
+        input_data = {
+            'Main theme': {},
+            'Description and title': {},
+            'Type of PR': {},
+            'Relevant tests added': {},
+            'Unrelated changes': {},
+            'Minimal and focused': {},
+            'General PR suggestions': {},
+            'Code suggestions': {}
+        }
+        expected_output = ""
+        assert convert_to_markdown(input_data).strip() == expected_output.strip()
--- a/tests/unit/test_delete_hunks.py
+++ b/tests/unit/test_delete_hunks.py
@ -0,0 +1,84 @@
+# Generated by CodiumAI
+
+from pr_agent.algo.git_patch_processing import omit_deletion_hunks
+
+"""
+Code Analysis
+
+Objective:
+The objective of the "omit_deletion_hunks" function is to remove deletion hunks from a patch file and return only the 
+added lines.
+
+Inputs:
+- "patch_lines": a list of strings representing the lines of a patch file.
+
+Flow:
+- Initialize empty lists "temp_hunk" and "added_patched", and boolean variables "add_hunk" and "inside_hunk".
+- Compile a regular expression pattern to match hunk headers.
+- Iterate through each line in "patch_lines".
+- If the line starts with "@@", match the line with the hunk header pattern, finish the previous hunk if necessary, 
+and append the line to "temp_hunk".
+- If the line does not start with "@@", append the line to "temp_hunk", check if it is an added line, and set 
+"add_hunk" to True if it is.
+- If the function reaches the end of "patch_lines" and there is an unfinished hunk with added lines, append it to 
+"added_patched".
+- Join the lines in "added_patched" with newline characters and return the resulting string.
+
+Outputs:
+- A string representing the added lines in the patch file.
+
+Additional aspects:
+- The function only considers hunks with added lines and ignores hunks with deleted lines.
+- The function assumes that the input patch file is well-formed and follows the unified diff format.
+"""
+
+
+class TestOmitDeletionHunks:
+    # Tests that the function correctly handles a simple patch containing only additions
+    def test_simple_patch_additions(self):
+        patch_lines = ['@@ -1,0 +1,1 @@\n', '+added line\n']
+        expected_output = '@@ -1,0 +1,1 @@\n\n+added line\n'
+        assert omit_deletion_hunks(patch_lines) == expected_output
+
+    # Tests that the function correctly omits deletion hunks and concatenates multiple hunks in a patch.
+    def test_patch_multiple_hunks(self):
+        patch_lines = ['@@ -1,0 +1,1 @@\n', '-deleted line', '+added line\n', '@@ -2,0 +3,1 @@\n', '-deleted line\n',
+                       '-another deleted line\n']
+        expected_output = '@@ -1,0 +1,1 @@\n\n-deleted line\n+added line\n'
+        assert omit_deletion_hunks(patch_lines) == expected_output
+
+    # Tests that the function correctly omits deletion lines from the patch when there are no additions or context
+    # lines.
+    def test_patch_only_deletions(self):
+        patch_lines = ['@@ -1,1 +1,0 @@\n', '-deleted line\n']
+        expected_output = ''
+        assert omit_deletion_hunks(patch_lines) == expected_output
+
+        # Additional deletion lines
+        patch_lines = ['@@ -1,1 +1,0 @@\n', '-deleted line\n', '-another deleted line\n']
+        expected_output = ''
+        assert omit_deletion_hunks(patch_lines) == expected_output
+
+        # Additional context lines
+        patch_lines = ['@@ -1,1 +1,0 @@\n', '-deleted line\n', '-another deleted line\n', 'context line 1\n',
+                       'context line 2\n', 'context line 3\n']
+        expected_output = ''
+        assert omit_deletion_hunks(patch_lines) == expected_output
+
+    # Tests that the function correctly handles an empty patch
+    def test_empty_patch(self):
+        patch_lines = []
+        expected_output = ''
+        assert omit_deletion_hunks(patch_lines) == expected_output
+
+    # Tests that the function correctly handles a patch containing only one hunk
+    def test_patch_one_hunk(self):
+        patch_lines = ['@@ -1,0 +1,1 @@\n', '+added line\n']
+        expected_output = '@@ -1,0 +1,1 @@\n\n+added line\n'
+        assert omit_deletion_hunks(patch_lines) == expected_output
+
+    # Tests that the function correctly handles a patch containing only deletions and no additions
+    def test_patch_deletions_no_additions(self):
+        patch_lines = ['@@ -1,1 +1,0 @@\n', '-deleted line\n']
+        expected_output = ''
+        assert omit_deletion_hunks(patch_lines) == expected_output
--- a/tests/unit/test_extend_patch.py
+++ b/tests/unit/test_extend_patch.py
@ -0,0 +1,93 @@
+
+# Generated by CodiumAI
+
+
+from pr_agent.algo.git_patch_processing import extend_patch
+
+"""
+Code Analysis
+
+Objective:
+The objective of the 'extend_patch' function is to extend a given patch to include a specified number of surrounding 
+lines. This function takes in an original file string, a patch string, and the number of lines to extend the patch by, 
+and returns the extended patch string.
+
+Inputs:
+- original_file_str: a string representing the original file
+- patch_str: a string representing the patch to be extended
+- num_lines: an integer representing the number of lines to extend the patch by
+
+Flow:
+1. Split the original file string and patch string into separate lines
+2. Initialize variables to keep track of the current hunk's start and size for both the original file and the patch
+3. Iterate through each line in the patch string
+4. If the line starts with '@@', extract the start and size values for both the original file and the patch, and 
+calculate the extended start and size values
+5. Append the extended hunk header to the extended patch lines list
+6. Append the specified number of lines before the hunk to the extended patch lines list
+7. Append the current line to the extended patch lines list
+8. If the line is not a hunk header, append it to the extended patch lines list
+9. Return the extended patch string
+
+Outputs:
+- extended_patch_str: a string representing the extended patch
+
+Additional aspects:
+- The function uses regular expressions to extract the start and size values from the hunk header
+- The function handles cases where the start value of a hunk is less than the number of lines to extend by by setting 
+the extended start value to 1
+- The function handles cases where the hunk extends beyond the end of the original file by only including lines up to 
+the end of the original file in the extended patch
+"""
+
+
+class TestExtendPatch:
+    # Tests that the function works correctly with valid input
+    def test_happy_path(self):
+        original_file_str = 'line1\nline2\nline3\nline4\nline5'
+        patch_str = '@@ -2,2 +2,2 @@ init()\n-line2\n+new_line2\nline3'
+        num_lines = 1
+        expected_output = '@@ -1,4 +1,4 @@ init()\nline1\n-line2\n+new_line2\nline3\nline4'
+        actual_output = extend_patch(original_file_str, patch_str, num_lines)
+        assert actual_output == expected_output
+
+    # Tests that the function returns an empty string when patch_str is empty
+    def test_empty_patch(self):
+        original_file_str = 'line1\nline2\nline3\nline4\nline5'
+        patch_str = ''
+        num_lines = 1
+        expected_output = ''
+        assert extend_patch(original_file_str, patch_str, num_lines) == expected_output
+
+    # Tests that the function returns the original patch when num_lines is 0
+    def test_zero_num_lines(self):
+        original_file_str = 'line1\nline2\nline3\nline4\nline5'
+        patch_str = '@@ -2,2 +2,2 @@ init()\n-line2\n+new_line2\nline3'
+        num_lines = 0
+        assert extend_patch(original_file_str, patch_str, num_lines) == patch_str
+
+    # Tests that the function returns the original patch when patch_str contains no hunks
+    def test_no_hunks(self):
+        original_file_str = 'line1\nline2\nline3\nline4\nline5'
+        patch_str = 'no hunks here'
+        num_lines = 1
+        expected_output = 'no hunks here'
+        assert extend_patch(original_file_str, patch_str, num_lines) == expected_output
+
+    # Tests that the function extends a patch with a single hunk correctly
+    def test_single_hunk(self):
+        original_file_str = 'line1\nline2\nline3\nline4\nline5'
+        patch_str = '@@ -2,3 +2,3 @@ init()\n-line2\n+new_line2\nline3\nline4'
+        num_lines = 1
+        expected_output = '@@ -1,5 +1,5 @@ init()\nline1\n-line2\n+new_line2\nline3\nline4\nline5'
+        actual_output = extend_patch(original_file_str, patch_str, num_lines)
+        assert actual_output == expected_output
+
+    # Tests the functionality of extending a patch with multiple hunks.
+    def test_multiple_hunks(self):
+        original_file_str = 'line1\nline2\nline3\nline4\nline5\nline6'
+        patch_str = '@@ -2,3 +2,3 @@ init()\n-line2\n+new_line2\nline3\nline4\n@@ -4,1 +4,1 @@ init2()\n-line4\n+new_line4'  # noqa: E501
+        num_lines = 1
+        expected_output = '@@ -1,5 +1,5 @@ init()\nline1\n-line2\n+new_line2\nline3\nline4\nline5\n@@ -3,3 +3,3 @@ init2()\nline3\n-line4\n+new_line4\nline5'  # noqa: E501
+        actual_output = extend_patch(original_file_str, patch_str, num_lines)
+        assert actual_output == expected_output
--- a/tests/unit/test_handle_patch_deletions.py
+++ b/tests/unit/test_handle_patch_deletions.py
@ -0,0 +1,84 @@
+# Generated by CodiumAI
+import logging
+
+from pr_agent.algo.git_patch_processing import handle_patch_deletions
+from pr_agent.config_loader import settings
+
+"""
+Code Analysis
+
+Objective:
+The objective of the function is to handle entire file or deletion patches and return the patch after omitting the 
+deletion hunks.
+
+Inputs:
+- patch: a string representing the patch to be handled
+- original_file_content_str: a string representing the original content of the file
+- new_file_content_str: a string representing the new content of the file
+- file_name: a string representing the name of the file
+
+Flow:
+- If new_file_content_str is empty, set patch to "File was deleted" and return it
+- Otherwise, split patch into lines and omit the deletion hunks using the omit_deletion_hunks function
+- If the resulting patch is different from the original patch, log a message and set patch to the new patch
+- Return the resulting patch
+
+Outputs:
+- A string representing the patch after omitting the deletion hunks
+
+Additional aspects:
+- The function uses the settings from the configuration files to determine the verbosity level of the logging messages
+- The omit_deletion_hunks function is called to remove the deletion hunks from the patch
+- The function handles the case where the new_file_content_str is empty by setting the patch to "File was deleted"
+"""
+
+
+class TestHandlePatchDeletions:
+    # Tests that handle_patch_deletions returns the original patch when new_file_content_str is not empty
+    def test_handle_patch_deletions_happy_path_new_file_content_exists(self):
+        patch = '--- a/file.py\n+++ b/file.py\n@@ -1,2 +1,2 @@\n-foo\n-bar\n+baz\n'
+        original_file_content_str = 'foo\nbar\n'
+        new_file_content_str = 'foo\nbaz\n'
+        file_name = 'file.py'
+        assert handle_patch_deletions(patch, original_file_content_str, new_file_content_str,
+                                      file_name) == patch.rstrip()
+
+    # Tests that handle_patch_deletions logs a message when verbosity_level is greater than 0
+    def test_handle_patch_deletions_happy_path_verbosity_level_greater_than_0(self, caplog):
+        patch = '--- a/file.py\n+++ b/file.py\n@@ -1,2 +1,2 @@\n-foo\n-bar\n+baz\n'
+        original_file_content_str = 'foo\nbar\n'
+        new_file_content_str = ''
+        file_name = 'file.py'
+        settings.config.verbosity_level = 1
+
+        with caplog.at_level(logging.INFO):
+            handle_patch_deletions(patch, original_file_content_str, new_file_content_str, file_name)
+            assert any("Processing file" in message for message in caplog.messages)
+
+    # Tests that handle_patch_deletions returns 'File was deleted' when new_file_content_str is empty
+    def test_handle_patch_deletions_edge_case_new_file_content_empty(self):
+        patch = '--- a/file.py\n+++ b/file.py\n@@ -1,2 +1,2 @@\n-foo\n-bar\n'
+        original_file_content_str = 'foo\nbar\n'
+        new_file_content_str = ''
+        file_name = 'file.py'
+        assert handle_patch_deletions(patch, original_file_content_str, new_file_content_str,
+                                      file_name) == 'File was deleted\n'
+
+    # Tests that handle_patch_deletions returns the original patch when patch and patch_new are equal
+    def test_handle_patch_deletions_edge_case_patch_and_patch_new_are_equal(self):
+        patch = '--- a/file.py\n+++ b/file.py\n@@ -1,2 +1,2 @@\n-foo\n-bar\n'
+        original_file_content_str = 'foo\nbar\n'
+        new_file_content_str = 'foo\nbar\n'
+        file_name = 'file.py'
+        assert handle_patch_deletions(patch, original_file_content_str, new_file_content_str,
+                                      file_name).rstrip() == patch.rstrip()
+
+    # Tests that handle_patch_deletions returns the modified patch when patch and patch_new are not equal
+    def test_handle_patch_deletions_edge_case_patch_and_patch_new_are_not_equal(self):
+        patch = '--- a/file.py\n+++ b/file.py\n@@ -1,2 +1,2 @@\n-foo\n-bar\n'
+        original_file_content_str = 'foo\nbar\n'
+        new_file_content_str = 'foo\nbaz\n'
+        file_name = 'file.py'
+        expected_patch = '--- a/file.py\n+++ b/file.py\n@@ -1,2 +1,2 @@\n-foo\n-bar'
+        assert handle_patch_deletions(patch, original_file_content_str, new_file_content_str,
+                                      file_name) == expected_patch
--- a/tests/unit/test_language_handler
+++ b/tests/unit/test_language_handler
@ -0,0 +1,121 @@
+
+# Generated by CodiumAI
+from pr_agent.algo.language_handler import sort_files_by_main_languages
+
+
+import pytest
+
+"""
+Code Analysis
+
+Objective:
+The objective of the function is to sort a list of files by their main language, putting the files that are in the main language first and the rest of the files after. It takes in a dictionary of languages and their sizes, and a list of files.
+
+Inputs:
+- languages: a dictionary containing the languages and their sizes
+- files: a list of files
+
+Flow:
+1. Sort the languages by their size in descending order
+2. Get all extensions for the languages
+3. Filter out files with bad extensions
+4. Sort files by their extension, putting the files that are in the main extension first and the rest of the files after
+5. Map languages_sorted to their respective files
+6. Append the files to the files_sorted list
+7. Append the rest of the files to the files_sorted list under the "Other" language category
+8. Return the files_sorted list
+
+Outputs:
+- files_sorted: a list of dictionaries containing the language and its respective files
+
+Additional aspects:
+- The function uses a language_extension_map dictionary to map the languages to their respective extensions
+- The function uses the filter_bad_extensions function to filter out files with bad extensions
+- The function uses a rest_files dictionary to store the files that do not belong to any of the main extensions
+"""
+class TestSortFilesByMainLanguages:
+    # Tests that files are sorted by main language, with files in main language first and the rest after
+    def test_happy_path_sort_files_by_main_languages(self):
+        languages = {'Python': 10, 'Java': 5, 'C++': 3}
+        files = [
+            type('', (object,), {'filename': 'file1.py'})(),
+            type('', (object,), {'filename': 'file2.java'})(),
+            type('', (object,), {'filename': 'file3.cpp'})(),
+            type('', (object,), {'filename': 'file4.py'})(),
+            type('', (object,), {'filename': 'file5.py'})()
+        ]
+        expected_output = [
+            {'language': 'Python', 'files': [files[0], files[3], files[4]]},
+            {'language': 'Java', 'files': [files[1]]},
+            {'language': 'C++', 'files': [files[2]]},
+            {'language': 'Other', 'files': []}
+        ]
+        assert sort_files_by_main_languages(languages, files) == expected_output
+
+    # Tests that function handles empty languages dictionary
+    def test_edge_case_empty_languages(self):
+        languages = {}
+        files = [
+            type('', (object,), {'filename': 'file1.py'})(),
+            type('', (object,), {'filename': 'file2.java'})()
+        ]
+        expected_output = [{'language': 'Other', 'files': []}]
+        assert sort_files_by_main_languages(languages, files) == expected_output
+
+    # Tests that function handles empty files list
+    def test_edge_case_empty_files(self):
+        languages = {'Python': 10, 'Java': 5}
+        files = []
+        expected_output = [
+            {'language': 'Other', 'files': []}
+        ]
+        assert sort_files_by_main_languages(languages, files) == expected_output
+
+    # Tests that function handles languages with no extensions
+    def test_edge_case_languages_with_no_extensions(self):
+        languages = {'Python': 10, 'Java': 5, 'C++': 3}
+        files = [
+            type('', (object,), {'filename': 'file1.py'})(),
+            type('', (object,), {'filename': 'file2.java'})(),
+            type('', (object,), {'filename': 'file3.cpp'})()
+        ]
+        expected_output = [
+            {'language': 'Python', 'files': [files[0]]},
+            {'language': 'Java', 'files': [files[1]]},
+            {'language': 'C++', 'files': [files[2]]},
+            {'language': 'Other', 'files': []}
+        ]
+        assert sort_files_by_main_languages(languages, files) == expected_output
+
+    # Tests the behavior of the function when all files have bad extensions and only one new valid file is added.
+    def test_edge_case_files_with_bad_extensions_only(self):
+        languages = {'Python': 10, 'Java': 5, 'C++': 3}
+        files = [
+            type('', (object,), {'filename': 'file1.csv'})(),
+            type('', (object,), {'filename': 'file2.pdf'})(),
+            type('', (object,), {'filename': 'file3.py'})()  # new valid file
+        ]
+        expected_output = [{'language': 'Python', 'files': [files[2]]}, {'language': 'Other', 'files': []}]
+        assert sort_files_by_main_languages(languages, files) == expected_output
+
+    # Tests general behaviour of function
+    def test_general_behaviour_sort_files_by_main_languages(self):
+        languages = {'Python': 10, 'Java': 5, 'C++': 3}
+        files = [
+            type('', (object,), {'filename': 'file1.py'})(),
+            type('', (object,), {'filename': 'file2.java'})(),
+            type('', (object,), {'filename': 'file3.cpp'})(),
+            type('', (object,), {'filename': 'file4.py'})(),
+            type('', (object,), {'filename': 'file5.py'})(),
+            type('', (object,), {'filename': 'file6.py'})(),
+            type('', (object,), {'filename': 'file7.java'})(),
+            type('', (object,), {'filename': 'file8.cpp'})(),
+            type('', (object,), {'filename': 'file9.py'})()
+        ]
+        expected_output = [
+            {'language': 'Python', 'files': [files[0], files[3], files[4], files[5], files[8]]},
+            {'language': 'Java', 'files': [files[1], files[6]]},
+            {'language': 'C++', 'files': [files[2], files[7]]},
+            {'language': 'Other', 'files': []}
+        ]
+        assert sort_files_by_main_languages(languages, files) == expected_output
--- a/tests/unit/test_parse_code_suggestion.py
+++ b/tests/unit/test_parse_code_suggestion.py
@ -0,0 +1,88 @@
+
+# Generated by CodiumAI
+from pr_agent.algo.utils import parse_code_suggestion
+
+"""
+Code Analysis
+
+Objective:
+The objective of the function is to convert a dictionary into a markdown format. The function takes in a dictionary as 
+input and recursively converts it into a markdown format. The function is specifically designed to handle dictionaries 
+that contain code suggestions.
+
+Inputs:
+- output_data: a dictionary containing the data to be converted into markdown format
+
+Flow:
+- Initialize an empty string variable called markdown_text
+- Create a dictionary of emojis to be used in the markdown format
+- Iterate through the items in the input dictionary
+- If the value is empty, skip to the next item
+- If the value is a dictionary, recursively call the function with the value as input
+- If the value is a list, iterate through the list and add each item to the markdown format
+- If the value is not 'n/a', add it to the markdown format
+- If the key is 'code suggestions', call the parse_code_suggestion function to handle the list of code suggestions
+- Return the markdown format as a string
+
+Outputs:
+- markdown_text: a string containing the input dictionary converted into markdown format
+
+Additional aspects:
+- The function uses the textwrap module to indent code examples in the markdown format
+- The parse_code_suggestion function is called to handle the 'code suggestions' key in the input dictionary
+- The function uses emojis to add visual cues to the markdown format
+"""
+
+
+class TestParseCodeSuggestion:
+    # Tests that function returns empty string when input is an empty dictionary
+    def test_empty_dict(self):
+        input_data = {}
+        expected_output = "\n"  # modified to expect a newline character
+        assert parse_code_suggestion(input_data) == expected_output
+
+    # Tests that function returns correct output when 'suggestion number' key has a non-integer value
+    def test_non_integer_suggestion_number(self):
+        input_data = {
+            "Suggestion number": "one",
+            "Description": "This is a suggestion"
+        }
+        expected_output = "- **suggestion one:**\n  - **Description:** This is a suggestion\n\n"
+        assert parse_code_suggestion(input_data) == expected_output
+
+    # Tests that function returns correct output when 'before' or 'after' key has a non-string value
+    def test_non_string_before_or_after(self):
+        input_data = {
+            "Code example": {
+                "Before": 123,
+                "After": ["a", "b", "c"]
+            }
+        }
+        expected_output = "  - **Code example:**\n    - **Before:**\n        ```\n        123\n        ```\n    - **After:**\n        ```\n        ['a', 'b', 'c']\n        ```\n\n"  # noqa: E501
+        assert parse_code_suggestion(input_data) == expected_output
+
+    # Tests that function returns correct output when input dictionary does not have 'code example' key
+    def test_no_code_example_key(self):
+        code_suggestions = {
+            'suggestion number': 1,
+            'suggestion': 'Suggestion 1',
+            'description': 'Description 1',
+            'before': 'Before 1',
+            'after': 'After 1'
+        }
+        expected_output = "- **suggestion 1:**\n  - **suggestion:** Suggestion 1\n  - **description:** Description 1\n  - **before:** Before 1\n  - **after:** After 1\n\n"  # noqa: E501
+        assert parse_code_suggestion(code_suggestions) == expected_output
+
+    # Tests that function returns correct output when input dictionary has 'code example' key
+    def test_with_code_example_key(self):
+        code_suggestions = {
+            'suggestion number': 2,
+            'suggestion': 'Suggestion 2',
+            'description': 'Description 2',
+            'code example': {
+                'before': 'Before 2',
+                'after': 'After 2'
+            }
+        }
+        expected_output = "- **suggestion 2:**\n  - **suggestion:** Suggestion 2\n  - **description:** Description 2\n  - **code example:**\n    - **before:**\n        ```\n        Before 2\n        ```\n    - **after:**\n        ```\n        After 2\n        ```\n\n"  # noqa: E501
+        assert parse_code_suggestion(code_suggestions) == expected_output