Merge branch 'main' of github.com:PrashantDixit0/pr-agent

This commit is contained in:
PrashantDixit-dev
2023-12-25 00:41:28 +05:30
27 changed files with 353 additions and 262 deletions

123
README.md
View File

@ -21,7 +21,7 @@ Making pull requests less painful with an AI agent
</div>
<div style="text-align:left;">
CodiumAI `PR-Agent` is an open-source tool for efficient pull request reviewing and handling. It automatically analyzes the pull request and can provide several types of commands:
CodiumAI PR-Agent is an open-source tool to help efficiently review and handle pull requests. It automatically analyzes the pull request and can provide several types of commands:
**Auto Description ([`/describe`](./docs/DESCRIBE.md))**: Automatically generating PR description - title, type, summary, code walkthrough and labels.
\
@ -35,9 +35,12 @@ CodiumAI `PR-Agent` is an open-source tool for efficient pull request reviewing
\
**Find Similar Issue ([`/similar_issue`](./docs/SIMILAR_ISSUE.md))**: Automatically retrieves and presents similar issues.
\
**Add Documentation ([`/add_docs`](./docs/ADD_DOCUMENTATION.md))**: Automatically adds documentation to un-documented functions/classes in the PR.
**Add Documentation 💎 ([`/add_docs`](./docs/ADD_DOCUMENTATION.md))**: Automatically adds documentation to methods/functions/classes that changed in the PR.
\
**Generate Custom Labels ([`/generate_labels`](./docs/GENERATE_CUSTOM_LABELS.md))**: Automatically suggests custom labels based on the PR code changes.
**Generate Custom Labels 💎 ([`/generate_labels`](./docs/GENERATE_CUSTOM_LABELS.md))**: Automatically suggests custom labels based on the PR code changes.
\
**Analyze 💎 ([`/analyze`](./docs/Analyze.md))**: Automatically analyzes the PR, and presents changes walkthrough for each component.
See the [Installation Guide](./INSTALL.md) for instructions on installing and running the tool on different git platforms.
@ -47,17 +50,28 @@ See the [Tools Guide](./docs/TOOLS_GUIDE.md) for detailed description of the dif
<h3>Example results:</h3>
</div>
<h4><a href="https://github.com/Codium-ai/pr-agent/pull/229#issuecomment-1687561986">/describe:</a></h4>
<h4><a href="https://github.com/Codium-ai/pr-agent/pull/530">/describe</a></h4>
<div align="center">
<p float="center">
<img src="https://www.codium.ai/images/describe-2.gif" width="800">
<img src="https://www.codium.ai/images/pr_agent/describe_short_main.png" width="800">
</p>
</div>
<h4><a href="https://github.com/Codium-ai/pr-agent/pull/229#issuecomment-1695021901">/review:</a></h4>
<h4><a href="https://github.com/Codium-ai/pr-agent/pull/472#discussion_r1435819374">/improve</a></h4>
<div align="center">
<p float="center">
<img src="https://www.codium.ai/images/review-2.gif" width="800">
<kbd>
<img src="https://www.codium.ai/images/pr_agent/improve_short_main.png" width="768">
</kbd>
</p>
</div>
<h4><a href="https://github.com/Codium-ai/pr-agent/pull/530">/generate_labels</a></h4>
<div align="center">
<p float="center">
<kbd><img src="https://www.codium.ai/images/pr_agent/geneare_custom_labels_main_short.png" width="300"></kbd>
</p>
</div>
@ -104,39 +118,42 @@ See the [Tools Guide](./docs/TOOLS_GUIDE.md) for detailed description of the dif
- [Installation](#installation)
- [How it works](#how-it-works)
- [Why use PR-Agent?](#why-use-pr-agent)
- [Roadmap](#roadmap)
</div>
## Overview
`PR-Agent` offers extensive pull request functionalities across various git providers:
| | | GitHub | Gitlab | Bitbucket | CodeCommit | Azure DevOps | Gerrit |
|-------|---------------------------------------------|:------:|:------:|:---------:|:----------:|:----------:|:----------:|
| TOOLS | Review | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | ⮑ Incremental | :white_check_mark: | | | | | |
| | Ask | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | Auto-Description | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | Improve Code | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | | :white_check_mark: |
| | ⮑ Extended | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | | :white_check_mark: |
| | Reflect and Review | :white_check_mark: | :white_check_mark: | :white_check_mark: | | :white_check_mark: | :white_check_mark: |
| | Update CHANGELOG.md | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | | |
| | Find similar issue | :white_check_mark: | | | | | |
| | Add Documentation | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | | :white_check_mark: |
| | Generate Labels | :white_check_mark: | :white_check_mark: | | | | |
| | | | | | | |
| USAGE | CLI | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | App / webhook | :white_check_mark: | :white_check_mark: | | | |
| | Tagging bot | :white_check_mark: | | | | |
| | Actions | :white_check_mark: | | | | |
| | Web server | | | | | | :white_check_mark: |
| | | | | | | |
| CORE | PR compression | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | Repo language prioritization | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | Adaptive and token-aware<br />file patch fitting | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | Multiple models support | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | Incremental PR Review | :white_check_mark: | | | | | |
| | | GitHub | Gitlab | Bitbucket |
|-------|---------------------------------------------|:------:|:------:|:---------:|
| TOOLS | Review | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | ⮑ Incremental | :white_check_mark: | | |
| | Ask | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | Auto-Description | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | Improve Code | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | ⮑ Extended | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | Reflect and Review | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | Update CHANGELOG.md | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | Find Similar Issue | :white_check_mark: | | |
| | Add PR Documentation 💎 | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | Generate Custom Labels 💎 | :white_check_mark: | :white_check_mark: | |
| | Analyze PR Components 💎 | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | | | | |
| USAGE | CLI | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | App / webhook | :white_check_mark: | :white_check_mark: | |
| | Tagging bot | :white_check_mark: | | |
| | Actions | :white_check_mark: | | |
| | | | | |
| CORE | PR compression | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | Repo language prioritization | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | Adaptive and token-aware<br />file patch fitting | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | Multiple models support | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | Incremental PR review | :white_check_mark: | | |
| | Static code analysis 💎 | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | Global configuration 💎 | :white_check_mark: | :white_check_mark: | :white_check_mark: |
Review the [usage guide](./Usage.md) section for detailed instructions how to use the different tools, select the relevant git provider (GitHub, Gitlab, Bitbucket,...), and adjust the configuration file to your needs.
- 💎 means this feature is available only in [PR-Agent Pro](https://www.codium.ai/pricing/)
- Support for additional git providers is described in [here](./docs/Full_enviroments.md)
## Try it now
@ -156,8 +173,10 @@ Note that when you set your own PR-Agent or use CodiumAI hosted PR-Agent, there
---
## Installation
When you sign up to [PR-Agent-Pro 💎](https://www.codium.ai/pricing/), you will get access to a hosted PR-Agent, which is regularly updated with the latest features and abilities. This is the easiest way to use PR-Agent.
To get started with PR-Agent quickly, you first need to acquire two tokens:
To use your own version of PR-Agent, you first need to acquire two tokens:
1. An OpenAI key from [here](https://platform.openai.com/), with access to GPT-4.
2. A GitHub personal access token (classic) with the repo scope.
@ -193,43 +212,15 @@ Here are some advantages of PR-Agent:
- We emphasize **real-life practical usage**. Each tool (review, improve, ask, ...) has a single GPT-4 call, no more. We feel that this is critical for realistic team usage - obtaining an answer quickly (~30 seconds) and affordably.
- Our [PR Compression strategy](./PR_COMPRESSION.md) is a core ability that enables to effectively tackle both short and long PRs.
- Our JSON prompting strategy enables to have **modular, customizable tools**. For example, the '/review' tool categories can be controlled via the [configuration](pr_agent/settings/configuration.toml) file. Adding additional categories is easy and accessible.
- We support **multiple git providers** (GitHub, Gitlab, Bitbucket, CodeCommit), **multiple ways** to use the tool (CLI, GitHub Action, GitHub App, Docker, ...), and **multiple models** (GPT-4, GPT-3.5, Anthropic, Cohere, Llama2).
- We are open-source, and welcome contributions from the community.
- We support **multiple git providers** (GitHub, Gitlab, Bitbucket), **multiple ways** to use the tool (CLI, GitHub Action, GitHub App, Docker, ...), and **multiple models** (GPT-4, GPT-3.5, Anthropic, Cohere, Llama2).
## Roadmap
- [x] Support additional models, as a replacement for OpenAI (see [here](https://github.com/Codium-ai/pr-agent/pull/172))
- [x] Develop additional logic for handling large PRs (see [here](https://github.com/Codium-ai/pr-agent/pull/229))
- [ ] Add additional context to the prompt. For example, repo (or relevant files) summarization, with tools such a [ctags](https://github.com/universal-ctags/ctags)
- [x] PR-Agent for issues
- [ ] Adding more tools. Possible directions:
- [x] PR description
- [x] Inline code suggestions
- [x] Reflect and review
- [x] Rank the PR (see [here](https://github.com/Codium-ai/pr-agent/pull/89))
- [ ] Enforcing CONTRIBUTING.md guidelines
- [ ] Performance (are there any performance issues)
- [x] Documentation (is the PR properly documented)
- [ ] ...
See the [Release notes](./RELEASE_NOTES.md) for updates on the latest changes.
## Similar Projects
- [CodiumAI - Meaningful tests for busy devs](https://github.com/Codium-ai/codiumai-vscode-release) (although various capabilities are much more advanced in the CodiumAI IDE plugins)
- [Aider - GPT powered coding in your terminal](https://github.com/paul-gauthier/aider)
- [openai-pr-reviewer](https://github.com/coderabbitai/openai-pr-reviewer)
- [CodeReview BOT](https://github.com/anc95/ChatGPT-CodeReview)
- [AI-Maintainer](https://github.com/merwanehamadi/AI-Maintainer)
## Data Privacy
If you use a self-hosted PR-Agent with your OpenAI API key, it is between you and OpenAI. You can read their API data privacy policy here:
If you host PR-Agent with your OpenAI API key, it is between you and OpenAI. You can read their API data privacy policy here:
https://openai.com/enterprise-privacy
When using a PR-Agent app hosted by CodiumAI, we will not store any of your data, nor will we used it for training.
When using PR-Agent-Pro 💎, hosted by CodiumAI, we will not store any of your data, nor will we used it for training.
You will also benefit from an OpenAI account with zero data retention.
## Links

View File

@ -1,5 +1,5 @@
# Add Documentation Tool
The `add_docs` tool scans the PR code changes, and automatically suggests documentation for the undocumented code components (functions, classes, etc.).
# Add Documentation Tool 💎
The `add_docs` tool scans the PR code changes, and automatically suggests documentation for any code components that changed in the PR (functions, classes, etc.).
It can be invoked manually by commenting on any PR:
```
@ -7,9 +7,18 @@ It can be invoked manually by commenting on any PR:
```
For example:
<kbd><img src=https://codium.ai/images/pr_agent/add_docs_comment.png width="768"></kbd>
<kbd><img src=https://codium.ai/images/pr_agent/add_docs.png width="768"></kbd>
<kbd><img src=https://codium.ai/images/pr_agent/docs_command.png width="768"></kbd>
___
<kbd><img src=https://codium.ai/images/pr_agent/docs_components.png width="768"></kbd>
___
<kbd><img src=https://codium.ai/images/pr_agent/docs_single_component.png width="768"></kbd>
### Configuration options
- `docs_style`: The exact style of the documentation (for python docstring). you can choose between: `google`, `numpy`, `sphinx`, `restructuredtext`, `plain`. Default is `sphinx`.
- `extra_instructions`: Optional extra instructions to the tool. For example: "focus on the changes in the file X. Ignore change in ...".
Notes
- Language that are currently fully supported: Python, Java, C++, JavaScript, TypeScript.
- For languages that are not fully supported, the tool will suggest documentation only for new components in the PR.
- A previous version of the tool, that offered support only for new components, was deprecated.

21
docs/Analyze.md Normal file
View File

@ -0,0 +1,21 @@
# Analyze Tool 💎
The `analyze` tool combines static code analysis with LLM capabilities to provide a comprehensive analysis of the PR code changes.
The tool scans the PR code changes, find the code components (methods, functions, classes) that changed, and summarizes the changes in each component.
It can be invoked manually by commenting on any PR:
```
/analyze
```
An example [result](https://github.com/Codium-ai/pr-agent/pull/546#issuecomment-1868524805):
<kbd><img src=https://codium.ai/images/pr_agent/analyze_1.png width="768"></kbd>
___
<kbd><img src=https://codium.ai/images/pr_agent/analyze_2.png width="768"></kbd>
___
<kbd><img src=https://codium.ai/images/pr_agent/analyze_3.png width="768"></kbd>
Notes
- Language that are currently supported: Python, Java, C++, JavaScript, TypeScript.

27
docs/Full_enviroments.md Normal file
View File

@ -0,0 +1,27 @@
## Overview
`PR-Agent` offers extensive pull request functionalities across various git providers:
| | | GitHub | Gitlab | Bitbucket | CodeCommit | Azure DevOps | Gerrit |
|-------|---------------------------------------------|:------:|:------:|:---------:|:----------:|:----------:|:----------:|
| TOOLS | Review | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | ⮑ Incremental | :white_check_mark: | | | | | |
| | Ask | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | Auto-Description | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | Improve Code | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | | :white_check_mark: |
| | ⮑ Extended | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | | :white_check_mark: |
| | Reflect and Review | :white_check_mark: | :white_check_mark: | :white_check_mark: | | :white_check_mark: | :white_check_mark: |
| | Update CHANGELOG.md | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | | |
| | Find similar issue | :white_check_mark: | | | | | |
| | Add Documentation | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | | :white_check_mark: |
| | Generate Custom Labels 💎 | :white_check_mark: | :white_check_mark: | | | | |
| | | | | | | |
| USAGE | CLI | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | App / webhook | :white_check_mark: | :white_check_mark: | | | |
| | Tagging bot | :white_check_mark: | | | | |
| | Actions | :white_check_mark: | | | | |
| | Web server | | | | | | :white_check_mark: |
| | | | | | | |
| CORE | PR compression | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | Repo language prioritization | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | Adaptive and token-aware<br />file patch fitting | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | Multiple models support | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| | Incremental PR Review | :white_check_mark: | | | | | |

View File

@ -1,4 +1,4 @@
# Generate Custom Labels
# Generate Custom Labels 💎
The `generate_labels` tool scans the PR code changes, and given a list of labels and their descriptions, it automatically suggests labels that match the PR code changes.
It can be invoked manually by commenting on any PR:
@ -25,7 +25,7 @@ When working from CLI, you need to apply the [configuration changes](#configurat
#### 2. Repo configuration file
To enable custom labels, you need to apply the [configuration changes](#configuration-changes) to the local `.pr_agent.toml` file in you repository.
#### 3. Handle custom labels from the Repo's labels page :gem:
#### 3. Handle custom labels from the Repo's labels page
> This feature is available only in PR-Agent Pro
* GitHub : `https://github.com/{owner}/{repo}/labels`, or click on the "Labels" tab in the issues or PRs page.
* GitLab : `https://gitlab.com/{owner}/{repo}/-/labels`, or click on "Manage" -> "Labels" on the left menu.

View File

@ -26,6 +26,7 @@ Under the section 'pr_code_suggestions', the [configuration file](./../pr_agent/
- `num_code_suggestions`: number of code suggestions provided by the 'improve' tool. Default is 4.
- `extra_instructions`: Optional extra instructions to the tool. For example: "focus on the changes in the file X. Ignore change in ...".
- `rank_suggestions`: if set to true, the tool will rank the suggestions, based on importance. Default is false.
- `include_improved_code`: if set to true, the tool will include an improved code implementation in the suggestion. Default is true.
#### params for '/improve --extended' mode
- `num_code_suggestions_per_chunk`: number of code suggestions provided by the 'improve' tool, per chunk. Default is 8.

View File

@ -7,5 +7,6 @@
- [UPDATE CHANGELOG](./UPDATE_CHANGELOG.md)
- [ADD DOCUMENTATION](./ADD_DOCUMENTATION.md)
- [GENERATE CUSTOM LABELS](./GENERATE_CUSTOM_LABELS.md)
- [Analyze](./Analyze.md)
See the **[installation guide](/INSTALL.md)** for instructions on how to setup PR-Agent.

View File

@ -264,19 +264,10 @@ def _get_all_deployments(all_models: List[str]) -> List[str]:
def find_line_number_of_relevant_line_in_file(diff_files: List[FilePatchInfo],
relevant_file: str,
relevant_line_in_file: str) -> Tuple[int, int]:
"""
Find the line number and absolute position of a relevant line in a file.
Args:
diff_files (List[FilePatchInfo]): A list of FilePatchInfo objects representing the patches of files.
relevant_file (str): The name of the file where the relevant line is located.
relevant_line_in_file (str): The content of the relevant line.
Returns:
Tuple[int, int]: A tuple containing the line number and absolute position of the relevant line in the file.
"""
relevant_line_in_file: str,
absolute_position: int = None) -> Tuple[int, int]:
position = -1
if absolute_position is None:
absolute_position = -1
re_hunk_header = re.compile(
r"^@@ -(\d+)(?:,(\d+))? \+(\d+)(?:,(\d+))? @@[ ]?(.*)")
@ -285,15 +276,32 @@ def find_line_number_of_relevant_line_in_file(diff_files: List[FilePatchInfo],
if file.filename and (file.filename.strip() == relevant_file):
patch = file.patch
patch_lines = patch.splitlines()
delta = 0
start1, size1, start2, size2 = 0, 0, 0, 0
if absolute_position != -1: # matching absolute to relative
for i, line in enumerate(patch_lines):
# new hunk
if line.startswith('@@'):
delta = 0
match = re_hunk_header.match(line)
start1, size1, start2, size2 = map(int, match.groups()[:4])
elif not line.startswith('-'):
delta += 1
#
absolute_position_curr = start2 + delta - 1
if absolute_position_curr == absolute_position:
position = i
break
else:
# try to find the line in the patch using difflib, with some margin of error
matches_difflib: list[str | Any] = difflib.get_close_matches(relevant_line_in_file,
patch_lines, n=3, cutoff=0.93)
if len(matches_difflib) == 1 and matches_difflib[0].startswith('+'):
relevant_line_in_file = matches_difflib[0]
delta = 0
start1, size1, start2, size2 = 0, 0, 0, 0
for i, line in enumerate(patch_lines):
if line.startswith('@@'):
delta = 0

View File

@ -102,7 +102,7 @@ def parse_code_suggestion(code_suggestions: dict, i: int = 0, gfm_supported: boo
markdown_text += f"<tr><td>{sub_key}</td><td>{relevant_file}</td></tr>"
# continue
elif sub_key.lower() == 'suggestion':
markdown_text += f"<tr><td>{sub_key} &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td><strong>{sub_value}</strong></td></tr>"
markdown_text += f"<tr><td>{sub_key} &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td><strong>\n\n{sub_value}</strong></td></tr>"
elif sub_key.lower() == 'relevant line':
markdown_text += f"<tr><td>relevant line</td>"
sub_value_list = sub_value.split('](')
@ -316,19 +316,21 @@ def _fix_key_value(key: str, value: str):
return key, value
def load_yaml(response_text: str) -> dict:
def load_yaml(response_text: str, keys_fix_yaml: List[str] = []) -> dict:
response_text = response_text.removeprefix('```yaml').rstrip('`')
try:
data = yaml.safe_load(response_text)
except Exception as e:
get_logger().error(f"Failed to parse AI prediction: {e}")
data = try_fix_yaml(response_text)
data = try_fix_yaml(response_text, keys_fix_yaml=keys_fix_yaml)
return data
def try_fix_yaml(response_text: str) -> dict:
def try_fix_yaml(response_text: str, keys_fix_yaml: List[str] = []) -> dict:
response_text_lines = response_text.split('\n')
keys = ['relevant line:', 'suggestion content:', 'relevant file:', 'existing code:', 'improved code:']
keys = keys + keys_fix_yaml
# first fallback - try to convert 'relevant line: ...' to relevant line: |-\n ...'
response_text_lines_copy = response_text_lines.copy()
for i in range(0, len(response_text_lines_copy)):
@ -343,22 +345,34 @@ def try_fix_yaml(response_text: str) -> dict:
except:
get_logger().info(f"Failed to parse AI prediction after adding |-\n")
# second fallback - try to remove last lines
# second fallback - try to extract only range from first ```yaml to ````
snippet_pattern = r'```(yaml)?[\s\S]*?```'
snippet = re.search(snippet_pattern, '\n'.join(response_text_lines_copy))
if snippet:
snippet_text = snippet.group()
try:
data = yaml.safe_load(snippet_text.removeprefix('```yaml').rstrip('`'))
get_logger().info(f"Successfully parsed AI prediction after extracting yaml snippet")
return data
except:
pass
# third fallback - try to remove leading and trailing curly brackets
response_text_copy = response_text.strip().rstrip().removeprefix('{').removesuffix('}')
try:
data = yaml.safe_load(response_text_copy,)
get_logger().info(f"Successfully parsed AI prediction after removing curly brackets")
return data
except:
pass
# fourth fallback - try to remove last lines
data = {}
for i in range(1, len(response_text_lines)):
response_text_lines_tmp = '\n'.join(response_text_lines[:-i])
try:
data = yaml.safe_load(response_text_lines_tmp,)
get_logger().info(f"Successfully parsed AI prediction after removing {i} lines")
break
except:
pass
# thrid fallback - try to remove leading and trailing curly brackets
response_text_copy = response_text.strip().rstrip().removeprefix('{').removesuffix('}')
try:
data = yaml.safe_load(response_text_copy,)
get_logger().info(f"Successfully parsed AI prediction after removing curly brackets")
return data
except:
pass

View File

@ -174,9 +174,6 @@ class AzureDevopsProvider:
def publish_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
raise NotImplementedError("Azure DevOps provider does not support publishing inline comment yet")
def create_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
raise NotImplementedError("Azure DevOps provider does not support creating inline comments yet")
def publish_inline_comments(self, comments: list[dict]):
raise NotImplementedError("Azure DevOps provider does not support publishing inline comments yet")

View File

@ -201,8 +201,10 @@ class BitbucketProvider(GitProvider):
get_logger().exception(f"Failed to remove comment, error: {e}")
# funtion to create_inline_comment
def create_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
position, absolute_position = find_line_number_of_relevant_line_in_file(self.get_diff_files(), relevant_file.strip('`'), relevant_line_in_file)
def create_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str, absolute_position: int = None):
position, absolute_position = find_line_number_of_relevant_line_in_file(self.get_diff_files(),
relevant_file.strip('`'),
relevant_line_in_file, absolute_position)
if position == -1:
if get_settings().config.verbosity_level >= 2:
get_logger().info(f"Could not find position for {relevant_file} {relevant_line_in_file}")

View File

@ -210,11 +210,14 @@ class BitbucketServerProvider(GitProvider):
pass
# funtion to create_inline_comment
def create_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
def create_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str,
absolute_position: int = None):
position, absolute_position = find_line_number_of_relevant_line_in_file(
self.get_diff_files(),
relevant_file.strip('`'),
relevant_line_in_file
relevant_line_in_file,
absolute_position
)
if position == -1:
if get_settings().config.verbosity_level >= 2:

View File

@ -229,9 +229,6 @@ class CodeCommitProvider(GitProvider):
# https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/codecommit/client/post_comment_for_compared_commit.html
raise NotImplementedError("CodeCommit provider does not support publishing inline comments yet")
def create_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
raise NotImplementedError("CodeCommit provider does not support creating inline comments yet")
def publish_inline_comments(self, comments: list[dict]):
raise NotImplementedError("CodeCommit provider does not support publishing inline comments yet")

View File

@ -380,11 +380,6 @@ class GerritProvider(GitProvider):
'Publishing inline comments is not implemented for the gerrit '
'provider')
def create_inline_comment(self, body: str, relevant_file: str,
relevant_line_in_file: str):
raise NotImplementedError(
'Creating inline comments is not implemented for the gerrit '
'provider')
def publish_labels(self, labels):
# Not applicable to the local git provider,

View File

@ -106,9 +106,9 @@ class GitProvider(ABC):
def publish_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
pass
@abstractmethod
def create_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
pass
def create_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str,
absolute_position: int = None):
raise NotImplementedError("This git provider does not support creating inline comments yet")
@abstractmethod
def publish_inline_comments(self, comments: list[dict]):

View File

@ -206,8 +206,12 @@ class GithubProvider(GitProvider):
self.publish_inline_comments([self.create_inline_comment(body, relevant_file, relevant_line_in_file)])
def create_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
position, absolute_position = find_line_number_of_relevant_line_in_file(self.diff_files, relevant_file.strip('`'), relevant_line_in_file)
def create_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str,
absolute_position: int = None):
position, absolute_position = find_line_number_of_relevant_line_in_file(self.diff_files,
relevant_file.strip('`'),
relevant_line_in_file,
absolute_position)
if position == -1:
if get_settings().config.verbosity_level >= 2:
get_logger().info(f"Could not find position for {relevant_file} {relevant_line_in_file}")

View File

@ -183,7 +183,7 @@ class GitLabProvider(GitProvider):
self.send_inline_comment(body, edit_type, found, relevant_file, relevant_line_in_file, source_line_no,
target_file, target_line_no)
def create_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
def create_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str, absolute_position: int = None):
raise NotImplementedError("Gitlab provider does not support creating inline comments yet")
def create_inline_comments(self, comments: list[dict]):

View File

@ -121,9 +121,6 @@ class LocalGitProvider(GitProvider):
def publish_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
raise NotImplementedError('Publishing inline comments is not implemented for the local git provider')
def create_inline_comment(self, body: str, relevant_file: str, relevant_line_in_file: str):
raise NotImplementedError('Creating inline comments is not implemented for the local git provider')
def publish_inline_comments(self, comments: list[dict]):
raise NotImplementedError('Publishing inline comments is not implemented for the local git provider')

View File

@ -3,8 +3,6 @@ commands_text = "> **/review**: Request a review of your Pull Request. \n" \
"> **/improve [--extended]**: Suggest code improvements. Extended mode provides a higher quality feedback. \n" \
"> **/ask \\<QUESTION\\>**: Ask a question about the PR. \n" \
"> **/update_changelog**: Update the changelog based on the PR's contents. \n" \
"> **/add_docs**: Generate docstring for new components introduced in the PR. \n" \
"> **/generate_labels**: Generate labels for the PR based on the PR's contents. \n" \
"> see the [tools guide](https://github.com/Codium-ai/pr-agent/blob/main/docs/TOOLS_GUIDE.md) for more details.\n\n" \
">To edit any configuration parameter from the [configuration.toml](https://github.com/Codium-ai/pr-agent/blob/main/pr_agent/settings/configuration.toml), add --config_path=new_value. \n" \
">For example: /review --pr_reviewer.extra_instructions=\"focus on the file: ...\" \n" \

View File

@ -62,6 +62,7 @@ include_generated_by_header=true
[pr_code_suggestions] # /improve #
num_code_suggestions=4
summarize = false
include_improved_code = true
extra_instructions = ""
rank_suggestions = false
# params for '/improve --extended' mode
@ -78,6 +79,8 @@ docs_style = "Sphinx Style" # "Google Style with Args, Returns, Attributes...etc
push_changelog_changes=false
extra_instructions = ""
[pr_analyze] # /analyze #
[pr_config] # /config #
[github]

View File

@ -32,13 +32,13 @@ __old hunk__
Specific instructions:
- Provide up to {{ num_code_suggestions }} code suggestions. Try to provide diverse and insightful suggestions.
- Prioritize suggestions that address major problems, issues and bugs in the code. As a second priority, suggestions should focus on best practices, code readability, maintainability, enhancments, performance, and other aspects.
- Prioritize suggestions that address major problems, issues and bugs in the code. As a second priority, suggestions should focus on enhancment, best practice, performance, maintainability, and other aspects.
- Don't suggest to add docstring, type hints, or comments.
- Suggestions should refer only to code from the '__new hunk__' sections, and focus on new lines of code (lines starting with '+').
- Avoid making suggestions that have already been implemented in the PR code. For example, if you want to add logs, or change a variable to const, or anything else, make sure it isn't already in the '__new hunk__' code.
- For each suggestion, make sure to take into consideration also the context, meaning the lines before and after the relevant code.
- Provide the exact line numbers range (inclusive) for each suggestion.
- Assume there is additional relevant code, that is not included in the diff.
- When quoting variables or names from the code, use backticks (`) instead of single quote (').
{%- if extra_instructions %}
@ -49,65 +49,41 @@ Extra instructions from the user:
======
{%- endif %}
The output must be a YAML object equivalent to type PRCodeSuggestions, according to the following Pydantic definitions:
=====
class CodeSuggestion(BaseModel):
relevant_file: str = Field(description="the relevant file full path")
suggestion_content: str = Field(description="an actionable suggestion for meaningfully improving the new code introduced in the PR")
existing_code: str = Field(description="a code snippet, showing the relevant code lines from a '__new hunk__' section. It must be contiguous, correctly formatted and indented, and without line numbers")
relevant_lines_start: int = Field(description="The relevant line number, from a '__new hunk__' section, where the suggestion starts (inclusive). Should be derived from the hunk line numbers, and correspond to the 'existing code' snippet above")
relevant_lines_end: int = Field(description="The relevant line number, from a '__new hunk__' section, where the suggestion ends (inclusive). Should be derived from the hunk line numbers, and correspond to the 'existing code' snippet above")
improved_code: str = Field(description="a new code snippet, that can be used to replace the relevant lines in '__new hunk__' code. Replacement suggestions should be complete, correctly formatted and indented, and without line numbers")
label: str = Field(description="a single label for the suggestion, to help the user understand the suggestion type. For example: 'security', 'bug', 'performance', 'enhancement', 'possible issue', 'best practice', 'maintainability', etc. Other labels are also allowed")
class PRCodeSuggestions(BaseModel):
code_suggestions: List[CodeSuggestion]
=====
You must use the following YAML schema to format your answer:
```yaml
Code suggestions:
type: array
minItems: 1
maxItems: {{ num_code_suggestions }}
uniqueItems: true
items:
relevant file:
type: string
description: the relevant file full path
suggestion content:
type: string
description: |-
a concrete suggestion for meaningfully improving the new PR code.
existing code:
type: string
description: |-
a code snippet showing the relevant code lines from a '__new hunk__' section.
It must be contiguous, correctly formatted and indented, and without line numbers.
relevant lines start:
type: integer
description: |-
The relevant line number from a '__new hunk__' section where the suggestion starts (inclusive).
Should be derived from the hunk line numbers, and correspond to the 'existing code' snippet above.
relevant lines end:
type: integer
description: |-
The relevant line number from a '__new hunk__' section where the suggestion ends (inclusive).
Should be derived from the hunk line numbers, and correspond to the 'existing code' snippet above.
improved code:
type: string
description: |-
a new code snippet that can be used to replace the relevant lines in '__new hunk__' code.
Replacement suggestions should be complete, correctly formatted and indented, and without line numbers.
```
Example output:
```yaml
Code suggestions:
- relevant file: |-
code_suggestions:
- relevant_file: |-
src/file1.py
suggestion content: |-
suggestion_content: |-
Add a docstring to func1()
existing code: |-
existing_code: |-
def func1():
relevant lines start: |-
12
relevant lines end: |-
12
improved code: |-
relevant_lines_start: 12
relevant_lines_end: 12
improved_code: |-
...
label: |-
...
```
Each YAML output MUST be after a newline, indented, with block scalar indicator ('|-').
Don't repeat the prompt in the answer, and avoid outputting the 'type' and 'description' fields.
"""
user="""PR Info:

View File

@ -1,10 +1,15 @@
[pr_description_prompt]
system="""You are PR-Reviewer, a language model designed to review a Git Pull Request (PR).
Your task is to provide a full description for the PR content - title, type, description, and main files walkthrough.
{%- if enable_custom_labels %}
Your task is to provide a full description for the PR content - files walkthrough, title, type, description and labels.
{%- else %}
Your task is to provide a full description for the PR content - files walkthrough, title, type, and description.
{%- endif %}
- Focus on the new PR code (lines starting with '+').
- Keep in mind that the 'Previous title', 'Previous description' and 'Commit messages' sections may be partial, simplistic, non-informative or out of date. Hence, compare them to the PR diff code, and use them only as a reference.
- The generated title and description should prioritize the most significant changes.
- If needed, each YAML output should be in block scalar indicator ('|-')
- When quoting variables or names from the code, use backticks (`) instead of single quote (').
{%- if extra_instructions %}
@ -45,43 +50,24 @@ Class FileDescription(BaseModel):
{%- endif %}
Class PRDescription(BaseModel):
type: List[PRType] = Field(description="one or more types that describe the PR content. Return the label value, not the name.")
{%- if enable_semantic_files_types %}
pr_files[List[FileDescription]] = Field(max_items=15, description="a list of the files in the PR, and their changes summary.")
{%- endif %}
description: str = Field(description="an informative and concise description of the PR. Use bullet points. Display first the most significant changes.")
title: str = Field(description="an informative title for the PR, describing its main theme")
type: List[PRType] = Field(description="one or more types that describe the PR type. Return the label value, not the name.")
description: str = Field(description="an informative and concise description of the PR. {%- if use_bullet_points %} Use bullet points.{% endif %}")
{%- if enable_custom_labels %}
labels: List[Label] = Field(min_items=0, description="choose the relevant custom labels that describe the PR content, and return their keys. Use the value field of the Label object to better understand the label meaning.")
{%- endif %}
{%- if enable_file_walkthrough %}
main_files_walkthrough: List[FileWalkthrough] = Field(max_items=10)
{%- endif %}
{%- if enable_semantic_files_types %}
pr_files[List[FileDescription]] = Field(max_items=15")
{%- endif %}
=====
Example output:
```yaml
title: |-
...
type:
- ...
- ...
{%- if enable_custom_labels %}
labels:
- |
...
- |
...
{%- endif %}
description: |-
...
{%- if enable_file_walkthrough %}
main_files_walkthrough:
- ...
- ...
{%- endif %}
{%- if enable_semantic_files_types %}
pr_files:
- filename: |
@ -92,6 +78,17 @@ pr_files:
...
...
{%- endif %}
description: |-
...
title: |-
...
{%- if enable_custom_labels %}
labels:
- |
...
- |
...
{%- endif %}
```
Answer should be a valid YAML, and nothing else. Each YAML output MUST be after a newline, with proper indent, and block scalar indicator ('|-')

View File

@ -32,6 +32,7 @@ Code suggestions guidelines:
- Avoid making suggestions that have already been implemented in the PR code. For example, if you want to add logs, or change a variable to const, or anything else, make sure it isn't already in the PR code.
- Don't suggest to add docstring, type hints, or comments.
- Suggestions should focus on the new code added in the PR diff (lines starting with '+')
- When quoting variables or names from the code, use backticks (`) instead of single quote (').
{%- endif %}
{%- if extra_instructions %}

View File

@ -65,14 +65,14 @@ class PRCodeSuggestions:
data = self._prepare_pr_code_suggestions()
else:
data = await retry_with_fallback_models(self._prepare_prediction_extended)
if (not data) or (not 'Code suggestions' in data):
if (not data) or (not 'code_suggestions' in data):
get_logger().info('No code suggestions found for PR.')
return
if (not self.is_extended and get_settings().pr_code_suggestions.rank_suggestions) or \
(self.is_extended and get_settings().pr_code_suggestions.rank_extended_suggestions):
get_logger().info('Ranking Suggestions...')
data['Code suggestions'] = await self.rank_suggestions(data['Code suggestions'])
data['code_suggestions'] = await self.rank_suggestions(data['code_suggestions'])
if get_settings().config.publish_output:
get_logger().info('Pushing PR code suggestions...')
@ -116,44 +116,73 @@ class PRCodeSuggestions:
def _prepare_pr_code_suggestions(self) -> Dict:
review = self.prediction.strip()
data = load_yaml(review)
data = load_yaml(review,
keys_fix_yaml=["relevant_file", "suggestion_content", "existing_code", "improved_code"])
if isinstance(data, list):
data = {'Code suggestions': data}
data = {'code_suggestions': data}
# remove invalid suggestions
suggestion_list = []
for i, suggestion in enumerate(data['code_suggestions']):
if suggestion['existing_code'] != suggestion['improved_code']:
suggestion_list.append(suggestion)
else:
get_logger().debug(
f"Skipping suggestion {i + 1}, because existing code is equal to improved code {suggestion['existing_code']}")
data['code_suggestions'] = suggestion_list
return data
def push_inline_code_suggestions(self, data):
code_suggestions = []
if not data['Code suggestions']:
if not data['code_suggestions']:
get_logger().info('No suggestions found to improve this PR.')
return self.git_provider.publish_comment('No suggestions found to improve this PR.')
for d in data['Code suggestions']:
for d in data['code_suggestions']:
try:
if get_settings().config.verbosity_level >= 2:
get_logger().info(f"suggestion: {d}")
relevant_file = d['relevant file'].strip()
relevant_lines_start = int(d['relevant lines start']) # absolute position
relevant_lines_end = int(d['relevant lines end'])
content = d['suggestion content']
new_code_snippet = d['improved code']
relevant_file = d['relevant_file'].strip()
relevant_lines_start = int(d['relevant_lines_start']) # absolute position
relevant_lines_end = int(d['relevant_lines_end'])
content = d['suggestion_content'].rstrip()
new_code_snippet = d['improved_code'].rstrip()
label = d['label'].strip()
if new_code_snippet:
new_code_snippet = self.dedent_code(relevant_file, relevant_lines_start, new_code_snippet)
body = f"**Suggestion:** {content}\n```suggestion\n" + new_code_snippet + "\n```"
if get_settings().pr_code_suggestions.include_improved_code:
body = f"**Suggestion:** {content} [{label}]\n```suggestion\n" + new_code_snippet + "\n```"
code_suggestions.append({'body': body, 'relevant_file': relevant_file,
'relevant_lines_start': relevant_lines_start,
'relevant_lines_end': relevant_lines_end})
else:
if self.git_provider.is_supported("create_inline_comment"):
body = f"**Suggestion:** {content} [{label}]"
comment = self.git_provider.create_inline_comment(body, relevant_file, "",
absolute_position=relevant_lines_end)
if comment:
code_suggestions.append(comment)
else:
get_logger().error("Inline comments are not supported by the git provider")
except Exception:
if get_settings().config.verbosity_level >= 2:
get_logger().info(f"Could not parse suggestion: {d}")
if get_settings().pr_code_suggestions.include_improved_code:
is_successful = self.git_provider.publish_code_suggestions(code_suggestions)
else:
is_successful = self.git_provider.publish_inline_comments(code_suggestions)
if not is_successful:
get_logger().info("Failed to publish code suggestions, trying to publish each suggestion separately")
for code_suggestion in code_suggestions:
if get_settings().pr_code_suggestions.include_improved_code:
self.git_provider.publish_code_suggestions([code_suggestion])
else:
self.git_provider.publish_inline_comments([code_suggestion])
def dedent_code(self, relevant_file, relevant_lines_start, new_code_snippet):
try: # dedent code snippet
@ -195,8 +224,8 @@ class PRCodeSuggestions:
for prediction in prediction_list:
self.prediction = prediction
data_per_chunk = self._prepare_pr_code_suggestions()
if "Code suggestions" in data:
data["Code suggestions"].extend(data_per_chunk["Code suggestions"])
if "code_suggestions" in data:
data["code_suggestions"].extend(data_per_chunk["code_suggestions"])
else:
data.update(data_per_chunk)
self.data = data
@ -214,11 +243,8 @@ class PRCodeSuggestions:
"""
suggestion_list = []
# remove invalid suggestions
for i, suggestion in enumerate(data):
if suggestion['existing code'] != suggestion['improved code']:
for suggestion in data:
suggestion_list.append(suggestion)
data_sorted = [[]] * len(suggestion_list)
try:
@ -264,24 +290,25 @@ class PRCodeSuggestions:
for ext in extensions:
extension_to_language[ext] = language
for s in data['Code suggestions']:
for s in data['code_suggestions']:
try:
extension_s = s['relevant file'].rsplit('.')[-1]
code_snippet_link = self.git_provider.get_line_link(s['relevant file'], s['relevant lines start'],
s['relevant lines end'])
data_markdown += f"\n💡 Suggestion:\n\n**{s['suggestion content']}**\n\n"
extension_s = s['relevant_file'].rsplit('.')[-1]
code_snippet_link = self.git_provider.get_line_link(s['relevant_file'], s['relevant_lines_start'],
s['relevant_lines_end'])
label = s['label'].strip()
data_markdown += f"\n💡 [{label}]\n\n**{s['suggestion_content'].rstrip().rstrip()}**\n\n"
if code_snippet_link:
data_markdown += f" File: [{s['relevant file']} ({s['relevant lines start']}-{s['relevant lines end']})]({code_snippet_link})\n\n"
data_markdown += f" File: [{s['relevant_file']} ({s['relevant_lines_start']}-{s['relevant_lines_end']})]({code_snippet_link})\n\n"
else:
data_markdown += f"File: {s['relevant file']} ({s['relevant lines start']}-{s['relevant lines end']})\n\n"
data_markdown += f"File: {s['relevant_file']} ({s['relevant_lines_start']}-{s['relevant_lines_end']})\n\n"
if self.git_provider.is_supported("gfm_markdown"):
data_markdown += "<details> <summary> Example code:</summary>\n\n"
data_markdown += f"___\n\n"
language_name = "python"
if extension_s and (extension_s in extension_to_language):
language_name = extension_to_language[extension_s]
data_markdown += f"Existing code:\n```{language_name}\n{s['existing code']}\n```\n"
data_markdown += f"Improved code:\n```{language_name}\n{s['improved code']}\n```\n"
data_markdown += f"Existing code:\n```{language_name}\n{s['existing_code'].rstrip()}\n```\n"
data_markdown += f"Improved code:\n```{language_name}\n{s['improved_code'].rstrip()}\n```\n"
if self.git_provider.is_supported("gfm_markdown"):
data_markdown += "</details>\n"
data_markdown += "\n___\n\n"
@ -290,5 +317,3 @@ class PRCodeSuggestions:
self.git_provider.publish_comment(data_markdown)
except Exception as e:
get_logger().info(f"Failed to publish summarized code suggestions, error: {e}")

View File

@ -48,7 +48,6 @@ class PRDescription:
"description": self.git_provider.get_pr_description(full=False),
"language": self.main_pr_language,
"diff": "", # empty diff for initial calculation
"use_bullet_points": get_settings().pr_description.use_bullet_points,
"extra_instructions": get_settings().pr_description.extra_instructions,
"commit_messages_str": self.git_provider.get_commit_messages(),
"enable_custom_labels": get_settings().config.enable_custom_labels,
@ -187,6 +186,18 @@ class PRDescription:
# Load the AI prediction data into a dictionary
self.data = load_yaml(self.prediction.strip())
# re-order keys
if 'title' in self.data:
self.data['title'] = self.data.pop('title')
if 'type' in self.data:
self.data['type'] = self.data.pop('type')
if 'labels' in self.data:
self.data['labels'] = self.data.pop('labels')
if 'description' in self.data:
self.data['description'] = self.data.pop('description')
if 'pr_files' in self.data:
self.data['pr_files'] = self.data.pop('pr_files')
if get_settings().pr_description.add_original_user_description and self.user_description:
self.data["User Description"] = self.user_description
@ -371,7 +382,8 @@ class PRDescription:
<summary><strong>{filename_publish}</strong></summary>
<ul>
{filename}<br><br>
<strong>{file_change_description}</strong>
**{file_change_description}**
</ul>
</details>
</td>

View File

@ -43,7 +43,6 @@ class PRGenerateLabels:
"description": self.git_provider.get_pr_description(full=False),
"language": self.main_pr_language,
"diff": "", # empty diff for initial calculation
"use_bullet_points": get_settings().pr_description.use_bullet_points,
"extra_instructions": get_settings().pr_description.extra_instructions,
"commit_messages_str": self.git_provider.get_commit_messages(),
"enable_custom_labels": get_settings().config.enable_custom_labels,

View File

@ -19,6 +19,19 @@ class TestTryFixYaml:
expected_output = {"relevant line": "value: 3"}
assert try_fix_yaml(review_text) == expected_output
# The function extracts YAML snippet
def test_extract_snippet(self):
review_text = '''\
Here is the answer in YAML format:
```yaml
name: John Smith
age: 35
```
'''
expected_output = {'name': 'John Smith', 'age': 35}
assert try_fix_yaml(review_text) == expected_output
# The function removes the last line(s) of the YAML string and successfully parses the YAML string.
def test_remove_last_line(self):
review_text = "key: value\nextra invalid line\n"