From 416b150d666939837725d5df92d9431053eee2dc Mon Sep 17 00:00:00 2001 From: mrT23 Date: Sun, 2 Jun 2024 11:28:48 +0300 Subject: [PATCH 1/2] Add documentation for PR-Agent code fine-tuning benchmark and update mkdocs.yml --- docs/docs/finetuning_benchmark/index.md | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/docs/docs/finetuning_benchmark/index.md b/docs/docs/finetuning_benchmark/index.md index b41660d1..a89eb93d 100644 --- a/docs/docs/finetuning_benchmark/index.md +++ b/docs/docs/finetuning_benchmark/index.md @@ -52,11 +52,18 @@ Here are the results: ### Training dataset -Our training dataset is comprised of 25,000 pull requests, aggregated from permissive license repos. For each pull request, we generated responses for the three main tools of PR-Agent: +Our training dataset comprises 25,000 pull requests, aggregated from permissive license repos. For each pull request, we generated responses for the three main tools of PR-Agent: [Describe](https://pr-agent-docs.codium.ai/tools/describe/), [Review](https://pr-agent-docs.codium.ai/tools/improve/) and [Improve](https://pr-agent-docs.codium.ai/tools/improve/). On the raw data collected, we employed various automatic and manual cleaning techniques to ensure the outputs were of the highest quality, and suitable for instruct-tuning. -An example input prompt can be found [here](https://github.com/Codium-ai/pr-agent/blob/main/pr_agent/settings/pr_code_suggestions_prompts.toml), and an example output can be found [here](https://github.com/Codium-ai/pr-agent/pull/910#issuecomment-2118761309). + +Here are the prompts, and example outputs, used to fine-tune the models: + +| Tool | Prompt | Example output | +|----------|------------------------------------------------------------------------------------------------------------|----------------| +| Describe | [link](https://github.com/Codium-ai/pr-agent/blob/main/pr_agent/settings/pr_description_prompts.toml) | [link](https://github.com/Codium-ai/pr-agent/pull/910#issue-2303989601) | +| Review | [link](https://github.com/Codium-ai/pr-agent/blob/main/pr_agent/settings/pr_reviewer_prompts.toml) | [link](https://github.com/Codium-ai/pr-agent/pull/910#issuecomment-2118761219) | +| Improve | [link](https://github.com/Codium-ai/pr-agent/blob/main/pr_agent/settings/pr_code_suggestions_prompts.toml) | [link](https://github.com/Codium-ai/pr-agent/pull/910#issuecomment-2118761309) | ### Evaluation dataset From f3aa9c02ccf8cd268fb4eeb7947b684dcee8b7f5 Mon Sep 17 00:00:00 2001 From: mrT23 Date: Sun, 2 Jun 2024 11:30:56 +0300 Subject: [PATCH 2/2] Add documentation for PR-Agent code fine-tuning benchmark and update mkdocs.yml --- docs/docs/finetuning_benchmark/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/docs/finetuning_benchmark/index.md b/docs/docs/finetuning_benchmark/index.md index a89eb93d..35ec064b 100644 --- a/docs/docs/finetuning_benchmark/index.md +++ b/docs/docs/finetuning_benchmark/index.md @@ -57,7 +57,7 @@ Our training dataset comprises 25,000 pull requests, aggregated from permissive On the raw data collected, we employed various automatic and manual cleaning techniques to ensure the outputs were of the highest quality, and suitable for instruct-tuning. -Here are the prompts, and example outputs, used to fine-tune the models: +Here are the prompts, and example outputs, used as input-output paris to fine-tune the models: | Tool | Prompt | Example output | |----------|------------------------------------------------------------------------------------------------------------|----------------|