From eaf7cfbcf28f14d59505e1b92e84a4ddb11baee7 Mon Sep 17 00:00:00 2001 From: mrT23 Date: Mon, 2 Oct 2023 18:18:15 +0300 Subject: [PATCH] readme updates --- PR_COMPRESSION.md | 6 +++--- Usage.md | 4 ++-- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/PR_COMPRESSION.md b/PR_COMPRESSION.md index 8e3e5fd7..ef053efe 100644 --- a/PR_COMPRESSION.md +++ b/PR_COMPRESSION.md @@ -1,4 +1,4 @@ -# Git Patch Logic +# PR Compression Strategy There are two scenarios: 1. The PR is small enough to fit in a single prompt (including system and user prompt) 2. The PR is too large to fit in a single prompt (including system and user prompt) @@ -16,7 +16,7 @@ We prioritize the languages of the repo based on the following criteria: ## Small PR In this case, we can fit the entire PR in a single prompt: 1. Exclude binary files and non code files (e.g. images, pdfs, etc) -2. We Expand the surrounding context of each patch to 6 lines above and below the patch +2. We Expand the surrounding context of each patch to 3 lines above and below the patch ## Large PR ### Motivation @@ -25,7 +25,7 @@ We want to be able to pack as much information as possible in a single LMM promp -#### PR compression strategy +#### Compression strategy We prioritize additions over deletions: - Combine all deleted files into a single list (`deleted files`) - File patches are a list of hunks, remove all hunks of type deletion-only from the hunks in the file patch diff --git a/Usage.md b/Usage.md index 96ffc8c0..6176eaf0 100644 --- a/Usage.md +++ b/Usage.md @@ -248,9 +248,9 @@ This mode provide a very good speed-quality-cost tradeoff, and can handle most P When the PR is above the token limit, it employs a [PR Compression strategy](./PR_COMPRESSION.md). However, for very large PRs, or in case you want to emphasize quality over speed and cost, there are 2 possible solutions: -1) [use a model](#changing-a-model) with larger context, like GPT-32K, or claude-100K. This solution will be applicable for all the tools +1) [Use a model](#changing-a-model) with larger context, like GPT-32K, or claude-100K. This solution will be applicable for all the tools. 2) For the `/improve` tool, there is an ['extended' mode](./docs/IMPROVE.md) (`/improve --extended`), -which divides the PR to chunks, and process each chunk separately, so regardless of the model, no compression will be done (but for large PRs, multiple calls may occur) +which divides the PR to chunks, and process each chunk separately. With this mode, regardless of the model, no compression will be done (but for large PRs, multiple model calls may occur) ### Appendix - additional configurations walkthrough