diff --git a/INSTALL.md b/INSTALL.md
index 285b3f14..5f107b20 100644
--- a/INSTALL.md
+++ b/INSTALL.md
@@ -24,9 +24,15 @@ To request a review for a PR, or ask a question about a PR, you can run directly
 
 1. To request a review for a PR, run the following command:
 
+For GitHub:
 ```
 docker run --rm -it -e OPENAI.KEY=<your key> -e GITHUB.USER_TOKEN=<your token> codiumai/pr-agent --pr_url <pr_url> review
 ```
+For GitLab:
+```
+docker run --rm -it -e OPENAI.KEY=<your key> -e CONFIG.GIT_PROVIDER=gitlab -e GITLAB.PERSONAL_ACCESS_TOKEN=<your token> codiumai/pr-agent --pr_url <pr_url> review
+```
+For other git providers, update CONFIG.GIT_PROVIDER accordingly, and check the `pr_agent/settings/.secrets_template.toml` file for the environment variables expected names and values.
 
 2. To ask a question about a PR, run the following command:
 
@@ -354,7 +360,7 @@ PYTHONPATH="/PATH/TO/PROJECTS/pr-agent" python pr_agent/cli.py \
 ```
 WEBHOOK_SECRET=$(python -c "import secrets; print(secrets.token_hex(10))")
 ```
-3. Follow the instructions to build the Docker image, setup a secrets file and deploy on your own server from [Method 5](#method-5-run-as-a-github-app).
+3. Follow the instructions to build the Docker image, setup a secrets file and deploy on your own server from [Method 5](#method-5-run-as-a-github-app) steps 4-7.
 4. In the secrets file, fill in the following:
     - Your OpenAI key.
     - In the [gitlab] section, fill in personal_access_token and shared_secret. The access token can be a personal access token, or a group or project access token.
@@ -363,11 +369,5 @@ WEBHOOK_SECRET=$(python -c "import secrets; print(secrets.token_hex(10))")
 In the "Trigger" section, check the ‘comments’ and ‘merge request events’ boxes. 
 6. Test your installation by opening a merge request or commenting or a merge request using one of CodiumAI's commands.
 
----
 
-### Appendix - **Debugging LLM API Calls**  
-If you're testing your codium/pr-agent server, and need to see if calls were made successfully + the exact call logs, you can use the [LiteLLM Debugger tool](https://docs.litellm.ai/docs/debugging/hosted_debugging). 
-
-You can do this by setting `litellm_debugger=true` in configuration.toml. Your Logs will be viewable in real-time @ `admin.litellm.ai/<your_email>`. Set your email in the `.secrets.toml` under 'user_email'.
-
-<img src="./pics/debugger.png" width="800"/>
\ No newline at end of file
+=======
diff --git a/README.md b/README.md
index eff850e6..ade40209 100644
--- a/README.md
+++ b/README.md
@@ -15,20 +15,20 @@ Making pull requests less painful with an AI agent
 </div>
 <div style="text-align:left;">
 
-CodiumAI `PR-Agent` is an open-source tool aiming to help developers review pull requests faster and more efficiently. It automatically analyzes the pull request and can provide several types of PR feedback:
+CodiumAI `PR-Agent` is an open-source tool aiming to help developers review pull requests faster and more efficiently. It automatically analyzes the pull request and can provide several types of commands:
 
-**Auto Description (/describe)**: Automatically generating [PR description](https://github.com/Codium-ai/pr-agent/pull/229#issue-1860711415) - title, type, summary, code walkthrough and labels.
+‣ **Auto Description (`/describe`)**: Automatically generating [PR description](https://github.com/Codium-ai/pr-agent/pull/229#issue-1860711415) - title, type, summary, code walkthrough and labels.
 \
-**Auto Review (/review)**: [Adjustable feedback](https://github.com/Codium-ai/pr-agent/pull/229#issuecomment-1695022908) about the PR main theme, type, relevant tests, security issues, score, and various suggestions for the PR content.
+‣ **Auto Review (`/review`)**: [Adjustable feedback](https://github.com/Codium-ai/pr-agent/pull/229#issuecomment-1695022908) about the PR main theme, type, relevant tests, security issues, score, and various suggestions for the PR content.
 \
-**Question Answering (/ask ...)**: Answering [free-text questions](https://github.com/Codium-ai/pr-agent/pull/229#issuecomment-1695021332) about the PR.
+‣ **Question Answering (`/ask ...`)**: Answering [free-text questions](https://github.com/Codium-ai/pr-agent/pull/229#issuecomment-1695021332) about the PR.
 \
-**Code Suggestions (/improve)**: [Committable code suggestions](https://github.com/Codium-ai/pr-agent/pull/229#discussion_r1306919276) for improving the PR.
+‣ **Code Suggestions (`/improve`)**: [Committable code suggestions](https://github.com/Codium-ai/pr-agent/pull/229#discussion_r1306919276) for improving the PR.
 \
-**Update Changelog (/update_changelog)**: Automatically updating the CHANGELOG.md file with the [PR changes](https://github.com/Codium-ai/pr-agent/pull/168#discussion_r1282077645).
+‣ **Update Changelog (`/update_changelog`)**: Automatically updating the CHANGELOG.md file with the [PR changes](https://github.com/Codium-ai/pr-agent/pull/168#discussion_r1282077645).
 
 
-See the [usage guide](./Usage.md) for instructions how to run the different tools from [CLI](./Usage.md#working-from-a-local-repo-cli), or by [online usage](./Usage.md#online-usage).
+See the [usage guide](./Usage.md) for instructions how to run the different tools from [CLI](./Usage.md#working-from-a-local-repo-cli), or by [online usage](./Usage.md#online-usage), as well as additional details on optional commands and configurations.
 
 <h3>Example results:</h3>
 </div>
@@ -199,4 +199,4 @@ Here are some advantages of PR-Agent:
 - [Aider - GPT powered coding in your terminal](https://github.com/paul-gauthier/aider)
 - [openai-pr-reviewer](https://github.com/coderabbitai/openai-pr-reviewer)
 - [CodeReview BOT](https://github.com/anc95/ChatGPT-CodeReview)
-- [AI-Maintainer](https://github.com/merwanehamadi/AI-Maintainer)
\ No newline at end of file
+- [AI-Maintainer](https://github.com/merwanehamadi/AI-Maintainer)
diff --git a/Usage.md b/Usage.md
index f8624d7e..80dbc3bd 100644
--- a/Usage.md
+++ b/Usage.md
@@ -149,15 +149,58 @@ TBD
 #### Changing a model
 See [here](pr_agent/algo/__init__.py) for the list of available models.
 
-To use Llama2 model, for example, set:
+#### Azure
+To use Azure, set: 
+```
+api_key = "" # your azure api key
+api_type = "azure"
+api_version = '2023-05-15'  # Check Azure documentation for the current API version
+api_base = ""  # The base URL for your Azure OpenAI resource. e.g. "https://<your resource name>.openai.azure.com"
+deployment_id = ""  # The deployment name you chose when you deployed the engine
+```
+in your .secrets.toml
+
+and 
 ```
 [config]
+model="" # the OpenAI model you've deployed on Azure (e.g. gpt-3.5-turbo)
+```
+in the configuration.toml 
+
+#### Huggingface
+
+To use a new model with Huggingface Inference Endpoints, for example, set:
+```
+[__init__.py]
+MAX_TOKENS = {
+    "model-name-on-huggingface": <max_tokens>
+}
+e.g.
+MAX_TOKENS={
+    ...,
+    "meta-llama/Llama-2-7b-chat-hf": 4096
+}
+[config] # in configuration.toml
+model = "huggingface/meta-llama/Llama-2-7b-chat-hf"
+
+[huggingface] # in .secrets.toml
+key = ... # your huggingface api key
+api_base = ... # the base url for your huggingface inference endpoint 
+```
+(you can obtain a Llama2 key from [here](https://replicate.com/replicate/llama-2-70b-chat/api))
+
+#### Replicate
+
+To use Llama2 model with Replicate, for example, set:
+```
+[config] # in configuration.toml
 model = "replicate/llama-2-70b-chat:2c1608e18606fad2812020dc541930f2d0495ce32eee50074220b87300bc16e1"
-[replicate]
+[replicate] # in .secrets.toml
 key = ...
 ```
 (you can obtain a Llama2 key from [here](https://replicate.com/replicate/llama-2-70b-chat/api))
 
+
 Also review the [AiHandler](pr_agent/algo/ai_handler.py) file for instruction how to set keys for other models.
 
 #### Extra instructions
diff --git a/pr_agent/algo/__init__.py b/pr_agent/algo/__init__.py
index 82a2af40..56511cd0 100644
--- a/pr_agent/algo/__init__.py
+++ b/pr_agent/algo/__init__.py
@@ -12,4 +12,5 @@ MAX_TOKENS = {
     'claude-2': 100000,
     'command-nightly': 4096,
     'replicate/llama-2-70b-chat:2c1608e18606fad2812020dc541930f2d0495ce32eee50074220b87300bc16e1': 4096,
+    'meta-llama/Llama-2-7b-chat-hf': 4096
 }
diff --git a/pr_agent/algo/ai_handler.py b/pr_agent/algo/ai_handler.py
index fcc5f04c..b48924d6 100644
--- a/pr_agent/algo/ai_handler.py
+++ b/pr_agent/algo/ai_handler.py
@@ -5,9 +5,7 @@ import openai
 from litellm import acompletion
 from openai.error import APIError, RateLimitError, Timeout, TryAgain
 from retry import retry
-
 from pr_agent.config_loader import get_settings
-
 OPENAI_RETRIES = 5
 
 
@@ -26,7 +24,6 @@ class AiHandler:
         try:
             openai.api_key = get_settings().openai.key
             litellm.openai_key = get_settings().openai.key
-            litellm.debugger = get_settings().config.litellm_debugger
             self.azure = False
             if get_settings().get("OPENAI.ORG", None):
                 litellm.organization = get_settings().openai.org
@@ -48,6 +45,8 @@ class AiHandler:
                 litellm.replicate_key = get_settings().replicate.key
             if get_settings().get("HUGGINGFACE.KEY", None):
                 litellm.huggingface_key = get_settings().huggingface.key
+                if get_settings().get("HUGGINGFACE.API_BASE", None):
+                    litellm.api_base = get_settings().huggingface.api_base
         except AttributeError as e:
             raise ValueError("OpenAI key is required") from e
 
diff --git a/pr_agent/settings/.secrets_template.toml b/pr_agent/settings/.secrets_template.toml
index 16c121ff..d7631b3c 100644
--- a/pr_agent/settings/.secrets_template.toml
+++ b/pr_agent/settings/.secrets_template.toml
@@ -28,6 +28,11 @@ key = "" # Optional, uncomment if you want to use Cohere. Acquire through https:
 
 [replicate]
 key = "" # Optional, uncomment if you want to use Replicate. Acquire through https://replicate.com/
+
+[huggingface]
+key = "" # Optional, uncomment if you want to use Huggingface Inference API. Acquire through https://huggingface.co/docs/api-inference/quicktour
+api_base = "" # the base url for your huggingface inference endpoint 
+
 [github]
 # ---- Set the following only for deployment type == "user"
 user_token = ""  # A GitHub personal access token with 'repo' scope.
diff --git a/pr_agent/settings/configuration.toml b/pr_agent/settings/configuration.toml
index 2188e8cc..da3e1924 100644
--- a/pr_agent/settings/configuration.toml
+++ b/pr_agent/settings/configuration.toml
@@ -10,7 +10,6 @@ use_repo_settings_file=true
 ai_timeout=180
 max_description_tokens = 500
 max_commits_tokens = 500
-litellm_debugger=false
 secret_provider="google_cloud_storage"
 
 [pr_reviewer] # /review #
diff --git a/requirements.txt b/requirements.txt
index cc5254fd..a4c6756f 100644
--- a/requirements.txt
+++ b/requirements.txt
@@ -7,17 +7,15 @@ Jinja2==3.1.2
 tiktoken==0.4.0
 uvicorn==0.22.0
 python-gitlab==3.15.0
-pytest==7.4.0
+pytest~==7.4.0
 aiohttp==3.8.4
 atlassian-python-api==3.39.0
 GitPython==3.1.32
 PyYAML==6.0
 starlette-context==0.3.6
-litellm==0.1.504
+litellm~=0.1.538
 boto3==1.28.25
 google-cloud-storage==2.10.0
 ujson==5.8.0
 azure-devops==7.1.0b3
-msrest==0.7.1
-pinecone-client==2.2.2
-pinecone_datasets==0.6.1
\ No newline at end of file
+msrest==0.7.1
\ No newline at end of file