adding documentation on how to call local hf models

This commit is contained in:
Krrish Dholakia
2023-09-08 09:59:44 -07:00
parent e90c9e5853
commit 1b098aea13
2 changed files with 28 additions and 0 deletions

View File

@ -169,6 +169,31 @@ in the configuration.toml
#### Huggingface #### Huggingface
**Local**
You can run Huggingface models locally through either [VLLM](https://docs.litellm.ai/docs/providers/vllm) or [Ollama](https://docs.litellm.ai/docs/providers/ollama)
E.g. to use a new Huggingface model locally via Ollama, set:
```
[__init__.py]
MAX_TOKENS = {
"model-name-on-ollama": <max_tokens>
}
e.g.
MAX_TOKENS={
...,
"llama2": 4096
}
[config] # in configuration.toml
model = "ollama/llama2"
[ollama] # in .secrets.toml
api_base = ... # the base url for your huggingface inference endpoint
```
**Inference Endpoints**
To use a new model with Huggingface Inference Endpoints, for example, set: To use a new model with Huggingface Inference Endpoints, for example, set:
``` ```
[__init__.py] [__init__.py]

View File

@ -29,6 +29,9 @@ key = "" # Optional, uncomment if you want to use Replicate. Acquire through htt
key = "" # Optional, uncomment if you want to use Huggingface Inference API. Acquire through https://huggingface.co/docs/api-inference/quicktour key = "" # Optional, uncomment if you want to use Huggingface Inference API. Acquire through https://huggingface.co/docs/api-inference/quicktour
api_base = "" # the base url for your huggingface inference endpoint api_base = "" # the base url for your huggingface inference endpoint
[ollama]
api_base = "" # the base url for your huggingface inference endpoint
[github] [github]
# ---- Set the following only for deployment type == "user" # ---- Set the following only for deployment type == "user"
user_token = "" # A GitHub personal access token with 'repo' scope. user_token = "" # A GitHub personal access token with 'repo' scope.