Running a local LLM model
Notice:
This tutorial is intended for expert users
(Local) LLMs require a lot of computational power
Smaller models (in terms of parameter size) typically respond qualitatively worse than bigger ones, but they are faster, need less memory and might already be sufficient for your use case.
High-level explanation
You can use any program that creates a server with OpenAI-compatible API.
After you started your service, you can do this:
The "Chat Model" field in AI preferences is editable, so you can enter any model you have downloaded
There is a field called "API base URL" in "Expert Settings" where you need to provide the address of an OpenAI-compatible API server
Voilà! You can use a local LLM right away in JabRef.
Step-by-step guide for ollama
ollama
The following steps guide you on how to use ollama
to download and run local LLMs.
Install
ollama
from their websiteSelect a model that you want to run. The
ollama
provides a large list of models to choose from (we recommend tryinggemma2:2b
, ormistral:7b
, ortinyllama
)When you have selected your model, type
ollama pull <MODEL>:<PARAMETERS>
in your terminal.<MODEL>
refers to the model name likegemma2
ormistral
, and<PARAMETERS>
refers to parameters count like2b
or9b
.ollama
will download the model for youAfter that, you can run ollama serve to start a local web server. This server will accept requests and respond with LLM output. Note: The ollama server may already be running, so do not be alarmed by a cannot bind error. If it is not yet running, use the following command:
ollama run <MODEL>:<PARAMETERS>
Go to JabRef Preferences -> AI
Set the "AI provider" to "OpenAI"
Set the "Chat Model" to the model you have downloaded in the format
<MODEL>:<PARAMETERS>
Set the "API base URL" in "Expert Settings" to
http://localhost:11434/v1/
Now, you are all set and can chat "locally".
Step-by-step guide for GPT4All
The following steps guide you on how to use GPT4All
to download and run local LLMs.
Install
GPT4All
from their website.Open GPT4All, download a model, configure it in the settings and run it as a server.
Open JabRef, go to "File" > "Preferences" > "AI"
Set the "AI provider" to "GPT4All"
Set the "Chat model" to the name (including the
.gguf
part) of the model you have downloaded in GPT4All.Set the "API base URL" in "Expert Settings" to
http://localhost:4891/v1/chat/completions
.
Last updated
Was this helpful?