Connect to a Local Model
Connect to a model running on your local machine
Connecting to any model served from 127.0.0.1 or localhost is now FREE.
Intro & Initial Setup (0:00-1:18), LM Studio (1:19-2:25), Ollama (2:26-3:10), Jan (3:11-3:59)
What is a Local Model?
When you run a local model, the LLM runs on your own computer. Then Steve can use that model instead of using one in the cloud. The primary benefit of this is that your requests never leave your machine, so all your data is kept private and secure. The secondary benefit is that it's FREE. We have tested Ask Steve with the providers below, but you should be able to use any system that can expose a model via a local web server.
What are the downsides?
- You have to download big model files (e.g. 3+GB)
- You need to have a capable machine
- The models your local machine can run generally aren't as good as the ones you can access in the cloud
- It can be slower, even though it's running locally
How Do I Use LM Studio With Ask Steve?
- Download and install LM Studio
- Open LM Studio. Click the purple
Discover
icon in the left nav to find a model and download it. - Once it's downloaded, press the
Load Model
button on the 'Download Complete` dialog. - When it's done loading, go to the Local Server page in LM Studio by pressing the green icon in the left nav. Switch the
Status
toggle on toRunning
. This will start up a local server on port 1234 with the selected model. - Finally, go to the Models page in Ask Steve Settings, press
ADD NEW MODEL
and selectLocal: Streaming
from the menu. You won't need an API Key since it's running on your own machine. - Change the port number in the URL field to match what LM Studio is serving on. LM Studio's port is typically 1234, so the beginning of the URL should be
http://localhost:1234...
- Change the model name to match what you downloaded, and any other attributes that you want (context window, output tokens, temperature, etc.). Press
TEST
to ensure it works, thenSAVE NEW MODEL
to save it. - Congratulations! Steve is now using a model running completely on your own computer!
How Do I Use Jan.ai With Ask Steve?
- Download and install Jan
- Open Jan. Click the
Explore the Hub
button and pick a model to download. - Once it's downloaded, press the button near the bottom of the left nav with
< >
inside. Press theStart Server
button. - Finally, go to the Models page in Ask Steve Settings, press
ADD NEW MODEL
and selectLocal: Streaming
from the menu. You won't need an API Key since it's running on your own machine. - Change the port number in the URL field to match what Jan is serving on. Jan's port is typically 1337, so the beginning of the URL should be
http://localhost:1337...
- Change the model name to match what you downloaded, and any other attributes that you want (context window, output tokens, temperature, etc.). Press
TEST
to ensure it works, thenSAVE NEW MODEL
to save it. - Congratulations! Steve is now using a model running completely on your own computer!
How Do I Use gpt4all With Ask Steve?
- Download and install gpt4all
- Open gpt4all. Click the
Install a Model
button and pick a model to download. - Once it's downloaded, press the
Settings
icon and underApplication Settings
checkEnable Local API Server
- Finally, go to the Models page in Ask Steve Settings, press
ADD NEW MODEL
and selectLocal: Non-Streaming
from the menu. You won't need an API Key since it's running on your own machine. - Change the port number in the URL field to match what gpt4all is serving on. gpt4all's port is typically 4891, so the beginning of the URL should be
http://localhost:4891...
- Change the model name to match what you downloaded, and any other attributes that you want (context window, output tokens, temperature, etc.). Press
TEST
to ensure it works, thenSAVE NEW MODEL
to save it. - Congratulations! Steve is now using a model running completely on your own computer!
How Do I Use Ollama With Ask Steve?
- Download and install Ollama
- Download a model. For example, this terminal command will pull Meta's Llama 3:
ollama pull llama3
- You will need to enable the Ask Steve extension to connect to Ollama. To do this you need to configure the Ollama server with the environment variable OLLAMA_ORIGINS, and it needs to be set to "chrome-extension://gldebcpkoojijledacjeboaehblhfbjg”. For Microsoft Edge it should be "chrome-extension://hiefenciocpoafbgoocochpolfjfmjfg"
- Instructions for how to do so on various platforms are here. So on Chrome on a Mac you'd issue this terminal command:
launchctl setenv OLLAMA_ORIGINS "chrome-extension://gldebcpkoojijledacjeboaehblhfbjg"
to give Ask Steve access to Ollama. Other alternatives to starting Ollama with the correct configuration are described here. - After setting OLLAMA_ORIGINS, you will need to restart the Ollama server. On a Mac you can do this by quitting and restarting Ollama from the Mac Taskbar.
- Finally, go to the Models page in Ask Steve Settings, press
ADD NEW MODEL
and selectLocal: Streaming
from the menu. You won't need an API Key since it's running on your own machine. - Change the port number in the URL field to match what Ollama is serving on. Ollama's port is typically 11434, so the beginning of the URL should be
http://localhost:11434...
- Change the model name to match what you downloaded, and any other attributes that you want (context window, output tokens, temperature, etc.). Press
TEST
to ensure it works, thenSAVE NEW MODEL
to save it. - Congratulations! Steve is now using a model running completely on your own computer!