Connecting to a Local Model

Connect to a model running on your local machine via Ollama or LM Studio

This is only available on the Premium Plan.

What is a Local Model?

When you run a local model, the LLM runs on your own computer. Then Steve can use that model instead of using one in the cloud. The primary benefit of this is that your requests never leave your machine, so all your data is kept private. The secondary benefit is that it's FREE. We have tested Ask Steve with LM Studio and Ollama, but you should be able to use any system that can expose a model via a local web server.

What are the downsides?

  • You have to download big model files (e.g. 3+GB)
  • You need to have a fast machine
  • The models your local machine can run generally aren't as good as the ones you can access in the cloud
  • It can be slower, even though it's running locally

How Do I Use LM Studio With Ask Steve?

  1. Download and setup LM Studio
  2. Open LM Studio. From the home page download a model. Try Llama 3 - 8B Instruct.
  3. Once it's downloaded, go to the Local Server page in LM Studio and select the model you just downloaded from the dropdown at the top. If it doesn't start automatically, press the green Start Server button. This will start up a local server on port 1234 with the selected model.
  4. Finally, go to the Models page in Ask Steve Settings, press ADD NEW MODEL and select Local: LM Studio from the menu. You won't need an API Key since it's running on your own machine. Press TEST to ensure it works, then SAVE NEW MODEL to save it.
  5. Congratulations! Steve is now using a model running completely on your own computer!

How Do I Use Ollama With Ask Steve?

  1. Download and setup Ollama
  2. Download a model. For example, this terminal command will pull Meta's Llama 3: ollama pull llama3
  3. You will need to enable the Ask Steve extension to connect to Ollama. To do this you need to configure the Ollama server with the environment variable OLLAMA_ORIGINS, and it needs to be set to "chrome-extension://gldebcpkoojijledacjeboaehblhfbjg”. For Microsoft Edge it should be "chrome-extension://hiefenciocpoafbgoocochpolfjfmjfg"
  4. Instructions for how to do so on various platforms are here. So on Chrome on a Mac you'd issue this terminal command: launchctl setenv OLLAMA_ORIGINS "chrome-extension://gldebcpkoojijledacjeboaehblhfbjg" to give Ask Steve access to Ollama.
  5. After setting OLLAMA_ORIGINS, you will need to restart the Ollama server. On a Mac you can do this by quitting and restarting Ollama from the Mac Taskbar.
  6. Finally, go to the Models page in Ask Steve Settings, press ADD NEW MODEL and select Local: Ollama - Meta Llama 3 from the menu. You won't need an API Key since it's running on your own machine. Press TEST to ensure it works, then SAVE NEW MODEL to save it. Note that this assumes you downloaded the Llama 3 model. Change the "model" parameter in the body if you downloaded a different model.
  7. Congratulations! Steve is now using a model running completely on your own computer!