Running a local/private AI
Here are a few tools to run AI locally on your computer.
Option 1. Using Ollama
Step 1 Download and install Ollama
One command for Linux:
$ curl -fsSL https://ollama.com/install.sh | sh
>>> Downloading ollama...
######################################################################## 100.0%##O#-#
######################################################################## 100.0%
>>> Installing ollama to /usr/local/bin...
NVIDIA GPU installed.
>>> The Ollama API is now available at 127.0.0.1:11434.
>>> Install complete. Run "ollama" from the command line.
If you have a fast GPU make sure you get that line of GPU installed.
. AI is GPU hungry, so the faster the GPU the better. If you don’t have a GPU, you can still run AI, but it will be much slower and you will have to get by with smaller models.
Step 2 - Choose your AI model.
You can download thousands of AI models from Hugginface
In particular here we are going to set up the Llama 2 LLM (Large Langulge Model), which has 3 flavours dependending on the size of the model you want to use. The larger the model, the more accurate the results, but also the more resources it will consume.
You may want to run the bigget model you computer can handle. As an example the requirements for the 3 models are:
- llama2 7B - 4GB to 10GB of RAM
- llama2 13B - 10GB to 18GB of RAM
- llama2 70B - 40GB to 80GB of RAM
Step 3 - Run the model you want with ollama.
ollama run llama2:70b
Step 4. You can have multiple models and view them with ollama list
$ ollama list
NAME ID SIZE MODIFIED
llama2:13b d475bf4c50bc 7.4 GB 10 days ago
llama2:70b e7f6c06ffef4 38 GB 10 days ago
llama2:latest 78e26419b446 3.8 GB 10 days ago
llama2-uncensored:latest 44040b922233 3.8 GB 10 days ago
Fine-tune
Option 2. Using Private-GPT
With privategpt you can also fine tune the model with your own data.
To fine tune the model with your own data with one command:
make ingest ./my-data -- --watch