12 free and open source alternatives to ChatGPT ⋆ FullPress

Create your ChatGPT clone with free and open-source alternatives! Discover 12 options to develop your chatbot at no additional cost.

Summary

If you are interested in creating your own ChatGPT clone but prefer free and open-source alternatives, you’ve come to the right place. In this article, we’ll explore 12 alternatives to ChatGPT that you can use to build your own chatbot model without spending any money.

Advantages of Free and Open-Source Alternatives to ChatGPT

Before we dive into specific alternatives, it’s important to understand the benefits of using open-source language models over ChatGPT.

Data Privacy. Many companies want to maintain control over their data and do not want to grant access to third parties.
Customization. Open-source alternatives allow developers to train custom language models using their own data and apply filters on certain topics if needed.
Cost-Effectiveness. Open-source GPT models allow you to train sophisticated language models without worrying about expensive hardware.
Democratization of AI. The use of open-source models paves the way for further research that can be used to solve real-world problems.

Now that we’ve clarified the benefits, let’s move on to examining the free and open-source alternatives to ChatGPT.

1. Llama

Introducing Llama

Llama stands for Large Language Model Meta AI. It is a model that includes a range of sizes from 7 billion to 65 billion parameters. Meta AI researchers focused on improving the model’s performance by increasing the volume of training data, rather than the number of parameters. They stated that the 13 billion parameter model outperformed GPT-3’s 175 billion parameter model. Llama uses the transformer architecture and was trained on 1.4 trillion tokens scraped from Wikipedia, GitHub, Stack Exchange, Project Gutenberg books, and scientific articles on ArXiv.

Python Code for Llama

# Installa il pacchetto
pip install llama-cpp-python

from llama_cpp import Llama
llm = Llama(model_path="./models/7B/ggml-model.bin")
output = llm("Q: Elenco dei pianeti del sistema solare? A: ", max_tokens=128, stop=["Q:", "\\n"], echo=True)
print(output)

2. Llama 2

What’s New in Llama 2

There are a few key differences between Llama 2 and Llama:

Training data: Llama 2 was trained on 40% more tokens than Llama, for a total of 2 trillion tokens. This gives it a broader knowledge base and allows it to generate more accurate responses.
Dimensioni del modello: Llama 2 is available in three sizes: 7 billion parameters, 13 billion parameters, and 70 billion parameters. Whereas Llama’s maximum size is 65 billion parameters.
Ottimizzazione della chat: Llama 2-Chat is a specialized version of Llama 2 optimized for two-way conversations. It was trained on a dataset of human conversations, which allows it to generate more natural and engaging responses.
Sicurezza e mitigazione dei pregiudizi: Llama 2 was trained with a focus on safety and bias mitigation. This means it is less likely to generate toxic or harmful content.
Open source: Llama 2 is open source, which means anyone can use it for research or commercial purposes, unlike Llama which cannot be used for commercial purposes.

Python Code for Llama 2

To run the 7 billion parameter Llama2 model, here is the reference code:

%cd /content
!apt-get -y install -qq aria2

!git clone -b v1.3 https://github.com/camenduru/text-generation-webui
%cd /content/text-generation-webui
!pip install -r requirements.txt
!pip install -U gradio==3.28.3

!mkdir /content/text-generation-webui/repositories
%cd /content/text-generation-webui/repositories
!git clone -b v1.2 https://github.com/camenduru/GPTQ-for-LLaMa.git
%cd GPTQ-for-LLaMa
!python setup_cuda.py install

!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/4bit/Llama-2-7b-Chat-GPTQ/raw/main/config.json -d /content/text-generation-webui/models/Llama-2-7b-Chat-GPTQ -o config.json
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/4bit/Llama-2-7b-Chat-GPTQ/raw/main/generation_config.json -d /content/text-generation-webui/models/Llama-2-7b-Chat-GPTQ -o generation_config.json
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/4bit/Llama-2-7b-Chat-GPTQ/raw/main/special_tokens_map.json -d /content/text-generation-webui/models/Llama-2-7b-Chat-GPTQ -o special_tokens_map.json
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/4bit/Llama-2-7b-Chat-GPTQ/resolve/main/tokenizer.model -d /content/text-generation-webui/models/Llama-2-7b-Chat-GPTQ -o tokenizer.model
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/4bit/Llama-2-7b-Chat-GPTQ/raw/main/tokenizer_config.json -d /content/text-generation-webui/models/Llama-2-7b-Chat-GPTQ -o tokenizer_config.json
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/4bit/Llama-2-7b-Chat-GPTQ/resolve/main/gptq_model-4bit-128g.safetensors -d /content/text-generation-webui/models/Llama-2-7b-Chat-GPTQ -o gptq_model-4bit-128g.safetensors

%cd /content/text-generation-webui
!python server.py --share --chat --wbits 4 --groupsize 128 --model_type llama

For the 13 billion parameter Llama2 model, you can refer to the following code:

%cd /content
!apt-get -y install -qq aria2

!git clone -b v1.8 https://github.com/camenduru/text-generation-webui
%cd /content/text-generation-webui
!pip install -r requirements.txt

!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/4bit/Llama-2-13b-chat-hf/resolve/main/model-00001-of-00003.safetensors -d /content/text-generation-webui/models/Llama-2-13b-chat-hf -o model-00001-of-00003.safetensors
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/4bit/Llama-2-13b-chat-hf/resolve/main/model-00002-of-00003.safetensors -d /content/text-generation-webui/models/Llama-2-13b-chat-hf -o model-00002-of-00003.safetensors
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/4bit/Llama-2-13b-chat-hf/resolve/main/model-00003-of-00003.safetensors -d /content/text-generation-webui/models/Llama-2-13b-chat-hf -o model-00003-of-00003.safetensors
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/4bit/Llama-2-13b-chat-hf/raw/main/model.safetensors.index.json -d /content/text-generation-webui/models/Llama-2-13b-chat-hf -o model.safetensors.index.json
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/4bit/Llama-2-13b-chat-hf/raw/main/special_tokens_map.json -d /content/text-generation-webui/models/Llama-2-13b-chat-hf -o special_tokens_map.json
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/4bit/Llama-2-13b-chat-hf/resolve/main/tokenizer.model -d /content/text-generation-webui/models/Llama-2-13b-chat-hf -o tokenizer.model
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/4bit/Llama-2-13b-chat-hf/raw/main/tokenizer_config.json -d /content/text-generation-webui/models/Llama-2-13b-chat-hf -o tokenizer_config.json
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/4bit/Llama-2-13b-chat-hf/raw/main/config.json -d /content/text-generation-webui/models/Llama-2-13b-chat-hf -o config.json
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/4bit/Llama-2-13b-chat-hf/raw/main/generation_config.json -d /content/text-generation-webui/models/Llama-2-13b-chat-hf -o generation_config.json

%cd /content/text-generation-webui
!python server.py --share --chat --load-in-8bit --model /content/text-generation-webui/models/Llama-2-13b-chat-hf

3. Alpaca

Introduction to Alpaca
A team of researchers from Stanford University has developed an open-source language model called Alpaca. It is based on Meta’s large language model called Llama. The team used OpenAI’s GPT API (text-davinci-003) to fine-tune the 7 billion parameter Llama model. The team’s goal is to make AI available for free to allow academics to do further research without having to worry about expensive hardware to run these memory-intensive algorithms. Although these open-source models are not available for commercial use, small businesses can still use them to build their own chatbots.
How Alpaca Works
The Stanford team started their research with the smallest language model among Llama’s, which is the 7 billion parameter Llama model, and pre-trained it with 1 trillion tokens. They started with 175 human-written instruction-output pairs from the self-instruct seed set. They then used ChatGPT’s API to ask ChatGPT to generate further instructions using the seed set. This way, they obtained about 52,000 sample conversations, which they used for further fine-tuning the Llama model using Hugging Face’s training framework.
Llama is available in different sizes: 7B, 13B, 30B, and 65B parameters. Alpaca has also been extended to the 13B, 30B, and 65B models.
Alpaca Performance
The Alpaca model was tested against ChatGPT in tasks such as email generation, social media, and productivity tools, and Alpaca won 90 times while ChatGPT won 89 times. The model can be used in the real world for various purposes. It will be of great help to researchers in the field of ethical AI and cybersecurity, such as in scam and phishing detection.
Alpaca Limitations
Like the commercial version of ChatGPT, Alpaca also has similar limitations, such as hallucination, toxicity, and stereotyping. In other words, it can be used to generate text that spreads misinformation, racism, and hatred towards vulnerable sections of society.
Alpaca Memory Requirements
Alpaca cannot run on CPUs; it requires a GPU. For the 7B and 13B models, a single GPU with 12 GB of RAM is sufficient. For the 30B model, additional system resources are required.
Python Code for Alpaca
Here is a reference code for Alpaca. It is a chat application that uses the Alpaca 7B model:

import sys
sys.path.append("/usr/local/lib/python3.9/site-packages")

!nvidia-smi

!git clone https://github.com/deepanshu88/Alpaca-LoRA-Serve.git

%cd Alpaca-LoRA-Serve
!python3.9 -m pip install -r requirements.txt

base_model = 'decapoda-research/llama-7b-hf'
finetuned_model = 'tloen/alpaca-lora-7b'

!python3.9 app.py --base_url $base_model --ft_ckpt_url $finetuned_model --share

Remember that this is only reference code for Alpaca 7B. You can adapt it to use models of different sizes, such as Alpaca 13B and Alpaca 30B.

Alpaca Output
Below is an example of Alpaca output in response to two relatively simple questions. One question is about a general topic, and the other is about coding. Alpaca answers both questions correctly.

4. GPT4All

Introduction to GPT4All
The Nomic AI team was inspired by Alpaca and used OpenAI’s GPT-3.5-Turbo API to collect around 800,000 prompt-response pairs to create 430,000 assistant-style prompts and generations, including code, dialogues, and narratives. The 800,000 pairs are about 16 times larger than Alpaca. The interesting thing about this model is that it can run on CPUs; it does not require a GPU. Like Alpaca, it is also open source, which means you can use it for commercial purposes, and you can easily run it on your computer.
How GPT4All Works
It works similarly to Alpaca and is based on the 7 billion parameter Llama model. The team trained 7B Llama models, and the final model was trained on the 437,605 post-processed assistant-style prompts.
GPT4All Performance
In Natural Language Processing, perplexity is used to evaluate the quality of language models. It measures how surprised a language model would be to see a new sequence of words it has never encountered before, based on its training data. A lower perplexity score indicates that the language model is better at predicting the next word in a sequence and is therefore more accurate. The Nomic AI team claims that their models have lower perplexity than Alpaca. However, actual accuracy depends on the type of prompt you use. Alpaca might have better accuracy in some cases.
GPT4All Memory Requirements
It can run on a CPU with 8GB of RAM. If you have a laptop with 4GB of RAM, it might be time to upgrade to at least 8GB.
Python Code for GPT4All
Here is a reference code for GPT4All. You can use it as a reference, modify it to your needs, or run it as is. It is entirely up to you to decide how to use the code to best suit your needs.

# Clone il repository Git
!git clone --recurse-submodules https://github.com/nomic-ai/gpt4all.git

# Installa i pacchetti richiesti
cd /content/gpt4all
!python -m pip install -r requirements.txt

cd transformers
!pip install -e .

cd ../peft
!pip install -e .

# Avvia l'addestramento
!accelerate launch --dynamo_backend=inductor --num_processes=8 --num_machines=1 --machine_rank=0 --deepspeed_multinode_launcher standard --mixed_precision=bf16 --use_deepspeed --deepspeed_config_file=configs/deepspeed/ds_config.json train.py --config configs/train/finetune.yaml

# Scarica il checkpoint del modello GPT4All
cd /content/gpt4all/chat
!wget https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-quantized.bin

# Esegui il sistema di conversazione
!./gpt4all-lora-quantized-linux-x86

If you are running the code on a local machine that uses an operating system other than Linux, use the following commands instead:
Windows (PowerShell): ./gpt4all-lora-quantized-win64.exe Mac (M1): ./gpt4all-lora-quantized-OSX-m1 Mac (Intel): ./gpt4all-lora-quantized-OSX-intel

GPT4All Output

GPT4All was unable to correctly answer a coding-related question. This is just an example and we cannot assess the model’s accuracy based on a single instance. It might perform well on other prompts, so the model’s accuracy depends on your usage. Additionally, when I ran it again after 2 days, it performed well on coding-related questions. It seems they have further fine-tuned the model.

Troubleshooting GPT4All

If you run into issues with the distributed version that doesn’t have NCCL, it could be because CUDA is not installed on your computer.

Some users have reported strange errors on the Windows 10/11 platform. As a last resort, you can install the Windows Subsystem for Linux, which allows you to install a Linux distribution on your Windows computer and follow the code above.

5. GPT4All-J

This model has a similar name to the previous one, but the difference is that both models come from the same team at Nomic AI. The only change is that it has now been trained on GPT-J instead of Llama. The advantage of training it on GPT-J is that GPT4All-J is now Apache-2 licensed, meaning you can use it for commercial purposes and easily run it on your computer.

6. Dolly 2

The Databricks team created a language model based on EleutherAI’s Pythia model, later fine-tuned on about 15,000 instructions from a corpus. It is available in three sizes: 12B, 7B, and 3B parameters.

Dolly 2 Memory Requirements

It requires a GPU with about 10GB of RAM for the 7B model with 8-bit quantization. For the 12B model, at least 18GB of GPU VRAM is required.

Dolly 2 Python Code

Here is a reference code for Dolly 2. You can use it as a reference to access the Python code and a detailed description of the Dolly model.

7. Vicuna

Introducing Vicuna

A team of researchers from UC Berkeley, CMU, Stanford, and UC San Diego developed this model. It was fine-tuned on Llama using a chat dataset extracted from the ShareGPT website. The researchers claim the model achieved over 90% quality compared to OpenAI ChatGPT-4. It is worth noting that its performance is nearly on par with Bard. They used Alpaca’s training program and further improved two aspects: multi-round conversations and long sequences.

Vicuna Python Code

You can refer to this post – Vicuna Detailed Guide to access the Python code and a detailed description of the Vicuna model.

8. StableVicuna

Introducing StableVicuna

Stability AI released StableVicuna, a fine-tuned version of the 13B Vicuna model. To improve the Vicuna model, they further trained it using supervised fine-tuning (SFT). They used three different datasets to train it:

OpenAssistant Conversations Dataset, containing 161,443 human conversation messages in 35 different languages.
GPT4All Prompt Generations, which is a dataset of 437,605 prompts and responses generated by GPT-3.5.
Alpaca, which is a dataset of 52,000 prompts and responses generated by the text-davinci-003 model.

They used trlx to train a reward model. This model was set up using their additional SFT model. The reward model was trained using three datasets with human preferences:

OpenAssistant Conversations Dataset with 7,213 preference samples.
Anthropic HH-RLHF with 160,800 labels of people expressing what they think about the helpfulness or harmlessness of AI assistants.
Stanford Human Preferences with 348,718 human preferences on answers to questions or instructions in various areas, such as cooking or philosophy.

Finally, the SFT model was trained using RLHF with trlX through a process called Proximal Policy Optimization. This is how StableVicuna was created.

StableVicuna Memory Requirements

To run the 4-bit GPTQ StableVicuna model, approximately 10GB of GPU VRAM is required.

StableVicuna Performance Issues

Stability AI claims this model is an improvement over the original Vicuna model, but many people have reported the opposite. This model hallucinates more than the original model, producing worse responses. In other words, the model generates inaccurate output that does not match the prompt. As these models have only recently been released, rigorous evaluation is still needed. It is possible that this model may perform better on some tasks but much worse on others.

Python Code for StableVicuna

We can run the model using Text Generation WebUI, which makes it easy to run open source LLM models. The following code runs a 4-bit quantization which reduces the model’s memory requirements and makes it possible to run with less VRAM.

%cd /content
!apt-get -y install -qq aria2

!git clone -b v1

9. Alpaca GPT-4 Model

Introduction to Alpaca GPT-4

Alpaca GPT-4 is a language model developed by OpenAI. It is an improved version of the Alpaca model that uses 10 trillion tokens for training. Alpaca GPT-4 is known for its ability to generate high-quality text and coherent responses.

Python Code for Alpaca GPT-4

Currently, OpenAI has not released the code or pre-trained model for Alpaca GPT-4. However, you can use other free and open-source alternatives listed in this article to create your own chatbot.

10. Cerebras-GPT

Introduction to Cerebras-GPT

Cerebras-GPT is a language model developed by the artificial intelligence company Cerebras. It utilizes the parallel processing technology of the Cerebras chip to train large language models. Cerebras-GPT is known for its high-level performance and its ability to process large amounts of data efficiently.

Cerebras-GPT Memory Requirements

Cerebras-GPT requires a GPU with a large amount of VRAM, preferably 32 GB or more, to run large language models.

Python Code for Cerebras-GPT

Currently, Cerebras has not released the code or pre-trained model for Cerebras-GPT. However, you can use the free and open-source alternatives listed in this article to create your own chatbot.

11. GPT-J 6B

Introduction to GPT-J 6B

GPT-J 6B is a large language model developed by EleutherAI. It was trained on 6 billion parameters using advanced machine learning techniques. GPT-J 6B is known for its ability to generate coherent and high-quality text on a wide range of topics.

Python Code for GPT-J 6B

Currently, EleutherAI has not released the code or pre-trained model for GPT-J 6B. However, you can use the free and open-source alternatives listed in this article to create your own chatbot.

12. OpenChatKit Model

Introduction to OpenChatKit Model

OpenChatKit Model is an open-source chatbot model developed by OpenAI. It is based on the transformer architecture and can be used to create chatbots with coherent and high-quality responses.

Python Code for OpenChatKit Model

Currently, OpenAI has not released the code or pre-trained model for OpenChatKit Model. However, you can use the free and open-source alternatives listed in this article to create your own chatbot.

Pubblicato in Artificial Intelligence, Development & Programming

26 April 2024 Paolo Bergamaschi Artificial Intelligence, Development & Programming 0

12 free and open source alternatives to ChatGPT