Introduction

Large language models (LLMs) have emerged as one of the most transformative technologies of our time, revolutionizing the way we interact with machines and process information. At the heart of this innovation lies the ability of LLMs to understand, generate, and interpret human language with remarkable accuracy and nuance. These advanced AI systems are trained on vast datasets, encompassing a wide range of human knowledge and language use, enabling them to perform a variety of tasks that were once thought to be exclusively human domains. From writing and summarizing articles to answering complex questions and even creating poetic verse, large language models are breaking new ground in AI’s capabilities.

Requirements

To dive into the demonstrations of running an LLM locally, here is what you will need to follow along:

  • Hardware: A computer with a modern mid-tier CPU, ideally with a GPU, and at least 16GB of RAM is recommended for running LLMs. Also ensure you have plenty of disk storage, the model we are working with is ~4GB, but they can run substantially larger (39GB for Llama2 70B).

  • Software: The setup runs on Windows (with WSL2), macOS, or Ubuntu. We will use Ollama to run an LLM locally.

Getting Started

The best way to get started with Large Language Models is to dive right in. Ollama provides an extremely easy method to run an LLM locally.

  1. Download and install Ollama from https://ollama.com/download/linux.
  2. Open a terminal and run:
ollama run mistral

There you have it! You are running your own chat model locally. Interact with it a bit and make note of how fast it response, the length and verbosity of the responses, and it’s ability to carry a conversation.

To exit, type /bye. The next time you run Ollama you will start with a clear context/history, so this also serves to reset the context.

Why run locally?

Why run models locally when there are numerous free and subscription services available with more features and higher quality responses? The simple answer is experimentation. Hugging Face and the work of many have democratized a large swath of AI models. At the time of writing this, Hugging Face houses more 586,000 models. While the acceptable use and licensing on these models can vary, they are generally permissive to research use.

Conclusion

In conclusion, Large Language Models (LLMs) are reshaping our interaction with technology, offering unprecedented capabilities in understanding and generating human language. This guide has provided a practical entry point for experimenting with LLMs locally, highlighting their potential for personal learning and exploration. Running an LLM on your own machine not only demystifies the technology behind these models but also opens up avenues for creativity and innovation in AI. As we continue to explore and understand LLMs, they promise to play a pivotal role in the future of artificial intelligence, bridging the gap between human creativity and machine efficiency.