Small language model

Capturing my journey through Ollama, a small language model that I deployed for fun

2/16/20261 min read

Add word "RAG" to image
Add word "RAG" to image

I spent time understanding prediction modeling from this Harvard university courses and decided to get some hands-on exercises. This led me to looking into small language models SLMs.

I deployed Ollama and build a simple cloud assistant chatbot at first. Then I decided to build out a network troubleshooting bot which I found actually useful. This is nothing super technical but I had some issues with building the Modelfile - the file that tells the AI model what to do. After giving the Modelfile my instructions, I saved it as a text file and that took me a few minutes of frustration trying to figure out why Ollama could not find the file. The trick is to save the Modelfile as "All Types"

******>ollama run Fortinet-assistant

>>> Traffic is hitting the wrong NAT rule. Explain how to debug rule shadowing.

Frustrating issue!

What is rule shadowing?

In most firewalls, when you apply a security policy to a device, it

overwrites any existing rules on that device. This is called rule

shadowing. However, if there are changes made to the device's configuration

or policies after the initial installation, those changes can be lost due

to rule shadowing.

Debugging rule shadowing:

To debug rule shadowing, we'll need to investigate the following steps:

1. Check for policy changes: Verify that no recent changes have been

made to the security policy on the device.

Having more fun with the model I decided to build a RAG (Retrieval-Augmented Generation). I pulled in LlamaIndex core, Ollama LLM wrapper and Vector store indexing.

Everything was going well until I tried running the RAG and it errored out because my utils file was calling out to openAI for authentication. I wanted to run the RAG locally so I had to edit the RAG file to point local

from llama_index.embeddings.ollama import OllamaEmbedding

embed_model = OllamaEmbedding(model_name="nomic-embed-text")

Then pulled the embedding file

ollama pull nomic-embed-text

That looked promising until got an error pointing to a mismatch in llama-index-embeddings package. After an upgrade, the RAG was happy and I was able to run it