My experience with Ollama

Post Reply
manusz
Posts: 14
Joined: 29 Feb 24 12:34
Location: Netherlands

My experience with Ollama

Post by manusz »

I have been experimenting with Ollama and Gemma3 to see if it is a tool I can use.
My test environment is a MacBook Air M1 with 8 GB of memory running Sequoia and PSU build 7632.

On this machine Ollama is terribly slow.
I suspect this is a memory issue, as soon as PSU and Ollama are running the memory pressure is high, approximately 7GB of the available 8GB is used, leading to memory swapping.
Moreover, after PSU is closed Ollama still uses 5 GB of RAM which is never released.
Apparently the LLM resides permanently in memory.
Permanently claiming more than 50% of available RAM obviously also has some impact on other applications running on this laptop.

As for the results Ollama produces I am not very impressed.
I always use Who - What - Where - Why labels to make my image collection searchable.
Of course AI can not detect the Who and Why.
Not all my images have embedded geo tags, but Ollama seems not to be guessing a Where.
So what remains for my use case is suggestions for What.
During my tests I found the results a mixed bag, some are spot-on, some are synonyms, some are nonsense.

I suspect Open AI will perform better than Ollama, both in memory usage as well as results, but sending images to a cloudservice is a no-go for me.

So in the end I removed Ollama from my machine, it is a tool I will not use.
Hert
Posts: 7928
Joined: 13 Sep 03 6:24

Re: My experience with Ollama

Post by Hert »

While Gemma3 is arguably the best model currently available for use in Ollama, it's important to remember that it's still "just" a 4B model...meaning it has around 4 billion parameters.

In comparison, although not officially documented, OpenAI’s GPT-4.1 is estimated to have roughly 1.8 trillion (or 1800 billion) parameters.

As a result, you can generally expect GPT-4.1 to deliver significantly higher accuracy.

Check out this topic where I demonstrate the difference between the models.
viewtopic.php?p=138797#p138797

Ollama runs locally, which means you'll need solid hardware, both in terms of processing power and memory, to run AI models effectively.

On Windows systems, you'll typically need a dedicated GPU to get decent performance. The key requirement is that the GPU must have enough VRAM to load the entire model.

Most Macs, on the other hand, don’t have dedicated GPUs in the traditional sense. Instead, Apple uses integrated graphics paired with a unified memory architecture, where the CPU and GPU share the same memory pool on the Apple Silicon chip (like M1, M2, or M3). These integrated GPUs, especially in the Pro, Max, and Ultra versions, are surprisingly powerful, often rivaling or exceeding discrete GPUs.

In practical terms, this means a MacBook Air with an M1 chip uses its internal memory for everything, including running models. With only 8GB of unified memory, that becomes the primary limitation when trying to run AI models locally.

In PSU with OpenAI-GPT4.1-Nano (smallest and fastest) it takes about 2 seconds to analyze an image
With OpenAI-GPT4.1-Mini (best average) it takes about 6 seconds to analyze an image
With OpenAI-GPT4.1 (top) takes about 14 seconds to analyze an image
So what remains for my use case is suggestions for What.
You also gain a rich AI description which adds to the searchability of your catalog.
No need to "process" the AI results.

Everything that the AI returns is instantly searchable using the search box or advanced search.

Also read this post:
viewtopic.php?p=138813#p138813
This is a user-to-user forum. If you have suggestions, requests or need support then please send a message
Post Reply