“Only Apple can do this”
Variously attributed to Tim Cook
Apple introduced Apple Intelligence at WWDC 2024.
After waiting almost a year for Apple to,
in Craig Federighi’s words, “get it right”,
its promise of “AI for the rest of us” feels just as distant as ever.
While we wait for Apple Intelligence to arrive on our devices,
something remarkable is already running on our Macs.
Think of it as a locavore approach to artificial intelligence:
homegrown, sustainable, and available year-round.
This week on NSHipster,
we’ll look at how you can use Ollama to run
LLMs locally on your Mac —
both as an end-user and as a developer.
What is Ollama?
Ollama is the easiest way to run large language models on your Mac.
You can think of it as “Docker for LLMs” –
a way to pull, run, and manage AI models as easily as containers.
Download Ollama with Homebrew
or directly from their website.
Then pull and run llama3.2 (2GB).
$ brew install –cask ollama
$ ollama run llama3.2
> [String] {
// Get embedding for the query
let [queryEmbedding] = try await client.embeddings(
model: “llama3.2”,
texts: [query]
)
// See: https://en.wikipedia.org/wiki/Cosine_similarity
func cosineSimilarity(_ a: [Float], _ b: [Float]) -> Float {
let dotProduct = zip(a, b).map(*).reduce(0, +)
let magnitude = { sqrt($0.map { $0 * $0 }.reduce(0, +)) }
return dotProduct / (magnitude(a) * magnitude(b))
}
// Find documents above similarity threshold
let rankedDocuments = zip(embeddings, documents)
.map { embedding, document in
(similarity: cosineSimilarity(embedding, queryEmbedding),
document: document)
}
.filter { $0.similarity >= threshold }
.sorted { $0.similarity > $1.similarity }
.prefix(limit)
return rankedDocuments.map(.document)
}
Building a RAG System
Embeddings really shine when combined with text generation in a
RAG (Retrieval Augmented Generation) workflow.
Instead of asking the model to generate information from its training data,
we can ground its responses in our own documents by:
Converting documents into embeddings
Finding relevant documents based on the query
Using those documents as context for generation
Here’s a simple example:
let query = “What were AAPL’s earnings in Q3 2024?”
let relevantDocs = try await findRelevantDocuments(query: query)
let context = “””
Use the following documents to answer the question.
If the answer isn’t contained in the documents, say so.
Documents:
(relevantDocs.joined(separator: “n—n”))
Question: (query)
“””
let response = try await client.generate(
model: “llama3.2”,
prompt: context
)
To summarize:
Different models have different capabilities.
Models like llama3.2
and deepseek-r1
generate text.
Some text models have “base” or “instruct” variants,
suitable for fine-tuning or chat completion, respectively.
Some text models are tuned to support tool use,
which let them perform more complex tasks and interact with the outside world.
Models like llama3.2-vision
can take images along with text as inputs.
Models like nomic-embed-text
create numerical vectors that capture semantic meaning.
With Ollama,
you get unlimited access to a wealth of these and many more open-source language models.
So, what can you build with all of this?
Here’s just one example:
Nominate.app
Nominate
is a macOS app that uses Ollama to intelligently rename PDF files based on their contents.
Like many of us striving for a paperless lifestyle,
you might find yourself scanning documents only to end up with
cryptically-named PDFs like Scan2025-02-03_123456.pdf.
Nominate solves this by combining AI with traditional NLP techniques
to automatically generate descriptive filenames based on document contents.
The app leverages several technologies we’ve discussed:
Ollama’s API for content analysis via the ollama-swift package
Apple’s PDFKit for OCR
The Natural Language framework for text processing
Foundation’s DateFormatter for parsing dates
Looking Ahead
“The future is already here – it’s just not evenly distributed yet.”
William Gibson
Think about the timelines:
Apple Intelligence was announced last year.
Swift came out 10 years ago.
SwiftUI 6 years ago.
If you wait for Apple to deliver on its promises,
you’re going to miss out on the most important technological shift in a generation.
The future is here today.
You don’t have to wait.
With Ollama, you can start building the next generation of AI-powered apps
right now.

