Project Idea: Using Ollama for Emacs Completions

Posted on Nov 18, 2024

I’d like to use a local LLM for Github Copilot-esque completions. I like Copilot; it’s downright uncanny when it serves that perfect completion. However, it comes with a few drawbacks:

  • Cost: $10/mo adds up, especially when I already pay for general purpose LLMs
  • Privacy: even if Github sticks to its privacy policy and doesn’t keep queries for future training, sending code to third parties introduces a risk surface many organizations find unacceptable.

A local LLM sidesteps these issues

  • Retrieval Augmented Generation (RAG): bring in useful context from the project
  • Dynamic Model Selection: allow live switching between these large / small models
  • Hackability: I mean, it’s Emacs plus any LLM

Where is the model running?

At home, I can leverage my desktop’s Radeon 6900 XT GPU, which can run inference at twice the speed of my laptop’s CPU. To maximize flexibility, I’d like the Ollama host to be switchable along with the model.

How is the completion inserted?

Currently, I don’t see a tool that provides exactly what I’m envisioning, so this might require a greenfield solution. Key UX considerations include:

  • Speed: fast inference allows multiple ranked completion options
  • Simplicity: the interface should remain lightweight and unobtrusive

Prompting and RAG

There’s no one way to skin the RAG cat, but bringing in the right context from the project could supercharge the completions. For example, when completing a query pattern, bringing in samples of similar queries from the project makes the problem akin to multi-shot completion. And, LLMs are very good at multi-shot completion.

Initially, though, start simple: ask the model for the code completion and provide no other context. Modern instruct models might be powerful enough to limit the output to code, but getting under the hood and messing with the search function could force valid syntactic output.

Next Steps

  • Research existing options / decide on using gptel
  • Integrate with Ollama
  • Implement completion code