Project Idea: Using Ollama for Emacs Completions


I’d like to use a local LLM for Github Copilot-esque completions. I like Copilot; it’s downright uncanny when it serves that perfect completion. However, it comes with a few drawbacks:

  • Cost: $10/mo adds up, especially when I already pay for general purpose LLMs
  • Privacy: even if Github sticks to its privacy policy and doesn’t keep queries for future training, sending code to third parties introduces a risk surface many organizations find unacceptable.

A local LLM sidesteps these issues

Read more ⟶

Ollama Benchmarks: The Server (GPU) vs The Laptop (CPU)


Intro

This post collects initial benchmarks of Ollama running LLM inference across on my server and my laptop: the server armed with a Radeon 6900 XT GPU and the laptop using CPU-only processing. Both setups run Arch Linux, and ROCm provides AMD GPU acceleration.

The benchmark focuses on token generation speeds (tokens/s) for various models.

The Setup

  • The Server (GPU):
    • Radeon RX 6900 XT
    • 16GB GDDR6 RAM (~448 GB/s)
  • The Laptop (CPU):
    • 11th Gen Intel i7-1185G7 @ 3.00GHz
    • 32GB DDR4 RAM (~26 GB/s)
  • OS & Setup:
    • Arch Linux with ROCm for GPU acceleration (see archwiki)
    • ollama v0.4.2

Benchmark Results

There was a 35% - 110% speedup moving from the Intel i7 CPU to the Radeon GPU, with greater gains generally coming from the larger models (qwen2.5-coder:7b being the exception).

Read more ⟶

GNU Shave: Vanilla Emacs


A change of plans left me with extra time in my hands today, so of course I spent it on an Emacs reconfigure. Yup, I too felt like it was a waste of time. I swear I tried to avoid it, but Python & LSP & eglot were fighting and without Python support it’s real tough to justify Emacs. I truly tried, but with just the complexity of Doom I wasn’t able to debug it within an hour.

Read more ⟶

OpenAI introduces the GPT Store


First announced back at their dev days, today OpenAI officially opened up their app store. I’m looking at how it’s structured, what applications are featured, and the missing payouts. I’m still waiting for access to the store to be rolled out to me, so starting with the glaring hole: there is still no transparency on the revenue sharing. OpenAI briefly noted:

In Q1 we will launch a GPT builder revenue program. As a first step, US builders will be paid based on user engagement with their GPTs. We’ll provide details on the criteria for payments as we get closer.

Read more ⟶

Documents worth chatting to


I have reams of documents from over the years. Initially handwritten, post-college as digital text—the documents cover communications, all kinds of notes, recordings, personal metadata, etc. To plumb together a personal RAG system I’ll need to nail down the sources.

Let’s start by brainstorming easily accessible digital sources:

  • org/obsidian files (how to chunk?)
  • emails (pull using offlineimap)
  • calendar (one-time export)
  • health data, (Apple HealthKit, Google Fit
  • voice/video chat recordings (which I don’t have)
  • source code (not really useful to bring in)
  • ‘sketches’, blueprints, mind maps which don’t really exist

Beyond the digital sources I’ve been accumulating from are the potential sources of the future. For example, better logging my thoughts and processes in the hope LLMs will give that personal text personal utility and accessibility (yes, that’s what this doc is.).

Read more ⟶