jtmoulia’s intermittent journal, guides, and recipes. See recent posts below, and links to archives in the header.
Spring Things
Guix: Adding Packages / Applying Diffs
My guix home upgrade has been blocked by one of the dependent package’s builds being broken (python-lazr-restfulclient). The patch to fix has already been submitted, but not accepted into the guix mainline. If I could cherry-pick these changes into my own “cutting edge” branch I can keep my build working.
Guix has published three videos showing the process, which feels foundational to figuring out my own guix fork:
…Weds back from South Lake
LLM
Some more reading. Still behind the curve, but seeing how the foundation can be put together to run locally on the tools we have today:
- Vicuna is a LLaMa derivative fine-tuned as a chatbot by GPT-4. It provides near GPT-3 quality (90% whatever that means) and the 13B model can be run on local commodity hardware, e.g. an Apple M1 chip with 32GB of ram has no problem running the model. The fine-tuning only cost ~$500 for the 65B param model
- Alpaca-LoRA opens the door for instruction-tuning on local hardware. The sauce is Low Ranked Adaptation [LoRA], which freezes the pre-trained model while training a lower-ranked layer, a much cheaper process than training the whole high-ranked model.
- Found my way to the LLM wikipedia article, always good to read.
Exercise
Slight regression from last week, the ribs still hurt from Friday and I don’t want to push it. Similar story with grip strength – left middle finger tendons have been strained for like the last two weeks.
…Tuesday from South Lake
Drove back today. Will miss the snow.
No exercise, tired and traveling. Not much space for work in Tahoe, so back to focus.
…Monday in South Lake
Wonderful time up here in the thin air with family and friends, headed back home tomorrow.
Starship launch was scrubbed, moving the next window back to Wednesday.
LLMs
What I’m reading today re LLM’s:
- RedPajama is a project to create open-source base models (like llama) from the ground up. They’ve released a 1.2T large training dataset that’ll serve as the foundation for an open vicunia-like chat model.
- Minigpt-4 looks to Enhance Vision-language Understanding with Advanced Large Language Models and combines Vicunica with a vision recognition model using a small, easily trained linear layer. So, the “GPT-4” aspect is the vision-language combination, and the SOTA is that this model can be run on commodity GPUs (3090)
- Google web-comic on federated learning. Reads like propaganda, but brought up the concept of secure aggregation, where training data can be collected from end-user devices without exposing personal user data. Particularly relevant when trying to encourage collaboration between entities with silos of private data.
Sunday in South Lake
Exercise
Full day of boarding out at Kirkwood, and a reminder that Kirkwood is my favorite mountain. No Wagon Wheel, but I felt solid on everything but.
…