LLM's get Reflexive

Posted on Apr 3, 2023

Self-Reflecting LLMs

My dad shared some fascinating news on GPT-4 (via this video): it performs better with reflexion, i.e. if you ask it why it provided an incorrect answer it will sometimes catch it, meaning it can sometimes self-reflect on wrong answers to offer an improvement. Here is a substack post from the paper’s authors Noah Shinn and Ashwin Gopinath – one of the provocative points is that GPT-4 crossed a threshold of complexity to be able to improve via reflexion, unlike the earlier GPT models.

A related paper describes a similar paradigm called Dialog-Enabled Resolving Agents (DERA) where the model challenges itself. Particularly interesting to me, the paper uses clinical documentation examples to show how DERA improves the quality and reduces hallucinations.

On a separate GPT note, I’d like to also read the HuggingGPT paper as an exploration of executive orchestration among task specific AI models. I took a glance at the pipelines available.

Exercise:

Off-topically: today I shave the mustache. It’s caterpillared too far.

I need to get back in the saddle and get some cycling miles under my belt. I’m happy I kept up with lifting while in LA, but didn’t get in the cardio / jogging like I was hoping. Definitely a problem with working with people out of the office house; there’s no real going back to your own place so no room to get in independent work.

This weekend is a good opportunity, though I need to balance it against basketball: full court ball for three hours wrings me out.

Table 1: Lifting: Shoulders #7
Exercise Set Weight Reps
DBell Overhead Press 1 102.5 5
DBell Overhead Press 2 90 8
DBell Overhead Press 3 90 8
Reverse Fly 1 50 10
Reverse Fly 2 50 11
Reverse Fly 3 50 9
Lateral Raise 1 50 8
Lateral Raise 2 50 8
Lateral Raise 3 50 7
Front Raise 1 50 8
Front Raise 2 50 9
Front Raise 3 50 8

Next time I’ll drop the lateral raise weight down to 45lbs – it was super sloppy at 50lbs.