Thinking out loud about LLM tech

Thinking more seriously about use cases for LLMs. I can’t see any way to cajole them into being accurate. It seems really obvious to me what kind of AI people actually want. One that “understands” the data and gives them answers they can trust. But that’s not possible here in any meaningful way.

So instead I’m trying to learn the underlying tech a little better and asking myself “what else could this possibly do?”

Here are some qualities of the tech that I think are interesting. Folks should feel free to correct me where I’m wrong. I’m still trying to wrap my head around it.

  • It’s made up of a system of transformers that can be massively parallelized. The capabilities of the system scales up with the amount of compute

  • There are two phases with difference affordances and properties. The training phase, where you can try to teach the system how to interpret the underlying data. And the inference phase, where you can try to teach it what kind of output is acceptable.

  • The training phase is a one time, up front investment. Given some interesting and significantly large training set, you could make your own LLM. Rather than piggybacking off existing models.

  • The inference phase happens at runtime given some input. The size of the input it will take is also a function of how much compute you have available (??). But it has no memory, so you have to send in the whole input each time. That’s why the size of input the model can take is important. It’s an unchangeable constraint on a particular instance of a model.

  • There is also RLHF. Reinforcement Learning from Human Feedback. Essentially you can have humans in the loop giving the model feedback on the quality of its answers. And that can yield significant improvements in how the model performs at a given task.

  • I don’t understand RLHF all that well yet. For instance, what is it exactly that you can impact with RLHF? Does it happen at training time or inference time? I’m reading this today. https://huggingface.co/blog/rlhf