Llamafile

December 4, 2023

https://github.com/Mozilla-Ocho/llamafile/ - llamafile contains the weights of various LLM's (the default being LLaVA, which is based on Clip and Vicuna / Llama) and everything else needed for distribution and inference, running locally across operating systems with no installations or setup required. It's just one big 4GB executable lol, and runs at ~4 tokens per second on my machine. Great to see what is coming out of the OSS space, and what is increasingly becoming possible on commodity hardware with privacy in mind, a lot of which has been galvanized by https://github.com/ggerganov/llama.cpp.