Hi John, thanks for the blog post. A few things I'd like to comment on:

Aug 25, 2023

1) Llama 2 comes in various sizes - apart from the 70B model that you mention there are also the 7B and 13B variants, which don't require as much GPU memory.

2) Thanks to Cloud providers it is not prohibitively expensive to hos these models. You mention Google Colab, but there are alternatives, e.g. Amazon SageMaker: https://aws.amazon.com/blogs/machine-learning/llama-2-foundation-models-from-meta-are-now-available-in-amazon-sagemaker-jumpstart/

3) Re finetuning - there are alternatives to updating the entire model weights. Parameter efficient finetuning techniques like QLoRA enable you to train Llama 2 with a much smaller GPU memory footprint: https://www.philschmid.de/sagemaker-llama2-qlora

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Heiko Hotz

Responses (2)