Heiko Hotz
Aug 25, 2023

Hi John, thanks for the blog post. A few things I'd like to comment on:

1) Llama 2 comes in various sizes - apart from the 70B model that you mention there are also the 7B and 13B variants, which don't require as much GPU memory.

2) Thanks to Cloud providers it is not prohibitively expensive to hos these models. You mention Google Colab, but there are alternatives, e.g. Amazon SageMaker: https://aws.amazon.com/blogs/machine-learning/llama-2-foundation-models-from-meta-are-now-available-in-amazon-sagemaker-jumpstart/

3) Re finetuning - there are alternatives to updating the entire model weights. Parameter efficient finetuning techniques like QLoRA enable you to train Llama 2 with a much smaller GPU memory footprint: https://www.philschmid.de/sagemaker-llama2-qlora

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Heiko Hotz
Heiko Hotz

Written by Heiko Hotz

Generative AI Blackbelt @ Google — All opinions are my own

Responses (2)

Write a response