Building Your Own Gemini Pro Chatbot

Use Vertex AI APIs to access Google’s new AI model

Heiko Hotz
4 min readDec 14, 2023
Image by author

What is this about?

This tutorial guides you through creating a straightforward chatbot that interfaces with Google’s latest Gemini Pro model. It utilises Vertex AI APIs for model interaction, Streamlit for UI setup, and LangChain to manage context, enabling the chatbot to remember previous conversation segments. Using this application is straightforward and doesn’t necessitate specialised computing hardware.

All the code is available on GitHub:

Why is it important?

When Google announced the Gemini models on 6 December 2023, they stated that Bard would be powered by the new Gemini Pro model. However, this hasn’t been implemented yet in the UK or EU, likely due to regulatory constraints. Currently, the only method to explore the Gemini Pro model is through the Vertex AI APIs. This tutorial demonstrates how to do so in fewer than 100 lines of code. Let’s get started!

Code Walkthrough

Custom LLM Class for LangChain

The initial requirement is a custom LLM class that integrates with LangChain. While LangChain already includes a ChatVertexAI class, as of 13 December 2023, it hasn’t yet integrated the new Gemini Pro model.

Image by author

So, we need to create our own LLM class:

class GeminiProLLM(LLM):
def _llm_type(self) -> str:
return "gemini-pro"

def _call(
prompt: str,
stop: Optional[List[str]] = None,
run_manager: Optional[CallbackManagerForLLMRun] = None,
**kwargs: Any,
) -> str:
if stop is not None:
raise ValueError("stop kwargs are not permitted.")

gemini_pro_model = GenerativeModel("gemini-pro")

model_response = gemini_pro_model.generate_content(
generation_config={"temperature": 0.1}
text_content = model_response.candidates[0][0].text
return text_content

def _identifying_params(self) -> Mapping[str, Any]:
"""Get the identifying parameters."""
return {"model_id": "gemini-pro", "temperature": 0.1}

In this class, we specify the model’s usage and the parsing of its responses. Currently, I’ve set only one parameter, temperature, with a hardcoded value. However, it’s quite simple to enhance this by configuring the temperature (and other parameters) in the UI and dynamically passing them to the model.

How to use the Gemini models is described in Google’s Gemini API documentation.

Creating the ChatChain

Once we have the LLM class ready we can create the ChatChain which will take care of the memory component. This will ensure that the chatbot will remember previous parts of the conversation:

def load_chain():
llm = GeminiProLLM()
memory = ConversationBufferMemory()
chain = ConversationChain(llm=llm, memory=memory)
return chain

chatchain = load_chain()

Session variables

Next, we need to initialise a few session variables in Streamlit. Session variables maintain their values for the entire duration of the app, in contrast to regular variables, which are overwritten with each user interaction. In this scenario, our aim is to keep track of previous conversations in the chat.

# Initialise session state variables
if 'messages' not in st.session_state:
st.session_state['messages'] = []

Chatbot UI

Now, all that remains is to display the previous messages and respond to new inputs from the user.

# Display previous messages
for message in st.session_state['messages']:
role = message["role"]
content = message["content"]
with st.chat_message(role):

# Chat input
prompt = st.chat_input("You:")
if prompt:
st.session_state['messages'].append({"role": "user", "content": prompt})
with st.chat_message("user"):

response = chatchain(prompt)["response"]
st.session_state['messages'].append({"role": "assistant", "content": response})
with st.chat_message("assistant"):

That is it 😃 Now we can test out chatbot application!

Testing the app

To run the chatbot app we use Streamlit like so:

streamlit run

his will launch a local Streamlit server on port 8501, meaning it can be accessed through a browser at the address http://localhost:8501/. Now, we’re ready to interact with the model. Let’s test if our memory component is functioning properly:

Image by author

As we can see, the model recalls our previous discussion about Russia and seamlessly answers the question ‘What is its capital?’ without any issues.


We’ve developed a simple and user-friendly chatbot application for interacting with Gemini Pro. You can download the code and experiment with it yourself! There are numerous ways to enhance this app, including:

  • Incorporating additional generation parameters, such as max_tokens
  • Integrating the Gemini Pro Vision model to enable image inputs
  • Allowing users to select different models within the same app for comparison and contrast.

Heiko Hotz

👋 Follow me on Medium and LinkedIn to read more about Generative AI, Machine Learning, and Natural Language Processing.

👥 If you’re based in London join one of our NLP London Meetups.

Image by author



Heiko Hotz

Generative AI Blackbelt @ Google — All opinions are my own