How to Use Google's Gemini AI Model in R

Deepanshu Bhalla Add Comment , ,
Integrate Google's Gemini AI Model into R

In this tutorial, you will learn how to integrate Google's Gemini AI Model into R. Google AI has an official Python package and documentation for the Gemini API but R users need not feel let down by the lack of official documentation for this API as this tutorial will provide you with the necessary information to get started using Gemini API in R.

Steps to Integrate Gemini into R

To use Google's Gemini via API in R, please follow these steps.

  1. Step 1 : Get API Key -

    You can access the Gemini API by visiting this link : Google AI Studio. Once you have access, you can create an API key by clicking on Create API Key button. Copy and save your API key for future reference.

    Please note that the Gemini API is currently available for free. In the future, there may be a cost involved in using the Gemini API. Check out the pricing page here.

  2. Step 2 : Install the Required Libraries -

    Before we can start using Gemini AI Model in R, we need to install the necessary libraries. The two libraries we will be using are httr and jsonlite. The "httr" library allows us to post our question and fetch response with Gemini API, while the "jsonlite" library helps to convert R object to JSON format.

    To install these libraries, you can use the following code in R:

    install.packages("httr")
    install.packages("jsonlite")
    
  3. Step 3 : Generate Content Based on Prompt -

    The following function generates a response from the Gemini Model based on your question (prompt).

    library(httr)
    library(jsonlite)
    
    # Function
    gemini <- function(prompt, 
                     temperature=0.5,
                     max_output_tokens=1024,
                     api_key=Sys.getenv("GEMINI_API_KEY"),
                     model = "gemini-1.5-flash-latest") {
      
      if(nchar(api_key)<1) {
        api_key <- readline("Paste your API key here: ")
        Sys.setenv(GEMINI_API_KEY = api_key)
      }
      
      model_query <- paste0(model, ":generateContent")
      
      response <- POST(
        url = paste0("https://generativelanguage.googleapis.com/v1beta/models/", model_query),
        query = list(key = api_key),
        content_type_json(),
        encode = "json",
        body = list(
          contents = list(
            parts = list(
              list(text = prompt)
            )),
          generationConfig = list(
            temperature = temperature,
            maxOutputTokens = max_output_tokens
          )
        )
      )
      
      if(response$status_code>200) {
        stop(paste("Error - ", content(response)$error$message))
      }
      
      candidates <- content(response)$candidates
      outputs <- unlist(lapply(candidates, function(candidate) candidate$content$parts))
      
      return(outputs)
      
    }
    
    prompt <- "R code to remove duplicates using dplyr."
    cat(gemini(prompt))
    

    When you run the function above first time, it will ask you to enter your API Key. It will save the API Key in GEMINI_API_KEY environment variable so it won't ask for API key when you run the function next time. Sys.setenv( ) is to store API Key whereas Sys.getenv( ) is to pull the stored API Key.

Output

```r
# Remove duplicate rows from a data frame

df <- data.frame(
  x = c(1, 2, 3, 4, 5, 1, 2, 3, 4, 5),
  y = c("a", "b", "c", "d", "e", "a", "b", "c", "d", "e")
)

# Using `distinct()`

df %>% distinct()
```
Terminologies Related to Gemini API
  • Prompt: Prompt means a question you want to ask. It is also called search query.
  • Tokens: Tokens are subwords or words. For example, the word "lower" splits into two tokens: "low" and "er".
  • Temperature: It is the model parameter which is used to fine tune the response. It lies between 0 and 1. If you set value of temperature close to 0, it means model to generate response which has highest probability. A value closer to 1 will produce responses that are more creative or random.
  • Max output tokens: It is the model parameter which defines the maximum number of tokens that can be generated in the response.

How to Handle Image as Input

To handle image as input, we can use the gemini-1.5-flash-latest model. It helps to describe image. You can ask any question related to the image. Make sure to install "base64enc" library.

library(httr)
library(jsonlite)
library(base64enc)

# Function
gemini_vision <- function(prompt, 
                   image,
                   temperature=0.5,
                   max_output_tokens=4096,
                   api_key=Sys.getenv("GEMINI_API_KEY"),
                   model = "gemini-1.5-flash-latest") {
  
  if(nchar(api_key)<1) {
    api_key <- readline("Paste your API key here: ")
    Sys.setenv(GEMINI_API_KEY = api_key)
  }
  
  model_query <- paste0(model, ":generateContent")
  
  response <- POST(
    url = paste0("https://generativelanguage.googleapis.com/v1beta/models/", model_query),
    query = list(key = api_key),
    content_type_json(),
    encode = "json",
    body = list(
      contents = list(
        parts = list(
          list(
            text = prompt
          ),
          list(
            inlineData = list(
              mimeType = "image/png",
              data = base64encode(image)
            )
          )
        )
      ),
      generationConfig = list(
        temperature = temperature,
        maxOutputTokens = max_output_tokens
      )
    )
  )
  
  if(response$status_code>200) {
    stop(paste("Error - ", content(response)$error$message))
  }
  
  candidates <- content(response)$candidates
  outputs <- unlist(lapply(candidates, function(candidate) candidate$content$parts))
  
  return(outputs)
  
}

gemini_vision(prompt = "Describe what people are doing in this image", 
              image = "https://upload.wikimedia.org/wikipedia/commons/a/a7/Soccer-1490541_960_720.jpg")

Shiny App to Explain Image using Gemini API

You can create an interactive Shiny app that describes an image uploaded by user.

library(shiny)

Sys.setenv(GEMINI_API_KEY = "xxxxxxxxxxx")

ui <- fluidPage(
  mainPanel(
    fluidRow(
      fileInput(
        inputId = "imgFile",
        label = "Select image to upload",
      ),
      textInput(
        inputId = "prompt", 
        label = "Prompt", 
        placeholder = "Enter Your Query"
      ),
      actionButton("submit", "Talk to Gemini"),
      textOutput("response")
    ),
    imageOutput(outputId = "myimage")
  )
)

server <- function(input, output) {
  
  observeEvent(input$imgFile, {
    path <- input$imgFile$datapath
    output$myimage <- renderImage({
      list(
        src = path
      )
    }, deleteFile = FALSE)
  })
  
  observeEvent(input$submit, {
    output$response <- renderText({
      gemini_vision(input$prompt, input$imgFile$datapath)
    })
  })
}

shinyApp(ui = ui, server = server)
Explain Image using Gemini API

Question and Answering

Suppose you have some documents. You want to make a system where people can ask questions about these documents and a chatbot will answer based on what they ask.

make_prompt <- function(query, relevant_passage) {
  escaped <- gsub("'", "", gsub('"', "", gsub("\n", " ", relevant_passage)))
  prompt <- sprintf("You are a helpful and informative bot that answers questions using text from the reference passage included below. \
  Be sure to respond in a complete sentence, being comprehensive, including all relevant background information. \
  However, you are talking to a non-technical audience, so be sure to break down complicated concepts and \
  strike a friendly and conversational tone. \
  If the passage is irrelevant to the answer, you may ignore it.
  QUESTION: '%s'
  PASSAGE: '%s'

    ANSWER:
  ", query, escaped)
  
  return(prompt)
}

passage <- "Title: Is AI a Threat to Content Writers?\n Author: Deepanshu Bhalla\nFull article:\n Both Open source and commercial generative AI models have made content writing easy and quick. Now you can create content in a few mins which used to take hours."
Example 1
query <- "Who is the author of this article?"
prompt = make_prompt(query, passage)
cat(gemini(prompt))
Output
Deepanshu Bhalla is the author of this article.
Example 2
query <- "Summarize this article"
prompt = make_prompt(query, passage)
cat(gemini(prompt))
Output
This article discusses whether AI is a threat to content writers. It argues that both open source and commercial generative AI models have made content writing easier and quicker, and that this could potentially put content writers out of a job.

Chat Functionality

There are many use cases where it is important for a Chatbot to remember your previous questions in order to answer subsequent questions. For example, if you ask a question like "What is 2+2?" and then follow up with the another question : "What is the square of it?", it should understand your query and respond accordingly.

library(httr)
library(jsonlite)

chat_gemini <- function(prompt, 
                      temperature=0.5,
                      api_key=Sys.getenv("GEMINI_API_KEY"),
                      model="gemini-1.5-flash-latest") {
  
  if(nchar(api_key)<1) {
    api_key <- readline("Paste your API key here: ")
    Sys.setenv(GEMINI_API_KEY = api_key)
  }
  
  model_query <- paste0(model, ":generateContent")
  
  # Add new message
  chatHistory <<- append(chatHistory, list(list(role = 'user', 
                                                parts = list(
                                                  list(text = prompt)
                                                ))))
  
  response <- POST(
    url = paste0("https://generativelanguage.googleapis.com/v1beta/models/", model_query),
    query = list(key = api_key),
    content_type_json(),
    body = toJSON(list(
      contents = chatHistory,
      generationConfig = list(
        temperature = temperature
      )
    ),  auto_unbox = T))
  
  if(response$status_code>200) {
    chatHistory <<- chatHistory[-length(chatHistory)]
    stop(paste("Status Code - ", response$status_code))
  } else {
    answer <- content(response)$candidates[[1]]$content$parts[[1]]$text
    chatHistory <<- append(chatHistory, list(list(role = 'model', 
                                          parts = list(list(text = answer)))))
  }
  
  return(answer)
  
}
chatHistory <- list()
cat(chat_gemini("2+2"))
cat(chat_gemini("square of it"))
cat(chat_gemini("add 3 to result"))
Output
> chatHistory <- list()
> cat(chat_gemini(prompt="3+5"))
3+5 is equal to 8.
> cat(chat_gemini(prompt="square of it"))
The square of 8 is 64.
> cat(chat_gemini(prompt="Add 2 to it"))
64 + 2 = 66.

How to Generate Text Embeddings

In this section, we will show you how to use the Gemini API to generate text embeddings. This will help you search through a list of documents and ask questions about a certain topic.

In the example below, we have three documents on AI. We want to identify the most relevant document based on a question about the AI topic.

embedding_gemini <- function(prompt, 
                   api_key=Sys.getenv("GEMINI_API_KEY"),
                   model = "text-embedding-004") {
  
  if(nchar(api_key)<1) {
    api_key <- readline("Paste your API key here: ")
    Sys.setenv(GEMINI_API_KEY = api_key)
  }
  
  model_query <- paste0(model, ":embedContent")
  
  response <- POST(
    url = paste0("https://generativelanguage.googleapis.com/v1beta/models/", model_query),
    query = list(key = api_key),
    content_type_json(),
    encode = "json",
    body = list(
      model = paste0("models/",model),
      content = list(
        parts = list(
          list(text = prompt)
        ))
    )
  )
  
  if(response$status_code>200) {
    stop(paste("Status Code - ", response$status_code))
  }
  
  return(unlist(content(response)))
  
}

DOCUMENT1 = "AI is like a smart helper in healthcare. It can find problems early by looking at lots of information, help doctors make plans just for you, and even make new medicines faster."
DOCUMENT2 = "AI needs to be open and fair. Sometimes, it can learn things that aren't right. We need to be careful and make sure it's fair for everyone. If AI makes a mistake, someone needs to take responsibility."
DOCUMENT3 = "AI is making school exciting. It can make learning fit you better, help teachers make fun lessons, and show when you need more help."
df <-  data.frame(Text = c(DOCUMENT1, DOCUMENT2, DOCUMENT3))

# Get the embeddings of each text
embedding_out <- list()
for(i in 1:nrow(df)) {
  result <- embedding_gemini(prompt = df[i,"Text"])
  embedding_out[[i]] <- result
}

# Identify Most relevant document
query <-  "AI can generate misleading results many times."
scores_query <- embedding_gemini(query)

# Calculate the dot products
dot_products <- sapply(embedding_out, function(x) sum(x * scores_query))

# Find the index of the maximum dot product to view the most relevant document
idx <- which.max(dot_products)
df$Text[idx]
Output
[1] "AI needs to be open and fair. Sometimes, it can learn things that aren't right. We need to be careful and make sure it's fair for everyone. If AI makes a mistake, someone needs to take responsibility."
List of AI Models Available via Gemini API

The following code returns all the AI models available through the Gemini API.

library(httr)
library(jsonlite)

models <- GET(
  url = "https://generativelanguage.googleapis.com/v1beta/models", 
  query = list(key = Sys.getenv("GEMINI_API_KEY")))

lapply(content(models)[["models"]], function(model) c(description = model$description,
                                                 displayName = model$displayName,
                                                 name = model$name,
                                                 method = model$supportedGenerationMethods[1]))
Output
[[1]]
[[1]]$description
[1] "A legacy text-only model optimized for chat conversations"

[[1]]$displayName
[1] "PaLM 2 Chat (Legacy)"

[[1]]$name
[1] "models/chat-bison-001"

[[1]]$method
[1] "generateMessage"


[[2]]
[[2]]$description
[1] "A legacy model that understands text and generates text as an output"

[[2]]$displayName
[1] "PaLM 2 (Legacy)"

[[2]]$name
[1] "models/text-bison-001"

[[2]]$method
[1] "generateText"


[[3]]
[[3]]$description
[1] "Obtain a distributed representation of a text."

[[3]]$displayName
[1] "Embedding Gecko"

[[3]]$name
[1] "models/embedding-gecko-001"

[[3]]$method
[1] "embedText"


[[4]]
[[4]]$description
[1] "The best model for scaling across a wide range of tasks. This is the latest model."

[[4]]$displayName
[1] "Gemini 1.0 Pro Latest"

[[4]]$name
[1] "models/gemini-1.0-pro-latest"

[[4]]$method
[1] "generateContent"


[[5]]
[[5]]$description
[1] "The best model for scaling across a wide range of tasks"

[[5]]$displayName
[1] "Gemini 1.0 Pro"

[[5]]$name
[1] "models/gemini-1.0-pro"

[[5]]$method
[1] "generateContent"


[[6]]
[[6]]$description
[1] "The best model for scaling across a wide range of tasks"

[[6]]$displayName
[1] "Gemini 1.0 Pro"

[[6]]$name
[1] "models/gemini-pro"

[[6]]$method
[1] "generateContent"


[[7]]
[[7]]$description
[1] "The best model for scaling across a wide range of tasks. This is a stable model that supports tuning."

[[7]]$displayName
[1] "Gemini 1.0 Pro 001 (Tuning)"

[[7]]$name
[1] "models/gemini-1.0-pro-001"

[[7]]$method
[1] "generateContent"


[[8]]
[[8]]$description
[1] "The best image understanding model to handle a broad range of applications"

[[8]]$displayName
[1] "Gemini 1.0 Pro Vision"

[[8]]$name
[1] "models/gemini-1.0-pro-vision-latest"

[[8]]$method
[1] "generateContent"


[[9]]
[[9]]$description
[1] "The best image understanding model to handle a broad range of applications"

[[9]]$displayName
[1] "Gemini 1.0 Pro Vision"

[[9]]$name
[1] "models/gemini-pro-vision"

[[9]]$method
[1] "generateContent"


[[10]]
[[10]]$description
[1] "Mid-size multimodal model that supports up to 2 million tokens"

[[10]]$displayName
[1] "Gemini 1.5 Pro Latest"

[[10]]$name
[1] "models/gemini-1.5-pro-latest"

[[10]]$method
[1] "generateContent"


[[11]]
[[11]]$description
[1] "Mid-size multimodal model that supports up to 2 million tokens"

[[11]]$displayName
[1] "Gemini 1.5 Pro 001"

[[11]]$name
[1] "models/gemini-1.5-pro-001"

[[11]]$method
[1] "generateContent"


[[12]]
[[12]]$description
[1] "Mid-size multimodal model that supports up to 2 million tokens"

[[12]]$displayName
[1] "Gemini 1.5 Pro"

[[12]]$name
[1] "models/gemini-1.5-pro"

[[12]]$method
[1] "generateContent"


[[13]]
[[13]]$description
[1] "Mid-size multimodal model that supports up to 2 million tokens"

[[13]]$displayName
[1] "Gemini 1.5 Pro Experimental 0801"

[[13]]$name
[1] "models/gemini-1.5-pro-exp-0801"

[[13]]$method
[1] "generateContent"


[[14]]
[[14]]$description
[1] "Mid-size multimodal model that supports up to 2 million tokens"

[[14]]$displayName
[1] "Gemini 1.5 Pro Experimental 0827"

[[14]]$name
[1] "models/gemini-1.5-pro-exp-0827"

[[14]]$method
[1] "generateContent"


[[15]]
[[15]]$description
[1] "Fast and versatile multimodal model for scaling across diverse tasks"

[[15]]$displayName
[1] "Gemini 1.5 Flash Latest"

[[15]]$name
[1] "models/gemini-1.5-flash-latest"

[[15]]$method
[1] "generateContent"


[[16]]
[[16]]$description
[1] "Fast and versatile multimodal model for scaling across diverse tasks"

[[16]]$displayName
[1] "Gemini 1.5 Flash 001"

[[16]]$name
[1] "models/gemini-1.5-flash-001"

[[16]]$method
[1] "generateContent"


[[17]]
[[17]]$description
[1] "Fast and versatile multimodal model for scaling across diverse tasks"

[[17]]$displayName
[1] "Gemini 1.5 Flash 001 Tuning"

[[17]]$name
[1] "models/gemini-1.5-flash-001-tuning"

[[17]]$method
[1] "generateContent"


[[18]]
[[18]]$description
[1] "Fast and versatile multimodal model for scaling across diverse tasks"

[[18]]$displayName
[1] "Gemini 1.5 Flash"

[[18]]$name
[1] "models/gemini-1.5-flash"

[[18]]$method
[1] "generateContent"


[[19]]
[[19]]$description
[1] "Fast and versatile multimodal model for scaling across diverse tasks"

[[19]]$displayName
[1] "Gemini 1.5 Flash Experimental 0827"

[[19]]$name
[1] "models/gemini-1.5-flash-exp-0827"

[[19]]$method
[1] "generateContent"


[[20]]
[[20]]$description
[1] "Fast and versatile multimodal model for scaling across diverse tasks"

[[20]]$displayName
[1] "Gemini 1.5 Flash 8B Experimental 0827"

[[20]]$name
[1] "models/gemini-1.5-flash-8b-exp-0827"

[[20]]$method
[1] "generateContent"


[[21]]
[[21]]$description
[1] "Obtain a distributed representation of a text."

[[21]]$displayName
[1] "Embedding 001"

[[21]]$name
[1] "models/embedding-001"

[[21]]$method
[1] "embedContent"


[[22]]
[[22]]$description
[1] "Obtain a distributed representation of a text."

[[22]]$displayName
[1] "Text Embedding 004"

[[22]]$name
[1] "models/text-embedding-004"

[[22]]$method
[1] "embedContent"


[[23]]
[[23]]$description
[1] "Model trained to return answers to questions that are grounded in provided sources, along with estimating answerable probability."

[[23]]$displayName
[1] "Model that performs Attributed Question Answering."

[[23]]$name
[1] "models/aqa"

[[23]]$method
[1] "generateAnswer"
Related Posts
Spread the Word!
Share
About Author:
Deepanshu Bhalla

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 10 years of experience in data science. During his tenure, he worked with global clients in various domains like Banking, Insurance, Private Equity, Telecom and HR.

Post Comment 0 Response to "How to Use Google's Gemini AI Model in R"
Next → ← Prev