How to Integrate Bard into R

In this tutorial, you will learn how to integrate Google Bard into R using the PaLM API. We will be talking about PaLM API which can be called and used to run Bard within R. Google AI has an official Python package and documentation for the PaLM API but R users need not feel let down by the lack of official documentation for this API as this tutorial will provide you with the necessary information to get started using PaLM API in R.

How to Integrate Bard into R

What is Bard?

Bard is a AI chatbot from Google that can understand and respond to your questions like a human. It is trained on a huge amount of text data so it can generate responses that are relevant. It can be used for a variety of purposes such as customer support, coding, grammar check, rewriting text etc.

Table of Contents

Terminologies related to PaLM API

  • Prompt: Prompt means a question you want to ask. It is also called search query. Think like this - you have a very smart machine which can answer anything. You can ask it to write an essay, a programming code, or anything else you can think of. But the machine requires specific instruction from you on what exactly you want them to do.
  • Tokens: Tokens are subwords or words. For example, the word "lower" splits into two tokens: "low" and "er". Similarly, the word "unhappy" splits into two tokens: "un" and "happy". If you noticed words are split into tokens because they can have different suffix and prefix. "Low" can be lower or lowest so it is important to make the model understand that these words are related.
  • Temperature: It is the model parameter which is used to fine tune the response. It lies between 0 and 1. If you set value of temperature close to 0, it means model to generate response which has highest probability. A value closer to 1 will produce responses that are more creative.
  • Max output tokens: It is the model parameter which defines the maximum number of tokens that can be generated in the response.
  • Candidate Count: The PaLM API can generate more than one response. In this model parameter, you need to specify the number of responses you want.

Steps to run Bard in R

Step 1 : Get API Key

The PaLM API is currently available for free. The documentation states that there will be no pricing for the PaLM API until the end of this year. In the future, there will be a cost involved in using the PaLM API.

First you need to join the PaLM API waitlist by visiting this link : PaLM API Waitlist. You may get access in a few minutes or it may take a day or so. Google will notify you of your access via email.

Once you have access, you can create an API key to access the PaLM API by going to this link. Copy and save your API key for future reference.

Please note that there are some countries where PaLM API is not currently available. See the countries list here
Step 2 : Install the Required Libraries

Before we can start using Bard in R, we need to install the necessary libraries. The two libraries we will be using are httr and jsonlite. The "httr" library allows us to post our question and fetch response with PaLM API, while the "jsonlite" library helps to convert R object to JSON format.

To install these libraries, you can use the following code in R:
install.packages("httr")
install.packages("jsonlite")
Models

The PaLM API has the following three models which cover various use cases.

  1. text-bison: It can be used to generate text, code. It is also useful for problem solving, extracting key information etc.
  2. chat-bison: It generates text in a conversational format like a chat bot. It remembers your prior conversation while answering your question.
  3. embedding-gecko: It generates text embeddings for the input text. These embeddings can be used for a variety of tasks such as searching text, classifying text, clustering text, and detecting outliers in text.

The following code returns all the available models supported in the PaLM API. Make sure to enter your API key in the api_key vector.

library(httr)
library(jsonlite)

api_key <- "XXXXXXXXXXXXX"

models <- GET(
  url = "https://generativelanguage.googleapis.com/v1beta2/models", 
  query = list(key = api_key))

lapply(content(models)$models, function(model) c(description = model$description,
                                                 displayName = model$displayName,
                                                 name = model$name,
                                                 method = model$supportedGenerationMethods[1]))
Output
[[1]]
[[1]]$description
[1] "Chat-optimized generative language model."

[[1]]$displayName
[1] "Chat Bison"

[[1]]$name
[1] "models/chat-bison-001"

[[1]]$method
[1] "generateMessage"


[[2]]
[[2]]$description
[1] "Model targeted for text generation."

[[2]]$displayName
[1] "Text Bison"

[[2]]$name
[1] "models/text-bison-001"

[[2]]$method
[1] "generateText"


[[3]]
[[3]]$description
[1] "Obtain a distributed representation of a text."

[[3]]$displayName
[1] "Embedding Gecko"

[[3]]$name
[1] "models/embedding-gecko-001"

[[3]]$method
[1] "embedText"

How to Generate Response based on Prompt

The function code has two functions : find_model and bard. The "find_model" function returns the current model name based on the method. It is required because the model name can be changed in the future when a new version is released. The "bard" function generates a response from the model based on your question (prompt).

find_model <- function(method, api_key=Sys.getenv("PALM_API_KEY")) {
  
  if(nchar(api_key)<1) {
    api_key <- readline("Paste your API key here: ")
    Sys.setenv(PALM_API_KEY = api_key)
  }
  
  response <- GET(
    url = "https://generativelanguage.googleapis.com/v1beta2/models", 
    query = list(key = api_key))
  
  if(status_code(response)>200) {
    stop(content(response)$error$message)
  }
  
  find_model <- lapply(content(response)$models, function(x) {
    if (method %in% x$supportedGenerationMethods) {
      return(x$name)
    }
  })
  
  modelName <- unlist(find_model[!sapply(find_model, is.null)])[1]
  
  return(modelName)
  
}

# Function
bard <- function(prompt, 
                 temperature=0.3,
                 max_output_tokens=1024,
                 candidate_count=1,
                 api_key=Sys.getenv("PALM_API_KEY"),
                 model = "auto") {
  
  if(nchar(api_key)<1) {
    api_key <- readline("Paste your API key here: ")
    Sys.setenv(PALM_API_KEY = api_key)
  }
  
  if(tolower(model)=="auto") {
    model <- find_model("generateText")
  }
    model_query <- paste0(model, ":generateText")
  
  response <- POST(
    url = paste0("https://generativelanguage.googleapis.com/v1beta2/", model_query),
    query = list(key = api_key),
    content_type_json(),
    encode = "json",
    body = list(
      prompt = list(
        text = prompt
      ),
      temperature=temperature, 
      maxOutputTokens=max_output_tokens,
      candidateCount=candidate_count
    )
  )
  
  if(status_code(response)>200) {
    stop(content(response)$error$message)
  }
  
  candidates <- content(response)$candidates
  outputs <- unlist(lapply(candidates, function(candidate) candidate$output))
  
  return(outputs)
  
}

prompt <- "R code to remove duplicates using dplyr."
cat(bard(prompt))
Output
```r
# Remove duplicate rows from a data frame

df <- data.frame(
  x = c(1, 2, 3, 4, 5, 1, 2, 3, 4, 5),
  y = c("a", "b", "c", "d", "e", "a", "b", "c", "d", "e")
)

# Using `distinct()`

df %>% distinct()

# Using `filter()` and `duplicated()`

df %>% filter(!duplicated(x, y))
```

When you run the function above first time, it will ask you to enter your API Key. It will save the API Key in PALM_API_KEY environment variable so it won't ask for API key when you run the function next time. Sys.setenv( ) is to store API Key whereas Sys.getenv( ) is to pull the stored API Key.

Sys.setenv(PALM_API_KEY = "APIKey") # Set API Key
Sys.getenv("PALM_API_KEY") # Get API Key

By changing the candidate_count parameter you can generate more than one response. It is useful when you want to rewrite some text. You can pick the one which meets your expectation.

prompt <- "Be sure to respond with less verbosity. Rewrite the text. Text : 'There is always a debate about Python vs R but I love both of them equally. Am I a lovable person?'."
result <- bard(prompt, candidate_count = 3, temperature = 0.8)
cat(result, fill = TRUE, labels = paste0("{", 1:8, "}:"))
Output
{1}: People debate whether Python or R is better, but I like both. Does that make me likeable? 
{2}: Python and R are both great languages. I love them both equally. Does that make me lovable? 
{3}: Python and R are both great languages. It's good to be open-minded and like both.

Question and Answering

Suppose you have some documents. You want to build a system where user can ask any question related to the documents and the chatbot should respond based on the questions asked by the users.

make_prompt <- function(query, relevant_passage) {
  escaped <- gsub("'", "", gsub('"', "", gsub("\n", " ", relevant_passage)))
  prompt <- sprintf("You are a helpful and informative bot that answers questions using text from the reference passage included below. \
  Be sure to respond in a complete sentence, being comprehensive, including all relevant background information. \
  However, you are talking to a non-technical audience, so be sure to break down complicated concepts and \
  strike a friendly and conversational tone. \
  If the passage is irrelevant to the answer, you may ignore it.
  QUESTION: '%s'
  PASSAGE: '%s'

    ANSWER:
  ", query, escaped)
  
  return(prompt)
}
passage <- "Title: Is AI a Threat to Content Writers?\n Author: Deepanshu Bhalla\nFull article:\n Both Open source and commercial generative AI models have made content writing easy and quick. Now you can create content in a few mins which used to take hours."

query <- "Who is the author of this article?"
prompt = make_prompt(query, passage)
cat(bard(prompt))

Output: Deepanshu Bhalla is the author of this article.

query <- "Summarize this article"
prompt = make_prompt(query, passage)
cat(bard(prompt))
Output:
This article discusses whether AI is a threat to content writers. It argues that both open source and commercial generative AI models have made content writing easier and quicker, and that this could potentially put content writers out of a job.

R Function to Chat like Bard

There are many use cases where it is important for a Chatbot to remember your previous questions in order to answer subsequent questions. For example, if you ask a question like "What is 2+2?" and then follow up with the another question : "What is the square of it?", it should understand your query and respond accordingly.

chat_bard <- function(prompt, 
                      temperature=0.25,
                      api_key=Sys.getenv("PALM_API_KEY"),
                      model="auto") {
  
  if(nchar(api_key)<1) {
    api_key <- readline("Paste your API key here: ")
    Sys.setenv(PALM_API_KEY = api_key)
  }
  
  if(tolower(model)=="auto") {
    model <- find_model("generateMessage")
  }
  
  model_query <- paste0(model, ":generateMessage")
    
  # Add new message
  chatHistory <<- append(chatHistory, list(list(author = '0', content = prompt)))
  
  response <- POST(
    url = paste0("https://generativelanguage.googleapis.com/v1beta2/", model_query),
    query = list(key = api_key),
    content_type_json(),
    body = toJSON(list(
      prompt = list(messages = chatHistory),
      temperature=temperature
    ),  auto_unbox = T))
  
  if(status_code(response)>200) {
    chatHistory <<- chatHistory[-length(chatHistory)]
    stop(content(response)$error$message)
  } else {
    answer <- content(response)$candidates[[1]]$content
    chatHistory <<- append(chatHistory, list(list(author = '1', content = answer)))
  }
  
  return(answer)
  
}
chatHistory <- list()
cat(chat_bard(prompt="3+5"))
cat(chat_bard(prompt="square of it"))
cat(chat_bard(prompt="Add 2 to it"))
Output
> chatHistory <- list()
> cat(chat_bard(prompt="3+5"))
3+5 is equal to 8.
> cat(chat_bard(prompt="square of it"))
The square of 8 is 64.
> cat(chat_bard(prompt="Add 2 to it"))
64 + 2 = 66.

How to Generate Text Embeddings

In this section, we will show you how to use the PaLM API to generate text embeddings. This will help you search through a list of documents and ask questions about a certain topic.

In the example below, we have three documents on AI. We want to identify the most relevant document based on a question about the AI topic.

embedding_bard <- function(prompt, 
                 api_key=Sys.getenv("PALM_API_KEY"),
                 model="auto") {
  
  if(nchar(api_key)<1) {
    api_key <- readline("Paste your API key here: ")
    Sys.setenv(PALM_API_KEY = api_key)
  }

  if(tolower(model)=="auto") {
    model <- find_model("embedText")
  }
  
  model_query <- paste0(model, ":embedText")
  
  response <- POST(
    url = paste0("https://generativelanguage.googleapis.com/v1beta2/", model_query), 
    query = list(key = api_key),
    content_type_json(),
    encode = "json",
    body = list(text = prompt)
  )
  
  if(status_code(response)>200) {
    stop(content(response)$error$message)
  }
  
  return(unlist(content(response)))
  
}

DOCUMENT1 = "AI is like a smart helper in healthcare. It can find problems early by looking at lots of information, help doctors make plans just for you, and even make new medicines faster."
DOCUMENT2 = "AI needs to be open and fair. Sometimes, it can learn things that aren't right. We need to be careful and make sure it's fair for everyone. If AI makes a mistake, someone needs to take responsibility."
DOCUMENT3 = "AI is making school exciting. It can make learning fit you better, help teachers make fun lessons, and show when you need more help."
df <-  data.frame(Text = c(DOCUMENT1, DOCUMENT2, DOCUMENT3))

# Get the embeddings of each text
embedding_out <- list()
for(i in 1:nrow(df)) {
  result <- embedding_bard(df[i,"Text"])
  embedding_out[[i]] <- result
}

# Identify Most relevant document
query <-  "AI can generate misleading results many times."
scores_query <- embedding_bard(query)

# Calculate the dot products
dot_products <- sapply(embedding_out, function(x) sum(x * scores_query))

# Find the index of the maximum dot product to view the most relevant document
idx <- which.max(dot_products)
df$Text[idx]
Output
[1] "AI needs to be open and fair. Sometimes, it can learn things that aren't right. We need to be careful and make sure it's fair for everyone. If AI makes a mistake, someone needs to take responsibility."
Related Posts
Spread the Word!
Share
About Author:
Deepanshu Bhalla

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 10 years of experience in data science. During his tenure, he worked with global clients in various domains like Banking, Insurance, Private Equity, Telecom and HR.

0 Response to "How to Integrate Bard into R"

Post a Comment

Next → ← Prev
Looks like you are using an ad blocker!

To continue reading you need to turnoff adblocker and refresh the page. We rely on advertising to help fund our site. Please whitelist us if you enjoy our content.