In this tutorial, we will see how to extract and analyze facebook data using R. Facebook has crossed more than 1 billion active users. Facebook has gathered the most extensive data set ever about behavior of human. In R, we can extract data from Facebook and later analyze it. Social media mining is one of the most interesting piece in data science. You can analyze sentiments of an important event by pulling information about the event from Facebook and get insights from data in R.
Step by Step Guide : Extract Data from Facebook
Step I : Facebook Developer Registration
Go to https://developers.facebook.com and register yourself by clicking on Get Started button at the top right of page (See the snapshot below). After it would open a form for registration which you need to fill it to get yourself registered.
Step II : Add a new App
Once you are done with registration as shown in step 1, you need to click on My Apps button (check out the snapshot below). Then select Add a New App from the drop down.
Then you need to write Display Name of App ID (Type any name) and select drop down option in Category (Choose Education). press Create App ID button.
Step 3 : Get App ID and App Secret
In this step, we need to note down our App ID and App Secret (Refer the screenshot below).
Step 4 : OAuth Settings
If you don't put information correctly, you would get the following error -
1. Install required packages
Go to R and install Rfacebook and RCurl packages. Run the following code to install them.
2. Load desired packages
In this step, we will load the above installed packages.
It would return the following message -
Fix : Error
Are you getting the error below?
Step 1 : Run the following program
Step 2 : Run fbOAuth function again
Make sure you put your own app_id and app_secret number before using the code below
Now, getUsers() function will work.
7. Search Pages that contain a particular keyword
It returns 16 variables. See the list of variables -
[1] "id" "about" "category"
[4] "description" "general_info" "likes"
[7] "link" "city" "state"
[10] "country" "latitude" "longitude"
[13] "name" "talking_about_count" "username"
[16] "website"
[3] "President Donald Trump Fan Club" "President Donald J. Trump"
[5] "Donald Trump Is My President" "Donald Trump For President"
8. Extract list of posts from a Facebook page
See the status posted by BBC News. The facebook page name of BBC News is bbcnews.
The above image is truncated. It returns in total 11 variables. See the variables' list -
[1] "from_id" "from_name" "message" "created_time"
[5] "type" "link" "id" "story"
[9] "likes_count" "comments_count" "shares_count
9. Get all the posts from a particular date
You can also put the beginning and end date of the posts you wanted to extract.
10. Which of these posts got maximum likes?
To know the most popular BBCNews post, you can submit the following line of code.
[1] "Could circular runways take off? (via BBC World Hacks)"
11. Which of these posts got maximum comments?
Some posts are not so popular in terms of likes but they fetch max comments. It might be because they are controversial.
12. Which post was shared the most?
13. Extract a list of users who liked the maximum liked posts
In terms of marketing or growth of a website, it is very important to know about the users who liked a certain post.
from_name from_id
Tommy Johnson 10154527932013108
Mirtunjay Raj 399490251425210
Sony Joseph 142559101272027
Note - I have edited the IDs to maintain privacy
14. Extract FB comments on a specific post
To know what users think about a post, it is important to analyze their comments.
14 13 10
17. Extract Reactions for most recent post
Facebook has more than a like button. Last year, it launched emoji (emoticons). If a post got 1k likes, it does not mean everyone really loves the comment. The reaction can be happy, sad or angry.
18. Get Posts of a particular group
First, searchGroup() function searches id of a group from which you want to pull out posts. Later, the group ID is used as a input value in getGroup() function.
End Notes
Text Mining (Social) has gained a lot of interest in a last couple of years. Every company has started analyzing customers' opinion about their products and what customers talk about the company in social media world. It helps marketing team to define marketing strategies and development team to modify the upcoming products based on customer feedback.
Extract Facebook Data using R |
Step I : Facebook Developer Registration
Go to https://developers.facebook.com and register yourself by clicking on Get Started button at the top right of page (See the snapshot below). After it would open a form for registration which you need to fill it to get yourself registered.
Facebook Developer Registration |
Step II : Add a new App
Once you are done with registration as shown in step 1, you need to click on My Apps button (check out the snapshot below). Then select Add a New App from the drop down.
Facebook : My Apps |
Then you need to write Display Name of App ID (Type any name) and select drop down option in Category (Choose Education). press Create App ID button.
Create a new App |
Step 3 : Get App ID and App Secret
In this step, we need to note down our App ID and App Secret (Refer the screenshot below).
Fb App ID and App Secret |
Step 4 : OAuth Settings
- On the left hand side menu, click on Add Product Button
- Click on Facebook Login link
- Under Settings, make sure YES is selected in Client OAuth Login
- Type http://localhost:1410/ in Valid OAuth redirect URIs box
- Click on Save Changes button
OAuth redirect URIs |
If you don't put information correctly, you would get the following error -
Can't Load URL: The domain of this URL isn't included in the app's domains. To be able to load this URL, add all domains and subdomains of your app to the App Domains field in your app settings.Step 5 : Write R Script
1. Install required packages
Go to R and install Rfacebook and RCurl packages. Run the following code to install them.
install.packages("Rfacebook")The package Rfacebook lets you to access Facebook App via R.
install.packages("RCurl")
2. Load desired packages
In this step, we will load the above installed packages.
library(Rfacebook)
library(RCurl)
3. Paste your app id and app secret below
fb_oauth <- fbOAuth(app_id="183xxxxxxxx3748", app_secret="7bfxxxxxxxxcf0",extended_permissions = TRUE)Press ENTER in R Console or CTRL+ENTER in R Studio.
It would return the following message -
Copy and paste into Site URL on Facebook App Settings: http://localhost:1410/
When done, press any key to continue...
Waiting for authentication in browser...
Press Esc/Ctrl + C to abort
Authentication in Browser |
Authentication Status |
4. Check your profile account information
me <- getUsers("me",token=fb_oauth, private_info=TRUE)
me$name
[1] "Deepanshu Bhalla"
Fix : Error
Are you getting the error below?
Error in callAPI(query, token) : An active access token must be used to query information about the current user.
Recently Facebook has made changes in the API which causes error in functions of Rfacebook package. See the method below to correct it.
Step 1 : Run the following program
fbOAuth <- function(app_id, app_secret, extended_permissions=FALSE, legacy_permissions=FALSE, scope=NULL)
{
## getting callback URL
full_url <- oauth_callback()
full_url <- gsub("(.*localhost:[0-9]{1,5}/).*", x=full_url, replacement="\\1")
message <- paste("Copy and paste into Site URL on Facebook App Settings:",
full_url, "\nWhen done, press any key to continue...")
## prompting user to introduce callback URL in app page
invisible(readline(message))
## a simplified version of the example in httr package
facebook <- oauth_endpoint(
authorize = "https://www.facebook.com/dialog/oauth",
access = "https://graph.facebook.com/oauth/access_token")
myapp <- oauth_app("facebook", app_id, app_secret)
if (is.null(scope)) {
if (extended_permissions==TRUE){
scope <- c("user_birthday", "user_hometown", "user_location", "user_relationships",
"publish_actions","user_status","user_likes")
}
else { scope <- c("public_profile", "user_friends")}
if (legacy_permissions==TRUE) {
scope <- c(scope, "read_stream")
}
}
if (packageVersion('httr') < "1.2"){
stop("Rfacebook requires httr version 1.2.0 or greater")
}
## with early httr versions
if (packageVersion('httr') <= "0.2"){
facebook_token <- oauth2.0_token(facebook, myapp,
scope=scope)
fb_oauth <- sign_oauth2.0(facebook_token$access_token)
if (GET("https://graph.facebook.com/me", config=fb_oauth)$status==200){
message("Authentication successful.")
}
}
## less early httr versions
if (packageVersion('httr') > "0.2" & packageVersion('httr') <= "0.6.1"){
fb_oauth <- oauth2.0_token(facebook, myapp,
scope=scope, cache=FALSE)
if (GET("https://graph.facebook.com/me", config(token=fb_oauth))$status==200){
message("Authentication successful.")
}
}
## httr version from 0.6 to 1.1
if (packageVersion('httr') > "0.6.1" & packageVersion('httr') < "1.2"){
Sys.setenv("HTTR_SERVER_PORT" = "1410/")
fb_oauth <- oauth2.0_token(facebook, myapp,
scope=scope, cache=FALSE)
if (GET("https://graph.facebook.com/me", config(token=fb_oauth))$status==200){
message("Authentication successful.")
}
}
## httr version after 1.2
if (packageVersion('httr') >= "1.2"){
fb_oauth <- oauth2.0_token(facebook, myapp,
scope=scope, cache=FALSE)
if (GET("https://graph.facebook.com/me", config(token=fb_oauth))$status==200){
message("Authentication successful.")
}
}
## identifying API version of token
error <- tryCatch(callAPI('https://graph.facebook.com/pablobarbera', fb_oauth),
error = function(e) e)
if (inherits(error, 'error')){
class(fb_oauth)[4] <- 'v2'
}
if (!inherits(error, 'error')){
class(fb_oauth)[4] <- 'v1'
}
return(fb_oauth)
}
Step 2 : Run fbOAuth function again
Make sure you put your own app_id and app_secret number before using the code below
fb_oauth <- fbOAuth(app_id="183385******33748", app_secret="7bf18f8********4cf7def77cf0",extended_permissions = TRUE)
Now, getUsers() function will work.
5. List of all the pages you have liked
Suppose you want to see all the pages you have liked in the past.
likes = getLikes(user="me", token = fb_oauth)The sample() function is used to list some 10 random pages you have liked.
sample(likes$names, 10)
[1] "The Hindu" "ADGPI - Indian Army" "Brain Humor"
[4] "Jokes Corner" "The New York Times" "Oye! Extra Pen Hai?"
[7] "So You Think You Can Dance" "Shankar Tucker" "Rihanna"
[10] "Lindsey Stirling"
[4] "Jokes Corner" "The New York Times" "Oye! Extra Pen Hai?"
[7] "So You Think You Can Dance" "Shankar Tucker" "Rihanna"
[10] "Lindsey Stirling"
6. Update Facebook Status from R
You can also update status in Facebook via R.
updateStatus("this is just a test", token=fb_oauth)
7. Search Pages that contain a particular keyword
pages <- searchPages( string="trump", token=fb_oauth, n=200)In the above code, we are telling R to search all the pages that contain 'trump' as keyword. The n= 200 refers to the number of pages to return.
It returns 16 variables. See the list of variables -
[1] "id" "about" "category"
[4] "description" "general_info" "likes"
[7] "link" "city" "state"
[10] "country" "latitude" "longitude"
[13] "name" "talking_about_count" "username"
[16] "website"
head(pages$name)[1] "Donald J. Trump" "Ivanka Trump"
[3] "President Donald Trump Fan Club" "President Donald J. Trump"
[5] "Donald Trump Is My President" "Donald Trump For President"
8. Extract list of posts from a Facebook page
See the status posted by BBC News. The facebook page name of BBC News is bbcnews.
page <- getPage(page="bbcnews", token=fb_oauth, n=200)
Posts Details |
[1] "from_id" "from_name" "message" "created_time"
[5] "type" "link" "id" "story"
[9] "likes_count" "comments_count" "shares_count
9. Get all the posts from a particular date
You can also put the beginning and end date of the posts you wanted to extract.
page <- getPage("bbcnews", token=fb_oauth, n=100,
since='2016/06/01', until='2017/03/20')
10. Which of these posts got maximum likes?
To know the most popular BBCNews post, you can submit the following line of code.
summary = page[which.max(page$likes_count),]summary$message
[1] "Could circular runways take off? (via BBC World Hacks)"
11. Which of these posts got maximum comments?
Some posts are not so popular in terms of likes but they fetch max comments. It might be because they are controversial.
summary1 = page[which.max(page$comments_count),]"When Angela Merkel met Donald J. Trump, did her reactions speak louder than words?
12. Which post was shared the most?
summary2 = page[which.max(page$shares_count),]"Islam will be the world's largest religion by 2070, new research suggests."
13. Extract a list of users who liked the maximum liked posts
In terms of marketing or growth of a website, it is very important to know about the users who liked a certain post.
post <- getPost(summary$id[1], token=fb_oauth, comments = FALSE, n.likes=2000)To view the list of people:
likes <- post$likesResult -
head(likes)
from_name from_id
Tommy Johnson 10154527932013108
Mirtunjay Raj 399490251425210
Sony Joseph 142559101272027
Note - I have edited the IDs to maintain privacy
14. Extract FB comments on a specific post
To know what users think about a post, it is important to analyze their comments.
post <- getPost(page$id[1], token=fb_oauth, n.comments=1000, likes=FALSE)
comments <- post$comments
fix(comments)
15. What is the comment that got the most likes?
comments[which.max(comments$likes_count),]
16. What are the most common first names in the user list?
head(sort(table(users$first_name), dec=TRUE), n=3)David John Daniel
14 13 10
17. Extract Reactions for most recent post
Facebook has more than a like button. Last year, it launched emoji (emoticons). If a post got 1k likes, it does not mean everyone really loves the comment. The reaction can be happy, sad or angry.
post <- getReactions(post=page$id[1], token=fb_oauth)love_count = 60, haha_count = 286, wow_count = 62, sad_count = 169, angry_count = 532
18. Get Posts of a particular group
First, searchGroup() function searches id of a group from which you want to pull out posts. Later, the group ID is used as a input value in getGroup() function.
# Extract posts from Machine Learning Facebook group
ids <- searchGroup(name="machinelearningforum", token=fb_oauth)
group <- getGroup(group_id=ids[1,]$id, token=fb_oauth, n=25)
In case, searchGroup() function could not find group id. You can search it on lookup-id website.
End Notes
Text Mining (Social) has gained a lot of interest in a last couple of years. Every company has started analyzing customers' opinion about their products and what customers talk about the company in social media world. It helps marketing team to define marketing strategies and development team to modify the upcoming products based on customer feedback.
Share Share Tweet