Facebook Data Mining using R

Live Online Training : Data Science with R

- Explain Advanced Algorithms in Simple English
- Live Projects
- Case Studies
- Job Placement Assistance
- Get 10% off till Oct 26, 2017
- Batch starts from October 28, 2017

In this tutorial, we will see how to extract and analyze facebook data using R. Facebook has crossed more than 1 billion active users.  Facebook has gathered the most extensive data set ever about behavior of human. In R, we can extract data from Facebook and later analyze it. Social media mining is one of the most interesting piece in data science. You can analyze sentiments of an important event by pulling information about the event from Facebook and get insights from data in R.
Extract Facebook Data using R
Step by Step Guide : Extract Data from Facebook

Step I : Facebook Developer Registration

Go to https://developers.facebook.com and register yourself by clicking on Get Started button at the top right of page (See the snapshot below). After it would open a form for registration which you need to fill it to get yourself registered.
Facebook  Developer Registration



Step II : Add a new App

Once you are done with registration as shown in step 1, you need to click on My Apps button (check out the snapshot below). Then select Add a New App from the drop down.

Facebook : My Apps

Then you need to write Display Name of App ID (Type any  name) and select drop down option in Category (Choose Education). press Create App ID button.
Create a new App

Step 3 : Get App ID and App Secret

In this step, we need to note down our App ID and App Secret (Refer the screenshot below).
Fb App ID and App Secret

Step 4 : OAuth Settings

  1. On the left hand side menu, click on Add Product Button
  2. Click on Facebook Login link
  3. Under Settings, make sure YES is selected in Client OAuth Login
  4. Type http://localhost:1410/ in Valid OAuth redirect URIs box
  5. Click on Save Changes button

OAuth redirect URIs

If you don't put information correctly, you would get the following error -
Can't Load URL: The domain of this URL isn't included in the app's domains. To be able to load this URL, add all domains and subdomains of your app to the App Domains field in your app settings. 
Step 5 :  Write R Script

1. Install required packages

Go to R and install Rfacebook and RCurl packages. Run the following code to install them.
install.packages("Rfacebook")
install.packages("RCurl")
The package Rfacebook lets you to access Facebook App via R.

2. Load desired packages

In this step, we will load the above installed packages.
library(Rfacebook)
library(RCurl)
3. Paste your app id and app secret below 
fb_oauth <- fbOAuth(app_id="183xxxxxxxx3748", app_secret="7bfxxxxxxxxcf0",extended_permissions = TRUE)
Press ENTER in R Console or CTRL+ENTER in R Studio.

It would return the following message -
Copy and paste into Site URL on Facebook App Settings: http://localhost:1410/ 
When done, press any key to continue...
Waiting for authentication in browser...
Press Esc/Ctrl + C to abort

Authentication in Browser

Authentication Status

4. Check your profile account information
me <- getUsers("me",token=fb_oauth, private_info=TRUE)
me$name
[1] "Deepanshu Bhalla"

Fix : Error

Are you getting the error below?
Error in callAPI(query, token) :  An active access token must be used to query information about the current user.
Recently Facebook has made changes in the API which causes error in functions of Rfacebook package. See the method below to correct it.

Step 1 : Run the following program
fbOAuth <- function(app_id, app_secret, extended_permissions=FALSE, legacy_permissions=FALSE, scope=NULL)
{
  ## getting callback URL
  full_url <- oauth_callback()
  full_url <- gsub("(.*localhost:[0-9]{1,5}/).*", x=full_url, replacement="\\1")
  message <- paste("Copy and paste into Site URL on Facebook App Settings:",
                   full_url, "\nWhen done, press any key to continue...")
  ## prompting user to introduce callback URL in app page
  invisible(readline(message))
  ## a simplified version of the example in httr package
  facebook <- oauth_endpoint(
    authorize = "https://www.facebook.com/dialog/oauth",
    access = "https://graph.facebook.com/oauth/access_token")
  myapp <- oauth_app("facebook", app_id, app_secret)
  if (is.null(scope)) {
    if (extended_permissions==TRUE){
      scope <- c("user_birthday", "user_hometown", "user_location", "user_relationships",
                 "publish_actions","user_status","user_likes")
    }
    else { scope <- c("public_profile", "user_friends")}
 
    if (legacy_permissions==TRUE) {
      scope <- c(scope, "read_stream")
    }
  }

  if (packageVersion('httr') < "1.2"){
    stop("Rfacebook requires httr version 1.2.0 or greater")
  }

  ## with early httr versions
  if (packageVersion('httr') <= "0.2"){
    facebook_token <- oauth2.0_token(facebook, myapp,
                                     scope=scope)
    fb_oauth <- sign_oauth2.0(facebook_token$access_token)
    if (GET("https://graph.facebook.com/me", config=fb_oauth)$status==200){
      message("Authentication successful.")
    }
  }

  ## less early httr versions
  if (packageVersion('httr') > "0.2" & packageVersion('httr') <= "0.6.1"){
    fb_oauth <- oauth2.0_token(facebook, myapp,
                               scope=scope, cache=FALSE)
    if (GET("https://graph.facebook.com/me", config(token=fb_oauth))$status==200){
      message("Authentication successful.")
    }
  }

  ## httr version from 0.6 to 1.1
  if (packageVersion('httr') > "0.6.1" & packageVersion('httr') < "1.2"){
    Sys.setenv("HTTR_SERVER_PORT" = "1410/")
    fb_oauth <- oauth2.0_token(facebook, myapp,
                               scope=scope, cache=FALSE)
    if (GET("https://graph.facebook.com/me", config(token=fb_oauth))$status==200){
      message("Authentication successful.")
    }
  }

  ## httr version after 1.2
  if (packageVersion('httr') >= "1.2"){
    fb_oauth <- oauth2.0_token(facebook, myapp,
                               scope=scope, cache=FALSE)
    if (GET("https://graph.facebook.com/me", config(token=fb_oauth))$status==200){
      message("Authentication successful.")
    }
  }

  ## identifying API version of token
  error <- tryCatch(callAPI('https://graph.facebook.com/pablobarbera', fb_oauth),
                    error = function(e) e)
  if (inherits(error, 'error')){
    class(fb_oauth)[4] <- 'v2'
  }
  if (!inherits(error, 'error')){
    class(fb_oauth)[4] <- 'v1'
  }

  return(fb_oauth)
}

Step 2 :  Run fbOAuth function again

Make sure you put your own app_id and app_secret number before using the code below
fb_oauth <- fbOAuth(app_id="183385******33748", app_secret="7bf18f8********4cf7def77cf0",extended_permissions = TRUE)

Now, getUsers() function will work.


5. List of all the pages you have liked

Suppose you want to see all the pages you have liked in the past.
likes = getLikes(user="me", token = fb_oauth)
sample(likes$names, 10)
The sample() function is used to list some 10 random pages you have liked.

 [1] "The Hindu"                  "ADGPI - Indian Army"        "Brain Humor"            
 [4] "Jokes Corner"               "The New York Times"         "Oye! Extra Pen Hai?"    
 [7] "So You Think You Can Dance" "Shankar Tucker"             "Rihanna"                
[10] "Lindsey Stirling"


6. Update Facebook Status from R

You can also update status in Facebook via R.
updateStatus("this is just a test", token=fb_oauth)

7. Search Pages that contain a particular keyword
pages <- searchPages( string="trump", token=fb_oauth, n=200)
 In the above code, we are telling R to search all the pages that contain 'trump' as keyword. The n= 200 refers to the number of pages to return.

It returns 16 variables. See the list of variables -

[1] "id"                  "about"               "category"        
 [4] "description"         "general_info"        "likes"            
 [7] "link"                "city"                "state"            
[10] "country"             "latitude"            "longitude"        
[13] "name"                "talking_about_count" "username"        
[16] "website"
head(pages$name)
[1] "Donald J. Trump"                 "Ivanka Trump"                
[3] "President Donald Trump Fan Club" "President Donald J. Trump"    
[5] "Donald Trump Is My President"    "Donald Trump For President"  


8. Extract list of posts from a Facebook page

See the status posted by BBC News. The facebook page name of BBC News is bbcnews.
page <- getPage(page="bbcnews", token=fb_oauth, n=200) 
Posts Details
The above image is truncated. It returns in total 11 variables. See the variables' list -

 [1] "from_id"        "from_name"      "message"        "created_time"
 [5] "type"           "link"           "id"             "story"      
 [9] "likes_count"    "comments_count" "shares_count

9. Get all the posts from a particular date

You can also put the beginning and end date of the posts you wanted to extract.
page <- getPage("bbcnews", token=fb_oauth, n=100,
since='2016/06/01', until='2017/03/20')

10. Which of these posts got maximum likes?

To know the most popular BBCNews post, you can submit the following line of code.
summary = page[which.max(page$likes_count),]
summary$message
[1] "Could circular runways take off? (via BBC World Hacks)"

11. Which of these posts got maximum comments?

Some posts are not so popular in terms of likes but they fetch max comments. It might be because they are controversial.
summary1 = page[which.max(page$comments_count),]
"When Angela Merkel met Donald J. Trump, did her reactions speak louder than words?

12. Which post was shared the most?
summary2 = page[which.max(page$shares_count),]
"Islam will be the world's largest religion by 2070, new research suggests."

13. Extract a list of users who liked the maximum liked posts

In terms of marketing or growth of a website, it is very important to know about the users who liked a certain post.
post <- getPost(summary$id[1], token=fb_oauth, comments = FALSE, n.likes=2000)
To view the list of people:
likes <- post$likes
head(likes)
Result - 
from_name           from_id
Tommy Johnson 10154527932013108
Mirtunjay Raj   399490251425210
Sony Joseph   142559101272027

Note - I have edited the IDs to maintain privacy


14. Extract FB comments on a specific post

To know what users think about a post, it is important to analyze their comments.
post <- getPost(page$id[1], token=fb_oauth, n.comments=1000, likes=FALSE)
comments <- post$comments
fix(comments)

15. What is the comment that got the most likes?
comments[which.max(comments$likes_count),]
16. What are the most common first names in the user list?
head(sort(table(users$first_name), dec=TRUE), n=3)
  David   John Daniel
    14     13     10

17. Extract Reactions for most recent post

Facebook has more than a like button. Last year, it launched emoji (emoticons). If a post got 1k likes, it does not mean everyone really loves the comment. The reaction can be happy, sad or angry.
post <- getReactions(post=page$id[1], token=fb_oauth)
love_count = 60, haha_count = 286, wow_count = 62, sad_count = 169, angry_count = 532

18. Get Posts of a particular group

First, searchGroup() function searches id of a group from which you want to pull out posts. Later, the group ID is used as a input value in getGroup() function.
# Extract posts from Machine Learning Facebook group
ids <- searchGroup(name="machinelearningforum", token=fb_oauth)
group <- getGroup(group_id=ids[1,]$id, token=fb_oauth, n=25)
In case, searchGroup() function could not find group id. You can search it on lookup-id website.

End Notes

Text Mining (Social) has gained a lot of interest in a last couple of years. Every company has started analyzing customers' opinion about their products and what customers talk about the company in social media world. It helps marketing team to define marketing strategies and development team to modify the upcoming products based on customer feedback.

R Tutorials : 75 Free R Tutorials

About Author:

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has close to 7 years of experience in data science and predictive modeling. During his tenure, he has worked with global clients in various domains like retail and commercial banking, Telecom, HR and Automotive.


While I love having friends who agree, I only learn from those who don't.

Let's Get Connected: Email | LinkedIn

Get Free Email Updates :
*Please confirm your email address by clicking on the link sent to your Email*

Related Posts:

30 Responses to "Facebook Data Mining using R"

  1. this step bombed:

    me <- getUsers("me", token=fb_Oauth, private_info=TRUE)

    Error in callAPI(query, token) :
    An active access token must be used to query information about the current user.

    ReplyDelete
    Replies
    1. I have added a method in the post to fix this error.

      Delete
    2. This comment has been removed by the author.

      Delete
    3. Yes, you need to run both the steps - Step 1 and Step 2. Run one by one. Make sure you change app_id and app_secret number before running step2 code.

      Delete
  2. me <- getUsers("me", token=fb_Oauth, private_info=TRUE)

    Error in callAPI(query, token) :
    An active access token must be used to query information about the current user.

    Reply

    ReplyDelete
    Replies
    1. facing the same issue.
      any fix for this pls?????

      Delete
    2. @Deepanshu....thank you for the original post and then for the fix as well. Will test it out and let you know. Looking forward to more learning. Thanks again.

      Delete
    3. @ Deepanshu.
      I just run the fix. It works. awesome . thanks a ton!

      Delete
  3. @ Step 7 :
    I am getting this error:
    In vect[notnulls] <- unlist(lapply(lst, function(x) x[[field]])) :
    number of items to replace is not a multiple of replacement length

    ReplyDelete
    Replies
    1. It's a warning. It returns information about public pages that match the keyword. It seems to be some issues with handling missing data in this function.

      Delete
    2. I also face the same error admin please help out to resolve

      Delete
    3. iam also facing same issue with searchpages method.due to this iam unable to proceed furthur.can u plz help me

      Delete
  4. Thanks very much for the fix you've published!

    ReplyDelete
  5. it's a very useful tutorial! it helped me a lot, thank you very much

    ReplyDelete
  6. Amazing content! It was very helpful, especially your error handling

    ReplyDelete
  7. Great content..!

    ReplyDelete

  8. Being new to the blogging world I feel like there is still so much to learn. Your tips helped to clarify a few things for me as well as giving..
    Android App Development Company

    ReplyDelete
  9. Pretty article! I found some useful information in your blog, it was awesome to read, thanks for sharing this great content to my vision, keep sharing..
    Mobile App Development Company

    ReplyDelete
  10. great and nice blog thanks sharing..I just want to say that all the information you have given here is awesome...Thank you very much for this one.
    web design Company
    web development Company
    web design Company in chennai
    web development Company in chennai
    web design Company in India
    web development Company in India

    ReplyDelete
  11. In addition one can use the keylogging technique where you record the keys that are struck on the keyboard. This can be used with electronic cameras to make the recordings. Facebook hack tool

    ReplyDelete
  12. Pretty article! I found some useful information in your blog, it was awesome to read, thanks for sharing this great content to my vision, keep sharing..
    iOS App Development Company

    ReplyDelete
  13. it is really amazing...thanks for sharing....provide more useful information...
    Mobile app development company

    ReplyDelete
  14. when i do step 8, the only output i get is
    "25 posts 50 posts 75 posts 100 posts 125 posts 150 posts 175 posts 200 posts" instead of the list. help me sir

    ReplyDelete
  15. Any method for deriving a list of users who have liked a particular facebook page?

    ReplyDelete
  16. we get only name and id how can we get emails ?

    ReplyDelete
  17. Companies can use Facebook to communicate with the right audience, the target goal and have a greater chance of success. http://www.thesocialguys.co.uk

    ReplyDelete
  18. I seem to be stuck and running in loops
    I am trying to scrape a list of facebook posts, NOT FROM A PARTICULAR GROUP, and extract basic information like SHares, Likes and date of activity. It seems that the Rfacebook packages may be limited is that correct?

    ReplyDelete

Next → ← Prev