Twitter incessantly produces copious amount of data. The locations of tweets can help with some interesting questions. One that comes to mind, and one that I plan to do when the time is right is- What part of the world is interested in The Champions League final vs The World Cup final. The rationale for this is the amount of debate currently happening on this topic.
Basically, this post will answer "where in the world are people searching for [something]?" Also, all explanation will be in the comments itself.
Code
library(ROAuth) library(twitteR) library(ggplot2) library(maps) library(dismo) # Check to see if R is connected to twitter registerTwitterOAuth(twitCred) # searchString parameter for Twitter API key <- "#UCLfinal" # requesting Twitter API tag <- searchTwitter(key, n = 2000, lang= "en") # Tweets data frame # At this stage it is quite possible to get rid of all the non-geotagged tweets. # However, a very very small portion of users geotag tweets. Therefore, another approach # is used here. In the next step, the location in their profile description will # be extracted. df <- do.call("rbind", lapply(tag, as.data.frame)) # User data frame userInfo <- do.call("rbind", lapply(lookupUsers(df$screenName), as.data.frame)) # geocoding all users with some sort of location identification # also creating "Interpreted Place" # All locations with invalid location will be dropped after this step # Package dismo used here # Although using oneRecord=T decreases the size, it produces more # reliable location data frame locations <- geocode(userInfo$location, progress="text", oneRecord=T) # getting rid of all rows with na locations <- locations[complete.cases(locations),] # also getting rid of all tweets that only have country name as locations # For example, it is good to avoid the center of Australia (which is very very sparse) # showing massive number of tweets. # The easiest way to do this is to get rid of all rows that do not have a comma locations <- locations[grep("\\,",locations$interpretedPlace),] # Map of the world # It is also possible to do the same with country/places maps. # However, it is necessary to make sure that the coordinates are correct result <- ggplot(map_data("world")) + geom_path(aes(x = long, y = lat, group = group)) # Adding Tweet locations result <- result + geom_point(data = locations, aes(x = longitude, y = latitude), color = "red", alpha = .2, size = 3) result <- result + ggtitle(key) + theme_minimal() + theme(axis.text=element_blank(), axis.ticks=element_blank(), axis.title=element_blank()) # Time Stamping the file name filename <- paste(format(Sys.Date(),"%d%m%y"),format(Sys.time(), "%H%M%S"),".png",sep="") # Saving the file ggsave(filename, units="in", width=8.15, height=5.20, dpi=300)
After coupling the result with the wordcloud code (Click here to go to that blog post) and Photoshop, these are the products.