Monday, 19 November 2012

Saving Tweets to MongoDB using Java

MongoDB is the fastest growing NoSQL database. Apart from the advantages of NoSQL technology, it's JSON querying style, easy installation makes it more preferable. In this article I'm going to show you how to save tweets into MongoDB using Java.

Here I'm going to use Twitter Search API to get the tweets in the form of JSON and save it to the database. Previously I have tried this with the MySQL. I think MongoDB has some very good advantages.
  • We don't have to create a schema for the database where we are going to store the tweets. If you closely see the structure of data returned by the search API, It would require you at least five tables to implement in RDBMS
  • Each tweet has different set of attributes associated with it. Some contain mention information, some contain retweet information, geo location, URLs and a lot more. This could be stored without designing complex database schema
  • MongoDB drivers has built-in support for JSON. JSON being the primary choice of many web services, this becomes a great advantage.
  • Twitter may add or remove some of the properties from the result returned by the search API. This will not affect our program.
  • Querying the database has become easier and efficient. We don't need to use Join any more.
You can download the Java driver for MongoDB here. The following simple code is enough to do our task.
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.net.URL;
import java.net.URLConnection;

import com.mongodb.BasicDBList;
import com.mongodb.BasicDBObject;
import com.mongodb.DB;
import com.mongodb.DBCollection;
import com.mongodb.DBObject;
import com.mongodb.Mongo;
import com.mongodb.util.JSON;

public class Main {
    
    public static void main(String args[])throws Exception {
        System.setProperty("java.net.useSystemProxies", "true");
        
        //Connecting to MongoDB
        Mongo m = new Mongo();
        DB db = m.getDB("twitter");
        DBCollection coll = db.getCollection("tweets");

        //Fetching tweets from Twitter
        String urlstr = "http://search.twitter.com/" +
                "search.json?q=mongodb";
        URL url = new URL(urlstr);
        URLConnection con = url.openConnection();
        BufferedReader br = new BufferedReader(
                new InputStreamReader(con.getInputStream()));
        int c;
        StringBuffer content = new StringBuffer();
        while((c=br.read())!=-1) {
            content.append((char)c);
        }
        
        //Inserting tweets to database        
        BasicDBObject res = (BasicDBObject)
                JSON.parse(content.toString());
        BasicDBList list;
        list = (BasicDBList)res.get("results");
        for(Object obj : list) {
            coll.insert((DBObject)obj);
        }
        m.close();
    }

}
To verify, log into the database terminal and type the following commands and you will be able to see all the tweets in the form of JSON.
>use twitter
>db.tweets.find();


1 comment:

  1. Mongo is dead, long live mongo client.. :)
    Please update code.

    ReplyDelete