Monday, September 29, 2014

Facebook without News Feed

Don't want to see any of your friends feeds at all? Then what's the point of using facebook anyway? However, I've been looking for a way where I would be able to check important group posts, not any other feeds. Then, I've stumbled onto this awesome blog: http://maxfriedrich.de/post/86417669824 . Check that out, I'm enjoying it. 

Consuming New York Times API with python

Hi,

I didn't know earlier that New York Times actually has an api for accessing it's resources which are even as old as 1851! Tons of articles, columns, news can be mined quite easily. The steps involved in their consumption is nothing of a complex sort. Just obtain a key of each type of resource you want to access and call the api with it.
Below is an example of accessing it with python.

1. Get the key from this link: http://developer.nytimes.com/docs
2. Install this module: https://pypi.python.org/pypi/nytimesarticle/0.1.0
3. Learn about filtering the search: http://developer.nytimes.com/docs/read/article_search_api_v2#filters

Now this is a simple example of getting all the articles by Roger Cohen, one of my favorite writers, which includes issues on 'usa'.

from nytimesarticle import articleAPI
api = articleAPI('your_article_access_token')
res = api.search( q = 'usa', fq = {'byline':'ROGER COHEN', 'source':['The New York Times']}, begin_date = 20140901, facet_field = ['source','day_of_week'], facet_filter = True )
for m in res['response']['docs']:
    print m['web_url']

Monday, September 22, 2014

Shell script for updating project in server from local using git

Hi,

This is a trick for lazy people like me. Say, you have a simple project that is being hosted by github or bitbucket. That same project is being hosted in a server too. Then, there is an easy way for porting local changes to server through github or bitbucket.

First you need to add the same remote, either github or bitbucket in your local copy and in your server copy. So that, pushing from local and pulling from server both occurs keeping github or bitbucket in the middle.

Now, the steps involved in syncing the server with local copy after changes is like this:

1. Go to your project folder.
2. Add your changes.
3. Commit these changes.
4. Pull from github or bitbucket.
5. Push to those.
6. Login to your server by using SSH.
6. Go to the project folder in the server.
7. Pull from github or bitbucket.

Quite a few commands need to be executed. Let's add them in a simple script that can do all these.

For that, you need sshpass to insert password while logging into the server with ssh. Now, the script would look like these:
#!/bin/sh
cd "/path/to/your/local/project/" && 
git add . && 
git commit -am"lazy commit message" && 
git pull origin master && 
git push origin master && 
sshpass -p 'your_server_password' ssh username@server_address 'cd /path/to/project/ && git pull origin master'

:)

Friday, September 12, 2014

Find strings in your facebook chat history

Hi,

I've lost a number which my friend gave in facebook chat. But I couldn't remember when he gave it. So, rather than checking all those previous messages manually, I ended up writing a program which would search for a string in facebook chat history.

So, there are three steps involved in this program:

1. Find the id of that conversation
2. Request to the graph api
3. Iterate through the paged responses

To accomplish the first task, send a request to 'user/inbox' node of the graph api. You would find a list of conversations each with it's own id. Find your desired one. It's possible to come up with a program to find it automatically, however, I'm gonna settle for this much for now.

Now, define these three functions:

import requests
from facepy import GraphAPI
import re

def search_in_message(message):
    for m in message['data']:
        try:
            matched_obj = re.match('(\d+)', m['message'], flags=0)
            if matched_obj:
                print m['message']
                print m['created_time']
                print matched_obj.group()
        except KeyError:
            continue
def get_message_from_graph(graph, stream_id, limit, since):
    return graph.get(stream_id + '/comments?limit='+str(limit)+'&since='+str(since))

def get_next_page(next_url):
    return requests.get(next_url).json()
Each methods should be self explanatory. However, we can achieve different results, like a specific number, pattern, string in the messages just by modifying the search_in_message function. I've given it the shape to find numbers in the conversation. The later parts are pretty straightforward:
token = 'your_secret_token'
graph = GraphAPI(token)

paged_messages = get_message_from_graph(graph, 'your_conversation_id', your_limit, your_since_timestamp)

while paged_messages['paging']['next']:
    print paged_messages['paging']['next']
    search_in_message(paged_messages)
    paged_messages = get_next_page(paged_messages['paging']['next'])

Thursday, August 28, 2014

Get various information from Facebook Graph API

Finding information from dataset has always fascinated me. And facebook is such a vast reservoir of data, that it's practically a crime for the enthusiasts to leave it untapped.

I've been looking for some way to get different questions answered by facebook.

Like, who has liked my post most? Whose post I liked most? Comments, photos everything can be explored to find these sort of information. There are also a lot of apps for these purposes, but as a developer, you should be able to do it yourself, at least I think so.

I'm going to explain this one: "Who liked your last N number of feeds most?"

In this post, I've explained the way to connect to graph api. Keep in mind that facebook has made the fql obsolete from versions 2.0+. So, being an expert in fql is not going to help much in the future.

Now, facebook has opened up different root nodes in the graph api. They could be found in this link . We are going to use the 'user/feed' node for our purpose. Now, of course facebook is not going to give all data in one request, nor you should expect it to. It returns data in a paged format. There are three types of pagination that facebook uses. Our node uses time based pagination.

Let's look at this code chunk:
import requests
from facepy import GraphAPI
import pprint
from pymongo import MongoClient
from bson.son import SON

client = MongoClient('localhost', 27017)
db = client.likes
like_table = db.like_table

We're going to use mongo a little bit. Due the scope of this post, I'm not going to describe it's installation or usage. This link has very good resources on it.
feeds = graph.get('friends_or_self_id/feed?limit=1')
To find this id, we can simply issue a request in the browser in the address 'http://graph.facebook.com/username'. We could also append parameters like until and since to specify the time period we are concerned about. However, I faced trouble getting them working. So, I've opted for counter instead.
while True:
    try:
        likes = feeds['data'][0]['likes']
        while True:
            try:
                for m in likes['data']:
                    like_table.insert(m)
                likes = requests.get(likes['paging']['next']).json()
            except KeyError:
                print("keyerror in likes pagination")
                break
        if(counter == 30):
            break
        counter = counter + 1
        feeds = requests.get(feeds['paging']['next']).json()
    except KeyError:
        print("keyerror happened")
        break
This shall retrieve the last 30 feeds on the timeline of that user. Then, let's find the most appearance of a particular id. This is where mongo is coming handy.
list = (like_table.aggregate([
    {'$group': {
        '_id': '$id',
        'count': {'$sum': 1}}
    },
    {"$sort": SON([("count", -1), ("_id", -1)])}
]))

for k in list['result']:
    print(k['_id'])
    print(k['count'])
This is just one simple example of the endless possibilities of facebook. I'm a big fan of it just because of it's database. People can practically be 'profiled'.
The code is also hosted in github.