banner-gsoc2016_2

As Google Summer of Code 2016 has officially begun, I am all excited to be working with FOSSASIA yet again. This time, I have been assigned the project Loklak where I shall be spending the summer working on implementing indexing (harvester/scrapers) for different services like weibo.com, angel.co, meetup.com, Instagram etc.

Loklak is a server application which is able to collect messages from various sources, including Twitter. This server contains a search index and a peer-to-peer index sharing interface.

If one likes to be anonymous when searching things, want to archive Tweets or messages about specific topics and if you are looking for a tool to create statistics about Tweet topics, then Loklak is the best option to consider.

loklak_anonymous.png



Welcoming Wiki GeoData to Loklak !

 

Loklak has grown vast with due course of time and it’s capabilities have extended manifold especially by the inclusion of sundry website scraper services and data provider services.

loklak_org_sticker

The recent addition includes a special service which would provide the user with a list of Wikipedia articles tagged with the location when supplied with a specific name of the place.

Thanks to the Media Wiki GeoData API, this service was smoothly integrated in the Loklak Server and SUSI (our very own cute and quirky personal digital assistant)

When the name of the place is sent in the query , firstly the home-grown API loklak.org/api/geocode.json was utilized to get the location co-ordinates i.e. Latitude and Longitude.


URL getCoordURL = null;

String path = "data={\"places\":[\"" + place + "\"]}";

try {
    getCoordURL = new URL("http://loklak.org/api/geocode.json?" + path);
} catch (MalformedURLException e) {
    e.printStackTrace();
}

JSONTokener tokener = null;
try {
    tokener = new JSONTokener(getCoordURL.openStream());
} catch (Exception e1) {
    e1.printStackTrace();
}

JSONObject obj = new JSONObject(tokener);

String longitude = obj.getJSONObject("locations").getJSONObject(place).getJSONArray("location").get(0)
                .toString();
String lattitude = obj.getJSONObject("locations").getJSONObject(place).getJSONArray("location").get(1)
                .toString();

The resultant geographical co-ordinates were then passed on to the Media Wiki GeoData API with other parameters like the radius of the geographical bound to be considered and format of the resultant data along-with the co-ordinates to obtain a list of Page IDs of the corresponding Wikipedia Articles besides Title and Distance.


URL getWikiURL = null;

try {
    getWikiURL = new URL(
                      "https://en.wikipedia.org/w/api.php?action=query&list=geosearch&gsradius=10000&gscoord=" + latitude
            + "|" + longitude + "&format=json");
} catch (MalformedURLException e) {
    e.printStackTrace();
}

JSONTokener wikiTokener = null;

try {
    wikiTokener = new JSONTokener(getWikiURL.openStream());
} catch (Exception e1) {
    e1.printStackTrace();
}

JSONObject wikiGeoResult = new JSONObject(wikiTokener);

When implemented as a Console Service for Loklak, the servlet was registered as /api/wikigeodata.json?place={place-name} and the API endpoint for example goes like http://localhost:9000/api/console.json?q=SELECT * FROM wikigeodata WHERE place=’Singapore’;

Presto !!

We have a JSON Object as the result with a list of Wikipedia Articles as:

56976976-5e41-11e6-95b6-19e570099739

The Page-IDs thus obtained can now be utilized very easily to diaply the articles by using the placeholder https://en.wikipedia.org/?curid={pageid} for the purpose.

And this way, another facility was included in our diversifying Loklak server.

Questions, feedback, suggestions appreciated 🙂

Advertisements