Dave recently pushed some of my API changes to the main TalkingPuffin. There are quite a few updates. The API is more complete now, more resilient, supports optional REST arguments, and has a method to load all pages of various APIs. I thought I’d show a few of the enhancements here.
This first listing shows how to use the new TwitterArgs class.
package org.talkingpuffin.twitter
object ShowAPI {
def main(args: Array[String]) = {
// set up our credentials and session
val user = "foo"
val password = "bar"
val sess = TwitterSession(user,password)
// same method as before
var friendsTweets = sess.getFriendsTimeline();
System.out.println("got " + friendsTweets.size + " tweets from old style method")
// use per page count
friendsTweets = sess.getFriendsTimeline(TwitterArgs.maxResults(200))
System.out.println("got " + friendsTweets.size + " tweets using per page count")
// use chained twitter args
friendsTweets = sess.getFriendsTimeline(TwitterArgs.maxResults(200).page(2))
System.out.println("got " + friendsTweets.size + " tweets using chained twitter args")
}
}
There are a variety of methods that support passing in a TwitterArgs instance. These can be constructed by calling the various methods on the TwitterArgs object, e.g.
val args = TwitterArgs.maxResults(200)
If you want to pass in multiple optional arguments you can make calls on an existing args instance, e.g.
val args = oldArgs.page(2)
These get converted to a URL query segment, and are appended to the URLs called for twitter data.
The next listing shows some Scala functional programming neatness, and builds on my previous post about retry logic.
package org.talkingpuffin.twitter
object ShowAPI {
def main(args: Array[String]) = {
// set up our credentials and session
val user = ""
val password = ""
val sess = TwitterSession(user,password)
// demo load all... this just loads the first page
var myTweets = sess.getUserTimeline("mccv")
System.out.println("got " + myTweets.size + " tweets from my timeline")
// now we show load all. loadAll just wants a function that takes an int as an arg,
// which is the page. Scala's partially applied functions make this pretty easy
// to use in a general purpose way
myTweets = sess.loadAll(sess.getUserTimeline("mccv",_:Int))
System.out.println("got " + myTweets.size + " tweets using loadAll")
// this is even fancier. Here I add retry logic to load all.
// note that retryPage is a function I defined here... but the session
// doesn't care. It keeps iterating through pages, retrying and loading
// until it reaches the end.
myTweets = sess.loadAll(retryPage(_:Int,sess.getUserTimeline("mccv",_:Int)))
System.out.println("got " + myTweets.size + " tweets using loadAll and retries")
}
/**
* this is a function that is sort of a thunk through to tryNTimes.
*/
def retryPage[T](page:Int, func: (Int) => T):T = {
// here we define a privately scoped function
// that can be passed to tryNTimes
def tryPage() = {
func(page)
}
// and now we try N (5) times
tryNTimes(tryPage,5)
}
/**
* from the last blog post, a retrier
*/
def tryNTimes[T](func: () => T, runNumber: Int):T = {
try{
func()
} catch {
case e if runNumber > 1 => tryNTimes(func,runNumber - 1)
case e => throw e
}
}
}
Hopefully this code is more or less self documenting. The first session call just gets the first page of the user timeline. This is usually sufficient for writing a Twitter client, but if you are doing data mining it isn’t so great. The new API introduces a method called loadAll of type (f:(Int) => List[T]) => List[T]. This means that any method that takes a single int argument (a page number) and returns a list can be passed to loadAll. It keeps executing the passed in function with increasing page numbers until an empty list is returned (note that this must be the behavior on page overruns as currently implemented. If the overrun URI returns a 404 we’ll get an exception thrown. Luckily Twitter currently just returns an empty list).
The second call shows this in action. It’s using a slightly more complicated case, because getUserTimeline takes a String and an Int. Scala’s partially applied functions make this a snap. The line
sess.getUserTimeline("mccv",_:Int)
Takes the getUserTimeline call with one bound argument and one unbound. It returns a function of type (Int) => List[TwitterStatus], which is exactly what loadAll wants.
The third call is even more complicated um, sophisticated. Let’s say we want to retry operations five times, just in case we get dropped connections in the middle of a big load. Well, all we need to do is get a function that takes an int and returns a list into loadAll.
In a previous blog post I wrote about implementing a retryable method. You can see this more or less unchanged at the end of the file. Unfortunately its signature isn’t quite what we want. So we define retryPage, which acts as an adapter from loadAll to tryNTimes. With this setup in place, we can set up our last call, which uses two partially applied functions. The first converts getUserTimeline into the page-argument-only form, and the second converts retryPage into a page-argument-only form.
Running the two samples combined this gives us the following output
got 20 tweets from old style method
got 199 tweets using per page count
got 200 tweets using chained twitter args
got 20 tweets from my timeline
got 824 tweets using loadAll
trying to get page 1
trying to get page 2
trying to get page 3
trying to get page 4
trying to get page 5
trying to get page 6
trying to get page 7
trying to get page 8
trying to get page 9
trying to get page 10
trying to get page 11
trying to get page 12
trying to get page 13
trying to get page 14
trying to get page 15
trying to get page 16
trying to get page 17
trying to get page 18
trying to get page 19
trying to get page 20
trying to get page 21
trying to get page 22
trying to get page 23
trying to get page 24
trying to get page 25
trying to get page 26
trying to get page 27
trying to get page 28
trying to get page 29
trying to get page 30
trying to get page 31
trying to get page 32
trying to get page 33
trying to get page 34
trying to get page 35
trying to get page 36
trying to get page 37
trying to get page 38
trying to get page 39
trying to get page 40
trying to get page 41
trying to get page 42
trying to get page 43
got 824 tweets using loadAll and retries