Skip to content


Getting Started With The Talking Puffin Twitter API

Talking Puffin is a full blown desktop Twitter client, but it also includes an independently consumable Twitter API. This provides a Scala implementation of most Twitter API functions, buildable with Maven and includable in other projects.

The first step in using the Twitter API is getting source. You can either download this from http://github.com/dcbriccetti/talking-puffin/ or use git to clone your own repository.

The next step is building the code. Go to the base directory of the project (the one with README.md), and execute mvn install. This will run the scala compiler on both the desktop and Twitter API projects and install them into your Maven repository. Note that you can also do this from the twitter-api subdirectory if you only want to build the Twitter API.

After code has successfully built you can interact with the Twitter API via the scala console. To do this, go to the twitter-api directory and run mvn scala:console. You should see something like the following

mcmac:twitter-api mmcbride$ mvn scala:console
[INFO] Scanning for projects...
[INFO] Searching repository for plugin with prefix: 'scala'.
[INFO] ------------------------------------------------------------------------
[INFO] Building TalkingPuffin Twitter API
[INFO]    task-segment: [scala:console]
[INFO] ------------------------------------------------------------------------
[INFO] Preparing scala:console
[INFO] [resources:resources]
[INFO] Using default encoding to copy filtered resources.
[INFO] [compiler:compile]
[INFO] Nothing to compile - all classes are up to date
[INFO] [scala:compile {execution: default}]
[INFO] Checking for multiple versions of scala
[INFO] Nothing to compile - all classes are up to date
[INFO] [scala:console]
[INFO] Checking for multiple versions of scala
Welcome to Scala version 2.7.3.final (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_07).
Type in expressions to have them evaluated.
Type :help for more information.

scala>

In the console you can now import the Twitter API package and create a session.

cala> import org.talkingpuffin.twitter._
import org.talkingpuffin.twitter._
import org.talkingpuffin.twitter._
 
scala> val sess = TwitterSession(twitterUser,twitterPassword)
val sess = TwitterSession(twitterUser,twitterPassword)
sess: org.talkingpuffin.twitter.AuthenticatedSession = org.talkingpuffin.twitter.AuthenticatedSession@9fbc913

Once you have a session, you can call twitter methods… this example shows us getting messages from the public timeline

scala> val timeline = sess.getFriendsTimeline("mccv")
val timeline = sess.getFriendsTimeline("mccv")
timeline: List[org.talkingpuffin.twitter.TwitterStatus] = List(org.talkingpuffin.twitter.TwitterStatus@73ec8c2, org.talkingpuffin.twitter.TwitterStatus@2aee3c45, org.talkingpuffin.twitter.TwitterStatus@7eb6ec07, org.talkingpuffin.twitter.TwitterStatus@1b42008f, org.talkingpuffin.twitter.TwitterStatus@a32ba44, org.talkingpuffin.twitter.TwitterStatus@862cb97, org.talkingpuffin.twitter.T...

The object returned is a List of TwitterStatus objects. The TwitterStatus object is a container for a single tweet, and has fields for most of the elements of the XML returned from the Twitter API. The following example shows a way to iterate over this list and print out the user and text. Note that the user field of the TwitterStatus object is a TwitterUser object, which contains info about a user.

scala> timeline foreach {(x) => println(x.user.name + "--\t" + x.text)}
timeline foreach {(x) => println(x.user.name + "--\t" + x.text)}
Dave Winer--	Switched at Birth, Women Find New Identity. http://tr.im/lixe
Dave Winer--	Yes, there is something super-ironic about a long list of URL shorteners. :-) http://tr.im/li3i
Robert Dempsey--	Yes, I'm on a bus in Austin.   http://yfrog.com/emz3oj
Dave Winer--	The NY Times/Twitter feed (Scripting News). http://tr.im/lixY
Dave Winer--	No LOST spoilers!!
Twitter API--	Starting something serious on the API? Want to launch your project May 26? Launch at @140tc: http://bit.ly/8jj8x ^DW
Dave Briccetti--	Trying to get more RAM on my shit dedicated server, davebsoft.com. Shit because OLM wanted a fortune per month to add dirt-cheap RAM.
Michael Arrington--	FriendFeed Enables People/Group Tracking http://tcrn.ch/1t4 by @parislemon
rick--	OMG LOST finale in a couple hours
Steve Spalding--	A hard lesson to learn is that you can't hand people success. The best you can do is give them the tools.
Michael Arrington--	Google's Last MySpace Payment: $75 Million On June 20, 2010 http://tcrn.ch/1t9 by @arrington
Alex Payne--	On my way to have a beer with two of the @dropbox guys. Once tipsy, I may request more jiggabytes.
Robert Dempsey--	OH: "They all run in the same circles. They fight, they hang out, they have transvestite beauty pageants."
Albert Wenger--	On the ground at SFO - meeting up with @joshu for dinner.
NewsGang--	Calling all iPhones! Emergency scanner apps on the loose! (from Steven Sande) : Filed under: Software, Odds and .. http://tinyurl.com/qpom96
NewsGang--	Quite Possibly the Coolest Hot Tub You Have Ever Seen | dornob (from dornob.com) http://tinyurl.com/p2r54h
Dave Winer--	Red tree: http://tr.im/liMR
Dave Winer--	Guerilla Cafe, Berkeley: http://tr.im/liMS
Dave Winer--	Tree with red leaves in sunset: http://tr.im/liMT
Dave Troy--	There will be a 2010 re-enactment of the 1871 Baltimore parade celebrating the ratification of the 15th amendment (my college thesis).
scala>

Note that we can get a bit fancier with this and use for comprehensions to do more or less the same thing. The following sample uses the API to filter the friends timeline, looking for any text with the word “Berkeley”

scala> for(x <- sess.getFriendsTimeline("mccv") if x.text.indexOf("Berkeley") > 0) println(x.text)
for(x <- sess.getFriendsTimeline("mccv") if x.text.indexOf("Berkeley") > 0) println(x.text)
Guerilla Cafe, Berkeley: http://tr.im/liMS
 
scala>

This should give you a pretty decent start on using the API. There are many more functions available, and you can take a look by generating the scala doc and checking out the API methods. To generate the site, go to your twitter-api directory and run “mvn site”. This should generate a set of HTML pages in target/site. Open index.html, click project reports, click scala docs, and look at the generated documentation for AuthenticatedSession.

Posted in scala.


Everybody fights

I really like Starship Troopers. Note that this is not to say that I in any way like Starship Troopers. One of things that caught me was the notion that in the mobile infantry everybody fights. Cooks, captains, chaplains… when it comes time to drop everybody has skin in the game.

There have been a variety of things bothering me about the latest project I’m on, and I think I’ve finally identified the root cause. There are a number of people who have a say… but when it comes time to get work done they don’t/can’t get their hands dirty and get things done. We’re entering planning for the next six to eighteen months of our project, and one of my big pushes is going to be reducing the number of people who don’t fight. Feedback and collaboration are nice. But if you want a seat at the decision making table, be ready to fight.

Posted in agile.


Scala exceptions vs. pattern matching

I’ve recently been working on the talking puffin project, particularly on the lower level twitter API. At the plumbing level, we want a class that allows us to fetch XML from a URL. In Java doing this sort of stuff usually involves quite a few checked exceptions. The interwebs are flakey, and ignoring that fact usually leads to really bad things happening. However Scala doesn’t have checked exceptions… and that’s a nice thing for flexibility, conciseness, etc. So what to do?

My two basic approaches were

  1. Throw exceptions on errors. this allows more concise code when you don’t care about the errors (maybe you’re dealing with them higher up)
  2. Use case classes with subclasses indicating successful and unsuccessful execution. This forces you to deal with error conditions right away, which may lead to more stable code

Let’s take a look at implementations of both. First we set up a simple base class to deal with HTTP plumbing. This just provides a way to open a URL and a helper method to read the error stream into a string.

class HttpBase{
  /**
  * Open the specified URL and return a tuple containing the corresponding
  * HTTP response code and HttpURLConnection
  */
  def openConnection(urlStr:String) = {
    val url = new URL(urlStr)
    val conn = (url.openConnection).asInstanceOf[HttpURLConnection]
    val responseCode = conn.getResponseCode()
    (responseCode,conn)
  }
 
  /**
  * Utility to open a connection's error stream and read it into a string'
  */
  def getErrMsg(conn:HttpURLConnection) = {
    var errMsg = ""
    val reader = new BufferedReader(
          new InputStreamReader(conn.getErrorStream()))
    var line = reader.readLine()
    while(line != null){
      errMsg += line
      line = reader.readLine()
    }
    errMsg
  }
}

Now let’s try to implement a class that actually fetches XML, and throws an exception if errors are encountered.

case class HttpException(val msg:String, val code:Int) extends Exception
 
/**
* Provides a getXML method that fetches XML from a URL.
* Throws an exception if any errors are encountered
*/
class HttpXMLExceptions extends HttpBase{
  def getXML(urlStr:String):Node = {
    try{
      openConnection(urlStr) match {
        case (200,conn) => XML.load(conn.getInputStream())
        case (code,conn) => throw HttpException(getErrMsg(conn),code)
      }
    }catch{
      case e => throw HttpException(e.toString,-1)
    }
  }
}

And here’s a usage sample. This is concise, but (by design) we aren’t forced to think about nasty things like 404s, 401s, dropped connections, broken pipes, and all sorts of other networky hobgoblins.

    val excs = new HttpXMLExceptions()
    val url = "http://twitter.com/statuses/public_timeline.xml"
    val content = excs.getXML(url)

Let’s try to force people to think about them. Instead of throwing an exception, we can define a hierarchy of case classes for our responses, like so.

case class Response
case class Success(val content:Node) extends Response
case class Error(val msg:String,val code:Int) extends Response

Our method will return a type of Response. If our request is successful we’ll get a Success object that has a content field. If we run into any errors we’ll instead get an Error object back with a response code and error message from the response body. This is similar to using Scala’s Option type, however this allows us to provide information on failure instead of simple returning the None instance. Here we go…

/**
* Proveds a getXML method that fetches XML from a URL.
* Always returns a Response object, regardless of success/failure
*/
class HttpXMLMatches extends HttpBase{
  def getXML(urlStr:String):Response = {
    try{
      openConnection(urlStr) match {
        case (200,conn) => Success(XML.load(conn.getInputStream()))
        case (code,conn) => Error(getErrMsg(conn),code)
      }
    }catch{
      case e => Error(e.toString,-1)
    }
  }
}

This is pleasantly similar in implementation… However usage looks quite a bit different…

    val matches = new HttpXMLMatches()
    val url = "http://twitter.com/statuses/public_timeline.xml"
    val content = matches.getXML(url) match {
      case Success(node) => node
      case Error(msg,code) => 
        <error><msg>{msg}</msg><code>{code}</code></error>
    }

This is quite a bit more verbose. However it does send a message that developers need to be conscious of the potential error case. Let’s look at side by side usage when we add error handling for an unknown host

    val excs = new HttpXMLExceptions()
    val badUrl = "http://nohost.twitter.com/statuses/public_timeline.xml"
    val noContent = try{
      excs.getXML(badUrl)
    }catch{
      case HttpException(msg,code) => 
        <error><msg>{msg}</msg><code>{code}</code></error>
    }
 
    val matches = new HttpXMLMatches()
    val noMatchContent = matches.getXML(badUrl) match {
      case Success(node) => node
      case Error(msg,code) => 
        <error><msg>{msg}</msg><code>{code}</code></error>
    }

This usage actually looks pretty similar too. This is due to the fact that a try/catch evaluates like a pattern match. In the exception case your error conditions are separated from the main logic, which some may like and some may not.

The primary difference is that using the Response class hierarchy forces the user to consider the the unsunny day scenarios. I don’t think there’s a black and white rule on when to use either approach, but case classes provide you a way to expose error conditions more visibly than unchecked exceptions. In the talking puffin project I think we’ll stick with exceptions, but it’s nice to have options.

Posted in scala.


How J2EE set architecture back a ways

I’ll start this off by saying this is pure opinion.  I don’t have any statistics.  Instead I have some gut intuition based on numerous “enterprise Java” projects within my company, and through observing the development community.

I’m currently involved in architecting a next generation systems management framework.  It has fairly hefty requirements… monitor and manage millions of objects distributed across the globe in real time.  Take out the “millions” bit, and this isn’t so bad.  There are a bunch of standards out there (JMX, SNMP), there are some off the shelf tools, and you can knock something together pretty quickly that covers a reasonable subset of equipment for small installs.  But we need to do better.   We need to manage more than just “most” of the equipment, and we need to do it in some of the world’s biggest data centers.  And as I talk about this with my peers from other groups,  I’m getting more convinced that J2EE has poisoned a good chunk of the current generation of architects.

I know this isn’t how J2EE  has to work, but in most projects, you start them like this…
1) Build your object model, use magic to map it to the DB
2) Build your business logic
3) Build your front end
4) Ship it!

And there you are.  You have a nice central database so you don’t have to worry about distributed data so much.  It’s all nice and safe on the disk.  You have some adapters to get stuff in and out, and a cozy UI that serves it all up to the user.  And it scales… for as much load as you can throw at your desktop.

Then you take it to some place big.  And it just flat out doesn’t work.  It’s not a matter of getting a bigger database, because you just can’t scale it up past a certain point. You can cluster, but now you need a professional services group to install your product, and the performance gains to be had aren’t crystal clear.

It becomes a matter of deciding what goes in there.  And when, and how.  But looking forward to these issues is met with fierce resistance in design meetings… Building distributed systems is much harder than building something around a central store.  Objections are typically based on reading marketing propaganda from database vendors.  Responses of the form “well, we’ll partition the data when we hit that load.  Or use a cluster… yeah, a cluster will fix everything” are extremely painful to work through.  After all, some “expert” from VendorX says it will work. Who are you, my peer, to assume you know more than me?  

But guess what?  Each of these objectors has a product in the field, based a central database, handling far less load than our targets, and none of them scale well.

Scaling out is beating scaling up.  Processors are going multi-core, not taking huge leaps forward in the GHz war.  Scala and Erlang are gaining in popularity, and with them a different model of parallelism (not new, just different) than that offered by J2EE.  The actor model embraced by both of those languages is primarily targeted at small processes within a service.  But as you look at larger chunks of the architecture, I think the sanest way forward is to embrace a similar model.  

ESB has been a buzzword for years now.  I don’t buy all the marketing hype around it, but at its core there is a model that the current generation of architects needs to get a handle on.  It’s very similar in concept to the Erlang/Scala actor model.  Yes, it’s harder to design around asynchronous messages and a large number of distributed components.  But that’s how you handle the big problems.  And that’s why you hire good developers to work on them, not just some random CS grad who happens to be able to regurgitate the latest Spring book.  

I’m not saying there isn’t a place for the traditional J2EE buildout (rails has a similar model, and works wonderfully for small-medium sized projects).  And in fact, even with these asynchronous systems there are almost certainly services that need a DB backing them.  But there are other tools that should be in your toolbelt.

My suggestion to Java architects today would be to pick up another language that fosters a totally different architecture.  I’m currently biased towards Ruby, Erlang or Scala.  There are no guarantees that any of these scale better than Java (and it’s fairly easily arguable that Ruby doesn’t).  But any of these will make you see the world in a different way, and allow you to more effectively consider and evolve your current architecture.

Posted in opinion.

Tagged with , , , , .


Scala Lift Off – Static Companion to Ruby?

So I went to the last half of Scala Lift Off on Saturday (only half, because the first half was taken up by my final MBA class.  Ever.).  I went primarily out of curiosity, not knowing much about Scala or Lift.  The main draw was the built in comet support for Lift, which seems to not be a focus in other frameworks… at least not for Rails.  We currently use Juggernaut for comet support, but depending on flash is something of a liability (see: iPhone), and Juggernaut itself isn’t as smoothly integrated with Rails as i’d like.

I came away extremely impressed.  Scala is relatively unheralded in the world of alternative JVM languages (see Groovy, Jython and JRuby publicity), but shows a lot of promise.  It’s a functional language with an expressive syntax that allows you to easily create code that looks DSL-ish.  These are the primary features that drew me to Ruby (ok, Ruby isn’t a functional language, but you can sorta fake it).  But Scala has a better integration story with existing Java libraries, is strongly typed, and has a stronger functional bent.

I’m a big believer in the right tool for the job, and as such don’t fall into a pure-dynamic or pure-static language camp.  I also don’t fall into a single language camp. I really enjoy Ruby for quick prototyping, and love Rails for quick prototyping of webapps, and maintaining a nimble production face on web applications.  But Rails falls down when I need to run background processing.  The times I think hardest about moving back to a Java webapp environment are when I need to go write something that doesn’t just receive a web request and terminate.  This is where concurrency issues get painful to deal with, Ruby daemons/DRb are painful, and starting up a whole Rails env for simple processing is rough.

So I’m hoping Scala/Lift fills that void.  I’m mentally sketching out a replacement of our background processing jobs (Twitter integration, email processing, etc.) with Scala, and in particular the Actors library.  These are relatively simple processing tasks, and should give me a decent feel for the language.  It should also improve the stability and scalability of our background processing.  It may also yield some reasonable libraries to contribute back (Scala Twitter library, Scala ActiveRecord bridge). 

Once I have that nailed down, an evaluation of how Lift can/should fit into the framework is in order… or maybe I’ll have to start my Rettiwt side project based on Lift.

Posted in scala.


System and Organizational Scaling – the Enterprise View

Albert Wenger put up a good post talking about the challenges faced by their startup portfolio, and how a vertical approach to subsystem division helps scaling the organization.  In fact the Web 2.0 landscape is very reflective of this approach.  10 years ago if Yahoo needed an authorization framework they would have built it themselves.  Today people use oAuth.  Flickr, Twitter, Delicious, Campfire, Meebo, S3… all very focused services that delegate non-core functionality to another service where possible.  For those things that are duplicated across services (web frameworks, database backends), nobody cares if they’re different… it’s hidden behind the service.

This is very attractive for the startup ecosystem.  The tangible results have been products that are cheaper to launch, quicker to market, and easy to adapt to customer feedback.  But how does it apply to the enterprise?

Issue 1:

For better or worse, in large engineering organizations people tend to care that the common horizontal components are indeed common.  If you work in a 100+ person engineering organization, and the rest of the team is using Spring + Struts for web development, it’s usually a tough sell to start using Rails.  If the organizations insists on this homogeneity, your vertically
sliced org is cross cut by the commonality police, hampering the
agility of the vertical team.  There are good reasons for this.  Somebody needs to support and test the thing.  If nobody else can deal with what you just built, it doesn’t have a path to market, and is therefore useless. 

Issue 2:

It’s easier to add 1 person or 1 feature to an existing service than it is to spin up a whole new service team.  Most organizations usually don’t have the luxury of five new reqs to apply to a service.  And they’re also hesitant to carve out five people from existing teams to spin up a new service.  So teams and services are usually built by accretion.  The result is usually a gradual march to collapse, as services become bloated and so difficult to maintain they need to be replaced.

Issue 3:

The organization only has capacity for a limited number of products.  A startup of 5-10 people can fully support the development, launch, and marketing of a vertical service.  In a large organization, the amount of infrastructure required to take a product to market means that unless you have a multi-million dollar revenue stream guaranteed in year 1 your chances of getting marketing, sales, training, doc, etc. spun up to support you are very small.  That means actual releases are larger than individual services.

Issue 4:

Coordination challenges across vertical services.  This ties in to issue 3.  Because a product release is a composition of multiple services, there is a desire for tight coordination across these services.  Contrast this to the Web 2.0 world.  Basecamp uses Amazon’s S3 storage service.  They are under no illusion that they can call the shots on S3’s feature roadmap.  And even more important, they would not plan for a release based on features that S3 has not committed to.  In the enterprise, these rules don’t apply.  Feature roadmaps for a collection of services are determined at the same time, and management hopes to coalesce them all into product at a predefined future date.

Possible Solutions

To address issue 1, the organization should focus on service capability, not underlying implementation.  To support this, service teams will need to be self sufficient, and should make implementation decisions as a team.  It needs to be tested.  Involve quality engineering in the technology selection.  It needs to be doced.  Involve doc.  At the end of the day the customer rarely cares if you use VB or Python to get them their value.

To address issue 2, the entire organization needs to be focused on keeping services focused on their core function.  When somebody needs feature X, run through the list of services.  Teams should be willing to say no to features not because they don’t
have capacity, but because the feature corrupts the purpose of the
service. If the feature doesn’t fit with any project, spin up a new service rather than accrete it onto an existing project.  This requires the organization to be flexible, to be willing to work on new things, and to give up old responsibilities.  It also demands a rejection of empire building.  You should take more pride in your project being small, focused, and absolutely fantastic at what it does, rather than measuring your worth by the number of people on your project. 

Issues 3 and 4 are the stickiest.  In the enterprise the opportunity for experimentation is much lower than it is for a consumer web offering.  Perhaps one option is to create a “labs” organization, that releases products for free for use by bleeding edge customers.  Services that are vetted through the “labs” channel could be productized at a later date. 

Coordination problems could be solved by eliminating planning that spans services. Take a snapshot of service capabilities, and plan from there.  This likely slows down development to a certain extent, but also makes individual plans more predictable, and significantly reduces coordination challenges.

Wrapup 

While challenges exist in the enterprise, I think it’s worthwhile to look at how we can make vertical slicing of the org work.  The advantages realized by Web 2.0 companies are compelling, and we definitely have challenges with our current horizontally structured groups.

Posted in organization.


Transparent PNGs in IE w/ Rails

I’ve been working on Kebima for several months now, using Firefox and Linux/OSX.  Chalk it up to not doing enough research, but I just figured transparent PNGs worked in IE.  Oh well.  They don’t.  At least not in 6 and earlier.  So began my mission to get them to work.

I was already using a lightbox package that uses the technique mentioned in the MS support article, but I didn’t want to have to apply a div-specific solution for every png on my page… and since i’m using the silk icon set, this would mean a lot a lot of specificity.

Googling gets you a lot of results.  The first hit is actually pretty good, in that it states that the script isn’t maintained, but points you to 24 Ways, which has a pretty good solution.  This likely works out of the box for normal web development (I did run into one bug… for some reason the section of the script that sets root on line 17 failed… I took out the bit that allowed you to limit the div the script was applied to).  However rails likes to throw timestamps on the end of images, so instead of ‘/images/icon.png’ you get ‘/images/icon.png?1209327623′

Because the 24 Ways script is looking for an img tag whose src attribute ends with ‘.png’, this makes things not work.  My solution was to add a regex to match rails style image srcs

var png_pattern        = /\.png(\?\d*)?$/i;

And then match against that in fnLoadPngs

// background pngs
if (obj.currentStyle.backgroundImage.match(png_pattern) !== null) {
  bg_fnFixPng(obj);
}
// image elements
if (obj.tagName=='IMG' && obj.src.match(png_pattern) !== null){
  el_fnFixPng(obj);
}

And it worked.  Plus I got a pretty decent workout running up and down the stairs between Mac and PC.

Posted in rails, ruby.

Tagged with , , , , .


Twitter, Jabber, Stability – Should Twitter be more ejabberd and less rails?

So I switched our Twitter integration for Kebima from using the HTTP interface to using the XMPP interface.  We really wanted real time updates, and polling just seems so barbaric.  I found some code on how to create a twitter bot and got the conversion made surprisingly fast.  It’s still ugly because auto-following has to be done through HTTP, but in a few hours I had a pretty simple bot going.

Then we went to the Web 2.0 Expo and tried it out.  And it didn’t work.  Turns out Twitter’s Jabber replies were delayed or somesuch… probably fallout of Twitter’s other greyout problems this week.  But it got me thinking…

If you had to build twitter from scratch, how thin a veneer over XMPP could you do it with?  At its heart, Twitter is a message router.  There are interfaces with SMS systems, HTTP, and Jabber.  But messages come in, messages go out.  I’ve heard they run XMPP under the hood to handle this, but I’ve also heard they run rails for a good chunk of functionality.  I’m a rails fan as well, but from what I can see, perhaps there’s too much rails and not enough ejabberd in the mix.

Conceptually it seems like you could set this up as a set of processes that each act as an internal Jabber client for an individual twitter user.  The process is responsible for receiving messages

  1. pushing them to SMS or the user’s Jabber client
  2. Building the web page that people visit
  3. Handling API calls as they come in
  4. Dispatching messages to followers

In fact conceptually you could set up three clients, each to handle one of these jobs.  Each of these processes could in essence be a fully functional twitter service for an individual user.

In addition to this you need gateways for SMS and the HTTP API, but it seems like those could be scaled out fairly easily as they’re not user specific.  The SMS gateway is just going to build an XMPP message and dispatch it to the matching twitter client.  The HTTP gateway is doing the same thing.

The beauty of this is that it’s naturally sharded.  No shared data between users.  You have to parse messages for @responses and route them to appropriate destinations, but it seems like that could easily be written as an ejabberd plugin. 

Google has proven this scales to a fairly large user base.  Is Twitter already beyond that scale, or am I missing something?

Regardless, we’ll be pushing ahead with our plans to add direct GTalk integration for our app.  It may go down at times as well, but it seems more robust than Twitter’s infrastructure at this point.

Posted in ruby.

Tagged with , , , .


Standalone ActiveRecord and SQLite3

I recently wanted to set up a quick daemon process that used ActiveRecord outside the rails framework. I also wanted it to use sqlite, just to keep install/dependencies simple. I found not so good documentation on how to do this… and after a few missteps it turns out it’s not that hard. The only gem dependencies are activerecord and sqlite3-ruby. You’ll also need sqlite working. Code follows

require 'rubygems'
require 'sqlite3'
require 'activerecord'
# connect to database.  This will create one if it doesn't exist
MY_DB_NAME = ".my.db"
MY_DB = SQLite3::Database.new(MY_DB_NAME)
# get active record set up
ActiveRecord::Base.establish_connection(:adapter => 'sqlite3', :database => MY_DB_NAME)
# create your AR class
class Update > ActiveRecord::Base
end
# do a quick pseudo migration.  This should only get executed on the first run
if !Update.table_exists?
  ActiveRecord::Base.connection.create_table(:updates) do |t|
    t.column :account_name, :string
    t.column :last_update_time, :timestamp
    t.column :last_update_id, :integer
  end
end

Posted in ruby.

Tagged with , .


Scaling and Deploying Rails: It’s not Java. Or PHP.

Recently there have been a slew of posts complaining that rails doesn’t scale and is hard to deploy.  I think the root of this is that people come to rails after experience with Java, PHP or similar frameworks.   Rails is neither of these.  It is heavier than PHP, and it seems that this makes its fit with the typical PHP deployment methods slow at best.  At the same time it has an entirely different deployment model than Java web applications.  The Java application containers are heavier, and designed to support a much more highly available application within  that JVM instance.

But rails still has its sweet spot.  So why would I use rails instead of Java? It’s not as light as PHP, but from a development perspective it’s usually close enough.  Bouncing webrick or mongrel takes a second or two, and for most changes I don’t even need to do this.  To accomplish the same code/test cycle in Java takes  at least 10 times as long, often much worse.  At the same time rails offers.  Plus I can do a lot of things I can’t do with Java, thanks to dynamic typing, better metaprogramming facilities, and things like the rails console.  Plus, from a hosting perspective you can get much better deals on hosting a rails application than a full Java servlet environment, which tells you something about the relative difficulty of hosting rails and J2EE.

Why would I use it instead of PHP?  The framework does enough magic that getting an application up and running is faster than raw PHP (disclaimer:  I haven’t used any of the newer PHP frameworks like CakePHP . Maybe they’re more on par with rails here.).  MVC is nicely broken out.  Ruby itself is a wonderful language to develop in.  Yes, you’re going to have to do more work around deployment.  I can’t imagine going live using FastCGI, SCGI, or that sort of thing.  But learn how to set up a mongrel cluster.  That seems to work.

I think Java and PHP still have their places as well.  Java provides much better concurrency in a single process, better built-in capabilities for high availability, security, and a larger library/tool ecosystem.  Enterprises know how to deploy and maintain Java applications, and in general feel more comfortable putting those in production than a rails system.  PHP is lighter weight, will likely scale better out of the box, and is better understood by most hosting companies.

At the end of the day the decision comes down to what you’re building and where you’re deploying.  I think rails works in wide range of situations, but there is an equally wide range where PHP or Java may be a better choice. 

Posted in rails.