Tuesday, July 15, 2008

Google App Engine Followup

Following up my last rambling post with some GAE deficiencies that I have found.

Inability to communicate with external servers in meaningful ways

Often times when building an application it is necessary to communicate with another server. A classic example would be an email server. While the client connects to CRUD email the server would need to connect to other servers to deliver new email. Granted, you can use GAE's facilities to do this for you but if you wanted to invent "Email2.0" and need to directly poll servers or otherwise communicate with them you can pretty much fuggetaboutit.

That being said, you can make HTTP/S connections (80/443 only). However these connections have to be made within the context of a user request to the server (youch!). Worse, you can't time out these connections. So you could potentially hang your user indefinitely without any abiltity to control this from the GAE side. So, these just are not useful. I'm not sure why they have limited connections in this way. I'm guessing security concerns.

Cost of running routine database maintenance seems very high

I'm not a rocket scientist (OK, you didn't need that pointed out for you :P). It is very difficult for me to get an app up and running well. My preference is to build, tinker, build, tinker, etc. Some people like to start with requirements, move on to use cases, create a design doc and on until they have an app ready for the world. I prefer to start small, move it into production and continually add. That's just "the way I roll." That being said I find that the GFS/Big Table/Datastore on GAE doesn't fit well into the way I work. For example, API calls are supposed to be scarce. For the free/beta version you are limited to 2.5m API calls/day. That seems like a lot but I will often build some type of monitor utility that will continually scan my db and do things like massage rankings, update FK-relationships and various other tasks. In my apache+postgres/mysql world that isn't a big deal. When load is light I can fire off the maintenance apps and they can have their way with the db. No can do with GAE. And this is for 2 reasons:

  1. There is no way to run a chron job on GAE. Worse, even if you could fire off a simulated job by touching a URL the engine will limit your processing time to that which is reasonable for a typical user request. Probably < 1 min but definitely < 5 mins.
  2. Maintenance utils would burn through the API credits. Let's say I have a modest app with 100k records that I massage every hour or so with a ranking. 100k * 24h = 2.4m api calls per day and my users haven't even done anything yet.

There are some ways that I could work within the constraints of the app engine but for me, for now, it makes more sense to stick with LAMP on some rented hardware. I'll check back again in a few months and see if any of this gets addressed.

GWT notes

On the plus side I was able to get GWT working with GAE rather effortlessly. Pretty straightforward if you don't mind rolling your own RPCs. I think the RPCs out of the box are pretty bloated anyway. FWIW, I still find it strange that Google picked Java over Python for GWT. They should just buy out the Pyjamas guys and integrate it already.

No comments: