Free Coffee for Voting

Written on 10:50:00 AM by S. Potter

As if anyone needs any more enticement to vote this election season, Starbucks has decided to reward people who cast their vote Tuesday November 4th, 2008 with a free Tall coffee. I do not believe you have to vote for anyone in particular:) Just go and vote. While you are enjoying your free coffee at Starbucks, perhaps explore the latest Merb 1.0 release candidate. To install simply do the following:

$ sudo gem update --system
$ gem --version # make sure it is at least 1.2.0
$ sudo gem install merb
Now to create a new Merb application do the following:
$ merb-gen app your-app-name-here
$ cd your-app-name-here
# now start editing the files you need to edit for you app.
# I promise more tutorials will be coming once I have 
# free time

Book Shelf Clearing

Written on 9:52:00 PM by S. Potter

In an effort to reorganize my office, I am clearing off space on my bookshelves and selling books (on Amazon marketplace) that I have already read. Since they are all software development related I thought I would list the most relevant below in case any blog readers are looking for books on the subject for cheap:

  • [SOLD] Deploying Rails Applications: A Step-by-Step Guide (Facets of Ruby) (buy it new or used)
    Very good for the beginner or intermediate Capistrano recipe writer. It also contains some good pointers at the end of the book on benchmarking. Released relatively recently (within 3 months I think?)
  • SOA in Practice: The Art of Distributed System Design (Theory in Practice) (Published Aug 2007)
    This book is more for the unRESTful SOA types than the RESTful folk like me. The author obviously has some old-time experience, so if you are into old school SOA standards (WSDL, SOAP, UDDI) then this would provide a lot of good architect-level areas to consider, which are often ignored including: service lifecycle, versioning, security, service management and more. One gripe I had with the book is in the versioning section. Let's just say his preference for naming is definitely NOT my style:) Otherwise this contains good reading material for the architect just getting into SOA in an enterprise setting.
  • Rails Cookbook (Cookbooks (O'Reilly))
    Great for starting Rails projects with limited Ruby on Rails experience. It covers Rails 1.2, but most recipes will only require minor changes.
  • The Art of Agile Development
    Covers all aspects of agile development from planning to delivery. If you aren't already an agile development ninja, this is in my top three agile books.
  • Webbots, Spiders, and Screen Scrapers: A Guide to Developing Internet Agents with PHP/CURL
    Last year I had to work (for a short while) with PHP in the crawling/bot arena. This book contained got me developing bots and crawlers in no time in PHP.
  • Ajax on Rails
    This would be a decent book for Rails developers that aren't believers in Unobtrusive Javascript (UJS). I have to purge this book from my bookcase now that I am a big UJS convert:) I hope you understand. I got this when it first came out, before I had seen the light. Seriously though, if you don't care about UJS principles, this book covers good ground on using AJAX Rails helpers. You should realize that the book was written pre-2.0 (Rails release that is).
Also those wanting to branch out on their own as a consultant, the following books might be for you as I found both very helpful when first starting my software consulting practice in 2001:
  • Getting Started in Consulting
    If I could only recommend one of these two consulting books it would be this one hands down, since this is more about general consulting principles and practices and the second book might be slightly dated in the initial chapters, because both were written before the software outsourcing/offshoring mania.
  • Getting Started in Computer Consulting
PS I have been using Amazon Prime for two months now and love it. You can try Amazon Prime FREE for one month if you like too.

Tips to Enterpreneurs Hiring Ruby on Rails Consulting Firms

Written on 10:20:00 AM by S. Potter

I know I am taking a break from my DataMapper/Merb series of posts today, but I have seen enough fakers out there in the Ruby on Rails world that I just have to get a few things off my chest: Below is a list of things to avoid when hiring a Ruby on Rails consulting firm:

  • Overexcited marketing speak. Drop any firm that has something like the following on their website: "We believe Ruby on Rails is the most elegant, powerful programming language on the planet". These people might know a thing or two about configuring WordPress or Drupal or perhaps only Photoshop, but definitely no serious Object Oriented development of large scale Ruby server applications. Basically anyone that calls "Ruby on Rails" as a whole a programming language has an inadequate skill set to build a non-trivial web application. In addition anyone that knows more than three non-similar programming languages will know Ruby is not "the most powerful programming language on the planet". Sorry I had to find a quote on the web to prove I wasn't exaggerating. There are many more fakers on the web too, so look out! Remember you aren't looking for people using the most powerful programming language, you want results.
  • Agile this, agile that, agile everywhere. Beware of firms using the word agile like there is no tomorrow. Some use is fine, but ask them what they mean by agile. And what benefits you, the client, will receive from an agile team. Don't get me wrong if a web application is built in an iterative fashion with lots of client involvement and sign off, with lots of useful automated tests/specs that are run often then this is almost certainly a great thing for the project. However, beware of the firms or freelancers that are using agile simply as a buzzword. (Note: some firms like mine use agile a lot in web page titles for SEO purposes, so not all firms that use agile a lot in titles are faking. So make sure to ask firms what they mean by the word agile when they use it).
  • Why Rails? If any firm tells you that a web application written using Django, TurboGears, Merb or other web frameworks will never be as good as a Rails web application, then drop them off your consideration list immediately. The same is true if the converse is said. If they answer with something more measured and sensible like "our engineers have found developing applications in Rails to be more productive than X, Y and Z, which we previously used, and it provides us with the right amount of control/flexibility without us needing to worry about all configuration than previous frameworks" then at least they aren't misleading you just to get a sale. Of course, you might be looking for other points in the answer (other attributes of a web framework that would make business sense like active community, skillset, etc.). Also in general you should look for measured answers to all your questions, not answers that lack insight and promise they can do EVERYTHING under the sun for you in 2 weeks time. You will be disappointed if you hire that firm or freelancer.
  • Expert in everything, expert in nothing. Consulting firms and/or freelancers need to choose a specific set of technical competencies so they can be experts in those areas. Beware of consulting firms or freelancers that claim to be experts in every framework, programming language and operating system.
  • Ubiquitous buzzwords. Beware of too many buzzwords. Even if you know what a buzzword means, ask the person using too much jargon to clarify so you can evaluate whether they really know what they are saying.
  • Everything AJAX. I previously worked through a consulting firm on the west coast (I'll not name names). The consulting manager who led project management functions on all the firms' projects was convinced that if we just put an AJAX form in here and there it would solve all our problems. My experience with this firm wasn't overly positive. I like the people on a personal level, but their work quality and focus tended to create hard to use web applications. Beware of people who think AJAX is the answer to everything. It has its place, but don't go overboard. PS Make sure the people you hire write unobtrusive Javascript rather than embedded Rails Prototype helpers. UJS has many benefits including graceful degradation (benefiting those that do not have Javascript enabled), avoiding browser incompatibilities and improved code organization (separation of concerns). Users of your web application will appreciate the forethought and your developers can be more productive!:) Win-win.
  • Novice.has_many :problems Look for previous Object Oriented modeling experience for the engineers on the project team. Even though ActiveRecord (the default Ruby on Rails Object Relational Mapping framework) provides a higher level of abstraction for ORM associations than other frameworks, don't be fooled into thinking anyone can model your business domain correctly without experience and/or know-how. Many developers who are used to using Hibernate or TOPLink (as only two examples) before in Java or Smalltalk respectively will have likely done a lot of OO modeling. Ask them about the most complex business domain they have modeled previously to be able to guage their skill level and experience in this area. This point is most important in applications that have more complex business domains.
  • Why should I care about your data? Look at my cool Photoshop spinning logo. Rails consulting firms and/or freelancers should have a good understanding of the importance of data modeling. Even if they do not have specific data modeling expertise they should at least know that indexing appropriate columns and defining column size limits where appropriate can save not just storage space, but also shave time off queries (depending on the query and usage scenario). That way when your application gets to the point where the database needs to be fine tuned, the data architect you hire will have an easier job than if your schema was designed by a PHP weekender that thinks the Drupal or WordPress database schema is finely tuned (sorry, it's horrendous and yes you should know that anyway). This is a very important consideration if your application is supposed to hold a considerable amount of data and/or if you have strong reporting requirements. As a side note, make sure to ask them if different database schemas are sometimes needed for querying large data sets (classical reporting requirements) versus CRUD usages. The answer should be yes and you should ask them to explain this a little to see how much they really understand.
  • I can design, develop, deploy and manage perfectly. Usually firms and freelancers are much better in one or two areas than all equally. Some are better at graphic design and got into development to earn $20/hr extra. That's fine, but you should know what you are getting yourself into. Ask what their strengths and weaknesses are. For example, I am excellent at developing and very good at deploying and systems engineering. I can use Photoshop and Illustrator (just like graphic designers can copy Rails code from forums and paste it into their editor), but ask me to create a visually stunning logo that encapsulates your company's image and I am lost (just like the graphic designers that can't write original code themselves). Know what is most important for your application needs and hire the appropriate people. Usually outsourcing graphic design to a design firm and hiring a flexible developer that can work with external teams well, gets you the best value for money. Throw away the firms and people that don't have any weaknesses. They are lying.
One point I would like to make to round this piece off is that if you find a young, eager engineer with higher than average aptitude, he/she can really do a lot for your project. However, you should expect them to make a few mistakes. Nobody is perfect and without prior experience with large scale server applications it is virtually impossible for someone not to make mistakes. If you can afford a few mistakes here and there, then this might be the way to go. Give them an equity stake if capital is an issue for you, they are more likely to take a risk with you than a seasoned consultant with a mortgage to pay or spouse and children to feed. However, never hire someone who claims to be a Ruby/Rails expert at a high rate if they simply copy and paste sample code from Ruby/Rails forums online (you would be surprised how many there are out there). It's a terrible waste of your capital and frankly it takes work away from better firms and/or freelancers out there and you will end up with badly written software that runs poorly. As a final note, hire a firm, freelancer or in-house engineer that has similar energy level and compatible personality to yours. Or at a minimum hire a firm or freelancer that you can trust and has been recommended, but not just through popularity contests like those at WorkingWithRails.com (I know three freelancers who are excellent Ruby engineers with only have 1 recommendation on the site and I have worked with one person that has more than ten recommendations and consider them subpar to most Ruby/Rails developers I know).

DataMapper: Flexibility of mapping model attributes to table columns

Written on 10:13:00 AM by S. Potter

I know an esteemed ex-colleague of mine (even if he is an Apple zombie now - yes, becoming a fan of Apple on Facebook deserves some heckling;)) has reservations about the apparent unDRY-ness of DataMapper with migrations. I used to share his reservations before I wrote two applications using DataMapper. Neither Merb application was particularly spectacular, but both had to work with legacy database schemas. Oh the joys of crazy DBA naming conventions! Let's look at a brief example of DBA naming madness (some names were changed, but the conventions used are the same):

ordersTable:
  - uid_int: int (PK)
  - acc_id_int: int
  - amt_int: int
  - ref_code_string: varchar(64) natural key
  - desc_string: varchar(128)
  - entered_dt: datetime
  - changed_dt: datetime
In another table managed by a different group we had:
Account:
  - ID: int (PK)
  - Balance: int
  - UserID: int (FK)
  - EnteredDateTime: datetime
  - ChangedDateTime: datetime
I kid you not. Two different conventions, both relatively unreadable as Ruby attributes as they both violate popular coding conventions in some way. In ActiveRecord you would be left doing non-trivial coding to hide the ugly column names in the model and map Ruby-friendly names to the appropriate column. In DataMapper this is not so. Lending from the design pattern's primary purpose - to explicitly map model attributes to database columns to decouple the two - we can write something like the following using DM:

class Order
  include DataMapper::Resource
  set_table_name("ordersTable") #ridiculous name really, but what can you do if other older applications rely on this already?

  has(1, :account, :child_key => 'acc_id_int', :repository => repository(:accounts))
  property(:id, Integer, :serial => true, :field => 'uid')
  property(:amount, Integer, :field => 'amt_int')
  property(:reference, String, :field => 'ref_code_string', :key => true)
  property(:description, String, :field => 'desc_string')
  property(:created_at, DateTime, :field => 'entered_dt')
  property(:changed_dt, DateTime, :field => 'changed_dt')
end

class Account
  include DataMapper::Resource
  set_table_name('Account')

  property(:id, Integer, :serial => true, :field => 'ID')
  property(:balance, Integer, :field => 'Balance')
  property(:created_at, DateTime, :field => 'EnteredDateTime')
  property(:updated_at, DateTime, :field => 'ChangedDateTime')
end

This way we can write readable Ruby code with the model. Next time we can look at lazy loading: how to switch it on and off and when to use it for greater effect.

DataMapper does have migrations

Written on 9:27:00 AM by S. Potter

I wanted to bust a myth out there that DataMapper does not have regular migrations, just the auto migrations. At least in post 0.9 versions DataMapper has both. In other blogs about the topic I found quotes like: "DataMapper migrations pull your database structure directly from the Ruby code for your models, so there’s no need to write separate migration files...". This, in my opinion, gives the wrong impression to those that are worried about non-trivial schema migrations. The author of that quote appears to have never moved one column into multiple or vice versa (a non-trivial schema change), among other schema changes where a DataMapper auto migration would not work in a production environment. In merb you can just create a new DM migration with the following: $ merb-gen migration name_of_your_migration (Assuming you have set the use_orm setting to :datamapper in config/init.rb.) A simple DataMapper migration might look something like:


migration 1, :create_orders  do
  up do
    create_table(:orders) do
      column(:id, Integer, :serial => true)
      column(:amount, Integer)
      column(:completed, Boolean)
      column(:description, String, :size => 255)
      column(:created_at, DateTime)
      column(:updated_at, DateTime)
    end
  end

  down do
    drop_table(:orders)
  end
end

If you have problems with Boolean not being recognized, throw in the line: include DataMapper::Types to the top of the migration (yes, this needs some finessing obviously). Hope that helps a few people that may have been mislead by some blog posts inadvertently. Migrations are needed for many cases I come across where auto migrations just will not cut it without losing data in production (not a good thing if data matters in your application:)). BTW A minor annoyance of mine in Merb is that for DM migrations you need to use this ugly Rake task: dm:db:migrate to migrate to latest version. To rollback migrations you need to use this ugly thing: dm:db:migrate:down[version]. Where those using AR in Merb just keep using db:migrate and db:rollback. This is an area that needs cleaning up to promote ORM agnosticism the way Merb is supposed to do.

Next Generation Syndication and Publishing

Written on 8:34:00 AM by S. Potter

For the last year or so, I have been focusing on implementing RSS, Atom publishing and related standards for a particular client in the podcasting industry. Both RSS and (to a lesser extent due to its standard being finalized much later) Atom are being heavily used not just for blogs, podcasts, videos and multimedia content, but also used heavily for web 2.0 (or social media) activity "watching". Most people should be familiar with FriendFeed by now (if not, where have you been? really!). This month (August 2008) FriendFeed came out with their Simple Update Protocol (SUP) which sits on top of RSS/Atom publishing to make it easier for content consumers (very self-serving for FriendFeed, but at least they are coming up with some kind of "solution" instead of just bitching about the status quo) to identify feeds that have changed before they make a timed request for the feed every X minutes regardless of whether the feed has changes or not. In a nutshell, SUP requires a meta JSON feed that notifies consumers of newly updated feeds to publish data. The first time an RSS/Atom feed is subscribed to, the consuming agent (e.g. FriendFeed) will strip out the SUP-ID and the SUP feed URL from the xpath:channel/link element and associate this information with the relevant user/account or whatever. Then using the identifier for a feed returned by the meta JSON feed, the consumer knows the URL associated with that ID for feeds they need to be updated by. While this might solve some problems, it really does feel very much like a monkey patch. On the flip side I do understand to some extent why they took this path of monkey patching as opposed to coming up with something more revolutionary. Revolutionaries tend to get killed and only the second wave settlers reap the benefits of their blood, sweat and tears. However, I did want to mention SixApart's efforts to broadcast frequently updating consumable data in what should be a much more obvious way for former enterprise MOM developers like myself. The Six Apart Update Stream broadcasts over HTTP new public updates/posts as they happen. It's an interesting idea and much closer to the XMPP ideal that companies like Twitter have attempted. Any other former enterprise architects merely see this new trend in web [defacto] standards in publishing and consuming data as just another message oriented middleware (MOM)? And I am hopeful that finally the web has found a MOM in XMPP that can stand the test of public scrutiny.

NEVER use or trust Paypal

Written on 5:20:00 PM by S. Potter

For those that have been screwed by Paypal and need to find another venue to conduct business here are some alternative services:

For *some* of the many issues surrounding Paypal/Ebay policies and rules you can check out the following articles: I have recently been royally screwed by Paypal. However, I have been advised not to comment on specifics until I play all my cards:)

LA Times Travel uses Twitter4R

Written on 6:51:00 PM by S. Potter

LA Times Travel is now using Twitter4R (Ruby library for the Twitter API). You can see they sent messages via the Twitter4R for a few hours today (2008-07-22) on their Twitter timeline before changing the Twitter4R configuration to not sure which library/application they are using (or just waiting to get Alex to approve their application source code). Please let me know if you see Twitter4R sightings being used by other well known or cutting edge applications.

Serendipity: An original Pradipta's Rolodex member

Written on 8:44:00 AM by S. Potter

Last night I got the Pradipta's Rolodex message. Rock on Pradipta!:) The Few, The Proud, The Pradipta 416

Twitter4R Development Releases

Written on 10:04:00 PM by S. Potter

Over the last week or so I have been putting Twitter4R sources in GitHub and getting the on-the-fly GemSpec to work "safely" enough for GitHub servers. Tonight (June 30) I decided I would announce Twitter4R development micro-releases available only via the GitHub gem server. To setup your environment to install the latest development release of Twitter4R: the most complete Ruby client bindings for the Twitter.com REST API, please read HowTo: Install GitHub Development Releases Official documentation will still exist through the official Rubyforge project site of Twitter4R, but any documentation related to current development tasks, features, bugfixes will now be documented on the Official Twitter4R GitHub Wiki. If you want to keep an eye on the Git repository, feel free to watch the mbbx6spp/twitter4r GitHub repository.

Rails rocks, X sucks: How Provincial

Written on 2:12:00 PM by S. Potter

I seem to be coming across more and more [Steve] Jobs-clones or simply no-thinking Apple fan boys lately. I don't mind the Apple fans that thought themselves about why they love their stylish new Macbook Pro, I have just had enough of the no-thinking variety of Apple zealots. This part of the Rails community appears to be a large population unfortunately. Because these fan boys are actually incapable of thinking on their own, they adopt other people's arguments and say things like: "Rails rules, X sucks because so-and-so said this...". Hmmmm. Need I say more? On the other side I recently came across a Django fan boy who tried to convince his non-technical manager that Rails "cannot scale period". Hmmmm. How original. A blanket statement like this is not only misleading, it is actually in many ways technically incorrect. This is why in middle school we learned how to qualify our arguments. Why is it that supposed college graduates cannot do this in their 20s and 30s? It is one thing to strip out technical jargon for non-technical managers to understand a situation, but it is quite another to over simplify and mislead them because you have your own tech-religion agenda. Be passionate, but admit to yourself and others when passion is getting the better of you. Why is acknowledging your chosen solution's weakness bad? Doesn't it actually make the solution implementation stronger if you have thought about scenario-based weaknesses? How about we each try to *qualify* our positions on technology solutions out there, rather than figuratively urinate all over others without any basic respect for them. I encourage criticism, but criticize constructively. Say why. On the flip side, we should also accept constructive criticism too. The first step to constructive criticism, need not necessarily be candy coating your critique, but having the right intentions behind your critique. For example, if your motivation for giving criticism is to start a flame war (sorry that is probably a very out of date term now), then you will probably get what you want. However, if you really want to help a project get better because you care, then that is the first step. In addition, the way you word and tone statements (as Twitter's Alex Payne should have learned all too well by now) has a big impact. For example, Alex Payne's statements about Rails and Ruby cost me a contract at a former Rails shop that had a conniving PHP zealot waiting in the wings to jump on anything and take issue with Rails to upper management. Lucky for him, it worked and he had 3 to 4 months of leading a team to miserable failure using Drupal as a "platform". Not necessarily a reflection of PHP (or Drupal for CMS applications since it was far from a CMS application), but a reflection of this calculating pseudo-coder's inability to lead a project. A "technical" manager who thought the kernel version on his Fiesty Ubuntu laptop was 7.0.4. Enough said!:) Overall, I hope we can start to have intelligent conversations about software stacks rather than "yours sucks ass". Because the latter in the end favors nobody. PS Merb rulez and Rails suckz!:) If you want to know why read my teaser entry titled Why Merb is delicious, though I plan on writing more soon. PPS Merb wouldn't be where it is today without Rails.

Why Merb is delicious

Written on 8:49:00 AM by S. Potter

Below are the reasons why Merb is more delicious than other MVC web frameworks I have worked with (including Rails):

  • On a diet...It is lighter weight than most, but still has enough oomph to implement common features with minimal code
  • Religion-free...It is ORM, JavaScript/AJAX and testing/speccing library agnostic.
  • The sum of it's parts...OK that was a lame segway, but parts is a mightily useful feature where a plugin is just total overkill. Just because Java calls something a "component" does not make it the general definition of a component in concept. And now look at the Rails plugin debacle today.
  • Precious gems...Plugins are created, distributed and installed in the form of beautiful RubyGems as opposed to the Rails plugin catastrophe with Gigabytes of duplicate disk space and lame version/revision control
  • It's exceptional...Merb actually thought about how to do exception handling in controllers before Rails came up with something and also came up with IMHO a better solution.
  • Speedy Gonzalez...Compared to especially Rails, Merb performs very nicely indeed.
  • Loose threads...Merb is thread-safe unlike Rails, so that means one process can handle multiple concurrent file uploads, where Rails cannot right now.
All together I think most people will find Merb to be a much better thought out (conceptually) MVC framework than Rails. Over the next few weeks I'll be talking a lot about Merb and it's delicious APIs, Plugins, etc. as it nears it's 1.0 release.

Comedy of Errors

Written on 10:23:00 AM by S. Potter

Does anyone else find it comical that the Chicago Java Users Group website is written in Rails? And apparently not very well if it is not only sending a HTTP 500 error, but still using the default Rails generated 500.html page. Poor all around. An IM buddy mentioned it might be JRuby, but I do think it is still very ironic considering the hostility from the Java world regarding Rails in general (whether Ruby on Rails or JRuby on Rails). I just had to share!:) Update: Someone fixed it shortly after I reported it to CJUG.
Update: Actually it was only temporarily fixed, it is back to the default 500 Rails error page.
Update: In addition they have exposed their SVN information. See the public entries and from that information you can find their non-password protected source code and check it out if you like. If anyone from CJUG is reading, you can fix this by reading my blog entry from two months back called Preventing Information Leaks, Part 3. Although it might be useful to read part 1 and part 2 as well. Cheers! PS I also sent an email to info at cjug.org and received an email delivery error.
Update: I may have read too much into the SVN repository information exposure in this specific case since this is an open source Rails application project (I only just realized), thus the non-password protected repository, but I still stand by the rest!:) It is also not good practice to expose such information from the outset of deploying an web application.
Update (1.5 hours after trying to notify JUG and posting the blog post): The site has been fixed now for over 5 minutes, so good job fixing it CJUG!:)
Update (8 minutes later): Ooooops! It is back to the default Rails 500 error page. Perhaps it is 2 out of three bad FastCGI processes that need to be killed on the server? HTH someone at CJUG, but since I can't look at the system I can't say for sure.

Coding with Explicit Intentions

Written on 10:34:00 AM by S. Potter

Recently I have had the [mis]fortune to be working with PHP for integration purposes for a client. A partner of my client's (in the advertising space) sent us some PHP code to add to our servers for resolving the content type of a file based on its extension. The line of code in the PHP script they sent over that was supposed to determine the file extension was:

  $ext = substr($file,-4);
As you can see they are making a pretty big assumption - that the extension is exactly three letters long. The media files that this script *may* support in the future are: Real Media (.rm), MPEG (.mpeg), Quicktime (.qt), etc. These extensions are not exactly three letters long. Why would someone want to write code that is quite likely to fail and also not communicate more explicitly their intentions? This partner already doesn't support PHP < 4.0.3, so why not substitute the line above with:
  $ext = ".".pathinfo($file, PATHINFO_EXTENSION);
It is standard, will never fail (unless there is a defect in the PHP version you are using for the standard function pathinfo, but what can you do about that?) and communicates the explicit intent of the original author, thus improving code maintainability. Anyway, just a pet peeve. This is not by any means the only example I see on a day in day out basis, this just happened to be a great example for me to demonstrate my point. While this example shows just a minor one-line example, if developers introduce even only 2-3 of these un-explicit intentioned lines of code a day that do not always do what you might want it to do, then the codebase quickly becomes a mine field.

Waking Up From Hibernation

Written on 9:49:00 PM by S. Potter

Just a quick post to say that I will be waking up from OSS and blog hibernation at the beginning of May. I have a stack of bug fixes to apply to Twitter4R for a pretty large 0.3.1 release (in terms of bug fixes and supporting Rails 2.x compatibility). The next two major features on my Twitter4R v0.4.0 release radar are: (a) supporting Ruby 1.9 and (b) adding the newish "track" feature in some capacity (probably via XMPP instead of the REST API on Twitter.com). I also have a couple of minor features to add to the metafusion-crypto gem and finally release the first version of metafusion-rails. Then I will probably abandon OSS work explicit to Rails and write numerous Merb plugins/extensions. The catch up work will likely take 2-3 weeks into May and then hopefully my new venture workload will be significantly smaller to sustain OSS development over the summer months. Thanks for all your patience!

Preventing Information Leaks, Part 3

Written on 10:44:00 AM by S. Potter

In the third part of the how to prevent information leaks blog post series, we will look for unguarded "hidden" files where we can garnish quite a bit of information. Make sure you check out part 1 and part 2 on how to prevent information leaks in web applications before continuing.

Seek Unguarded Hidden Files

For web applications that use deployment tools that pull code from code repositories, you can find out quite a bit of information about the code repository (e.g. host, path, usernames, file listings), by finding unguarded hidden files. One obvious example especially in the SVN saturated Rails world is the path /.svn/entries and similar URLs. You will want to make sure the following URIs are not accessible on your public sites if you use Subversion (SVN):
  • /.svn/entries
  • /javascripts/.svn/entries
  • /stylesheets/.svn/entries
  • /images/.svn/entries
If you are using a superior SCM for your project like Git you ought to be looking for the following:
  • /.git/config
  • /javascripts/.git/config
  • /stylesheets/.git/config
  • /images/.git/config
If your project is still in the stone-age using CVS, then check the following:
  • /CVS/Entries
  • /javascripts/CVS/Entries
  • /stylesheets/CVS/Entries
  • /images/CVS/Entries
Adapt these patterns for your application framework default structures. The above is for Rails or Rails-based frameworks (e.g. Merb). When I first tried this last week on the Rails top 10 sites, three out of ten of these sites exposed their SVN information (and last week I had only checked SVN, no Git or CVS URIs). Since last week at least one of those sites have fixed the problem! What information is exposed in these files? Quite a lot of information. Do you want potential hackers knowing:
    code repository URL (host, port, protocol, path)
  • usernames of committers
  • code repository listings of files in a directory
All these things are not things you really want the potential hacker knowing. On my travels investigating the top 100 Rails sites I found one top 20 site that exposed enough information from their SVN entries file that I found a prototyping directory that included a "specs.rtf" document, which I was able to download. The document was not very uninteresting (at least not to me), but if they had written more in-depth specifications in the document, it may have served as a nice guide for a potential hacker to garnish enough information about their setup to take over! Moral of this story? Guard hidden files. I recommend "forbidding" access to all troublesome path regex patterns. For example, one of the LigHTTPd servers I administer has the following at the top of the configuration file:

$HTTP["url"] =~ ".*/\..*" {
  url.access-deny = ("")
}
For Apache httpd servers I use something like the following in the httpd.conf:

<DirectoryMatch "^.*\..*">
  ErrorDocument 403 /403.html
  Order allow,deny
  Deny from all
  Satisfy All
</DirectoryMatch>
Nginx configuration might look something like:

location ^~ /\..* {
  deny    all;
}
This guards against URLs like: /.svn/entries, .git/config as well as .htpasswd, etc. This means that only some script-kiddie hackers will mistakenly think you are using one SCM system instead of another, but they will never be able to know for sure which SCM you use or any information about your code repository. Forbidding access to all paths containing "/." somewhere in the URL is generally a good idea (IMHO) as sometimes people change web servers and leave "secret" files in the directory structure, but the new server doesn't know that, for example, .htaccess or .htpasswd is "special", so it will just serve it without thinking.

Preventing Information Leaks, Part 2

Written on 1:04:00 PM by S. Potter

Continuing the series that looks at how to prevent information leakage, today we look at the information leakage from web server HTTP headers.

HTTP Header Leakage

Let's look at the information potential hackers can get from HTTP headers from just a GET / HTTP request. Let us look at http://www.cnn.com:
$ curl -I http://www.cnn.com
HTTP/1.1 200 OK
Date: Thu, 07 Feb 2008 15:22:32 GMT
Server: Apache
Accept-Ranges: bytes
Cache-Control: max-age=60, private
Expires: Thu, 07 Feb 2008 15:23:23 GMT
Vary: Accept-Encoding,User-Agent
Content-Type: text/html
X-Pad: avoid browser bug
Content-Length: 90458
This example is pretty decent. What you want to look for is the Server HTTP header value. In this case it is just "Apache". Now it does identify the web server used, but it doesn't pinpoint the version being used. Now I am going to try a popular Rails website:

HTTP/1.1 200 OK
Date: Thu, 07 Feb 2008 15:25:34 GMT
Server: Apache/2.2.2 (FreeBSD) mod_ssl/2.2.2 OpenSSL/0.9.8b DAV/2 PHP/5.1.4 SVN/1.3.2 mod_vd/2.0 mod_fastcgi/2.4.2 proxy_html/2.5
Last-Modified: Thu, 07 Feb 2008 14:58:58 GMT
ETag: "4da437-36ac-b69b1080"
Accept-Ranges: bytes
Content-Length: 13996
Vary: Accept-Encoding
Content-Type: text/html
In this case we can see what OS Apache is running on and the version of Apache. Not only this, but we see all the enabled modules in Apache and their respective versions. IMHO this is too much information especially considering this site supposedly (at least as far as I know) host the site on a fully controlled environment (either dedicated or VPS with root access). Applications on shared hosts cannot help this much without assistance from the shared hosting company. I have a couple of very small sites that get little traffic on a shared host, so I appreciate this obstacle. Moral of this story: If you have full control over your environment you should always either change the "Server" HTTP header to something generic (e.g. "Apache" as in the CNN example) or disable it from being returned to the client. This setting is very easy in Apache, LigHTTPd and NGinx. I assume this wouldn't be difficult in LiteSpeed either, but I do not have configuration experience with LiteSpeed. Apache configuration:
Header unset Server
LigHTTPd configuration:
server.tag = ""
NGiNX configuration:
server_tokens off;
Also make sure there aren't any other headers that give away too much information. Especially look at the X- HTTP headers.

DataMapper vs. ActiveRecord

Written on 12:17:00 PM by S. Potter

Today I came across a thread on ruby-talk where I responded to the age old question of which Ruby ORM (specifically DM vs. AR). I responded based on my production use of AR, development use of DM and my understanding of Fowler's PoEA of the same names that I have applied in Java and Python countless times. As some readers of this blog might find this discussion interesting, I am linking to my response to the DataMapper vs. ActiveRecord discussion here. The executive summary (for the infinitely lazy) is:

It appears that if DataMapper (the Ruby library) is able to sufficiently hide enough database logic in more complex business logic scenarios, then the DM library might be more beneficial to use when using a legacy database schema where you are not able to create primarily isomorphic relationships between class attributes and table columns. Whereas AR would be a slam dunk and simpler to use in isomorphic schema scenarios where you have control over the database schema.
I will be returning to the Preventing Information Leaks series on Monday/Tuesday (it is already written, I just need to publish).

Preventing Information Leaks, Part 1

Written on 10:04:00 AM by S. Potter

Before I start I want to mention that the techniques suggested in this blog post are for readers to use to secure their own web applications. This does not mean that you can take over the server with these techniques without further hacking, but these steps provide a lot of information about the environment the web application is deployed to or developed using, unless system administrators/engineers or software engineers prevent this information leakage. Some of these techniques exploit settings that are usually found in Rails when default settings are used, but web applications written in different frameworks may also use these settings, which leak information. This all just comes down to thinking through the security scenarios. Just like developers need to consider the usage of their applications or APIs or frameworks from the user perspective, those tasked with securing up their applications need to consider how potential hackers might try to access key information to help them hack your systems.

Identifying the framework or "stack"

Let us check if your application smells too much like you are using a particular framework or another. This may or may not be terrible, BUT if you can tell which version of the framework you are using from publicly accessible information on your site, then you are leaking too much information. In fact, some firms/developers advertise very publicly which frameworks or stacks they use, but they usually try not to talk too much about the exact versions they use. For example, if an application responds with a 200 OK HTTP status for the majority of the following URLs, it is VERY likely it is a Rails application:
  • /javascripts/application.js
  • /404.html
  • /500.html
  • /422.html
In the case of the last URL, we can identify that the application is written using Rails 2.0. We can also look at the 404 to see if it looks like the default looking 404 file for different versions of Rails. If /422.html doesn't exist we might still be able to tell the difference between Rails 1.1 and 1.2 applications by what /404.html looks like. Moral of this story: At a minimum always change the default error pages in your Rails application to fit your design. Also consider changing the URLs of the error pages too. Not only does the redesign prevent potential hackers from identifying which version of Rails you might be using it also looks more professional and makes the user experience in case of an error a little more acceptable! If you are not using Rails, think about what files are created by default by your framework and change things around a little to prevent information leakage. I will continue this series in subsequent parts. Stay tuned!

Stupid Easy: 7 Deadly [Rails] Sins, Part 3

Written on 9:23:00 AM by S. Potter

This is the third and final part in the Stupid Easy: 7 Deadly [Rails] Sins series. See part 1 and part 2 if you haven't already. I have left the most heinous sin to the end. If you are a serious sinner in this aspect, there is little hope for you unless you take steps now to rectify your behavior in this aspect.

Stupid Easy Ruby

When in Rome, do as the Romans do
That's right. This is the sin of not knowing how to really code in Ruby. Sure some coders that move to Ruby can write code that does what they want it to do, but often at the severe expense of compromising project code maintainability (e.g. making things very unDRY, etc.) or even introducing significant performance issues by not understanding basics about the Ruby language. Instead I see people using Java, Perl, PHP, C# or even Python idioms in Ruby instead of understanding the essence of the language of Ruby, which is distinct from each of these languages in different ways. The same is true for any language/environment you work within. I know former PHP people that moved to Java and do not have the first clue about basic Java idioms that work well. This isn't just to pick on PHP heads as I imagine this occurs from any language migration path. I have simply seen this sin committed most by former PHP coders moving toward Java, Ruby and Python. However, I will say I see a LOT of former Java (massive static language proponents - Joshua Bloch you are a nonobjective snob) developers that do not understand or appreciate the non-static design mentality and they create large class hierarchies and don't understand what a Mixin is (or they pretend to be "cool" and talk alot of about these idioms, but don't really utilize them in the right way themselves). The Ruby mindset is still different from it's more similar looking cousins: Python and Perl, but moving between Python, Perl and Ruby (in any direction) feels more intuitive (at least in my mind). Now I should stress that I can no longer code in Java without pulling my hair out because the thought of creating 5 interfaces and a factory for every three class implementations drives me temporarily insane. The SPI design principle allows Java to be fairly flexible (especially for a statically typed language), but at the expense of my personal sanity now that I prefer to think the non-static way (yes, I still appreciate Python's strongly-typed ways). This is why I prefer to use JRuby, Jython or Groovy scripting in Java environments. There is no magic bullet to stop sinning in this regard if you aren't willing to take a journey to learn and understand Ruby idioms (not just basic *knowledge*). There are no shortcuts, it simply depends on how easily your brain can shift in gears. This doesn't mean people who can't shift gears are "bad" developers, but not well suited to migrating to significantly different environments very often. Just remember the rules in Ruby aren't the same. Think in terms of sending messages to objects that may or may not respond to the message you send, instead of expected interfaces and you are half way there if you are coming from Java, C++ and similar languages. Think of Java as a big government structure with a complicated tax code and Ruby as minimal government with a simplified tax code without loopholes accessible only to the rich! That means in the Ruby world wild things such as adult services can legally be procured between consenting adults, so bend your brain to think that way when working in Ruby...almost anything is possible especially in the realm of metaprogramming that is not possible in the Java world.
That government is best which governs least.
--Thomas Paine
I personally happen to agree with Thomas Paine's statement (at least in the context of programming), but there are various programming philosophies that different languages cater for, so it comes down to your own preference. Pick the ones you can live with at the time and think in terms of and you will be happier for it. Obviously in a programming context some people (e.g. Joshua Bloch, who is still a static language snob) will oppose legal procurement of "adult services". They have a moral objection to these activities and thus do not appreciate the beauty of dynamic design. If you have such moral reservations in programming, do everyone a favor and stay in the Java world or at least static languages! This is the beauty of different opinions. Nobody is right in absolute terms, but you need to make the right choice for yourself. It is the zealots who only accept "absolute correctness" that we should be weary of (on any side of the argument, but including Joshua Bloch). My general philosophy when I had the Java Way ingrained in my brain was that as an API designer I had to protect stupid developers from their own stupidity. Now my philosophy in the Ruby world is that I write APIs for smart developers and if you aren't [Ruby] smart, use at your own risk. Of course, I oppose complication for the sake of appearing to be "smart". But I try to write APIs that allow more advanced Ruby developers to take advantage of language features such as blocks, metaprogramming, etc. as opposed to providing an over-simplified, dumbed-down API that ends up looking ugly to the wiser Ruby developer due to being less DRY than might be possible without restriction.

Ruby Language Pointers

If you haven't acquainted yourself with the pickaxe book you should pay special attention to the following sections: Blocks, Mixins, Inspecting Objects, Inheritance & Messages, Inheritance & Mixins, Classes & Objects Interacting, The Ruby Language. Then I would look at the source code of the ActiveSupport open source project (part of Rails) for techniques that are very useful. Some of these techniques I have discussed in previous blog posts: Ruby Idiom: Reopening Classes, Ruby Idiom: forwardables, Ruby Idiom: Inject-ing Understanding, Ruby Equality, Ruby Higher Order Messaging.

Stupid Easy: 7 Deadly [Rails] Sins, Part 2

Written on 9:43:00 AM by S. Potter

In the second part of Stupid Easy: 7 Deadly [Rails] Sins, we look at sins four through six and save the most heinous sin for last in the third part of this series.

4. Stupid Easy Views

Everyone has seen spaghetti view code, either in the JSP, PHP, or other similar web templating environments. So why do people honestly think (ok, only the no-thinkers) that when they use Rails that you can't violate MVC because it is built around the MVC "pattern". Don't even get me started on the whole "pattern" thing... If you see a piece of view code like the following, then you are violating MVC. Yes, even when using Rails:

<% 
if current_user && current_user.admin?
  @somevar = something goes here
  # and do something else here
else
  @somevar = something else goes here
end
%>
This is a small (but very stupid) violation, but a violation none-the-less. I have seen not just logic in view code, but oddly model-specific code (in one case I even saw some attributes of a model being updated at the END of the view template - go figure). Now simply checking the admin flag of the current_user to insert some view specific code, would not be violating MVC. There are various ways of making sure you don't make this mistake. Some people think using a specialized templating language in place of eRB is the way to go (one of the many potential replacements that I happen to know about is Haml). These specialized template languages basically make it virtually impossible to embed any kind of logic into the view code. Others (pro-thinkers, rather than no-thinkers) don't believe that level of enforcement is necessary and just make a point of highlighting to the team (and then quickly refactoring) any violations in view code to educate all the team members about this sin. I have even seen some Rake tasks written (ok, I may have written a couple) that flag any views that have specific keywords within the <% %> eRB brackets as potential violators so that we can look through the suspects and make judgments before a release is made (because you never want to release code that violate basic principles you believe in). Another strategy is to only work with developers that you share a very similar development philosophy and know they wouldn't violate this basic rule of thumb in the first place. In any case, you need to incorporate one of these strategies so you can be sure to be rid of the Stupid Easy View faux pas in Rails projects.

5. Stupid Easy Configuration (and Routes)

In Rails 2.0 there is absolutely ZERO excuse for this (with very nice initializer hooks available), but even pre-2.0 it is hard to justify an config/environment.rb consisting of more than 30 lines (including basic Rails configuration minus comments). Separation of configuration into separate files that are required by config/environment.rb were my personal preference in pre-2.0. On the config/routes.rb side, I would often see the default routes still uncommented. In Rails post-1.2 there is really no excuse since almost everyone should be using RESTful routes with RESTful controllers (even if the way Rails generates a ridiculous amount of code, when it could be refactored nicely into a mixin or base class, is pretty dumb) anyway. In fact, default routes are almost always an invitation for troublemakers to hack into your site (especially if they know it is a Rails site). I can't say I have never deployed a Rails 1.2.x site with default routes uncommented, but I certainly haven't in the last year since Rails 1.2's official release.

6. Stupid Easy Library Code

For some reason numerous popular Rails plugins promote the idea of putting generated files into your RAILS_ROOT/lib directory (e.g. acts_as_authentication, restful_authentication, etc.). Instead of being a real plugin they are basically just Rails generators. Now we could debate how generators could be better supported in Rails, but that is really a tangent to the point. Don't be fooled by these popular "plugins" (remember they are really only generators). If you have significant code in your RAILS_ROOT/lib that has the same focus (e.g. authentication, authorization, credit card processing, etc.), then you really ought to create your own plugin (even if it isn't going to be shared with other projects in the foreseeable future). How you and/or your team defines "significant" is a judgment call, but I personally think any unit of code that works together that is greater than 200 lines of elegant Ruby code is significant (minus empty lines and comments of course). It isn't necessarily a hard or fast rule, but a general rule of thumb, which may have exceptions. The main purpose of this is twofold from my perspective. When you have a lot of code in your RAILS_ROOT/lib it is a harder to be more disciplined about organization. Whereas with a plugin you know where plugin initialization code goes vs. mixin or other code. Another big benefit (which is related to code organization) is testing/specing. It is much easier to see (or specifically notice) test/spec-coverage shortcomings of plugins than holes in your tests/specs for RAILS_ROOT/lib code. Benefits of creating plugins include that it is forces you (the developer) to think more about:
  • Testing/specing your code
  • code organization from a maintainability perspective
  • what chunks of code go together (should this be one or two plugins)
  • the scope of your plugin vs. what needs to be defined or written in your application code
  • who is responsible for developing this functionality, maybe it is the infrastructure/core/framework development team who needs to be responsible for authentication or authorization code rather than application developers. With plugins you can separate our these responsibilities very easily
In addition, a plugin is much more easily shareable across projects (in a more organized and disciplined way than just copying or even linking files/directories into your RAILS_ROOT/lib directory). Of course, I also find the way Rails plugins are supported by Rails a hindrance, especially when RubyGems could provide a many facilities to support this in a more elegant way, but that is a topic for another day!:) There is only one Stupid Easy Deadly [Rails] Sin left, which I will publish in the next two weeks.

Stupid Easy: 7 Deadly [Rails] Sins, Part 1

Written on 10:46:00 AM by S. Potter

Everyone has been talking about Rails for a few of years now as the framework that makes it "stupid easy" to create web applications. The problem I have with this mentality is it creates a "stupid easy" sub-culture within the Rails community that promotes stupid laziness (as opposed to smart laziness). Over the last 3 Rails contracts, I have worked with supposedly experienced developers that have switched to Rails from OO languages like Java and C++ (as opposed to pretend OO languages like PHP). The results have been VERY disappointing. The biggest problem I see is that some, even quite experienced developers, have the [stupid] lazy mentality burned into their brains. They want to visit blogs and copy and paste code into their applications without thinking about it. In fact, recently I found a ridiculous real world example of such madness. No thought put into the pasted code AT ALL. I could identify the blog posting on a popular Rails blog that had the EXACT same code. The variables weren't EVEN renamed to be meaningful in the current application. Instead a finance web application had references to customers and invoices in the controller and view code when there were no such entities/models involved in the application. Not only is this a maintenance nightmare, it show how little thought went into the code just to get a small AJAX effect that (a) wasn't difficult to write from scratch (4 lines in total after refactoring from the 11 lines of code used) and (b) was probably not even that necessary from the user experience perspective. These are the developers I think ought to return to their no-thinking PHP or Java/JEE recipe books and leave the Ruby world alone. Of course, the authors of "recipe" books should also bare some of the blame for encouraging the no-thinking hackers out there to join the Ruby world in the first place. Is that really the type of mindset we wish to harbor in our community? Look at the PHP and Java universes today for a reference point of where we will end up very soon if we are not careful.

1. Stupid Easy Schemas

One of the deadly sins I saw on my travels through their code was extending the "stupid easy" mentality in the form of not defining schema properly. Sure they *used* Rails' migrations, but they didn't really create a usable schema based on the use stories and cross-story functional requirements. While you might not want to use database specific features like triggers, stored procedures, or even foreign keys (the latter assumes you are using Rails-based equivalents in place of FKs), you should still make your schema sensible. For example,

# Migration code
  create_table :countries do |t|
    t.column :code, :string
    t.column :name, :string
  end

# Model code
class Country < ActiveRecord::Base
end
In this example we created a table called countries that has two string fields: code, name. In this case the code is the ISO country code that the application is going to use for almost all country lookups in the database. We knew that pretty much from the beginning because of the nature of our application. (Note: this example is a little contrived as I had to change things so as not to expose too much about the application this is from). Not only do we know that the Country model will almost always be looked up by its code (from the initial set of use(r) cases/stories we need to implement). The ISO code is also always 3 characters long. We also know that these ISO codes are unique for each country. All these things we knew before we needed to create the schema. This is the important information that is missing from the schema and the model is missing relevant validation. Before committing this new model and migration for Country the developer writing the code (one of the developers I am thinking of loves to tout how XP/agile and great his code is), should have had the following code on initial checkin (there really is no excuse IMHO):

# migration code
  create_table :countries do |t|
    t.column :code, :string, :limit => 3
    t.column :name, :string # You may even decide to cap the length of the name field too, but this is more of a DBA style thing.
  end
  add_index :countries, :code, :unique => true

# Model code
class Country < ActiveRecord::Base
  validates_length_of :code, :is => 3
  validates_uniqueness_of :code
end
While on my soapbox about this point, another developer I respect said this might be premature optimization. In this specific case, I totally disagree, however, this is a good point to make in general. You don't want to do premature optimization either. This is something we all need to be careful about and to be aware. This is the point where you need to think for yourself. There is no cheat sheet on these types of thinking points. The [stupid] lazy Rails developers should go back to hacking PHP senselessly or following J2EE/JEE blueprint patterns if they find thinking for themselves to develop their own rules of thumb too much work. Remember [smart] lazy is the way to go.

2. Stupid Easy Models

Another cardinal sin I often see in both client work I have inherited from others or open source projects is dumbing down models. Instead of creating member methods of models that related to *what* they are and do, some Rails coders (usually not very experienced) write this functionality within the controller layer, which leads to drastic controller layer bloat. Which is really ugly. Remember if your code is starting to look ugly, do something about it. The way to fix this is *almost* always to create relevant member methods on the corresponding models that this functionality works on. One easy example of this might be to add an authenticate method to the User model instead of having extra logic in the ApplicationController. Think CRC instead of procedural.

3. Stupid Easy Controller Methods

This is perhaps the most obviously harmful (from a security perspective) sin. While traveling through the Supid Easy Rails universe of code, I almost always see unprotected controller methods that are used as filters or for utility purposes in some capacity. This is a sin that I myself must confess to, though I haven't done so since the first 3 months of my Rails development around 2005 Q4. The following is typically what I see around:

# in SparksController
  def check_permissions
    return false unless current_user.has_permission?(:MANAGE_USERS)
    true
  end
The problem with this is that we can do the following (unless using resource routes where default routes are taken out):
GET /sparks/check_permissions?some_var=some_value
And perhaps hack into the web application. In this case we may not be able to do anything too interesting, but there are other scenarios that this could create a very open hole in the web application. The way to fix this and keep your team's sanity from a maintenance perspective is to protect these filter or utility methods by scoping appropriate with protected or private, which is not difficult at all. However, if you are a true Rubyist you will probably opt to keep your controller utility methods in Acts::As mixins or similar.

'Twas the night before launch...

Written on 7:10:00 PM by S. Potter

Oh fun times! On such fateful nights you know you will find something that will screw things up unless immediately addressed. It must be someone's law already, but if it isn't I call dibs on it and it should thus be called "Potter's Law":). On a night before launch that I recently experienced, I had a heart stopping moment. The last 5 days of the sprint I had been churning out last minute changes based on usability feedback from the client. In the process, I had neglected to include benchmarking and memory leak testing with the same discipline I prefer. On the night before launch I ran my usual benchmarking scripts. The results were pretty good except on one action. We could have lived with that performance, but the real problem came when I ran tests to detect memory leaks in my Rails application. Let the real fun begin! Before continuing, I should mention there is only thing I hate more than uber visual tasks (editing/creating graphics, layout tweaking, etc.) and that is debugging memory leaks. I ran the ab (Apache Bench) utility on a few different types of pages in the staging environment and didn't notice any problems. The Rails application was consistently teetering under 50M RAM. Excellent! Then I tried the second most requested type of page (this was a static site we were rewriting in Rails to create an easy to use domain-specific CMS - so we had live production web statistics) and my heart skipped a beat or three. Memory usage spiraled out of control. From 49.5M the single mongrel process eventually grew to 250M after a few ab -c2 -n100 .... runs. I first checked out the view since I knew the controller action code was only 6 lines (how much could go wrong there?). I tried removing different parts of the view code and rerunning my tests while monitoring the memory usage of the process. Still no change. So I reluctantly looked in the controller and saw something like the following:


  def show
    @eto = ExchangeTradedOption.find_active(params[:id])

    respond_to do |format|
      format.html # show.rhtml
      format.xml  { render :action => "xml", :layout => false }
      format.csv  { render :action => "csv", :layout => false }
    end
  end
What on earth was in the ExchangeTradedOption.find_active method?

  class << self
    def find_active(id)
      now = Time.now
      find(
        id,
        :include => [:exchange, {:vendor_symbols => [:vendor]}],
        :conditions => [
          %{options.expires_at <= ? AND options.active = 1 AND 
            vendor_symbols.effective_date <= ? AND vendor_symbols.expiration_date >= ?}, 
          now, now, now
        ],
        :order => 'vendor_symbols.effective_date DESC'
      )
    end
  end
Ouch! The problem was that I still needed all vendor symbols and vendors for the view in question and didn't want to make the extra SQL queries for the Vendor on each VendorSymbol as that would have added between 5-10 extra SQL queries per action invocation. I guessed the nested includes in the find was most likely to be causing the issue. So I refactored as followed:

# Controller action code: ExchangeTradedOptionsController#show
  def show
    @eto = ExchangeTradedOption.find_active(params[:id])
    @vendor_symbols = VendorSymbol.find_active_for(params[:id])

    respond_to do |format|
      format.html # show.rhtml
      format.xml  { render :action => "xml", :layout => false }
      format.csv  { render :action => "csv", :layout => false }
    end
  end

# Model finder code: ExchangeTradedOption.find_active
  class << self
    def find_active(id)
      now = Time.now
      find(
        id,
        :include => [:exchange],
        :conditions => [%{options.expires_at <= ? AND options.active = 1}, now],
      )
    end
  end

# The VendorSymbol.find_active_for method implementation is left as an exercise for the reader
# but it is very, very simple!

# Changed view to refer to @vendor_symbols array instead of @eto.vendor_symbols
# This also helped reduce indirection and prevented Law of Demeter violations in the view code!
Of course, I didn't leave it without verifying! Never just guess that the problem is solved, you should always verify this with tests (either automated or manual). There were a few solutions, of course. One is the one I provided above. The second solution I thought of trying was removing the vendor_symbols SQL conditions and sorting in Ruby. This seemed a waste of CPU to me and extra lines of code I felt was unnecessary (remember the less code you write the less you need to maintain!). The third solution was to execute an optimized raw SQL query myself. I also didn't like this option as the first one presented about seemed cleaner and more maintainable going forward. The first solution does cause more than one SQL query to be executed each time the action is run, but it is scalable as it remains constant at two queries per invocation. In addition I added one extra instance variable, which is also not ideal, but again this didn't seem to me to be the worst problem at this stage with only two instance variables in the action being set for the view. The benchmarking results proved my assumptions correct regarding the various solutions. The moral of the story is find the best solution based on the context rather than attempting to apply a one-query-per-action-invocation rule to the whole application, which might be ideal, but occasionally unrealistic and/or catastrophic.