ephekt Programming seems to be my existence

14Aug/090

Ruby Regular Expressions – Security Risk

This post is a half reminder to elaborate when I have free time... But in short, there is nothing wrong with Ruby regular expressions, except that they behave differently than one might expect (in general and if coming from Perl RegEx).

Here is the dealy, from the Programming Ruby book by Dave Thomas:

The patterns ^ and $ match the beginning and end of a line, respectively. These are often
used to anchor a pattern match: for example, /^option/ matches the word option only if it
appears at the start of a line. The sequence \A matches the beginning of a string, and \z and
\Z match the end of a string.

All sounds good right? Well, it turns out that Ruby will execute code within a regular expression if you can pass multi-line input to the parser. For example... Given

class EmailAttachment < ActiveRecord::Base
validates_format_of :attachment, :with => /^[\w\.\-\+]+$/
end

You can easily pass in

attachment.txt%0A<script>alert('open_sesame')</script>

which is converted (as %0A is a URL encoded new line), by ROR, into

"attachment.txt\n<script>alert('open_sesame')</script>"

You can think about the implications of this, feel free... I have been able to have some fun with my own personal site and getting arbitrary JavaScript and (worse) shell commands to execute. Also, I believe this may cause a larger security whole within routes for Rails (at least 2.1.0). I'll investigate this more later, as the beginning of this post says.

25Jul/090

Validating Emails in Ruby on Rails

It's time to validate e-mail addresses and you're sitting in a Ruby on Rails application. Fortunately, there are a few methods to tackle such a task, and a combination of them can yield a pretty nice solution.

The first idea is to use some lengthy regular expressions. But why enumerate/describe in regular expressions what we are looking for when TMail has it built in... Using TMail, it is possible to let our Ruby Net SMTP wrapper class parse the email address and decide if it is correct or not.

The second task is to make up for some of TMail's odd shortcomings: the fact that the text "bob" passes as valid for TMail is alarming, but throwing in some simple regular expression to get past this provides a pretty solid solution. (For the curious, "bob" is a valid e-mail to TMail because you could be sending messages to the local domain.)

First, here is our regular expression for a basic e-mail address...

/^([^@\s'"]+)@((?:[-a-z0-9]+\.)+[a-z]{2,})$/i

Next, we create a TMail object with our e-mail address...

tmail = TMail::Address.parse(address.to_s) rescue nil

I am rescuing nil here. You can rescue whatever error message you want, but for example sake I am not so concerned if the TMail fails to parse the e-mail address. You can pass in a multitude of e-mail formats and TMail will do its best to match the RFC standard for e-mail addresses.

You now have access to a TMail object with a flurry of options (TMail & documentation). Let's proceed.

My simply method calls TMail and then follows it with the regular expression match to ensure this e-mail address in question is ready to be used on the web. Here is my final result to a pretty safe-proof (so far, tested on a rather large web site) e-mail handler. This method will return the TMail object.

def validate_email e-mail
  tmail = TMail::Address.parse(address.to_s) rescue nil
  tmail if tmail.address =~ /^([^@\s'"]+)@((?:[-a-z0-9]+\.)+[a-z]{2,})$/i
end

And for those of you who want the extra step, I have attached a helper method that takes a TMail object and gives you back the e-mail address in string format fully qualified.

def formatted_email tmail
  if !email.blank?
    friendly_name = (tmail.name.blank? ? "" : tmail.name.blank?).tr('"',"'")
    quote = '"' if friendly_name =~ /[&lt;,@;]/
    "#{quote}#{friendly_name}#{quote} &lt;#{address}&gt;"
  end
end

As always, test this code and do not trust it blindly. I could have fat fingered something...