Nov 102013
 

Today at RubyConf in Miami, David Copeland gave a great talk titled, “Eliminating branching, nil and attributes – let’s get weird.” It’s always fun to get weird, and I sat in on his talk to see what direction it would take.

David mentioned that it’s fun to see how we can implement basic constructs in a language by using basic building blocks and I wholeheartedly agree. In fact, the exercise he posed in a great one in this case since it leads us into the lambda calculus which is a simple language that is actually just as powerful as languages like Ruby. Because Ruby supports lambdas it’s beautifully simple (almost as much so as in Scheme or Clojure) to implement a construct like the standard ‘if’ statement. This kind of thing is often done as a learning exercise in a lisp dialect, but I couldn’t find an example in Ruby so I decided to translate it here.

The example that I wrote using lambdas in Ruby also takes advantage of currying, or the concept that functions that take more than one argument can always be expressed as a series of one-argument functions that return a function accepting the next argument. In other words not only can we discard with the ‘if’ statement, we can also get rid of functions that take more than one argument to further ‘simplify’ our programs.

I’d encourage you to read up on this if these concepts or the following code fragment seems foreign. Without further ado, here is my attempt at if..then/else without the Ruby ‘if’ statement, along with MiniTest specs.

Note that this program will only work for Ruby 1.9 and above. For Ruby 1.8.7 and below you will have to change the ‘arrow’ syntax for defining to syntaxes compatible with those interpreters.

require 'minitest/spec'
require 'minitest/autorun'

# We first define lambdas for true and false expressions. 'tru' is a function
# taking one argument that returns a function accepting another function, and
# which returns the value of the outer function. 'fls' also takes one argument
# and returns a function taking a second argument, but it returns the value
# given to the second function.
#
# Note that we use a convention of indicating variables whose values are ignored
# with the '_' character.
tru = -> (x) { -> (_) { x } }
fls = -> (_) { -> (y) { y } }

# We now define a sequence of three nested lambdas corresponding to the
# conditional test followed by lambdas representing the true and the false
# "branches" in the conditional. Again this could have been represented as a
# single lambda accepting three arguments, but to keep the language as simple as
# possible and to demonstrate currying and higher-order functions we implement
# the conditional by a series of one-argument lambdas.
#
# For a true expression, you can think of this in terms of the following
# evaluation:
#
#     if_then_else.call(tru).call('true expression').call('false expression')
#
# We call the if_then_else lambda with the expression that either evaluates to
# true or false. The next two invocations of 'call' are evaluated by the
# functions above (tru and fls) which return either the value given to the outer
# function or the inner function.
if_then_else = -> (p) { -> (a) { -> (b) { p.call(a).call(b) } } }


describe 'Implementing conditionals in the lambda calculus' do
  describe 'the tru (true) lambda expression' do
    it 'returns the argument given to the first function' do
      tru.call('first').call('second').must_equal 'first'
    end
  end

  describe 'the fls (false) lambda expression' do
    it 'returns the argument given to the second function' do
      fls.call('first').call('second').must_equal 'second'
    end
  end

  describe 'if..then..else using the lambda calculus' do
    it 'returns the value of the "then" branch if the conditional is true' do
      if_then_else.call(tru).call('true value').call('false value').
        must_equal 'true value'
    end

    it 'returns the value of the "else" branch if the conditional is false' do
      if_then_else.call(fls).call('true value').call('false value').
        must_equal 'false value'
    end
  end
end

This code is also available on github.

Nov 062013
 

We all know that responsible programming in a highly dynamic language like Ruby requires a significant amount of time spent writing tests. We also know that these tests don’t just serve to demonstrate correct operation of the program under certain conditions, they can also give rise to well-structured code [1]. Still, we want to make sure that the tests we’re writing add the most possible value to our program, rather than becoming a burden to the development process.

The “test pyramid” is one of the most useful concepts that I’ve come across to make sure that tests are helpful without slowing down your development. Basically, this concept says most of your automated tests should be in the form of isolated unit tests, and more integrative tests should be used sparingly. There are several reasons that this makes sense:

  1. Unit tests are fast to run, since they run without dependencies. In contrast integration tests, and especially tests that drive a web UI are relatively slow.
  2. Unit tests give you the most direct feedback about the source of problems. A single change in code could break many integration tests, and you would have to dig through layers of code to find out the single line that caused an error. In contrast an isolated unit test should indicate the cause of a bug within a few lines.

You may have some arguments against following such a practice. Won’t an emphasis on unit testing miss bugs that occur when components are integrated? It’s true that you do want some integration tests for high-value scenarios in your application. They make sure that components generally work together and don’t cause bugs. However remember that any non-trivial program can’t be integration-tested in its entirety. If you tried to do that, your development process would drown under the weight of the tests, and you would probably still be missing coverage. In contrast, well-focused and abundant unit tests and a few high-level integration tests will cover your code in addition to providing enough high-level coverage.

1. http://www.growing-object-oriented-software.com/

Aug 272013
 

I’m currently in Quito, Ecuador training new developers joining the Stack Builders team. As we’re working on a practice project, a time billing system, I came across a really useful class that I hadn’t really used much in the past. Ruby’s Method class helped us to clean up a part of our code and I wanted to share how we used it.

In our system, we had a method that would either add or subtract time from a Date object. The problem is, you can’t just pass ‘+’ or ‘-’ to be applied as concisely as you would in more functional languages.  For example in Haskell (using the interactive shell, ghci) you can do the following to pass addition or subtraction to a function to modify two numbers:

λ> let modifyNumbers x y f = x `f` y
λ> modifyNumbers 3 4 (+)
7
λ> modifyNumbers 3 4 (-)
-1

What’s the closest way to approach this elegance in Ruby? I’ll cover a few different possibilities including Ruby’s `method` method.

Perhaps the most straightforward method to accomplish this to someone not used to a language with higher-order functions would be a simple conditional to decide which calculation to apply to a set of numbers:

def modify_numbers(x, y, modification_type)
  if modification_type == :add
    x + y
  elsif modification_type == :subtract
    x - y
  end
end

=> modify_numbers(1, 2, :add)
=> 3 

In Ruby, we can use higher-order functions – that is, we can write functions that take functions as arguments. Let’s take a look at this approach:

add = -> (x, y) { x + y }
subtract = -> (x, y) { x - y }

def compute_time(time_a, time_b, add)
  add.call(1, 2)
end
=> 3

Passing an anonymous block of code around may help to write more generic functions that you can compose in different ways later. But it still feels a bit kludgy to have to wrap a function inside of a another function just to pass it around. After all, this kind of thing comes for free in Haskell and Clojure! Instead of passing an anonymous function, you could just pass in a symbol representing the method to be invoked and use send:

def modify_numbers(x, y, modification_type)
  x.send(modification_type, y)
end

modify_numbers(1, 2, :+)
 => 3 

This is nice since you don’t have to wrap the method in a lambda in order to use it. But I think that we can do a bit better using Ruby’s ‘method’ method:

def modify_numbers(x, y, method_name)
  x.method(method_name).call(y)
end

modify_numbers(1, 2, :+)
=> 3

I like this approach the best since it avoids having to explicitly wrap the method in a lambda. You also get something that acts like a lambda (ie, the instance of Method that ‘method’ returns responds to ‘call’ just like a lambda). This may help in refactoring later if you decide that you really need a higher-order function.

After writing this, I’m curious how many other Rubyists have found a use for the Method class in their code. I know that I spent six years programming in Ruby without using it a single time. Let me know in the comments!

Aug 082013
 

As programmers we always want to make our code more robust and error-free. There is a large body of research on how to achieve these goals, particularly by using functional programming languages [1]. Language constructs such as static types and immutability, not to mention innovations such as proof-carrying code are used in order to make it easier to write safer software.

Ruby eschews the trend toward safer programs by placing far more than enough rope to hang oneself at the foot of the programmer. In fact, even if you don’t hang yourself with the rope provided by Ruby, you are likely at some point in your Ruby programming career to trip over the rope provided by Ruby, perhaps because of decisions or mistakes made by designers of the libraries that you have relied on [2].

It’s probably a safe assumption that Ruby will never reach the level of safety present in some newer languages and frameworks [3]. I think it’s also a safe assumption that given the wonderful software and frameworks that have been produced in Ruby the language will be popular for application development for years to come. With that said, here are some things that you can look out for and measures you can take in order to produce safer programs in Ruby.

Watch out for unsafe deserialization

The recent vulnerability in Rails came from unsafe deserialization (ie, creation of Ruby objects from a text string). Essentially the YAML library that ships with Ruby has a method that blindly instantiates objects into the class types specified by the creator of the YAML document. If you allow untrusted users to feed YAML to your system which is deserialized into Ruby objects they may be able to find ways to leverage these objects to gain access to unintended parts of your application or the system that it is hosted on.

Also, since Ruby never garbage collects symbols, the attacker may create large amounts of symbols that effectively shut down your application for lack of memory.

In the context of today’s society and internet it seems unwise to continue using this method of deserialization by default. Fortunately it is easy to replace the current method of deserialization with a new one. Ruby’s YAML parser, Psych, is built upon the libyaml library and it allows you to provide a different handler in response to the elements found in the YAML document. Soon after the recent Rails vulnerabilities were discovered, the safe_yaml gem was released by Dan Tao. This library essentially only allows the instantiation of certain types of objects such as Strings, Booleans, and Floats as they are found in the YAML input. In addition it isn’t re-inventing the wheel of yaml parsing, as it relies on Ruby’s YAML parsers (Psych or Syck depending on Ruby version).

I hope that safe_yaml, or something like it, becomes the default in Rails applications. Most of the Rails applications that I have seen don’t need to be able to deserialize YAML into any kind of object form – uses of YAML in Rails applications is primarily for things like configuration files which just need basic data types. We should have to explicitly turn on unsafe deserialization if we want it rather than locking things down in every part of our application to make it safer.

Avoid careless use of send

Ruby programmers know that we don’t call methods on objects – we send messages to objects. That is, in the Ruby world calling Person.foo is the same as calling Person.send(“foo”). You should be careful when using send and avoid using it unnecessarily. Aside from the fact that it allows you to violate encapsulation by calling private methods (use public_send to indicate that you’re not poking through object interfaces that you’re not supposed to), it also allows you to send an arbitrary string as a message to a class!

An example of a vulnerability that you could introduce due to use of send would be where you send a message to an object based on input from the user, perhaps through a form submission. Think about what happens if the user submits “destroy_all” as input and you pass that to an ActiveRecord model in your application – you would quickly lose all of your data in the table which it references.

Avoid using send unless you absolutely need it. If you do need it, always *whitelist* the strings that it should accept (blacklisting should be avoided as well since the unscrupulous user could find edge cases).

“Eval” is one character from “evil”

Evaluating strings as code is a common characteristic of dynamic languages including Ruby. While it allows for some really neat things such as concise DSLs (domain-specific languages), it’s one of the most dangerous things that you can do in a program. The danger of using eval is pretty obvious so I won’t dwell on it. If you do decide to use a form of eval (eg. eval, instance_eval, class_eval) make sure that it’s far from any kind of untrusted user input. Once you’re sure it’s far from user input, check again just in case. This isn’t a place where you want to make a mistake.

Use static analysis tools

There are static analysis tools for frameworks built on Ruby like Rails. Static analysis tools scan your code without running it. Brakeman is a popular tool for Rails applications that will tell you if your application has clear cases where SQL may be injected, for example. It also looks for recent vulnerabilities in Rails and tells you which applications may need to be upgraded or patched. It’s simple to run, and can be hooked up to continuous integration tools like Jenkins. At Stack Builders we have a CI server that produces brakeman reports for all of our Rails projects every time a commit is merged to the master branch in our git repositories.

Since Brakeman is so easy to run and configure as an automated part of your build there is no reason it shouldn’t be used for all Rails applications as an added measure of security.

Have someone else review your code

It’s always a good idea to have a second set of eyes on the code that you write. At Stack Builders we use Github pull requests (or a similar process without Github when necessary) to ensure that all code is reviewed by another team member. Even if you’re carefully following all of the recommendations above, and are a very experienced programmer it could be that you missed something that would be picked up by another set of eyes. It’s worth the minimal amount of extra time to catch potential vulnerabilities in the code you’ve written.

Conclusion

Ruby is a language with a lot of great features, but safety isn’t its strong suit. As we’ve recently discovered our Rails applications aren’t as safe as we thought. This isn’t particularly surprising given the amount of freedom that Ruby gives us as developers to dynamically evaluate code. Even though the language we’re using to build web applications doesn’t have constructs that are built into more modern languages for safety it’s still one of the fastest and most fun ways to build great web applications.

If we’re going to continue using Ruby we need to recognize that its flexibility comes with an added burden of responsibility for the programmer. The features that are the most dynamic, including the ability to deserialize text into any object type, arbitrary message sending, and the ability to eval code need to be scrutinized very carefully and considered in the context of your applications’ exposure to untrusted input. Spending extra time on the parts of your program that rely on dynamic features in Ruby is well worth the effort when compared to the risk of exposing your system and data to attackers.

Notes

  1. If you’re interested in real safety in programming languages you may enjoy some of these papers: http://www.cs.berkeley.edu/~necula/papers.html
  2. These features aren’t particular to Ruby, and most dynamic languages (eg., Perl, Python and PHP) have aspects that aren’t ideal in terms of safety.
  3. Haskell and Urweb are two languages that are designed to produce code that is safer by default than Ruby