Bitwise Magazine :: Ruby programming tutorial

Home

Archives

About us...

Advertising

Contacts

Site Map

AN INTRODUCTION TO RUBY #3

In the third part of our Ruby tutorial, Huw Collingbourne untangles strings...

Requirements:
Ruby

Download The Source Code:
ruby3src.zip

Note: For a more complete Ruby tutorial, download Huw's free eBook (and source code), The Little Book Of Ruby, from SapphireSteel Software.

See also: Part One and Part Two of this series

Note: You can download all the Ruby programs used in this column and run them either from the command prompt or via an editor/IDE. The screenshots in this article show an early beta of Ruby In Steel 'Personal Edition' – a free Ruby IDE for Visual Studio 2005. The final version of Ruby In Steel PE 1.0 was released after this article was published. A commercial edition of Ruby In Steel ('Developer Edition') has also been released, which includes the ultra-fast 'Cylon' debugger and powerful analytical IntelliSense capabilities.

To download, evaluate or buy Ruby In Steel, go to the SapphireSteel Software site: http://www.sapphiresteel.com

As I mentioned back in part one of this series, double-quoted strings do more work than single-quoted strings. In particular, they have the ability to evaluate bits of themselves as though they were programming code. Here is an example from one of this month’s sample programs, 1strings.rb:

myname = 'Fred'

puts( 'Hello #{myname}' )
# This displays: ‘Hello #{myname}’

puts( "Hello #{myname}" )
# This displays: ‘Hello Fred’

In the example above, #{myname} in the double-quoted string tells Ruby to evaluate the myname variable and insert its value into the string itself. So, if myname equals “Fred”, the string “Hello Fred” will be displayed. No evaluation is done in a single-quoted string, however, which is why this type of string displays the actual characters entered: ‘Hello #{myname}’.

A double-quoted string is also able to evaluate variables and attributes (‘properties’) such as ob.name, expressions such as 2*3 and bits of code such as the method-call ob.ten and ‘escape characters’ such as “\n” and “\t” representing a newline and a tab. Once again, a single-quoted string does no such evaluation. A single-quoted string can, however, use a backslash to indicate that the next character should be treated literally. This is useful when a single-quoted string contains a single-quote character, like this:

‘It\’s my party’

User-Defined String Delimiters

If, for some reason, single and double-quotes aren’t convenient – if, for example, your strings contain lots of quote characters and you don’t want to have to keep putting backslashes in front of them – you have the option of delimiting strings in many other ways. See 2strings.rb for some examples.

The standard alternative delimiters for double quoted strings are %Q and / or %/ and / while for single-quoted strings they are %q and /. Thus…

%Q/This is the same as a double-quoted string./
%/This is also the same as a double-quoted string./
%q/And this is the same as a single-quoted string/

You can even define your own string delimiters. These must be non-alphanumeric characters and they may even be non-printing characters such as newlines or characters which normally have a special meaning in Ruby such as the ‘pound’ or ‘hash’ (#).

Whichever character you choose, it should be placed after %q or %Q and you should be sure to terminate the string with the same character. If your delimiter is an opening bracket, the corresponding closing bracket should be used at the end of the string, like this:

                            # %Q[ causes Ruby to treat square brackets
                            # like double-quoted strings
%Q[Hello #{myname}]         # displays ‘Hello Fred’

# %q+ causes Ruby to plus signs
# like single-quoted strings
puts( %q+Hello #{myname}+ ) # displays ‘Hello #{myname}’

You will find examples of a broad range of user-selected string delimiters in the sample program, 3strings.rb. Needless to say, while there may be times when it is useful to delimit a string by some esoteric character such as a newline or an asterisk, in many cases the disadvantages (not least the mental anguish and confusion) resulting from such arcane practices may significantly outweigh the advantages.

Backquotes

One other type of string deserves a special mention: namely, strings enclosed by back-quotes – that is, the inward-pointing quote character which is usually tucked away up towards the top left-hand corner of the keyboard: `

Ruby considers anything enclosed by back-quotes to be a command which can be passed for execution by the operating system using a method such as print or puts. By now, you will probably already have guessed that (as with so many things) Ruby provides more than one way of doing this. It turns out %/some command/ has the same effect as `somecommand` and so does %x{some command}. On the Windows operating system, for example, each of the three lines shown below would pass the command dir to the operating system, causing a directory listing to be displayed:

puts(`dir`)
puts(%x/dir/)
puts(%x{dir})

You can also embed commands inside double-quoted strings like this:

print( "Goodbye #{%x{calc}}" )

Be careful if you do this. The command itself is evaluated first. Your Ruby program then waits until the process which starts has terminated. In the present case, the calculator will pop up. You are now free to do some calculations, if you wish. Only when you close the calculator will the string “Goodbye” be displayed. The program, 4backquotes.rb, provides a few more examples.

Using backquoted strings you can run operating system commands or start external applications. Here. for example, we have started up the Windows Notepad.

String Handling

Let’s take a quick look at a few common string operations.

Concatenation

You can concatenate strings using << or + or just by placing a space between them (see: string_concat.rb). Here are three examples of string concatenation; in each case, s is assigned the string “Hello World”:

s = "Hello " << "world"
s = "Hello " + "world"
s = "Hello " "world"

Refer to the comments in the source code for more information on these operations.

String Assignment

The Ruby String class provides a number of useful string handling methods. Most of these methods create and return new string objects. So, for example, in the following code, the s on the left-hand side of the assignment on the second line is not the same object as the s on the right-hand side:

s = "Hello world"
s = s + "!"

A few string methods alter the string itself without creating a new object. These methods generally end with an exclamation mark (e.g. the capitalize! method). In fact, the << concatenation method which I mentioned earlier also modifies the receiver object (the string to its left). Watch out for this!

If in doubt, you can check an object’s identity using the object_id method. I’ve provided a few examples of operations which do and do not create new strings in the string_assign.rb program. Run this and check the object_id of s after each string operation is performed.

Indexing Into A String

You can treat a string as an array of characters and index into that array to find a character at a specific index using square brackets. Strings and arrays in Ruby are indexed from 0 (the first character). So, for instance, to replace the character ‘e’ with ‘a’ in the string, s, which currently contains ‘Hello world’, you would assign a new character to index 1:

s[1] = 'a'

However, if you index into a string in order to find a character at a specific location, Ruby doesn’t return the character itself; it returns its ASCII value:

s = "Hello world"
puts( s[1] ) # prints out 101 – the ASCII value of ‘e’

In order to obtain the actual character, you can do this:

s = "Hello world"
puts( s[1,1] ) # prints out ‘e’

This tells Ruby to index into the string at position 1 and return one character. If you want to return three characters starting at position 1, you would enter this:

puts( s[1,3] ) # prints ‘ell’

This tells Ruby to start at position 1 and return the next 3 characters. Alternatively, you could use the two-dot ‘range’ notation:

puts( s[1..3] ) # also prints ‘ell’

Strings can also be indexed using minus values, in which case -1 is the index of the last character and, once again, you can specify the number of characters to be returned:

puts( s[-1,1] )        # prints ‘d’
puts( s[-5,1] )        # prints ‘w’
puts( s[-5,5] )        # prints ‘world’

When specifying ranges using a minus index (see: string_index.rb), you must use minus values for both the start and end indices:

puts( s[-5..5] ) # this prints an empty string!
puts( s[-5..-1] ) # prints ‘world’

Finally, you may want to experiment with a few of the standard methods available for manipulating strings. These include methods to change the case of a string, reverse it, insert substrings, remove repeating characters and so on. I’ve provided a few examples in string_methods.rb.

If you want to mangle the works of Shakespeare, a few of Ruby's string handling methods will do the job nicely!

Chop and chomp…

A couple of handy string processing methods deserve special mention. The chop and chomp methods can be used to remove characters from the end of a string. The chop method returns a string with the last character removed or with the carriage return and newline characters removed (“\r\n”) if these are found at the end of the string. The chomp method returns a string with the terminating carriage return or newline character removed (or both the carriage return and the newline character if both are found). Run the chop_chomp.rb program to see these in use.

The Record Separator - $/

Ruby pre-defines a variable, $/, as a ‘record separator’. This variable is used by methods such as gets and chomp. The gets method reads in a string up to and including the record separator. The chomp method returns a string with the record separator removed from the end (if present) otherwise it returns the original string unmodified. You can redefine the record separator if you wish, like this:

$/=”*” # the “*” character is now the record separator

When you redefine the record separator, this new character (or string ) will now be used by methods such as gets and chomp. For example:

$/= “world”
s = gets()         # user enters “Had we but world enough and time…”
puts( s )          # displays “Had we but world”

These methods are useful when you need to removing line feeds entered by the user or read from a file. For instance, when you use gets to read in a line of text, it returns the line including the terminating ‘record separator’ which, by default, is the newline character. You can remove the newline character using either chop or chomp. In most cases, chomp is preferable as it won’t remove the final character unless it is the record separator (a newline) whereas chop will remove the last character no matter what it is. Here are some examples:

# Note: s1 includes a carriage return and linefeed
s1 = "Hello world
"
s2 = "Hello world"

s1.chop                 # returns “Hello world”
s1.chomp                # returns “Hello world”
s2.chop                 # returns “Hello worl” – note the missing ‘d’!
s2.chomp                # returns “Hello world”

The chomp method also lets you specify a character or string to use as the separator:

s2.chomp(‘rld’) # returns “Hello wo”

Format Strings

Ruby provides the printf method to print ‘format strings’ containing special field specifiers starting with a percent sign (see: string_printf.rb). The format string may be followed by one or more data items separated by commas; the list of data items should match the number and type of the format specifiers. The actual data items replace the matching specifiers in the string and they are formatted accordingly.

These are some common formatting specifiers:

%d – decimal number
%f – floating point number
%o – octal number
%p – inspect object
%s – string
%x – hexadecimal number

You can control floating point precision by putting a point-number before the floating point formatting specifier, %f. For example, this would display the floating point value to two digits:

printf( “%0.02f”, 10.12945 ) # displays 10.13

Well, that just about wraps up this introduction to Ruby programming. However, if it’s whetted your appetite for the language, fear not – we’ll have more Ruby coverage in future issues…!

April 2006

Home | Archives | Contacts