Here, in Part
One, we look at objects and serialization.
See Part
Two for an explanation of class hierarchies and custom
collections.
If you have experience of another programming language
but are new to C#, this series will introduce you
to the fundamental features of the C# language and
the .NET Framework as we develop a simple ‘classic
style’ text adventure game. |
It is a truth universally acknowledged
that the highest expression of the programmer’s art is the adventure game. When I say ‘adventure game’, I am of course referring to that noble variety of the species which rejects graphics and spurns sound. In other words: the text adventure or, as it is sometimes, known, ‘interactive fiction’.
This month we serialize various data type including user defined objects
The heyday of the text adventure was the ‘70s and ‘80s. Back then Infocom was the company which dominated the field with games such as Zork, Starcross, The Hitchhiker’s Guide To The Galaxy and Trinity. I must confess a particular attachment to those games. But for them, I would never have bothered learning to program. I wanted to know how Zork had managed to create an internally consistent landscape which one could traverse only by following well-defined paths. How was it possible that you could find an object in a certain places, pick it up, drop it somewhere else and then find it again, hours later, in that same place?
In order to discover how all this was achieved, I spent a year programming my own adventure game, The Golden Wombat Of Destiny, using Borland’s Turbo Pascal 3.02 for DOS. Turbo Pascal was an excellent compiler for its day though the Pascal language was by no means ideal for writing adventure games. It wasn’t even object orientated! And an adventure game is full of objects - treasures, weapons, rooms and trolls - objects, objects, every one of them! Mind you, Turbo Pascal’s lack of objects wasn’t really surprising. Object orientation was rather an obscure methodology back in the early ‘80s. Nevertheless, given a free choice between writing a game in Turbo Pascal and a properly object orientated language such as C#, I’d go for the latter without a moment’s hesitation.
In this series, I’ll be explaining to use some of the fundamental features of the C# language and the .NET Framework to write an adventure game. If you don’t happen to want to write games, don’t worry. The techniques I’ll be covering could equally well be used to create any type of application which needs to create hierarchies of classes, maintain data items or save and load objects to and from disk. This month I’ll look at one way of saving lists or networks of objects - which might potentially be anything from a list of Customers or Employees to a Map object containing Room objects containing treasure objects, containing…. well, you get the drift.
Objects In The Stream
It goes without saying that most programs operate on data. And, in most cases, you will want to be able to save that data and reload it at a later stage. In C#, you can save data to and from stream objects. For example, the following code illustrates a basic technique for copying one file, “inp.xxx” to another file, “outp.xxx”. The File class provides methods to open the first file for reading and the second file for writing. It initialises two stream objects on the specified file and then just reads bytes from the first stream into a buffer and writes them out into the second stream:
const int BUFFSIZE = 1024;
Stream instream = File.OpenRead(“inp.xxx”);
Stream outstream = File.OpenWrite(“outp.xxx”);
byte[] buffer = new Byte[BUFFSIZE];
int numbytes;
while ((numbytes=instream.Read(buffer,0,BUFFSIZE)) > 0)
{
outstream.Write(buffer,0,numbytes);
}
instream.Close();
outstream.Close();
Streaming a discrete block of data such as a disk file is pretty straightforward. As you can see from the code above, you simply start with the first byte and carry on until you get to the last byte. Steaming objects that have been created during the execution of a program is not quite so simple. The problem is that the first object you want to stream may itself contain one or more other objects and these other objects may contain yet more objects.
The number, types and sizes of all these objects cannot be determined at the time the code is written. So, in order to save the objects to disk, you need to have some way of traversing the entire network of objects in the correct order, saving the appropriate types and amounts of data from each object you encounter.
When you reload the saved data, you have to be able to recreate the original network of objects, once again in the correct order and with the appropriate data. This is called Serialization and it is what we’ll be doing this month.
Serial Offenders
In brief, ‘serialization’ is the name given to the process of saving the state of one or more objects and transferring them to some other location – typically, but not necessarily, to and from memory and disk. When serialized, the internal state of an object is reduced to a single stream of bytes. The process of reconstructing objects from a saved stream of bytes is called deserialization.
Serialization tends to be regarded as one of the more arcane arts of object orientated programming. In some languages, serializing objects can be a difficult and error-prone process (see: Serial Killers). Fortunately for us, C# does most of the hard work for us. Indeed, serializing objects in C# is no more difficult than traditional file I/O in most other languages. To prove this point, let’s get started.
Load up the ConsoleApplication1.sln solution from the \serial1 directory which you can unzip from this month’s source code. This project shows how to serialize and deserialize some integers to and from disk. Although this might seem a rather trivial example, it demonstrates most of the fundamental techniques needed to do serialization of any type of data including user-defined objects.
It starts by creating an ArrayList object, named arrlist. An ArrayList is a dynamically resizable array. As with all else in C#, it is defined as a class. An instance of it (i.e. an object) is created using the new keyword.
Next we set up a for loop to iterate through int i from 0 to 9, adding the value of i to arrlist using the Add() method at each iteration. Just so we can see what’s happening, we also display each integer to the Console using WriteLine().
Having filled the ArrayList with integers, we now serialize the array to disk. First we open or create a file, “mydata.bin”, for writing using the OpenWrite() method of the File class and we assign the result to a Stream, st_out:
Stream st_out = File.OpenWrite("mydata.bin");
Next we create a BinaryFormatter object, binfmt and call its Serialize() method, passing to it the Stream, st_out and the ArrayList, arrlist. In .NET, the BinaryFormatter class can be used to serialize and deserialize whole lists or networks of connected objects. In .NET terminology, a network of objects is called a ‘graph’. In the current example, the ArrayList, arrlist, contains ten integers. It is not necessary to save each integer one by one. Instead we serialize the top-level or ‘root’ object, the ArrayList. C# does all the hard work of figuring out how many and what type of objects are in this list and saves these to the stream for us. While this may not seem any big deal when dealing with integers, bear in mind that it would be just as easy to save an array list containing other types of objects, including other array lists! When programming with other languages and libraries (a few languges such as Smalltalk and Java being honourable exceptions), serializing lists of lists of mixed objects can be a nightmare. C# and .NET make the remarkably simple – as we shall see shortly.
Deserializing objects is a bit like serializing them in reverse. We create a steam with File.OpenRead() instead of File.OpenWrite(). Then we call the BinaryFormatter object’s Deserialize() method instead of Serialize(). This reads in the stream of bytes and, from these bytes, it reconstructs the original graph of objects. We assign this object graph to the root object which, in the present case, is an ArrayList. Note that the return type of the Deserialize() method is ‘object’. This is the base type of the .NET class hierarchy and it is compatible with all its descendant classes. However, when deserializing a particular type of object, it is the programmer’s responsibility to specify the precise class type. I have done this by placing the class name, ArrayList, between braces before the call to binfmt2.Deserialize(). Finally, having finished with the input and output streams, I call their Close() methods to close the streams and release any associated resources. And that’s all there is to it!
A Class Act
To see an example of serializing user-defined objects, load ConsoleApplication2.sln from the \serial2 subdirectory. In this project, I have defined a simple class, myClass, which has two properties, MyInt and MyString, each of which has get and set accessors to private int and string data fields. Notice that this class is preceded by the [Serializable] attribute. This is required in order to allow the class to be serialized. If you comment out the [Serializable] attribute, an exception will occur when the program is run.
Find the Main() method. This is where all the action happens. First two myClass objects are created and initialised with an int and a string. Thereafter, the code is pretty much the same as in the previous project. As before, I serialize and deserialize the ArrayList containing the objects. The fact that this time I am serializing objects that I have defined myself rather than plain ints is of no real consequence. C# and .NET work out which types of data they are working with at any given moment.
It doesn’t take much more programming effort to go on to create and serialize indefinitely complex networks of objects. Load up ConsoleApplication3.sln from the \serial3 directory. This time, the myClass definition has been extended by the addition of an ArrayList object called _myarrlist. When a myClass object is created, its _myarrlist object is created and initialised with a series of strings. The number of strings in the _myarrlist object is determined the int argument, anint, which is passed to the myClass constructor. This is the full code of the constructor:
public myClass( int anint, string astring )
{
_myint = anint;
_mystring = astring;
_myarrlist = new ArrayList();
for (int i = 0; i < anint; i++ )
{
_myarrlist.Add("Item #: " + i );
}
}
I ’ve also given the class a method called arraystring(). This returns a string made up of all the strings in _myarrlist, with each string separated by a carriage return. This makes it easier to print the strings to the Console later on.
If you look at the code in Main(), you’ll find that this has barely altered since the previous projects. The serialization of an ArrayList of user-defined objects, each of which contains a variable-length ArrayList of strings, is no more difficult than serializing the simple list of integers that I started out with.
Finally, this month, I ’ve started work on an application that will, in due course, put serialization to more practical use. Load up the cddb.sln solution. This is an incomplete version of a CD database. It defines a serializable class called CDClass, which has three string properties: Name, Artist and Comments.
The form provides a button labelled ‘Add CD’. When you click this button, a new CDClass object is created and its properties are assigned the strings from the three text boxes on the form. As you can see, I have also added a menu with Save and Load items, plus two buttons which are intended to move to the previous and next CD object in sequence and display its data in the appropriate text boxes. In its present form, the behaviour of the buttons and menu items has yet to be coded. You may like to try finishing the program yourself. Refer to the notes on ‘A Serialized CD Database' in Going Further (below) for a few hints.
The ins and outs of Serialization and Iteration...
Serial Killers
One of the secrets of simple serialization using C# is the BinaryFormatter class. Here I’m using the drop-down list of members in Visual Studio to add the Serialize() method to my code.
Serialization describes the persistence of the internal state of an object. If you want to be able to reload an application and find it in the same state as when it was shut down, you will have to be able to save and restore the internal state of all the objects from which it is composed. When you serialize an object its internal data is written to the stream as a sequence of bytes. That is simple enough, you might think. The problems arise when you attempt to deserialize the data. After all, if it’s saved as a series of undifferentiated bytes, how can you know which particular type of object you need to construct when those bytes are read back into memory?
This problem makes deserializing quite a tricky task in many programming languages. Typically, the programmer has to write out each individual object one at a time. When written to a stream, each piece of data in each object has to be preceded by a tag that defines its type and, when necessary, its length. So if you stream out a 10 character string, you have to be sure to precede the string with a tag that says the next chunk of data is a string that’s ten characters long.
When deserializing the data, the program reads the tag, then reads in ten characters and assigns them to a string. Then it goes on to read the next tag and reconstruct an object that’s appropriate to whichever which ever chunk of data follows that tag. Even in a relatively user-friendly system such as Delphi, deserialization can get very complicated when you start saving and restoring whole networks or ‘graphs’ of interlinked objects. C#, fortunately, does all the hard work for us!
A Serialized CD Database
I’ve started work on a database to help keep track of a CD collection. What I haven’t done yet is add code to save and load the database objects. Should be simple…?
It’s always the case that the best way to learn something is by doing it. With that in mind, you may want to have a go at completing the CD database project (cddb.sln) in my sample code. This already defines the CDClass from which individual objects will be created. It also has a button to create and add new objects to an ArrayList. All you have to do is add code to take care of saving and loading the ArrayList object named CDList. Also, you might want to add code to the two buttons labelled ‘<<’ and ‘>>’ to allow the user to move to the previous and next objects in the CDList.
In fact, adding the appropriate functionality is not really that difficult. Using the techniques which I’ve explained this month, it should be trivial to code the saving and loading functionality using serialization. And as for the navigation buttons? Well, bear in mind that the first element is at index 0 and the last is at CDList.Count –1. Also, look at the code in AddObBtn_Click(). Notice that CDList.Add() returns the index of the last object added.
I’ll develop this application further next month. Amongst other things, I shall be creating a descendant class of ArrayList which will include a Pos property to indicate the position of the object that is currently being displayed. I shall also add buttons to ‘wind’ and ‘rewind’ to the last or first objects in the CDList. You might want to see if you can code all this yourself then compare your solution with mine next month.
All For One and One Foreach
ArrayList is one of several collection types defined within the System.Collections namespace. A foreach loop uses this syntax:
foreach( ItemType item in aCollection)
Unlike a for loop, foreach does not need to be provided with high and low indexes. It simply iterates through all the items available. There are several examples of foreach in this months code. This one, which iterates though integers, may bear a superficial resemblance to a standard for loop:
foreach (int i in arrlist2)
{
Console.WriteLine (i);
}
But bear in mind that foreach can iterate through collections of any types of object. In the following example, it iterates through user-defined myClass objects, instantiating the variable, mc, at each iteration. As you can see, this lets the code use the mc object’s properties, MyInt and MyString:
foreach (myClass mc in arrlist2)
{
Console.WriteLine (mc.MyInt);
Console.WriteLine (mc.MyString);
}
June 2005 |