Tree Collection 1

An implementation of a Tree Collection in C#.
By Nicholas Butler

profile for Nicholas Butler at Stack Overflow, Q&A for professional and enthusiast programmers
Publicly recommend on Google


[Update]

After this article, I wrote a new and improved Tree Collection using generics. It is available here.

Introduction

I needed a tree collection for something or other a while back. I couldn't find one, so I decided to roll my own -- after all, how hard could it be? 2000 lines of code later... I hope you can gain from my experience, and at least use this code as a starting point. I included everything I could think of, including a verbose event set.

Background

This code started life as a Composite pattern. However, I soon realized that I didn't need two classes as a node is a tree, and a tree is a node. So I rolled them into one class called NodeTree. This has probably been done before, but it just seemed like a good idea to me.

You can derive a class from NodeTree to provide specialization to a particular data type, but you don't have to. If you are happy to rely on runtime type checking and exceptions, then you can create a strongly-typed tree by providing the data type in the static constructor:

ITree tree = NodeTree.NewTree( typeof( Element ) );

Using the code

I wrote a test application to check my code as I wrote it. This grew to almost as long as the collection. It is the best resource to find out how I intended the collection to be used. I will elaborate on various aspects in the following sections:

Interfaces

Although only one class, NodeTree, is being used to model the tree, the consumer does not directly use it. They manipulate the collection through two interfaces: INode and ITree. The NodeTree class derives from both of these interfaces:

[Serializable]
internal class NodeTree : INode, ITree, ISerializable

So a NodeTree is the internal representation of both a node and a tree. However, the two interfaces only declare members that make sense to that particular point of view. There is considerable equivalent behavior between the two interfaces; for example they both declare InsertChild methods. In general, the INode interface is the larger, although ITree does declare some unique members like Clear.

Conversion

Each interface has a property that allows conversion to the other.

INode declares a property Tree, which gets a reference to the node's root node, as an ITree.

ITree declares a property Root, which gets a reference to the tree's root node, as an INode.

Data

The purpose of a collection is to hold data. In this collection, the data is held in a private field, _Data, in the NodeTree class. Access to this object is through the INode interface, which declares a public property Data:

object Data { get; set; }

ITree does not declare this property, as it does not make sense, as we shall see later.

A lot of the members act on a data object rather than a node. For example, one InsertChild method is declared as:

INode InsertChild( object o )

It takes the data object as a parameter, creates a node wrapper, returns the new INode. It is actually an error to try to insert an INode, although you can insert a ITree, as we shall see later.

Copying

There are two ways to copy the subtree defined by a NodeTree: Copy and DeepCopy.

Copy creates new tree nodes, but sets the data property of each new node to reference the same instance of the data as the original node.

DeepCopy attempts to make a copy of the data as well. I have defined an interface IDeepCopy as:

internal interface IDeepCopy
{
    object CreateDeepCopy();
}

If the data object supports this interface, then CreateDeepCopy is called on the data from each node being copied.

If the data does not support this interface, then an attempt is made to instantiate a new object of the same type as the data object, using the copy constructor:

Activator.CreateInstance( data.GetType(), new object[] { data } );

If no copy constructor exists either, then DeepCopy gives up and just copies a reference to the data.

Exceptions

I make extensive use of exceptions to check for invalid operations. This reflects my preference (in this collection, anyway) for runtime type checking.

In particular, there are many places which throw an exception if you pass an unwrapped node rather than a tree.

Structure

There is a structural difference between a tree accessed through INode and one accessed through ITree. The ITree NodeTree has an extra node at the root, which has as its data object a RootObject object. This is a protected class nested in NodeTree, so you can't instantiate one. This is to make sure that exactly one node has a RootObject as its data, that is the root node of an ITree. It also proved a handy place to store the data type of the tree's data objects.

So there is an inherent difference between an INode and an ITree. If you try to pass an INode to methods that take an ITree, an exception will be raised. This is by design. Generally, you use ITree most of the time, and only use INode to reference a particular node that is contained in an ITree.

This sounds complicated, but actually is quite intuitive when you get used to it.

For example, if you want to move a node from one place in a tree to another, then you could use the following:

void Move( INode source, INode destination )
{
    ITree temp = source.Cut();
    destination.InsertChild( temp );
}

Note that the Cut method returns an ITree, not an INode. Similarly, the InsertChild method takes an ITree, not an INode.

Details

I will now go through the interfaces in detail. Most of this is obvious, but it's worth skimming through or using as a reference.

Navigation

These properties return an INode because they reference nodes within a tree:

INode Parent    { get; }
INode Previous  { get; }
INode Next      { get; }
INode Child     { get; }

INode Root      { get; }
INode Top       { get; }
INode First     { get; }
INode Last      { get; }

INode LastChild { get; }

The child property references the first child of a node.

In a tree, there is one root NodeTree, which is hidden from the user. The direct children of this root node are the "top" nodes.

Boolean properties

These properties get information about the structure around a node:

bool IsTree      { get; }
bool IsRoot      { get; }
bool IsTop       { get; }
bool HasParent   { get; }
bool HasPrevious { get; }
bool HasNext     { get; }
bool HasChild    { get; }
bool IsFirst     { get; }
bool IsLast      { get; }

The IsTree property checks for the existence of a RootObject as the node's data. If it is there, then it returns true.

The IsRoot property simply checks for the existence of a parent node.

Search

These methods perform a linear search of the subtree below the current node, looking for a node with the specified data.

INode this[ object o ] { get; }

bool Contains( object o );

Insert objects

These methods create a wrapper NodeTree around the specified data object, insert this node into the tree, and return the new node:

INode InsertPrevious( object o );
INode InsertNext    ( object o );
INode InsertChild   ( object o );
INode Add           ( object o );
INode AddChild      ( object o );

Insert nodes

These methods just throw exceptions as it is illegal to add a node to a tree - you must add a new object, or a complete tree:

void InsertPrevious( INode node );
void InsertNext    ( INode node );
void InsertChild   ( INode node );
void Add           ( INode node );
void AddChild      ( INode node );

Insert trees

These methods insert the specified tree into this tree:

void InsertPrevious( ITree tree );
void InsertNext    ( ITree tree );
void InsertChild   ( ITree tree );
void Add           ( ITree tree );
void AddChild      ( ITree tree );

Cut, copy, remove nodes

These methods operate on nodes:

ITree Cut           ();
ITree Copy          ();
ITree DeepCopy      ();
void  Remove        ();

Cut, copy, remove data

These methods operate on the first node found that has the specified object as its data:

ITree Cut           ( object o );
ITree Copy          ( object o );
ITree DeepCopy      ( object o );
void  Remove        ( object o );

More boolean properties

These properties return values that indicate whether the corresponding methods are valid:

bool CanMoveToParent   { get; }
bool CanMoveToPrevious { get; }
bool CanMoveToNext     { get; }
bool CanMoveToChild    { get; }
bool CanMoveToFirst    { get; }
bool CanMoveToLast     { get; }

Moving

These methods perform common movements on the current node:

void MoveToParent   (); 
void MoveToPrevious (); 
void MoveToNext     (); 
void MoveToChild    (); 
void MoveToFirst    ();
void MoveToLast     ();

Counting

This method returns the number of direct children of a node:

int DirectChildCount { get; }

Collections

INode and ITree both derive from IValuesCollection, which is derived from ICollection, which is derived from IEnumerable, so they can both act as enumerators in themselves. I have added these four extra collections that operate on this NodeTree's children:

IValuesCollection Nodes                   { get; }
IValuesCollection AllChildren             { get; }
IValuesCollection DirectChildren          { get; }
IValuesCollection DirectChildrenInReverse { get; }
The Nodes collection is exactly the same as the DirectChildren collection.

The interface IValuesCollection just adds the Values property to ICollection. This means that your can access this property of all the collections to get a collection that returns the data in a node, not the node itself. This means that you can access the data objects directly with code like this :

foreach ( Element o in node.DirectChildren.Values )

DataType

This property is only declared in the ITree interface. It retrieves the data type of the data objects held in the tree:

Type DataType { get; }

Clear

This method is only declared in the ITree interface. It just empties a tree:

void Clear();

Events

The following events are made available by the NodeTree class:

event NodeTreeDataEventHandler     Validate;
event EventHandler                 Clearing;
event EventHandler                 Cleared;
event NodeTreeDataEventHandler     Setting;
event NodeTreeDataEventHandler     SetDone;
event NodeTreeInsertEventHandler   Inserting;
event NodeTreeInsertEventHandler   Inserted;
event EventHandler                 Cutting;
event EventHandler                 CutDone;
event NodeTreeNodeEventHandler     Copying;
event NodeTreeNodeEventHandler     Copied;
event NodeTreeNodeEventHandler     DeepCopying;
event NodeTreeNodeEventHandler     DeepCopied;

You can attach to an event at the node or tree level. Every event is raised for the current node, and then for the containing tree. I thought about raising the events for each parent of the current node, but this seemed a bit too much.

The user will probably only ever attach to the Validate event of a tree, but the option is there to attach to every event at every node.

The default Validate handler checks the type of the data object, and throws an exception if this does not match the type set in the tree's constructor.

See "Points of Interest" which explains about using an EventHandlerList object to minimize the space overhead of so many events.

Serialization

I have added serialization support by implementing ISerializable. This is useful if you want to persist a tree, or if you want to put a tree onto the clipboard.

I have added support for XmlSerialization in version 4.

ICollection

Both interfaces derive from ICollection, and thus also from IEnumerable.

Points of Interest

Serialization

Note the use of ISerializable, and the persistence of a Version number to help future-proof the serialization process.

The default serialization implementation is inflexible, but these two operations go a long way to mitigating its failures.

I wrote a couple of Adapter classes to support XmlSerialization - see the bottom of Form1.cs.

You can select which formatter to use by moving the commented lines in the handlers for the Serialization and Deserialization buttons.

Events

I have made a lot of events available - probably more than anyone will ever need outside of a test application. This would have had an unacceptable increase in the space requirements of the NodeTree class, so I used an EventHandlerList object to minimize the impact. Basically, instead of having a collection for each event in a class, you only have one collection for all events, and use key objects to only record events that are attached. Thus, each instance of NodeTree just has one instance of EventHandlerList, and this only records attached events. This makes the raising an event a little more complicated, but not very much so.

Conclusion

This collection is not meant to be a panacea. It favors functionality over efficiency, which has made it quite fat. However, it does fill a gap in the .NET Framework, and is certainly better than using an invisible TreeView. I present it here as another option to add to your toolbox.

History

  • 2005 February 25 : Version 5 : Upgraded all enumerations to collections. See Collections.
  • 2004 June 7 : Version 4 : Added support for XmlSerialization
  • 2004 April 19 : Version 3 : Enumerations now throw exceptions properly.
  • 2004 March 30 : Version 2 : Fixed naming conventions.
  • 2004 March 30 : Version 1.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License.

Contact

If you have any feedback, please feel free to contact me.

Publicly recommend on Google: