Tree Collection 2

An implementation of a tree collection using generics.
By Nicholas Butler

profile for Nicholas Butler at Stack Overflow, Q&A for professional and enthusiast programmers
Publicly recommend on Google


Table of Contents

Introduction

I wrote a tree collection for .NET 1.1 last year - A Tree Collection. This has proved to be quite popular, so I decided to write one for .NET 2.0 using generics. I reused a lot of code, but I had to write a lot of new interfaces from scratch, especially the enumerators. Once again, I don't know why MS didn't include an implementation of such a basic collection in the framework. This implementation fills that gap and is useable straight out of the box.

I recommend reading the Structure section first, but if you just want to use this code, then look at the Quickstart section.

Structure

This code started life as a composite pattern. However, I soon realized that I didn't need two classes as a node is a tree, and a tree is a node. So I rolled them into one class called NodeTree. This has probably been done before, but it just seemed like a good idea to me.

Interfaces

Although only one class, NodeTree, is being used to model the tree, the consumer does not directly use it. They manipulate the collection through two interfaces: INode and ITree. The NodeTree class derives from both of these interfaces:

partial class NodeTree<T> : ITree<T>, INode<T>

So, a NodeTree is the internal representation of both a node and a tree. However the two interfaces only declare members that make sense to that particular point of view. There is considerable equivalent behaviour between the two interfaces; for example, they both declare InsertChild methods. In general, the INode interface is the larger, although ITree does declare some unique members like Clear.

Data

The purpose of a collection is to hold data. In this collection, the data is held in a private field, _Data, in the NodeTree class. Access to this object is through the INode interface, which declares a property Data:

partial interface INode<T>
{
    T Data { get; set; }
}

ITree does not declare this property, as it does not make sense, as we shall see later.

Node structure

Each node has the following structure:

Figure 1: Node structure

Tree structure

Nodes link together to form a tree with the following structure:

Figure 2: Tree structure

Terminology

The Root node defines a tree. You cannot use it to store data - it is just a place-holder. The Top node(s) defines the user part of the tree. You can have multiple Top nodes, but you don't have to. You are responsible for building your tree to match your requirements. A branch is the collection of nodes which are the Children of a common Parent and are connected by the Previous and Next properties. The node at the beginning of a branch is called a First node. Similarly, the node at the end of a branch is called a Last node. The Child node of a parent is the First node in its branch.

Quickstart

This section will get you going in the shortest possible time. The examples show a tree with a data type of Element.

Firstly, you must create your tree:

ITree<Element> tree = NodeTree<Element>.NewTree();

Next, create a top node:

INode<Element> top = tree.AddChild( new Element() );

Note that you add instances of your data type and a node is created and inserted into the tree for you.

Then add leaf nodes:

INode<Element> leaf1 = top.AddChild(new Element());
INode<Element> leaf2 = top.AddChild( new Element() );

You can now iterate over your tree:

foreach( INode<Element> node in tree.All.Nodes )
    DoSomething( node.Data );

or:

foreach( Element element in tree.All.Values )
    DoSomething( element );

Documentation

This section details the capabilities of the collection.

Instantiation

These static methods create new trees:

partial class NodeTree<T>
{
    public static ITree<T> NewTree() { ... }
    public static ITree<T> 
       NewTree(IEqualityComparer<T> dataComparer) { ... }
}

The first method creates a new tree with the default DataComparer, using EqualityComparer<T>.Default. If T implements IEquatable, then it uses that implementation. Otherwise, it uses the Equals method.

The second method allows you to specify the IEqualityComparer<T> to use.

Conversion

Each interface has a property that allows conversion to the other:

partial interface ITree<T>
{
    INode<T> Root { get; }
}

partial interface INode<T>
{
    ITree<T> Tree { get; }
}

Counts

Various metrics about a tree or node are available:

partial interface ITree<T>
{
    int Count { get; }
    int DirectChildCount { get; }
}

partial interface INode<T>
{
    int Count { get; }
    int DirectChildCount { get; }

    int Depth { get; }
    int BranchIndex { get; }
    int BranchCount { get; }
}

Count returns the number of nodes below the current node + 1 for the current node. The root node is not counted. DirectChildCount returns just the number of direct children of the current node. The Depth of a node is the number of parents it has, not including the root node. Thus, the depth of a top node is 0 and the depth of a top node's child is 1, etc. A branch is a collection of sibling nodes. BranchCount is the number of sibling nodes and BranchIndex is the zero-based index of the current node in its branch.

Relationships

These properties provide access to other nodes in a tree:

partial interface ITree<T>
{
}

partial interface INode<T>
{
    INode<T> Parent { get; }
    INode<T> Previous { get; }
    INode<T> Next { get; }
    INode<T> Child { get; }

    INode<T> Root { get; }
    INode<T> Top { get; }
    INode<T> First { get; }
    INode<T> Last { get; }

    INode<T> LastChild { get; }
}

The Parent, Previous, Next and Child properties allow you to navigate through the immediate relations of a node. More distant relations can be accessed through the Root, Top, First, Last and LastChild properties of a node.

Boolean properties

These properties provide information about the relations of a node:

partial interface ITree<T>
{
}

partial interface INode<T>
{
    bool IsTree { get; }
    bool IsRoot { get; }

    bool IsTop { get; }
    bool IsFirst { get; }
    bool IsLast { get; }

    bool HasParent { get; }
    bool HasPrevious { get; }
    bool HasNext { get; }
    bool HasChild { get; }
}

The IsTree property is only true for a root node at the base of a tree. The IsRoot property is true for any node that has no parent. This should only be true for a root node at the base of a tree, as nodes cannot exist outside a tree. The IsTop, IsFirst and IsLast properties provide information about the position of a node within a tree. The HasParent, HasPrevious, HasNext and HasChild properties provide information about the immediate relations of a node.

Adding an element

These methods allow you to populate your tree:

partial interface ITree<T>
{
    INode<T> InsertChild( T o );

    INode<T> AddChild( T o );
}

partial interface INode<T>
{
    INode<T> InsertPrevious( T o );
    INode<T> InsertNext( T o );
    INode<T> InsertChild( T o );

    INode<T> Add( T o );
    INode<T> AddChild( T o );
}

These methods wrap the given element in a new node and insert or add this node in the tree. The InsertChild methods insert a node at the beginning of the child branch and the AddChild methods add a node to the end of the child branch. The Add method adds a node to the end of the current branch.

Adding a tree

These methods work with complete trees:

partial interface ITree<T>
{
    void InsertChild( ITree<T> tree );

    void AddChild( ITree<T> tree );
}

partial interface INode<T>
{
    void InsertPrevious( ITree<T> tree );
    void InsertNext( ITree<T> tree );
    void InsertChild( ITree<T> tree );

    void Add( ITree<T> tree );
    void AddChild( ITree<T> tree );
}

These methods work in the same way as adding an element, but operate on complete trees.

Moving a node

These methods allow you to move nodes around in your tree:

partial interface ITree<T>
{
}

partial interface INode<T>
{
    bool CanMoveToParent { get; }
    bool CanMoveToPrevious { get; }
    bool CanMoveToNext { get; }
    bool CanMoveToChild { get; }
    bool CanMoveToFirst { get; }
    bool CanMoveToLast { get; }

    void MoveToParent();
    void MoveToPrevious();
    void MoveToNext();
    void MoveToChild();
    void MoveToFirst();
    void MoveToLast();
}

The Can properties indicate whether a particular operation is possible. The methods actually do the work of moving a node (and its children along with it).

Copying

There are two ways to copy the sub-tree defined by a NodeTree: Copy and DeepCopy. Copy creates new tree nodes, but sets the data property of each new node to reference the same instance of the data as the original node. DeepCopy attempts to make a copy of the data as well. I have defined an interface IDeepCopy as:

public interface IDeepCopy
{
    object CreateDeepCopy();
}

If the data object supports this interface, then CreateDeepCopy is called on the data from each node being copied. If the data does not support this interface, then ICloneable is tried. If this interface is also not implemented, an attempt is made to instantiate a new object of the same type as the data object, using the copy constructor through reflection. If no copy constructor exists, then DeepCopy gives up and just copies a reference to the data.

Manipulating sub-trees

These methods allow you to create a new tree from a node and its children:

partial interface ITree<T>
{
    ITree<T> Copy();
    ITree<T> DeepCopy();
}

partial interface INode<T>
{
    ITree<T> Cut();
    ITree<T> Copy();
    ITree<T> DeepCopy();
    void Remove();
}

These methods operate on a node to produce a tree.

Working with elements

These methods find a node that contains the specified element and then operate on that node:

partial interface ITree<T>
{
    INode<T> this[ T item ] { get; }

    bool Contains( INode<T> node );
    bool Contains( T item );

    ITree<T> Cut( T item );
    ITree<T> Copy( T item );
    ITree<T> DeepCopy( T item );
    bool Remove( T item );
}

partial interface INode<T>
{
    INode<T> this[ T item ] { get; }

    bool Contains( INode<T> node );
    bool Contains( T item );

    ITree<T> Cut( T item );
    ITree<T> Copy( T item );
    ITree<T> DeepCopy( T item );
    bool Remove( T item );
}

The indexer property returns the first node that has the specified data element, using the tree's DataComparer. The methods use the indexer to find the required node and then operate on that node.

Enumerators

These interfaces and members allow you to perform enumerations:

public interface IEnumerableCollection<T> : IEnumerable<T>, 
                                                ICollection
{
    bool Contains( T item );
}

public interface IEnumerableCollectionPair<T>
{
    IEnumerableCollection<INode<T>> Nodes { get; }
    IEnumerableCollection<T> Values { get; }
}

partial interface ITree<T> : IEnumerableCollectionPair<T>
{
    IEnumerableCollectionPair<T> All { get; }
    IEnumerableCollectionPair<T> AllChildren { get; }
    IEnumerableCollectionPair<T> DirectChildren { get; }
    IEnumerableCollectionPair<T> DirectChildrenInReverse { get; }
}

partial interface INode<T> : IEnumerableCollectionPair<T>
{
    IEnumerableCollectionPair<T> All { get; }
    IEnumerableCollectionPair<T> AllChildren { get; }
    IEnumerableCollectionPair<T> DirectChildren { get; }
    IEnumerableCollectionPair<T> DirectChildrenInReverse { get; }
}

The IEnumerableCollection<T> interface is the base of all the enumerators. The IEnumerableCollectionPair<T> interface provides the two views of an EnumerablePair: the Nodes and the Values enumerations. Both ITree and INode implement IEnumerableCollectionPair<T>; these implementations return the All EnumerablePair.

The four properties that return EnumerablePairs provide access to differing parts of the tree, or sub-tree under a node.

Sorting

These methods allow you to sort direct children or whole sub-trees.

The implementation uses List<T>.Sort, which uses the QuickSort algorithm.

partial interface ITree<T>
{
    void SortAllChildren();
    void SortAllChildren( Comparison<T> comparison );
    void SortAllChildren( IComparer<T> comparer );
}

partial interface INode<T>
{
    void SortAllChildren();
    void SortAllChildren( Comparison<T> comparison );
    void SortAllChildren( IComparer<T> comparer );

    void SortDirectChildren();
    void SortDirectChildren( Comparison<T> comparison );
    void SortDirectChildren( IComparer<T> comparer );
}

Serialization

This implementation supports serialization to binary or XML streams:

partial interface ITree<T>
{
    void XmlSerialize( Stream stream );
}

[ Serializable ]
partial class NodeTree<T> : ITree<T>, 
                            INode<T>, ISerializable
{
    public static ITree<T> XmlDeserialize( Stream stream )
}

Binary serialization is provided through the ISerializable interface. To use this method, your would write something like this:

private void BinarySerialize()
{
    using ( Stream stream = File.Open( Filename, 
                      FileMode.Create, FileAccess.Write ) )
    {
        IFormatter f = new BinaryFormatter();

        ITree<Element> tree = CurrentNode.Copy();

        f.Serialize( stream, tree );
    }
}
private ITree<Element> BinaryDeserialize()
{
    using ( Stream stream = File.Open( Filename, 
                         FileMode.Open, FileAccess.Read ) )
    {
        IFormatter f = new BinaryFormatter();

        return ( ITree<Element> ) f.Deserialize( stream );
    }
}

To use binary serialization, your element data type must be marked with the Serializable attribute.

XML serialization is provided by methods exposed in the ITree interface and NodeTree class. To use these methods, you would write something like this:

private void XMLSerialize()
{
    using ( Stream stream = File.Open( Filename, 
                     FileMode.Create, FileAccess.Write ) )
    {
        ITree<Element> tree = CurrentNode.Copy();

        tree.XmlSerialize( stream );
    }
}

private ITree<Element> XMLDeserialize()
{
    using ( Stream stream = File.Open( Filename, 
                        FileMode.Open, FileAccess.Read ) )
    {
        return NodeTree<Element>.XmlDeserialize( stream );
    }
}

To use the XML serialization, your element data type must support standard XML serialization. By default, the XML serializer serializes all public fields and public properties with get and set accessors, which may not be what you want. See the documentation for the XmlSerializer class in MSDN.

Events

The two interfaces, ITree and INode, both expose many events:

partial interface ITree<T>
{
    event EventHandler<NodeTreeDataEventArgs<T>> Validate;
    event EventHandler Clearing;
    event EventHandler Cleared;
    event EventHandler<NodeTreeDataEventArgs<T>> Setting;
    event EventHandler<NodeTreeDataEventArgs<T>> SetDone;
    event EventHandler<NodeTreeInsertEventArgs<T>> Inserting;
    event EventHandler<NodeTreeInsertEventArgs<T>> Inserted;
    event EventHandler Cutting;
    event EventHandler CutDone;
    event EventHandler<NodeTreeNodeEventArgs<T>> Copying;
    event EventHandler<NodeTreeNodeEventArgs<T>> Copied;
    event EventHandler<NodeTreeNodeEventArgs<T>> DeepCopying;
    event EventHandler<NodeTreeNodeEventArgs<T>> DeepCopied;
}

partial interface INode<T>
{
    event EventHandler<NodeTreeDataEventArgs<T>> Validate;
    event EventHandler<NodeTreeDataEventArgs<T>> Setting;
    event EventHandler<NodeTreeDataEventArgs<T>> SetDone;
    event EventHandler<NodeTreeInsertEventArgs<T>> Inserting;
    event EventHandler<NodeTreeInsertEventArgs<T>> Inserted;
    event EventHandler Cutting;
    event EventHandler CutDone;
    event EventHandler<NodeTreeNodeEventArgs<T>> Copying;
    event EventHandler<NodeTreeNodeEventArgs<T>> Copied;
    event EventHandler<NodeTreeNodeEventArgs<T>> DeepCopying;
    event EventHandler<NodeTreeNodeEventArgs<T>> DeepCopied;
}

You can attach to an event at the node or tree level. Every event is raised for the current node, and then for the containing tree. I thought about raising the events for each parent of the current node, but this seemed a bit too much. The default Validate handler checks the type of the data object, and throws an exception if this does not match the type of the tree element.

See Points of interest which explains about using an EventHandlerList object to minimize the space overhead of so many events.

Points of interest

Serialization

Note the use of ISerializable, and the persistence of a version number to help future-proof the serialization process. The default serialization implementation is inflexible, but these two operations go a long way to mitigating its failures. You may like to use a SerializationBinder to return the current types when types from a previous version are being deserialized.

Events

I have made a lot of events available - probably more than anyone will ever need outside of a test application. This would have had an unacceptable increase in the space requirements of the NodeTree class, so I used an EventHandlerList object to minimize the impact. Basically, instead of having a collection for each event in a class, you only have one collection for all events, and use key objects to only record events that are attached. Thus, each instance of the NodeTree just has one instance of EventHandlerList, and this only records attached events. This makes the raising an event a little more complicated, but not very much so.

Version 2: The EventHandlerList is now only created when an event is hooked. This saves about 16 bytes per node when no events are hooked.

Conclusion

This collection is not meant to be a panacea. It favors functionality over efficiency, which has made it quite fat. However, it does fill a gap in the .NET Framework, and is certainly better than using an invisible TreeView. I present it here as another option to add to your toolbox.

History

  • 20th November, 2005: Version 1
    • First release.
  • 24th November, 2005: Version 2
    • Now only creates an EventHandlerList for nodes that have event handlers.
    • Space overhead per node is now 28 bytes (without events).
    • Added "Memory" tab to show the basic performance metrics.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License.

Contact

If you have any feedback, please feel free to contact me.

Publicly recommend on Google: