University of Connecticut University of UC Title Fallback Connecticut

Create a rooted, 3-taxon tree

The next step is to add a function that creates a 3-taxon rooted tree. We will call this function from the Tree constructor in order to have a way to generate a test tree.

This way of generating a tree is admittedly tedious, and we will soon abandon it for a more general method that automatically creates trees from newick tree descriptions

Additions to tree.hpp

Most of the code below is already in your tree.hpp file. Just add the bold, blue text. Note that the clear() function call has been commented out in the Tree constructor.

#pragma once

#include <memory>
#include <iostream>
#include "node.hpp" 

namespace strom
    {

    //class TreeManip;
    //class Likelihood;

    class Tree
        {

            //friend class TreeManip;
            //friend class Likelihood;

        public:

                                        Tree();
                                        ~Tree();

            void                        createTestTree();     

        private:

            void                        clear();

            bool                        _is_rooted;
            Node *                      _root;
            unsigned                    _nleaves;
            Node::PtrVector             _preorder;
            Node::Vector                _nodes;
      
        public:
        
            typedef std::shared_ptr<Tree> SharedPtr;
        };

    inline Tree::Tree()
        {
        std::cout << "Constructing a Tree" << std::endl;
        //clear();
        createTestTree();
        }

    inline Tree::~Tree()
        {
        std::cout << "Destroying a Tree" << std::endl;
        }

    inline void Tree::clear()
        {
        _is_rooted = false;
        _root = 0;
        _nodes.clear();
        _preorder.clear();
        }


inline void Tree::createTestTree()
        {
        clear();
        _nodes.resize(6);

        Node * root_node       = &_nodes[0];
        Node * first_internal  = &_nodes[1];
        Node * second_internal = &_nodes[2];
        Node * first_leaf      = &_nodes[3];
        Node * second_leaf     = &_nodes[4];
        Node * third_leaf      = &_nodes[5];

        // Here is the structure of the tree (numbers in
        // parentheses are node numbers, other numbers
        // are edge lengths):
        //
        // first_leaf (0)   second_leaf (1)   third_leaf (2)
        //      \              /                  /
        //       \ 0.1        / 0.1              /
        //        \          /                  /
        //     second_internal (3)             / 0.2
        //             \                      /
        //              \ 0.1                /
        //               \                  /
        //                first_internal (4)
        //                        |
        //                        | 0.1
        //                        |
        //                    root_node (5)
        //
        root_node->_parent            = 0;
        root_node->_left_child        = first_internal;
        root_node->_right_sib         = 0;
        root_node->_number            = 5;
        root_node->_name              = "root node";
        root_node->_edge_length       = 0.0;

        first_internal->_parent       = root_node;
        first_internal->_left_child   = second_internal;
        first_internal->_right_sib    = 0;
        first_internal->_number       = 4;
        first_internal->_name         = "first internal node";
        first_internal->_edge_length  = 0.1;

        second_internal->_parent      = first_internal;
        second_internal->_left_child  = first_leaf;
        second_internal->_right_sib   = third_leaf;
        second_internal->_number      = 3;
        second_internal->_name        = "second internal node";
        second_internal->_edge_length = 0.1;

        first_leaf->_parent           = second_internal;
        first_leaf->_left_child       = 0;
        first_leaf->_right_sib        = second_leaf;
        first_leaf->_number           = 0;
        first_leaf->_name             = "first leaf";
        first_leaf->_edge_length      = 0.1;

        second_leaf->_parent          = second_internal;
        second_leaf->_left_child      = 0;
        second_leaf->_right_sib       = 0;
        second_leaf->_number          = 1;
        second_leaf->_name            = "second leaf";
        second_leaf->_edge_length     = 0.1;

        third_leaf->_parent           = first_internal;
        third_leaf->_left_child       = 0;
        third_leaf->_right_sib        = 0;
        third_leaf->_number           = 2;
        third_leaf->_name             = "third leaf";
        third_leaf->_edge_length      = 0.2;

        _is_rooted             = true;
        _root                  = root_node;
        _nleaves               = 3;

        // Note that root node is not included in _preorder
        _preorder.push_back(first_internal);
        _preorder.push_back(second_internal);
        _preorder.push_back(first_leaf);
        _preorder.push_back(second_leaf);
        _preorder.push_back(third_leaf);
        }
    }

Explanation of createTestTree()

We’ve added a member function to the Tree class named createTestTree(), and this function is now called, instead of clear(), in the Tree constructor. Let’s go through this function to see how it creates a tree.

  _nodes.resize(6);

This line sets the size of the _nodes vector to 6. The Node constructor will be called 6 times when this line is executed because 6 Node objects are created, one for each of the first 6 elements of the _nodes vector.

  T * root_node       = &_nodes[0];
  T * first_internal  = &_nodes[1];
  T * second_internal = &_nodes[2];
  T * first_leaf      = &_nodes[3];
  T * second_leaf     = &_nodes[4];
  T * third_leaf      = &_nodes[5];

These 6 lines create pointers to each of the 6 node objects stored in the _nodes vector. This will make it easier (and less error-prone) to hook up each node to its parent and children. Note the ampersand symbol (&): in your mind, replace the & symbol with the words "memory address of." The asterisk symbol (*) means "pointer," and Node * means "pointer to an object of type Node." Pointers store the memory address of an object. Thus, the pointer named root_node stores the memory address of the first element (element 0) of the _nodes vector.

  root_node->_parent = 0;
  root_node->_left_child = first_internal;
  root_node->_right_sib = 0;
  root_node->_number = 0;
  root_node->_name = "root node";
  root_node->_edge_length = 0.0;

The lines above provide values for all the data members of the Node object pointed to by the root_node pointer. The arrow symbol (->) is used to specify a data member or method of an object when a pointer is used to refer to the object, while a dot (.) would be used if attributes are being set for the object itself. The root node has no parent and no right sibling, so we set these pointers to the value 0. The left child of the root node should point to the first internal node, and first_internal is a pointer to that node.

The other nodes are set up similarly according to the tree diagram given in the comments.

  _is_rooted = true;
  _root = root_node;
  _nleaves = 3;

The above two lines set data members of the Tree class. We specify true for _is_rooted because we want this tree to be interpreted as a rooted tree (setting this to false would imply that the root node is really a leaf and observed data is associated with the root node).


  _preorder.push_back(first_internal);
  _preorder.push_back(second_internal);
  _preorder.push_back(first_leaf);
  _preorder.push_back(second_leaf);
  _preorder.push_back(third_leaf);

These final 5 lines build up the _preorder vector one element at a time. Each element is a pointer to a Node object already stored in the _nodes vector. The order in which we push pointers onto _preorder is important. It is called _preorder because it stores nodes in preorder sequence (ancestral nodes visited before their children). One could walk through all the nodes of the tree in preorder sequence by simply iterating through this vector. Iterating through _preorder in reverse would walk the tree in postorder sequence (child nodes visited before their parents).

Important: note that the root node is not included in the _preorder vector. There are two reasons for this. First, we already have a pointer (_root) to the root node. Second, in most situations the root node either is ignored (rooted trees) or is treated specially (unrooted trees), so you will soon see that we almost always start iterating through nodes with the first (and only) child of the root node anyway: starting at the root node would only cause us to add extra lines to skip the root node.