Introduction

The primary motivation is to find or develop a data structure that allows for efficient, arbitrary retrieval (search), insertion, and deletion of stored elements.

  • Array-based Lists:
    • $O(n)$ time complexity for insertion and deletion (requiring element shifting).
    • $O(1)$ fast retrieval based on indices.
    • Offers efficient $O(\log n)$ binary search when sorted.
  • Linked Lists:
    • Efficient $O(1)$ time complexity for insertion and deletion at the head or tail.
    • Inefficient $O(n)$ time complexity for arbitrary search, insertion, and deletion.
    • Efficient binary search is not possible due to the lack of random access.
  • Stacks and Queues:
    • These are efficient for their operations but are restricted access data structures that do not allow arbitrary element manipulation.
  • Possible Alternative: Trees:
    • Tree data structures have the potential to provide $O(\log n)$ efficiency for retrieval, insertion, and deletion operations.

Definitions & Terminology

  • A tree is an acyclic graph.
  • For practical algorithmic purposes, we define a tree as a collection of nodes (which can hold keys, data, etc.) connected by edges.
  • Trees are oriented: each node has exactly one parent (except for the root) and potentially multiple children.
  • A node with no parents is called the root of the tree. All other child nodes are oriented “downward” relative to the root.
  • Nodes not immediately connected by an edge can maintain ancestor, descendant, or cousin relationships.
  • A node that has no children is called a leaf.
  • A tree structure where all nodes have at most two children is known as a binary tree.
  • A binary tree is also horizontally oriented: each node may have a left child, a right child, or both.
  • Example: A Binary Tree

Properties:

  • In a tree, any two nodes are connected by exactly one unique simple path.

  • The maximum possible number of nodes at any level $k$ is $2^k$.

  • Consequently, the maximum total number of nodes $n$ in any binary tree with depth $d$ is given by the sum of nodes across all levels up to $d$:

    $$
    n = 2^0 + 2^1 + 2^2 + \dots + 2^d = \sum_{k=0}^d 2^k = 2^{d+1} - 1
    $$

  • Given a full binary tree containing $n$ nodes, its depth $d$ can be derived as:

    $$
    d = \log_2(n+1) - 1
    $$

  • In essence, for such balanced trees, $d = O(\log n)$.

Motivation Restated: If we can design a tree-based data structure whose operations are proportional to its depth $d$, we could potentially achieve efficient retrieval (search), insertion, and deletion in $O(\log n)$ time.