-
Notifications
You must be signed in to change notification settings - Fork 2
Trees
- Acyclic and connected.
- Rooted, directed and ordered.
Binary Tree T defined as
: T =
Tree Height
Empty tree has height of -1, so height = max( height(
Tree Properties
-
Full : All nodes have either
0ormchildren. -
Perfect : Every leaf node is at the same level.
-
Complete : Perfect tree for every level except the leaves, where all leaves are pushed leftward.
Not every full tree is complete. Not every complete tree is full.
We use BSTs because can't run binary search on Linked List
- Height bound by n such that \(lg(n) \leq h \leq n\)
- Full, Complete and/or perfect
-
nnodes in a Binary Tree means$n+1$ NULL pointers - BST has avg height of
$O(lg(n))$
Tree Analysis
- Best Case BST: Complete Tree
- Worst Case BST: Linked List Tree
- If inserting sorted list of elements, start at middle to avoid this outcome.
- All height-based BST operations have lower bound of
$\Omega$ (log(n)) time, and upper bound of$O(n)$ time - Min number of nodes in tree of height
$h$ :$log(n)$ - Max number of nodes in tree of height
$h$ :$n$
Traversals
-
Pre/Post/In Order Traversals
Always Left then Right. Only change where current node goes.
- Pre-Order: Self, L, R -- Stack (DFS)
- In-Order: L, Self, R
- Ordered list
- Post-Order: L, R, Self
-
Level-Order Traversal -- Queue (BFS)
- Enqueue root
- While queue isn't empty
- Dequeue and if element isn't NULL, print it and enqueue its children.
Traversals vs. Searches
A traversal visits each node exactly once. A search visits each node until find node being searched for.
An invalid in-order traversal is one that doesn't strictly ascend.
BFS : Visits all nodes on same level before going deeper.
DFS : Visits leaf nodes as quickly as possible.
Two Child Removal
- Find In Order Predecessor (rightmost element of left subtree)
- Swap IOP with node
- Delete removal node
Run Time Analysis
| Operation | BST Avg. Case | BST Worst Case | Sorted Array | Sorted List |
|---|---|---|---|---|
| find() | O(log(n)) | O(n) | O(log(n)) *Binary Search | O(n) |
| insert() | O(log(n)) | O(n) | O(n) | Given: O(1), Random: O(n) |
| delete() | O(log(n)) | O(n) | O(n) | Given: O(1), Random: O(n) |
| traverse() | O(n) | O(n) | O(n) | O(n) |
AVL Tree considered height balanced if the absolute value of the balance factor for each node is
BalanceFactor = heightOfRightSubtree - heightOfLeftSubtree
Insertion
- If subNode is NULL, return
- If key is greater than the curr key, go right child.
- If key is less than curr key, go left child.
- Rebalance and update height (can be done inside rebalance)
Removal
- find(SubNodeKey) 2. If find does not return a valid node, end the remove method.
- Deal with children 3. If No Children, delete the node 4. Else If Two Children : 5. Find the In Order Predecessor (IOP), which is the rightmost value of the left subtree. 6. swap(IOP, root) 7. remove(root->left) 5. Else One Child, just overwrite removal node with its child
Rotations
All rotations run in O(1) time, and are done in rebalance(). No rotations possible in find() call.
void rotateRight(AVL*& root) {
AVL* origRoot = root;
root = root->left_;
origRoot->left_ = root->right_;
root->right_ = origRoot;
updateHeight(root->right_);
updateHeight(root);
}
void rotateLeft(AVL*& root) {
AVL* origRoot = root;
root = root->right_;
origRoot->right_ = root->left_;
root->left_ = origRoot;
updateHeight(root->left_);
updateHeight(root);
}
void rotateRightLeft(AVL*& root) {
rotateRight(root->right_);
rotateLeft(root);
}
void rotateLeftRight(AVL*& root) {
rotateLeft(root->left_);
rotateRight(root);
}
void rebalance(Node *&subtree) {
int balance = getBalanceFactor(subtree);
int leftBalance = getBalanceFactor(subtree->left);
int rightBalance = getBalanceFactor(subtree->right);
//If balance is negative, left heavy
if (balance == - 2) {
if (leftBalance == - 1) {
rotateRight(subtree);
} else rotateLeftRight(subtree);
}
//If balance is positive, right heavy
if (balance == 2) {
if (rightBalance) == 1) {
rotateLeft(subtree);
} else rotateRightLeft(subtree);
}
if (subtree != NULL) {
updateHeight(subtree);
}
}Run Time Analysis
| AVL Operation | Run Time |
|---|---|
| find() | O(h) ** h = O(lg(n)) |
| insert() | O(h) ** h = O(lg(n)) |
| remove() | O(h) ** h = O(lg(n)) |
| traverse() | O(n) |
Summary of Advantages/Disadvantages
- Advantages 2. Run time of O(log(n)) is improvement over arrays and lists 3. Great for specific implementations where exact key is unknown (Nearest neighbor, Interval search, etc.)
- Disadvantages
3. Not the fastest
find()run time. Outperformed by hash tables. 4. AVL tree does not work well with disk seeks. Must be stored in memory, so not good for large data sets.
When you see "Red-Black Tree", you should think of an AVL Tree. They're both balanced BSTs.
- Use for large data sets. Minimizes number of disk seeks. Chunks data for better caching.
- B-trees can be expensive if frequently modified
- Self-balancing and sorted
B-Tree of order m
- All internal nodes have one more child than key
- All leaves are on the same level, and hold no more than
$m-1$ keys - When a node has
mkey, you split it into two nodes, and push the median value to the node above. - Binary Search to Insert
- All non root internal nodes have
[ceil(m/2), m]children. - Root can be a leaf or have between 2 and m children.
- All keys are sorted within a node
B-Tree Upper and Lower Bounds
- Max Num Keys in Tree :
$m^{h+1} - 1$ - Min Num Keys in Tree :
2 ceil[m/2]^h - 1->$2^{kh + 1} - 1$
Time Complexity
| B-Tree Operation/Property | Run Time |
|---|---|
| Search Through Single Node | |
| height | |
| find() | height * Search |
| insert() | Same runtime as find()
|
- CAN NOT USE FOR DICTIONARY because can't insert or remove elements.
- "Think about how a K-D tree is created. We have to know information about all of the nodes that will eventually be in our tree so we can find the median and medians of the subarrays. Now that we have created our tree, we try to add in a new node. This could potentially change which node is our median, meaning we'd have to reconstruct the entire tree which is not efficient. Same with removals"
Constructor
- Quickselect allowed us to find the k-th smallest element
- Partition actually does the swapping
Run Time
- Worst Case Find Nearest Neighbor :
$O(n)$ - Average Case Find Nearest Neighbor :
$O(log n)$