My notes from an online C++ course
Named after inventors Adelson-Velsky and Landis, AVL trees are height-balanced BSTs.
A balance factor = height of left subtree - height of right subtree
, is used to ensure there are no excessively unbalanced nodes. It minimises the overall height of a tree by maximising the use of the nodes available. Balance factors of [-1, 0, 1] are sought after.
Generally, balance is taken as the absolute value |height of left subtree - height of right subtree|
since it does not always matter which subtree is longer. Thus, a tree is imbalanced
if |height of left subtree - height of right subtree| > 1
.
The above balance factors assume the absolute balance factor. An imbalance of 3 or more is largely ignored because the method of counting levels and deducing balancing factors concerns three consecutive nodes only (then 2 - 0 = 2, at most).
In building an AVL tree, current nodes are rotated so as to balance the overall tree.
The above imbalanced node is referred to as a left-of-left imbalanced
node. This naming convention states the relationship between the imbalanced node (3) and the inserted node (1). The rotation is similarly referred to as LL-rotation
, or right-rotation
or even clockwise rotation
. There is no standard nomenclature. In this article, we will use LL-rotation to match the LL-imbalance.
All rotations (and balance factors) are performed over three nodes only.
Analogously, there is also RR-imbalanced
nodes possible and rectified with RR-rotations
. Note that both LL-rotations and RR-rotations lead to the same balanced tree.
Another two types of imbalanced nodes are LR
and RL
.
RL-imbalanced nodes and rotations follow similarly.
LL- and RR-rotations are viewed as single rotations, and RL- and LR-rotations are viewed as double rotations. Rotations not only reveal how to balance BSTs into AVL trees, they also relate all BSTs together.
If one were to observe the balance factor of a node and neglect the absolute expression, then:
If one were to expand to larger BSTs, then one can observe the effect on other, largely ignored, child nodes/subtrees.
What happens when a new node is inserted? Note the changes to the balance factors, and how node A becomes imbalanced.
The insertion is described by looking at three nodes only, in this case, A, B and C. As far as these three nodes are concerned, node N was inserted and resulted in node A becoming LL-imbalanced. Node N traversed the left child of A (B) and the left child of B (C), presumably because it has the lowest value of the tree in view. The insertion caused on the three nodes to become imbalanced. Other nodes may become imbalanced but they would require independent attention.
Note that this description is not concerned about the relationship between node N and node CL. Node N could be the left-child or right-child of node CL. To balance the tree, we need to perform an LL-rotation around node A.
Note, also, how the relationship between BR1 and BR2 are preserved. Only the edge of B-BR1 is rotated.
Regarding a LR imbalance and rotation, consider the following:
The node inserted was either BRL or BRR, either of which traversed left of A and right of B. The LR rotation thus focuses on the three nodes A, B, and BR.
Replacing the generalised labels with real values reveals how one BST is converted into another BST.
The complete code for AVL insertion and rotations is given below.
#include <stdio.h>
struct Node
{
struct Node *lchild;
int data;
int height;
struct Node *rchild;
}*root = NULL;
int NodeHeight(struct Node *p)
{
int hl, hr;
//if p and its lchild are NULL, then return 0
hl = p && p->lchild ? p->lchild->height : 0;
hr = p && p->rchild ? p->rchild->height : 0;
//return the greatest height
return hl > hr ? hl + 1 : hr + 1;
}
int BalanceFactor(struct Node *p)
{
int hl, hr;
//if p and its lchild are NULL, then return 0
hl = p && p->lchild ? p->lchild->height : 0;
hr = p && p->rchild ? p->rchild->height : 0;
//positive or negative balanace factors assumed
return hl - hr;
}
//performed when left-subtree is larger than the right-subtree from node p
struct Node * LLRotation(struct Node *p)
{
//temporary pointers: pl is p's lchild; plr is pl's rchild (recall, plr's edge might need re-assigning)
struct Node *pl = p->lchild;
struct Node *plr = pl->rchild;
//perform the quick switch ('single' rotation)
pl->rchild = p;
p->lchild = plr;
//with nodes in place, re-assign their height property
p->height = NodeHeight(p);
pl->height = NodeHeight(pl);
//change the identity of root if necessary
if(root == p)
root = pl;
//return address of previously imbalanced node
return pl;
}
//ideas for LRRotation carry across from LLRotation
struct Node * LRRotation(struct Node *p)
{
struct Node *pl = p->lchild;
struct Node *plr = pl->rchild;
pl->rchild = plr->lchild;
p->lchild = plr->rchild;
plr->lchild = pl;
plr->rchild = p;
pl->height = NodeHeight(pl);
p->height = NodeHeight(p);
plr->height = NodeHeight(plr);
if(root == p)
root = plr;
return plr;
}
//for brevity, RR and RL rotations, while analogous, are omitted from this snippet
struct Node *RInsert(struct Node *p, int key){
//t is the pointer to the inserted Node
struct Node *t = NULL;
//if insert to an empty tree...
if(p == NULL)
{
t = (struct Node *)malloc(sizeof(struct Node));
t->data = key;
t->height = 1;
t->lchild = t->rchild = NULL;
return t;
}
// if insertion requires rebalancing
// choose left or right from root/parent
if(key < p->data)
p->lchild = RInsert(p->lchild, key);
else if(key > p->data)
p->rchild=RInsert(p->rchild,key);
//check balance of root/parent
p->height = NodeHeight(p);
if(BalanceFactor(p) == 2 && BalanceFactor(p->lchild) == 1)
//p is an LL-imbalanced node
return LLRotation(p);
else if(BalanceFactor(p) == 2 && BalanceFactor(p->lchild) == -1)
//p is an LR-imbalanced node
return LRRotation(p);
else if(BalanceFactor(p) == -2 && BalanceFactor(p->rchild) == -1)
//p is an RR-imbalanced node
return RRRotation(p);
else if(BalanceFactor(p) == -2 && BalanceFactor(p->rchild) == 1)
//p is an RL-imbalanced node
return RLRotation(p);
return p;
}
int main()
{
root = RInsert(root, 50);
RInsert(root, 10);
RInsert(root, 20);
return 0;
}
Balanced BSTs are generated by inserting nodes, checking balance factors and then rotating if required. This also ensures that absolute balance factors greater than 2 are produced.
When two or more imbalanced nodes are introduced on insertion, the first node that is re-balanced is the first ancestor node. Successive ancestor imbalanced nodes are then balanced.
Essentially, the procedure to delete nodes from an AVL tree is the same the deletion of nodes from a BST. The difference is is that rotations are carried out with AVL trees if one or more nodes become imbalanced.
Deleting a node from one side makes the opposite side longer. One can consider a series of general cases.
L1
rotation, where 1 represents the balance factor of the left-child node. A single rotation is needed, that is, LL.L-1
rotation. A double rotation, LR, is then needed. Note the result of L1 and L-1 is the same.L0
rotation.The nomenclature L1
, L-1
and L0
are specifically referring to rotations which follow deletions.
When deleting from the left-side, the right-side becomes ‘heavier’ and gives rise to R1
, R-1
and R0
rotations in the same way as outlined above.
The minimum and maximum no. of nodes for an AVL tree of given height h
is the same as that for BSTs. Recall that BSTs discussed previously are not always strict binary trees.
h
The last relation is actually the Fibonacci series, a ‘balanced’ series.
Note that these formulae apply when the height of an AVL tree with one node is taken as 1
not 0
.
The minimum and maximum height with a given no. of nodes n
is given by the inverse of the above formulae.
h
The last relation is approximately 1.44 log[2] (n + 2)
.