My notes from an online C++ course
A linked list a collection of nodes which contain a value (data) sometimes referred to as the key and an address to the next node. Unlike arrays, these have potentially unlimited memory capacity (hardware memory dependent) and are distributed over the heap. Linked lists are not stored in the stack.
The first node or head node (pointer) points to the second node, which contains the first value. The head node does not possess a data value but is merely a pointer which resides in the stack. Subsequent nodes reside in the heap. For circular linked lists, described below, the head node can have a data value.
Linked lists can be implemented by structures:
struct Node
{
int data;
struct Node* next; //an example of a self-referential structure
};
Since int
pointers occupy the same amount of memory as the value they reference, a single node above requires twice the memory of the data type. The head node can be declared in C with:
struct Node * p;
p = (struct Node *) malloc(sizeof(struct Node));
In C++, the node can defined by a class or a structure. As a structure, the first node is declared with C++ using:
struct Node * p;
p = new Node;
//access the data values
p->data = 10;
p->next = 0; //points to nothing; a null pointer NULL
A useful set of statements in which the head node traverses the linked list is given below. Pointer q
is the temporary pointer which traverses across the linked list. Pointer p
is initially located in the stack, as the head node.
struct Node *p, *q;
q = p; // assign q to the location of p->data
q = p->next; // move q to the data of the next node (p->next is itself a pointer to the next data value)
//check if q != NULL (or effectively, !(p->next)) then continue...
p = p->next; // move p to the data of the next node
A null pointer has an address of zero, a pointer is NULL
or !pointer
is true. Pointers which point to some definite address are true
.
In C and C++, with structures:
struct Node
{
int data;
struct Node *next;
}*first=NULL; //first is a global pointer to Node
void Display(struct Node *p)
{
while(p!=NULL)
{
//insert a counter > 1 here to record the number of nodes processed...
printf("%d ", p->data);
p = p->next;
}
}
The recursive form of Display can be constructed as follows:
void RecursiveDisplay(struct Node *p)
{
if(p!=NULL)
{
RecursiveDisplay(p->next);
printf("%d ", p->data);
}
}
Note that the above RecursiveDisplay also displays the nodes in reverse order. Changing the order so that we have a tail recursion, as opposed to head recursion, will print the nodes in the order they appear starting from the head node.
In C++, with classes:
class Node{
public:
int data;
Node* next;
};
//somewhere in main() or some other calling function (nullptr is a C++ keyword)
Node* p = head;
while (p != nullptr){
cout << p->data << " -> " << flush;
p = p->next;
}
Both recursive and iterative forms of Display() are time complexity of O(n). The space complexity for recursive form is also O(n+1) or just O(n), compared to iterative which is a constant O(1).
A linked list can be initialised from an array A
, of length n
. This pattern is applied for many of the examples outlined below.
struct Node
{
int data;
struct Node *next;
}*first = NULL, *second = NULL, *third = NULL; //have three global pointers to potentially build three linked lists
void newLinkedList(int A[], int n)
{
int i;
struct Node *temp, *last;
//build the first, head node (simply a pointer) and initialise the second node (key and pointer)
first=(struct Node *) malloc(sizeof(struct Node));
first->data = A[0];
first->next = NULL;
last = first;
//build the self-referential struct Node, in Node.
for(i = 1; i < n; i++)
{
temp = (struct Node*) malloc(sizeof(struct Node)); //new node created somewhere on the heap
temp->data = A[i]; //fill with key with array data
temp->next = NULL; //set the next node location as NULL (signifies the 'new end')
last->next = temp; //update the previous node's NULL pointer to point to temp's 'data'
last = temp; //set pointer last to point to temp's key
//by now, the temp node is part of the growing linked list...
}
}
The sum of all nodes and the number of nodes in a linked list can be implemented by cycling through while
loops, incrementing or summing as long as the Node
pointer p->next == true
. The sum code is commented in the node count function below:
int count(struct Node *p)
{
int l = 0;
// int s=0;
while(p)
{
l++;
// s += p->data;
p = p->next;
}
return l;
}
//the recursive form is...
int RecursiveCount(struct Node *p)
{
if(p!=NULL)
return RecursiveCount(p->next) + 1;
else
return 0;
}
One can traverse through the entire linked list to find the largest key (and similarly the smallest key).
int Max(struct Node *p)
{
int max = INT32_MIN;
while(p)
{
if(p->data > max)
max = p->data;
p = p->next;
}
return max;
}
Searching linked lists is generally quite expensive compared to arrays. One could perform a linear search, processing each node until the required node data is found.
Improvements to the linear search of linked lists borrows from the improvements to linear searching of arrays. Recall, for arrays:
Transposition
: where the required element is moved one place closer to the initial first element A[0]
and if necessary, substituting the initial element. This makes subsequent searches for the same element faster.Move to head/front
: where the required swaps places with the initial element. This is particularly useful if the element is frequently sought after with all other elements mostly neglected.For linked lists, it is more convenient to code the second improvement, move-to-front.
For linked lists, we move nodes as opposed to swapping the key between nodes. In this case, we have two pointers, one as a tail-pointer which follows (and is one node behind) the key pointer. Upon finding the key, we assign the tail pointer to the address of the node after the key pointer. The code is given below:
struct Node * LinearSearch(struct Node *lead, int key)
{
struct Node *tail = 0;
while(lead != NULL)
{
if(key == lead->data)
{
tail->next = lead->next;
lead->next = first;
first = lead;
return lead;
}
tail = lead; //move tail pointer one node forward
lead = lead->next; //move leading pointer one node forward
}
return NULL;
}
First create a new node anywhere in the heap. Then decide where in the linked list the new node will be inserted.
Inserting at the beginning is O(1) time complexity. After declaring the new node, assign the data value and set the new node’s next
pointer to point to the first data node in the linked list. Change the first pointer (in the stack) address so that it points to the new data of the new node.
Inserting a new node at a given node N
anywhere after the first node is more involved. Build a new node somewhere in the heap. Traverse the linked list N-1
times with a pointer p
which was initially set to the first node (use a for
loop). Set the new node’s next
pointer to match p->next
, then change p->next
to point to N
. The time complexity here is worst-case, O(N). Compare this to an insertion into an array which is similarly expensive.
The code (with the structure repeated for reference) is given below for both cases:
struct Node
{
int data;
struct Node *next;
}*first=NULL, *second=NULL, *third=NULL;
void Insert(struct Node *p, int index, int x)
{
struct Node *newNode;
int i;
if(index < 0 || index > count(p))
return;
newNode = (struct Node *) malloc(sizeof(struct Node));
newNode->data = x;
if(index == 0)
{
newNode->next = first;
first = newNode;
}
else
{
for(i = 0; i < index-1; i++)
p=p->next;
newNode->next = p->next;
p->next = newNode;
}
}
int count(struct Node *p)
{
int l = 0;
while(p)
{
l++;
p = p->next;
}
return l;
}
Incidentally, the insert()
method can be looped to create a new linked list, extending from the beginning or end, or inserting somewhere in between nodes.
To insert a new node after the last node only, the time complexity is constant O(1), as long as you know the address of the first and last nodes.
struct Node
{
int data;
struct Node *next;
}*first=NULL, *second=NULL, *third=NULL;
insertLast(int x)
{
Node * t = new Node;
t->data = x;
t->next = NULL;
if (!first)
{
//if the linked list is empty
first = t;
last = t;
}
else
{
last->next = t;
last = t;
}
}
One can check if a linked list is sorted by calling the function isSorted()
:
int isSorted(struct Node *p)
{
int min = -65536;
while(p != NULL)
{
if(p->data < x)
return 0;
min = p->data;
p = p->next;
}
return 1;
}
This achieved by traversing along the linked list, comparing the data value and incrementing the pointer value at the same time.
struct Node
{
int data;
struct Node *next;
}*first=NULL, *second=NULL, *third=NULL;
void insertInSorted(struct Node *p, int x)
{
struct Node *newNode, *trailing = NULL;
newNode = (struct Node*) malloc(sizeof(struct Node));
newNode->data = x;
newNode->next = NULL;
if(first == NULL)
//if the linked list is empty
first = newNode;
else
{
//traverse along the linked list, while reassigning the trailing pointer
while(p && p->data<x)
{
trailing = p;
p = p->next;
}
if(p == first)
{
newNode->next = first;
first = newNode;
}
else
{
newNode->next = trailing->next;
trailing->next = newNode;
}
}
}
The most important step to remember here is to free the memory that the deleted node occupies. Use free
.
//assume here that the linked list holds integers, the type returned
int Delete(struct Node *p, int index)
{
struct Node *trailing = NULL;
int x = -1, i;
if(index < 1 || index > count(p))
return -1;
if(index == 1)
{
trailing = first;
x = first->data;
first = first->next;
free(trailing); //use free() if malloc was used; use delete, an operator, if new was used
return x;
}
else
{
for(i = 0; i < index-1; i++)
{
trailing = p;
p = p->next;
}
trailing->next = p->next;
x = p->data;
free(p); //use free() if malloc was used; use delete, an operator, if new was used
return x;
}
}
The essential approach here is the compare adjacent node data values and free the node if the values are equal.
void removeDuplicate(struct Node *p)
{
struct Node *trailing = p->next;
while(trailing != NULL)
{
if(p->data != trailing->data)
{
//move along the list
p = trailing;
trailing = trailing->next;
}
else
{
p->next = trailing->next;
free(trailing);
trailing = p->next;
}
}
}
Two methods can reverse the nodes of a linked list.
The latter method is generally preferred. Linked list nodes reside at seemingly random positions in the heap and so it would be largely meaningless to interchange data values when the resultant list is also randomly distributed. Instead, one should change the direction of the link, given be Node’s next
pointer.
Initialise an (auxiliary) array of the same length as the linked list. Copy across all the elements to the array. Then re-assign all elements of the linked list in the reverse order.
void changeElements(struct Node *p)
{
int *A, i = 0;
struct Node *q = p;
A = (int *) malloc(sizeof(int)* count(p));
//fill the auxiliary array and move the pointer q along the linked list
while(q != NULL)
{
A[i] = q->data;
q = q->next;
i++;
}
q = p;
i--;
//re-assign the linked list in the reverse order
while(q != NULL)
{
q->data = A[i];
q = q->next;
i--; //we're going backwards...
}
}
The time complexity is O(2n) or just O(n). The method requires the movement of data (with a type which must be known) between linked lists and arrays.
This starts with three pointers. The three pointers p
, q
and r
occupy three consecutive nodes. The central node is the node which is modified at the time. The first and third pointer are needed to recall the previous pointer values.
The method effectively reverses the ‘direction’ of the link to the next node. Instead of pointing the next node, the pointer is changed so that it points to the previous node.
void changeLinks(struct Node *p)
{
struct Node *q=NULL, *r=NULL;
while(p!=NULL)
{
// first three statements slide the pointers r, p and q
r = q;
q = p;
p = p->next;
//reverse the direction of the link
q->next = r;
}
first=q;
}
This approach is much simpler. Data need not be exchanged.
The recursive version of changeLinks(), in outline, slides to the end of the linked list from the beginning. The function terminates and the previous call re-assigns the link. In the method below, pointer tail = NULL
and pointer p
should point to the second node.
void changeLinksRecursive(struct Node *tail, struct Node *p)
{
//keep calling itself until p reaches the last node, where
if(p)
{
changeLinksRecursive(p, p->next);
//initially, q now points to the first node and tails pointer p
//when p == null, the direction of the last link is reversed
p->next = tail;
}
else
//Node's first
first = tail;
}
Two linked lists can be joined be setting the last
property of the last node (in one list) to the address of the second node of another linked list.
//two nodes A and B
p = A;
while (p->next != NULL)
{
p = p->next;
}
p->next = B;
B = NULL; //previously pointed to the second node of Node B
The time complexity depends on the number nodes in Node A, so O(n).
One can also combine two sorted linked lists into a single linked list (specifically, merging) is somewhat analogous to the code needed to merge two sorted arrays, with the exception of data transfer. Briefly, one checks the value at both linked lists and re-assigns the link.
struct Node
{
int data;
struct Node *next;
}*first = NULL, *second = NULL, *third = NULL;
//merge two Node, p and q
void Merge(struct Node *p, struct Node *q)
{
struct Node *last;
if(p->data < q->data)
{
third = last = p;
p = p->next;
third->next = NULL;
}
else
{
third = last = q;
q = q->next;
third->next = NULL;
}
while(p && q)
{
if(p->data < q->data)
{
last->next = p;
last = p;
p = p->next;
last->next = NULL;
}
else
{
last->next = q;
last = q;
q = q->next;
last->next = NULL;
}
}
if(p)
last->next = p;
if(q)
last->next = q;
}
Follow the above code with the schematic below for the first few iterations:
The pointer last trails pointers p
and q
. The last assignments link pointer ‘last’ to remaining nodes of one linked list. Note that pointers p
or q
can be more than one node ahead of the other.
The attached unit in C++ will be used the to demonstrate some of the examples presented in the remainder of this article.
A looped linked list does not have any node with a NULL
pointer at the last node. The looped linked list possesses a last node which points to any other node in the same linked list. Linked lists described thus far do have a last node with a NULL
pointer, and are known as linear linked lists.
One can check for looped linked lists by
We describe the last option here.
One pointer will migrate forward by two nodes (faster) and the other pointer will migrate by one node. At some point in a looped linked list (although not necessarily during the first lap) both pointers will be assigned to the same node. If one of the pointers ends up being NULL
then the linked list is not looped.
The length of the loop section is always less than n
and determines how long it takes for the two pointers to ‘coincide’. The time complexity is O(n).
int isLoop(struct Node *f)
{
struct Node *p, *q;
// start at the beginning of Node f
p = q = f;
//start the race and continue until a NULL pointer is found or p == q
do
{
p = p->next;
q = q->next;
// q is advanced an extra node ahead during each iteration
q = q ? q->next : q;
}
while((p && q) && (p != q));
if(p == q)
return 1;
else
return 0;
}
A node with a head node only, and one which points to itself, is also a circular linked list.
Previously, we displayed linear linked lists by processing each node until we reached the end node with NULL pointer. For circular linked lists, we process each node until we return to the head node. One stores the address of the head node and uses a do...while
loops to force at least one iteration:
void Display(Node *p)
{
do
{
printf("%d ", p->data);
p = p->next;
}
while (p != head);
}
Recursively, the circular list is navigated with a flag value that marks the number of iterations carried out.
void Display(Node *p)
{
static int flag = 0;
if (p != head || flag == 0)
{
flag = 1;
printf("%d ", p->data);
Display(p->next);
}
flag = 0;
}
Once the second call to Display() is performed, flag = 1
. When the last node is printed, p = head
. Hence, the tail recursion ends and the circular list is not traversed a second time. The second assignment of flag = 0
is needed if Display() is called again. Recall, static variables are initialised on program start-up and persist until shutdown. The static variable is only accessible in the block it is declared in.
For any circular list of length n
, there are n+1
points of insertion (the point between the head and tail node is the same; in linear lists, the node before the head is not the same as the node after the last). In the case of insertion before the head node, a pointer must first traverse the entire list and reach the last (tail) node. This is thus O(n) in time complexity. In the case of insertion after the head node, the insertion follows the same procedure as the insertion of a linear linked list.
The snippet below was taken from the C++ unit referenced earlier. The Length() method is also provided in the unit.
struct Node
{
int data;
struct Node *next;
} * Head;
void Insert(struct Node *p, int index, int x)
{
struct Node *t;
int i;
if(index < 0 || index > Length(p))
return;
//this part relates to the insertion before the head node
if(index == 0)
{
t = (struct Node *) malloc(sizeof(struct Node));
t->data = x;
if (Head == NULL)
{
Head = t;
Head->next = Head;
}
else
{
while(p->next != Head)
p = p->next;
p->next = t;
t->next = Head;
Head = t;
}
}
//this part relates to the insertion at any other point after the head node
else
{
for(i = 0; i < index-1; i++)
p=p->next;
t=(struct Node *)malloc(sizeof(struct Node));
t->data = x;
t->next = p->next;
p->next = t;
}
}
The deletion of a node at any point after the head node (but not before it) is the same as the deletion of a node of a linear linked list. The deletion of the head node is a new consideration in this article. The deletion requires the reassignment of the head node and then the deletion of the head node.
struct Node
{
int data;
struct Node *next;
} * Head;
int Delete(struct Node *p, int index)
{
struct Node *q;
int i, x;
if(index < 0 || index > Length(Head))
return -1;
//deleting the head node starts here...
if(index == 1)
{
//start at any node and traverse to the last node with pointer p
while(p->next != Head)
p = p->next;
x = Head->data;
//if circular linked list is made up of one node (the head node)
if(Head == p)
{
free(Head);
Head = NULL;
}
//if the circular list has more than one node then re-assign the node addresses
else
{
p->next = Head->next;
free(Head);
Head = p->next;
}
}
//deleting any other node starts here...
else
{
for(i = 0; i < index-2; i++)
p = p->next;
q = p->next;
p->next = q->next;
x = q->data;
free(q);
}
return x;
}
Linked lists discussed to this point are known specifically as singly linked lists, which contain only one address, referred here as next
. A linked list with two addresses, previous
and next
as it were, are known as doubly linked lists
.
Doubly linked lists can be traversed in two directions, whereas singly linked lists can only be traversed in one direction.
The structure of a doubly linked list can take the following form:
struct Node
{
struct Node * prev;
int data; //could also take other data-types if needed
struct Node * next;
}
Initially, a new node is set such that the prev
and next
addresses are initially null
and the data value is assigned as required. Many of the methods on singly linked lists apply to doubly linked lists.
The initialisation of an array of integers (in this case) starts by preparing the first node and then handling all other nodes.
class Node{
public:
Node* prev;
int data;
Node* next;
};
//this could also be approached from a constructor point of view
void newLinkedList(int *A, int n) {
head = new Node;
head->prev = NULL;
head->data = A[0];
head->next = NULL;
Node* tail = head; //this will be updated below, as appropriate
//prepare all remaining nodes
for (int i = 1; i < n; i++){
Node* t = new Node;
t->prev = tail;
t->data = A[i];
t->next = tail->next; // tail->next is pointing to NULL
tail->next = t;
tail = t; //doubly-linked list in action
}
}
This is where one utilises the prev
pointer. Inserting before the head node involves the setting of three pointers, and then changing the pointer of the list, first
. The time complexity is constant, O(1).
Node *t = new Node;
t->data = someValue;
t->prev = NULL;
t->next = first; //first could labelled instead as head
first->prev = t;
first = t;
Inserting at any point after the head node to position pos
(that is, position lies in between the nodes where insertion is needed), can be achieved with:
Node *t = new Node;
Node *p = first;
t->data = someValue;
for (int i = 0; i < pos-1; i++)
p = p->next;
t->next = p->next; //p->next might be NULL, thus t becomes the tail node
t->prev = p;
//if p-next was NULL then p->next->prev is also NULL, so we only set p->next->prev if p->next is not NULL
if (p->next)
p->next->prev = t;
p->next = t;
Note the notation p->next->prev
is actually the next node’s prev
pointer, visualised from:
The worst-case time complexity is O(n), n is the length of the linked list.
As with insertion, the deletion of the first node is special case and the deletion of other nodes can be generalised. For deletion of the first node:
Node *p = first;
first = first->next; //this could be NULL if there is only one node present; no problem
extractedValue = p->data;
delete p;
//if there is more than one node present
if(first)
first->prev = NULL;
Deleting any other node (in this example, node N
) is carried out with:
Node *p = first;
for (int i = 0; i < N-1 i++)
p = p->next;
p->prev->next = p->next; //this might be NULL, again, no problem
//as above insertion, check if there is a successive node
if (p->next)
p->next->prev = p->prev;
extractedValue = p->data;
delete p;
The time complexity of deleting the first node is O(1) and the complexity of other nodes is worst-case O(n).
Reversal is achieved by swapping prev
and next
for all nodes in the doubly linked list. The conditional applied to the last node also applies here.
void Reverse() {
Node* p = head;
Node* temp;
while (p){
temp = p->next;
p->next = p->prev;
p->prev = temp;
p = p->prev;
// Need to check the condition at the last node again
if (!p->next){
p->next = p->prev;
p->prev = NULL;
head = p;
break; //reversal finished
}
}
Circular doubly linked list
operations involve the greatest number of pointer changes. There are never any tail (or head) nodes as such, so both prev
and next
pointers of three nodes are edited on insertion and deletion.
In summary, the node labelled head
(or first
) functions as a reference point for all pointer based operations on linked lists.
Doubly linked lists require more memory than singly linked lists. Both require more memory than equally sized arrays.
Doubly linked lists generally require more pointer based operations then singly linked lists.
These days, circular doubly linked lists offer all features of the linked lists available and are implemented in libraries provided in other languages, such as Java. To save memory, one can implement their own circular singly linked lists.