Linked lists are preferable over arrays when:
you need constant-time insertions/deletions from the list (such as in real-time computing where time predictability is absolutely critical)
you don't know how many items will be in the list. With arrays, you may need to re-declare and copy memory if the array grows too big
you don't need random access to any elements
you want to be able to insert items in the middle of the list (such as a priority queue)
Arrays are preferable when:
you need indexed/random access to elements
you know the number of elements in the array ahead of time so that you can allocate the correct amount of memory for the array
you need speed when iterating through all the elements in sequence. You can use pointer math on the array to access each element, whereas you need to lookup the node based on the pointer for each element in linked list, which may result in page faults which may result in performance hits.
memory is a concern. Filled arrays take up less memory than linked lists. Each element in the array is just the data. Each linked list node requires the data as well as one (or more) pointers to the other elements in the linked list.
Array Lists (like those in .Net) give you the benefits of arrays, but dynamically allocate resources for you so that you don't need to worry too much about list size and you can delete items at any index without any effort or re-shuffling elements around. Performance-wise, arraylists are slower than raw arrays.
Both store a sequence of elements, but using different techniques.
An array stores elements in successive order in memory, i.e. it looks like follows:
--------------------------------------------------------------------------------------
| item 1 | item 2 | item 3 | ... ... | item x | //here comes other stuff
--------------------------------------------------------------------------------------
This means, elements are one after another consecutive in memory.
A ((doubly) linked) list, on the other hand, stores the items the following way: It creates an own "list item" for each element; this "list item" holds the actual element and a pointer/reference/hint/etc to the next "list item". If it is doubly linked, it also contains a pointer/reference/hint/etc to the previous "list item":
------------
------------ ---------- | item 4 |
| item 1 | | item 2 | | next ---+----...
| next ---+------->| next | ------------
------------ ---+------ ^
| |
| |
v |
---------- |
| item 3 | |
| next --+---+
----------
This means, the elements can be spread all over the memory and are not stored at specific memory locations.
Now that we know this, we can compare some usual operations on sequences of elements:
Accessing an element at a specific index: Using an array, we simply compute the offset for this index and have the element at the index.
This is very cheap. With a list on the other hand, we do not know a priori where the element at index is stored (since all elements can be anywhere in memory), so we have to walk the list item by item until we find the element at the index. This is an expensive operation.
Adding a new element at the end of the sequence: Using an array this can lead to the following problem: Arrays are usually of fixed size, so if we have the situation that our array is already completely filled (see //here comes other stuff
), we probably cant put the new element there because there might already be something else. So, maybe we have to copy the whole array. However, if the array is not filled, we can simply put the element there.
Using a list, we have to generate a new "list item", put the element into it and set the next
pointer of the currently last element to this new list item. Usually, we store a reference to the currently last element so that we don't have to search through the whole list to find it. Thus, inserting a new element is no real problem with lists.
Adding a new element somewhere in the middle: Let's first consider arrays: here, we might get into the following situation: We have an array with elements 1 to 1000:
1 | 2 | 3 | 4 | 5 | 6 | ... | 999 | 1000 | free | free
Now, we want to insert 4.5
between 4
and 5
: This means, we have to move all elements from 5
to 1000
one position right in order to make space for the 4.5
:
1 | 2 | 3 | 4 | free | 5 | 6 | ... | 999 | 1000 | free
||
vv
1 | 2 | 3 | 4 | 4.5 | 5 | 6 | ... | 999 | 1000 | free
Moving all these elements, is a very expensive operation. So better don't do this too often.
Now we consider, how a list can perform this task: Let's say we have currently the following list:
1 -> 2 -> 3 -> 4 -> 5 -> ... -> 999 -> 1000
Again, we want to insert 4.5
between 4
and 5
. This can be done very easily: We generate a new list item and update the pointers/references:
1 -> 2 -> 3 -> 4 5 -> ... -> 999 -> 1000
| ^
+-> 4.5 -+
We have simply created a new list element and generated sort of "bypass" to insert it - very cheap (as long as we have a pointer/reference to the list item the new element will be inserted after).
So, let's sum up: Linked lists are really nice when it comes to inserting at random positions (as long as you have a pointer to the adequate list item). If your operation involves adding lots of elements dynamically and traversing all elements anyway, a list might be a good choice.
An array is very good when it comes to index accesses. If your application needs to access elements at specific positions very often, you should rather use an array.
Notable things:
Solving the fixed-size problem for arrays: As mentioned before, (raw) arrays are usually of fixed size. However, this problem is nowadays no real point anymore, since almost all programming languages provide mechanisms to generate arrays that grow (and possibly shrink) dynamically - just as needed. This growing and shrinking can be implemented such that we have amortized runtime of O(1) for inserting and removing elements (at the end of the array) and such that the programmer doesn't have to call grow
and shrink
explicitly.
Since lists provide such nice properties for insertions, they can be used as underlying data structures for search trees, etc. I.e. you construct a search tree, whose lowest level consists of the linked list.
Best Answer
If you back a linked list with an array, you'll end up with the disadvantages of both. Consequently, this is probably not a very good way to implement it.
Some immediate disadvantages:
I suppose some advantages are:
mmap()
call easily. Though, you'd be better off using some sort of protocol buffer for portability.