19. Dynamic Data Structures
Algorithms + Data Structures = Programs
19.1. Problem: Infix conversion
If a math teacher writes the expression 1 + 7 × 8 - 6 × (4 + 5) ÷ 3 on the blackboard and asks a group of 10-year-old children to solve it, different children might give different answers. The difficulty lies not in the operations themselves but in the order of operations. Modern graphing calculators and many computer programs can, of course, evaluate such expressions correctly, but how do they do it? You intuitively grasp order of operations (left to right, multiplication and division take higher precedence than addition and subtraction, and so on), but encoding that intuition into a computer program requires effort.
One way a computer scientist might approach this problem is to turn the mathematical expression from one that’s difficult to evaluate into one that’s easy. The normal style of writing mathematical expressions is called infix notation, because the operators are written in between the operands they’re used on. An easier notation for automatic evaluation is called postfix notation, because an operator is placed after the operands it works on. The following table gives a few examples of expressions written in both infix and postfix notation.
Infix | Postfix |
---|---|
|
|
|
|
|
|
|
|
Although infix notation is probably more familiar to you, postfix notation has the benefit of exactly specifying the order of operations without using any precedence rules and without needing parentheses to clarify. To understand how to compute an expression in postfix notation, we rely on the idea of a stack, which we first introduced in Chapter 9 and examine in greater depth here.
Recall that a stack has three operations: push, pop, and top. It works like a stack of physical objects. The push operation places an object on the top, the pop operation removes an item from the top, and the top operation tells you what’s at the top of the stack.
Using a stack, postfix evaluation rules are easy: Scan the expression from left to right, if you see an operand (a number, in our case), put it on the stack. If you see an operator, pop the last two operands off the stack and use the operator on them. Then, push the result back on the stack. When you run out of input, the value at the top of the stack is your answer.
For example, with 1 9 2 * +
, all three operands are pushed onto the
stack. Then, the *
is read, and so 2
and 9
are popped off the stack
and multiplied. The result 18
is pushed back on the stack. Then, the
+
is read, and 18
and 1
are popped off the stack and summed,
resulting in 19
, which is pushed back on the stack as the final
answer.
Our problem, however, is not to evaluate an expression in postfix notation but to convert an expression in infix notation to postfix notation. Again, the concept of a stack is useful. To do the conversion, we initialize a stack and scan through the input in infix notation. As we scan through the input, we do one of four things depending on which of the four possible inputs we see.
- Operand
-
Simply copy the operand to the postfix output.
- Operator
-
If the stack’s empty, push the operator onto the stack. If the stack’s not empty and the operator at the top of the stack has the same or greater precedence than our new operator, put the top operator into our postfix output and pop the stack. Continue this process as long as the top operator has the same or greater precedence compared to our new operator and the stack is not empty. Finally, push the new operator onto the stack.
- Left Parenthesis
-
Push the left parenthesis onto the stack.
- Right Parenthesis
-
Pop everything off the stack and add it to the output until you reach a left parenthesis on the stack. Then pop the left parenthesis.
Precedence comes from order of operations: *
and /
have high
precedence and ` and `-` have low precedence. When you encounter it on
the stack, treat `(` as if it has even lower precedence than `
and
-
. A right parenthesis should never appear on the stack.
Following this algorithm, we’re able to write a program that converts infix notation to postfix notation. We further restrict our problem to the case when the only operands are positive integers in the range [0, 9]. Since each is a single character, parsing the input is much easier. The same ideas for postfix conversion hold no matter how the input is formatted, but parsing arbitrarily formatted numbers is a difficult problem in its own right. This restriction also makes spaces unnecessary.
To solve the infix conversion problem, we need to create a stack data structure whose elements are terms from an infix expression, where a term is an operator, operand, or a parenthesis. We created a stack to solve the nesting expression problem in Section 9.1, but we explore stacks in this chapter as one of many different kinds of dynamic data structure.
19.2. Concepts: Dynamic data structures
By now you’ve seen several ways to organize data in your programs. For example, you’ve used arrays to store a sequence of values and class definitions to store (and operate on) collections of related values in objects. These data structures have the property that they’re static in size: Once allocated, they do not grow as the program runs. If you allocate an array to store 100 integers, you’ll get an error if you try to store 101 integers into it.
In this chapter, we examine dynamic data structures: As more data is read or processed, these data structures grow and shrink in memory to store what is needed. The stack used to solve the nesting expressions problem from Chapter 9 was not actually a dynamic data structure since it was defined with a fixed maximum size. In this chapter, we implement a true stack as well as many other kinds of dynamic data structures.
19.2.1. Dynamic arrays
There are two broad classes of dynamic data structures we examine here. The first kind are based on arrays that grow and shrink. Dynamic arrays allow for fast access to individual elements in the data structure. One drawback of dynamic arrays is that the array that stores that data has a fixed amount of space. When too many elements are added, a new array must to be allocated and all the original elements copied over.
14
would need to be moved back to insert 17
in order.Another drawback of dynamic arrays is that they’re poorly suited for insertion or deletion of elements in the middle of the array. When an element is inserted, each element after it must be moved back one position. Likewise, when an element is deleted, all of the elements after it must be moved forward one position. Thus, insertions and deletions at the end of a dynamic array are usually efficient, but insertions and deletions in the middle are time consuming.
19.2.2. Linked lists
The second kind of dynamic data structure is based on objects that link
to (or reference) other objects. The simplest form of such a data type
is a linked list. A linked list is a data structure made up of a
sequence of objects. Each object contains some data value (such as a
String
) and a link to the next object in the sequence.
Linked lists are flexible because they have no preset size. Whenever a new element is needed, it can be created and linked into the list. Unfortunately, they can be slow if you need to access arbitrary elements in the list. The only way to reach an element in the list is to walk from element to element until you find what you’re looking for. If the element is at the beginning (or the end) of the list, this process can be quick. If the element is in the middle, there’s no fast way to get there.
Linked lists work well when inserting new elements at arbitrary locations in the list. Unlike arrays, they’re not implemented as a contiguous block of memory. Linking a new element into the middle of the list automatically creates the correct relationship among elements, and there’s no need to move all the elements after an insertion.
Among their downsides is the memory overhead of linked lists. A new object must be allocated for each element in the list, which must include a reference to the next element as well as its own data. Consequently, using a linked list to solve a problem usually take more memory than an equivalent dynamic array solution.
It turns out that either dynamic arrays or linked lists can be used to create an efficient solution to the infix conversion problem defined at the beginning of the chapter.
19.2.3. Abstract data types
The fact that dynamic arrays and linked lists can be used to solve similar problems points out that we may often be more interested in the capabilities of a data structure rather than its implementation.
An abstract data type (ADT) is a set of operations that can be applied to a set of data values with well-defined results that are independent of any particular implementation. In other words, it is a list of things that a data type can do (or have done to it).
A stack is a great example of an ADT. A stack needs to be able to push a value, pop a value, and tell us what value is on top. The internal workings of the stack are irrelevant (as long as they’re efficient). It’s possible to use either a dynamic array or a linked list to implement a stack ADT. A queue is another ADT we discuss in Section 19.4, but there are many other useful ADTs.
19.3. Syntax: Dynamic arrays and linked lists
19.3.1. Dynamic arrays
Suppose you’re faced with the problem of reading a list of names from a
file, sorting them into alphabetical order, and printing them out. You’ve
already looked at simple sorting algorithms to handle the sorting
part, or you could use the Java Arrays.sort()
method. In previous
problems when you needed to use an array for storing items, you knew in
advance how many (or a maximum of how many) items you’d need to store.
In this new problem, the number of names in the input file is
unspecified, so you must allow an arbitrary number to be handled.
One approach is to make a guess at how many names are in the input file and allocate an array of that size. If your guess is too small, and you don’t check array accesses, you’ll cause an exception once you’ve filled the array and try to store the next name into the index one past the last legal one. If your guess is too large, you could be wasting a significant amount of storage space.
Our first solution to the problem of dealing with dynamic or unknown amounts of data is to watch our array accesses and expand the array as necessary during processing. It’s also possible to contract an array once you determine that the array has more space than needed.
A simple solution
Program 19.1 allocates an array of 10 String
references
and reads a list of names from standard input until it reaches the end
of the file, storing each name in successive array locations. If the
number of names in the input is larger than the size of the array, it
generates an exception.
Since programs that generate uncaught exceptions are, in general, a bad idea, our first change to this program should be either to catch the exception or check the index before storing the name in the array. In either case, we would then take some action that’s more user friendly than generating an exception, perhaps simply printing an explanatory message before exiting.
import java.util.Arrays;
import java.util.Scanner;
public class ReadIntoFixedArray {
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
String[] names = new String[10];
int count = 0;
while (in.hasNextLine()) {
names[count] = in.nextLine();
++count;
}
Arrays.sort(names, 0, count);
for (String name: names) { (1)
System.out.println(name);
}
}
}
1 | Note that we use an enhanced for loop for convenient iteration through
the array names . If you don’t recall this syntax, refer to
Section 6.9.1. |
A second possibility is to take a recovery action that allows the program to proceed. What went wrong? We made a guess of the input size and allocated an array of that size, but our guess was too small. We could modify the code to allocate a larger array at the beginning, recompile, and re-run the program, but that option might not be available to us if the program’s been distributed to users around the world. Instead, we fix the problem on the fly by allocating a larger array, copying the old array into the new array, and continuing.
Program 19.2 begins like the previous program by allocating a fixed-size array.
import java.util.Arrays;
import java.util.Scanner;
public class ReadAndGrowArray {
public static void main(String[] args) {
String[] names = new String[10];
Scanner in = new Scanner(System.in);
int count = 0;
String line = null;
while (in.hasNextLine()) {
line = in.nextLine();
try {
names[count] = line;
}
catch (ArrayIndexOutOfBoundsException e) { (1)
names = Arrays.copyOfRange(names, 0, names.length*2); (2)
names[count] = line;
}
++count;
}
Arrays.sort(names, 0, count);
for (String name: names) {
System.out.println(name);
}
}
}
1 | The program catches the ArrayOutOfBoundsException if there isn’t enough
space for another name in the array. |
2 | The catch statement allocates a new array, twice the size of the original
(current) array, copies the existing array into it, and replaces the reference
to the current array with a reference to the new array. |
Note that it was necessary to refactor the code in
Program 19.1 slightly. We added the line
variable
to hold the temporary result of reading the input line and moved the
count
increment outside of the try
-catch
block.
Can this new, improved program still fail? Yes, but only for very large input, in the case when the Java virtual machine runs out of memory when doubling the size of the array.
A potentially more serious problem is the way we set names
to point at
a new array.
names = Arrays.copyOfRange(names, 0, names.length*2);
This line works because we know the only variable that references the
array is names
. If other variables referenced that array, they would
continue to reference the old, smaller, and now out-of-date version of
the names
array. Figure 19.3 shows this problem.
array1
and array2
begin pointing at the same array. (b) A new array has been allocated, and 42
has been added to it. array1
has been updated to point at the new array, but array2
still points at the original.A more complete solution
The problem of updating variables that reference the dynamic array is a serious issue in large programs. It might not be enough to allocate a larger array and assign the new reference to only one variable. There might be hundreds of variables (or other objects) that reference the original array.
A solution to this problem is to create a new class whose objects
contain the array as a private field. References to the array are then
mediated, as usual, via accessor methods, which always refers to the
same version of the array. Program 19.3 is a simple
implementation of a dynamic array class. This class maintains an
internal array of String
objects, which it extends whenever a call to
set()
tries to write a new element just past the end of the array.
import java.util.Arrays;
public class DynamicArray {
private String[] strings = new String[10];
public synchronized void set(int i, String string) {
if (i == strings.length) {
strings = Arrays.copyOfRange(strings, 0, strings.length*2);
}
strings[i] = string;
}
public String get(int i) {
return strings[i];
}
public synchronized void sort(int first, int last) {
Arrays.sort(strings, first, last);
}
}
Note that the set()
and sort()
methods are both synchronized
in
case this class is used by multiple threads simultaneously.
Program 19.4 illustrates how to modify and extend Program 19.1 to use this new class. Since the array grows automatically, there is no need for the original program to check for out-of-bounds exceptions. Of course, the array expansion only works if the reference occurs exactly at the index corresponding to one beyond the end of the array. Other out-of-bound references generate an exception.
DynamicArray
class to store input read from a file.import java.util.Scanner;
public class UseDynamicArray {
public static void main(String[] args) {
DynamicArray names = new DynamicArray();
Scanner in = new Scanner(System.in);
int count = 0;
String line = null;
while (in.hasNextLine()) {
line = in.nextLine();
names.set(count, line); (1)
count++;
}
names.sort(0, count); (2)
for (int i = 0; i < count; i++) { (3)
System.out.println(names.get(i));
}
}
}
1 | Since names is no longer an array, but rather an object of class
DynamicArray , we can no longer use braces ([] ) to access elements and
must use methods set() and get() instead. |
2 | Arrays.sort() can’t sort this object directly which is why we provided a
sort() method in the class that sorts the private array on demand. |
3 | We also can’t use an enhanced for loop on names and must iterate through
its contents explicitly. |
This implementation, like most implementations of dynamic arrays, has potentially serious performance penalties. If the initial array is too small, it will have been doubled and the elements copied multiple times, resulting in slower execution. After a resize, the array is only half full, resulting in wasted space. Even on average, the array will only be three-quarters full.
19.3.2. Linked lists
As we’ve seen, while dynamic arrays can grow to accommodate a large number of items, the performance penalties of repeated copying and the space wasted by unoccupied array elements can negatively affect program behavior. In this section, we introduce the linked list, an alternative data structure that can efficiently grow to accommodate a large number of objects. As we shall see, this efficient growth comes at the expense of limitations on how the structure can be accessed.
Consider again the problem of reading an arbitrary number of names from an input file and storing them. Since we don’t know in advance how many names there are, it might not be efficient to pre-allocate or dynamically grow an array to store them. Instead, imagine that we could write each name on a small index card, and then link the index cards together to keep track of them, much like the cars of a railroad train are linked by the coupling from one to the next.
Constructing a linked list
In Java, a linked list is usually implemented as a class that provides methods to interact with a sequence of objects. The objects in the list are implemented as a private static nested class. A private static nested class behaves like a normal class but can only be created and accessed by the class surrounding it. In this way, the internal representation of the list is hidden and protected from outside modification. The nested class has two fields, one containing the data to be stored and the other containing a link or reference to the next object, or node, in the list. Since they’re only accessed by the outer class, it’s reasonable to make these fields public. If you need a refresher on static nested classes, refer to Section 9.4.
public class LinkedList {
private static class Node {
public String value;
public Node next;
}
// Methods for interacting with the list
}
Note that the type next
is the same as the class it’s inside of! This
apparent circular reference works because the variable only references
an object, but the object is not actually contained within the variable.
In fact, the value of the link may be null
, indicating that there are
no additional nodes in the list.
In the railroad metaphor, the node is a train car (with its freight as the value), and the link to the next node is the coupling to the next car.
The definition of LinkedList
given above is a good start, but it needs
a head
reference that keeps track of the first node in the list.
Initially, this value is null
. We also need an add()
method so that
we can add nodes to the list. Without checking through the entire list,
it’s useful to know how many nodes are in it. We can create a size
field that we increment whenever we add a node, as well as an accessor
to read its value. Finally, we can create a fillArray()
method that
fills an array with the values in the list.
String
objects.public class LinkedList {
private static class Node {
public String value;
public Node next;
}
private Node head = null;
private int size = 0;
public void add(String value) {
Node temp = new Node();
temp.value = value;
temp.next = head;
head = temp;
++size;
}
public int size() {
return size;
}
public void fillArray(String[] array) {
Node temp = head;
int position = 0;
while (temp != null) {
array[position] = temp.value;
++position;
temp = temp.next;
}
}
}
Program 19.6 is a re-implementation of the
name-reading program using class LinkedList
. Note that no array needs
to be pre-allocated. Instead, we capture all lines of input in a
linked list called list
.
LinkedList
class to store input read from a file.import java.util.Arrays;
import java.util.Scanner;
public class UseLinkedList {
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
LinkedList list = new LinkedList();
while (in.hasNextLine()) {
list.add(in.nextLine());
}
String[] names = new String[list.size()];
list.fillArray(names);
Arrays.sort(names);
for (String name: names) {
System.out.println(name);
}
}
}
Each time we read a new line from the file, the LinkedList
class
internally creates a new Node
with the input line as its value
. It
also sets its next
reference to the current head
so that the rest
of the list (which could be empty if head
is null
) comes after the
new Node
. We then update the head
field to reference the new Node
.
Thus, each new line read from the file is stored at the beginning of
the linked list. The last node in the list, which contains the first
String
read in, has a next
value of null
.
Figure 19.4 shows a visualization of the contents of this
implementation of a linked list. An “X” is used in place of an arrow
that points to null
.
Since we also increment the size
field inside of LinkedList
on each
add, we know how many String
objects it contains. Thus, we know how large of
an array to allocate in Program 19.6. The fillArray()
method visits
every node in the linked list, storing its value
into the allocated array.
array. We sort the returned array and then print it as before.
Appending to the end of a linked list
The LinkedList
class maintains a field named head
that references
the first node in the linked list. As we saw, that element was actually
the last or most recent String
read from input. This head
element was followed by the next most recent String
, followed by the
next most recent String
, and so on. The last node contained the first
String
read from input and had a null
next
field.
If we want the linked list to be ordered in the natural way, with head
pointing to the first element read from the file and the last element in
the list (the one with next
pointing to null
) containing the
String
most recently read, we can maintain a second field that
references the tail of the list.
Program 19.7 adds a tail pointer called
tail
to the LinkedList
class. Note that we have changed the add()
method to the addFirst()
method, and we have also added an addLast()
method to make it easy to append elements to the end of a linked list.
tail
, to reference the last element (tail) of the list.public class LinkedListWithTail {
private static class Node {
public String value;
public Node next;
}
private Node head = null;
private Node tail = null;
private int size = 0;
public void addFirst(String value) { (1)
Node temp = new Node();
temp.value = value;
temp.next = head;
head = temp;
if (tail == null) {
tail = head;
}
++size;
}
public void addLast(String value) { (2)
Node temp = new Node();
temp.value = value;
if (tail == null) {
head = temp;
} else {
tail.next = temp;
}
tail = temp;
++size;
}
public int size() {
return size;
}
public void fillArray(String[] array) {
Node temp = head;
int position = 0;
while (temp != null) {
array[position] = temp.value;
++position;
temp = temp.next;
}
}
}
1 | The addFirst() method has been updated to change the tail
pointer, but only if the list is empty (when head is null ). After all,
adding to the front of a list only changes tail if the front is also
the back. |
2 | In the addLast() method, adding a value to an empty list
also sets the head to point at the new node. Once the list has a node in it,
subsequent calls to addLast() will point the next field of
the old tail at this new node, linking the earlier nodes in the list to the
new node at the end. |
Inserting into a linked list
In the running example for this chapter, we’re interested in printing a
sorted list of String
objects read from input. Thus far we’ve
captured the lines into a linked list of elements, dumped these elements
into an array of the right size, and then sorted the array. An
alternative solution is to insert the elements into the linked list at
the right point in the first place.
Program 19.8 is a version of a linked list that
inserts elements into the linked list in sorted order. The only
significant difference between it and the previous implementations of a
linked list is its add()
method. This method walks down the linked
list, starting at head
, until it either walks off the end of the list
or finds an element before which the new String
should go. There are
special cases that must be handled to make this process work correctly:
empty list, inserting at the beginning, inserting in the middle, and inserting
at the end.
add()
method inserts each value in sorted order.public class SortedLinkedList {
private static class Node {
public String value;
public Node next;
}
private Node head = null;
private Node tail = null;
private int size = 0;
public void add(String value) {
Node temp = new Node();
temp.value = value;
temp.next = null;
if (head == null) { // Empty list (1)
head = tail = temp;
} else if (value.compareTo(head.value) < 0) { // Insert at beginning (2)
temp.next = head;
head = temp;
} else { // Insert at middle or end (3)
Node previous = head;
Node current = head.next;
while (current != null && value.compareTo(current.value) >= 0) {
previous = current;
current = current.next;
}
previous.next = temp;
temp.next = current;
if (current == null) { // Inserting at end of list (4)
tail = temp;
}
}
++size;
}
1 | When adding to an empty linked list, the head and tail fields must be set to
reference the new node. The next field of the new node is null by default. |
2 | If inserting at the beginning of a non-empty list, the head must be
updated to point to the new node. The next field of the new node is set to
the old value of head . |
3 | Inserting into the middle or the end of a linked list are similar
operations. To insert a node into the middle, it’s common to maintain two
variables to reference the current and previous nodes while walking down
the list. Once the proper insertion point is found (between the previous and
current nodes), the next field for the previous node is adjusted to
reference the new node, and next field for the new node is set to current . |
4 | Inserting at the end of a linked list is the same as inserting into the
middle, except that the tail field must also be updated to reference the new
node. |
public int size() {
return size;
}
public void fillArray(String[] array) {
Node temp = head;
int position = 0;
while(temp != null) {
array[position] = temp.value;
++position;
temp = temp.next;
}
}
}
19.4. Syntax: Abstract data types (ADT)
We’ve seen two examples so far of dynamic data structures: dynamic arrays and linked lists. A great deal of complexity can go on inside these data structures, but code that uses these data structures doesn’t need to be aware of the details of the internal implementation. Ideally, user programs could use any data structure that provides the required set of operations.
Our dynamic array and linked list classes were simple examples of abstract data types (ADT). We can design many data structures that hide the details of their implementation inside a class. The user of each class is aware of the operations (public methods) that can be performed on objects of the class but not of the intricacies used to implement those operations. Defining an ADT without regard to an implementation keeps users of the ADT from becoming dependent on details of any particular implementation. It gives maximum freedom to the programmer to choose (and change) the implementation as appropriate for the overall system design.
We generalize a data structure by observing which operations are applied to it. Then, we create an abstraction that formalizes these observations. The idea is to cleanly separate the use and behavior of the data structure from the way in which it’s implemented.
Interfaces are the obvious tool for defining the behavior of a class in Java without specifying its implementation. When defining an ADT in Java, its set of operations becomes the set of methods specified by the interface. Then, any class that implements the ADT must implement the corresponding interface.
In subsequent sections we look at two fundamental abstract data types, stacks and queues, and sample classes that implement them.
19.4.1. Stacks
We’ve already used stacks to solve problems in Chapter 9. Recall that a stack data structure behaves like a stack of books on your desk. When you place a book on the stack, it covers the books that are already there. When you take a book off the stack, you remove the book most recently placed there, exposing the one beneath it.
You can find a simple implementation of a stack in the solution to the infix conversion problem in Section 19.6, but we now examine the stack more deeply as an archetypal ADT. A stack’s restricted set of operations (pushing and popping) is adequate for many tasks and can be implemented in a number of different ways, some more efficient than others.
The acronym FILO (first in, last out) is sometimes used to describe a stack. The last item that’s been pushed onto the stack is the first item to be popped off the stack. In the next section, we’ll study the queue, which is a FIFO (first in, first out) data structure.
19.4.2. Abstract Data Type: Operations on a stack
There are two essential operations on a stack abstract data type,
corresponding to placing a book on the pile and removing it: push()
and pop()
. We also define two additional operations, top()
and
isEmpty()
.
-
push(x)
Push valuex
onto the stack. -
pop()
Pop the object off the top of the stack and return its value. -
top()
Return the value from the top of the stack but do not remove it. -
isEmpty()
Returntrue
if the stack is empty andfalse
otherwise.
Because a stack’s an abstract data type, we’re not specifically
concerned with how these operations are implemented, merely that they
are. Thus, we can specify an interface called Stack
that requires
these four methods.
public interface Stack {
void push(String value);
String pop();
String top();
boolean isEmpty();
}
Linked list implementation
All the operations defined by the stack ADT (and interface) are
implemented as methods in the class LinkedListStack
, shown in
Program 19.10.
public class LinkedListStack implements Stack {
private static class Node {
public String value;
public Node next;
}
private Node head = null; (1)
public void push(String value) { (2)
Node temp = new Node();
temp.value = value;
temp.next = head;
head = temp;
}
public String pop() { (3)
String value = null;
if (isEmpty()) {
System.out.println("Can't pop empty stack!");
} else {
value = head.value;
head = head.next;
}
return value;
}
public String top() {
String value = null;
if (isEmpty()) {
System.out.println("No top on an empty stack!");
} else {
value = head.value;
}
return value;
}
public boolean isEmpty() {
return head == null;
}
}
1 | The head field is used to maintain a reference to the linked list that
defines the stack. It is initialized to null . |
2 | The method push() must create a new node for the linked list and push
it onto the front of the list. It does so by creating a new Node ,
setting its value field to the incoming value , and pointing its
next pointer to the beginning of the list, stored by head . Since
temp is now the new top of the stack, head is made to point at it. |
3 | The pop() method needs to return the value of the head node and
remove that node from the linked list. It does this by replacing the
head node with the node pointed at by the next link in head . The
pop() method from the simpler stack used in the solution to the nested
expressions problem in Section 9.5 merely
removed the top and didn’t return the value. Most real-world stack
implementations of pop() do return this value, giving programmers
more flexibility. |
Note that both pop()
and top()
print an error message if the stack
is empty. More elaborate error handling is possible by throwing an exception.
Dynamic array implementation
Like the dynamic array example of Program 19.3,
Program 19.11 implements a stack of String
values using a dynamic array data structure.
import java.util.Arrays;
public class DynamicArrayStack implements Stack {
private String[] strings = new String[10];
private int size = 0;
public void push(String string) {
if (size == strings.length) {
doubleArray();
}
strings[size++] = string;
}
public String pop() {
String value = null;
if (size == 0) {
System.out.println("Can't pop empty stack!");
} else {
value = strings[--size];
}
return value;
}
public String top() {
String value = null;
if (isEmpty()) {
System.out.println("No top on an empty stack!");
} else {
value = strings[size - 1];
}
return value;
}
public boolean isEmpty() {
return size == 0;
}
private void doubleArray() {
strings = Arrays.copyOfRange(strings, 0, strings.length*2);
}
}
This stack implementation using a dynamic array omits the top()
and
isEmpty()
methods, causing a compiler error until the Stack
interface is
fully implemented.
At the beginning of the chapter, we introduced the problem of converting an expression from infix to postfix notation. In Section 19.6, we give the solution to this problem, but without a program that can evaluate a postfix expression, the conversion tool isn’t very useful.
Here we give a simple postfix evaluator. Recall the algorithm: Scan the input expression from left to right, if you see a number, put it on the stack. If you see an operator, pop the last two operands off the stack and use the operator on them. Then, push the result back on the stack. When you run out of input, the value at the top of the stack is your answer.
Like the infix to postfix converter, we restrict our input to positive
integers of a single digit. To make this program simpler, we introduce
two new classes that will be useful in our infix to postfix converter.
The first is Term
.
public class Term {
private int value;
public Term(int value) { this.value = value; }
public int getValue() { return value; }
}
This class allows us to hold an int
value. Although its structure is
simple, we update the definition of Term
later in the solution to the
infix to postfix conversion problem. By doing so, we can keep exactly
the same definition for TermStack
given next.
Term
objects.public class TermStack {
private static class Node {
public Term value;
public Node next;
}
private Node head = null;
public void push(Term value) {
Node temp = new Node();
temp.value = value;
temp.next = head;
head = temp;
}
public Term pop() {
Term value = null;
if (isEmpty()) {
System.out.println("Can't pop empty stack!");
} else {
value = head.value;
head = head.next;
}
return value;
}
public Term top() {
Term value = null;
if (isEmpty()) {
System.out.println("No top on an empty stack!");
} else {
value = head.value;
}
return value;
}
public boolean isEmpty() {
return head == null;
}
}
This class gives a linked list implementation of a stack. In fact, it’s
virtually identical to Program 19.10 with the
substitution of Term
for String
.
With our utility classes in place, the code for the postfix evaluator is short.
import java.util.*;
public class PostfixEvaluator {
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
String expression = in.nextLine(); (1)
TermStack stack = new TermStack();
for (int i = 0; i < expression.length(); i++) { (2)
char term = expression.charAt(i);
if (term >= '0'&& term <= '9') { (3)
stack.push(new Term(term - '0'));
} else {
int b = stack.pop().getValue();
int a = stack.pop().getValue();
switch (term) { (4)
case '+': stack.push(new Term(a + b));
break;
case '-': stack.push(new Term(a - b));
break;
case '*': stack.push(new Term(a * b));
break;
case '/': stack.push(new Term(a / b));
break;
}
}
}
System.out.println("The answer is: " + stack.top().getValue()); (5)
}
}
1 | Our main() method reads in the expression from the user and
creates a TermStack called stack . |
2 | Then, it iterates through the
expression with a for loop. |
3 | For each number we find, we supply it as
an argument to the constructor of a new Term object, which we push
onto stack . |
4 | For each operator, we pop two items off stack and apply the operator
to them. We create a new Term from the result and push this value onto
stack . |
5 | Finally, after all input is exhausted, we print the value on
the top of stack . |
To test this program properly, you must supply expressions in postfix form. Also, remember that these operations are all integer operations without fractional parts. Be careful to avoid division by zero!
19.4.3. Queues
A queue data structure is similar to a stack data structure, except that when getting an item from a queue, the item that’s been in the queue longest is the one retrieved. A queue data structure models an ordinary queue or line of people. The first person in line at a bank, for example, is the first one to receive service. Late comers are served in the order in which they arrive.
A queue is sometimes called a FIFO (first in, first out) data structure due to this property. To distinguish the operations on a queue from those on a stack, we use the terms enqueue and dequeue instead of push and pop.
19.4.4. Abstract Data Type: Operations on a queue
Four typical operations on a queue data structure are the following.
-
enqueue(x)
Put valuex
at the end of the queue. -
dequeue()
Remove and return the value at the front of the queue, that is, the value that’s waiting the longest. -
front()
Return the value at the front of the queue but do not remove it. -
isEmpty()
Returntrue
if the queue is empty andfalse
otherwise.
As with stacks, we can specify an interface called Queue
that requires
these four methods.
public interface Queue {
void enqueue(String value);
String dequeue();
String front();
boolean isEmpty();
}
Linked list implementation
Program 19.15 shows an implementation of the queue
ADT operations using a linked list. Because we need to keep track of
nodes at both ends of the linked list, we maintain head
and tail
variables to reference these nodes. The enqueue()
and dequeue()
methods manipulate these variables to manage the queue as values are put
onto it and removed from it.
public class LinkedListQueue implements Queue {
private static class Node {
public String value;
public Node next;
}
private Node head = null;
private Node tail = null;
public void enqueue(String value) {
Node temp = new Node();
temp.value = value;
temp.next = null;
if (isEmpty()) {
head = temp;
} else {
tail.next = temp;
}
tail = temp;
}
public String dequeue() {
String value = null;
if (isEmpty()) {
System.out.println("Can't dequeue an empty queue!");
} else {
value = head.value;
head = head.next;
if (head == null) {
tail = null;
}
}
return value;
}
public String front() {
String value = null;
if (isEmpty()) {
System.out.println("No front on an empty queue!");
} else {
value = head.value;
}
return value;
}
public boolean isEmpty() {
return head == null;
}
}
Note that the implementation of the LinkedListQueue
class is very
similar to the implementation of the LinkedListWithTail
class. The
enqueue()
method in the former is almost identical to the addLast()
method in the latter.
19.5. Advanced: Generic data structures
Most of the dynamic data structures we’ve seen in this chapter store
values of type String
. We’ve explored dynamic arrays of String
values,
linked lists of String
objects, queues of String
objects, and stacks
of String
objects. In Example 19.1, we
created the stack class TermStack
to hold Term
objects, but
TermStack
is identical to the existing LinkedListStack
class with
the substitution of Term
for String
.
What if you wanted to store values of some other type in these data
structures? What if you wanted a stack of int
values or a queue of
Thread
objects? You might think that you need to create a distinct but
similar implementation of each ADT for each type, as we do in
Example 19.1.
One possible solution is to take advantage of the fact that a variable
of type Object
can hold a reference to a value of any reference type,
since all classes are subtypes of Object
. If we create data
structures using Object
as the underlying type, we can store values of
any type in the data structure. For example,
Program 19.16 is an implementation of a stack ADT with
an underlying data type of Object
.
Object
references.public class ObjectStack {
private static class Node {
public Object value;
public Node next;
}
private Node head = null;
public void push(Object value) {
Node temp = new Node();
temp.value = value;
temp.next = head;
head = temp;
}
public Object pop() {
Object value = null;
if (isEmpty()) {
System.out.println("Can't pop empty stack!");
} else {
value = head.value;
head = head.next;
}
return value;
}
public Object top() {
Object value = null;
if (isEmpty()) {
System.out.println("Can't get top of empty stack!");
} else {
value = head.value;
}
return value;
}
public boolean isEmpty() {
return head == null;
}
}
A stack of Object
references could be an example of a
heterogeneous data structure since it’s possible to push objects of
different types onto the same stack. While there are situations in which
this technique is useful, in most cases a homogeneous data structure
(where all values are of the same type) is all that’s needed.
Homogeneous data structures allow type checking to occur at compile
time, thus helping to avoid run-time errors.
Using a stack of Object
references is generally more cumbersome, since
you must cast values returned from pop()
or top()
to the appropriate
data type.
ObjectStack stack = new ObjectStack();
stack.push("hello");
String result = (String)stack.pop(); (1)
1 | Without the explicit cast to String , the compiler gives an error:
Type mismatch: cannot convert from Object to String . |
Casting the returning value from a heterogeneous data structure essentially forces type checking to move from compile time to run time. Instead of having the Java compiler verify the type correctness of operations, we force the Java virtual machine to do the check.
19.5.1. Generics in Java
Java provides a general facility to create classes that implement the
same basic ADT but with a different underlying data type. This mechanism
preserves the advantages of compile-time type checking and eliminates
the need for run-time casting. A generic class is a class that gives a
template for creating classes in which a placeholder for an underlying
data type can be filled in when a specific instance of that class is
created. In the case of Example 19.1, we need
a stack that can hold Term
objects instead of String
objects, and a
generic class would allow us to create a stack of any reference type.
The generics facility in Java only supports underlying data types that
are reference types like String
and other classes, not
primitive types like int
or boolean
. However, we can use
wrapper classes to hold primitives types. Thus, a generic stack of int
values needs to be implemented as a stack of Integer
objects.
Fortunately, Java automatically converts between int
and Integer
in
most cases.
Defining a simple generic class in Java is done by appending a type
parameter within angle brackets (<>
) to the end of the class name
being defined.
public class GenericClass<T> {
...
T transform(T item) {
...
}
}
This code defines a new generic class (think class template)
GenericClass
with underlying type T
. It includes a method
transform()
that takes a value of type T
and transforms it (in some
unspecified way) to another value of type T
.
To use a generic class properly, you must create instances of it
specifying the underlying type. Then, the compiler will check using the
appropriate type at compile time. The compiler must make sure that
all the operations are valid with the supplied type substituted for the
type parameter (T
in this example).
For example, to create and use an instance of GenericClass
with
underlying type String
, you might do the following.
GenericClass<String> genericString = new GenericClass<String>();
String output = generic.transform("hello");
Because this use of the GenericClass
class is defined for underlying
type String
, no casting is necessary to assign the result of the
transform()
method to the String
variable output
.
To create and use an instance of GenericClass
with underlying type
Integer
, you would type:
GenericClass<Integer> genericInteger = new GenericClass<Integer>();
int i = generic.transform(27);
The same definition of GenericClass
is used in both instances with
different underlying data types, and the compiler is able to verify at
compile time that the uses are type safe.
If you omit the underlying type when declaring a generic variable or
creating an instance of a generic type, the compiler uses Object
as
the underlying type. This use, called a raw type, is essentially like
not using generics at all. There’s no compile-time type checking, and
references must be cast as needed. Modern Java compilers generally issue
a warning when raw types are used.
GenericClass genericRaw = new GenericClass(); // Raw type
int i = (Integer)genericRaw.transform(27); // Cast needed
The next two examples illustrate defining generic classes in Java.
Program 19.17 defines a generic version of the
LinkedList
class shown earlier. Note that it’s necessary to include
the type parameter T
on the outer class as well as the nested class
Node
.
public class GenericLinkedList<T> {
private static class Node<T> {
public T value;
public Node<T> next;
}
private Node<T> head = null;
private int size = 0;
public void add(T value) {
Node<T> temp = new Node<T>();
temp.value = value;
temp.next = head;
head = temp;
size++;
}
public int size() {
return size;
}
public void fillArray(T[] array) {
Node<T> temp = head;
int position = 0;
while (temp != null) {
array[position++] = temp.value;
temp = temp.next;
}
}
}
This class is almost indistinguishable from Program 19.5 except that
it uses type T
instead of String
.
Using generics is mostly like using any other class, but there are a few
oddities. In particular, there are problems instantiating arrays with generic
types. The fillArray()
method works because it never creates the array, only
fills it.
19.5.2. Using a Generic Class
Creating an instance of a generic class is similar to creating an
instance of a regular class, except that you should
specify the missing type (or types) used to parameterize the generic
class. For example, if you want to create an instance of the
GenericClass<T>
class, you must specify the type T
, for example
new GenericClass<String>()
.
Program 19.18 uses the generic class
GenericLinkedList
parameterized by String
to re-implement
Program 19.6.
GenericLinkedList
to create and use a linked list of Strings.import java.util.Arrays;
import java.util.Scanner;
public class UseGenericLinkedList {
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
GenericLinkedList<String> list = new GenericLinkedList<String>();
while (in.hasNextLine()) {
list.add(in.nextLine());
}
String[] names = new String[list.size()];
list.fillArray(names);
Arrays.sort(names);
for (String name: names) {
System.out.println(name);
}
}
}
Java 7 and later allow generic type inference. Type inference means that
the compiler is able to guess what type you mean without explicitly typing it.
Type inference is intended as a convenience so that you don’t have to type as
much repetitive code. The most common form of generic type inference uses the
diamond operator (<>
). This operator is used when instantiating a generic
type to show the compiler that you want to use a generic type parameter that
you believe the compiler can determine for itself.
For example, the following line appears above without type inference.
GenericLinkedList<String> list = new GenericLinkedList<String>();
Using type inference, it can be shortened to the following.
GenericLinkedList<String> list = new GenericLinkedList<>();
In general, you must always type the full generic type parameter when declaring a variable, but you can often use the diamond operator when instantiating a new generic object.
Combining the var
keyword with generic type inference is allowed, but it will
usually infer the wrong type.
var list = new GenericLinkedList<>(); // Don't do this!
In this case, the type inferred for list
is GenericLinkedList<Object>
. Using
var
infers a type from the right half of the assignment, and using the diamond
operator infers a type from the left half of the assignment. Both are forced to
guess and wind up with a legal but unspecific type. Avoid combining the two.
19.5.3. The Java Collections Framework
The Java Collections Framework (JCF) contains many classes and interfaces that
use generics for flexibility. The java.util
package includes classes to
implement stacks, queues, lists, sets, maps, and other useful data structures.
These container classes are parameterized so that they can be created to hold
many different types. We illustrate two examples here: ArrayList
and
HashMap
. Note that there’s also a LinkedList
class which is a great deal
more powerful than the LinkedList
class defined in this chapter.
Any class that implements the Iterable
interface can also be used in the
enhanced for
loops described in Section 6.9.1. All classes that
implement the List
interface (including ArrayList
, LinkedList
, and
Vector
) as well as those that implement the Set
interface (including
HashSet
and TreeSet
) also implement Iterable
. In our examples, a List
object and a Set
object (returned by the entrySet()
method of a HashMap
)
are used as targets of enhanced for
loops.
Lists of data are useful for almost every Java program. An array is a simple
implementation of a list, but arrays can’t shrink or grow as needed. When you
need a list with that kind of flexibility, the List
interface is a great
choice. It contains add()
methods to add elements, a get()
method to
retrieve an element, and remove()
methods to remove elements. To simplify
code, we often store lists in a variable with a List
type, but List
is an
interface, not a class that can be instantiated.
The three most common classes that implement List
are ArrayList
,
LinkedList
, and Vector
. ArrayList
(java.util.ArrayList
) implements an
array of objects that can grow at run time. The array is automatically extended
when it’s full and an attempt is made to store another item. Unlike using a
linked list, ArrayList
elements can be efficiently accessed in any order (by
specifying the index, just like an ordinary array). An element can be inserted
into the middle of the ArrayList
, causing any elements after the insertion
point to be pushed back by one index. Arbitrary elements can also be deleted
from the ArrayList
using the remove()
method.
Program 19.19 illustrates a use of the ArrayList
class. The program creates an empty ArrayList
and generates random
integers between 1 and 10, appending them to the end of the list,
until their sum is at least 100. Then, it prints the integers, their
sum, and how many were generated.
ArrayList
class.import java.util.*;
public class ArrayListExample {
public static void main(String[] args) {
Random random = new Random();
List<Integer> list = new ArrayList<>();
int sum = 0;
while (sum < 100) {
int n = random.nextInt(10) + 1;
list.add(n); // Append n to end of list
sum += n;
}
for (int n: list) {
System.out.format("%3d%n", n);
}
System.out.println("---");
System.out.format("%3d (%d values)%n", sum, list.size());
}
}
Output from a typical run of Program 19.19 is shown below.
9 9 8 7 7 4 7 6 8 7 9 4 9 10 --- 104 (14 values)
We recommend the ArrayList
class for most situations when you want to store
a list of data. It will usually be the list option whose operations execute
fastest. However, the LinkedList
class can perform better in some situations
when the size of the list is rapidly increasing and decreasing, especially when
elements are being added to and removed from near the beginning of the list. As
we’ll discuss in Example 19.5, a Vector
is similar to an ArrayList
in
functionality, but its operations are synchronized so that it’s safe to share
between multiple threads.
The Map
(java.util.Map
) is an interface for a useful, general-purpose data
structure that maintains a dictionary of entries. A dictionary associates unique
keys with values. You can think of it as mapping a key to a value. In the
Java Map
interface, keys and values can be arbitrary Java classes. As with the
List
interface, there are several different implementations of the Map
interface. The most common are the HashMap
and TreeMap
classes.
To show a map in action, Program 19.20 reads a sequence of lines
containing names and ages. For simplicity, each name is one word, and each age
is a simple integer. It stores these (name, age) pairs in a
HashMap<String,Integer>
data structure. Once all the input is read and
in.hasNext()
returns false
, the program prints all the keys (names), then
all the values (ages), and finally it prints the names and ages of each person
in the input file.
HashMap
dictionary to store a mapping from names to ages.import java.util.HashMap;
import java.util.Map;
import java.util.Scanner;
public class HashMapExample {
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
Map<String,Integer> map = new HashMap<>();
while (in.hasNext()) {
String name = in.next();
int age = in.nextInt();
map.put(name, age);
}
System.out.println("Keys");
for (String name: map.keySet()) {
System.out.println("\t" + name);
}
System.out.println("Values");
for (int age: map.values()) {
System.out.println("\t" + age);
}
for (Map.Entry<String, Integer> entry: map.entrySet()) {
System.out.println(entry.getKey() + " -> " + entry.getValue());
}
}
}
Shown below is the output for a simple input file.
Keys Kathy Martha Fred Henway Michael Henry John Margarette Edward Tim Hamcost Values 60 22 15 1 21 31 23 57 12 57 2 Kathy -> 60 Martha -> 22 Fred -> 15 Henway -> 1 Michael -> 21 Henry -> 31 John -> 23 Margarette -> 57 Edward -> 12 Tim -> 57 Hamcost -> 2
In general, the HashMap
class will be the faster implementation of the Map
interface. It uses a data structure called a hash table that contains a large
array with many empty locations. By cleverly jumping to the location associated
with a key, adding, finding, and removing data from a HashMap
can be
remarkably quick. Unfortunately, a HashMap
doesn’t order the keys. If you want
to be able to retrieve your keys in order (assuming that they can be ordered,
like the Integer
and String
classes can be), you should use a TreeMap
,
which internally uses a binary search tree data structure, similar to the one
we’ll discuss in Section 20.4.1.
Sets are similar to maps, except that they only have keys with no associated
values. They’re useful data structures for holding collections of unique
values. If you add a value that’s already present in a set, nothing will happen.
Unsurprisingly, Java provides the Set
interface for sets. HashSet
, an
implementation using a hash table, will usually be the fastest option. As with
maps, TreeSet
, an implementation using a binary search tree, might be slower
but will allow the values in the set to be retrieved in order.
The following table summarizes several important JCF classes. All of these are used frequently in professional Java code.
Interface | Use | Implementation | Details |
---|---|---|---|
|
Stores sequential list of data |
|
Uses dynamic array |
|
Uses linked list |
||
|
Like |
||
|
Stores key-value pairs |
|
Uses hash table |
|
Uses tree, allows ordering |
||
|
Stores unique values |
|
Uses hash table |
|
Uses tree, allows ordering |
19.6. Solution: Infix conversion
Here we give our solution to the infix conversion problem from the
beginning of the chapter. As in Example 19.1,
we use a stack of Term
objects to solve the problem. However, we
expand the Term
class to hold both operands and operators. We only add
methods and fields to the earlier definition, taking nothing away. In
this way, we should be able to use the Term
class for both infix to
postfix conversion and postfix calculation.
public class Term {
private int value;
private char operator;
private boolean isOperator;
Here we’ve augmented the earlier Term
class by adding two more
fields, a char
called operator
to hold an operator and a boolean
called isOperator
to keep track of whether or not our Term
object
holds an operator or an operand.
public Term(int value) {
this.value = value;
isOperator = false;
}
public Term(char operator) {
this.operator = operator;
isOperator = true;
}
We now have two constructors. The first one takes an int
value and
stores it into value
, setting isOperator
to false
to indicate that
the Term
object must be an operand. The second constructor takes a
char
value and stores it into operator
, setting isOperator
to
true
to indicate that the Term
object must be an operator (such as
+
, -
,*
, or /
).
public int getValue() { return value; }
public char getOperator() { return operator; }
public boolean isOperator() { return isOperator; }
These three accessors give back the operand value, the operator
character, and whether or not the object is an operator, respectively.
This solution is not necessarily the most elegant from an OOP
perspective. The code that uses a Term
object needs to choose the
getOperator()
method or the getValue()
method depending on whether
the Term
is an operator. This design opens up the possibility
that some code will call the wrong accessor method and get a useless
default value.
public boolean greaterOrEqual(Term term) {
if (isOperator()) {
switch(operator) {
case '*':
case '/': return true;
case '+':
case '-': return term.operator != '*' && term.operator != '/';
default: return false;
}
} else {
return false;
}
}
}
The most complicated addition to the Term
class is the
greaterOrEqual()
method, which takes in another Term
object. This
method compares the operator of the Term
object being called with the
one that’s being passed in as a parameter. Because this method is in
the Term
class, it can access the private
variables of the term
parameter. This method returns true
if the operator of the called
object has a greater or equal precedence compared to the operator of the
parameter object. The meat of the method is the switch
statement that
establishes the high precedence of *
and /
, the medium precedence of
+
and -
, and the low precedence of anything else, namely the left
parenthesis (
.
With this updated Term
class, we can create Term
objects that hold
either an operator or an operand and allow the precedence of operators
to be compared. We use exactly the same TermStack
class from
Example 19.1 for our stack. All that remains
is the client code that parses the input.
import java.util.*;
public class InfixToPostfix {
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
String expression = in.nextLine(); (1)
TermStack stack = new TermStack(); (2)
String postfix = ""; (3)
1 | We read in the input expression. |
2 | We create a TermStack called stack to aid in conversion. |
3 | We also declare an empty String called postfix to hold the output. |
for (int i = 0; i < expression.length(); i++) { (1)
char term = expression.charAt(i);
if (term >= '0' && term <= '9') { (2)
postfix += term;
} else if (term == '(') { (3)
stack.push(new Term(term));
} else if (term == ')') { (4)
while (stack.top().getOperator() != '(') {
postfix += stack.top().getOperator();
stack.pop();
}
stack.pop(); // Pop off the '('
}
else if (term == '*' || term == '/' || term == '+' || term == '-') {
Term operator = new Term(term); (5)
while (!stack.isEmpty() && stack.top().greaterOrEqual(operator)) {
postfix += stack.top().getOperator();
stack.pop();
}
stack.push(operator);
}
}
1 | This for loop runs through each char in the input expression and
applies the four rules given in the description of the infix conversion
problem. |
2 | If a term is an operand, it’s added directly to the output. |
3 | If a term is a left parenthesis, it’s pushed onto the stack. |
4 | If a term is a right parenthesis, all the terms on the stack are popped off and added to the output until a left parenthesis is reached. |
5 | If a term is a normal operator, the top of the stack is repeatedly popped
and added to output as long as it has a precedence greater than or equal to the
new operator. The complexity of doing this precedence comparison is tucked
away inside of the Term class. |
while (!stack.isEmpty()) { (1)
postfix += stack.top().getOperator();
stack.pop();
}
System.out.println(postfix); (2)
}
}
1 | After the input has all been consumed, we pop any remaining operators off the stack and add them to the output. |
2 | Finally, we print the output. |
The output from this program could be used as the input to the postfix
evaluator program from Example 19.1. A more complex program
that did both the conversion and the calculation might want to store
everything in a queue of Term
objects instead of producing String
output and
then recreating Term
objects.
19.7. Concurrency: Linked lists and thread safety
The implementations of stacks and queues in the previous sections are
not thread-safe. If multiple threads use a stack or queue object
simultaneously, the head
or tail
pointers can become inconsistent or
be updated incorrectly, potentially causing the stack or queue to lose
elements. As you’ve seen, multiple threads operating on the same data
can produce unexpected results.
Program 19.21 is a simple multi-threaded program to test (and break!) the thread safety of the queue implementation in Program 19.15.
public class UseLinkedListQueue extends Thread {
private static final int THREADS = 10;
private LinkedListQueue queue;
private boolean adding;
public static void main(String[] args) throws InterruptedException {
Thread[] threads = new Thread[THREADS];
LinkedListQueue queue = new LinkedListQueue(); (1)
for (int i = 0; i < THREADS; i++) {
threads[i] = new UseLinkedListQueue(queue, true);
threads[i].start(); (2)
}
for (int i = 0; i < THREADS; i++) {
threads[i].join(); (3)
}
for (int i = 0; i < THREADS; i++) {
threads[i] = new UseLinkedListQueue(queue, false);
threads[i].start(); (4)
}
for (int i = 0; i < THREADS; i++) {
threads[i].join();
}
while (!queue.isEmpty()) {
System.out.println("Left in queue: ID = " + queue.dequeue());
}
}
1 | The program creates a queue. |
2 | It then creates and starts 10 threads, passing them the queue and true
for their adding value. |
3 | Then, the program joins the threads until each has finished. |
4 | The program starts 10 more threads, passing them the same queue but false
for their adding value. |
public UseLinkedListQueue(LinkedListQueue queue, boolean adding) { (1)
this.queue = queue;
this.adding = adding;
}
public void run() {
if (adding) {
long ID = Thread.currentThread().getId();
System.out.println("Thread ID added to queue: " + ID);
queue.enqueue("" + ID); (2)
}
else {
String ID = queue.dequeue();
System.out.println("Thread ID removed from queue: " + ID); (3)
}
}
}
1 | Each thread’s constructor takes a reference to the shared queue and a
boolean that specifies whether it’s an adding or a removing thread. |
2 | Threads in the adding phase add a String version of their thread ID
numbers to the queue and print it out. |
3 | Threads not in the adding phase each remove one value from the queue and print it out. |
Without appropriate synchronization, the program may not correctly link all values into the queue nor remove them at the end. A typical error-prone output run is shown here.
Thread ID added to queue: 9 Thread ID added to queue: 14 Thread ID added to queue: 13 Thread ID added to queue: 12 Thread ID added to queue: 11 Thread ID added to queue: 10 Thread ID added to queue: 18 Thread ID added to queue: 17 Thread ID added to queue: 16 Thread ID added to queue: 15 Thread ID removed from queue: 14 Thread ID removed from queue: 11 Thread ID removed from queue: 12 Thread ID removed from queue: 16 Thread ID removed from queue: 17 Thread ID removed from queue: 18 Thread ID removed from queue: 10 Can't dequeue an empty queue! Can't dequeue an empty queue! Thread ID removed from queue: 15 Thread ID removed from queue: null Thread ID removed from queue: null
How does this implementation fail? Consider the situation in which two
threads are attempting to put a value in the queue simultaneously by calling
the enqueue()
method in Program 19.15. Suppose
the first thread tests the queue and finds it empty (isEmpty()
returns
true
) but is then interrupted. If a second thread gets control, it will
also see that the queue’s empty, set the head
and tail
variables to the new Node
object temp
, and
return. The first thread will eventually wake up, still thinking that
the queue is empty, and also set the head
and tail
variables to its
own new Node
temp
. But these assignments overwrite the assignments
just done by the previous thread! The initial node that was in the queue
is now lost. Note that there are other sequences of execution that can cause
similar race conditions.
This problem can be fixed by ensuring that once one thread starts
examining and modifying queue variables, no other thread can access the
same variables until the first one is finished. As shown in
Chapter 15, this mutual exclusion can be
achieved by using the synchronized
keyword on methods that need to
have exclusive access to object variables. In this queue implementation,
we need to synchronize access by threads that are using either the
enqueue()
or dequeue()
methods, since both methods access and
manipulate variables in the object. Although it’s not called in this
program, the front()
method should also be synchronized so that a
null
head
is not accessed accidentally. The isEmpty()
method doesn’t
need to be synchronized since the only methods that call it that can
do any harm are already synchronized. Outside code that calls
isEmpty()
might get the wrong value if another thread modifies the
contents of the queue, but there’s no guarantee that other threads won’t
modify the state of the queue at any point after the isEmpty()
method is called anyway.
public class LinkedListQueueSafe implements Queue {
private static class Node {
public String value;
public Node next;
}
private Node head = null;
private Node tail = null;
public synchronized void enqueue(String value) {
Node temp = new Node();
temp.value = value;
temp.next = null;
if (isEmpty()) {
head = temp;
} else {
tail.next = temp;
}
tail = temp;
}
public synchronized String dequeue() {
String value = null;
if (isEmpty()) {
System.out.println("Can't dequeue an empty queue!");
} else {
value = head.value;
head = head.next;
if(head == null)
tail = null;
}
return value;
}
public synchronized String front() {
String value = null;
if (isEmpty()) {
System.out.println("No front on an empty queue!");
} else {
value = head.value;
}
return value;
}
public boolean isEmpty() {
return head == null;
}
}
With both enqueue()
and dequeue()
methods synchronized as in
Program 19.22, a typical output generated by the
program is shown below.
Thread ID added to queue: 9 Thread ID added to queue: 14 Thread ID added to queue: 12 Thread ID added to queue: 13 Thread ID added to queue: 10 Thread ID added to queue: 11 Thread ID added to queue: 18 Thread ID added to queue: 17 Thread ID added to queue: 16 Thread ID added to queue: 15 Thread ID removed from queue: 9 Thread ID removed from queue: 18 Thread ID removed from queue: 13 Thread ID removed from queue: 17 Thread ID removed from queue: 15 Thread ID removed from queue: 16 Thread ID removed from queue: 14 Thread ID removed from queue: 12 Thread ID removed from queue: 10 Thread ID removed from queue: 11
19.8. Concurrency: Thread-safe libraries
As we mentioned in Section 9.6, some libraries are thread-safe and some are not. The JCF is a useful library, but it’s also a library that requires you to manage your own thread safety if it matters for your program.
The JCF defines the Collection
interface and the Map
interface. The
Collection
interface, which any collection of objects should
implement, has subinterfaces Set
, List
, and Queue
which define the
basic operations in Java that are needed to implement a set, list, or
queue of items. The Map
interface gives the basic operations for a
dictionary, a collection of key-value pairs, one implementation of which
is the HashMap
from Example 19.4.
As we mentioned in Chapter 10, an interface can’t
mark a method with the synchronized
keyword. Consequently, the JCF makes
no guarantee about the thread safety of a container based on which
interface it implements. The programmer must read the documentation
carefully in order to know if a container is thread-safe and react accordingly.
Vector
A Vector
is like an ArrayList
, with essentially the same interface
but adding synchronization. That is, if two threads attempt to insert or
remove an element from the same ArrayList
at the same time, the
internal state of the ArrayList
might become corrupt, or the results might be
incorrect. For a Vector
, however, these problems won’t happen.
Program 19.23 is an example that makes updates to a Vector
class
with multiple threads.
Vector
.import java.util.*;
public class VectorExample extends Thread {
private List<String> list;
public static void main(String[] args) throws InterruptedException {
List<String> list = new ArrayList<>(); (1)
Thread t1 = new VectorExample(list); (2)
Thread t2 = new VectorExample(list);
t1.start();
t2.start();
t1.join();
t2.join();
for (String text: list) { (3)
System.out.println(text);
}
}
1 | The main() method creates a Vector . |
2 | It then creates and starts two threads, passing in the Vector as an
argument to each so that they both share the list. |
3 | After waiting for the threads to finish, the main() method prints
out the contents of the list. |
public VectorExample(List<String> list) {
this.list = list; (1)
}
public void run() {
for (int i = 0; i < 10; i++) { (2)
list.add(this.getName() + ": " + i); (3)
try {
Thread.sleep(1); (4)
}
catch (InterruptedException e) {
e.printStackTrace();
}
}
}
}
1 | The constructor stores a reference to the shared Vector . |
2 | Each thread repeats a loop 10 times, appending a String to the
Vector on each iteration. |
3 | To prevent concurrent updates from happening, each thread
synchronizes on the shared variable list . |
4 | To make concurrent update attempts more likely without synchronization, the thread sleeps for a millisecond on each iteration. |
By using Vector
, each run includes exactly the same number of entries from
each thread, although the threads don’t always alternate in strict lockstep.
Thread-0: 0 Thread-1: 0 Thread-0: 1 Thread-1: 1 Thread-1: 2 Thread-0: 2 Thread-1: 3 Thread-0: 3 Thread-1: 4 Thread-0: 4 Thread-0: 5 Thread-1: 5 Thread-1: 6 Thread-0: 6 Thread-1: 7 Thread-0: 7 Thread-1: 8 Thread-0: 8 Thread-1: 9 Thread-0: 9
However, if we had used an ArrayList
instead, a possible run, shown below,
includes a null
reference in the output, indicating that the internal
ArrayList
data structure wasn’t updated correctly.
Thread-1: 0 Thread-0: 0 Thread-1: 1 Thread-0: 1 Thread-1: 2 Thread-0: 2 Thread-0: 3 Thread-1: 3 Thread-1: 4 Thread-0: 4 null Thread-0: 5 Thread-1: 6 Thread-0: 6 Thread-1: 7 Thread-0: 7 Thread-1: 8 Thread-0: 8 Thread-0: 9 Thread-1: 9
ArrayList
is much more widely used than Vector
, not in spite of its lack
synchronization but because of that lack. Synchronization tools have
overhead, slowing down execution. Most programmers aren’t focused on writing
thread-safe code and prefer faster execution.
Whenever synchronization doesn’t matter, ArrayList
is a better choice than
Vector
. When synchronization does matter, the programmer must decide
whether to use Vector
or to use ArrayList
with explicit synchronization
tools.
19.9. Exercises
Conceptual Problems
-
Explain the difference between static data structures and dynamic data structures.
-
In which situations is it better to use a dynamic array? In which situations is it better to use a linked list? Explain why in each case.
-
On which line in Program 19.1 is an exception generated? Why?
-
In Program 19.2, is it possible to increment
count
inside thetry
clause rather than at the bottom of thewhile
loop? -
Explain why the array inside the
names
object in Program 19.4 is, on average, only three-quarters full. -
Based on the stack implementation in Program 19.10, draw a picture of the linked list structure after each of the following statements.
LinkedListStack stack = new LinkedListStack(); stack.push("hello"); stack.push("goodbye"); stack.pop(); stack.push("there"); stack.push("cruel"); stack.pop(); stack.push("world");
-
Implement the methods
top()
andisEmpty()
for the dynamic array implementation of the stack in Program 19.11. -
Based on queue implementation in Program 19.15, draw a picture of the linked list structure after each of the following statements.
LinkedListStack queue = new LinkedListStack(); stack.enqueue("hello"); stack.enqueue("there"); stack.enqueue("world"); stack.dequeue(); stack.enqueue("cruel"); stack.dequeue(); stack.enqueue("goodbye");
Programming Practice
-
Implement a version of
DynamicArray
from Program 19.3 that shrinks the size of its internal storage array to half its size when only one quarter of its capacity is being used. This design can save significant amounts of space if a large number of items are added to the dynamic array at once and then removed. -
Consider Program 19.7 which defines the
LinkedListWithTail
class for storing a linked list ofString
values. Add areverse()
method to the class which reverses the order of the nodes in the linked list. The key idea is make a new linked list that holds the head of the list. Then, remove the head from the original linked list. Put the next node in front of the head in the new linked list and remove it from the old. Continue the process until there’s nothing left in the original list. Be sure to reset thehead
andtail
references correctly after the reversal. -
In Section 19.2.2, we used two kinds of linked lists to store data but copy all that data back into an array before sorting it. We also used a third linked list class,
SortedLinkedList
, to insert data and maintain a sorted order. However, it’s possible to add data in non-sorted order to a linked list and then sort it afterward. Add asort()
method to theLinkedListWithTail
class that performs a bubble sort on the nodes inside.The algorithm for a bubble sort is described in Section 10.1. The idea is to make repeated passes through a list, swapping two adjacent items if they’re out of order. You keep making passes over the list until no adjacent items are out of order. For a this
sort()
method, you’ll need to use thecompareTo()
method to compare theString
values in the linked list nodes. Also, it might be necessary to have special cases that update thehead
andtail
pointers if those nodes are swapped with other nodes. Note that bubble sort is not the fastest way to sort a linked list. We introduce a faster approach in Chapter 20. -
Create JUnit test cases to verify that the
synchronized
keywords are needed on theset()
andsort()
methods of theDynamicArray
example from Program 19.3. To test theset()
method, you can create one thread that repeatedly sets, gets, and tests a changing value at a fixed location (e.g., 0) and another thread that continuously appends to the array (causing it to grow by copy and replace, thus occasionally overwriting the value at the fixed location). To test thesort()
method, create two threads that sort the same large random array at the same time. Check to see if the array is, in fact, actually sorted after the threads have exited. For both tests, you might need to repeat the operations a number of times to trigger the race condition. -
To make an infix converter that can handle floating-point values or even just integers with more than one digit, you need to make a pass over the input, parsing the sequence of characters into terms. When an expression is in infix notation, the order of terms is an operand followed by an operator, repeated over and over, and finishing on an operand. There are two exceptions: Whenever you’re expecting an operand, you might get a left parenthesis, but after the parenthesis, you’ll continue to look for an operand. Whenever you’re expecting an operator, you might get a right parenthesis, but after that parenthesis, you’ll continue to look for an operator.
Using this first pass over input to separate terms as well as the
Double.parseDouble()
method to compute the equivalentdouble
values of operands, rewrite the solution from Section 19.6 to convert your terms into postfix ordering and then calculate the answer. -
Re-implement the solution to the infix conversion problem given in Section 19.6 so that it uses
GenericStack
with a type parameter ofTerm
instead ofTermStack
. -
Interfaces can also be generic. Consider the following generic version of
Queue
.public interface Queue<T> { void enqueue(T value); T dequeue(); T front(); boolean isEmpty(); }
Re-implement
LinkedListQueue
so that it’s generic with type parameterT
and implements interfaceQueue<T>
.
Experiments