In this course, we use the term “representation” exclusively to mean the representation of abstract ideas by means of physical signs.
If we want to convey an abstract idea to another person, we are forced to represent that idea by means of physical signs, because only physical signs can be perceived by the other person. For example, to convey the abstract idea “tomorrow”, we can represent it using sound waves by speaking the word “tomorrow”, or we can represent it using blots of ink on paper in the shape of the letters “tomorrow”. These are two physical representations of the abstract idea “tomorrow”. As another example, we can convey the abstract idea “five” by holding up all five fingers of one hand; in this case, we are representing the abstract idea “five” by means of the physical sign of holding up the five fingers of our hand.
There are generally many ways to physically represent a given abstract idea, and many ways to interpret a physical sign as an abstract idea. For example, the letters “chair” can mean “seat” (in English) or “flesh” (in French). Therefore, conveying an abstract idea using physical signs works only if the relation between abstract ideas and corresponding physical signs (which we could call an interpretation relation or, as we do in this course, an abstraction relation) is well-understood. To avoid ambiguity, a given physical sign should be associated with at most one abstract idea. It is however possible that a given physical sign is not associated with any abstract idea; in that case we say the physical sign is nonsense. Thus, an unambiguous abstraction relation corresponds to a partial function from physical signs to abstract ideas; the domain of this function is the set of meaningful (or valid) physical signs for the given abstraction relation.
The abstract ideas we wish to represent using physical signs in this course are abstract states from some set of abstract states (called an abstract state space). We wish to represent them by means of physical states of some part of the world. For example, if we take the set of numbers between zero and five as the abstract state space, then our possible abstract states are 0, 1, 2, 3, 4, and 5. We can represent those abstract states physically by means of physical states of our hand: we represent 0 by keeping our hand closed, 1 by showing a single finger, 2 by showing two fingers, 3 by showing three fingers, 4 by showing four fingers, and 5 by showing five fingers. Looking at it this way, our hand can be in six possible states. We call these states concrete states because they are states of the physical world. This abstraction relation specifies how we can interpret a state of our hand as an abstract value from our abstract state space. We call the set of possible states of our hand the concrete state space.
This, however, is not the only way we can interpret states of our hand as numbers between 0 and 5. Another possible way to represent these abstract states using concrete states of our hand is as follows: showing no fingers means 0, showing our thumb means 1, showing our index finger means 2, showing our middle finger means 3, showing our ring finger means 4, and showing our little finger means 5. Under this abstraction relation, showing multiple fingers is nonsense; it is not a valid concrete state. Only concrete states in which at most one finger is shown are valid concrete states and map to abstract states.
In this course, we are more specifically interested in defining and implementing data abstractions. We wish to enable clients of a data abstraction implemented by some class C to think of objects of class C as storing an abstract state from some abstract state space. For example, we wish to enable clients of a TimeOfDay class to think of a TimeOfDay object as storing a time-of-day value such as 10:30am or 2:15pm. In order to implement such a data abstraction, we need to choose a way to represent these abstract states as physical states of the TimeOfDay object. This means that we need to choose a specific data representation: which fields should we declare in class TimeOfDay, and what should be the types of those fields.
One valid choice of data representation for class TimeOfDay is to declare a field hours
of type int
and a field minutes
, also of type int
. In that case, the concrete states of a TimeOfDay object are characterized by a value for field hours
and a value for field minutes
. One possible abstraction relation is relate a concrete state with value H for hours
and value M for minutes
with the time of day H:M. Under this abstraction relation, concrete states where H is less than 0 or greater than 23, or M is less than zero or greater than 59, are not meaningful, i.e. they are not valid.
Another valid choice for an abstraction relation for the same choice of fields allows any value for minutes
, provided that H * 60 + M is not less than 0 and less than 24 times 60. Under this abstraction relation, there are fewer invalid concrete states, and multiple different concrete states may represent the same abstract state. For example, a value of 0 for hours
and 600 for minutes
represents 10:00am, and so does a value of 5 for hours
and a value of 300 for minutes
.
A third valid choice of data representation is to have only a field minutesSinceMidnight
of type int
.
A data representation for a data abstraction, then, at least in simple cases like TimeOfDay, consists of a choice of fields and field types for the class being designed, and an abstraction relation which determines the set of valid concrete states and the correspondence between valid concrete states and abstract states.
As another example of a data abstraction, we wish to enable clients of a class IntList to think of objects of class IntList as storing a sequence of int values. The abstract state space in this case is the set of sequences of int values, including sequences such as 10, 20, 30 (a sequence of length 3), or 400, 300, 350, 150 (a sequence of length 4), or the empty sequence (which is of length 0). Note that this abstract state space is infinite: there are infinitely many sequences of int values. Note also that an object of a class whose fields have primitive types (such as int
or char
, but not a class or array type), has only finitely many possible concrete states, since a class can have only finitely many fields and each of Java's primitive types has only finitely many values. Therefore, we cannot represent the abstract states of an IntList object using only the concrete states of the object itself. In such cases, we need to consider the concrete states of other objects as well. For example, we can implement the IntList abstraction by having a field elements
in class IntList of type int[]
(array-of-int
). Such a field can hold a reference to an array-of-int
object. By marking this field as @representationObject
, we indicate that we include the array object into the part of the physical world that we use to represent the abstract states of the IntList object. That is, the part of the physical world that we use to represent the abstract states of an IntList object L is object L itself as well as the array object pointed to by L's elements
field. We consider the concrete state of an instance L of the data abstraction to consist of the concrete states of the fields of L together with the concrete state of the representation object of L. We say that such a concrete state is valid if the elements
field of L is not null
, and if so, that the concrete state corresponds to the abstract state given by the sequence of int
values stored in the array object.
Another possible abstraction relation for class IntList is to consider a null
value for elements
to be valid, and to represent the empty sequence. In that case, the empty sequence has two possible representations: one where the elements
field is null
, and one where the elements
field points to an empty array. The choice of which abstraction relation is used is expressed by the presence or absence of a representation invariant that states that elements
must not be null
. (Note that if we allow elements
to be null
, we must take this case into account and properly handle it in every method of our class! Therefore, the more sensible choice is to not allow elements
to be null
.)
A third possible choice of a data representation for class IntList is to define an auxiliary class Node
with fields value
(of type int
) and next
(of type Node
) and to use a linked list of Node
objects to represent an abstract state of an IntList object. In that case, class IntList has a single field called head
of type Node
, and the concrete state of an instance L of the data abstraction consists of the state of the head
field of L and the states of the Node
objects reachable from head
by following the next
pointers. That is, the Node
objects are included in the part of the physical world that is used to represent an abstract state of the IntList object. (The way to indicate that the Node objects are representation objects of the IntList object is advanced material and need not be known for the exam.)