By
Date, C.J.
Introduction: The term 'encapsulation' and the focus
on what it describes, when a type in question has no visible components,
is misleading and not very useful. Instead, the term 'scalar' captures
much better the idea that encapsulation tries to express. Other than
this nomenclature problem, there are other problems with encapsulation.
The concept directly contradicts the need to perform ad hoc queries.
Also, often types are referred to as encapsulated when they are not,
nor would it be beneficial if they were. Just because data types are
not encapsulated does not mean that any information is lost.
Because of these points, the term 'encapsulation' should be avoided
entirely.
How this article is organized:
- WHAT DOES ENCAPSULATION MEAN?
- WHAT ABOUT AD HOC QUERIES?
- WE DON'T ALWAYS WANT ENCAPSULATION
- SCALAR VS NONSCALAR TYPES
"Encapsulation [means each type has] a set of [operations and] a
representation ...
that is allocated for each of its instances.
This representation is used to store the state of the object.
Only the methods implementing operations for the objects are
allowed to access the representation, thereby making it possible
to change the representation without disturbing the rest of the
system. Only the methods would need to be recoded.
"["Fundamentals of Object-Oriented Databases."
"Encapsulation refers to the concept of including processing or behavior
with the object instances defined by the class. Encapsulation allows code
and data to be packaged together."[The Object Database Handbook]
Abstraction, information hiding, and encapsulation are the three concepts
that appear during the analysis and design of a system.
Some writers seem to think the concept refers specifically to
the physical bundling, or "packaging," of data representation
definitions and operator definitions.
But it seems to me that to interpret the term in this way is to
mix model and implementation considerations. The user shouldn't care,
and shouldn't need to care, whether or not code and data are "packaged
together! "Thus, it's my belief that--from the user's point of view, at
least, which is to say from the point of view of the model--encapsulation
simply means what I said before: namely, that the data in question has no
user-visible components and can be operated upon only by means of the
pertinent operators.
BUT WHAT ABOUT AD HOC QUERIES?
As you might already know, the objective of encapsulation conflicts
somewhat
with the need to be able to perform ad hoc queries. After all,
encapsulation
means data can be accessed only via predefined operators, while an ad hoc
query means, more or less by definition, that access is required in ways
that can't have been predefined. For example, suppose we have a data type
POINT; suppose we also have a (predefined) operator to "get"--that is,
read
or retrieve--the X coordinate of any given point, but no analogous
operator
to get the corresponding Y coordinate. Then even the following simple
queries,
and many others like them, obviously can't be handled:
* Get the Y coordinate of point P
* Get all points on the X axis
* Get all points with Y coordinate less than five.
In fact, they can't even be stated.
Solution: define operators that
expose
some possible representation for instances of that type.
THE_X and THE_Y effectively expose a possible representation - namely,
Cartesian coordinates X and Y - for points, thereby making it possible
to perform ad hoc queries involving points.
This fact doesn't mean that points are actually represented by Cartesian
coordinates inside the system.
This method does not violate data independence.
WE DON'T ALWAYS WANT ENCAPSULATION
Some types are definitely not encapsulated: certain "generated" types,
such
as ARRAY, LIST, TUPLE, and RELATION
VAR POINT...RELATION{X..., Y...}
user-visible components: the attributes X and Y. => ad hoc queries
Base types vs composite types. C++ takes intermediate approach.
SCALAR vs. NONSCALAR TYPES
Scalar as a generic way of referring to "simple" types (POINT, LENGTH,
AREA, LINE,
and so on--possibly even types like INTEGER, if the system doesn't
provide them as
built-in types)
The reasons to chose the term scalar:
* It was already available. (It's been used with the meaning we had in
mind
for many years in the programming languages world.)
* It did seem to be the obviously correct term to contrast with ones such
as tuple and relation (and array and list and all the rest).
And I now observe that our term scalar means exactly the same thing as
encapsulated.
Copyright © University of Colorado. All rights reserved.
Revised: November 17, 1998