A friend of mine once said, "What do you mean you don't want to argue
about semantics? What else is there to argue about?"
In the realm of programming languages, expressing the intended
semantics of an interface is one of the most important aspects of the
design. Naturally, I'll focus this discussion on the issues with
expressing semantics in C++ programs.
A strictly typed language can be used to express many simple semantics,
such as "the first argument of the function is a signed integer."
However, the C++ language does not offer a way to express ideas such as
"the range of legal values for the first argument is between -2 and +2
inclusive."
The purpose of this document is to 1) provide a place to collect ideas
regarding the various interface semantics that are not expressed
directly by C++, 2) analyze those semantics, 3) evolve a consistent set
of terminology, and 4) develop a set of questions that a developer can
ask himself with regard to an interface, that will help him to correctly
document the semantics of each argument, the return type, and the
behavior of the operation overall.
Ambiguous Semantic Analysis
Each subsection of this document will discuss a particular semantic
issue.
Life Expectency For Objects Passed By Reference
Its common knowledge that passing arguments by reference rather than by
value is more efficient once the size of the argument exceeds a certain
threshold. The threshold varies, but is generally related to the
register size of the target CPU.
A pass by value semantic is the default for integers, and involves a
copy of the object being passed to the operation. If during its
execution, the operation modifies the argument, it is modifying a copy
of the original object rather than the object itself.
If an object is passed by reference, and the operation modifies the
object, it is actually modifying the original object, rather than a
copy. Therefore, although the act of passing by reference can be more
efficient than passing by value, the semantics are different from the
point-of-view of the implementation of the operation. In other words,
either the operation must be careful not to modify the referenced
argument, or the semantic must be made clear to the client.
Of course, using the "const" qualifier for the reference in the
interface specification neatly "prevents" the implementation of the
interface from modifying the argument, effectively making the argument
read-only.
However, what is not covered by C++ is the time related aspects of the
referenced object. When an object is passed by value, the operation
implementation is free to keep a copy of the copy, for its own use, and
the lifetime of the copy is under its control.
The possibility of an operation storing a reference beyond the
execution of the operation itself, however, has other implications. Does
the operation expect the referenced object to persist beyond the
exectution of the operation? For example, what happens if the operation
is part of an object, and a copy of the reference is stored within the
state of the object, and subsequently another operation is invoked that
uses the reference. The expectation of the interface, in this case, is
that the referenced object remain valid for some period of time after
the reference is obtained throught the original operation.
The question then becomes, "How long must the referenced object persist
beyond the invocation of this operation?" Possible answers to this
question include:
- The referenced object is never accessed after the operation
completes.
- The referenced object must remain valid for the lifetime object
for which the operation is a member.
- The referenced object must remain valid "forever".
- The referenced object must remain valid until one or more
specified operations of the object for which the operation is a member
is invoked.
- The referenced object must remain valid for a specific period of
time after the operation is invoked (Yuk.)
- The referenced object must remain valid until some specific event
occurs.