Brian Silverman | af2eaa8 | 2018-08-04 17:28:31 -0700 | [diff] [blame^] | 1 | [/ |
| 2 | Boost.Optional |
| 3 | |
| 4 | Copyright (c) 2003-2007 Fernando Luis Cacciola Carballal |
| 5 | |
| 6 | Distributed under the Boost Software License, Version 1.0. |
| 7 | (See accompanying file LICENSE_1_0.txt or copy at |
| 8 | http://www.boost.org/LICENSE_1_0.txt) |
| 9 | ] |
| 10 | |
| 11 | [section Design Overview] |
| 12 | |
| 13 | [section The models] |
| 14 | |
| 15 | In C++, we can ['declare] an object (a variable) of type `T`, and we can give this |
| 16 | variable an ['initial value] (through an ['initializer]. (cf. 8.5)). |
| 17 | When a declaration includes a non-empty initializer (an initial value is given), |
| 18 | it is said that the object has been initialized. |
| 19 | If the declaration uses an empty initializer (no initial value is given), and |
| 20 | neither default nor value initialization applies, it is said that the object is |
| 21 | [*uninitialized]. Its actual value exist but has an ['indeterminate initial value] |
| 22 | (cf. 8.5/11). |
| 23 | `optional<T>` intends to formalize the notion of initialization (or lack of it) |
| 24 | allowing a program to test whether an object has been initialized and stating |
| 25 | that access to the value of an uninitialized object is undefined behavior. That |
| 26 | is, when a variable is declared as `optional<T>` and no initial value is given, |
| 27 | the variable is ['formally] uninitialized. A formally uninitialized optional object |
| 28 | has conceptually no value at all and this situation can be tested at runtime. It |
| 29 | is formally ['undefined behavior] to try to access the value of an uninitialized |
| 30 | optional. An uninitialized optional can be assigned a value, in which case its initialization state changes to initialized. Furthermore, given the formal |
| 31 | treatment of initialization states in optional objects, it is even possible to |
| 32 | reset an optional to ['uninitialized]. |
| 33 | |
| 34 | In C++ there is no formal notion of uninitialized objects, which means that |
| 35 | objects always have an initial value even if indeterminate. |
| 36 | As discussed on the previous section, this has a drawback because you need |
| 37 | additional information to tell if an object has been effectively initialized. |
| 38 | One of the typical ways in which this has been historically dealt with is via |
| 39 | a special value: `EOF`, `npos`, -1, etc... This is equivalent to adding the |
| 40 | special value to the set of possible values of a given type. This super set of |
| 41 | `T` plus some ['nil_t]—where `nil_t` is some stateless POD—can be modeled in modern |
| 42 | languages as a [*discriminated union] of T and nil_t. Discriminated unions are |
| 43 | often called ['variants]. A variant has a ['current type], which in our case is either |
| 44 | `T` or `nil_t`. |
| 45 | Using the __BOOST_VARIANT__ library, this model can be implemented in terms of `boost::variant<T,nil_t>`. |
| 46 | There is precedent for a discriminated union as a model for an optional value: |
| 47 | the __HASKELL__ [*Maybe] built-in type constructor. Thus, a discriminated union |
| 48 | `T+nil_t` serves as a conceptual foundation. |
| 49 | |
| 50 | A `variant<T,nil_t>` follows naturally from the traditional idiom of extending |
| 51 | the range of possible values adding an additional sentinel value with the |
| 52 | special meaning of ['Nothing]. However, this additional ['Nothing] value is largely |
| 53 | irrelevant for our purpose since our goal is to formalize the notion of |
| 54 | uninitialized objects and, while a special extended value can be used to convey |
| 55 | that meaning, it is not strictly necessary in order to do so. |
| 56 | |
| 57 | The observation made in the last paragraph about the irrelevant nature of the |
| 58 | additional `nil_t` with respect to [_purpose] of `optional<T>` suggests an |
| 59 | alternative model: a ['container] that either has a value of `T` or nothing. |
| 60 | |
| 61 | As of this writing I don't know of any precedent for a variable-size |
| 62 | fixed-capacity (of 1) stack-based container model for optional values, yet I |
| 63 | believe this is the consequence of the lack of practical implementations of |
| 64 | such a container rather than an inherent shortcoming of the container model. |
| 65 | |
| 66 | In any event, both the discriminated-union or the single-element container |
| 67 | models serve as a conceptual ground for a class representing optional—i.e. |
| 68 | possibly uninitialized—objects. |
| 69 | For instance, these models show the ['exact] semantics required for a wrapper |
| 70 | of optional values: |
| 71 | |
| 72 | Discriminated-union: |
| 73 | |
| 74 | * [*deep-copy] semantics: copies of the variant implies copies of the value. |
| 75 | * [*deep-relational] semantics: comparisons between variants matches both |
| 76 | current types and values |
| 77 | * If the variant's current type is `T`, it is modeling an ['initialized] optional. |
| 78 | * If the variant's current type is not `T`, it is modeling an ['uninitialized] |
| 79 | optional. |
| 80 | * Testing if the variant's current type is `T` models testing if the optional |
| 81 | is initialized |
| 82 | * Trying to extract a `T` from a variant when its current type is not `T`, models |
| 83 | the undefined behavior of trying to access the value of an uninitialized optional |
| 84 | |
| 85 | Single-element container: |
| 86 | |
| 87 | * [*deep-copy] semantics: copies of the container implies copies of the value. |
| 88 | * [*deep-relational] semantics: comparisons between containers compare container |
| 89 | size and if match, contained value |
| 90 | * If the container is not empty (contains an object of type `T`), it is modeling |
| 91 | an ['initialized] optional. |
| 92 | * If the container is empty, it is modeling an ['uninitialized] optional. |
| 93 | * Testing if the container is empty models testing if the optional is |
| 94 | initialized |
| 95 | * Trying to extract a `T` from an empty container models the undefined behavior |
| 96 | of trying to access the value of an uninitialized optional |
| 97 | |
| 98 | [endsect] |
| 99 | |
| 100 | [section The semantics] |
| 101 | |
| 102 | Objects of type `optional<T>` are intended to be used in places where objects of |
| 103 | type `T` would but which might be uninitialized. Hence, `optional<T>`'s purpose is |
| 104 | to formalize the additional possibly uninitialized state. |
| 105 | From the perspective of this role, `optional<T>` can have the same operational |
| 106 | semantics of `T` plus the additional semantics corresponding to this special |
| 107 | state. |
| 108 | As such, `optional<T>` could be thought of as a ['supertype] of `T`. Of course, we |
| 109 | can't do that in C++, so we need to compose the desired semantics using a |
| 110 | different mechanism. |
| 111 | Doing it the other way around, that is, making `optional<T>` a ['subtype] of `T` |
| 112 | is not only conceptually wrong but also impractical: it is not allowed to |
| 113 | derive from a non-class type, such as a built-in type. |
| 114 | |
| 115 | We can draw from the purpose of `optional<T>` the required basic semantics: |
| 116 | |
| 117 | * [*Default Construction:] To introduce a formally uninitialized wrapped |
| 118 | object. |
| 119 | * [*Direct Value Construction via copy:] To introduce a formally initialized |
| 120 | wrapped object whose value is obtained as a copy of some object. |
| 121 | * [*Deep Copy Construction:] To obtain a new yet equivalent wrapped object. |
| 122 | * [*Direct Value Assignment (upon initialized):] To assign a value to the |
| 123 | wrapped object. |
| 124 | * [*Direct Value Assignment (upon uninitialized):] To initialize the wrapped |
| 125 | object with a value obtained as a copy of some object. |
| 126 | * [*Assignment (upon initialized):] To assign to the wrapped object the value |
| 127 | of another wrapped object. |
| 128 | * [*Assignment (upon uninitialized):] To initialize the wrapped object with |
| 129 | value of another wrapped object. |
| 130 | * [*Deep Relational Operations (when supported by the type T):] To compare |
| 131 | wrapped object values taking into account the presence of uninitialized states. |
| 132 | * [*Value access:] To unwrap the wrapped object. |
| 133 | * [*Initialization state query:] To determine if the object is formally |
| 134 | initialized or not. |
| 135 | * [*Swap:] To exchange wrapped objects. (with whatever exception safety |
| 136 | guarantees are provided by `T`'s swap). |
| 137 | * [*De-initialization:] To release the wrapped object (if any) and leave the |
| 138 | wrapper in the uninitialized state. |
| 139 | |
| 140 | Additional operations are useful, such as converting constructors and |
| 141 | converting assignments, in-place construction and assignment, and safe |
| 142 | value access via a pointer to the wrapped object or null. |
| 143 | |
| 144 | [endsect] |
| 145 | |
| 146 | [section The Interface] |
| 147 | |
| 148 | Since the purpose of optional is to allow us to use objects with a formal |
| 149 | uninitialized additional state, the interface could try to follow the |
| 150 | interface of the underlying `T` type as much as possible. In order to choose |
| 151 | the proper degree of adoption of the native `T` interface, the following must |
| 152 | be noted: Even if all the operations supported by an instance of type `T` are |
| 153 | defined for the entire range of values for such a type, an `optional<T>` |
| 154 | extends such a set of values with a new value for which most |
| 155 | (otherwise valid) operations are not defined in terms of `T`. |
| 156 | |
| 157 | Furthermore, since `optional<T>` itself is merely a `T` wrapper (modeling a `T` |
| 158 | supertype), any attempt to define such operations upon uninitialized optionals |
| 159 | will be totally artificial w.r.t. `T`. |
| 160 | |
| 161 | This library chooses an interface which follows from `T`'s interface only for |
| 162 | those operations which are well defined (w.r.t the type `T`) even if any of the |
| 163 | operands are uninitialized. These operations include: construction, |
| 164 | copy-construction, assignment, swap and relational operations. |
| 165 | |
| 166 | For the value access operations, which are undefined (w.r.t the type `T`) when |
| 167 | the operand is uninitialized, a different interface is chosen (which will be |
| 168 | explained next). |
| 169 | |
| 170 | Also, the presence of the possibly uninitialized state requires additional |
| 171 | operations not provided by `T` itself which are supported by a special interface. |
| 172 | |
| 173 | [heading Lexically-hinted Value Access in the presence of possibly |
| 174 | uninitialized optional objects: The operators * and ->] |
| 175 | |
| 176 | A relevant feature of a pointer is that it can have a [*null pointer value]. |
| 177 | This is a ['special] value which is used to indicate that the pointer is not |
| 178 | referring to any object at all. In other words, null pointer values convey |
| 179 | the notion of nonexistent objects. |
| 180 | |
| 181 | This meaning of the null pointer value allowed pointers to became a ['de |
| 182 | facto] standard for handling optional objects because all you have to do |
| 183 | to refer to a value which you don't really have is to use a null pointer |
| 184 | value of the appropriate type. Pointers have been used for decades—from |
| 185 | the days of C APIs to modern C++ libraries—to ['refer] to optional (that is, |
| 186 | possibly nonexistent) objects; particularly as optional arguments to a |
| 187 | function, but also quite often as optional data members. |
| 188 | |
| 189 | The possible presence of a null pointer value makes the operations that |
| 190 | access the pointee's value possibly undefined, therefore, expressions which |
| 191 | use dereference and access operators, such as: `( *p = 2 )` and `( p->foo() )`, |
| 192 | implicitly convey the notion of optionality, and this information is tied to |
| 193 | the ['syntax] of the expressions. That is, the presence of operators `*` and `->` |
| 194 | tell by themselves —without any additional context— that the expression will |
| 195 | be undefined unless the implied pointee actually exist. |
| 196 | |
| 197 | Such a ['de facto] idiom for referring to optional objects can be formalized |
| 198 | in the form of a concept: the __OPTIONAL_POINTEE__ concept. |
| 199 | This concept captures the syntactic usage of operators `*`, `->` and |
| 200 | contextual conversion to `bool` to convey the notion of optionality. |
| 201 | |
| 202 | However, pointers are good to [_refer] to optional objects, but not particularly |
| 203 | good to handle the optional objects in all other respects, such as initializing |
| 204 | or moving/copying them. The problem resides in the shallow-copy of pointer |
| 205 | semantics: if you need to effectively move or copy the object, pointers alone |
| 206 | are not enough. The problem is that copies of pointers do not imply copies of |
| 207 | pointees. For example, as was discussed in the motivation, pointers alone |
| 208 | cannot be used to return optional objects from a function because the object |
| 209 | must move outside from the function and into the caller's context. |
| 210 | |
| 211 | A solution to the shallow-copy problem that is often used is to resort to |
| 212 | dynamic allocation and use a smart pointer to automatically handle the details |
| 213 | of this. For example, if a function is to optionally return an object `X`, it can |
| 214 | use `shared_ptr<X>` as the return value. However, this requires dynamic allocation |
| 215 | of `X`. If `X` is a built-in or small POD, this technique is very poor in terms of |
| 216 | required resources. Optional objects are essentially values so it is very |
| 217 | convenient to be able to use automatic storage and deep-copy semantics to |
| 218 | manipulate optional values just as we do with ordinary values. Pointers do |
| 219 | not have this semantics, so are inappropriate for the initialization and |
| 220 | transport of optional values, yet are quite convenient for handling the access |
| 221 | to the possible undefined value because of the idiomatic aid present in the |
| 222 | __OPTIONAL_POINTEE__ concept incarnated by pointers. |
| 223 | |
| 224 | |
| 225 | [heading Optional<T> as a model of OptionalPointee] |
| 226 | |
| 227 | For value access operations `optional<>` uses operators `*` and `->` to |
| 228 | lexically warn about the possibly uninitialized state appealing to the |
| 229 | familiar pointer semantics w.r.t. to null pointers. |
| 230 | |
| 231 | [caution |
| 232 | However, it is particularly important to note that `optional<>` objects |
| 233 | are not pointers. [_`optional<>` is not, and does not model, a pointer]. |
| 234 | ] |
| 235 | |
| 236 | For instance, `optional<>` does not have shallow-copy so does not alias: |
| 237 | two different optionals never refer to the ['same] value unless `T` itself is |
| 238 | a reference (but may have ['equivalent] values). |
| 239 | The difference between an `optional<T>` and a pointer must be kept in mind, |
| 240 | particularly because the semantics of relational operators are different: |
| 241 | since `optional<T>` is a value-wrapper, relational operators are deep: they |
| 242 | compare optional values; but relational operators for pointers are shallow: |
| 243 | they do not compare pointee values. |
| 244 | As a result, you might be able to replace `optional<T>` by `T*` on some |
| 245 | situations but not always. Specifically, on generic code written for both, |
| 246 | you cannot use relational operators directly, and must use the template |
| 247 | functions __FUNCTION_EQUAL_POINTEES__ and __FUNCTION_LESS_POINTEES__ instead. |
| 248 | |
| 249 | [endsect] |
| 250 | |
| 251 | [endsect] |