Brian Silverman | 8867871 | 2018-08-04 23:56:48 -0700 | [diff] [blame^] | 1 | <!doctype HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> |
| 2 | <html> |
| 3 | <!-- |
| 4 | (C) Copyright 2002-4 Robert Ramey - http://www.rrsd.com . |
| 5 | Use, modification and distribution is subject to the Boost Software |
| 6 | License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at |
| 7 | http://www.boost.org/LICENSE_1_0.txt) |
| 8 | --> |
| 9 | <head> |
| 10 | <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> |
| 11 | <link rel="stylesheet" type="text/css" href="../../../boost.css"> |
| 12 | <link rel="stylesheet" type="text/css" href="style.css"> |
| 13 | <title>Serialization - Dataflow Iterators</title> |
| 14 | </head> |
| 15 | <body link="#0000ff" vlink="#800080"> |
| 16 | <table border="0" cellpadding="7" cellspacing="0" width="100%" summary="header"> |
| 17 | <tr> |
| 18 | <td valign="top" width="300"> |
| 19 | <h3><a href="../../../index.htm"><img height="86" width="277" alt="C++ Boost" src="../../../boost.png" border="0"></a></h3> |
| 20 | </td> |
| 21 | <td valign="top"> |
| 22 | <h1 align="center">Serialization</h1> |
| 23 | <h2 align="center">Dataflow Iterators</h2> |
| 24 | </td> |
| 25 | </tr> |
| 26 | </table> |
| 27 | <hr> |
| 28 | <h3>Motivation</h3> |
| 29 | Consider the problem of translating an arbitrary length sequence of 8 bit bytes |
| 30 | to base64 text. Such a process can be summarized as: |
| 31 | <p> |
| 32 | source => 8 bit bytes => 6 bit integers => encode to base64 characters => insert line breaks => destination |
| 33 | <p> |
| 34 | We would prefer the solution that is: |
| 35 | <ul> |
| 36 | <li>Decomposable. so we can code, test, verify and use each (simple) stage of the conversion |
| 37 | independently. |
| 38 | <li>Composable. so we can use this composite as a new component somewhere else. |
| 39 | <li>Efficient, so we're not required to re-implement it again. |
| 40 | <li>Scalable, so that it works well for short and arbitrarily long sequences. |
| 41 | </ul> |
| 42 | The approach that comes closest to meeting these requirements is that described |
| 43 | and implemented with <a href="../../iterator/doc/index.html">Iterator Adaptors</a>. |
| 44 | The fundamental feature of an Iterator Adaptor template that makes it interesting to |
| 45 | us is that it takes as a parameter a base iterator from which it derives its |
| 46 | input. This suggests that something like the following might be possible. |
| 47 | <pre><code> |
| 48 | typedef |
| 49 | insert_linebreaks< // insert line breaks every 76 characters |
| 50 | base64_from_binary< // convert binary values to base64 characters |
| 51 | transform_width< // retrieve 6 bit integers from a sequence of 8 bit bytes |
| 52 | const char *, |
| 53 | 6, |
| 54 | 8 |
| 55 | > |
| 56 | > |
| 57 | ,76 |
| 58 | > |
| 59 | base64_text; // compose all the above operations in to a new iterator |
| 60 | |
| 61 | std::copy( |
| 62 | base64_text(address), |
| 63 | base64_text(address + count), |
| 64 | ostream_iterator<CharType>(os) |
| 65 | ); |
| 66 | </code></pre> |
| 67 | Indeed, this seems to be exactly the kind of problem that iterator adaptors are |
| 68 | intended to address. The Iterator Adaptor library already includes |
| 69 | modules which can be configured to implement some of the operations above. For example, |
| 70 | included is <a target="transform_iterator" href="../../iterator/doc/transform_iterator.html"> |
| 71 | transform_iterator</a>, which can be used to implement 6 bit integer => base64 code. |
| 72 | |
| 73 | <h3>Dataflow Iterators</h3> |
| 74 | Unfortunately, not all iterators which inherit from Iterator Adaptors are guaranteed |
| 75 | to meet the composability goals stated above. To accomplish this purpose, they have |
| 76 | to be written with some additional considerations in mind. |
| 77 | |
| 78 | We define a Dataflow Iterator as an class inherited from <code style="white-space: normal">iterator_adaptor</code> which |
| 79 | fulfills a small set of additional requirements. |
| 80 | |
| 81 | <h4>Templated Constructors</h4> |
| 82 | <p> |
| 83 | Templated constructor have the form: |
| 84 | <pre><code> |
| 85 | template<class T> |
| 86 | dataflow_iterator(T start) : |
| 87 | iterator_adaptor(Base(start)) |
| 88 | {} |
| 89 | </code></pre> |
| 90 | When these constructors are applied to our example of above, the following code is generated: |
| 91 | <pre><code> |
| 92 | std::copy( |
| 93 | insert_linebreaks( |
| 94 | base64_from_binary( |
| 95 | transform_width( |
| 96 | address |
| 97 | ), |
| 98 | ) |
| 99 | ), |
| 100 | insert_linebreaks( |
| 101 | base64_from_binary( |
| 102 | transform_width( |
| 103 | address + count |
| 104 | ) |
| 105 | ) |
| 106 | ) |
| 107 | ostream_iterator<char>(os) |
| 108 | ); |
| 109 | </code></pre> |
| 110 | The recursive application of this template is what automatically generates the |
| 111 | constructor <code style="white-space: normal">base64_text(const char *)</code> in our example above. The original |
| 112 | Iterator Adaptors include a <code style="white-space: normal">make_xxx_iterator</code> to fulfill this function. |
| 113 | However, I believe these are unwieldy to use compared to the above solution using |
| 114 | Templated constructors. |
| 115 | <p> |
| 116 | Unfortunately, some systems which fail to properly support partial function template |
| 117 | ordering cannot support the concept of a templated constructor as implemented above. |
| 118 | A special "wrapper" macro has been created to work around this problem. With this "wrapper" |
| 119 | the above example is modified to: |
| 120 | <pre><code> |
| 121 | std::copy( |
| 122 | base64_text(BOOST_MAKE_PFTO_WRAPPER(address)), |
| 123 | base64_text(BOOST_MAKE_PFTO_WRAPPER(address + count)), |
| 124 | ostream_iterator<char>(os) |
| 125 | ); |
| 126 | </code></pre> |
| 127 | This macro is defined in <a target="pfto" href="../../../boost/serialization/pfto.hpp"><boost/serialization/pfto.hpp></a>. |
| 128 | For more information about this topic, check the source. |
| 129 | |
| 130 | <h4>Dereferencing</h4> |
| 131 | Dereferencing some iterators can cause problems. For example, a natural |
| 132 | way to write a <code style="white-space: normal">remove_whitespace</code> iterator is to increment past the initial |
| 133 | whitespaces when the iterator is constructed. This will fail if the iterator passed to the |
| 134 | constructor "points" to the end of a string. The |
| 135 | <a target="filter_iterator" href="../../iterator/doc/filter_iterator.html"> |
| 136 | <code style="white-space: normal">filter_iterator</code></a> is implemented |
| 137 | in this way so it can't be used in our context. So, for implementation of this iterator, |
| 138 | space removal is deferred until the iterator actually is dereferenced. |
| 139 | |
| 140 | <h4>Comparison</h4> |
| 141 | The default implementation of iterator equality of <code style="white-space: normal">iterator_adaptor</code> just |
| 142 | invokes the equality operator on the base iterators. Generally this is satisfactory. |
| 143 | However, this implies that other operations (E. G. dereference) do not prematurely |
| 144 | increment the base iterator. Avoiding this can be surprisingly tricky in some cases. |
| 145 | (E.G. transform_width) |
| 146 | |
| 147 | <p> |
| 148 | Iterators which fulfill the above requirements should be composable and the above sample |
| 149 | code should implement our binary to base64 conversion. |
| 150 | |
| 151 | <h3>Iterators Included in the Library</h3> |
| 152 | Dataflow iterators for the serialization library are all defined in the hamespace |
| 153 | <code style="white-space: normal">boost::archive::iterators</code> included here are: |
| 154 | <dl class="index"> |
| 155 | <dt><a target="base64_from_binary" href="../../../boost/archive/iterators/base64_from_binary.hpp"> |
| 156 | base64_from_binary</a></dt> |
| 157 | <dd>transforms a sequence of integers to base64 text</dd> |
| 158 | |
| 159 | <dt><a target="base64_from_binary" href="../../../boost/archive/iterators/binary_from_base64.hpp"> |
| 160 | binary_from_base64</a></dt> |
| 161 | <dd>transforms a sequence of base64 characters to a sequence of integers</dd> |
| 162 | |
| 163 | <dt><a target="insert_linebreaks" href="../../../boost/archive/iterators/insert_linebreaks.hpp"> |
| 164 | insert_linebreaks</a></dt> |
| 165 | <dd>given a sequence, creates a sequence with newline characters inserted</dd> |
| 166 | |
| 167 | <dt><a target="mb_from_wchar" href="../../../boost/archive/iterators/mb_from_wchar.hpp"> |
| 168 | mb_from_wchar</a></dt> |
| 169 | <dd>transforms a sequence of wide characters to a sequence of multi-byte characters</dd> |
| 170 | |
| 171 | <dt><a target="remove_whitespace" href="../../../boost/archive/iterators/remove_whitespace.hpp"> |
| 172 | remove_whitespace</a></dt> |
| 173 | <dd>given a sequence of characters, returns a sequence with the white characters |
| 174 | removed. This is a derivation from the <code style="white-space: normal">boost::filter_iterator</code></dd> |
| 175 | |
| 176 | <dt><a target="transform_width" href="../../../boost/archive/iterators/transform_width.hpp"> |
| 177 | transform_width</a></dt> |
| 178 | <dd>transforms a sequence of x bit elements into a sequence of y bit elements. This |
| 179 | is a key component in iterators which translate to and from base64 text.</dd> |
| 180 | |
| 181 | <dt><a target="wchar_from_mb" href="../../../boost/archive/iterators/wchar_from_mb.hpp"> |
| 182 | wchar_from_mb</a></dt> |
| 183 | <dd>transform a sequence of multi-byte characters in the current locale to wide characters.</dd> |
| 184 | |
| 185 | <dt><a target="xml_escape" href="../../../boost/archive/iterators/xml_escape.hpp"> |
| 186 | xml_escape</a></dt> |
| 187 | <dd>escapes xml meta-characters from xml text</dd> |
| 188 | |
| 189 | <dt><a target="xml_unescape" href="../../../boost/archive/iterators/xml_unescape.hpp"> |
| 190 | xml_unescape</a></dt> |
| 191 | <dd>unescapes xml escape sequences to create a sequence of normal text<dd> |
| 192 | </dl> |
| 193 | <p> |
| 194 | The standard stream iterators don't quite work for us. On systems which implement <code style="white-space: normal">wchar_t</code> |
| 195 | as unsigned short integers (E.G. VC 6) they didn't function as I expected. I also made some |
| 196 | adjustments to be consistent with our concept of Dataflow Iterators. Like the rest of our |
| 197 | iterators, they are found in the namespace <code style="white-space: normal">boost::archive::interators</code> to avoid |
| 198 | conflicts with the standard library versions. |
| 199 | <dl class = "index"> |
| 200 | <dt><a target="istream_iterator" href="../../../boost/archive/iterators/istream_iterator.hpp"> |
| 201 | istream_iterator</a></dt> |
| 202 | <dt><a target="ostream_iterator" href="../../../boost/archive/iterators/ostream_iterator.hpp"> |
| 203 | ostream_iterator</a></dt> |
| 204 | </dl> |
| 205 | |
| 206 | <hr> |
| 207 | <p><i>© Copyright <a href="http://www.rrsd.com">Robert Ramey</a> 2002-2004. |
| 208 | Distributed under the Boost Software License, Version 1.0. (See |
| 209 | accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) |
| 210 | </i></p> |
| 211 | </body> |
| 212 | </html> |