Brian Silverman | 60e3e2a | 2018-08-04 23:57:12 -0700 | [diff] [blame^] | 1 | [/ |
| 2 | / Copyright (c) 2012 Marshall Clow |
| 3 | / |
| 4 | / Distributed under the Boost Software License, Version 1.0. (See accompanying |
| 5 | / file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) |
| 6 | /] |
| 7 | |
| 8 | [article String_Ref |
| 9 | [quickbook 1.5] |
| 10 | [authors [Clow, Marshall]] |
| 11 | [copyright 2012 Marshall Clow] |
| 12 | [license |
| 13 | Distributed under the Boost Software License, Version 1.0. |
| 14 | (See accompanying file LICENSE_1_0.txt or copy at |
| 15 | [@http://www.boost.org/LICENSE_1_0.txt]) |
| 16 | ] |
| 17 | ] |
| 18 | |
| 19 | [/===============] |
| 20 | [section Overview] |
| 21 | [/===============] |
| 22 | |
| 23 | Boost.StringRef is an implementation of Jeffrey Yaskin's [@http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3442.html N3442: |
| 24 | string_ref: a non-owning reference to a string]. |
| 25 | |
| 26 | When you are parsing/processing strings from some external source, frequently you want to pass a piece of text to a procedure for specialized processing. The canonical way to do this is as a `std::string`, but that has certain drawbacks: |
| 27 | |
| 28 | 1) If you are processing a buffer of text (say a HTTP response or the contents of a file), then you have to create the string from the text you want to pass, which involves memory allocation and copying of data. |
| 29 | |
| 30 | 2) if a routine receives a constant `std::string` and wants to pass a portion of that string to another routine, then it must create a new string of that substring. |
| 31 | |
| 32 | 3) A routine receives a constant `std::string` and wants to return a portion of the string, then it must create a new string to return. |
| 33 | |
| 34 | `string_ref` is designed to solve these efficiency problems. A `string_ref` is a read-only reference to a contiguous sequence of characters, and provides much of the functionality of `std::string`. A `string_ref` is cheap to create, copy and pass by value, because it does not actually own the storage that it points to. |
| 35 | |
| 36 | A `string_ref` is implemented as a small struct that contains a pointer to the start of the character data and a count. A `string_ref` is cheap to create and cheap to copy. |
| 37 | |
| 38 | `string_ref` acts as a container; it includes all the methods that you would expect in a container, including iteration support, `operator []`, `at` and `size`. It can be used with any of the iterator-based algorithms in the STL - as long as you don't need to change the underlying data (`sort` and `remove`, for example, will not work) |
| 39 | |
| 40 | Besides generic container functionality, `string_ref` provides a subset of the interface of `std::string`. This makes it easy to replace parameters of type `const std::string &` with `boost::string_ref`. Like `std::string`, `string_ref` has a static member variable named `npos` to denote the result of failed searches, and to mean "the end". |
| 41 | |
| 42 | Because a `string_ref` does not own the data that it "points to", it introduces lifetime issues into code that uses it. The programmer must ensure that the data that a `string_ref` refers to exists as long as the `string_ref` does. |
| 43 | |
| 44 | [endsect] |
| 45 | |
| 46 | |
| 47 | [/===============] |
| 48 | [section Examples] |
| 49 | [/===============] |
| 50 | |
| 51 | Integrating `string_ref` into your code is fairly simple. Wherever you pass a `const std::string &` or `std::string` as a parameter, that's a candidate for passing a `boost::string_ref`. |
| 52 | |
| 53 | std::string extract_part ( const std::string &bar ) { |
| 54 | return bar.substr ( 2, 3 ); |
| 55 | } |
| 56 | |
| 57 | if ( extract_part ( "ABCDEFG" ).front() == 'C' ) { /* do something */ } |
| 58 | |
| 59 | Let's figure out what happens in this (contrived) example. |
| 60 | |
| 61 | First, a temporary string is created from the string literal `"ABCDEFG"`, and it is passed (by reference) to the routine `extract_part`. Then a second string is created in the call `std::string::substr` and returned to `extract_part` (this copy may be elided by RVO). Then `extract_part` returns that string back to the caller (again this copy may be elided). The first temporary string is deallocated, and `front` is called on the second string, and then it is deallocated as well. |
| 62 | |
| 63 | Two `std::string`s are created, and two copy operations. That's (potentially) four memory allocations and deallocations, and the associated copying of data. |
| 64 | |
| 65 | Now let's look at the same code with `string_ref`: |
| 66 | |
| 67 | boost::string_ref extract_part ( boost::string_ref bar ) { |
| 68 | return bar.substr ( 2, 3 ); |
| 69 | } |
| 70 | |
| 71 | if ( extract_part ( "ABCDEFG" ).front() == "C" ) { /* do something */ } |
| 72 | |
| 73 | No memory allocations. No copying of character data. No changes to the code other than the types. There are two `string_ref`s created, and two `string_ref`s copied, but those are cheap operations. |
| 74 | |
| 75 | [endsect] |
| 76 | |
| 77 | |
| 78 | [/=================] |
| 79 | [section:reference Reference ] |
| 80 | [/=================] |
| 81 | |
| 82 | The header file "string_ref.hpp" defines a template `boost::basic_string_ref`, and four specializations - for `char` / `wchar_t` / `char16_t` / `char32_t` . |
| 83 | |
| 84 | `#include <boost/utility/string_ref.hpp>` |
| 85 | |
| 86 | Construction and copying: |
| 87 | |
| 88 | BOOST_CONSTEXPR basic_string_ref (); // Constructs an empty string_ref |
| 89 | BOOST_CONSTEXPR basic_string_ref(const charT* str); // Constructs from a NULL-terminated string |
| 90 | BOOST_CONSTEXPR basic_string_ref(const charT* str, size_type len); // Constructs from a pointer, length pair |
| 91 | template<typename Allocator> |
| 92 | basic_string_ref(const std::basic_string<charT, traits, Allocator>& str); // Constructs from a std::string |
| 93 | basic_string_ref (const basic_string_ref &rhs); |
| 94 | basic_string_ref& operator=(const basic_string_ref &rhs); |
| 95 | |
| 96 | `string_ref` does not define a move constructor nor a move-assignment operator because copying a `string_ref` is just a cheap as moving one. |
| 97 | |
| 98 | Basic container-like functions: |
| 99 | |
| 100 | BOOST_CONSTEXPR size_type size() const ; |
| 101 | BOOST_CONSTEXPR size_type length() const ; |
| 102 | BOOST_CONSTEXPR size_type max_size() const ; |
| 103 | BOOST_CONSTEXPR bool empty() const ; |
| 104 | |
| 105 | // All iterators are const_iterators |
| 106 | BOOST_CONSTEXPR const_iterator begin() const ; |
| 107 | BOOST_CONSTEXPR const_iterator cbegin() const ; |
| 108 | BOOST_CONSTEXPR const_iterator end() const ; |
| 109 | BOOST_CONSTEXPR const_iterator cend() const ; |
| 110 | const_reverse_iterator rbegin() const ; |
| 111 | const_reverse_iterator crbegin() const ; |
| 112 | const_reverse_iterator rend() const ; |
| 113 | const_reverse_iterator crend() const ; |
| 114 | |
| 115 | Access to the individual elements (all of which are const): |
| 116 | |
| 117 | BOOST_CONSTEXPR const charT& operator[](size_type pos) const ; |
| 118 | const charT& at(size_t pos) const ; |
| 119 | BOOST_CONSTEXPR const charT& front() const ; |
| 120 | BOOST_CONSTEXPR const charT& back() const ; |
| 121 | BOOST_CONSTEXPR const charT* data() const ; |
| 122 | |
| 123 | Modifying the `string_ref` (but not the underlying data): |
| 124 | |
| 125 | void clear(); |
| 126 | void remove_prefix(size_type n); |
| 127 | void remove_suffix(size_type n); |
| 128 | |
| 129 | Searching: |
| 130 | |
| 131 | size_type find(basic_string_ref s) const ; |
| 132 | size_type find(charT c) const ; |
| 133 | size_type rfind(basic_string_ref s) const ; |
| 134 | size_type rfind(charT c) const ; |
| 135 | size_type find_first_of(charT c) const ; |
| 136 | size_type find_last_of (charT c) const ; |
| 137 | |
| 138 | size_type find_first_of(basic_string_ref s) const ; |
| 139 | size_type find_last_of(basic_string_ref s) const ; |
| 140 | size_type find_first_not_of(basic_string_ref s) const ; |
| 141 | size_type find_first_not_of(charT c) const ; |
| 142 | size_type find_last_not_of(basic_string_ref s) const ; |
| 143 | size_type find_last_not_of(charT c) const ; |
| 144 | |
| 145 | String-like operations: |
| 146 | |
| 147 | BOOST_CONSTEXPR basic_string_ref substr(size_type pos, size_type n=npos) const ; // Creates a new string_ref |
| 148 | bool starts_with(charT c) const ; |
| 149 | bool starts_with(basic_string_ref x) const ; |
| 150 | bool ends_with(charT c) const ; |
| 151 | bool ends_with(basic_string_ref x) const ; |
| 152 | |
| 153 | [endsect] |
| 154 | |
| 155 | [/===============] |
| 156 | [section History] |
| 157 | [/===============] |
| 158 | |
| 159 | [heading boost 1.53] |
| 160 | * Introduced |
| 161 | |
| 162 | |
| 163 | [endsect] |
| 164 | |
| 165 | |
| 166 | |
| 167 | |