UtfString
|
Contains and manages an encoding-neutral Unicode string. More...
#include <UnicodeString.h>
Classes | |
class | const_iterator |
An iterator that iterates through the code points in a Unicode string, but allowing only access to constant code points. More... | |
class | const_reverse_iterator |
An iterator that iterates through the code points in a Unicode string in reverse order, but allowing only access to constant code points. More... | |
class | iterator |
An iterator that iterates through the code points in a Unicode string. More... | |
class | reverse_iterator |
An iterator that iterates through the code points in a Unicode string in reverse order. More... | |
Public Member Functions | |
UnicodeString () | |
The default constructor. | |
UnicodeString (const UnicodeString &unicodeString) | |
Initialized this string with another UnicodeString object. More... | |
UnicodeString (const Utf16String &utf16String) | |
Initializes this string with a UTF-16 string. More... | |
UnicodeString (const Utf8String &utf8String) | |
Initializes this string with a UTF-8 string. More... | |
virtual | ~UnicodeString () |
The class destructor. | |
UnicodeString & | append (const UnicodeString &unicodeString) |
Appends the contents of another string to this string. More... | |
UnicodeString & | append (const Utf16String &utf16String) |
Appends the contents of another string to this string. More... | |
UnicodeString & | append (const Utf8String &utf8String) |
Appends the contents of another string to this string. More... | |
UnicodeString & | append (const UnicodeChar &unicodeCharacter) |
Appends a Unicode character to this string. More... | |
UnicodeString & | append (const UnicodeCharReference &unicodeCharacterReference) |
Appends a Unicode character to this string. More... | |
UnicodeString & | assign (const UnicodeString &unicodeString) |
Assigns the contents of another string to this string, replacing the current contents of this string. More... | |
UnicodeString & | assign (const Utf16String &utf16String) |
Assigns the contents of another string to this string, replacing the current contents of this string. More... | |
UnicodeString & | assign (const Utf8String &utf8String) |
Assigns the contents of another string to this string, replacing the current contents of this string. More... | |
UnicodeString & | assign (const UnicodeChar &unicodeCharacter) |
Assigns a Unicode character to this string. More... | |
UnicodeString & | assign (const UnicodeCharReference &unicodeCharacterReference) |
Assigns a Unicode character to this string. More... | |
UnicodeCharReference | at (size_t index) |
Returns a reference to the character found at the specified character index. More... | |
const UnicodeChar | at (size_t index) const |
Returns a reference to the character found at the specified character index. More... | |
iterator | begin () |
Returns an iterator pointing to the first character of a string. More... | |
const_iterator | begin () const |
Returns a constant iterator pointing to the first character of a string. More... | |
void | clear () |
Clears out the string, leaving it an empty string. | |
int | compare (const UnicodeString &unicodeString) const |
Compares the code points in this string with a code points in another string to determine if both are equal or if one is less than the other. More... | |
int | compare (const Utf16String &utf16String) const |
Compares the code points in this string with a code points in another string to determine if both are equal or if one is less than the other. More... | |
int | compare (const Utf8String &utf8String) const |
Compares the code points in this string with a code points in another string to determine if both are equal or if one is less than the other. More... | |
bool | empty () const |
Indicates whether this is an empty string. More... | |
iterator | end () |
Returns an iterator pointing to the location succeeding the last character in a string. More... | |
const_iterator | end () const |
Returns an constant iterator pointing to the location succeeding the last character in a string. More... | |
UnicodeString::iterator | erase (const UnicodeString::iterator &firstPosition, const UnicodeString::iterator &lastPosition) |
Removes a range of characters from this string. More... | |
UnicodeString::iterator | erase (const UnicodeString::iterator &position) |
Removes a character from this string. More... | |
UnicodeString & | erase (const size_t offset=0, const size_t count=npos) |
Removes a range of characters from this string. More... | |
size_t | find (const UnicodeString &searchString, size_t offset=0) |
Searches this string for specific substring. More... | |
size_t | find (const Utf8String &searchString, size_t offset=0) |
Searches this string for specific substring. More... | |
size_t | find (const Utf16String &searchString, size_t offset=0) |
Searches this string for specific substring. More... | |
size_t | find_first_not_of (const UnicodeString &searchString, size_t offset=0) |
Searches this string for the first character that is not found in a given string. More... | |
size_t | find_first_not_of (const Utf8String &searchString, size_t offset=0) |
Searches this string for the first character that is not found in a given string. More... | |
size_t | find_first_not_of (const Utf16String &searchString, size_t offset=0) |
Searches this string for the first character that is not found in a given string. More... | |
size_t | find_first_of (const UnicodeString &searchString, size_t offset=0) |
Searches this string for the first character that is found in a given string. More... | |
size_t | find_first_of (const Utf8String &searchString, size_t offset=0) |
Searches this string for the first character that is found in a given string. More... | |
size_t | find_first_of (const Utf16String &searchString, size_t offset=0) |
Searches this string for the first character that is found in a given string. More... | |
size_t | find_last_not_of (const UnicodeString &searchString, size_t offset=npos) |
Searches this string for the last character that is not found in a given string. More... | |
size_t | find_last_not_of (const Utf8String &searchString, size_t offset=npos) |
Searches this string for the last character that is not found in a given string. More... | |
size_t | find_last_not_of (const Utf16String &searchString, size_t offset=npos) |
Searches this string for the last character that is not found in a given string. More... | |
size_t | find_last_of (const UnicodeString &searchString, size_t offset=npos) |
Searches this string for the last character that is found in a given string. More... | |
size_t | find_last_of (const Utf8String &searchString, size_t offset=npos) |
Searches this string for the last character that is found in a given string. More... | |
size_t | find_last_of (const Utf16String &searchString, size_t offset=npos) |
Searches this string for the last character that is found in a given string. More... | |
UnicodeString & | insert (const size_t index, const UnicodeString &unicodeString) |
Inserts the contents of another string into this string at a specified index. More... | |
UnicodeString & | insert (const size_t index, const Utf8String &utf8String) |
Inserts the contents of another string into this string at a specified index. More... | |
UnicodeString & | insert (const size_t index, const Utf16String &utf16String) |
Inserts the contents of another string into this string at a specified index. More... | |
UnicodeString & | insert (const size_t index, const UnicodeChar &unicodeCharacter) |
Inserts a character into this string at a specified index. More... | |
bool | is_valid () const |
Indicates whether this string is a valid Unicode string. More... | |
const UtfEncoding | internal_encoding () const |
Indicates the internal encoding used by this string. More... | |
size_t | length () const |
Returns the number of code points in this string. More... | |
void | push_back (const UnicodeChar &character) |
Appends a character to the end of this string. More... | |
reverse_iterator | rbegin () |
Returns an iterator pointing to the first character of a reversed string, which corresponds to the last character of the normal string. More... | |
const_reverse_iterator | rbegin () const |
Returns a constant iterator pointing to the first character of a reversed string, which corresponds to the last character of a normal string. More... | |
reverse_iterator | rend () |
Returns an iterator pointing to the location succeeding the last character in a reversed string, which corresponds to the location preceding the first character in a normal string. More... | |
const_reverse_iterator | rend () const |
Returns an constant iterator pointing to the location succeeding the last character in a reversed string, which corresponds to the location preceding the first character in a normal string. More... | |
UnicodeString & | replace (const size_t position, const size_t count, const UnicodeString &replacementString) |
Removes a section of this string and replaces it with the contents of another string. More... | |
UnicodeString & | replace (const size_t position, const size_t count, const Utf8String &replacementString) |
Removes a section of this string and replaces it with the contents of another string. More... | |
UnicodeString & | replace (const size_t position, const size_t count, const Utf16String &replacementString) |
Removes a section of this string and replaces it with the contents of another string. More... | |
UnicodeString & | replace (const size_t position, const size_t count, const size_t characterCount, const UnicodeChar &character) |
Replaces the characters in a section of this string with the given character. More... | |
UnicodeString & | replace (UnicodeString::iterator beginIterator, UnicodeString::iterator endIterator, const UnicodeString &replacementString) |
Removes a section of this string and replaces it with the contents of another string. More... | |
UnicodeString & | replace (UnicodeString::iterator beginIterator, UnicodeString::iterator endIterator, const size_t characterCount, const UnicodeChar &character) |
Replaces the characters in a section of this string with the given character. More... | |
size_t | rfind (const UnicodeString &searchString, size_t offset=npos) |
Searches this string backward for specific substring. More... | |
size_t | rfind (const Utf8String &searchString, size_t offset=npos) |
Searches this string backward for specific substring. More... | |
size_t | rfind (const Utf16String &searchString, size_t offset=npos) |
Searches this string backward for specific substring. More... | |
size_t | size () const |
Returns the number of code points in this string. More... | |
UnicodeString | substr (const size_t offset=0, const size_t count=npos) |
Returns a substring of this string. More... | |
void | swap (UnicodeString &unicodeString) |
Swaps the contents of this string with those of another string. More... | |
void | swap (Utf8String &utf8String) |
Swaps the contents of this string with those of another string. More... | |
void | swap (Utf16String &utf16String) |
Swaps the contents of this string with those of another string. More... | |
UnicodeCharReference | operator[] (const size_t index) |
Returns the character found at the specified character index. More... | |
const UnicodeChar | operator[] (const size_t index) const |
Returns the character found at the specified character index. More... | |
UnicodeString & | operator= (const UnicodeString &unicodeString) |
Assigns the contents of a UnicodeString object to this object. More... | |
UnicodeString & | operator= (const Utf8String &utf8String) |
Assigns the contents of a Utf8String object to this object. More... | |
UnicodeString & | operator= (const Utf16String &utf16String) |
Assigns the contents of a Utf16String object to this object. More... | |
bool | operator== (const UnicodeString &otherString) const |
Compares the value of this string to the value of another string and tests whether the two strings are the same. More... | |
bool | operator== (const Utf8String &utf8String) const |
Compares the value of this string to the value of a UTF-8 string and tests whether the two strings are the same. More... | |
bool | operator== (const Utf16String &utf16String) const |
Compares the value of this string to the value of a UTF-16 string and tests whether the two strings are the same. More... | |
bool | operator!= (const UnicodeString &otherString) const |
Compares the value of this string to the value of another string and tests whether the two strings are the different. More... | |
bool | operator!= (const Utf8String &utf8String) const |
Compares the value of this string to the value of a UTF-8 string and tests whether the two strings are the different. More... | |
bool | operator!= (const Utf16String &utf16String) const |
Compares the value of this string to the value of a UTF-16 string and tests whether the two strings are the different. More... | |
bool | operator< (const UnicodeString &otherString) const |
Compares the value of this string to the value of another string and tests whether the value of this string is less than the value of the other string. More... | |
bool | operator< (const Utf8String &utf8String) const |
Compares the value of this string to the value of a UTF-8 string and tests whether the value of this string is less than the value of the other string. More... | |
bool | operator< (const Utf16String &utf16String) const |
Compares the value of this string to the value of a UTF-16 string and tests whether the value of this string is less than the value of the other string. More... | |
bool | operator<= (const UnicodeString &otherString) const |
Compares the value of this string to the value of another string and tests whether the value of this string is less than or equal to the value of the other string. More... | |
bool | operator<= (const Utf8String &utf8String) const |
Compares the value of this string to the value of a UTF-8 string and tests whether the value of this string is less than or equal to the value of the other string. More... | |
bool | operator<= (const Utf16String &utf16String) const |
Compares the value of this string to the value of a UTF-16 string and tests whether the value of this string is less than or equal to the value of the other string. More... | |
bool | operator> (const UnicodeString &otherString) const |
Compares the value of this string to the value of another string and tests whether the value of this string is greater than the value of the other string. More... | |
bool | operator> (const Utf8String &utf8String) const |
Compares the value of this string to the value of a UTF-8 string and tests whether the value of this string is greater than the value of the other string. More... | |
bool | operator> (const Utf16String &utf16String) const |
Compares the value of this string to the value of a UTF-16 string and tests whether the value of this string is greater than the value of the other string. More... | |
bool | operator>= (const UnicodeString &otherString) const |
Compares the value of this string to the value of another string and tests whether the value of this string is greater than or equal to the value of the other string. More... | |
bool | operator>= (const Utf8String &utf8String) const |
Compares the value of this string to the value of a UTF-8 string and tests whether the value of this string is greater than or equal to the value of the other string. More... | |
bool | operator>= (const Utf16String &utf16String) const |
Compares the value of this string to the value of a UTF-16 string and tests whether the value of this string is greater than or equal to the value of the other string. More... | |
operator Utf16String () const | |
Converts this object to a Utf16String object. More... | |
operator Utf8String () const | |
Converts this object to a Utf8String object. More... | |
Static Public Member Functions | |
static bool | IsWhitespace (const UnicodeChar &unicodeCharacter) |
Indicates whether a Unicode character is a whitespace character. More... | |
Static Public Attributes | |
static const size_t | npos |
An unsigned integral value initialized to –1 that indicates either "not found" or "all remaining characters" when a search function fails. | |
Friends | |
std::istream & | operator>> (std::istream &inputStream, UnicodeString &unicodeString) |
This operator reads a stream of bytes into a UnicodeString. More... | |
std::ostream & | operator<< (std::ostream &outputStream, const UnicodeString &unicodeString) |
This operator converts the contents of a UnicodeString to a stream of bytes. More... | |
std::wistream & | operator>> (std::wistream &inputStream, UnicodeString &unicodeString) |
This operator converts a wide stream of 16-bit values to a UTF-16 string and stores the UTF-16 string inside unicodeString. More... | |
std::wostream & | operator<< (std::wostream &outputStream, const UnicodeString &unicodeString) |
This operator converts a Unicode string to a wide stream of 16-bit values. More... | |
Contains and manages an encoding-neutral Unicode string.
This class is intended to be used in situations when the encoding of a Unicode string is unknown at compile time. UnicodeString encapsulates and encoding-specific string, and abstracts away code units, concentrating on code points instead.
UnicodeString is less efficient than the encoding-specific string classes (Utf8String and Utf16String), and due to the need to be encoding-neutral, contains less functionality than the encoding-specific strings. If you won't know the exact encoding until runtime, use UnicodeString; otherwise, use Utf8String or Utf16String.
A UnicodeString can always be converted to a Utf8String or Utf16String. So if a string is coming from a source with an unknown encoding, such as a file, use UnicodeString at first and then convert to a Utf8String or Utf16String for use in the rest of the application. UnicodeString is most useful in library APIs or any other widely-used code, so functions can return a UnicodeString object instead of implementing separate functions for each encoding.
UtfString::UnicodeString::UnicodeString | ( | const UnicodeString & | unicodeString) |
Initialized this string with another UnicodeString object.
[in] | unicodeString | The UnicodeString object to use to initialize this object |
UtfString::UnicodeString::UnicodeString | ( | const Utf16String & | utf16String) |
Initializes this string with a UTF-16 string.
[in] | utf16String | A UTF-16 string used to initialize this object |
UtfString::UnicodeString::UnicodeString | ( | const Utf8String & | utf8String) |
Initializes this string with a UTF-8 string.
[in] | utf8String | A UTF-8 string used to initialize this object |
UnicodeString& UtfString::UnicodeString::append | ( | const UnicodeString & | unicodeString) |
Appends the contents of another string to this string.
[in] | unicodeString | A Unicode string to be appended |
UnicodeString& UtfString::UnicodeString::append | ( | const Utf16String & | utf16String) |
Appends the contents of another string to this string.
[in] | utf16String | A string of 16-bit code units to be appended |
UnicodeString& UtfString::UnicodeString::append | ( | const Utf8String & | utf8String) |
Appends the contents of another string to this string.
[in] | utf8String | A UTF-8 string to be appended. The string is assumed to be a valid UTF-8 string. |
UnicodeString& UtfString::UnicodeString::append | ( | const UnicodeChar & | unicodeCharacter) |
Appends a Unicode character to this string.
[in] | unicodeCharacter | A Unicode character to be appended. The character is assumed to be a valid Unicode character |
UnicodeString& UtfString::UnicodeString::append | ( | const UnicodeCharReference & | unicodeCharacterReference) |
Appends a Unicode character to this string.
[in] | unicodeCharacterReference | A reference to a Unicode character to be appended. The character is assumed to be a valid Unicode character |
UnicodeString& UtfString::UnicodeString::assign | ( | const UnicodeString & | unicodeString) |
Assigns the contents of another string to this string, replacing the current contents of this string.
[in] | unicodeString | A Unicode string to be assigned |
UnicodeString& UtfString::UnicodeString::assign | ( | const Utf16String & | utf16String) |
Assigns the contents of another string to this string, replacing the current contents of this string.
[in] | utf16String | A string of 16-bit code units to be assigned |
UnicodeString& UtfString::UnicodeString::assign | ( | const Utf8String & | utf8String) |
Assigns the contents of another string to this string, replacing the current contents of this string.
[in] | utf8String | A UTF-8 string to be assigned. The string is assumed to be a valid UTF-8 string. |
UnicodeString& UtfString::UnicodeString::assign | ( | const UnicodeChar & | unicodeCharacter) |
Assigns a Unicode character to this string.
[in] | unicodeCharacter | A Unicode character to be assigned. The character is assumed to be a valid Unicode character |
UnicodeString& UtfString::UnicodeString::assign | ( | const UnicodeCharReference & | unicodeCharacterReference) |
Assigns a Unicode character to this string.
[in] | unicodeCharacterReference | A reference to the Unicode character to be assigned. The character is assumed to be a valid Unicode character |
UnicodeCharReference UtfString::UnicodeString::at | ( | size_t | index) |
Returns a reference to the character found at the specified character index.
This operator does for the validity of the index, and throws an out_of_range exception when the given index doesn't correspond to a character within a string. Note that operator[] is a faster way to access a specific character, but doesn't check for index validity.
Unicode strings are of variable length encoding, meaning that whereas accessing a character at a particular index is O(1) for fixed-length encodings, accessing a character in Unicode strings is O(1) in the best case and O(n) in the worst case.
So if you wish to iterate through the characters in this string, use the standard iterators instead of an indexer. The standard iterators will be far more efficient.
[in] | index | The index of a character in the string |
const UnicodeChar UtfString::UnicodeString::at | ( | size_t | index) | const |
Returns a reference to the character found at the specified character index.
This operator does for the validity of the index, and throws an out_of_range exception when the given index doesn't correspond to a character within a string. Note that operator[] is a faster way to access a specific character, but doesn't check for index validity.
Unicode strings are of variable length encoding, meaning that whereas accessing a character at a particular index is O(1) for fixed-length encodings, accessing a character in Unicode strings is O(1) in the best case and O(n) in the worst case.
So if you wish to iterate through the characters in this string, use the standard iterators instead of an indexer. The standard iterators will be far more efficient.
[in] | index | The index of a character in the string |
iterator UtfString::UnicodeString::begin | ( | ) |
Returns an iterator pointing to the first character of a string.
If the UnicodeString is empty, an empty iterator is created. See the iterator class description for more information about empty iterators.
const_iterator UtfString::UnicodeString::begin | ( | ) | const |
Returns a constant iterator pointing to the first character of a string.
If the UnicodeString is empty, an empty iterator is created. See the iterator class description for more information about empty iterators.
int UtfString::UnicodeString::compare | ( | const UnicodeString & | unicodeString) | const |
Compares the code points in this string with a code points in another string to determine if both are equal or if one is less than the other.
If this string is the same as the parameter string, then the two strings are considered equal. If the strings are different, then one is considered to be less than the other. The strings are compared "alphabetically", and placed in "alphabetical" order. The string that comes before the other string in that order is considered to be less than higher-ordered other string.
Note that "alphabetical" order is used in quotations because it isn't truly alphabetical. Different languages have different symbols and may have complex rules for the ordering of characters. This class does not attempt to address those issues, but instead compares code points based on their Unicode value. So any particular Latin code point will be considered to be less than any particular Cyrillic code point, because the Cyrillic code points have higher Unicode values. Within the English language, the code points are numbered so that they will be compared according to the rules of the language. This may or may not be the case for code points used by other languages.
If language- or locale-specific comparison is necessary, it would be better to use the ICU library.
[in] | unicodeString | A string to be compared to this string |
int UtfString::UnicodeString::compare | ( | const Utf16String & | utf16String) | const |
Compares the code points in this string with a code points in another string to determine if both are equal or if one is less than the other.
If this string is the same as the parameter string, then the two strings are considered equal. If the strings are different, then one is considered to be less than the other. The strings are compared "alphabetically", and placed in "alphabetical" order. The string that comes before the other string in that order is considered to be less than higher-ordered other string.
Note that "alphabetical" order is used in quotations because it isn't truly alphabetical. Different languages have different symbols and may have complex rules for the ordering of characters. This class does not attempt to address those issues, but instead compares code points based on their Unicode value. So any particular Latin code point will be considered to be less than any particular Cyrillic code point, because the Cyrillic code points have higher Unicode values. Within the English language, the code points are numbered so that they will be compared according to the rules of the language. This may or may not be the case for code points used by other languages.
If language- or locale-specific comparison is necessary, it would be better to use the ICU library.
[in] | utf16String | A string to be compared to this string |
int UtfString::UnicodeString::compare | ( | const Utf8String & | utf8String) | const |
Compares the code points in this string with a code points in another string to determine if both are equal or if one is less than the other.
If this string is the same as the parameter string, then the two strings are considered equal. If the strings are different, then one is considered to be less than the other. The strings are compared "alphabetically", and placed in "alphabetical" order. The string that comes before the other string in that order is considered to be less than higher-ordered other string.
Note that "alphabetical" order is used in quotations because it isn't truly alphabetical. Different languages have different symbols and may have complex rules for the ordering of characters. This class does not attempt to address those issues, but instead compares code points based on their Unicode value. So any particular Latin code point will be considered to be less than any particular Cyrillic code point, because the Cyrillic code points have higher Unicode values. Within the English language, the code points are numbered so that they will be compared according to the rules of the language. This may or may not be the case for code points used by other languages.
If language- or locale-specific comparison is necessary, it would be better to use the ICU library.
[in] | utf8String | A string to be compared to this string |
bool UtfString::UnicodeString::empty | ( | ) | const |
Indicates whether this is an empty string.
An empty UnicodeString is one that has no internal data: it is completely empty container, ready to be assigned an encoding-specific string.
iterator UtfString::UnicodeString::end | ( | ) |
Returns an iterator pointing to the location succeeding the last character in a string.
The iterator returned by this function is usually used to test whether an iterator has reached the end of a string. The iterator returned by this function should never be dereferenced, as it doesn't not point to a part of the string.
If the UnicodeString is empty, an empty iterator is created. See the iterator class description for more information about empty iterators.
const_iterator UtfString::UnicodeString::end | ( | ) | const |
Returns an constant iterator pointing to the location succeeding the last character in a string.
The iterator returned by this function is usually used to test whether an iterator has reached the end of a string. The iterator returned by this function should never be dereferenced, as it doesn't not point to a part of the string.
If the UnicodeString is empty, an empty iterator is created. See the iterator class description for more information about empty iterators.
UnicodeString::iterator UtfString::UnicodeString::erase | ( | const UnicodeString::iterator & | firstPosition, |
const UnicodeString::iterator & | lastPosition | ||
) |
Removes a range of characters from this string.
[in] | firstPosition | An iterator pointing to the first character of the range to be removed |
[in] | lastPosition | An iterator pointing to the position one past the last character of the range to be removed |
UnicodeString::iterator UtfString::UnicodeString::erase | ( | const UnicodeString::iterator & | position) |
Removes a character from this string.
[in] | position | An iterator pointing to the character to be removed |
UnicodeString& UtfString::UnicodeString::erase | ( | const size_t | offset = 0 , |
const size_t | count = npos |
||
) |
Removes a range of characters from this string.
This function will only cause characters to be removed up to the end of the string, so an overly large count parameter value will not cause problems.
This function assumes that offset <= length().
[in] | offset | The offset describing the index location of the first character to be removed |
[in] | count | The maximum number of characters to be removed |
size_t UtfString::UnicodeString::find | ( | const UnicodeString & | searchString, |
size_t | offset = 0 |
||
) |
Searches this string for specific substring.
[in] | searchString | The substring to be found in this string |
[in] | offset | The index of the string at which the search is to begin |
size_t UtfString::UnicodeString::find | ( | const Utf8String & | searchString, |
size_t | offset = 0 |
||
) |
Searches this string for specific substring.
[in] | searchString | The substring to be found in this string |
[in] | offset | The index of the string at which the search is to begin |
size_t UtfString::UnicodeString::find | ( | const Utf16String & | searchString, |
size_t | offset = 0 |
||
) |
Searches this string for specific substring.
[in] | searchString | The substring to be found in this string |
[in] | offset | The index of the string at which the search is to begin |
size_t UtfString::UnicodeString::find_first_not_of | ( | const UnicodeString & | searchString, |
size_t | offset = 0 |
||
) |
Searches this string for the first character that is not found in a given string.
Note that if searchString is not a valid Unicode string, this function will still work, but the result may turn up an unexpected code point.
[in] | searchString | The string containing the characters that are to be excluded in the search. |
[in] | offset | The index of the string at which the search is to begin |
size_t UtfString::UnicodeString::find_first_not_of | ( | const Utf8String & | searchString, |
size_t | offset = 0 |
||
) |
Searches this string for the first character that is not found in a given string.
Note that if searchString is not a valid UTF-8 string, this function will still work, but the result may turn up an unexpected code point.
[in] | searchString | The string containing the characters that are to be excluded in the search. |
[in] | offset | The index of the string at which the search is to begin |
size_t UtfString::UnicodeString::find_first_not_of | ( | const Utf16String & | searchString, |
size_t | offset = 0 |
||
) |
Searches this string for the first character that is not found in a given string.
Note that if searchString is not a valid UTF-16 string, this function will still work, but the result may turn up an unexpected code point.
[in] | searchString | The string containing the characters that are to be excluded in the search. |
[in] | offset | The index of the string at which the search is to begin |
size_t UtfString::UnicodeString::find_first_of | ( | const UnicodeString & | searchString, |
size_t | offset = 0 |
||
) |
Searches this string for the first character that is found in a given string.
This function differes from find() in that find() searches for the exact occurrance of the search string whereas this function searches for any one of the characters found in the search string.
This function assumes is_valid() is true and searchString.is_valid() is true.
[in] | searchString | The string containing the characters that are to be searched for |
[in] | offset | The index of the string at which the search is to begin |
size_t UtfString::UnicodeString::find_first_of | ( | const Utf8String & | searchString, |
size_t | offset = 0 |
||
) |
Searches this string for the first character that is found in a given string.
This function differes from find() in that find() searches for the exact occurrance of the search string whereas this function searches for any one of the characters found in the search string.
This function assumes is_valid() is true and searchString.is_valid() is true.
[in] | searchString | The string containing the characters that are to be searched for |
[in] | offset | The index of the string at which the search is to begin |
size_t UtfString::UnicodeString::find_first_of | ( | const Utf16String & | searchString, |
size_t | offset = 0 |
||
) |
Searches this string for the first character that is found in a given string.
This function differes from find() in that find() searches for the exact occurrance of the search string whereas this function searches for any one of the characters found in the search string.
This function assumes is_valid() is true and searchString.is_valid() is true.
[in] | searchString | The string containing the characters that are to be searched for |
[in] | offset | The index of the string at which the search is to begin |
size_t UtfString::UnicodeString::find_last_not_of | ( | const UnicodeString & | searchString, |
size_t | offset = npos |
||
) |
Searches this string for the last character that is not found in a given string.
Note that if searchString is not a valid Unicode string, this function will still work, but the result may turn up an unexpected code point.
Please note that the offset in this function controls the index where the search ends, and not where it begins.
[in] | searchString | The string containing the characters that are to be excluded in the search. |
[in] | offset | The index of the string at which the search is to finish |
size_t UtfString::UnicodeString::find_last_not_of | ( | const Utf8String & | searchString, |
size_t | offset = npos |
||
) |
Searches this string for the last character that is not found in a given string.
Note that if searchString is not a valid UTF-8 string, this function will still work, but the result may turn up an unexpected code point.
Please note that the offset in this function controls the index where the search ends, and not where it begins.
[in] | searchString | The string containing the characters that are to be excluded in the search. |
[in] | offset | The index of the string at which the search is to finish |
size_t UtfString::UnicodeString::find_last_not_of | ( | const Utf16String & | searchString, |
size_t | offset = npos |
||
) |
Searches this string for the last character that is not found in a given string.
Note that if searchString is not a valid UTF-16 string, this function will still work, but the result may turn up an unexpected code point. For example, if the search string contains only the second code unit of a two-code-unit code point, that code point in the string being searched may still be the character identified by the search result, because even though the second code unit was in the search string, the first code unit of that code point was not. This is because there are numerous code points that could have that second code unit, and there is no way to distinguish between them if we are only given one code unit.
Please note that the offset in this function controls the index where the search ends, and not where it begins.
[in] | searchString | The string containing the characters that are to be excluded in the search. |
[in] | offset | The index of the string at which the search is to finish |
size_t UtfString::UnicodeString::find_last_of | ( | const UnicodeString & | searchString, |
size_t | offset = npos |
||
) |
Searches this string for the last character that is found in a given string.
This function differes from find() in that find() searches for the exact occurrance of the search string whereas this function searches for any one of the characters found in the search string.
Please note that the offset in this function controls the index where the search ends, and not where it begins.
This function assumes is_valid() is true and searchString.is_valid() is true.
[in] | searchString | The string containing the characters that are to be searched for |
[in] | offset | The index of the string at which the search is to finish |
size_t UtfString::UnicodeString::find_last_of | ( | const Utf8String & | searchString, |
size_t | offset = npos |
||
) |
Searches this string for the last character that is found in a given string.
This function differes from find() in that find() searches for the exact occurrance of the search string whereas this function searches for any one of the characters found in the search string.
Please note that the offset in this function controls the index where the search ends, and not where it begins.
This function assumes is_valid() is true and searchString.is_valid() is true.
[in] | searchString | The string containing the characters that are to be searched for |
[in] | offset | The index of the string at which the search is to finish |
size_t UtfString::UnicodeString::find_last_of | ( | const Utf16String & | searchString, |
size_t | offset = npos |
||
) |
Searches this string for the last character that is found in a given string.
This function differes from find() in that find() searches for the exact occurrance of the search string whereas this function searches for any one of the characters found in the search string.
Please note that the offset in this function controls the index where the search ends, and not where it begins.
This function assumes is_valid() is true and searchString.is_valid() is true.
[in] | searchString | The string containing the characters that are to be searched for |
[in] | offset | The index of the string at which the search is to finish |
UnicodeString& UtfString::UnicodeString::insert | ( | const size_t | index, |
const UnicodeString & | unicodeString | ||
) |
Inserts the contents of another string into this string at a specified index.
[in] | index | The index in this string where the parameter string is to be inserted |
[in] | unicodeString | A Unicode string to be appended |
Note that text can be inserted at the end of the string by specifying an index one past the end of the string.
This function assumes index <= length(), is_valid() == true, and unicodeString.is_valid() == true.
UnicodeString& UtfString::UnicodeString::insert | ( | const size_t | index, |
const Utf8String & | utf8String | ||
) |
Inserts the contents of another string into this string at a specified index.
[in] | index | The index in this string where the parameter string is to be inserted |
[in] | utf8String | A UTF-8 string to be appended |
Note that text can be inserted at the end of the string by specifying an index one past the end of the string.
This function assumes index <= length(), is_valid() == true, and unicodeString.is_valid() == true.
UnicodeString& UtfString::UnicodeString::insert | ( | const size_t | index, |
const Utf16String & | utf16String | ||
) |
Inserts the contents of another string into this string at a specified index.
[in] | index | The index in this string where the parameter string is to be inserted |
[in] | utf16String | A UTF-16 string to be appended |
Note that text can be inserted at the end of the string by specifying an index one past the end of the string.
This function assumes index <= length(), is_valid() == true, and unicodeString.is_valid() == true.
UnicodeString& UtfString::UnicodeString::insert | ( | const size_t | index, |
const UnicodeChar & | unicodeCharacter | ||
) |
Inserts a character into this string at a specified index.
[in] | index | The index in this string where the character is to be inserted |
[in] | unicodeCharacter | A Unicode character to be appended. The character is assumed to be a valid Unicode character |
Note that character can be inserted at the end of the string by specifying an index one past the end of the string.
This function assumes unicodeCharacter is a valid character and that index <= length().
const UtfEncoding UtfString::UnicodeString::internal_encoding | ( | ) | const |
Indicates the internal encoding used by this string.
The internal encoding of a UnicodeString depends on what data is used to initialize the string or what data is assigned to the string when it is empty. If a UnicodeString is initialized with a UTF-8 string, the internal encoding will be UTF-8. In this case, any operations involving a UTF-16 string will result in the UTF-16 string being converted to UTF-8 internally. The opposite is the case when a UnicodeString is initialized with a UTF-16 string. This is done to keep encoding conversions to a minimum. If an application is dealing primarily with one encoding, and a string in that encoding is put in a UnicodeString, we avoid the conversion to a specific internal encoding and then the conversion back to the original encoding.
bool UtfString::UnicodeString::is_valid | ( | ) | const |
Indicates whether this string is a valid Unicode string.
A valid Unicode string is a string whose encapsulated encoding-specific string is comprised of valid code units.
An empty string is considered to be a valid Unicode string
|
static |
Indicates whether a Unicode character is a whitespace character.
This function tests for the standard ASCII whitespace characters(tab, space, carriage return, line feed), and the characters that the Unicode standard defines as being separator characters.
An empty character is not considered to be a whitespace character.
[in] | unicodeCharacter | The character to be examined |
size_t UtfString::UnicodeString::length | ( | ) | const |
Returns the number of code points in this string.
Use this function if you're interested in how many characters are in a string.
This function does not check for validity, so it may return an incorrect result if is_valid() is false.
This function has a O(N) performance, since we need to iterate through the code units to figure out how many code points there are. Counting each code point is an extremely quick operation, but due to the need to visit every code point in the string, it would be wise to be mindful of performance when making heavy use of this function on long strings in performance-sensitive code.
UtfString::UnicodeString::operator Utf16String | ( | ) | const |
Converts this object to a Utf16String object.
This operator assumes that if this string is non-empty, it is a valid Unicode string.
If this object does not contain a string, an empty Utf16String will be returned
UtfString::UnicodeString::operator Utf8String | ( | ) | const |
Converts this object to a Utf8String object.
This operator assumes that if this string is non-empty, it is a valid Unicode string.
If this object does not contain a string, an empty Utf8String will be returned
bool UtfString::UnicodeString::operator!= | ( | const UnicodeString & | otherString) | const |
Compares the value of this string to the value of another string and tests whether the two strings are the different.
[in] | otherString | The string to be compared with this string |
bool UtfString::UnicodeString::operator!= | ( | const Utf8String & | utf8String) | const |
Compares the value of this string to the value of a UTF-8 string and tests whether the two strings are the different.
[in] | utf8String | The UTF-8 string to be compared with this string |
bool UtfString::UnicodeString::operator!= | ( | const Utf16String & | utf16String) | const |
Compares the value of this string to the value of a UTF-16 string and tests whether the two strings are the different.
[in] | utf16String | The UTF-16 string to be compared with this string |
bool UtfString::UnicodeString::operator< | ( | const UnicodeString & | otherString) | const |
Compares the value of this string to the value of another string and tests whether the value of this string is less than the value of the other string.
The values of each string are determined by the Unicode values of the characters. This is the similar comparing strings in alphabetical order, where the character order is determined by the Unicode values and not the ordering of any particular alphabet.
In practice, this works out to be the same as alphabetical ordering for English- language strings, but may not be for strings in other languages.
[in] | otherString | The string to be compared with this string |
bool UtfString::UnicodeString::operator< | ( | const Utf8String & | utf8String) | const |
Compares the value of this string to the value of a UTF-8 string and tests whether the value of this string is less than the value of the other string.
The values of each string are determined by the Unicode values of the characters. This is the similar comparing strings in alphabetical order, where the character order is determined by the Unicode values and not the ordering of any particular alphabet.
In practice, this works out to be the same as alphabetical ordering for English- language strings, but may not be for strings in other languages.
[in] | utf8String | The UTF-8 string to be compared with this string |
bool UtfString::UnicodeString::operator< | ( | const Utf16String & | utf16String) | const |
Compares the value of this string to the value of a UTF-16 string and tests whether the value of this string is less than the value of the other string.
The values of each string are determined by the Unicode values of the characters. This is the similar comparing strings in alphabetical order, where the character order is determined by the Unicode values and not the ordering of any particular alphabet.
In practice, this works out to be the same as alphabetical ordering for English- language strings, but may not be for strings in other languages.
[in] | utf16String | The UTF-16 string to be compared with this string |
bool UtfString::UnicodeString::operator<= | ( | const UnicodeString & | otherString) | const |
Compares the value of this string to the value of another string and tests whether the value of this string is less than or equal to the value of the other string.
The values of each string are determined by the Unicode values of the characters. This is the similar comparing strings in alphabetical order, where the character order is determined by the Unicode values and not the ordering of any particular alphabet.
In practice, this works out to be the same as alphabetical ordering for English- language strings, but may not be for strings in other languages.
[in] | otherString | The string to be compared with this string |
bool UtfString::UnicodeString::operator<= | ( | const Utf8String & | utf8String) | const |
Compares the value of this string to the value of a UTF-8 string and tests whether the value of this string is less than or equal to the value of the other string.
The values of each string are determined by the Unicode values of the characters. This is the similar comparing strings in alphabetical order, where the character order is determined by the Unicode values and not the ordering of any particular alphabet.
In practice, this works out to be the same as alphabetical ordering for English- language strings, but may not be for strings in other languages.
[in] | utf8String | The UTF-8 string to be compared with this string |
bool UtfString::UnicodeString::operator<= | ( | const Utf16String & | utf16String) | const |
Compares the value of this string to the value of a UTF-16 string and tests whether the value of this string is less than or equal to the value of the other string.
The values of each string are determined by the Unicode values of the characters. This is the similar comparing strings in alphabetical order, where the character order is determined by the Unicode values and not the ordering of any particular alphabet.
In practice, this works out to be the same as alphabetical ordering for English- language strings, but may not be for strings in other languages.
[in] | utf16String | The UTF-16 string to be compared with this string |
UnicodeString& UtfString::UnicodeString::operator= | ( | const UnicodeString & | unicodeString) |
Assigns the contents of a UnicodeString object to this object.
[in] | unicodeString | The UnicodeString object whose contents are to be assigned to this object |
UnicodeString& UtfString::UnicodeString::operator= | ( | const Utf8String & | utf8String) |
Assigns the contents of a Utf8String object to this object.
[in] | utf8String | The Utf8StringString object whose contents are to be assigned to this object |
UnicodeString& UtfString::UnicodeString::operator= | ( | const Utf16String & | utf16String) |
Assigns the contents of a Utf16String object to this object.
[in] | utf16String | The Utf16StringString object whose contents are to be assigned to this object |
bool UtfString::UnicodeString::operator== | ( | const UnicodeString & | otherString) | const |
Compares the value of this string to the value of another string and tests whether the two strings are the same.
[in] | otherString | The string to be compared with this string |
bool UtfString::UnicodeString::operator== | ( | const Utf8String & | utf8String) | const |
Compares the value of this string to the value of a UTF-8 string and tests whether the two strings are the same.
[in] | utf8String | The UTF-8 string to be compared with this string |
bool UtfString::UnicodeString::operator== | ( | const Utf16String & | utf16String) | const |
Compares the value of this string to the value of a UTF-16 string and tests whether the two strings are the same.
[in] | utf16String | The UTF-16 string to be compared with this string |
bool UtfString::UnicodeString::operator> | ( | const UnicodeString & | otherString) | const |
Compares the value of this string to the value of another string and tests whether the value of this string is greater than the value of the other string.
The values of each string are determined by the Unicode values of the characters. This is the similar comparing strings in alphabetical order, where the character order is determined by the Unicode values and not the ordering of any particular alphabet.
In practice, this works out to be the same as alphabetical ordering for English- language strings, but may not be for strings in other languages.
[in] | otherString | The string to be compared with this string |
bool UtfString::UnicodeString::operator> | ( | const Utf8String & | utf8String) | const |
Compares the value of this string to the value of a UTF-8 string and tests whether the value of this string is greater than the value of the other string.
The values of each string are determined by the Unicode values of the characters. This is the similar comparing strings in alphabetical order, where the character order is determined by the Unicode values and not the ordering of any particular alphabet.
In practice, this works out to be the same as alphabetical ordering for English- language strings, but may not be for strings in other languages.
[in] | utf8String | The UTF-8 string to be compared with this string |
bool UtfString::UnicodeString::operator> | ( | const Utf16String & | utf16String) | const |
Compares the value of this string to the value of a UTF-16 string and tests whether the value of this string is greater than the value of the other string.
The values of each string are determined by the Unicode values of the characters. This is the similar comparing strings in alphabetical order, where the character order is determined by the Unicode values and not the ordering of any particular alphabet.
In practice, this works out to be the same as alphabetical ordering for English- language strings, but may not be for strings in other languages.
[in] | utf16String | The UTF-16 string to be compared with this string |
bool UtfString::UnicodeString::operator>= | ( | const UnicodeString & | otherString) | const |
Compares the value of this string to the value of another string and tests whether the value of this string is greater than or equal to the value of the other string.
The values of each string are determined by the Unicode values of the characters. This is the similar comparing strings in alphabetical order, where the character order is determined by the Unicode values and not the ordering of any particular alphabet.
In practice, this works out to be the same as alphabetical ordering for English- language strings, but may not be for strings in other languages.
[in] | otherString | The string to be compared with this string |
bool UtfString::UnicodeString::operator>= | ( | const Utf8String & | utf8String) | const |
Compares the value of this string to the value of a UTF-8 string and tests whether the value of this string is greater than or equal to the value of the other string.
The values of each string are determined by the Unicode values of the characters. This is the similar comparing strings in alphabetical order, where the character order is determined by the Unicode values and not the ordering of any particular alphabet.
In practice, this works out to be the same as alphabetical ordering for English- language strings, but may not be for strings in other languages.
[in] | utf8String | The string to be compared with this string |
bool UtfString::UnicodeString::operator>= | ( | const Utf16String & | utf16String) | const |
Compares the value of this string to the value of a UTF-16 string and tests whether the value of this string is greater than or equal to the value of the other string.
The values of each string are determined by the Unicode values of the characters. This is the similar comparing strings in alphabetical order, where the character order is determined by the Unicode values and not the ordering of any particular alphabet.
In practice, this works out to be the same as alphabetical ordering for English- language strings, but may not be for strings in other languages.
[in] | utf16String | The UTF-16 string to be compared with this string |
UnicodeCharReference UtfString::UnicodeString::operator[] | ( | const size_t | index) |
Returns the character found at the specified character index.
This operator does not check for the validity of the index, so it assumes that index is valid. What happens when the index is invalid is undefined. If you want the index parameter to be validated, use the at() function instead.
Unicode strings are of variable length encoding, meaning that whereas accessing a character at a particular index is O(1) for fixed-length encodings, accessing a character in Unicode strings is O(1) in the best case and O(n) in the worst case.
So if you wish to iterate through the characters in this string, use the standard iterators instead of an indexer. The standard iterators will be far more efficient.
[in] | index | The index identifying the character to be retrieved |
const UnicodeChar UtfString::UnicodeString::operator[] | ( | const size_t | index) | const |
Returns the character found at the specified character index.
This operator does not check for the validity of the index, so it assumes that index is valid. What happens when the index is invalid is undefined. If you want the index parameter to be validated, use the at() function instead.
Unicode strings are of variable length encoding, meaning that whereas accessing a character at a particular index is O(1) for fixed-length encodings, accessing a character in Unicode strings is O(1) in the best case and O(n) in the worst case.
So if you wish to iterate through the characters in this string, use the standard iterators instead of an indexer. The standard iterators will be far more efficient.
[in] | index | The index identifying the character to be retrieved |
void UtfString::UnicodeString::push_back | ( | const UnicodeChar & | character) |
Appends a character to the end of this string.
This function is the equivalent of calling insert(length(), character) or append(character).
[in] | character | The character to be appended to the end of this string |
reverse_iterator UtfString::UnicodeString::rbegin | ( | ) |
Returns an iterator pointing to the first character of a reversed string, which corresponds to the last character of the normal string.
If the UnicodeString is empty, an empty iterator is created. See the iterator class description for more information about empty iterators.
const_reverse_iterator UtfString::UnicodeString::rbegin | ( | ) | const |
Returns a constant iterator pointing to the first character of a reversed string, which corresponds to the last character of a normal string.
If the UnicodeString is empty, an empty iterator is created. See the iterator class description for more information about empty iterators.
reverse_iterator UtfString::UnicodeString::rend | ( | ) |
Returns an iterator pointing to the location succeeding the last character in a reversed string, which corresponds to the location preceding the first character in a normal string.
The iterator returned by this function is usually used to test whether an iterator has reached the end of a string. The iterator returned by this function should never be dereferenced, as it doesn't not point to a part of the string.
If the UnicodeString is empty, an empty iterator is created. See the iterator class description for more information about empty iterators.
const_reverse_iterator UtfString::UnicodeString::rend | ( | ) | const |
Returns an constant iterator pointing to the location succeeding the last character in a reversed string, which corresponds to the location preceding the first character in a normal string.
The iterator returned by this function is usually used to test whether an iterator has reached the end of a string. The iterator returned by this function should never be dereferenced, as it doesn't not point to a part of the string.
If the UnicodeString is empty, an empty iterator is created. See the iterator class description for more information about empty iterators.
UnicodeString& UtfString::UnicodeString::replace | ( | const size_t | position, |
const size_t | count, | ||
const UnicodeString & | replacementString | ||
) |
Removes a section of this string and replaces it with the contents of another string.
Note that if position is one index past the end of the string, replacementString will simply be appended to the end of the string.
This function assumes position <= length().
[in] | position | The index in the string identifying the beginning of the string section to be removed |
[in] | count | The maximum number of characters to be removed from this string |
[in] | replacementString | The string whose contents are to replace the section being removed |
UnicodeString& UtfString::UnicodeString::replace | ( | const size_t | position, |
const size_t | count, | ||
const Utf8String & | replacementString | ||
) |
Removes a section of this string and replaces it with the contents of another string.
Note that if position is one index past the end of the string, replacementString will simply be appended to the end of the string.
This function assumes position <= length().
[in] | position | The index in the string identifying the beginning of the string section to be removed |
[in] | count | The maximum number of characters to be removed from this string |
[in] | replacementString | The string whose contents are to replace the section being removed |
UnicodeString& UtfString::UnicodeString::replace | ( | const size_t | position, |
const size_t | count, | ||
const Utf16String & | replacementString | ||
) |
Removes a section of this string and replaces it with the contents of another string.
Note that if position is one index past the end of the string, replacementString will simply be appended to the end of the string.
This function assumes position <= length().
[in] | position | The index in the string identifying the beginning of the string section to be removed |
[in] | count | The maximum number of characters to be removed from this string |
[in] | replacementString | The string whose contents are to replace the section being removed |
UnicodeString& UtfString::UnicodeString::replace | ( | const size_t | position, |
const size_t | count, | ||
const size_t | characterCount, | ||
const UnicodeChar & | character | ||
) |
Replaces the characters in a section of this string with the given character.
This function assumes position <= length().
[in] | position | The index in the string identifying the first character to be replaced |
[in] | count | The maximum number of characters to be replaced |
[in] | characterCount | The number of times the character is to be repeated in the replaced section |
[in] | character | The character to replace the characters in the identified section of this string |
UnicodeString& UtfString::UnicodeString::replace | ( | UnicodeString::iterator | beginIterator, |
UnicodeString::iterator | endIterator, | ||
const UnicodeString & | replacementString | ||
) |
Removes a section of this string and replaces it with the contents of another string.
This function replaces the section of the string from beginIterator to endIterator - 1, where endIterator is pointing at a position one past the end of the section to be replaced.
If endIterator points to a position before beginIterator, endIterator is ignored and the entire string from beginIterator to the end of the string is replaced. If beginIterator points to the same position as endIterator, replacementString is simply inserted at that position and nothing in this string is removed.
If this string is an empty string, then neither of the iterators passed in can be valid non-empty iterators. In this case, the empty string is replaced with the contents of replacementString.
This function assumes that beginIterator and endIterator are iterators for this string. This function also assumes that beginIterator and endIterator are both valid iterators when this string is non-empty or that beginIterator and endIterator are both empty iterators when this string is empty.
[in] | beginIterator | An iterator pointing to the first character of the string section to be replaced |
[in] | endIterator | An iterator pointing to the position one past the last character of the string section to be replaced |
[in] | replacementString | The string whose contents are to replace the section being removed |
UnicodeString& UtfString::UnicodeString::replace | ( | UnicodeString::iterator | beginIterator, |
UnicodeString::iterator | endIterator, | ||
const size_t | characterCount, | ||
const UnicodeChar & | character | ||
) |
Replaces the characters in a section of this string with the given character.
This function replaces the section of the string from beginIterator to endIterator - 1, where endIterator is pointing at a position one past the end of the section to be replaced.
If endIterator points to a position before beginIterator, endIterator is ignored and the entire string from beginIterator to the end of the string is replaced. If beginIterator points to the same position as endIterator, the new characters are simply inserted at that position and nothing in this string is removed.
If this string is an empty string, then neither of the iterators passed in can be valid non-empty iterators. In this case, the empty string is replaced with the contents of replacementString.
This function assumes that beginIterator and endIterator are iterators for this string. This function also assumes that beginIterator and endIterator are both valid iterators when this string is non-empty or that beginIterator and endIterator are both empty iterators when this string is empty.
[in] | beginIterator | An iterator pointing to the first character of the string section to be replaced |
[in] | endIterator | An iterator pointing to the position one past the last character of the string section to be replaced |
[in] | characterCount | The number of times the character is to be repeated in the replaced section |
[in] | character | The character to replace the characters in the identified section of this string |
size_t UtfString::UnicodeString::rfind | ( | const UnicodeString & | searchString, |
size_t | offset = npos |
||
) |
Searches this string backward for specific substring.
Note this does not look at the characters in reverse order like iterating through a string with a reverse iterator. It looks at the characters in forward order just like the find() function, but starts at the end of the string and works backward toward the beginning.
This function assumes that searchString.is_valid() is true.
[in] | searchString | The substring to be found in this string |
[in] | offset | The index of the string at which the search is to begin |
size_t UtfString::UnicodeString::rfind | ( | const Utf8String & | searchString, |
size_t | offset = npos |
||
) |
Searches this string backward for specific substring.
Note this does not look at the characters in reverse order like iterating through a string with a reverse iterator. It looks at the characters in forward order just like the find() function, but starts at the end of the string and works backward toward the beginning.
This function assumes that searchString.is_valid() is true.
[in] | searchString | The substring to be found in this string |
[in] | offset | The index of the string at which the search is to begin |
size_t UtfString::UnicodeString::rfind | ( | const Utf16String & | searchString, |
size_t | offset = npos |
||
) |
Searches this string backward for specific substring.
Note this does not look at the characters in reverse order like iterating through a string with a reverse iterator. It looks at the characters in forward order just like the find() function, but starts at the end of the string and works backward toward the beginning.
This function assumes that searchString.is_valid() is true.
[in] | searchString | The substring to be found in this string |
[in] | offset | The index of the string at which the search is to begin |
size_t UtfString::UnicodeString::size | ( | ) | const |
Returns the number of code points in this string.
This function is exactly the same as the length() function. We just include it here because it is a standard function in most string and STL classes.
This function has a O(N) performance, since we need to iterate through the code units to figure out how many code points there are. Counting each code point is an extremely quick operation, but due to the need to visit every code point in the string, it would be wise to be mindful of performance when making heavy use of this function on long strings in performance-sensitive code.
UnicodeString UtfString::UnicodeString::substr | ( | const size_t | offset = 0 , |
const size_t | count = npos |
||
) |
Returns a substring of this string.
The offset parameter indicates which character in the string will become the first character of the substring and the count parameter indicates how many characters will be copied into the substring. If the value of count would cause characters beyond the end of this string to be copied, only characters from the offset to the end of the string will be copied.
This function assumes that offset < length().
[in] | offset | The string offset indicating the first character of the substring |
[in] | count | The number of characters to be copied into the substring |
void UtfString::UnicodeString::swap | ( | UnicodeString & | unicodeString) |
Swaps the contents of this string with those of another string.
This function assumes that is_valid() is true and unicodeString.is_valid() is true.
[in] | unicodeString | The string whose contents are to be swapped with the contents this string |
void UtfString::UnicodeString::swap | ( | Utf8String & | utf8String) |
Swaps the contents of this string with those of another string.
This function assumes that is_valid() is true and utf8String.is_valid() is true.
[in] | utf8String | The string whose contents are to be swapped with the contents this string |
void UtfString::UnicodeString::swap | ( | Utf16String & | utf16String) |
Swaps the contents of this string with those of another string.
This function assumes that is_valid() is true and utf16String.is_valid() is true.
[in] | utf16String | The string whose contents are to be swapped with the contents this string |
|
friend |
This operator converts the contents of a UnicodeString to a stream of bytes.
This function will write to the stream depending on the internal encoding of the UnicodeString object. If the internal encoding is UTF-8, the stream will be written to as a stream of UTF-8 characters, and if the internal incoding is UTF-16, the stream will be written to a stream of UTF-16 characters. If the string does not have an internal encoding(uninitialized), nothing will be written to the stream.
[in] | outputStream | The output stream to which the contents of the UTF-8 string are to be written |
[in] | unicodeString | The UnicodeString to be written to the output stream |
|
friend |
This operator converts a Unicode string to a wide stream of 16-bit values.
No checks for validity are done, so the resulting UTF-16 stream may or may not contain a valid UTF-16 string.
If the UnicodeString contains a UTF-8 string, the function converts it to a UTF-16 string first. What happens when the UTF-8 string is invalid is undefined, so you better make sure your UTF-8 string is valid before doing this.
[in] | outputStream | The wide output stream to which the contents of the UTF-16 string are to be written |
[in] | unicodeString | The Unicode string to be written to the output stream |
|
friend |
This operator reads a stream of bytes into a UnicodeString.
This function clears the contents of the string before anything is read. This function will read the stream depending on the internal encoding of the UnicodeString object. If the internal encoding is UTF-8, the stream will be read as a stream of UTF-8 characters, and if the internal incoding is UTF-16, the stream will be read as a stream of UTF-16 characters. If the string does not have an internal encoding(uninitialized), we will assume we are reading a stream of UTF-8 characters
[in] | inputStream | The input stream bytes to be read |
[in] | unicodeString | The string object in which the stream contents will be stored |
|
friend |
This operator converts a wide stream of 16-bit values to a UTF-16 string and stores the UTF-16 string inside unicodeString.
This function clears the contents of unicodeString before the stream is converted. In addition this function assumes that the stream being converted is of the same endianness as the machine on which this function was compiled.
This function always assumes that that a wide stream contains a UTF-16 string, and not a UTF-8 string.
[in] | inputStream | The wide input stream containing 16-bit values to be converted to a UTF-16 string |
[in] | unicodeString | The string object into which the converted UTF-16 string will be stored |