UtfString
|
Provides a reference to an encoding-neutral Unicode character embedded in an encoding-neutral Unicode string. More...
#include <UnicodeCharReference.h>
Public Member Functions | |
UnicodeCharReference (const UnicodeCharReference &unicodeCharReference) | |
Initializes an instance of UnicodeCharReference using another UnicodeCharReference instance. More... | |
UnicodeCharReference (const Utf8CharReference &utf8CharReference) | |
Initializes an instance of UnicodeCharReference using a Utf8CharReference instance. More... | |
UnicodeCharReference (const Utf16CharReference &utf16CharReference) | |
Initializes an instance of UnicodeCharReference using a Utf16CharReference instance. More... | |
virtual | ~UnicodeCharReference () |
The destructor. More... | |
bool | operator== (const UnicodeCharReference &otherCharacterReference) const |
Compares the value of this character reference to the value of another character reference and tests whether the two character values are the same. More... | |
bool | operator== (const Utf8CharReference &characterReference) const |
Compares the value of this character reference to the value of a UTF-8 character reference and tests whether the two character Unicode values are the same. More... | |
bool | operator== (const Utf16CharReference &characterReference) const |
Compares the value of this character reference to the value of a UTF-16 character reference and tests whether the two character Unicode values are the same. More... | |
bool | operator== (const UnicodeChar &character) const |
Compares the value of this character reference to the value of a Unicode character and tests whether the two character Unicode values are the same. More... | |
bool | operator== (const Utf8Char &character) const |
Compares the value of this character reference to the value of a UTF-8 character and tests whether the two character Unicode values are the same. More... | |
bool | operator== (const Utf16Char &character) const |
Compares the value of this character reference to the value of a UTF-16 character and tests whether the two character Unicode values are the same. More... | |
bool | operator!= (const UnicodeCharReference &otherCharacterReference) const |
Compares the value of this character reference to the value of another character reference and tests whether the two character values are different. More... | |
bool | operator!= (const Utf8CharReference &characterReference) const |
Compares the value of this character reference to the value of a UTF-8 character reference and tests whether the two character Unicode values are different. More... | |
bool | operator!= (const Utf16CharReference &characterReference) const |
Compares the value of this character reference to the value of a UTF-16 character reference and tests whether the two character Unicode values are different. More... | |
bool | operator!= (const UnicodeChar &character) const |
Compares the value of this character reference to the value of a Unicode character and tests whether the two character Unicode values are different. More... | |
bool | operator!= (const Utf8Char &character) const |
Compares the value of this character reference to the value of a UTF-8 character and tests whether the two character Unicode values are different. More... | |
bool | operator!= (const Utf16Char &character) const |
Compares the value of this character reference to the value of a UTF-16 character and tests whether the two character Unicode values are different. More... | |
UnicodeCharReference & | operator= (const UnicodeCharReference &characterReference) |
Assigns the contents of a UnicodeCharReference object to this object. More... | |
UnicodeCharReference & | operator= (const Utf8CharReference &characterReference) |
Assigns the contents of a Utf8CharReference object to this object. More... | |
UnicodeCharReference & | operator= (const Utf16CharReference &characterReference) |
Assigns the contents of a Utf16CharReference object to this object. More... | |
UnicodeCharReference & | operator= (const UnicodeChar &character) |
Assigns the contents of a UnicodeChar object to this object. More... | |
UnicodeCharReference & | operator= (const Utf8Char &character) |
Assigns the contents of a Utf8Char object to this object. More... | |
UnicodeCharReference & | operator= (const Utf16Char &character) |
Assigns the contents of a Utf16Char object to this object. More... | |
operator UnicodeChar () const | |
Converts this object to a UnicodeChar object. More... | |
operator Utf8Char () const | |
Converts this object to a Utf8Char object. More... | |
operator Utf16Char () const | |
Converts this object to a Utf16Char object. More... | |
operator Utf8CharReference () const | |
Converts this object to a Utf8CharReference object. More... | |
operator Utf16CharReference () const | |
Converts this object to a Utf16CharReference object. More... | |
void | assign_reference (const UnicodeCharReference &otherCharacterReference) |
Assigns another character reference to this character reference, causing this character reference to refer to the exact same character as the other reference. More... | |
const UtfEncoding | internal_encoding () const |
Indicates the internal encoding used by this character reference. More... | |
bool | is_valid () const |
Indicates whether the character value of this reference is a valid Unicode character. More... | |
bool | reference_equal (const UnicodeCharReference &otherCharacterReference) |
Compares this character reference to another character reference and tests whether the two references refer to the exact same character. More... | |
bool | reference_not_equal (const UnicodeCharReference &otherCharacterReference) |
Compares this character reference to another character reference and tests whether the two references refer to different characters. More... | |
UInt32 | to_utf_32 () const |
Converts the character value of this character reference to a UTF-32 code point. More... | |
Provides a reference to an encoding-neutral Unicode character embedded in an encoding-neutral Unicode string.
Since a UnicodeString provides an interface that hides an encoding-specific string, a character is not directly pulled out of the UnicodeString, but is constructed from the underlying encoding- specific string. This makes it impossible for an string indexer to return a character reference that can be assigned to.
So this class acts as a reference to a character inside a string, so that if this character reference is altered, the contents of the string are altered as well. This allows us to have a read/write indexer in UnicodeString instead of a read-only indexer.
UnicodeCharReference encapsulates an encoding-specific character reference. For example, if UnicodeString encapsulates a Utf8String, UnicodeCharReference will encapsulate a Utf8CharReference. UnicodeCharReference can only be cast to the same type as the encapsulated character reference. To find the underlying encoding, use the GetInternalEncoding() member function.
Character references may become invalid if the contents of the string being referred to is modified in any way. If a string is modified by adding or removing characters, the associated character reference may then refer to another character or to somewhere outside the string. Attempting to use a character reference that no longer refers to a valid character will most likely cause a runtime error.
Since this is an encoding-neutral character reference, individual code units cannot be retrieved or set. You can only set an entire character (using a UTF-8, UTF-16 character string, or the 32-bit code point value) or retrieve the character's 32-bit code point value. You can, however, cast the UnicodeCharReference to its underlying encoding to retrieve and set the code units directly. You cannot cast a UnicodeCharReference to another encoding-specific character reference. The reason for this that UnicodeCharReference encapsulates an encoding-specific reference to an encoding-specific string (which was encapsulated by the UnicodeString). It would be complex and rather inefficent to handle translating code units to and from the code units referred to by an encoding-specific reference, and would require significant changes to Utf8CharReference and Utf16CharReference. It may be technically possible, but it's simply not worth it. If you need to address code units at all in a character reference, it's far easier and efficient to use an encoding-specific string and encoding-specific character reference instead of UnicodeString and UnicodeCharReference.
Note that if you attempt to cast a UnicodeCharReference to a character reference of a different type than the encapsulated character reference, you will either get a crash or an assertion failure. Use the GetInternalEncoding() member function to prevent this.
UnicodeChar and UnicodeCharReference objects can be assigned to each other or converted from one to the other.
UtfString::UnicodeCharReference::UnicodeCharReference | ( | const UnicodeCharReference & | unicodeCharReference) |
Initializes an instance of UnicodeCharReference using another UnicodeCharReference instance.
[in] | unicodeCharReference | The Unicode character reference to use in initializing this object |
UtfString::UnicodeCharReference::UnicodeCharReference | ( | const Utf8CharReference & | utf8CharReference) |
Initializes an instance of UnicodeCharReference using a Utf8CharReference instance.
[in] | utf8CharReference | The UTF-8 character reference to use in initializing this object |
UtfString::UnicodeCharReference::UnicodeCharReference | ( | const Utf16CharReference & | utf16CharReference) |
Initializes an instance of UnicodeCharReference using a Utf16CharReference instance.
[in] | utf16CharReference | The UTF-16 character reference to use in initializing this object |
|
virtual |
The destructor.
The destructor will clean up the encoding-specific string contained within this object
void UtfString::UnicodeCharReference::assign_reference | ( | const UnicodeCharReference & | otherCharacterReference) |
Assigns another character reference to this character reference, causing this character reference to refer to the exact same character as the other reference.
Note that this function does not copy the value of the other character reference and assign it to the value of this character reference. To do that, use the assignment operator.
[in] | otherCharacterReference | The reference to be assigned to this reference. |
const UtfEncoding UtfString::UnicodeCharReference::internal_encoding | ( | ) | const |
Indicates the internal encoding used by this character reference.
The internal encoding of a UnicodeCharReference depends on what data is used to initialize the character. If a UnicodeCharReference is initialized with a UTF-8 character reference, the internal encoding will be UTF-8. The opposite is the case when a UnicodeCharReference is initialized with a UTF-16 character reference. This is done to keep encoding conversions to a minimum. If an application is dealing primarily with one encoding, and a character in that encoding is put in a UnicodeString, we avoid the conversion to a specific internal encoding and then the conversion back to the original encoding.
bool UtfString::UnicodeCharReference::is_valid | ( | ) | const |
Indicates whether the character value of this reference is a valid Unicode character.
If this object is able to convert the code units given to it during initialization to a Unicode character, this character will be considered valid.
UtfString::UnicodeCharReference::operator UnicodeChar | ( | ) | const |
Converts this object to a UnicodeChar object.
UtfString::UnicodeCharReference::operator Utf16Char | ( | ) | const |
Converts this object to a Utf16Char object.
This operator assumes that this character is a valid Unicode character.
UtfString::UnicodeCharReference::operator Utf16CharReference | ( | ) | const |
Converts this object to a Utf16CharReference object.
This conversion operator will only work if the underlying encoding is UTF-16; otherwise, an assertion failure or crash will result. This operator assumes that this reference has an internal encoding of ENCODING_UTF16.
UtfString::UnicodeCharReference::operator Utf8Char | ( | ) | const |
Converts this object to a Utf8Char object.
This operator assumes that this character is a valid Unicode character.
UtfString::UnicodeCharReference::operator Utf8CharReference | ( | ) | const |
Converts this object to a Utf8CharReference object.
This conversion operator will only work if the underlying encoding is UTF-8; otherwise, an assertion failure or crash will result. This operator assumes that this reference has an internal encoding of ENCODING_UTF8.
bool UtfString::UnicodeCharReference::operator!= | ( | const UnicodeCharReference & | otherCharacterReference) | const |
Compares the value of this character reference to the value of another character reference and tests whether the two character values are different.
This type of equality checking checks to see whether the character values of the two references are not equal. This does not check whether the references are different. To test for that type of reference inequality, use the reference_not_equal() function.
[in] | otherCharacterReference | The character reference to be compared with this character reference |
bool UtfString::UnicodeCharReference::operator!= | ( | const Utf8CharReference & | characterReference) | const |
Compares the value of this character reference to the value of a UTF-8 character reference and tests whether the two character Unicode values are different.
This type of equality checking checks to see whether the character values of the two references are not equal. This does not check whether the references are different. To test for that type of reference inequality, use the reference_not_equal() function.
[in] | characterReference | The UTF-8 character reference to be compared with this character reference |
bool UtfString::UnicodeCharReference::operator!= | ( | const Utf16CharReference & | characterReference) | const |
Compares the value of this character reference to the value of a UTF-16 character reference and tests whether the two character Unicode values are different.
This type of equality checking checks to see whether the character values of the two references are not equal. This does not check whether the references are different. To test for that type of reference inequality, use the reference_not_equal() function.
[in] | characterReference | The UTF-16 character reference to be compared with this character reference |
bool UtfString::UnicodeCharReference::operator!= | ( | const UnicodeChar & | character) | const |
Compares the value of this character reference to the value of a Unicode character and tests whether the two character Unicode values are different.
[in] | character | The Unicode character to be compared with this character |
bool UtfString::UnicodeCharReference::operator!= | ( | const Utf8Char & | character) | const |
Compares the value of this character reference to the value of a UTF-8 character and tests whether the two character Unicode values are different.
[in] | character | The UTF-8 character to be compared with this character reference |
bool UtfString::UnicodeCharReference::operator!= | ( | const Utf16Char & | character) | const |
Compares the value of this character reference to the value of a UTF-16 character and tests whether the two character Unicode values are different.
[in] | character | The UTF-16 character to be compared with this character |
UnicodeCharReference& UtfString::UnicodeCharReference::operator= | ( | const UnicodeCharReference & | characterReference) |
Assigns the contents of a UnicodeCharReference object to this object.
The contents of the other reference are assigned to this reference. This does not cause the this reference to refer to the same character as the other reference. To have this reference refer to the exact same character as another reference, use the assign_reference() function.
[in] | characterReference | The UnicodeCharReference object whose contents are to be assigned to this object |
UnicodeCharReference& UtfString::UnicodeCharReference::operator= | ( | const Utf8CharReference & | characterReference) |
Assigns the contents of a Utf8CharReference object to this object.
The contents of the other reference are assigned to this reference. This does not cause the this reference to refer to the same character as the other reference. To have this reference refer to the exact same character as another reference, use the assign_reference() function.
[in] | characterReference | The Utf8CharReference object whose contents are to be assigned to this object |
UnicodeCharReference& UtfString::UnicodeCharReference::operator= | ( | const Utf16CharReference & | characterReference) |
Assigns the contents of a Utf16CharReference object to this object.
The contents of the other reference are assigned to this reference. This does not cause the this reference to refer to the same character as the other reference. To have this reference refer to the exact same character as another reference, use the assign_reference() function.
[in] | characterReference | The Utf16CharReference object whose contents are to be assigned to this object |
UnicodeCharReference& UtfString::UnicodeCharReference::operator= | ( | const UnicodeChar & | character) |
Assigns the contents of a UnicodeChar object to this object.
The contents of the UnicodeChar object are copied to this object: they are not shared.
[in] | character | The UnicodeChar object whose contents are to be assigned to this object |
UnicodeCharReference& UtfString::UnicodeCharReference::operator= | ( | const Utf8Char & | character) |
UnicodeCharReference& UtfString::UnicodeCharReference::operator= | ( | const Utf16Char & | character) |
bool UtfString::UnicodeCharReference::operator== | ( | const UnicodeCharReference & | otherCharacterReference) | const |
Compares the value of this character reference to the value of another character reference and tests whether the two character values are the same.
This type of equality checking checks to see whether the character values of the two references are equal. This does not check whether the references refer to the same character in the same string. To test for that type of reference equality, use the reference_equal() function.
[in] | otherCharacterReference | The character reference to be compared with this character reference |
bool UtfString::UnicodeCharReference::operator== | ( | const Utf8CharReference & | characterReference) | const |
Compares the value of this character reference to the value of a UTF-8 character reference and tests whether the two character Unicode values are the same.
This type of equality checking checks to see whether the character values of the two references are equal. This does not check whether the references refer to the same character in the same string. To test for that type of reference equality, use the reference_equal() function.
[in] | characterReference | The UTF-8 character reference to be compared with this character reference |
bool UtfString::UnicodeCharReference::operator== | ( | const Utf16CharReference & | characterReference) | const |
Compares the value of this character reference to the value of a UTF-16 character reference and tests whether the two character Unicode values are the same.
This type of equality checking checks to see whether the character values of the two references are equal. This does not check whether the references refer to the same character in the same string. To test for that type of reference equality, use the reference_equal() function.
[in] | characterReference | The UTF-16 character reference to be compared with this character reference |
bool UtfString::UnicodeCharReference::operator== | ( | const UnicodeChar & | character) | const |
Compares the value of this character reference to the value of a Unicode character and tests whether the two character Unicode values are the same.
[in] | character | The Unicode character to be compared with this character |
bool UtfString::UnicodeCharReference::operator== | ( | const Utf8Char & | character) | const |
Compares the value of this character reference to the value of a UTF-8 character and tests whether the two character Unicode values are the same.
[in] | character | The UTF-8 character to be compared with this character |
bool UtfString::UnicodeCharReference::operator== | ( | const Utf16Char & | character) | const |
Compares the value of this character reference to the value of a UTF-16 character and tests whether the two character Unicode values are the same.
[in] | character | The UTF-16 character to be compared with this character |
bool UtfString::UnicodeCharReference::reference_equal | ( | const UnicodeCharReference & | otherCharacterReference) |
Compares this character reference to another character reference and tests whether the two references refer to the exact same character.
This type of equality checking checks to see whether the two references refer to the exact same character in the same Utf16String. This does not check whether the character values are equal. To test for that type of equality, use the == operator.
[in] | otherCharacterReference | The character reference to be compared with this character reference |
bool UtfString::UnicodeCharReference::reference_not_equal | ( | const UnicodeCharReference & | otherCharacterReference) |
Compares this character reference to another character reference and tests whether the two references refer to different characters.
This type of equality checking checks to see whether the two references refer to different characters. This does not check whether the character values are not equal, but rather their references. To test for value inequality, use the != operator.
[in] | otherCharacterReference | The character reference to be compared with this character reference |
UInt32 UtfString::UnicodeCharReference::to_utf_32 | ( | ) | const |
Converts the character value of this character reference to a UTF-32 code point.
This function assumes that is_valid() is true. If empty() is true, this function will return a value of 0xFFFFFFFF.