UtfString
Public Member Functions | List of all members
UtfString::UnicodeCharReference Class Reference

Provides a reference to an encoding-neutral Unicode character embedded in an encoding-neutral Unicode string. More...

#include <UnicodeCharReference.h>

Public Member Functions

 UnicodeCharReference (const UnicodeCharReference &unicodeCharReference)
 Initializes an instance of UnicodeCharReference using another UnicodeCharReference instance. More...
 
 UnicodeCharReference (const Utf8CharReference &utf8CharReference)
 Initializes an instance of UnicodeCharReference using a Utf8CharReference instance. More...
 
 UnicodeCharReference (const Utf16CharReference &utf16CharReference)
 Initializes an instance of UnicodeCharReference using a Utf16CharReference instance. More...
 
virtual ~UnicodeCharReference ()
 The destructor. More...
 
bool operator== (const UnicodeCharReference &otherCharacterReference) const
 Compares the value of this character reference to the value of another character reference and tests whether the two character values are the same. More...
 
bool operator== (const Utf8CharReference &characterReference) const
 Compares the value of this character reference to the value of a UTF-8 character reference and tests whether the two character Unicode values are the same. More...
 
bool operator== (const Utf16CharReference &characterReference) const
 Compares the value of this character reference to the value of a UTF-16 character reference and tests whether the two character Unicode values are the same. More...
 
bool operator== (const UnicodeChar &character) const
 Compares the value of this character reference to the value of a Unicode character and tests whether the two character Unicode values are the same. More...
 
bool operator== (const Utf8Char &character) const
 Compares the value of this character reference to the value of a UTF-8 character and tests whether the two character Unicode values are the same. More...
 
bool operator== (const Utf16Char &character) const
 Compares the value of this character reference to the value of a UTF-16 character and tests whether the two character Unicode values are the same. More...
 
bool operator!= (const UnicodeCharReference &otherCharacterReference) const
 Compares the value of this character reference to the value of another character reference and tests whether the two character values are different. More...
 
bool operator!= (const Utf8CharReference &characterReference) const
 Compares the value of this character reference to the value of a UTF-8 character reference and tests whether the two character Unicode values are different. More...
 
bool operator!= (const Utf16CharReference &characterReference) const
 Compares the value of this character reference to the value of a UTF-16 character reference and tests whether the two character Unicode values are different. More...
 
bool operator!= (const UnicodeChar &character) const
 Compares the value of this character reference to the value of a Unicode character and tests whether the two character Unicode values are different. More...
 
bool operator!= (const Utf8Char &character) const
 Compares the value of this character reference to the value of a UTF-8 character and tests whether the two character Unicode values are different. More...
 
bool operator!= (const Utf16Char &character) const
 Compares the value of this character reference to the value of a UTF-16 character and tests whether the two character Unicode values are different. More...
 
UnicodeCharReferenceoperator= (const UnicodeCharReference &characterReference)
 Assigns the contents of a UnicodeCharReference object to this object. More...
 
UnicodeCharReferenceoperator= (const Utf8CharReference &characterReference)
 Assigns the contents of a Utf8CharReference object to this object. More...
 
UnicodeCharReferenceoperator= (const Utf16CharReference &characterReference)
 Assigns the contents of a Utf16CharReference object to this object. More...
 
UnicodeCharReferenceoperator= (const UnicodeChar &character)
 Assigns the contents of a UnicodeChar object to this object. More...
 
UnicodeCharReferenceoperator= (const Utf8Char &character)
 Assigns the contents of a Utf8Char object to this object. More...
 
UnicodeCharReferenceoperator= (const Utf16Char &character)
 Assigns the contents of a Utf16Char object to this object. More...
 
 operator UnicodeChar () const
 Converts this object to a UnicodeChar object. More...
 
 operator Utf8Char () const
 Converts this object to a Utf8Char object. More...
 
 operator Utf16Char () const
 Converts this object to a Utf16Char object. More...
 
 operator Utf8CharReference () const
 Converts this object to a Utf8CharReference object. More...
 
 operator Utf16CharReference () const
 Converts this object to a Utf16CharReference object. More...
 
void assign_reference (const UnicodeCharReference &otherCharacterReference)
 Assigns another character reference to this character reference, causing this character reference to refer to the exact same character as the other reference. More...
 
const UtfEncoding internal_encoding () const
 Indicates the internal encoding used by this character reference. More...
 
bool is_valid () const
 Indicates whether the character value of this reference is a valid Unicode character. More...
 
bool reference_equal (const UnicodeCharReference &otherCharacterReference)
 Compares this character reference to another character reference and tests whether the two references refer to the exact same character. More...
 
bool reference_not_equal (const UnicodeCharReference &otherCharacterReference)
 Compares this character reference to another character reference and tests whether the two references refer to different characters. More...
 
UInt32 to_utf_32 () const
 Converts the character value of this character reference to a UTF-32 code point. More...
 

Detailed Description

Provides a reference to an encoding-neutral Unicode character embedded in an encoding-neutral Unicode string.

Since a UnicodeString provides an interface that hides an encoding-specific string, a character is not directly pulled out of the UnicodeString, but is constructed from the underlying encoding- specific string. This makes it impossible for an string indexer to return a character reference that can be assigned to.

So this class acts as a reference to a character inside a string, so that if this character reference is altered, the contents of the string are altered as well. This allows us to have a read/write indexer in UnicodeString instead of a read-only indexer.

UnicodeCharReference encapsulates an encoding-specific character reference. For example, if UnicodeString encapsulates a Utf8String, UnicodeCharReference will encapsulate a Utf8CharReference. UnicodeCharReference can only be cast to the same type as the encapsulated character reference. To find the underlying encoding, use the GetInternalEncoding() member function.

Character references may become invalid if the contents of the string being referred to is modified in any way. If a string is modified by adding or removing characters, the associated character reference may then refer to another character or to somewhere outside the string. Attempting to use a character reference that no longer refers to a valid character will most likely cause a runtime error.

Since this is an encoding-neutral character reference, individual code units cannot be retrieved or set. You can only set an entire character (using a UTF-8, UTF-16 character string, or the 32-bit code point value) or retrieve the character's 32-bit code point value. You can, however, cast the UnicodeCharReference to its underlying encoding to retrieve and set the code units directly. You cannot cast a UnicodeCharReference to another encoding-specific character reference. The reason for this that UnicodeCharReference encapsulates an encoding-specific reference to an encoding-specific string (which was encapsulated by the UnicodeString). It would be complex and rather inefficent to handle translating code units to and from the code units referred to by an encoding-specific reference, and would require significant changes to Utf8CharReference and Utf16CharReference. It may be technically possible, but it's simply not worth it. If you need to address code units at all in a character reference, it's far easier and efficient to use an encoding-specific string and encoding-specific character reference instead of UnicodeString and UnicodeCharReference.

Note that if you attempt to cast a UnicodeCharReference to a character reference of a different type than the encapsulated character reference, you will either get a crash or an assertion failure. Use the GetInternalEncoding() member function to prevent this.

UnicodeChar and UnicodeCharReference objects can be assigned to each other or converted from one to the other.

Constructor & Destructor Documentation

UtfString::UnicodeCharReference::UnicodeCharReference ( const UnicodeCharReference unicodeCharReference)

Initializes an instance of UnicodeCharReference using another UnicodeCharReference instance.

Parameters
[in]unicodeCharReferenceThe Unicode character reference to use in initializing this object
UtfString::UnicodeCharReference::UnicodeCharReference ( const Utf8CharReference utf8CharReference)

Initializes an instance of UnicodeCharReference using a Utf8CharReference instance.

Parameters
[in]utf8CharReferenceThe UTF-8 character reference to use in initializing this object
UtfString::UnicodeCharReference::UnicodeCharReference ( const Utf16CharReference utf16CharReference)

Initializes an instance of UnicodeCharReference using a Utf16CharReference instance.

Parameters
[in]utf16CharReferenceThe UTF-16 character reference to use in initializing this object
virtual UtfString::UnicodeCharReference::~UnicodeCharReference ( )
virtual

The destructor.

The destructor will clean up the encoding-specific string contained within this object

Member Function Documentation

void UtfString::UnicodeCharReference::assign_reference ( const UnicodeCharReference otherCharacterReference)

Assigns another character reference to this character reference, causing this character reference to refer to the exact same character as the other reference.

Note that this function does not copy the value of the other character reference and assign it to the value of this character reference. To do that, use the assignment operator.

Parameters
[in]otherCharacterReferenceThe reference to be assigned to this reference.
const UtfEncoding UtfString::UnicodeCharReference::internal_encoding ( ) const

Indicates the internal encoding used by this character reference.

The internal encoding of a UnicodeCharReference depends on what data is used to initialize the character. If a UnicodeCharReference is initialized with a UTF-8 character reference, the internal encoding will be UTF-8. The opposite is the case when a UnicodeCharReference is initialized with a UTF-16 character reference. This is done to keep encoding conversions to a minimum. If an application is dealing primarily with one encoding, and a character in that encoding is put in a UnicodeString, we avoid the conversion to a specific internal encoding and then the conversion back to the original encoding.

bool UtfString::UnicodeCharReference::is_valid ( ) const

Indicates whether the character value of this reference is a valid Unicode character.

If this object is able to convert the code units given to it during initialization to a Unicode character, this character will be considered valid.

Returns
true if the code points in this character represent a valid Unicode character, otherwise false
UtfString::UnicodeCharReference::operator UnicodeChar ( ) const

Converts this object to a UnicodeChar object.

See Also
UnicodeChar::is_valid()
UtfString::UnicodeCharReference::operator Utf16Char ( ) const

Converts this object to a Utf16Char object.

This operator assumes that this character is a valid Unicode character.

See Also
UnicodeCharReference::is_valid()
UtfString::UnicodeCharReference::operator Utf16CharReference ( ) const

Converts this object to a Utf16CharReference object.

This conversion operator will only work if the underlying encoding is UTF-16; otherwise, an assertion failure or crash will result. This operator assumes that this reference has an internal encoding of ENCODING_UTF16.

See Also
UnicodeCharReference::internal_encoding()
UtfString::UnicodeCharReference::operator Utf8Char ( ) const

Converts this object to a Utf8Char object.

This operator assumes that this character is a valid Unicode character.

See Also
UnicodeCharReference::is_valid()
UtfString::UnicodeCharReference::operator Utf8CharReference ( ) const

Converts this object to a Utf8CharReference object.

This conversion operator will only work if the underlying encoding is UTF-8; otherwise, an assertion failure or crash will result. This operator assumes that this reference has an internal encoding of ENCODING_UTF8.

See Also
UnicodeCharReference::internal_encoding()
bool UtfString::UnicodeCharReference::operator!= ( const UnicodeCharReference otherCharacterReference) const

Compares the value of this character reference to the value of another character reference and tests whether the two character values are different.

This type of equality checking checks to see whether the character values of the two references are not equal. This does not check whether the references are different. To test for that type of reference inequality, use the reference_not_equal() function.

Parameters
[in]otherCharacterReferenceThe character reference to be compared with this character reference
Returns
true if the two character values are different, otherwise false
bool UtfString::UnicodeCharReference::operator!= ( const Utf8CharReference characterReference) const

Compares the value of this character reference to the value of a UTF-8 character reference and tests whether the two character Unicode values are different.

This type of equality checking checks to see whether the character values of the two references are not equal. This does not check whether the references are different. To test for that type of reference inequality, use the reference_not_equal() function.

Parameters
[in]characterReferenceThe UTF-8 character reference to be compared with this character reference
Returns
true if the two character Unicode values are different, otherwise false
bool UtfString::UnicodeCharReference::operator!= ( const Utf16CharReference characterReference) const

Compares the value of this character reference to the value of a UTF-16 character reference and tests whether the two character Unicode values are different.

This type of equality checking checks to see whether the character values of the two references are not equal. This does not check whether the references are different. To test for that type of reference inequality, use the reference_not_equal() function.

Parameters
[in]characterReferenceThe UTF-16 character reference to be compared with this character reference
Returns
true if the two character Unicode values are different, otherwise false
bool UtfString::UnicodeCharReference::operator!= ( const UnicodeChar character) const

Compares the value of this character reference to the value of a Unicode character and tests whether the two character Unicode values are different.

Parameters
[in]characterThe Unicode character to be compared with this character
Returns
true if the two character Unicode values are different, otherwise false
bool UtfString::UnicodeCharReference::operator!= ( const Utf8Char character) const

Compares the value of this character reference to the value of a UTF-8 character and tests whether the two character Unicode values are different.

Parameters
[in]characterThe UTF-8 character to be compared with this character reference
Returns
true if the two character Unicode values are different, otherwise false
bool UtfString::UnicodeCharReference::operator!= ( const Utf16Char character) const

Compares the value of this character reference to the value of a UTF-16 character and tests whether the two character Unicode values are different.

Parameters
[in]characterThe UTF-16 character to be compared with this character
Returns
true if the two character Unicode values are different, otherwise false
UnicodeCharReference& UtfString::UnicodeCharReference::operator= ( const UnicodeCharReference characterReference)

Assigns the contents of a UnicodeCharReference object to this object.

The contents of the other reference are assigned to this reference. This does not cause the this reference to refer to the same character as the other reference. To have this reference refer to the exact same character as another reference, use the assign_reference() function.

See Also
UnicodeCharReference::assign_reference()
Parameters
[in]characterReferenceThe UnicodeCharReference object whose contents are to be assigned to this object
Returns
A reference to this object
UnicodeCharReference& UtfString::UnicodeCharReference::operator= ( const Utf8CharReference characterReference)

Assigns the contents of a Utf8CharReference object to this object.

The contents of the other reference are assigned to this reference. This does not cause the this reference to refer to the same character as the other reference. To have this reference refer to the exact same character as another reference, use the assign_reference() function.

See Also
UnicodeCharReference::assign_reference()
Parameters
[in]characterReferenceThe Utf8CharReference object whose contents are to be assigned to this object
Returns
A reference to this object
UnicodeCharReference& UtfString::UnicodeCharReference::operator= ( const Utf16CharReference characterReference)

Assigns the contents of a Utf16CharReference object to this object.

The contents of the other reference are assigned to this reference. This does not cause the this reference to refer to the same character as the other reference. To have this reference refer to the exact same character as another reference, use the assign_reference() function.

See Also
UnicodeCharReference::assign_reference()
Parameters
[in]characterReferenceThe Utf16CharReference object whose contents are to be assigned to this object
Returns
A reference to this object
UnicodeCharReference& UtfString::UnicodeCharReference::operator= ( const UnicodeChar character)

Assigns the contents of a UnicodeChar object to this object.

The contents of the UnicodeChar object are copied to this object: they are not shared.

Parameters
[in]characterThe UnicodeChar object whose contents are to be assigned to this object
Returns
A reference to this object
UnicodeCharReference& UtfString::UnicodeCharReference::operator= ( const Utf8Char character)

Assigns the contents of a Utf8Char object to this object.

Parameters
[in]characterThe Utf8Char object whose contents are to be assigned to this object
Returns
A reference to this object
UnicodeCharReference& UtfString::UnicodeCharReference::operator= ( const Utf16Char character)

Assigns the contents of a Utf16Char object to this object.

Parameters
[in]characterThe Utf16Char object whose contents are to be assigned to this object
Returns
A reference to this object
bool UtfString::UnicodeCharReference::operator== ( const UnicodeCharReference otherCharacterReference) const

Compares the value of this character reference to the value of another character reference and tests whether the two character values are the same.

This type of equality checking checks to see whether the character values of the two references are equal. This does not check whether the references refer to the same character in the same string. To test for that type of reference equality, use the reference_equal() function.

Parameters
[in]otherCharacterReferenceThe character reference to be compared with this character reference
Returns
true if the two character values are the same, otherwise false
bool UtfString::UnicodeCharReference::operator== ( const Utf8CharReference characterReference) const

Compares the value of this character reference to the value of a UTF-8 character reference and tests whether the two character Unicode values are the same.

This type of equality checking checks to see whether the character values of the two references are equal. This does not check whether the references refer to the same character in the same string. To test for that type of reference equality, use the reference_equal() function.

Parameters
[in]characterReferenceThe UTF-8 character reference to be compared with this character reference
Returns
true if the two character Unicode values are the same, otherwise false
bool UtfString::UnicodeCharReference::operator== ( const Utf16CharReference characterReference) const

Compares the value of this character reference to the value of a UTF-16 character reference and tests whether the two character Unicode values are the same.

This type of equality checking checks to see whether the character values of the two references are equal. This does not check whether the references refer to the same character in the same string. To test for that type of reference equality, use the reference_equal() function.

Parameters
[in]characterReferenceThe UTF-16 character reference to be compared with this character reference
Returns
true if the two character Unicode values are the same, otherwise false
bool UtfString::UnicodeCharReference::operator== ( const UnicodeChar character) const

Compares the value of this character reference to the value of a Unicode character and tests whether the two character Unicode values are the same.

Parameters
[in]characterThe Unicode character to be compared with this character
Returns
true if the two character Unicode values are the same, otherwise false
bool UtfString::UnicodeCharReference::operator== ( const Utf8Char character) const

Compares the value of this character reference to the value of a UTF-8 character and tests whether the two character Unicode values are the same.

Parameters
[in]characterThe UTF-8 character to be compared with this character
Returns
true if the two character Unicode values are the same, otherwise false
bool UtfString::UnicodeCharReference::operator== ( const Utf16Char character) const

Compares the value of this character reference to the value of a UTF-16 character and tests whether the two character Unicode values are the same.

Parameters
[in]characterThe UTF-16 character to be compared with this character
Returns
true if the two character Unicode values are the same, otherwise false
bool UtfString::UnicodeCharReference::reference_equal ( const UnicodeCharReference otherCharacterReference)

Compares this character reference to another character reference and tests whether the two references refer to the exact same character.

This type of equality checking checks to see whether the two references refer to the exact same character in the same Utf16String. This does not check whether the character values are equal. To test for that type of equality, use the == operator.

Parameters
[in]otherCharacterReferenceThe character reference to be compared with this character reference
Returns
true if this character reference refers to the exact same character as the other reference
bool UtfString::UnicodeCharReference::reference_not_equal ( const UnicodeCharReference otherCharacterReference)

Compares this character reference to another character reference and tests whether the two references refer to different characters.

This type of equality checking checks to see whether the two references refer to different characters. This does not check whether the character values are not equal, but rather their references. To test for value inequality, use the != operator.

Parameters
[in]otherCharacterReferenceThe character reference to be compared with this character reference
Returns
true if this character reference refers to a differemt character as the other reference
UInt32 UtfString::UnicodeCharReference::to_utf_32 ( ) const

Converts the character value of this character reference to a UTF-32 code point.

This function assumes that is_valid() is true. If empty() is true, this function will return a value of 0xFFFFFFFF.

Returns
This character as a UTF-32 code unit

The documentation for this class was generated from the following file: