UtfString
Public Member Functions | Friends | List of all members
UtfString::Utf8CharReference Class Reference

Provides a reference to a UTF-8 character embedded in a UTF-8 string. More...

#include <Utf8CharReference.h>

Public Member Functions

 Utf8CharReference (std::string &codeUnitString, std::string::iterator &basicStringIterator, size_t codeUnitCount)
 Initializes an instance of Utf8CharReference using an iterator pointing to some code units and a count of the code units that comprise the character. More...
 
 Utf8CharReference (const Utf8CharReference &otherCharacterReference)
 This copy constructor initializes an instance of Utf8CharReference that is an exact copy of another Utf8CharReference object. More...
 
bool operator== (const Utf8CharReference &otherCharacterReference) const
 Compares the value of this character reference to the value of another character reference and tests whether the two references contain the same character values. More...
 
bool operator!= (const Utf8CharReference &otherCharacterReference) const
 Compares the value of this character reference to the value of another character reference and tests whether the two references contain different character values. More...
 
bool operator== (const Utf8Char &character) const
 Compares the value of this character reference to the value of a character and tests whether both contain the same character values. More...
 
bool operator!= (const Utf8Char &character) const
 Compares the value of this character reference to the value of a character and tests whether both contain different character values. More...
 
Utf8CharReferenceoperator= (const Utf8CharReference &otherCharacterReference)
 Assigns the contents of another Utf8CharReference object to this object. More...
 
Utf8CharReferenceoperator= (const Utf8Char &utf8Char)
 Assigns the contents of a Utf8Char object to this object. More...
 
char & operator[] (const size_t index)
 Returns the code unit found at the specified index. More...
 
const char & operator[] (const size_t index) const
 Returns the code unit found at the specified index. More...
 
 operator Utf8Char () const
 Converts this object to a Utf8Char object.
 
 operator Utf16Char () const
 Converts this object to a Utf16Char object. More...
 
void assign_reference (const Utf8CharReference &otherCharacterReference)
 Assigns another character reference to this character reference, causing this character reference to refer to the exact same character as the other reference. More...
 
bool is_valid () const
 Indicates whether this character is a valid UTF-8 character. More...
 
bool reference_equal (const Utf8CharReference &otherCharacterReference)
 Compares this character reference to another character reference and tests whether the two references refer to the exact same character. More...
 
bool reference_not_equal (const Utf8CharReference &otherCharacterReference)
 Compares this character reference to another character reference and tests whether the two references refer to different characters. More...
 
UInt32 to_utf_32 () const
 Converts this character to a UTF-32 code unit. More...
 
size_t size () const
 Returns the number of code units in this character.
 

Friends

std::istream & operator>> (std::istream &inputStream, Utf8CharReference &utf8CharReference)
 This operator converts a stream of 8-bit values to a UTF-8 character, and assigns it to a character reference. More...
 
std::ostream & operator<< (std::ostream &outputStream, const Utf8CharReference &utf8CharReference)
 This operator converts the character referred to by a UTF-8 character reference to a stream of 8-bit values. More...
 

Detailed Description

Provides a reference to a UTF-8 character embedded in a UTF-8 string.

Since a Utf8String provides an interface that hides the individual code units, a character is not directly pulled out of the Utf8String, but is constructed from the underlying code units. This makes it impossible for an string indexer to return a character reference that can be assigned to.

So this class acts as a reference to a character inside a string, so that if this character reference is altered, the contents of the string are altered as well. This allows us to have a read/write indexer in Utf8String instead of a read-only indexer.

Character references may become invalid if the contents of the string being referred to is modified in any way. If a string is modified by adding or removing characters, the associated character reference may then refer to another character or to somewhere outside the string. Attempting to use a character reference that no longer refers to a valid character will most likely cause a runtime error.

The only way code units can be added/subtracted in a character reference is to assign it the value of another character reference (using the assignment operator) or assign it the value of a character. The code units can be individually read and written to, but the number of code units cannot be changed through individual manipulation. Attempting to access a non-existant code unit will result in an assertion failure when a debug build and undefined behavior in a non-debug build.

Utf8Char and Utf8CharReference objects can be assigned to each other or converted from one to the other.

Constructor & Destructor Documentation

UtfString::Utf8CharReference::Utf8CharReference ( std::string &  codeUnitString,
std::string::iterator &  basicStringIterator,
size_t  codeUnitCount 
)

Initializes an instance of Utf8CharReference using an iterator pointing to some code units and a count of the code units that comprise the character.

This constructor assumes that codeUnitCount is less than the maximum number of code units allowed by the UTF-8 encoding.

Parameters
[in]codeUnitStringThe string containing the code units being referenced by this character reference
[in]basicStringIteratorAn iterator pointing to the code units to be stored in this character reference
[in]codeUnitCountThe number of code units that comprise this character
UtfString::Utf8CharReference::Utf8CharReference ( const Utf8CharReference otherCharacterReference)

This copy constructor initializes an instance of Utf8CharReference that is an exact copy of another Utf8CharReference object.

Parameters
[in]otherCharacterReferenceThe Utf8CharReference object that is to be exactly copied

Member Function Documentation

void UtfString::Utf8CharReference::assign_reference ( const Utf8CharReference otherCharacterReference)

Assigns another character reference to this character reference, causing this character reference to refer to the exact same character as the other reference.

Note that this function does not copy the value of the other character reference and assign it to the value of this character reference. To do that, use the assignment operator.

Parameters
[in]otherCharacterReferenceThe reference to be assigned to this reference.
bool UtfString::Utf8CharReference::is_valid ( ) const

Indicates whether this character is a valid UTF-8 character.

Returns
true if the code points in this character represent a valid UTF-8 character, otherwise false
UtfString::Utf8CharReference::operator Utf16Char ( ) const

Converts this object to a Utf16Char object.

This operator assumes that this character is a valid UTF-8 character.

See Also
Utf8CharReference::is_valid()
bool UtfString::Utf8CharReference::operator!= ( const Utf8CharReference otherCharacterReference) const

Compares the value of this character reference to the value of another character reference and tests whether the two references contain different character values.

This type of equality checking checks to see whether the character values of the two references are not equal. This does not check whether the references are different. To test for that type of reference inequality, use the reference_not_equal() function.

Parameters
[in]otherCharacterReferenceThe character reference to be compared with this character reference
Returns
true if the two character values are the same, otherwise false
bool UtfString::Utf8CharReference::operator!= ( const Utf8Char character) const

Compares the value of this character reference to the value of a character and tests whether both contain different character values.

Parameters
[in]characterThe character to be compared with this character reference
Returns
true if the two character values are the same, otherwise false
Utf8CharReference& UtfString::Utf8CharReference::operator= ( const Utf8CharReference otherCharacterReference)

Assigns the contents of another Utf8CharReference object to this object.

The contents of the other reference are assigned to this reference. This does not cause the this reference to refer to the same character as the other reference. To have this reference refer to the exact same character as another reference, use the assign_reference() function.

Note that when the contents of this object are changed, the change is propogated to the referenced Utf8String object.

Parameters
[in]otherCharacterReferenceThe other Utf8CharReference object whose contents are to be assigned to this object
Returns
A reference to this object
Utf8CharReference& UtfString::Utf8CharReference::operator= ( const Utf8Char utf8Char)

Assigns the contents of a Utf8Char object to this object.

Note that when the contents of this object are changed, the change is propogated to the referenced Utf8String object.

Parameters
[in]utf8CharThe Utf8Char object whose contents are to be assigned to this object
Returns
A reference to this object
bool UtfString::Utf8CharReference::operator== ( const Utf8CharReference otherCharacterReference) const

Compares the value of this character reference to the value of another character reference and tests whether the two references contain the same character values.

This type of equality checking checks to see whether the character values of the two references are equal. This does not check whether the references refer to the same character in the same string. To test for that type of reference equality, use the reference_equal() function.

Parameters
[in]otherCharacterReferenceThe character reference to be compared with this character reference
Returns
true if the two character values are the same, otherwise false
bool UtfString::Utf8CharReference::operator== ( const Utf8Char character) const

Compares the value of this character reference to the value of a character and tests whether both contain the same character values.

Parameters
[in]characterThe character to be compared with this character reference
Returns
true if the two character values are the same, otherwise false
char& UtfString::Utf8CharReference::operator[] ( const size_t  index)

Returns the code unit found at the specified index.

This operator does not check for the validity of the index, so it assumes that index is less than the maximum number of code units allowed by the UTF-8 encoding, and that index < size().

Parameters
[in]indexThe index identifying the code unit to be retrieved
Returns
The code unit found at the specified index
const char& UtfString::Utf8CharReference::operator[] ( const size_t  index) const

Returns the code unit found at the specified index.

This operator does not check for the validity of the index, so it assumes that index is less than the maximum number of code units allowed by the UTF-8 encoding, and that index < size().

Parameters
[in]indexThe index identifying the code unit to be retrieved
Returns
The code unit found at the specified index
bool UtfString::Utf8CharReference::reference_equal ( const Utf8CharReference otherCharacterReference)

Compares this character reference to another character reference and tests whether the two references refer to the exact same character.

This type of equality checking checks to see whether the two references refer to the exact same character in the same Utf8String. This does not check whether the character values are equal. To test for that type of equality, use the == operator.

Parameters
[in]otherCharacterReferenceThe character reference to be compared with this character reference
Returns
true if this character reference refers to the exact same character as the other reference
bool UtfString::Utf8CharReference::reference_not_equal ( const Utf8CharReference otherCharacterReference)

Compares this character reference to another character reference and tests whether the two references refer to different characters.

This type of equality checking checks to see whether the two references refer to different characters. This does not check whether the character values are not equal, but rather their references. To test for value inequality, use the != operator.

Parameters
[in]otherCharacterReferenceThe character reference to be compared with this character reference
Returns
true if this character reference refers to a differemt character as the other reference
UInt32 UtfString::Utf8CharReference::to_utf_32 ( ) const

Converts this character to a UTF-32 code unit.

This function assumes that size() is from 1 to 4.

Returns
This character as a UTF-32 code unit

Friends And Related Function Documentation

std::ostream& operator<< ( std::ostream &  outputStream,
const Utf8CharReference utf8CharReference 
)
friend

This operator converts the character referred to by a UTF-8 character reference to a stream of 8-bit values.

No checks for validity are done, so the resulting UTF-8 stream may or may contain a valid UTF-8 character.

Parameters
[in]outputStreamThe output stream to which the contents of the UTF-8 character are to be written
[in]utf8CharReferenceA reference to the UTF-8 character to be written to the output stream
std::istream& operator>> ( std::istream &  inputStream,
Utf8CharReference utf8CharReference 
)
friend

This operator converts a stream of 8-bit values to a UTF-8 character, and assigns it to a character reference.

This function clears the contents of utf8CharReference before the stream is converted. In addition this function assumes that the stream being converted is of the same endianness as the machine on which this function was compiled.

Parameters
[in]inputStreamThe input stream containing 8-bit values to be converted to a UTF-8 string
[in]utf8CharReferenceThe character reference object to which the converted UTF-8 character will be assigned

The documentation for this class was generated from the following file: