GLib Reference Manual | |||
---|---|---|---|
<<< Previous Page | Home | Up | Next Page >>> |
gboolean g_unichar_isupper (gunichar c); |
Determines if a character is uppercase.
c : | a unicode character |
Returns : |
gboolean g_unichar_isxdigit (gunichar c); |
Determines if a characters is a hexidecimal digit
c : | a unicode character. |
Returns : | TRUE if the character is a hexidecimal digit. |
gboolean g_unichar_istitle (gunichar c); |
Determines if a character is titlecase. Some characters in Unicode which are composites, such as the DZ digraph have three case variants instead of just two. The titlecase form is used at the beginning of a word where only the first letter is capitalized. The titlecase form of the DZ digraph is U+01F2 LATIN CAPITAL LETTTER D WITH SMALL LETTER Z
c : | a unicode character |
Returns : | TRUE if the character is titlecase. |
gboolean g_unichar_isdefined (gunichar c); |
Determines if a given character is assigned in the Unicode standard
c : | a unicode character |
Returns : | TRUE if the character has an assigned value. |
gboolean g_unichar_iswide (gunichar c); |
Determines if a character is typically rendered in a double-width cell.
c : | a unicode character |
Returns : | TRUE if the character is wide. |
gunichar g_unichar_toupper (gunichar c); |
Convert a character to uppercase.
c : | a unicode character |
Returns : | the result of converting c to uppercase. If c is not an lowercase or titlecase character, c is returned unchanged. |
gunichar g_unichar_tolower (gunichar c); |
Convert a character to lower case
c : | a unicode character. |
Returns : | the result of converting c to lower case. If c is not an upperlower or titlecase character, c is returned unchanged. |
gunichar g_unichar_totitle (gunichar c); |
Convert a character to the titlecase
c : | a unicode character |
Returns : | the result of converting c to titlecase. If c is not an uppercase or lowercase character, c is returned unchanged. |
gint g_unichar_xdigit_value (gunichar c); |
Determines the numeric value of a character as a hexidecimal degital.
c : | a unicode character |
Returns : | If c is a hex digit (according to `g_unichar_isxdigit'), its numeric value. Otherwise, -1. |
typedef enum { G_UNICODE_CONTROL, G_UNICODE_FORMAT, G_UNICODE_UNASSIGNED, G_UNICODE_PRIVATE_USE, G_UNICODE_SURROGATE, G_UNICODE_LOWERCASE_LETTER, G_UNICODE_MODIFIER_LETTER, G_UNICODE_OTHER_LETTER, G_UNICODE_TITLECASE_LETTER, G_UNICODE_UPPERCASE_LETTER, G_UNICODE_COMBINING_MARK, G_UNICODE_ENCLOSING_MARK, G_UNICODE_NON_SPACING_MARK, G_UNICODE_DECIMAL_NUMBER, G_UNICODE_LETTER_NUMBER, G_UNICODE_OTHER_NUMBER, G_UNICODE_CONNECT_PUNCTUATION, G_UNICODE_DASH_PUNCTUATION, G_UNICODE_CLOSE_PUNCTUATION, G_UNICODE_FINAL_PUNCTUATION, G_UNICODE_INITIAL_PUNCTUATION, G_UNICODE_OTHER_PUNCTUATION, G_UNICODE_OPEN_PUNCTUATION, G_UNICODE_CURRENCY_SYMBOL, G_UNICODE_MODIFIER_SYMBOL, G_UNICODE_MATH_SYMBOL, G_UNICODE_OTHER_SYMBOL, G_UNICODE_LINE_SEPARATOR, G_UNICODE_PARAGRAPH_SEPARATOR, G_UNICODE_SPACE_SEPARATOR } GUnicodeType; |
GUnicodeType g_unichar_type (gunichar c); |
Classifies a unicode character by type.
c : | a unicode character |
Returns : | the typ of the character. |
void g_unicode_canonical_ordering (gunichar *string, size_t len); |
string : | |
len : |
gunichar* g_unicode_canonical_decomposition (gunichar ch, size_t *result_len); |
ch : | |
result_len : | |
Returns : |
gunichar g_utf8_get_char (const gchar *p); |
Convert a sequence of bytes encoded as UTF-8 to a unicode character.
p : | a pointer to unicode character encoded as UTF-8 |
Returns : | the resulting character or (gunichar)-1 if p does not point to a valid UTF-8 encoded unicode character |
gchar* g_utf8_offset_to_pointer (const gchar *str, gint offset); |
Converts from an integer character offset to a pointer to a position within the string.
str : | a UTF-8 encoded string |
offset : | a character offset within the string. |
Returns : | the resulting pointer |
gint g_utf8_pointer_to_offset (const gchar *str, const gchar *pos); |
Converts from a pointer to position within a string to a integer character offset
str : | a UTF-8 encoded string |
pos : | a pointer to a position within str |
Returns : | the resulting character offset |
gchar* g_utf8_prev_char (const gchar *p); |
Find the previous UTF-8 character in the string before p
p does not have to be at the beginning of a UTF-8 character. No check is made to see if the character found is actually valid other than it starts with an appropriate byte. If p might be the first character of the string, you must use g_utf8_find_prev_char instead.
p : | a pointer to a position within a UTF-8 encoded string |
Returns : | a pointer to the found character. |
gchar* g_utf8_find_next_char (const gchar *p, const gchar *end); |
Find the start of the next utf-8 character in the string after p
p does not have to be at the beginning of a UTF-8 chracter. No check is made to see if the character found is actually valid other than it starts with an appropriate byte.
p : | a pointer to a position within a UTF-8 encoded string |
end : | a pointer to the end of the string, or NULL to indicate that the string is NULL terminated, in which case the returned value will be |
Returns : | a pointer to the found character or NULL |
gchar* g_utf8_find_prev_char (const gchar *str, const gchar *p); |
Given a position p with a UTF-8 encoded string str, find the start of the previous UTF-8 character starting before p. Returns NULL if no UTF-8 characters are present in p before str.
p does not have to be at the beginning of a UTF-8 chracter. No check is made to see if the character found is actually valid other than it starts with an appropriate byte.
str : | pointer to the beginning of a UTF-8 string |
p : | pointer to some position within str |
Returns : | a pointer to the found character or NULL. |
gint g_utf8_strlen (const gchar *p, gint max); |
p : | pointer to the start of a UTF-8 string. |
max : | the maximum number of bytes to examine. If max is less than 0, then the string is assumed to be nul-terminated. |
Returns : | the length of the string in characters |
gchar* g_utf8_strncpy (gchar *dest, const gchar *src, size_t n); |
dest : | |
src : | |
n : | |
Returns : |
gchar* g_utf8_strchr (const gchar *p, gunichar c); |
Find the leftmost occurence of the given iso-10646 character in a UTF-8 string.
p : | a nul-terminated utf-8 string |
c : | a iso-10646 character/ |
Returns : | NULL if the string does not contain the character, otherwise, a a pointer to the start of the leftmost of the character in the string. |
gchar* g_utf8_strrchr (const gchar *p, gunichar c); |
Find the rightmost occurence of the given iso-10646 character in a UTF-8 string.
p : | a nul-terminated utf-8 string |
c : | a iso-10646 character/ |
Returns : | NULL if the string does not contain the character, otherwise, a a pointer to the start of the rightmost of the character in the string. |
gboolean g_utf8_validate (const gchar *str, gint max_len, const gchar **end); |
Validates UTF-8 encoded text. str is the text to validate; if str is nul-terminated, then max_len can be -1, otherwise max_len should be the number of bytes to validate. If end is non-NULL, then the end of the valid range will be stored there (i.e. the address of the first invalid byte if some bytes were invalid, or the end of the text being validated otherwise).
Returns TRUE if all of str was valid. Many GLib and GTK+ routines <emphasis>require</emphasis> valid UTF8 as input; so data read from a file or the network should be checked with g_utf8_validate() before doing anything else with it.
str : | a pointer to character data |
max_len : | max bytes to validate, or -1 to go until nul |
end : | return location for end of valid data |
Returns : | TRUE if the text was valid UTF-8. |
gunichar* g_utf8_to_ucs4 (const gchar *str, gint len); |
Convert a string from UTF-8 to a 32-bit fixed width representation as UCS-4.
str : | a UTF-8 encoded strnig |
len : | the length of @ |
Returns : | a pointer to a newly allocated UCS-4 string. This value must be freed with g_free() |
gchar* g_ucs4_to_utf8 (const gunichar *str, gint len); |
Convert a string from a 32-bit fixed width representation as UCS-4. to UTF-8.
str : | a UCS-4 encoded string |
len : | the length of @ |
Returns : | a pointer to a newly allocated UTF-8 string. This value must be freed with g_free() |
gint g_unichar_to_utf8 (gunichar c, char *outbuf); |
Convert a single character to utf8
c : | a ISO10646 character code |
outbuf : | output buffer, must have at least 6 bytes of space. If NULL, the length will be computed and returned and nothing will be written to out. |
Returns : | number of bytes written |