UTF8 functions for working with strings


Can be created with:

Lib.Sys.Utf8.new(size)

Allocate a new Utf8 buffer using an optional bytes size.

size - bytes size. For nil value space for utf8 buffer will not be reserved


Created Utf8 object has methods:

addChar(c)

Add the given UTF8 character code to the buffer.

c - utf8 char code (int)

toString()

Returns the buffer converted to a String.



Static methods:

Lib.Sys.Utf8.encode(s)

Encode the input ISO string s into the corresponding UTF8 one.

s - ISO string

Lib.Sys.Utf8.decode(s)

Decode an UTF8 string s back to an ISO string. Throw an exception if a given UTF8 character is not supported by the decoder.

s - UTF8 string

Lib.Sys.Utf8.iter(s, chars)

Call the chars function for each UTF8 char of the string s.

s - UTF8 string

chars - function for iteration

Lib.Sys.Utf8.charCodeAt(sindex)

Returns the character code at position index of this UTF8 String s.

If index is negative or exceeds .length, nil is returned.

s - UTF8 string

index - char position in s string

Lib.Sys.Utf8.validate(s)

Tells if the String s is correctly encoded as UTF8.

s - any string

Lib.Sys.Utf8.length(s)

Returns the number of UTF8 chars of the string s.

s - UTF8 string

Lib.Sys.Utf8.compare(ab)

Compare two UTF8 strings a and b, character by character.

a - UTF8 string

b - UTF8 string

Lib.Sys.Utf8.sub(sposlen)

Returns len characters of this string s, starting at position pos.

If len is omitted, all characters from position pos to the end of this string are included.

If pos is negative, its value is calculated from the end of this string by this .length(s) + pos. If this yields a negative value, 0 is used instead.

If the calculated position + len exceeds .length(s), the characters from that position to the end of this string are returned.

If len is negative, the result is unspecified.

s - UTF8 string

pos - char position

len - requested length


Examples:


Utf8 = Lib.Sys.Utf8


local s = "ὕαλον"

print(s)

print(string.len(s))


print(Utf8.validate(s))

print(Utf8.compare(s, "ὕαλον"))

print(Utf8.length(s))


Utf8.iter(s, function(char)

    print(char)

end)


print(Utf8.charCodeAt(s, 0))

print(Utf8.sub(s, 1, 2))


local utf8 = Lib.Sys.Utf8.new(nil)

utf8.addChar(8021)

utf8.addChar(955)

print(utf8.toString())

Created with the Personal Edition of HelpNDoc: Easy CHM and documentation editor