Function utf8ByteSequenceLength [src]
Given the first byte of a UTF-8 codepoint,
returns a number 1-4 indicating the total length of the codepoint in bytes.
If this byte does not match the form of a UTF-8 start byte, returns Utf8InvalidStartByte.
Prototype
pub fn utf8ByteSequenceLength(first_byte: u8) !u3
Parameters
first_byte: u8
Source
pub fn utf8ByteSequenceLength(first_byte: u8) !u3 {
// The switch is optimized much better than a "smart" approach using @clz
return switch (first_byte) {
0b0000_0000...0b0111_1111 => 1,
0b1100_0000...0b1101_1111 => 2,
0b1110_0000...0b1110_1111 => 3,
0b1111_0000...0b1111_0111 => 4,
else => error.Utf8InvalidStartByte,
};
}