.NET stores strings in UTF-16 format however many systems expect or rely upon UTF-8.
The new u8
suffix allows strings to be stored in-memory as a UTF-8 byte sequence - specifically as a ReadOnlySpan<byte>
. This means that the string can be passed to APIs that expect UTF-8 without needing to be converted.
Code
C#
byte[] bytes = "Hello"u8.ToArray();
ReadOnlySpan<byte> span = "Hello"u8;
C#
byte[] asBytes = new byte[] { 72, 101, 108, 108, 111 };
ReadOnlySpan<byte> asSpan = new byte[] { 72, 101, 108, 108, 111 };