Skip to content

UTF-8 string literals C# 11.0performance

Allow literal strings to be stored as UTF-8 instead of UTF-16.

.NET stores strings in UTF-16 format however many systems expect or rely upon UTF-8.

The new u8 suffix allows strings to be stored in-memory as a UTF-8 byte sequence - specifically as a ReadOnlySpan<byte>. This means that the string can be passed to APIs that expect UTF-8 without needing to be converted.

Code

C#
byte[] bytes = "Hello"u8.ToArray();

ReadOnlySpan<byte> span = "Hello"u8;
C#
byte[] asBytes = new byte[] { 72, 101, 108, 108, 111 };

ReadOnlySpan<byte> asSpan = new byte[] { 72, 101, 108, 108, 111 };

More information