Skip to content

BOM Gone? #553

Description

@r2d2Proton

The best I can tell, in Lottie-Windows Loader.cs, StorageFIleLoader.cs, LottieCompositionReader.cs there is an effort to process different UTF and non-UTF files:

public static LottieComposition? ReadLottieCompositionFromJsonStream(Stream stream, Options options, out IReadOnlyList<(string Code, string Description)> issues)
{
    ReadStreamToUTF8(stream, out var utf8Text);
    return ReadLottieCompositionFromJson(utf8Text, options, out issues);
}

static void ReadStreamToUTF8(Stream stream, out ReadOnlySpan<byte> utf8Text)
{
    // This buffer size is chosen to be about 50% larger than
    // the average file size in our corpus, so most of the time
    // we don't need to reallocate and copy.
    var buffer = new byte[150000];
    var bytesRead = stream.Read(buffer, 0, buffer.Length);
    var spaceLeftInBuffer = buffer.Length - bytesRead;

    while (spaceLeftInBuffer == 0)
    {
        // Might be more to read. Expand the buffer.
        var newBuffer = new byte[buffer.Length * 2];
        spaceLeftInBuffer = buffer.Length;
        var totalBytesRead = buffer.Length;
        Array.Copy(buffer, 0, newBuffer, 0, totalBytesRead);
        buffer = newBuffer;
        bytesRead = stream.Read(buffer, totalBytesRead, buffer.Length - totalBytesRead);
        spaceLeftInBuffer -= bytesRead;
    }

    utf8Text = new ReadOnlySpan<byte>(buffer);
    NormalizeTextToUTF8(ref utf8Text);
}

static void NormalizeTextToUTF8(ref ReadOnlySpan<byte> text)
{
    if (text.Length >= 1)
    {
        switch (text[0])
        {
            case 0xEF:
                // Possibly start of UTF8 BOM.
                if (text.Length >= 3 && text[1] == 0xBB && text[2] == 0xBF)
                {
                    // UTF8 BOM. Step over the UTF8 BOM.
                    text = text.Slice(3, text.Length - 3);
                }
                break;  
        }
    }
}

The best I can tell, when loading UTF-8 files with:

var filePicker = new FileOpenPicker{};
StorageFile? file = await filePicker.PickSingleFileAsync();

The BOM has already been eaten by a function before this is called. The beginning of the buffer is the start of the "{"JSON.

Simplified version:

static void ReadStreamToUTF8(Stream stream, out ReadOnlySpan<byte> utf8Text)
{
    // This buffer size is chosen to be about 50% larger than the average file size in our corpus, so most of the time
    var buffer = new byte[stream.Length];
    var bytesRead = stream.Read(buffer, 0, buffer.Length);
    utf8Text = new ReadOnlySpan<byte>(buffer);
    NormalizeTextToUTF8(ref utf8Text);
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions