Skip to content

Fix drop_start on JavaScript for multi-byte strings#925

Open
jtdowney wants to merge 1 commit into
gleam-lang:mainfrom
jtdowney:fix-drop-start-js
Open

Fix drop_start on JavaScript for multi-byte strings#925
jtdowney wants to merge 1 commit into
gleam-lang:mainfrom
jtdowney:fix-drop-start-js

Conversation

@jtdowney
Copy link
Copy Markdown
Member

string_byte_slice used String.prototype.slice, which operates on UTF-16 code units, but it is called with UTF-8 byte offsets from byte_size. Encode to UTF-8, slice the byte array, then decode back.

This is a simple fix, but there is a downside: the string param to drop_start will be encoded to UTF-8 twice, once for byte_size and again for string_byte_slice. A better path might be to lift drop_start to a native implementation in JS.

Closes #924

`string_byte_slice` used `String.prototype.slice`, which operates on
UTF-16 code units, but it is called with UTF-8 byte offsets from
`byte_size`. Encode to UTF-8, slice the byte array, then decode back.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

string.drop_start behaves wrong on the JS target

1 participant