From e4f7af595afe69581c99bf732e8a63fcfea49d64 Mon Sep 17 00:00:00 2001 From: Hongbo Zhang Date: Thu, 14 May 2020 14:02:01 +0800 Subject: [PATCH 1/3] new post about lazy encoding --- .../compiler/a-story-of-lazy-encoding.mdx | 122 ++++++++++++++++++ 1 file changed, 122 insertions(+) create mode 100644 _blogposts/compiler/a-story-of-lazy-encoding.mdx diff --git a/_blogposts/compiler/a-story-of-lazy-encoding.mdx b/_blogposts/compiler/a-story-of-lazy-encoding.mdx new file mode 100644 index 00000000..a2116a53 --- /dev/null +++ b/_blogposts/compiler/a-story-of-lazy-encoding.mdx @@ -0,0 +1,122 @@ +--- +author: hongbo +date: "2020-05-15 18:00" +previewImg: +category: compiler +title: New Lazy Encoding in BuckleScript +description: | + Highlights of our newest changes to the internal representation of exceptions + and how it will provide better stacktraces to our users. +canonical: https://bucklescript.github.io/blog/2020/05/06/exception-encoding +--- + +--- +title: Enhanced lazy encoding in BuckleScript +--- + + + +Recently we made some significant improvements with our new encoding for lazy values, and we find it so exciting that we want to highlight the changes. The new encoding generates very idiomatic JS output like hand-written code. + +For people who are not familiar with lazy evaluation, it is documented here: https://ocaml.org/releases/4.10/htmlman/expr.html#sss:expr-lazy. + +# What's the difference? + +Take this code snippet, for example: + +```reasonml +let lazy1 = lazy { + "Hello, lazy" -> Js.log; + 1 +}; // create a lazy value + +let lazy2 = lazy 3 ; // artifical lazy values for demo purpose + +Js.log2 (lazy1, lazy2); // logging the lazy values + +let (lazy la, lazy lb) = (lazy1, lazy2); // pattern match to force evaluation + +Js.log2 (la, lb); // logging forced values +``` + +Running this code in node, the output is as below: +```bash +lazy_demo$node src/lazy_demo.bs.js +[ [Function], tag: 246 ] 3 # logging the output of two lazy blocks +Hello, lazy # lazy1, laz2 evaluated forced by pattern match, hence logging +1 3 #logging the evaluated lazy block +``` + +With the new encoding, the output is as below: +```bash +{ RE_LAZY_DONE: false, value: [Function: value] } { RE_LAZY_DONE: true, value: 3 } # logging block one with new encoding +Hello, lazy +1 3 +``` + +As you can see, with the new encoding, no magic tags like 246 appear, and the lazy status is clearly marked via `RE_LAZY_DONE: (true | false) `. + +More than that, the generated code quality is also improved. In the old mode, the generated JS code was like this: + +```js +var lazy1 = Caml_obj.caml_lazy_make((function (param) { + console.log("Hello, lazy"); + return 1; + })); + +console.log(lazy1, 3); + +var la = CamlinternalLazy.force(lazy1); + +var lb = CamlinternalLazy.force(3); + +console.log(la, lb); + +var lazy2 = 3; +``` + +In the new mode, it is simplified: +```js +var lazy1 = { + RE_LAZY_DONE: false, + value: (function () { // closure now is uncurried arity-0 function + console.log("Hello, lazy"); + return 1; + }) +}; + +var lazy2 = { + RE_LAZY_DONE: true, + value: 3 +}; + +console.log(lazy1, lazy2); + +var la = CamlinternalLazy.force(lazy1); + +var lb = CamlinternalLazy.force(lazy2); + +console.log(la, lb); +``` + +## What changes did we make? + +In native, the encoding of lazy values is rather complicated: + +- It is an array, which is not friendly for debugging in JS context. +- It has some special tags which are not meaningful, for example, magic number 246, in JS context. +- It tries to unbox lazy values with the help of native GC. However, such complexity does not pay off in JS since the JSVM does not expose its GC semantics. + +So in the master, our encoding scheme is much simplified to take advantage of JS as much as possible: + +- The encoding is uniform; it is always an object of two key value pairs. One is `RE_LAZY_DONE` to mark its status, +the other is either a closure or an evaluated value. + +- The compiler optimization still kicks in at compile time: if it knows a lazy value is already evaluated or does not need to be evaluated, it will promote its status to be 'done'. However, unlike native, unboxing is not happening. This makes sense since the most interesting unboxing scenario happens in runtime instead of compile time where it is impossible in JSVM. + + +With the new encoding, `lazy` has a much nicer sugar, and we encourage you to use it whenever it is convenient! + +# Caveats: + +Don't rely on the special name `RE_LAZY_DONE` for JS interop; we may change it to a symbol in the future. From d6eefae5a56736e925fb0195066db9bd84010c1e Mon Sep 17 00:00:00 2001 From: Hongbo Zhang Date: Thu, 14 May 2020 20:58:34 +0800 Subject: [PATCH 2/3] Apply suggestions from code review Co-authored-by: Patrick Stapfer --- .../compiler/a-story-of-lazy-encoding.mdx | 26 +++++++++---------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/_blogposts/compiler/a-story-of-lazy-encoding.mdx b/_blogposts/compiler/a-story-of-lazy-encoding.mdx index a2116a53..179a172d 100644 --- a/_blogposts/compiler/a-story-of-lazy-encoding.mdx +++ b/_blogposts/compiler/a-story-of-lazy-encoding.mdx @@ -14,15 +14,15 @@ canonical: https://bucklescript.github.io/blog/2020/05/06/exception-encoding title: Enhanced lazy encoding in BuckleScript --- - +## Introduction Recently we made some significant improvements with our new encoding for lazy values, and we find it so exciting that we want to highlight the changes. The new encoding generates very idiomatic JS output like hand-written code. For people who are not familiar with lazy evaluation, it is documented here: https://ocaml.org/releases/4.10/htmlman/expr.html#sss:expr-lazy. -# What's the difference? +## Comparison between the old and new lazy encoding -Take this code snippet, for example: +Let's take an example, and see how the old encoding of a lazy value would look like: ```reasonml let lazy1 = lazy { @@ -39,7 +39,7 @@ let (lazy la, lazy lb) = (lazy1, lazy2); // pattern match to force evaluation Js.log2 (la, lb); // logging forced values ``` -Running this code in node, the output is as below: +When compiled and run in `node`, the runtime representation of our lazy values will look something like: ```bash lazy_demo$node src/lazy_demo.bs.js [ [Function], tag: 246 ] 3 # logging the output of two lazy blocks @@ -47,7 +47,7 @@ Hello, lazy # lazy1, laz2 evaluated forced by pattern match, hence logging 1 3 #logging the evaluated lazy block ``` -With the new encoding, the output is as below: +With the new encoding, the output of the same example code would look like this: ```bash { RE_LAZY_DONE: false, value: [Function: value] } { RE_LAZY_DONE: true, value: 3 } # logging block one with new encoding Hello, lazy @@ -56,7 +56,7 @@ Hello, lazy As you can see, with the new encoding, no magic tags like 246 appear, and the lazy status is clearly marked via `RE_LAZY_DONE: (true | false) `. -More than that, the generated code quality is also improved. In the old mode, the generated JS code was like this: +In fact, the code quality of our generated `bs.js` files has also improved. Going back to our old version, the generated JS would look like this: ```js var lazy1 = Caml_obj.caml_lazy_make((function (param) { @@ -75,7 +75,7 @@ console.log(la, lb); var lazy2 = 3; ``` -In the new mode, it is simplified: +In our new version with all the new changes to the lazy encoding, the output is way more simplified: ```js var lazy1 = { RE_LAZY_DONE: false, @@ -101,22 +101,22 @@ console.log(la, lb); ## What changes did we make? -In native, the encoding of lazy values is rather complicated: +In the native runtime environment, the encoding of lazy values is rather complicated: - It is an array, which is not friendly for debugging in JS context. - It has some special tags which are not meaningful, for example, magic number 246, in JS context. -- It tries to unbox lazy values with the help of native GC. However, such complexity does not pay off in JS since the JSVM does not expose its GC semantics. +- It tries to unbox lazy values with the help of the native garbage collector (GC). However, this behavior does not make sense in a JS runtime environment since the JSVM does not expose its GC semantics. Keeping that behavior would only introduce more complexity for the JS side. -So in the master, our encoding scheme is much simplified to take advantage of JS as much as possible: +So in our current master branch, we drastically simplified our lazy encoding scheme to optimize for the JS runtime as much as possible: - The encoding is uniform; it is always an object of two key value pairs. One is `RE_LAZY_DONE` to mark its status, the other is either a closure or an evaluated value. -- The compiler optimization still kicks in at compile time: if it knows a lazy value is already evaluated or does not need to be evaluated, it will promote its status to be 'done'. However, unlike native, unboxing is not happening. This makes sense since the most interesting unboxing scenario happens in runtime instead of compile time where it is impossible in JSVM. +- The compiler optimization still kicks in at compile time: if it knows a lazy value is already evaluated or does not need to be evaluated, it will promote its status to be 'done'. However, unlike in a native environment, unboxing is not happening. This makes sense since the most interesting unboxing scenarios only happen during runtime and not during compile time (a scenario which is impossible in the JSVM). -With the new encoding, `lazy` has a much nicer sugar, and we encourage you to use it whenever it is convenient! +With the new encoding, `lazy` is now way more viable for JS usage, so we encourage our users to use it whenever it is convenient! We will also provide some more details about the `lazy` values in our documentation soon. -# Caveats: +## Caveats: Don't rely on the special name `RE_LAZY_DONE` for JS interop; we may change it to a symbol in the future. From c63843cfecd5a35634372225089b8de3306f3389 Mon Sep 17 00:00:00 2001 From: Hongbo Zhang Date: Thu, 14 May 2020 21:03:00 +0800 Subject: [PATCH 3/3] address other comments --- _blogposts/compiler/a-story-of-lazy-encoding.mdx | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/_blogposts/compiler/a-story-of-lazy-encoding.mdx b/_blogposts/compiler/a-story-of-lazy-encoding.mdx index 179a172d..d226fffe 100644 --- a/_blogposts/compiler/a-story-of-lazy-encoding.mdx +++ b/_blogposts/compiler/a-story-of-lazy-encoding.mdx @@ -5,14 +5,10 @@ previewImg: category: compiler title: New Lazy Encoding in BuckleScript description: | - Highlights of our newest changes to the internal representation of exceptions - and how it will provide better stacktraces to our users. -canonical: https://bucklescript.github.io/blog/2020/05/06/exception-encoding + Highlights of our newest changes to the internal representation of lazy values + and how it will benefit our users. --- ---- -title: Enhanced lazy encoding in BuckleScript ---- ## Introduction