Skip to content

Commit 897488f

Browse files
authored
docs: Comprehensive tgpu.unroll docs (#2334)
1 parent 1e3456c commit 897488f

2 files changed

Lines changed: 215 additions & 5 deletions

File tree

apps/typegpu-docs/src/content/docs/fundamentals/utils.mdx

Lines changed: 212 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -296,7 +296,7 @@ const processNeighbors = (cell: d.v2i) => {
296296

297297
## *tgpu.unroll*
298298

299-
For code with small, fixed iteration counts, you can use `tgpu.unroll` to unroll loops at compile time. This eliminates branch prediction overhead and can significantly improve performance.
299+
For code with small, fixed iteration counts, you can use `tgpu.unroll` to unroll loops at compile time.
300300

301301
### Usage
302302

@@ -387,5 +387,215 @@ fn fbm(pos: vec3f) -> f32 {
387387

388388
:::note
389389
- There are no constraints on how large a loop can be for unrolling. We will always try to unroll it, and if we can't, you'll receive an error.
390-
- You cannot use `continue` or `break` inside loop that you intend to unroll later.
390+
- You cannot use `continue` or `break` inside a loop that you intend to unroll later.
391391
:::
392+
393+
:::tip
394+
You can unroll loops conditionally, for example:
395+
```ts
396+
// A value determined based on the user's environment
397+
const shouldUnroll = true;
398+
399+
const f = () => {
400+
'use gpu';
401+
const arr = [1, 2, 3];
402+
let r = d.i32(0);
403+
for (const foo of shouldUnroll ? tgpu.unroll(arr) : arr) {
404+
r += foo;
405+
}
406+
};
407+
```
408+
:::
409+
410+
### Types of iterables (more technical section)
411+
412+
**Array expression of primitive elements:**
413+
```ts
414+
for (const foo of tgpu.unroll([1, 2, 3])) {
415+
result += foo;
416+
}
417+
```
418+
419+
Generates:
420+
```wgsl
421+
// unrolled iteration #0
422+
{
423+
result += 1u;
424+
}
425+
// unrolled iteration #1
426+
{
427+
result += 2u;
428+
}
429+
// unrolled iteration #2
430+
{
431+
result += 3u;
432+
}
433+
```
434+
:::note
435+
`std.range` falls into the primitive array expression category.
436+
:::
437+
438+
**Array expression of complex elements:**
439+
```ts
440+
const b1 = Boid({ pos: d.vec2i(1), vel: d.vec2f(1) });
441+
const b2 = Boid({ pos: d.vec2i(2), vel: d.vec2f(2) });
442+
443+
for (const foo of tgpu.unroll([b1, b2])) {
444+
const boo = foo;
445+
res = res.add(foo.vel);
446+
boo.pos = d.vec2i();
447+
}
448+
```
449+
450+
Generates:
451+
```wgsl
452+
// unrolled iteration #0
453+
{
454+
let boo = (&b1);
455+
res = (res + b1.vel);
456+
(*boo).pos = vec2i();
457+
}
458+
// unrolled iteration #1
459+
{
460+
let boo = (&b2);
461+
res = (res + b2.vel);
462+
(*boo).pos = vec2i();
463+
}
464+
```
465+
466+
:::caution
467+
In the example above, if `Boid` creation were inlined, we would throw an error. This means that elements of a complex type must be stored in variables before unrolling.
468+
:::
469+
470+
**Vector**
471+
```ts
472+
const v = d.vec3f(1, 2, 3);
473+
let res = d.f32(0);
474+
for (const foo of tgpu.unroll(d.vec3f(v))) {
475+
res += foo;
476+
}
477+
```
478+
479+
Generates:
480+
```wgsl
481+
// unrolled iteration #0
482+
{
483+
res = res + v[0u];
484+
}
485+
// unrolled iteration #1
486+
{
487+
res = res + v[1u];
488+
}
489+
// unrolled iteration #2
490+
{
491+
res = res + v[2u];
492+
}
493+
```
494+
495+
**Array expression of struct field names**
496+
```ts
497+
const values = { a: 1, b: 2, c: 3 };
498+
const keys = Object.keys(values) as (keyof typeof values)[];
499+
500+
const f = () => {
501+
'use gpu';
502+
let result = d.u32(0);
503+
for (const key of tgpu.unroll(keys)) {
504+
result += values[key];
505+
}
506+
return result;
507+
};
508+
```
509+
510+
Generates:
511+
```wgsl
512+
// ...
513+
514+
// unrolled iteration #0
515+
{
516+
result += 1u;
517+
}
518+
// unrolled iteration #1
519+
{
520+
result += 2u;
521+
}
522+
// unrolled iteration #2
523+
{
524+
result += 3u;
525+
}
526+
527+
// ...
528+
```
529+
530+
:::note
531+
As you can see, `tgpu.unroll` will handle external arrays as long as they are known at compile-time.
532+
:::
533+
534+
**Iterable stored in a variable**
535+
536+
In this case, we fall back to array access:
537+
```ts
538+
const arr = [1, 2, 3];
539+
let res = d.f32(0);
540+
for (const foo of tgpu.unroll(arr)) {
541+
res += foo;
542+
}
543+
```
544+
545+
Generates:
546+
```wgsl
547+
var arr = array<i32, 3>(1, 2, 3);
548+
var res = 0f;
549+
// unrolled iteration #0
550+
{
551+
res += f32(arr[0u]);
552+
}
553+
// unrolled iteration #1
554+
{
555+
res += f32(arr[1u]);
556+
}
557+
// unrolled iteration #2
558+
{
559+
res += f32(arr[2u]);
560+
}
561+
```
562+
563+
**Buffer, `const`, `accessor`, `comptime`, `lazy`, etc.**
564+
565+
As long as we know the length at compile time, we will unroll the loop.
566+
```ts
567+
const buffer = root.createUniform(d.arrayOf(d.u32, 3));
568+
const acc = tgpu.accessor(d.arrayOf(d.u32, 3), buffer);
569+
570+
const f = () => {
571+
'use gpu';
572+
let result = d.u32(0);
573+
for (const foo of tgpu.unroll(acc.$)) {
574+
result += foo;
575+
}
576+
577+
return result;
578+
};
579+
```
580+
581+
Generates:
582+
```wgsl
583+
@group(0) @binding(0) var<uniform> buffer: array<u32, 3>;
584+
585+
fn f() -> u32 {
586+
var result = 0u;
587+
// unrolled iteration #0
588+
{
589+
result += buffer[0u];
590+
}
591+
// unrolled iteration #1
592+
{
593+
result += buffer[1u];
594+
}
595+
// unrolled iteration #2
596+
{
597+
result += buffer[2u];
598+
}
599+
return result;
600+
}
601+
```

packages/typegpu/tests/unroll.test.ts

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -205,13 +205,13 @@ describe('tgpu.unroll', () => {
205205

206206
it('unrolls array expression of struct field names - (simple)', () => {
207207
const values = { a: 1, b: 2, c: 3 };
208-
const list = Object.keys(values) as (keyof typeof values)[];
208+
const keys = Object.keys(values) as (keyof typeof values)[];
209209

210210
const f = () => {
211211
'use gpu';
212212
let result = d.u32(0);
213-
for (const prop of tgpu.unroll(list)) {
214-
result += values[prop];
213+
for (const key of tgpu.unroll(keys)) {
214+
result += values[key];
215215
}
216216
return result;
217217
};

0 commit comments

Comments
 (0)