Skip to content

[Bug]Ark0.4.1 multi_gou_tutorial.py run error #207

@shenyanmei2020

Description

@shenyanmei2020

Describe the bug
ark0.4.1: run multi_gou_tutorial.py fail in sched_default.cc
line393 in configure_gpu_buf, tensor.cc line246 in update_pads, errors as follow:
invalid padding detected. This is likely caused because one GPU buffer is used by multiple operators that require different padding. A possible workaround is to let each operator use a different buffer by creating a new tensor rather than overwriting an existing tensor op name:send.

To Reproduce
run multi_gou_tutorial.py in ark0.4.1

Expected behavior

  1. explain why has the error;
  2. what relationship "ldims, type_bytes, tile" between ref_tensor and this_tensor satisfy in updae_pads?

System (please complete the following information):

  • ark0.4.1
  • OS: [e.g. Ubuntu18.04]
  • GPU [A100]
  • Networking Environment [Single-node, Multi-gpu]

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions