Skip to content

Serve Deberta using FasterTransformer in Triton #691

@sfc-gh-zhwang

Description

@sfc-gh-zhwang

Hi,
Is there any tutorial that we can refer to so that we could serve a deberta model using fastertransformer in Triton?
I think the steps would be:

  1. Convert a deberta-v2 model into fastertransformer;
  2. Dump the weights;
  3. Load it in the triton inference server;

However, I only see the step 1 with a tensorflow example.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions