Skip to content

Support broader work tasks/models using Lambda Docker container #346

Description

@twelch

Need

Right now we are limited to the Javascript ecosystem for geoprocessing functions. It should be possible to run a Docker container instead capable of running other environments.

Solution

  • user adds a Dockerfile to the project source code.
    • Dockerfile builds on Amazon base image capable of being run in a Lambda.
    • entry point is a Javascript function, not unlike GeoprocessingHandler.ts, that is a lambda handler. It receives input payload, calls the underlying user-provided geoprocessing function, and returns the result.
  • user registers a geoprocessing function in geoprocessing.json and includes a reference to the Dockerfile it requires.
    • The geoprocessing function is similar to how it is now, just the geoprocessing function gets run within the docker container as the entrypoint function. and it will include user-created code to call out to run tools with necessary input, mostly likely using a shell exec command. Input may need to be prepped/transformed going in and coming out.
    • It may also make sense to be able to invoke the container Lambda as a worker, so that it can scale better.
  • on deploy CDK will use lambda.DockerImageFunction to publish Dockerfiles as images and make them available to run as Lambda function
  • result S3 bucket can be used to store one or more results, and S3 metadata gets returned as the result, with temporary pre-signed URL's to access the results.

For example, to run an R model

  • the Dockerfile would install the necessary R environment and code.
  • The geoprocessing function would call out to run the model, passing it input, and getting output back.

See example of how done for SeaSketch - UploadHandlerLambdaStack

Challenges

  • Testing environment. How will smoke tests work? Should local environment build dockerfiles into images and run them?
  • Logging and plumbing for reporting status including progress, and errors.
  • Generating pre-signed URL's to read S3 bucket items

Limitations

  • Limited to what can be installed and run starting with a base Amazon Linux image

Questions

  • Does this need to work for preprocessor functions also?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions