Skip to content
This repository was archived by the owner on May 23, 2024. It is now read-only.
This repository was archived by the owner on May 23, 2024. It is now read-only.

sagemaker.tensorflow.serving.Model with input_handler is much slower than keras.model on GPU instance #213

@biyer19

Description

@biyer19

I am trying to follow this notebook to deploy an image processing model on sagemaker endpoint ml.g4dn.xlarge instance and found that adding image preprocessing using entrypoint script is much slower. Please consider the two cases below. In both cases I am using the same tensorflow saved model and same image(s) b64 encoded.

Setup 1:

  • Sagemaker Notebook instance ml.g4dn.xlarge. Load the model using reconstructed_model = keras.models.load_model()
  • Decode jpeg image, do some preprocessing to save as numpy arrays
  • call reconstructed_model.predict(). This call returns in ~300-400ms

Setup 2:

  • Sagemaker Notebook instance ml.g4dn.xlarge. Upload model artifacts to s3
  • Create inference.py to decode jpeg image and do some preprocessing to numpy arrays
  • Create model sm-model = TensorFlowModel(model_data=model_data, entry_point='inference.py', source_dir='src', framework_version="2.4.1", env={"SAGEMAKER_REQUIREMENTS": "requirements.txt"}, role=role)
    uncompiled_predictor = sm_model.deploy(initial_instance_count=1, instance_type='ml.g4dn.xlarge', endpoint_name='g4dn-xlarge-endpoint')
  • Call predict uncompiled_predictor.predict() This takes ~11-12 seconds to return.

From Cloudwatch logs, majority of the time (~8seconds) is spent after input_handler returns and before output_handler is invoked. From the logs, it also appears to be using GPU.

Screenshots or logs
CloudWatch screenshot

System information
A description of your system. Please provide:

  • Toolkit version:11.0
  • Framework version: 2.4.1
  • Python version:37
  • CPU or GPU:GPU
  • Custom Docker image (Y/N):N

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions