torch-serving

A simple, limited scope model server for JIT compiled PyTorch models.

The key idea is that one should be able to spin up a simple HTTP service that should just work, and be able to handle inbound requests to JIT compiled models. We want to do a few key things to make this work well:

Model caching - store deserialized torch::jit::Module pointers in an (modification-threadsafe) LRU cache so we don't need to pay deserialization cost every roundtrip
JSON marshaling for tensor types.
Speed!

Building the project

The build is suboptimal - it would be great to make the CMake config better.

Clone the repo (and submodule) with:

git clone --recursive https://github.com/lukedeo/torch-serving

We expect you to have libtorch unpacked somewhere (available here), and CMake available (as well as a C++11 compliant compiler).

This is also achievable using a python installed version of torch. We provide a script to locate the path to libtorch in your python installation.

Suspected Compatible PyTorch versions: torch<=1.6.0>=1.4.0

Run:

mkdir build && cd build
cmake -DCMAKE_PREFIX_PATH=/path/to/libtorch ..
# OR
cmake -DCMAKE_PREFIX_PATH=$(../scripts/libtorch-root) ..
make

The executable will be apps/torch-serving. Go ahead and move that up a directory:

mv apps/torch-serving .. && cd ..

Run the tests (in the parent directory) with:

build/tests/test-torch-serving

Saving a JIT model (from Python)

Now, ensure you have Python & torch==1.4.0 installed, and run

python example/create_model.py

which creates a dumb big model with interesting inputs and outputs (i.e., List[torch.Tensor], etc.) and runs a JIT trace, saving to model-example.pt.

Running the server & making a request.

From the repo directory (after following the build), just run ./torch-serving.

In another terminal, run a request through (maybe pipe through jq if you've got that installed)!

curl -X POST \
    --data @example/post-data.json \
    localhost:8888/serve?servable_identifier=model-example.pt

which should output:

{
  "type": "generic_dict",
  "value": {
    "out": [
      [
        {
          "data_type": "float32",
          "shape": [1, 3],
          "type": "tensor",
          "value": [19.27, 5.28, 3.72]
        },
        {
          "data_type": "float32",
          "shape": [1, 3],
          "type": "tensor",
          "value": [60, 33, 33]
        }
      ],
      {
        "type": "string",
        "value": "Hello!"
      },
      {
        "data_type": "int64",
        "type": "scalar",
        "value": 30
      }
    ]
  }
}

Note that we represent tensors unraveled and specify a shape, where you can do tensor.tensor(unraveled_tensor).reshape(shape).

TODOs

CI (Someone feel like setting up GH Actions?)
Documentation
Better build
Support type: image from JSON (base64 encoding)
Wire up the cache invalidation probability to the API.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
apps		apps
example		example
extern		extern
include/torch_serving		include/torch_serving
scripts		scripts
src		src
tests		tests
.clang-format		.clang-format
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

torch-serving

Building the project

Saving a JIT model (from Python)

Running the server & making a request.

TODOs

About

Uh oh!

Releases 1

Packages

Uh oh!

Languages

License

lukedeo/torch-serving

Folders and files

Latest commit

History

Repository files navigation

torch-serving

Building the project

Saving a JIT model (from Python)

Running the server & making a request.

TODOs

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Languages

Packages