Edit on GitHub


Locally deploy the model using a server implementation and expose its methods as endpoints.


usage: mlem serve [options] model [subtype]

MODEL      Model to create service from  [required]
[SUBTYPE]  Server type. Choices: ['fastapi', 'heroku', 'rmq']  [default: ]


An MLEM Model can be served via a server implementation (e.g. fastapi) and its methods exposed as API endpoints. This allows us to easily make requests (inference and others) against the served model.

For the common fastapi server implementation, the OpenAPI spec is available on the /docs endpoint.

HTTP Requests to the model-server can be made either with the corresponding built-in client, or common HTTP clients, such as curl and httpie CLIs, or the requests Python library.


  • -p, --project TEXT: Path to MLEM project [default: (none)]
  • --rev TEXT: Repo revision to use [default: (none)]
  • -l, --load TEXT: File to load server config from
  • -c, --conf TEXT: Options for server in format field.name=value
  • -f, --file_conf TEXT: File with options for server in format field.name=path_to_config
  • --help: Show this message and exit.

Example: FastAPI HTTP server

Easily serve a model from a remote GitHub repository on a local FastAPI HTTP server

$ mlem serve https://github.com/iterative/example-mlem-get-started/rf fastapi --conf port=3000
Starting fastapi server...
🖇️ Adding route for /predict
🖇️ Adding route for /predict_proba
🖇️ Adding route for /sklearn_predict
🖇️ Adding route for /sklearn_predict_proba
Checkout openapi docs at <>
INFO:     Started server process [6083]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on (Press CTRL+C to quit)

🐛 Found an issue? Let us know! Or fix it:

Edit on GitHub

Have a question? Join our chat, we will help you:

Discord Chat