Edit on GitHub

mlem.api.save()

Saves given object to a given path.

def save(
    obj: Any,
    path: Union[str, os.PathLike],
    project: Optional[str] = None,
    sample_data=None,
    fs: Optional[AbstractFileSystem] = None,
    params: Dict[str, str] = None,
    preprocess: Union[Any, Dict[str, Any]] = None,
    postprocess: Union[Any, Dict[str, Any]] = None,
) -> MlemObject

Usage:

from mlem.api import save

save(obj, path, index=False, external=True)

Description

Saves a given object to a given path. The path can belong to different file systems (eg: S3). The function returns and saves the object as a MLEM Object.

We often need to apply some preprocessing before and after the model is applied, for that we have preprocess and postprocess arguments. You can think of them like about running postprocess(model(preprocess(x))). See examples below.

Parameters

  • obj (required) - Object to dump
  • path (required) - If not located on LocalFileSystem, then should be urior fs argument should be provided
  • project (optional) - path to mlem project (optional)
  • sample_data (optional) - If the object is a model or function, you can provide input data sample, so MLEM will include it's schema in the model's metadata
  • fs (optional) - FileSystem for the path argument
  • params (optional) - arbitrary params for object
  • preprocess (optional) - applies before the model
  • postprocess (optional) - applies after the model

Returns

None

Exceptions

  • MlemObjectNotFound - Thrown if we can't find MLEM object

Example: Save a trained model with MLEM

import os
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier
from pandas import DataFrame
from mlem.api import save

train, target = load_iris(return_X_y=True)
train = DataFrame(train)
train.columns = train.columns.astype(str)
model = DecisionTreeClassifier().fit(train, target)
path = os.path.join(os.getcwd(), "saved-model")

save(model, path, sample_data=train)

Example: use pre- and post-processors

preprocess and postprocess can be functions or MLEM models:

def apply_emdedding(word):
    # apply embedding
    ...
    return embedding


def return_classname(prediction):
    if len(prediction.shape) > 1:
        return "A surname" if prediction[0][0] < prediction[0][1] else "Not a surname"
    return "A surname" if prediction[0] else "Not a surname"


mlem.api.save(
    classify_word,  # trained on a dataset created by applying `apply_emdedding`
    "surname_classifier",
    preprocess=apply_emdedding,
    postprocess=return_classname,
    sample_data="Gagarin",
)

If you need different pre- and post-processor for different model methods, you can specify them with dictionaries (let's assume classify_word is a sklearn model and have two methods: predict and predict_proba):

mlem.api.save(
    classify_word,  # trained on a dataset created by applying `apply_emdedding`
    "surname_classifier",
    preprocess={
        "predict": apply_emdedding,
        "predict_proba": apply_emdedding,
    },
    postprocess={
        "predict": lambda p: "A surname" if p[0] else "Not a surname",
        "predict_proba": lambda p: "A surname" if p[0][0] < p[0][1] else "Not a surname",
    },
    sample_data="Gagarin",
)
Content

🐛 Found an issue? Let us know! Or fix it:

Edit on GitHub

Have a question? Join our chat, we will help you:

Discord Chat