Quick Start

See it in action

Run the code in this guide here on Google’s Colab:

Launch on Colab

Alternatively, download and run the notebook locally:

$ pip install jupyter
$ git clone http://github.com/bentoml/bentoml
$ jupyter notebook bentoml/guides/quick-start/bentoml-quick-start-guide.ipynb

Quick start walk through

Defining a prediction service with BentoML:

import bentoml
from bentoml.handlers import DataframeHandler
from bentoml.artifact import SklearnModelArtifact

class IrisClassifier(bentoml.BentoService):

    def predict(self, df):
        return self.artifacts.model.predict(df)

You can add multiple bentoml.api to a BentoService, and the DataframeHandler here tells BentoML the expected input format of this API.

The bentoml.env decorator allows user to specify the dependencies and environment settings for this prediction service and bentoml.artifact is used to describe the trained models to be bundled with this prediction service. In addition to SklearnModelArtifact, BentoML libraries also provides PytorchModelArtifact, KerasModelArtifact, FastaiModelArtifact, and XgboostModelArtifact etc.

Next, train a classifier model with Iris dataset and pack the trained model with the BentoService IrisClassifier defined above:

from sklearn import svm
from sklearn import datasets

clf = svm.SVC(gamma='scale')
iris = datasets.load_iris()
X, y = iris.data, iris.target
clf.fit(X, y)

# Create a iris classifier service with the newly trained model
iris_classifier_service = IrisClassifier.pack(model=clf)

# Save the entire prediction service to file bundle
saved_path = iris_classifier_service.save()

You’ve just created a BentoML bundle, it’s a versioned file archive, containing the BentoService you defined, including the trained model artifacts, pre-processing code, dependencies and configurations.

Model serving via REST API

Now you can start a REST API server based off the saved BentoML bundle form command line:

bentoml serve {saved_path}

If you are doing this only local machine, visit in your browser to play around with the API server’s Web UI for debbugging and testing. You can also send prediction request with curl from command line:

curl -i \
  --header "Content-Type: application/json" \
  --request POST \
  --data '[[5.1, 3.5, 1.4, 0.2]]' \

Model serving via Command Line Interface

Load the saved BentoML bundle directly from command line for inferencing:

bentoml predict {saved_path} --input='[[5.1, 3.5, 1.4, 0.2]]'

# alternatively:
bentoml predict {saved_path} --input='./iris_test_data.csv'

Distribute BentoML Bundle as PyPI package

BentoML bundle is pip-installable and can be directly distributed as a PyPI package:

pip install {saved_path}
# Your bentoML model class name will become packaged name
import IrisClassifier

installed_svc = IrisClassifier.load()
installed_svc.predict([[5.1, 3.5, 1.4, 0.2]])

This allow users to upload their BentoService to pypi.org as public python package or to their organization’s private PyPi index to share with other developers.

!cd {saved_path} & python setup.py sdist upload


You will have to configure “.pypirc” file before uploading to pypi index. You can find more information about distributing python package at: https://docs.python.org/3.7/distributing/index.html#distributing-index

Run REST API server with Docker

BentoML bundle is structured to work as a docker build context so you can easily build a docker image for this API server by using it as the build context directory:

docker build -t my_api_server {saved_path}

docker run -p 5000:5000 my_api_server


You will need to install Docker before running this. Follow direction from this link: https://docs.docker.com/install

Learning More?

Interested in learning more about BentoML? Check out the Examples on BentoML github repository.

Be sure to join BentoML slack channel <http://bit.ly/2N5IpbB> to hear about the latest development updates.