Model

In our system, each "Model" points to an executable code repository (repo) and a specific version, identified by the repo's commit SHA. Users also need to specify the GPU requirement memory for executing this Model. Additionally, they have the option to make the Model publicly visible to all users.

When creating a Model, users are required to fill in an "entry_point_function" field. This specifies a filename and function name within the repo, outlining how the Model will be run when users call the API for execution.

TODO: Add an image of a Model page in console

Example 1 - blip-large

In this example model, the model repo is pointing to a code repo - https://github.com/ClustroAI/blip-large . The entry_point_function looks like this:

This model normally takes around ~2G GPU to run. So we set the required_gpu to 5G and the job can be distributed to some low-GPU workers, such as Nvidia 3060, for lower cost.

When calling the invoke API, users can pass in the input data. If the input is more than just a text prompt, you can define it as a stringfied JSON. In this example, the code can take an image URL as the input, or with a text prompt. It depends on whether the input_text can be jsonfied. This is completely flexible to users.

Example 2 - stable-diffusion-xl-1-0

In this example of stable-difussion-xl-1.0, In our API, the model type is defined to be "text_to_image." Unlike the "text_to_text" model type, which is designed for generating textual content, "text_to_image" is tailored for models that produce large content, such as images.

In the "text_to_text" model, the invoke() function in the code repository directly returns the generated text. However, for "text_to_image" models, the invoke() function should save the generated image to a local file and return the file name.

Once the file name is returned, our service agent will automatically upload the image from the worker machine to our Content Delivery Network (CDN). The URL of the generated content is then returned to the user.

For more details, check out the code repo - https://github.com/ClustroAI/stable-diffusion-xl/blob/main/model_invoke.py

An example task of the model:

When creating a new Model via our console or API, here are the parameters:

Required Parameters:

  • name: Model name,it can only consist of a combination of uppercase and lowercase letters and numbers

  • model_type: Either text_to_text or text_to_image or text_to_blob

  • model_code_repo_url: Model repo URL

  • model_code_version: Commit SHA to use

Optional Parameters:

  • entry_point_function: The executing function and its filename (default is model_invoke.py/invoke)

  • runtime_docker_image: The runtime Docker environment (only nvidia/cuda:11.6.2-runtime-ubuntu20.04 is supported currently)

  • example_input: An example input for the invoke function, helpful for understanding what parameters to pass, especially for public models.

  • description: Description of the model within 1000 words.

  • visibility: Either public or private. If public, the model will be visible to all users and others can create InferenceJobs for it. Otherwise, only the owner can create InferenceJobs for it.

  • required_gpu_memory: Maximum GPU requirements. It's advisable to set it to less than 24GB as most workers will likely have personal computing capabilities.

Example Public Models:

{
  "model_code_repo_url": "https://github.com/ClustroAI/stable-diffusion-xl.git",
  "created_at": "Mon, 03 Jul 2023 05:11:03 GMT",
  "description": "Stable Diffusion XL. It takes 15~20 seconds for an inference on RTX4090.",
  "example_input": "{\"input\": \"A majestic lion jumping from a big stone at night\"}",
  "id": "93d45b1c-62a7-47cd-babb-a5a1a7fe18ef",
  "entry_point_function": "model_invoke.py/invoke",
  "model_type": "text_to_image",
  "name": "stable-diffusion-xl-1-0",
  "runtime_docker_image": "nvidia/cuda:11.6.2-runtime-ubuntu20.04",
  "updated_at": "Tue, 29 Aug 2023 04:56:25 GMT",
  "user_id": "aab9ff07-3c0b-4584-b5ec-f1a5b288f6e3",
  "model_code_version": "d6e8c143e9d5b9f35d430dffaf7b53c3e2e20588",
  "visibility": "public",
  "required_gpu_memory": 25,
  "default_inference_job": "4324fb1c-52b7-47da-babb-b7b1b7fe18rg",
  "username": "clustrodemousername",
  "model_image_url":"https://cdn.clustro.ai/static/blip-large.png"
}
{
  "model_code_repo_url": "https://github.com/ClustroAI/blip-large",
  "created_at": "Thu, 17 Aug 2023 22:06:25 GMT",
  "description": "Model for conditional and un-conditional image captioning. Refer to the hugging face model Salesforce/blip-image-captioning-large. URL: https://huggingface.co/Salesforce/blip-image-captioning-large",
  "example_input": "{\"input\": \"https://storage.googleapis.com/sfr-vision-language-research/BLIP/demo.jpg\"}",
  "id": "c10501b6-e774-4096-83eb-6be38c2d82d9",
  "entry_point_function": "model_invoke.py/invoke",
  "model_type": "text_to_text",
  "name": "blip-large",
  "runtime_docker_image": "nvidia/cuda:11.6.2-runtime-ubuntu20.04",
  "updated_at": "Tue, 29 Aug 2023 04:37:56 GMT",
  "user_id": "aab9ff07-3c0b-4584-b5ec-f1a5b288f6e3",
  "model_code_version": "a85a6a6ca75b022aa007e27f1449e3bdc15f9f61",
  "visibility": "public",
  "default_inference_job": "4324fb1c-52b7-47da-babb-b7b1b7fe18rg",
  "username": "clustrodemousername",
  "model_image_url":"https://cdn.clustro.ai/static/blip-large.png"
}
{
  "model_code_repo_url": "https://github.com/ClustroAI/clip-vit-large",
  "created_at": "Thu, 17 Aug 2023 22:45:12 GMT",
  "description": "Zero-Shot Image Classification model from openai/clip-vit-large-patch14 . HuggingFace link: https://huggingface.co/openai/clip-vit-large-patch14",
  "example_input": "{\"input\": \"{\\\"image_url\\\": \\\"http://images.cocodataset.org/val2017/000000039769.jpg\\\", \\\"text\\\": \\\"a photo of dog, a photo of cat\\\"}\"}",
  "id": "d4041542-6d5a-4fd4-b9f1-0fa217698cb8",
  "entry_point_function": "model_invoke.py/invoke",
  "model_type": "text_to_text",
  "name": "clip-vit-large",
  "runtime_docker_image": "nvidia/cuda:11.6.2-runtime-ubuntu20.04",
  "updated_at": "Tue, 29 Aug 2023 06:14:42 GMT",
  "user_id": "aab9ff07-3c0b-4584-b5ec-f1a5b288f6e3",
  "version": "9b18da8f1229043f32192c804bda6fdbc294c12d",
  "visibility": "public",
  "default_inference_job": "4324fb1c-52b7-47da-babb-b7b1b7fe18rg",
  "username": "clustrodemousername",
  "model_image_url":"https://cdn.clustro.ai/static/blip-large.png"
}
{
  "model_code_repo_url": "https://github.com/ClustroAI/falcon7b-instruct",
  "created_at": "Tue, 18 Jul 2023 04:57:25 GMT",
  "description": "Falcon-7B-Instruct is a 7B parameters causal decoder-only model built by TII based on Falcon-7B and finetuned on a mixture of chat/instruct datasets. It is made available under the Apache 2.0 license. Reference - https://huggingface.co/tiiuae/falcon-7b-instruct",
  "example_input": "{\"input\": \"{\\\"prompt\\\": \\\"This is an essay about the universe\\\", \\\"max_length\\\": \\\"100\\\", \\\"do_sample\\\": \\\"False\\\"}\"}",
  "id": "e6c9ae22-a14c-4ed6-a577-c4000e1b4580",
  "entry_point_function": "model_invoke.py/invoke",
  "model_type": "text_to_text",
  "name": "falcon7b-instruct",
  "runtime_docker_image": "nvidia/cuda:11.6.2-runtime-ubuntu20.04",
  "updated_at": "Tue, 29 Aug 2023 05:00:02 GMT",
  "user_id": "aab9ff07-3c0b-4584-b5ec-f1a5b288f6e3",
  "model_code_version": "842b0f5934f7cba93405eec1e429bed1e5f2fbb5",
  "visibility": "public",
  "default_inference_job": "4324fb1c-52b7-47da-babb-b7b1b7fe18rg",
  "username": "clustrodemousername",
  "model_image_url":"https://cdn.clustro.ai/static/blip-large.png"
}

Last updated