InferenceEngineDetail
Inference Engine Object
idstring
The unique identifier for the inference engine
Example:
engine-12345namestring
Example:
test-inference-enginestatusstring
Possible values: [Creating, Active, Inactive]
Example:
Activebase_modelstring
Example:
Llama-3.2-3B-Instructtask_typestring
Example:
text-generationserver_typestring
Example:
on-demanddeployments object[]
Array [
idstring
The unique identifier for the deployment
Example:
dp-12345namestring
The name of the deployment
Example:
deployment-1statusstring
The current status of the deployment
Possible values: [Pending, Deploying, Deployed, Failed, Cancelled, Terminated, Deactivated, Reactivating, TimedOut]
Example:
Deployedcreated_atinteger
The timestamp when the deployment was created
Example:
1633036800]
base_model_optionstring
Example:
pre-trainedlast_access_timestampinteger
Example:
1633036800resource_management_config object
oneOf
- Inactive Timeout
- Schedule
management_typestring
Example:
inactive_timeoutmanagement_config object
inactive_durationinteger
Period of inactivity
Example:
3600created_atinteger
The timestamp when the deployment was last updated
Example:
1633036800InferenceEngineDetail
{
"id": "engine-12345",
"name": "test-inference-engine",
"status": "Active",
"base_model": "Llama-3.2-3B-Instruct",
"task_type": "text-generation",
"server_type": "on-demand",
"deployments": [
{
"id": "dp-12345",
"name": "deployment-1",
"status": "Deployed",
"created_at": 1633036800
}
],
"base_model_option": "pre-trained",
"last_access_timestamp": 1633036800,
"resource_management_config": {
"management_type": "inactive_timeout",
"management_config": {
"inactive_duration": 3600
}
},
"created_at": 1633036800
}