Experiment (Zero-shot Deployment)
Experiment is the fastest way to try the Emissary platform. In under a minute, you can spin up a customized model and call it from your application through a ready-made API endpoint — no training data, no tuning, no infrastructure setup required.
Every experiment starts as a zero-shot model built from the class definitions you provide. You can begin sending traffic immediately, then iterate by uploading data, labeling, and re-training as your use case matures.
Experiment currently supports six modes:
| Mode | What it does |
|---|---|
judge | Evaluate outputs against a single criterion (LLM-as-Judge) |
decision | Classify inputs into one of several mutually exclusive categories |
routing | Route requests to the right downstream model, agent, or workflow |
tool_calling | Pick the right tool and extract its arguments from a user query |
regression | Score inputs on a continuous numeric scale |
ner | Extract named entities (people, places, custom types) from text |
Common Workflow
Each mode follows the same two-step pattern:
- Create the experiment by
POSTing to/v1/experimentswith amodeand a list ofclasses. You receive anidandlatest_version. - Call the model at the mode's inference endpoint using the format
EXPERIMENT_ID/VERSIONas themodelparameter.
All requests require your API key in the X-API-Key header.
judge — LLM-as-Judge
Score a model's output against a single quality criterion. The model returns a probability that the criterion holds, which you can threshold or log for offline analysis.
Use it for: automated evaluation of LLM outputs in CI, online quality monitoring of a production assistant (helpfulness, safety, groundedness), or filtering low-quality generations from a synthetic dataset.
Create the experiment:
- Python
- cURL
import json
import requests
response = requests.post(
'https://api.withemissary.com/v1/experiments',
headers={
'Content-Type': 'application/json',
'X-API-Key': YOUR_API_KEY
},
data=json.dumps({
'name': 'MyJudge',
'mode': 'judge',
'classes': [
{'name': 'helpful', 'description': 'Is the response to the query helpful?'}
]
})
)
print(response.json())
curl https://api.withemissary.com/v1/experiments \
-H "Content-Type: application/json" \
-H "X-API-Key: $YOUR_API_KEY" \
-d '{
"name": "MyJudge",
"mode": "judge",
"classes": [
{"name": "helpful", "description": "Is the response to the query helpful?"}
]
}'
{"id": "ex-ahejacuehandheha", "latest_version": "0.0.0"}
Call the model:
- Python
- cURL
response = requests.post(
'https://api.withemissary.com/v1/classification',
headers={
'Content-Type': 'application/json',
'X-API-Key': YOUR_API_KEY
},
data=json.dumps({
'model': 'ex-ahejacuehandheha/0.0.0', # EXPERIMENT_ID/VERSION
'input': "User: Explain quantum computing in simple terms.\nAssistant: I don't know, Google it.",
'data_format': 'probs'
})
)
print(response.json())
curl https://api.withemissary.com/v1/classification \
-H "Content-Type: application/json" \
-H "X-API-Key: $YOUR_API_KEY" \
-d '{
"model": "ex-ahejacuehandheha/0.0.0",
"input": "User: Explain quantum computing in simple terms.\nAssistant: I don'\''t know, Google it.",
"data_format": "probs"
}'
{
"id": "classify-3c52592c7a404f97aa494861a79db220",
"model": "ex-ahejacuehandheha/0.0.0",
"data": [{"index": 0, "probs": {"helpful": 0}}],
"created": 1779906329
}
decision
Classify an input into exactly one of several mutually exclusive labels. The model returns a probability distribution across all classes that sums to 1.
Use it for: sentiment analysis, intent detection, content moderation, support ticket triage, or any task where each input belongs to one and only one category.
Create the experiment:
- Python
- cURL
response = requests.post(
'https://api.withemissary.com/v1/experiments',
headers={
'Content-Type': 'application/json',
'X-API-Key': YOUR_API_KEY
},
data=json.dumps({
'name': 'MyDecision',
'mode': 'decision',
'classes': [
{'name': 'positive', 'description': 'The text expresses a positive sentiment, satisfaction, or approval.'},
{'name': 'negative', 'description': 'The text expresses a negative sentiment, dissatisfaction, or complaint.'},
{'name': 'neutral', 'description': 'The text is factual or does not express a strong opinion.'}
]
})
)
print(response.json())
curl https://api.withemissary.com/v1/experiments \
-H "Content-Type: application/json" \
-H "X-API-Key: $YOUR_API_KEY" \
-d '{
"name": "MyDecision",
"mode": "decision",
"classes": [
{"name": "positive", "description": "The text expresses a positive sentiment, satisfaction, or approval."},
{"name": "negative", "description": "The text expresses a negative sentiment, dissatisfaction, or complaint."},
{"name": "neutral", "description": "The text is factual or does not express a strong opinion."}
]
}'
{"id": "ex-ahejacuehandhehe", "latest_version": "0.0.0"}
Call the model:
- Python
- cURL
response = requests.post(
'https://api.withemissary.com/v1/classification',
headers={
'Content-Type': 'application/json',
'X-API-Key': YOUR_API_KEY
},
data=json.dumps({
'model': 'ex-ahejacuehandhehe/0.0.0',
'input': 'This product is absolutely wonderful!',
'data_format': 'probs'
})
)
print(response.json())
curl https://api.withemissary.com/v1/classification \
-H "Content-Type: application/json" \
-H "X-API-Key: $YOUR_API_KEY" \
-d '{
"model": "ex-ahejacuehandhehe/0.0.0",
"input": "This product is absolutely wonderful!",
"data_format": "probs"
}'
{
"id": "classify-eac7451c54174ebc8da47be77dfe9581",
"model": "ex-ahejacuehandhehe/0.0.0",
"data": [{
"index": 0,
"probs": {
"negative": 0.017637422308325768,
"neutral": 0.024233082309365273,
"positive": 0.9581295251846313
}
}],
"created": 1779907078
}
routing
A specialized form of classification designed for picking the right downstream destination for each request. Behaves like decision but is optimized for routing patterns where each class represents a target model, agent, or pipeline.
Use it for: sending simple queries to a fast/cheap model and complex ones to a larger one, dispatching tasks across specialized agents (research, coding, creative), or routing customer messages to the correct team.
Create the experiment:
- Python
- cURL
response = requests.post(
'https://api.withemissary.com/v1/experiments',
headers={
'Content-Type': 'application/json',
'X-API-Key': YOUR_API_KEY
},
data=json.dumps({
'name': 'MyRouter',
'mode': 'routing',
'classes': [
{'name': 'simple_task', 'description': 'Simple, straightforward queries that can be handled by a small, fast model.'},
{'name': 'complex_task', 'description': 'Complex queries requiring reasoning, code generation, or multi-step analysis.'},
{'name': 'creative_task', 'description': 'Creative writing, brainstorming, or content generation tasks.'}
]
})
)
print(response.json())
curl https://api.withemissary.com/v1/experiments \
-H "Content-Type: application/json" \
-H "X-API-Key: $YOUR_API_KEY" \
-d '{
"name": "MyRouter",
"mode": "routing",
"classes": [
{"name": "simple_task", "description": "Simple, straightforward queries that can be handled by a small, fast model."},
{"name": "complex_task", "description": "Complex queries requiring reasoning, code generation, or multi-step analysis."},
{"name": "creative_task", "description": "Creative writing, brainstorming, or content generation tasks."}
]
}'
{"id": "ex-ahejacuehandheheee", "latest_version": "0.0.0"}
Call the model:
- Python
- cURL
response = requests.post(
'https://api.withemissary.com/v1/classification',
headers={
'Content-Type': 'application/json',
'X-API-Key': YOUR_API_KEY
},
data=json.dumps({
'model': 'ex-ahejacuehandheheee/0.0.0',
'input': 'Write a poem about autumn leaves',
'data_format': 'probs'
})
)
print(response.json())
curl https://api.withemissary.com/v1/classification \
-H "Content-Type: application/json" \
-H "X-API-Key: $YOUR_API_KEY" \
-d '{
"model": "ex-ahejacuehandheheee/0.0.0",
"input": "Write a poem about autumn leaves",
"data_format": "probs"
}'
{
"id": "classify-23d895d52c534bfca2c24e5492d3321d",
"model": "ex-ahejacuehandheheee/0.0.0",
"data": [{
"index": 0,
"probs": {
"complex_task": 0.011172994039952755,
"creative_task": 0.9710367918014526,
"simple_task": 0.017790259793400764
}
}],
"created": 1779907385
}
tool_calling
Select the right tool for a user query and extract the arguments needed to call it. Each class is a tool defined with a JSON Schema for its parameters; the model returns both the tool's probability and a structured generation payload ready to pass to your function.
Use it for: building lightweight, low-latency function-calling agents without a general-purpose LLM in the hot path — database query dispatchers, API gateways, voice assistants, or any system where you need both routing and structured argument extraction in one call.
Create the experiment:
- Python
- cURL
response = requests.post(
'https://api.withemissary.com/v1/experiments',
headers={
'Content-Type': 'application/json',
'X-API-Key': YOUR_API_KEY
},
data=json.dumps({
'name': 'MyToolCalling',
'mode': 'tool_calling',
'classes': [
{
'name': 'sql.execute',
'description': 'Execute SQL queries based on user-defined parameters like SQL keyword, table name, column names, and conditions.',
'parameters': {
'type': 'object',
'properties': {
'sql_keyword': {'type': 'string', 'enum': ['SELECT', 'INSERT', 'UPDATE', 'DELETE', 'CREATE']},
'table_name': {'type': 'string'},
'columns': {'type': 'array', 'items': {'type': 'string'}}
},
'required': ['sql_keyword', 'table_name']
}
},
{
'name': 'Movies_3_FindMovies',
'description': 'Retrieves a list of movies based on the director, genre, and cast specified by the user.',
'parameters': {
'type': 'object',
'properties': {
'directed_by': {'type': 'string'},
'genre': {'type': 'string', 'enum': ['Fantasy', 'Mystery', 'Thriller', 'Comedy', 'Drama', 'Action']},
'cast': {'type': 'string'}
},
'required': []
}
},
{
'name': 'Weather_1_GetWeather',
'description': 'Retrieves the weather forecast for a specified city on a particular date.',
'parameters': {
'type': 'object',
'properties': {
'city': {'type': 'string'},
'date': {'type': 'string'}
},
'required': ['city']
}
}
]
})
)
print(response.json())
curl https://api.withemissary.com/v1/experiments \
-H "Content-Type: application/json" \
-H "X-API-Key: $YOUR_API_KEY" \
-d '{
"name": "MyToolCalling",
"mode": "tool_calling",
"classes": [
{
"name": "sql.execute",
"description": "Execute SQL queries based on user-defined parameters like SQL keyword, table name, column names, and conditions.",
"parameters": {
"type": "object",
"properties": {
"sql_keyword": {"type": "string", "enum": ["SELECT", "INSERT", "UPDATE", "DELETE", "CREATE"]},
"table_name": {"type": "string"},
"columns": {"type": "array", "items": {"type": "string"}}
},
"required": ["sql_keyword", "table_name"]
}
},
{
"name": "Movies_3_FindMovies",
"description": "Retrieves a list of movies based on the director, genre, and cast specified by the user.",
"parameters": {
"type": "object",
"properties": {
"directed_by": {"type": "string"},
"genre": {"type": "string", "enum": ["Fantasy", "Mystery", "Thriller", "Comedy", "Drama", "Action"]},
"cast": {"type": "string"}
},
"required": []
}
},
{
"name": "Weather_1_GetWeather",
"description": "Retrieves the weather forecast for a specified city on a particular date.",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string"},
"date": {"type": "string"}
},
"required": ["city"]
}
}
]
}'
{"id": "ex-ahejacuehandhehedfe", "latest_version": "0.0.0"}
Call the model:
Tool calling uses the /v1/classify-generate endpoint, which returns both the selected tool and its arguments.
- Python
- cURL
response = requests.post(
'https://api.withemissary.com/v1/classify-generate',
headers={
'Content-Type': 'application/json',
'X-API-Key': YOUR_API_KEY
},
data=json.dumps({
'model': 'ex-ahejacuehandhehedfe/0.0.0',
'input': 'Find action movies directed by Christopher Nolan'
})
)
print(response.json())
curl https://api.withemissary.com/v1/classify-generate \
-H "Content-Type: application/json" \
-H "X-API-Key: $YOUR_API_KEY" \
-d '{
"model": "ex-ahejacuehandhehedfe/0.0.0",
"input": "Find action movies directed by Christopher Nolan"
}'
{
"id": "classify-6829e3d6987641a1badd9b16d8098ef1",
"model": "ex-ahejacuehandhehedfe/0.0.0",
"data": [{
"index": 0,
"generation": [{
"directed_by": "Christopher Nolan",
"genre": "Action"
}],
"probs": {
"Movies_3_FindMovies": 0.9871819019317627,
"NO_TOOL": 0.001537600066512823,
"Weather_1_GetWeather": 0.007378291338682175,
"sql.execute": 0.0039021370466798544
}
}],
"created": 1779908032
}
Note: A built-in
NO_TOOLclass is added automatically so the model can decline to call any tool when none applies.
regression
Score inputs on a continuous scale rather than choosing a discrete label. You define each scale by its min/max range and what the endpoints mean; the model returns a single numeric value.
Use it for: scoring sentiment intensity, response quality on a 1–5 scale, urgency or risk levels, readability, toxicity severity — anything where "how much" matters more than "which category."
Create the experiment:
- Python
- cURL
response = requests.post(
'https://api.withemissary.com/v1/experiments',
headers={
'Content-Type': 'application/json',
'X-API-Key': YOUR_API_KEY
},
data=json.dumps({
'name': 'MyRegressor',
'mode': 'regression',
'classes': [{
'name': 'valence',
'description': 'How positive the emotional tone is.',
'min_value': 0,
'max_value': 1,
'low_description': 'very negative emotional tone',
'high_description': 'very positive emotional tone',
'higher_means': 'more positive emotional tone'
}]
})
)
print(response.json())
curl https://api.withemissary.com/v1/experiments \
-H "Content-Type: application/json" \
-H "X-API-Key: $YOUR_API_KEY" \
-d '{
"name": "MyRegressor",
"mode": "regression",
"classes": [{
"name": "valence",
"description": "How positive the emotional tone is.",
"min_value": 0,
"max_value": 1,
"low_description": "very negative emotional tone",
"high_description": "very positive emotional tone",
"higher_means": "more positive emotional tone"
}]
}'
{"id": "ex-ahejacuehandhehedfedaew", "latest_version": "0.0.0"}
Call the model:
- Python
- cURL
response = requests.post(
'https://api.withemissary.com/v1/regression',
headers={
'Content-Type': 'application/json',
'X-API-Key': YOUR_API_KEY
},
data=json.dumps({
'model': 'ex-ahejacuehandhehedfedaew/0.0.0',
'input': 'I absolutely love this product — best purchase I have made all year!'
})
)
print(response.json())
curl https://api.withemissary.com/v1/regression \
-H "Content-Type: application/json" \
-H "X-API-Key: $YOUR_API_KEY" \
-d '{
"model": "ex-ahejacuehandhehedfedaew/0.0.0",
"input": "I absolutely love this product — best purchase I have made all year!"
}'
{
"id": "classify-56abbf04ce9b4fc793461a21538eabcb",
"model": "ex-ahejacuehandhehedfedaew/0.0.0",
"data": [{"index": 0, "logits": 0.9821727275848389}],
"created": 1779908552
}
ner — Text Extraction
Extract structured entities from unstructured text. Unlike a classifier, NER returns spans of text grouped by type, and you fully define what counts as an entity through the class descriptions — so you're not limited to a fixed taxonomy like PERSON/ORG/LOC.
Use it for: pulling structured data from emails, contracts, or chat messages; extracting PII for redaction; lifting custom entities like ticker symbols, drug names, product SKUs, or contract clauses from your domain.
Create the experiment:
- Python
- cURL
response = requests.post(
'https://api.withemissary.com/v1/experiments',
headers={
'Content-Type': 'application/json',
'X-API-Key': YOUR_API_KEY
},
data=json.dumps({
'name': 'MyExtractor',
'mode': 'ner',
'classes': [
{'name': 'PERSON', 'description': 'human names'},
{'name': 'ORGANIZATION', 'description': 'companies and institutions'},
{'name': 'LOCATION', 'description': 'cities and places'}
]
})
)
print(response.json())
curl https://api.withemissary.com/v1/experiments \
-H "Content-Type: application/json" \
-H "X-API-Key: $YOUR_API_KEY" \
-d '{
"name": "MyExtractor",
"mode": "ner",
"classes": [
{"name": "PERSON", "description": "human names"},
{"name": "ORGANIZATION", "description": "companies and institutions"},
{"name": "LOCATION", "description": "cities and places"}
]
}'
{"id": "ex-ahejacuehandhehedqeef", "latest_version": "0.0.0"}
Call the model:
- Python
- cURL
response = requests.post(
'https://api.withemissary.com/v1/ner',
headers={
'Content-Type': 'application/json',
'X-API-Key': YOUR_API_KEY
},
data=json.dumps({
'model': 'ex-ahejacuehandhehedqeef/0.0.0',
'input': 'Sundar Pichai announced that Google will open a new research lab in Zurich next year.'
})
)
print(response.json())
curl https://api.withemissary.com/v1/ner \
-H "Content-Type: application/json" \
-H "X-API-Key: $YOUR_API_KEY" \
-d '{
"model": "ex-ahejacuehandhehedqeef/0.0.0",
"input": "Sundar Pichai announced that Google will open a new research lab in Zurich next year."
}'
{
"id": "ner-8cf67c5c3ff94550a930874cc261edbe",
"model": "ex-ahejacuehandhehedqeef/0.0.0",
"entities": {
"LOCATION": [{"entity": "Zurich"}],
"ORGANIZATION": [{"entity": "Google"}],
"PERSON": [{"entity": "Sundar Pichai"}]
},
"created": 1779908815
}
Endpoint Reference
| Mode | Endpoint |
|---|---|
judge, decision, routing | POST /v1/classification |
tool_calling | POST /v1/classify-generate |
regression | POST /v1/regression |
ner | POST /v1/ner |
Next Steps
The zero-shot model created here is just the starting point. Once you've validated the mode and class definitions against real traffic, you can:
- Upload labeled examples to fine-tune the model for higher accuracy
- Publish new versions and roll them out incrementally
- Monitor inference logs in the dashboard to spot edge cases
Head to the Datasets and Training sections of the docs to continue.