EbbotGPT API
Unlock the Power of Advanced Language Models with Ebbot GPT API
Welcome to the Ebbot GPT API documentation. Our API offers seamless integration of service-optimized Large Language Models (LLMs) and Knowledge Management tools into your applications. With Ebbot GPT, you can harness sophisticated text generation capabilities and enhance your systems with cutting-edge AI technology. Explore our documentation to learn how to integrate and leverage the full potential of Ebbot GPT in your projects.
If you just want the open API documentation you can see it here:
1. API Key
To make requests to the API, you will need an API key. To obtain one, please request access by filling out this form.
2. Tenant
A tenant is required for many requests. Tenants help separate token usage and datasets between different users. Typically, you’ll only need one tenant, but multiple tenants can be used if your business case requires it. Tenants are tied to your API key, ensuring your data remains secure.
3. Making a Request
ExternalId: The ID you chose when creating the tenant.
Auth: Authentication is done with a bearer token in the authorization header. Example:
"Authorization: Bearer MyAPIKey123"
3.1 List Models
To list the available models:
3.2 Create a tenant
Tenants are automatically created when a request with an unknown tenant is made. If you want to manually create a tenant then use this:
(Replace externalId with the id you want to add)
3.3 List tenants
To list the tenants you have created:
A response could look like this:
3.4 Creating a tenant API key
You are able to create an API key for a tenant if you wish. The tenant is able with the key to: list models, create completions and get usage. If you want to create one you can do it like this:
A response could look like this:
Make sure you save the API key as it will only be shown once!
If you want to add some rate limit rules to the key you can add these as a query parameter:
concurrent
integer
How many requests can be made at the same time per minute.
tokenAllowance
integer
How many tokens can be used per minute.
3.5 List tenant API keys
If you want to list the API keys a tenant has you can use:
A response might look like this:
3.6 Revoke a tenant API key
To be able to revoke a tenant API key you can do it like this:
3.7 Creating a chat completion
This is where the magic happens. This is where you get a response from the bot. Here is a simple request with the bare necessities:
A response might look like this:
Request parameters
The following table describes the various parameters you can use when configuring your request. These parameters control different aspects of the response generation and the overall behavior of the API. Each parameter has specific functionality, ranging from controlling randomness in responses to enabling advanced retrieval mechanisms. Understanding these parameters will help you tailor the API's output to better meet your needs and ensure that you can leverage all available features effectively.
top_p
float
Top-p filtering selects words with the highest probability that cumulatively reach a set threshold, ensuring diverse and unpredictable text generation. (Alternative to temperature)
temperature
float
Controls the randomness of the response. Higher values make the output more random.
seed
integer
Attempts to make the answer deterministic.
maxTokens
integer
Maximum number of tokens used in the answer.
contentGuard
boolean
Flags inappropriate content by user and bot (e.g., sexual content, violence).
contentGuardConfig
object
Configurations for the content guard
contentGuardConfig.categories
array
You can set what categories should be blocked by content guard. Example: "Privacy", "Hate". It needs to be one of our predefined categories in order to work.
promptGuard
boolean
Flags if user trying is trying alter the bots behavior.
rag
object
Retrieval Augmented Generation (embedder)
rag.output
string
Placeholder for the word that will be replaced by the response from the embedder in the system prompt (e.g., {documents})
rag.dataSetId
string (UUID)
The ID of the dataset that should be used.
rag.returnSources
boolean
If the sources of the datasets should be returned.
rag.searchDefinitions
array
List of search definitions.
rag.references
boolean
If references should be added. This will allow the model to add reference tags to the response.
rag.searchDefinitions.rnrK
integer
Retrieve and rerank involves first pulling relevant information from a large dataset and then sorting it by importance or relevance. (Experimental)
rag.searchDefinitions.topK
number
Selects the top 'K' items from a list based on the highest scores or most relevance.
rag.searchDefinitions.numberOfMessages
integer
Number of messages to use from the conversation.
rag.searchDefinitions.filter
string
Filters based on specific roles (user, assistant, both or const).
rag.searchDefinitions.searchString
string
If you want to search for a specific string instead of a whole message
model
string
The model to be used for the request.
messages
array
The conversation history.
messages[].content
string
The content of the message.
messages[].role
enum
Who sent the message (e.g., system, user, assistant, tool).
messages[].reasoning_content
string
Agent reasoning.
messages[].tool_calls
array
Contains the tools used by the agent.
messages[].refrences
array
These are references to rag documents. The rag references option has to be turned on.
messages[].refrences[].id
string
The id of the reference.
messages[].refrences[].position
object
Contains the position of the reference.
messages[].refrences[].position.start
integer
The start position of the reference.
messages[].refrences[].position.end
integer
The end position of the reference.
messages[].refrences[].content
string
The content in between the tags of the reference.
messages[].tool_call_id
string
The ID of the tool call
messages[].tool_calls[].id
string
The tool id.
messages[].tool_calls[].type
enum
The type of the tool. This can only be 'function'
messages[].tool_calls[].function
object
Information about the tool called.
messages[].tool_calls[].function.name
string
The name of the tool called.
messages[].tool_calls[].function.arguments
string
The arguments of the tool called.
response
object
Settings for the response type.
response.type
string
The type can be 'polled' or 'immediate' depending on if you are doing high or low prio requests.
response_format
object
This is for structured output.
response_format.type
enum
The type of structured output. It can be 'json_schema' or 'text'
response_format.json_schema
object
This is the definition of the json schema that the model will follow.
response_format.json_schema.description
string
The description of what the json schema represents.
response_format.json_schema.name
string
The name of the schema.
response_format.json_schema.schema
object (json)
This is the schema the model will follow.
response_format.json_schema.strict
boolean
If the schema should be followed strictly.
repairURLs
boolean
This will look though the generated response for urls and replaces them with urls from the system prompt. It will do a similarity check to get the best match.
mcp
string
This is the url for an mcp server.
tools
array
This is where you define tools that the model can use.
tools.type
enum
This can only be 'function'.
tools.function
object
This is the definition of the tool call.
tools.function.description
string
The description of the tool call. Write a good description for best results.
tools.function.name
string
The name of the tool. Write a good name for best results.
tools.function.parameters
object (json)
This is the parameters of the tool.
tools.function.strict
boolean
If the parameters should be followed strictly.
When you have gotten a completion you can add the response from the bot to the array of messages and then continue the conversation with your user input. It could look like this:
4. Get token usage
To get the token usage between two dates:
The response might look something like this:
Last updated
Was this helpful?

