This page explains how to use the Chat API, covering its functionality, parameters, example responses, and the underlying pipelines. The Chat API enables real-time interaction with your AI Assistant, allowing you to send messages and receive responses based on data in your Catalog.
#Overview
The Chat API facilitates the Retrieval-Augmented Generation (RAG) process, letting users send messages and receive personalized, context-aware replies. It uses two preset pipelines:
-
Convert to Standalone Question: Reformulates user questions into standalone queries for accurate document retrieval from a Vector Database.
- Process:
- Pass in the conversation history and user's question.
- Generates a revised query based on this input.
- Retrieves relevant chunks from the Catalog using revised query.
- For more details, refer to the pipeline: Convert to Standalone Question.
- Process:
-
Reply to User Question: Uses retrieved chunks and conversation history to generate accurate, contextually-aware responses.
- Process:
- Combines the standalone question with retrieved and formatted chunks.
- Generates a response using this combined input and the conversation history.
- For more details, refer to the pipeline: Reply to User Question.
- Process:
Note: The current pipelines use default prompts designed for general-purpose scenarios. Future updates will allow prompt and pipeline customization.
When using on Instill Cloud credits are consumed based on tokens processed
during retrieval and response generation. Longer conversations or higher
topK
values will consume more credits.
#Chat via API
Note that the NAMESPACE_ID
and APP_ID
path parameters must be replaced with the app
owner's ID and the app ID, respectively.
#Body Parameters
catalogId
(string, required): The ID of the Catalog you want to interact with.conversationUid
(string, required): The unique identifier of the conversation.message
(string, required): The user's message or question.topK
(integer, optional): The number of top relevant chunks to retrieve from the Catalog. We recommend setting it to 11-20 to give LLM more context in answering the question.
Notes:
- Ensure that the
conversationUid
corresponds to an existing conversation. You can create a conversation using the Create Conversation API. - The Chat API supports streaming responses, allowing for real-time interaction and improved user experience.
- If the request fails, you will receive an error response detailing the issue. Common errors include invalid credentials, incorrect parameters, or exceeding quota limits.
#Example Response
A successful response will return a JSON object containing the AI Assistant's reply and the relevant chunks used.
{ "outputs": [ { "message": "The AI Assistant acts as a consumer application for the Catalog. Before creating an AI Assistant, users must first create a Catalog in Artifact and upload and process files. Once this is done, the AI Assistant can interact with the processed files in the Catalog, allowing users to ask questions related to the Catalog files through natural language." } ], "chunks": [ { "chunkUid": "chunk-uid-1", "similarityScore": 0.95, "textContent": "...The AI Assistant integrates with the Catalog to provide natural language interactions and RAG-based responses for enhanced user experience...", "sourceFile": "instill_app.pdf", "chunkMetadata": { "chunkUid": "chunk-uid-1", "retrievable": true, "startPos": 100, "endPos": 500, "tokens": 100, "createTime": "2024-10-14T12:00:00Z", "originalFileUid": "file-uid-1" } } ]}
#Output Description
outputs
: An array containing the AI Assistant's response.message
(string): The AI Assistant's reply to the user's message.
chunks
: An array of relevant chunks retrieved from the Catalog.chunkUid
(string): The unique identifier of the chunk.similarityScore
(float): The similarity score between the chunk and the user's message.textContent
(string): The content of the chunk.sourceFile
(string): The name of the source file the chunk originated from.chunkMetadata
: Additional metadata about the chunk.retrievable
(boolean): Indicates if the chunk is retrievable.startPos
(integer): The starting character position of the chunk in the source file.endPos
(integer): The ending character position of the chunk in the source file.tokens
(integer): The number of tokens in the chunk.createTime
(string): The timestamp when the chunk was created.originalFileUid
(string): The unique identifier of the original file.
#Chat via Console
To chat using the Instill Console, follow these steps:
- Launch Instill Console on Instill Cloud.
- Navigate to the Applications page using the navigation bar.
- Click on the AI Assistant app you wish to interact with.
- Ensure you have uploaded and processed files in the Catalog. This allows the AI Assistant to retrieve relevant information.
- In the top-right dropdown menu, select the Catalog you want to converse with.
- Set an appropriate
topK
value (between 1 and 20) to control the number of relevant chunks retrieved. - Type your question or message in the input field at the bottom of the chat interface.
- Press the
Send
button to submit your message. - The AI Assistant's streaming response will appear, along with any citations and retrieved text chunks used to generate the reply.
Notes:
- Adjusting the
topK
parameter allows you to balance between the relevance of retrieved chunks and the performance of your application. A highertopK
may provide more context for LLM but can increase processing time and credit consumption. - Keep an eye on your Instill Credits, especially when deploying applications at scale. Optimize your usage to prevent unexpected costs.
- Stay tuned for future updates that will allow pipeline customization, enabling you to tailor the AI Assistant's behavior to your specific use cases.
- Ensure that your application complies with data privacy regulations, especially when handling sensitive user data. Implement appropriate security measures to protect user information.