This page shows you how to conduct a smart search in a Catalog using a text prompt.
This API performs semantic search by embedding user queries with the a preset Indexing Embed
pipeline that is used by
the Process Files operation. The user query embedding is then compared to the
embeddings of the chunks in the specified Catalog to find and return the most contextually similar chunks.
#Retrieve Chunks via API
The Retrieve Chunks
API allows you to perform a semantic search within a Catalog by providing a text prompt.
This operation returns the most contextually similar chunks based on the provided text.
Replace NAMESPACE_ID
with the Catalog owner's ID (namespace), CATALOG_ID
with the identifier
of the Catalog you are searching.
#Body Parameters
textPrompt
(string, required): The text prompt to search for in the Catalog.topK
(integer, optional): Specifies the number of similar chunks to return. Defaults to 5.
#Example Response
A successful response will return a list of similar chunks found in the Catalog:
{ "similarChunks": [ { "chunkUid": "ba30f524-889c-4dc7-82a2-33a8f7be2d47", "similarityScore": 0.95, "textContent": "Instill Core is a full-stack AI solution to accelerate AI development...", "sourceFile": "core-intro.txt" }, { "chunkUid": "757ab6d9-e5b4-482e-8017-5582b578e57a", "similarityScore": 0.90, "textContent": "Transform unstructured data into a knowledge base with a unified format...", "sourceFile": "catalog-intro.pdf" } ]}
#Output Description
similarChunks
(array of objects): An array where each object represents a similar chunk found in the Catalog.chunkUid
(string): The unique identifier of the chunk.similarityScore
(number): The similarity score between the input text prompt and the chunk content. Scores range from 0 to 1, with higher scores indicating greater relevance.textContent
(string): The content of the similar chunk.sourceFile
(string): The name of the source file from which the chunk was extracted.
Notes:
- Ensure that the
Authorization
header contains a valid API token with theBearer
prefix. - Adjust the
topK
parameter based on how many context chunks you want to retrieve for your search. If omitted, it defaults to 5. - The API performs semantic search using embeddings, so the results will be based on contextual similarity rather than exact keyword matches.
#Error Responses
401 Unauthorized
: Returned when the client credentials are not valid. Ensure your API token is correct and has the necessary permissions.default
: An unexpected error response. The response will include anrpcStatus
object with details about the error.