Instill Types

Pipeline defines several Instill Types as data type identifiers. These types simplify the creation of pipelines by eliminating the complexity of converting unstructured data formats. The supported Instill Types include primitive data types such as boolean, string, integer, number, and json, as well as unstructured data types like file, document, image, audio, and video, along with array data types.

WARNING

In the variable section, the format field is used to specify the Instill Type of the variable. The format field will be changed to type in the future.

These Instill Types enable users to efficiently build pipelines that manage unstructured data in ETL workflows.

#Instill Types

Pipeline extends Instill Type from JSON primitive types and MIME types (IANA media types).

#Primitive Data Types

  • string
  • number
  • integer
  • boolean
  • json

#Unstructured Data Types

MIME types are defined as <type>/<subtype>, and Pipeline extends this to categorize unstructured data into five types:

  • image: e.g., image/jpeg
  • video: e.g., video/h264
  • audio: e.g., audio/wav
  • document: e.g., text/html, application/pdf
  • file: e.g., application/octet-stream
    • Please note that file is a generic type for any file, and the other types (text, image, video, audio, and document) are specialized file types with specific format handling capabilities.

#Auto-Conversion

With Instill Type, users don't need to manually handle type conversion. For instance, if a pipeline accepts a PNG image and a component requires JPEG, they can be directly connected as long as they share the same type image.

Example:


variable:
image:
title: A PNG Image
type: image # User uploads an image
component:
ai-0:
type: openai
task: TASK_TEXT_GENERATION
input:
prompt: What is in the image?
prompt-images:
- ${variable.image} # Component requires input in image type

Auto-conversion not only works within the same type but also supports cross-type conversions. For example, Pipeline can automatically convert a PDF document into text/markdown type.

Example:


variable:
pdf:
title: A PDF Document
type: document # User uploads a PDF file
component:
openai-0:
type: openai
task: TASK_TEXT_GENERATION
input:
prompt: ${variable.pdf} # Component requires text input

Supported Cross-Type Conversions:

  • documenttext

This feature allows users to focus on building business models without worrying about data type conversions, resulting in cleaner, more efficient pipelines.

#Attribute Extraction

Beyond type representation, Instill Type provides attribute extraction capabilities. Supported attribute includes:

TypeAttributeDescription
All Files:file-sizeSize of the file in bytes
All Files:filenameName of the file
All Files:content-typeContent type of the file
All Files:data-uriData URI representation of the file
All Files:base64Base64 representation of the file
Image:widthWidth of the image
Image:heightHeight of the image
Video:durationDuration of the video
Video:widthWidth of the video
Video:heightHeight of the video
Video:frame-rateFrame rate of the video
Audio:durationDuration of the audio
Audio:sample-rateSample rate of the audio

Example: Accessing the width and height of an image:


variable:
image:
title: A PNG Image
type: image # User uploads an image in PNG type
output:
bounding-boxes:
title: Bounding Boxes
value: ${ai-0.output.bounding-boxes}
height:
title: Image Height
value: ${variable.image:height} # Get the image height
width:
title: Image Width
value: ${variable.image:width} # Get the image width
component:
ai-0:
type: instill-model
task: TASK_DETECTION
input:
image: ${variable.image}