The theme of this tutorial

Discover The Logic Behind Arc Max: Boost Your Browsing with AI

A Step-by-Step Tutorial for Summarizing and Asking Questions on Webpages

May 15, 2024

Tutorial

Application

#Introduction

This tutorial explores two powerful use cases of Instill Cloud pipelines that mimic the functionality of Arc Max's webpage interaction tools: "Ask on Page" and "5 Second Previews". Both pipelines utilize the same components to enhance webpage interactivity and provide instant information retrieval.

#Overview of Pipelines

Both "Ask on Page" and "5 Second Previews" pipelines are designed to scrape webpages and extract valuable data tailored to specific needs:

  1. Ask on Page: This pipeline fetches webpage content and processes inquiries about that content, delivering concise answers.

  2. 5 Second Previews: This pipeline generates brief summaries of webpage content, highlighting key information in a matter of seconds.

#How to Use the Pipelines

To use these pipelines, simply provide a URL of a webpage you want to explore. Here’s how you can leverage each pipeline effectively:

  • Ask on Page: Input a URL along with a specific question you have about the page's content. For example, input https://www.instill.tech and ask, What's the product? Click 'Run' to receive an answer directly derived from the page.
Try it out:

Ask on Page

Answer questions related to a webpage.

The product is Instill Cloud, a no-code/low-code AI platform.

Reference on page → "Meet Instill Cloud, a no-code/low-code platform that accelerates AI application development by 10x."

Try it out:

Webpage Summary

Enter a webpage, generate a preview.

Detailed guide on hiking the Seven Sisters Cliffs Walk in the UK

🚶 Trail Overview: 20 km walk, Seaford to Eastbourne, great views, no maps needed.

🚉 Directions & Tips: Accessible by train, check tides, avoid bad weather, bring good shoes.

📷 Trail Highlights: Scenic cliffs, iconic Cuckmere Haven beach, perfect for photography.

❤️ Personal Experience: Great for soul searching and bonding with friends during tough times.

🍦 End of Trail: Ends in Eastbourne with amenities, food, and opportunities to relax.

#Outputs Explained

  • Ask on Page Output:
    • Webpage Content: Excerpts from the webpage.
    • Answer: Direct answers to your questions, with references to specific parts of the page.
  • 5 Second Previews Output:
    • Webpage Content: Brief content snippets.
    • Summary: A comprehensive guide or overview of the main content.
    • Previews: Key points from the page, summarized with relevant keywords and emojis for a quick scan.

#Steps to Build the Pipeline

#1. Create a New Pipeline

  • Owner: Choose the owner (individual or organization).
  • Name: Enter the pipeline name in lowercase.
  • Description: Provide an optional description.
  • Visibility: Choose public or private visibility.
  • Action: Click 'Create'.
Step 1 : Create a New Pipeline
Step 1 : Create a New Pipeline

#2. Configure Trigger Component

  • Webpage URL Field (For both use cases):
    • Click 'Add field' in the Trigger Component.
    • Select 'Short text'.
    • Title: Enter Website URL, edit the automatically generated Key if necessary (lowercase required).
    • Click 'Save' to save the configuration.
  • Question Field (For Ask on Page use case):
    • Repeat the process used for the Webpage URL, using the Title Question.
Step 2 : Configure Trigger Component
Step 2 : Configure Trigger Component

#3. Add Components

  • Website Scraper:
    • Task: Choose 'Scrape Website'.
    • Query: Reference the URL from the Trigger Component by typing ${trigger.webpage}.
    • Max Number of Pages: Enter 1 for example or reference this setting to another input field if needed.
    • Configuration: Ensure 'Include Link Text' is enabled.
  • OpenAI for Text Generation:
    • Task: Select 'Text Generation'.
    • Model: Choose the model to be used, such as 'gpt-4-1106-preview'.
    • Prompt: Use references to integrate scraped content. For example:
      • Input for Ask on Page use case:
        Question: ${trigger.question}
        Title: ${web-scraper.output.pages[0].title}
        Link: ${web-scraper.output.pages[0].link}
        Text content: ${web-scraper.output.pages[0].link-text}
      • Input for 5 Second Preview use case:
        ${web-scraper.output.pages[0].link-text}
    • Response Format: Select 'text'.
    • Additional Configuration: Set system messages specific to the task.
    • API Key: Reference the OpenAI API key previously set under "Settings""Secrets" as ${secrets.openai-instill-api-key}.
Step 3 : Add Components
Step 3 : Add Components

#4. Configure Output in Response Component

  • General Output:
    • To extract content scraped from the webpage by the Website component, add 'Webpage Text Content' field with Title Webpage Content, Key webpage-content and Value ${web-scraper.output.pages[0].link-text}.
  • Specific Outputs:
    • Ask on Page: To reference the output from the OpenAI Component, add 'Answer' field with Title Answer, Key answer and Value ${ask.output.texts}.
    • 5 Second Preview: To reference the output from the OpenAI Component, add 'Summary' field with Title Summary, Key summary and Value ${summarizer.output.texts}.
Step 4 : Configure Output in Response Component
Step 4 : Configure Output in Response Component

#5. Finalization and Usage

  • Save: Confirm all entries and save the pipeline.
  • Test: Fill in required inputs and run the pipeline to test.
  • Share: Share the pipeline URL or publish it to Instill Cloud for broader access.
  • Version Control: Use the release function for version control.
  • Integration: Integrate the pipeline into your applications using the API endpoint, accessible via Toolkit or Python SDK.
Step 5 : Finalization and Usage
Step 5 : Finalization and Usage

The components used in both pipelines play a critical role in extracting and processing the data from webpages to deliver precise and useful information to the user.

Feel free to experiment with these pipelines on Instill Cloud to see how they can transform your interaction with web content, making information retrieval both efficient and enjoyable.

blurred spotbeam
line

AI infrastructure for Enterprise