Home

Awesome

Azure Cognitive Semantic Search | Large documents | OpenAI enrichment

About

Azure Cognitive Search offers Semantic Search (preview), which uses context and meaning to enhance search results. This includes features like semantic re-ranking, captions and highlights, and semantic answers for more relevant and useful search results.Hoever it is challenging to get best Semantic query results on large documents (> 4/5 pages).

The accelerator enhances the results of Semantic queries on large documents by dividing the document into smaller chunks of 4000 characters (which can be customized), and creating a separate search document for each chunk within the search index. This allows the Semantic search engine to process smaller portions of information instead of the entire document, which can be difficult to handle.

To create multiple search document records within a search index, a two-step process is followed. Firstly, the documents are ingested into index 1 and the OCR skill is applied to extract text from images in the document. Then, the document is exported to an Azure Storage knowledgestore as a jsonArray as shown in example below.

In the second step, the exported jsonArray is ingested into a new search index 2, with each array item serving as a separate search document!

The search index is enhanced through the implementation of OpenAI Prompt custom skill, which is achieved via an Azure Function. The chunking of document provides advantage of retriving only the a portion of the larger document as part of query results, which then can be utilised for interactive follow-up prompts by the end user.

This accelerator utilizes the SharePoint Online (preview) connector to ingest data from SharePoint online. Additionally, it can function with other Cognitive Search supported data sources, including Azure Storage. This will require some alterations to both index definitions.

Features

Azure Services

How to deploy?

  1. Provision necessary Azure services

  2. Deploy Azure Function *

  3. Create Cognitive Search artefacts

    <br> * Code included in this repo.

Keys usage

Postman / POST requests

Azure Function

Looking for OpenAI + Redis Semantic Search Demo?

Looking for Azure OpenAI Lecture Generation (text-to-speech) Demo?

Other important information