What is Schema-Aware Semantic Search?

Schema-Aware Semantic Search with FactEngine

Victor Morgante


There are multiple ways that Schema-Aware Semantic Search has been delivered; this article focuses on how it is achieved using FactEngine.

Let us has a look at what Schema-Aware Semantic Search is.

The term, “Semantic Search”, generally means:

Returning results for a natural language query over a knowledge store where the results are returned based on the meaning, gist or context of that query;

The content of the knowledge store may vary depending on the implementation of semantic search.

For example, the knowledge store could be:

  1. Unstructured Data: E.g. documents (The contents of PDF files, word processor or text based documents), or such things as email, Tweet or social media exchanges; or
  2. Structured Data: E.g. A knowledge graph, product database, or any other database; or
  3. Random facts that are in some way structured. E.g. As in attempts to amalgamate structured data and unstructured data in burgeoning and experimental ‘neural databases’;

NB FactEngine operates over structured data and where unstructured data and/or random facts may have their meta-data stored in a run-of-the-mill database as structured data….we call that ‘structured unstructured data’.

Semantic search over structured data may be classified as, “Schema-Aware Sematic Search”.

Schema-Aware Semantic Search

If natural language queries are in some way formulated in view of the person or machine asking the question having some awareness of the schema over which the query operates…then that is Schema-Aware Semantic Seach. The ‘Semantic Search’ component comes from adopting natural language processing/understanding (NLP/NLU) techniques in transforming the natural language query into a form that a computer can use to query a database that conforms to that schema.

Let’s look at an example:

Imagine a sales order database, in either an SQL, Graph or TypeDB database, and let us pick on the famous Northwind database variably supplied for such purposes by Microsoft.

Now imagine the following semantic query:

Show me the customers that ordered products in 2016 that are sold by