Home

Awesome

GraphAware Graph-Aided Search - RETIRED

GraphAware Graph-Aided Search Has Been Retired

As of May 2021, this repository has been retired.

ElasticSearch Plugin providing integration with Neo4j

GraphAware Graph-Aided Search is an enterprise-grade bi-directional integration between Neo4j and Elasticsearch. It consists of two independent modules plus test suites. Both modules can be used independently or together to achieve full integration.

The first module is a plugin for Neo4j (more precisely, a GraphAware Transaction-Driven Runtime Module), which can be configured to transparently and asynchronously replicate data from Neo4j to ElasticSearch.

The second module (this module) is a plugin for Elasticsearch that can query the Neo4j graph database during a search query to enrich the result (boost the score) by results that are more efficiently calculated in a graph database, e.g. recommendations.

Both modules are now open-source production-ready for everyone. They are also officially supported by GraphAware for <a href="http://graphaware.com/enterprise/" target="_blank">GraphAware Enterprise</a> subscribers.

Feature Overview: Graph-Aided Search

This module is a plugin for Elasticsearch that enables users to improve search results by boosting or filtering them using data stored in the Neo4j graph database. After performing a search in Elasticsearch, just before returning the results to the user, this plugin requests additional information from Neo4j via its REST API in order to boost or filter the results.

Two main features are exposed by the plugin:

Detailed workflow:

  1. Intercept and parse any "Search query" and try to find the GraphAidedSearch extension parameter;
  2. Process the query extension identifying the type of the extension (boosting or a filter), and instantiate the related class;
  3. Perform the operation required to boost or filter by calling the Neo4j REST API (or a Neo4j extension like Graphaware Recommendation Engine, passing all necessary information, e.g. Cypher query, target user, etc...;
  4. Return the filtered/boosted result set back to the user;

overview


Usage: Installation

Install Graph-Aided Search Binary

Elasticsearch 2.2.2:

$ $ES_HOME/bin/plugin install com.graphaware.es/graph-aided-search/2.2.2.0

Elasticsearch 2.3.1:

$ $ES_HOME/bin/plugin install com.graphaware.es/graph-aided-search/2.3.2.0

Build from source

$ git clone git@github.com:graphaware/graph-aided-search.git
$ mvn clean package
$ $ES_HOME/bin/plugin install file:///path/to/project/graph-aided-search/target/releases/graph-aided-search-2.X.X.0.zip

Start elasticsearch

Configuration

Then configure indexes with the url of Neo4j. This can be done in two ways. First:

$ curl -XPUT http://localhost:9200/indexname/_settings?index.gas.neo4j.hostname=http://localhost:7474
$ curl -XPUT http://localhost:9200/indexname/_settings?index.gas.enable=true

indexname as your index name. e.g. curl -XPUT http://localhost:9200/neo4j-index-node/_settings?index.gas.enable=true

If the Neo4j Rest Api is protected by Basic Authentication confire username and password for neo4j in the following way:

$ curl -XPUT http://localhost:9200/indexname/_settings?index.gas.neo4j.user=neo4j
$ curl -XPUT http://localhost:9200/indexname/_settings?index.gas.neo4j.password=password

If the neo4j server supports bolt it can be enable and managed using the following configuration

$ curl -XPUT http://localhost:9200/indexname/_settings?index.gas.neo4j.boltHostname=bolt://localhost:7687
$ curl -XPUT http://localhost:9200/indexname/_settings?index.gas.neo4j.bolt.secure=false (default is true)

Since bolt is still not a stabel release, the default protocol is http, to enable it add the following line in the booster or filter configuration:

"protocol": "bolt"

Second, you can use also template to configure settings in the index:

    curl -XPOST http://localhost:9200/_template/template_gas -d \
    '{
      "template": "*",
      "settings": {
        "index.gas.neo4j.hostname": "http://localhost:7474",
        "index.gas.neo4j.boltHostname": "bolt://localhost:7687",
        "index.gas.enable": true,
        "index.gas.neo4j.user": "neo4j",
        "index.gas.neo4j.password": "password"
      }
    }'

Disable Plugin

$ curl -XPUT http://localhost:9200/indexname/_settings?index.gas.enable=false

Queries will continue to work even with Graph-Aided-Search-specific elements, e.g. "gas-boost" and "gas-filter".

Usage: Search Phase

The integration with a pre-existing search query is seamless, since the plugin only requires the addition of new elements into the query.

Booster example

Boosters allow to change the score by an external score source. This could be a recommender, a Cypher query, or any custom booster provider. A simple Elasticsearch query could have the following structure:

  curl -X POST http://localhost:9200/neo4j-index/Movie/_search -d '{
    "query" : {
        "match_all" : {}
    }';

In this case all the Elasticsearch result hits will have a relevancy score value of 1. If you would like to boost these results according to user interest computed by Graphaware Recommendation Plugin on top of Neo4j, you would change the query in the following way.

  curl -X POST http://localhost:9200/neo4j-index/Movie/_search -d '{
    "query" : {
        "match_all" : {}
    },
    "gas-booster" :{
          "name": "SearchResultNeo4jBooster",
          "target": "2",
          "maxResultSize": 10,
          "keyProperty": "objectId",
          "neo4j.endpoint": "/graphaware/recommendation/movie/filter/"
       }
  }';

The gas-booster clause identifies the type of operation, in this case it defines a boost operation. The name parameter is mandatory and allows to specify the Booster class. The remaining parameters depend on the type of the booster. In the following paragraph the available boosters are described.

SearchResultNeo4jBooster

This booster uses Neo4j through custom REST APIs available as plugins for the database. In this case, the name value must be set to SearchResultNeo4jBooster.

The following parameters are available for this booster:

Information about the list of IDs that should be boosted as well as the target is passed to the API running atop Neo4j. The REST API should expose a POST endpoint that accepts the following parameters:

Example Call:

http://localhost:7474/graphaware/recommendation/movie/filter/2

Parameters:
limit=2147483647&from=0&keyProperty=objectId&ids=99,166,486,478,270,172,73,84,351,120

This component supposes that the results are a json array with the following structure.

[
  {
    "nodeId": 1212,
    "objectId": "270",
    "score": 3
  },
  {
    "nodeId": 1041,
    "objectId": "99",
    "score": 1
  },
  {
    "nodeId": 1420,
    "objectId": "478",
    "score": 1
  },
  {
    "nodeId": 1428,
    "objectId": "486",
    "score": 1
  }
]

SearchResultCypherBooster

This booster uses Neo4j through custom REST APIs available as plugins for the database. In this case the name value must be set to SearchResultCypherBooster.

The following parameters are available for this booster:

The Elasticsearch result hits ids are passed as Cypher query parameter as a List of strings named items.

Example Use:

  curl -X POST http://localhost:9200/neo4j-index/Movie/_search -d '{
    "query" : {
        "match_all" : {}
    },
    "gas-booster" :{
          "name": "SearchResultCypherBooster",
          "query": "MATCH (input:User) WHERE id(input) = 2
                    MATCH p=(input)-[r:RATED]->(movie)<-[r2:RATED]-(other)
                    WITH other, collect(p) as paths
                    WITH other, reduce(x=0, p in paths | x + reduce(i=0, r in rels(p) | i+r.rating)) as score
                    WITH other, score
                    ORDER BY score DESC
                    MATCH (other)-[:RATED]->(reco)
                    RETURN reco.objectId as id, score
                    LIMIT 500",
          "maxResultSize": 1000,
          "scoreName": "score",
          "identifier": "id"
       }
  }';

Filter Example

Filters allow to filter the results using information stored in the graph. For example, you can filter movies based on what the user's friends have seen. If you would like to filter results according to a user's friends evaluation, it is possible to change the Elasticsearch query as follows:

  curl -X POST http://localhost:9200/neo4j-index/Movie/_search -d '{
    "query" : {
        "match_all" : {}
    },
    "gas-filter" :{
          "name": "SearchResultCypherFilter",
          "query": "MATCH (input:User) WHERE id(input) = 2
                   MATCH (input)-[f:FRIEND_OF]->(friend)-[r:RATED]->(movie)
                   WHERE r.rate > 3
                   RETURN movie.uuid as id",
          "exclude": false
       }
  }';

The gas-filter clause identifies the type of the operation; in this case a filter operation. The name parameter is mandatory and allows to specify the Filter class. The remaining parameters depends on the type of filter. In the following paragraph the available filters are described.

SearchResultCypherFilter

This filter allows to filter results using a Cypher query on Neo4j. In this case the name value must be set to SearchResultCypherFilter.

The following parameters are available for this filter:

Customize the plugin

The plugin allows to implement custom boosters and filters. In order to implement a booster, SearchResultBooster must be implemented and it needs to have the following annotation:

@SearchBooster(name = "MyCustomBooster")

Moreover, it should be in the package com.graphaware.es.gas.

In order to implement a filter, SearchResultFilter must be implemented and it needs to have the following annotation:

@SearchFilter(name = "MyCustomFilter")

Also in this case, it should be in the package com.graphaware.es.gas.

Version Matrix

The following version are currently supported

Version (this project)Elasticsearch
master2.4.4
2.3.2.22.3.2
2.3.1.02.3.1
2.2.2.02.2.2

Issues/Questions

Please file an issue.

License

Copyright (c) 2016 GraphAware

GraphAware is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.