ElasticSearch Cookbook(Second Edition)
上QQ阅读APP看书,第一时间看更新

Managing nested objects

There is a special type of embedded object: the nested object. This resolves problems related to Lucene indexing architecture, in which all the fields of the embedded objects are viewed as a single object. During a search in Lucene, it is not possible to distinguish between the values of different embedded objects in the same multivalued array.

If we consider the previous order example, it's not possible to distinguish between an item name and its quantity with the same query, as Lucene puts them in the same Lucene document object. We need to index them in different documents and to join them. This entire trip is managed by nested objects and nested queries.

Getting ready

You need a working ElasticSearch cluster.

How to do it...

A nested object is defined as a standard object with the type nested.

From the example in the Mapping an object recipe in this chapter, we can change the type from object to nested as follows:

{
  "order" : {
    "properties" : {
      "id" : {"type" : "string", "store" : "yes", "index":"not_analyzed"},
      "date" : {"type" : "date", "store" : "no", "index":"not_analyzed"},"customer_id" : {"type" : "string", "store" : "yes","index":"not_analyzed"},
      "sent" : {"type" : "boolean", "store" : "no", "index":"not_analyzed"},"item" : {
        "type" : "nested",
        "properties" : {
          "name" : {"type" : "string", "store" : "no","index":"analyzed"},
          "quantity" : {"type" : "integer", "store" : "no","index":"not_analyzed"},
          "vat" : {"type" : "double", "store" : "no","index":"not_analyzed"}
        }
      }
    }
  }
}

How it works...

When a document is indexed, if an embedded object is marked as nested, it's extracted by the original document and indexed in a new external document.

In the above example, we have reused the mapping of the previous recipe, Mapping an Object, but we have changed the type of the item from object to nested. No other action must be taken to convert an embedded object to a nested one.

Nested objects are special Lucene documents that are saved in the same block of data as their parents — this approach allows faster joining with the parent document.

Nested objects are not searchable with standard queries, but only with nested ones. They are not shown in standard query results.

The lives of nested objects are related to their parents; deleting/updating a parent automatically deletes/updates all the nested children. Changing the parent means ElasticSearch will do the following:

  • Mark old documents that are deleted
  • Mark all nested documents that are deleted
  • Index the new document's version
  • Index all nested documents

There's more...

Sometimes, it is necessary to propagate information about nested objects to their parents or their root objects, mainly to build simpler queries about their parents. To achieve this goal, the following two special properties of nested objects can be used:

  • include_in_parent: This allows you to automatically add the nested fields to the immediate parent
  • include_in_root: This adds the nested objects' fields to the root object

These settings add to data redundancy, but they reduce the complexity of some queries, improving performance.

See also

  • The Managing a child document recipe in this chapter