ownCloud
Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Back to homepage

20. File Search Query Language

Context and Problem Statement

From the users perspective, the interface to search is just a single form field where the user enters one or more search terms. The minimum expectation is that the search returns file names and links to files that:

  • have a file name that contains at least one of the search terms
  • contain at least one of the search terms in the file contents
  • have metadata that is equal or contains one of the search terms

Decision Drivers

  • The standard user should not be bothered by a query syntax
  • The power user should also be able to narrow his search with an efficient and flexible syntax
  • We need to consider different backend technologies which we need to access through an abstraction layer
  • Using different indexing systems should lead to a slightly different feature set without changing the syntax completely

Considered Options

Decision Outcome

Chosen option: KQL - Keyword Query Language, because it enables advanced search across all platforms.

Positive Consequences

  • We can use the same query language in all clients

Negative consequences

  • We need to build and maintain a backend connector

Pros and Cons of the Options

Keyword Query Language

The Keyword Query Language (KQL) is used by Microsoft Share Point and other Microsoft Services. It uses very simple query elements, property restrictions and operators.

  • Good, because we can fulfill all our current needs
  • Good, because it is very similar to the used query language in iOS
  • Good, because it supports date time keywords like “today”, “this week” and more
  • Good, because it can be easily extended to use “shortcuts” for eg. document types like :presentation which combine multiple mime types.
  • Good, because it is successfully implemented and used in similar use cases
  • Good, because it gives our clients the freedom to always use the same query language across all platforms
  • Good, because Microsoft Graph API is using it, we will have an easy transition in the future
  • Bad, because we need to build and maintain a connector to different search backends (bleve, elasticsearch or others)

Simplified Query

Implement a very simple search approach: Return all files which contain at least one of the keywords in their name, path, alias or selected metadata.

  • Good, because that covers 80% of the users needs
  • Good, because it is very straightforward
  • Good, because it is a suitable solution for GA
  • Bad, because it is below the industry standard
  • Bad, because it only provides one search query

Lucene Query Language

The Lucene Query Parser syntax supports advanced queries like term, phrase, wildcard, fuzzy search, proximity search, regular expressions, boosting, boolean operators and grouping. It is a well known query syntax used by the Apache Lucene Project. Popular Platforms like Wikipedia are using Lucene or Solr, which is the successor of Lucene

  • Good, because it is a well documented and powerful syntax
  • Good, because it is very close to the Elasticsearch and the Solr syntax which enhances compatibility
  • Bad, because there is no powerful and well tested query parser for golang available
  • Bad, because it adds complexity and fulfilling all the different query use-cases can be an “uphill battle”

Solr Query Language

Solr is highly reliable, scalable and fault-tolerant, providing distributed indexing, replication and load-balanced querying, automated failover and recovery, centralized configuration and more. Solr powers the search and navigation features of many of the world’s largest internet sites.

  • Good, because it is a well documented and powerful syntax
  • Good, because it is very close to the Elasticsearch and the Lucene syntax which enhances compatibility
  • Good, because it has a strong community with large resources and knowledge
  • Bad, because it adds complexity and fulfilling all the different query use-cases can be an “uphill battle”

Elasticsearch Query Language

Elasticsearch provides a full Query DSL (Domain Specific Language) based on JSON to define queries. Think of the Query DSL as an AST (Abstract Syntax Tree) of queries, consisting of two types of clauses. It is able to combine multiple query types into compound queries. It is also a successor of Solr.

  • Good, because it is a well documented and powerful syntax
  • Good, because it is very close to the Elasticsearch and the Solr syntax which enhances compatibility
  • Good, because there is a stable and well tested go client which brings a query builder
  • Good, because it could be used as the query language which supports different search backends by just implementing what is needed for our use-case
  • Bad, because it adds complexity and fulfilling all the different query use-cases can be an “uphill battle”