Advanced Help.
Overview.
CDP's query engine largely relies on an industry standard search engine from Apache called 'lucene.' This help screen largely draws upon it's documentation.
Terms & Phrases. Queries are broken into terms and operators. Terms may be individual words or multiple words in double quotes (Africa or "African Studies"). Terms may be combined using operators (discussed later).
Documents and Fields. When you search, you are searching for documents. Each document is divided into fields. You search a particular field by prefixing the field name followed by a colon
followed by the search term (w/ no spaces in between field, ':' and term. The term may contain spaces if properly quoted (e.g. content:"Harold Washington", description:mayor).
CDP defines 5 fields (and more may be forthcoming):
content (default): Full content of the document (includes description, owner, keywords);
description: Description of the document;
owner: Document that owns the current document (e.g. a table owns variable groups, groups own fields);
name: Short Name of the document (may not always be useful, can be a variable name); and
keywords: any additional terms associated w/ a document (such as synonyms).
additional: An additional two fields are supported in syntax but are not yet implemented: children to return the 'owned' document from any matches to 'child' documents and geocode that determine which type of map is returned or will where to center/zoom the map that is is found.
Modifiers: wildcards and fuzzy searches. Term Modifiers. In the previous
examples, terms must be matched exactly. If you type 'Mayor' documents that contain "Mayoral" will
not be matched. To remedy this, you can specify wildcards: Mayor* matches any word that starts with
Mayor and is followed by 0..n other characters. While * matches 0..n characters, "?" matches exactly
one character. For instance, te?t matches tent and test and text (and telt for that matter!).
?ent matches bent, dent, gent, lent, etc.
You can also use the wildcard searches in the begging or middle of a term as well. For instanct,
ma*r matches mayor, major, marr. *ma*r, matches Mayor, Major, Marr (like before) but also Almador,
amateur, Maarnaout. As an additional refinement, you can add a 'fuzzy' modifer (the tilde, ~) to
the end of the term to find words that are similar in spelling to a particular ward 'mayor~' returns
matches with marrou, mayro, major, myron. As you can see, this is less than exact (it is in fact,
fuzzy :-) but can help you find words if they are mispelled in the documentation. The tilde (~) if
used with a multi-word phrase has a different meaning. When a number N is provided it returns not an
exact match for the phrase but when the words are found within N words of each other. For instance
"Chicago Mayor"~5 only returns documents that contain both chicago and mayor and these words are
within 5 words of each other.
Logic/Boolean operators: a.k.a. AND, OR and NOT.
You can combine terms and phrases through simple logic operators. By default, any documents with
either of the terms will be returned (though documents w/ both will have higher scores). If
you place a plus-sign in front of a term the engine will required that term be found in all
documents. If you place minus-sign in front of a term then documents containing that term will be
excluded. For instance, +Mayor +Special returns just the special mayorial elections.
+description:Mayor -description:General returns only primary elections. you can achieve the same
with combination of (all-caps) AND and NOT. Our previous examples become Mayor AND Special
and description:Mayor AND NOT description:General. You can also specify 'OR' but as the
default conjunction the inclusion has no effect most queries. You can also use the programmerly
&& for AND and the ! for NOT.
Grouping.
You may have much more complicated boolean logic by using parentheses to group clauses. For instance, (mayor* OR clerk) AND "San
Diego" finds records pertaining to either mayoral elections, clerk relations but only in San Diego.
You may also group within a field by placing parentheses after the field name, e.g., description:(+mayor +"San Diego")
Escaping characters.. If you need to use
any of the special characters defined by the query parser, you must 'escape' them by placing a
backslash in front of the character. For instance -1 will exclude all documents that contain '1' not
those that contain -1. For that you would have to type -1.