Newzbin:V3 Search
From Newzbin Documentation
Contents |
[edit] v3 Search Queries
The way search query strings are handled has been completely overhauled in v3 to be more powerful and intuitive, and to be usable by Watchdog as well as the main site search engine.
[edit] The Basics
Queries consist of a set of terms; generally, these will be words seperated by spaces. Terms can match in any order, and by default they are all required to achieve a match (they are ANDed together).
[edit] Phrases
Sometimes you want to search for a set of words in the order they're specified, maybe including additional punctuation and even spaces. The usual way to do this is to enclose them in quotes:
"This is a phrase term" 'and so is this'
If you want to include quotes within phrase searches which are the same as those you've enclosed the phrase in, you can escape it like this:
"This is a \"phrase\" term" 'and so \'is\' this'
Additionaly you can escape things outside phrases. Escaped characters are used verbatim instead of having any special meaning. For example, the following two queries are identical:
"This is a phrase term"This\ is\ a\ phrase\ term
[edit] Boolean Operators
By default terms behave as though you'd typed AND between them. If you want to search for one term or another, OR them:
"one term" OR "another term" OR a bunch of other terms
If you want to negate part of a search, you can use NOT, or prefix the very start of a term with "-":
"one term" NOT "another term" OR "one term" -"another term"
Note that NOT only binds to the very next term; if you add anything to the end of the above example it will NOT be negated. Either specify NOT again, or put multiple terms you want to NOT together into a group (see the next section).
If you want to search for these words, quote or escape them:
"and" \and "or" \or "not" \not \-foo "-foo"
We plan on making boolean operators uppercase-only in future so you don't need to do this.
[edit] Groups
The new parser tries to use sensible rules when deciding how to bind terms to operators, but you can be more explicit in how you want your query to be built by grouping terms together:
"one term" NOT (useless terms OR "also useless")
Without the brackets, as
"one term" NOT useless terms OR "also useless"
this query would be equivilent to:
("one term" AND terms AND NOT useless) OR ("also useless")
i.e. only the term "useless" is negated, everything else is still required, and the OR splits it into two entirely distinct queries.
[edit] Anchors
Sometimes you'll want to search for a term that's at the start or end of a title or subject, or more rarely find an exact title or subject. Anchors allow you to do this.
To specify that a term must appear at the start of the result, put ^ before it:
^start OR ^"start"
To anchor to the right, put $ at the end:
end$ or "end"$
For an exact match, simply use both:
^"An exact match"$
Again if you want to search for ^ or $ verbatim, quote or escape them. If you want to negate an anchored term using "-", the - must appear before the anchor:
-^"An exact match"$
[edit] v3 Attribute Searches
The first thing to remember is all the attribute search terms are fully supported by the boolean engine; so the boolean and grouping tricks you learnt above apply here.
At first, there was a separate search box for attribute queries; this is now being dumped and you specify attribute queries in the main search box along with your other search terms.
[edit] Syntax
All attribute options follow this syntax:
Attr:[attr_type][=~][attr_bit]
Attr: is a fixed prefix that must be specified so the engine knows you're about to give an attribute option. It can be shortened to any of a: at: att: attr: and is case-insensitive.
attr_type is the attribute type, for example Language or Video Format. They are case insensitive, and you can remove spaces, so VideoFormat and videoformat will work too. What's more, you only need to specify enough to make a unique match; so VideoF will do just fine (just Video would also match Video Genre, so isn't unique)
The character in the middle specifies whether you want an exact match, or just that the given bit has to be included.
- = means that the attribute should be exactly this bit.
- Attr:Lang=English will return reports that are only English and nothing else.
- ~ means you want the given bit, but don't mind the presence of others.
- Attr:Lang~English will return reports that are English, but may also have other Language bits set.
The attr_bit is what you're actually interested in matching, so DivX for Video Format for example. Again this can be shortened, is case-insensitive, and spaces can be removed.
[edit] Examples
You want English DVDs that may have other languages too:
- Attr:VideoF~DVD Attr:Language~English
- you could put AND in between them, but it's the default so can be omitted.
You want English DVDs that have not come from a CAM or Screener:
- Attr:VideoF~DVD Attr:Language~English -Attr:VideoS~CAM -Attr:VideoS~Screener
- note the use of - to deduct something you don't want.
You want Music Videos, but not live ones:
- Attr:Music~Video NOT Attr:Music~Live
- note the use of NOT. You could similarly prefix the second Attr with - instead.
You want to browse Music, but don't want Music Videos:
- -Attr:Music~Video
Since all the words can be shorted so long as they're unique, you can search for English stuff with as little as:
- a:l=e for items that are only English
- a:l~e for items that are English amongst other things (or indeed, have English as the only language bit set)
- neat huh?
If you have any better examples you think we could add here, please let us know.
[edit] List of Attributes
We'll do this soon.