diff options
-rw-r--r-- | vendor/plugins/acts_as_xapian/README.txt | 139 |
1 files changed, 80 insertions, 59 deletions
diff --git a/vendor/plugins/acts_as_xapian/README.txt b/vendor/plugins/acts_as_xapian/README.txt index 0ccab02ba..84fa14f5c 100644 --- a/vendor/plugins/acts_as_xapian/README.txt +++ b/vendor/plugins/acts_as_xapian/README.txt @@ -1,18 +1,19 @@ Contents ======== -a. Introduction to acts_as_xapian -b. Comparison to acts_as_solr (as on 24 April 2008) -c. Documentation - indexing -d. Documentation - querying +* a. Introduction to acts_as_xapian +* b. Installation +* c. Comparison to acts_as_solr (as on 24 April 2008) +* d. Documentation - indexing +* e. Documentation - querying a. Introduction to acts_as_xapian ================================= -Xapian is a full text search engine library, which has Ruby bindings. -acts_as_xapian adds support for it to Rails. It is an alternative to -acts_as_lucene or acts_as_ferret. +"Xapian":http://www.xapian.org is a full text search engine library, which has +Ruby bindings. acts_as_xapian adds support for it to Rails. It is an +alternative to acts_as_lucene or acts_as_ferret. Xapian is an *offline indexing* search library - only one process can have the Xapian database open for writing at once, and others that try meanwhile are @@ -23,12 +24,22 @@ Instead, there is a ActsAsXapianJob model which stores which models need updating or deleting in the search index. A rake task 'xapian:update_index' then performs the updates since last change. Run it on a cron job, or similar. -Xapian 1.0.5 and associated Ruby bindings are required. +Xapian 1.0.5 and associated Ruby bindings are required. In Debian or Ubuntu +install the packages libxapian15 and libxapian-ruby1.8. Email francis@mysociety.org with patches. -b. Comparison to acts_as_solr (as on 24 April 2008) +b. Installation +=============== + +Retrieve the plugin directly from the git version control system by running +this command within your Rails app. + + git clone git://github.com/frabcus/acts_as_xapian.git vendor/plugins/acts_as_xapian + + +c. Comparison to acts_as_solr (as on 24 April 2008) ============================= * Offline indexing only mode - which is a minus if you want changes @@ -55,7 +66,7 @@ returning them mixed up together by relevancy. This is like multi_solr_search, only it is the default mode of operation and is properly supported. * No daemons - However, if you have more than one web server, you'll need to -work out how to use Xapian's remote backend http://xapian.org/docs/remote.html. +work out how to use "Xapian's remote backend":http://xapian.org/docs/remote.html. * One layer - full-powered Xapian is called directly from the Ruby, without Solr getting in the way whenever you want to use a new feature from Lucene. @@ -63,47 +74,53 @@ Solr getting in the way whenever you want to use a new feature from Lucene. * No Java - an advantage if you're more used to working in the rest of the open source world. acts_as_xapian, it's pure Ruby and C++. -* Xapian's awesome email list - the kids over at xapian-discuss are super -helpful. Useful if you need to extend and improve acts_as_xapian. The +* Xapian's awesome email list - the kids over at +"xapian-discuss":http://lists.xapian.org/mailman/listinfo/xapian-discuss +are super helpful. Useful if you need to extend and improve acts_as_xapian. The Ruby bindings are mature and well maintained as part of Xapian. -http://lists.xapian.org/mailman/listinfo/xapian-discuss -c. Documentation - indexing +d. Documentation - indexing =========================== -1. Put acts_as_xapian in your models that need search indexing. +1. Put acts_as_xapian in your models that need search indexing. e.g. -e.g. acts_as_xapian :texts => [ :name, :short_name ], + acts_as_xapian :texts => [ :name, :short_name ], :values => [ [ :created_at, 0, "created_at", :date ] ], :terms => [ [ :variety, 'V', "variety" ] ] Options must include: -:texts, an array of fields for indexing with full text search - e.g. :texts => [ :title, :body ] -:values, things which have a range of values for indexing, or for collapsing. - Specify an array quadruple of [ field, identifier, prefix, type ] where - - number is an arbitary numeric identifier for use in the Xapian database - - prefix is the part to use in search queries that goes before the : - - type can be any of :string, :number or :date - e.g. :values => [ [ :created_at, 0, "created_at" ], [ :size, 1, "size"] ] -:terms, things which come after a : in search queries. Specify an array - triple of [ field, char, prefix ] where - - char is an arbitary single upper case char used in the Xapian database - - prefix is the part to use in search queries that goes before the : - e.g. :terms => [ [ :variety, 'V', "variety" ] ] + +* :texts, an array of fields for indexing with full text search. +e.g. :texts => [ :title, :body ] + +* :values, things which have a range of values for sorting, or for collapsing. +Specify an array quadruple of [ field, identifier, prefix, type ] where +** number is an arbitary numeric identifier for use in the Xapian database +** prefix is the part to use in search queries that goes before the : +** type can be any of :string, :number or :date + +e.g. :values => [ [ :created_at, 0, "created_at" ], [ :size, 1, "size"] ] + +* :terms, things which come after a : in search queries. Specify an array +triple of [ field, char, prefix ] where +** char is an arbitary single upper case char used in the Xapian database +** prefix is the part to use in search queries that goes before the : + +e.g. :terms => [ [ :variety, 'V', "variety" ] ] + A 'field' is a symbol referring to either an attribute or a function which returns the text, date or number to index. Both 'number' and 'char' must be the same for the same prefix in different models. Alternatively, -:instead_index, a field which refers to another model that should be reindexed +* :instead_index, a field which refers to another model that should be reindexed instead of this one. Options may include: -:eager_load, added as an :include clause when looking up search results in +* :eager_load, added as an :include clause when looking up search results in database -:if, either an attribute or a function which if returns false means the +* :if, either an attribute or a function which if returns false means the object isn't indexed 2. Make and run this database migration to create the ActsAsXapianJob model. @@ -113,12 +130,10 @@ object isn't indexed create_table :acts_as_xapian_jobs do |t| t.column :model, :string, :null => false t.column :model_id, :integer, :null => false - t.column :action, :string, :null => false end add_index :acts_as_xapian_jobs, [:model, :model_id], :unique => true end - def self.down remove_table :acts_as_xapian_jobs end @@ -131,40 +146,46 @@ development/test/production dir in acts_as_xapian/xapiandbs. 4. Then from a cron job or a daemon, or by hand regularly!, call 'rake xapian:update_index' -d. Documentation - querying +e. Documentation - querying =========================== If you just want to test indexing is working, you'll find this rake task -useful (it has more options, see lib/tasks/xapian.rake) - rake xapian:query models="PublicBody User" query="moo" +useful (it has more options, see tasks/xapian.rake) -To perform a query call ActsAsXapian::Search.new. This takes in turn: - model_classes - list of models to search, e.g. [PublicBody, InfoRequestEvent] - query_string - Google like syntax, see below -And then a hash of options: - :offset - Offset of first result - :limit - Number of results per page - :sort_by_prefix - Optionally, prefix of value to sort by, otherwise sort by relevance - :sort_by_ascending - Default true, set to false for descending sort - :collapse_by_prefix - Optionally, prefix of value to collapse by (i.e. only return most relevant result from group) + rake xapian:query models="PublicBody User" query="moo" + +To perform a query from code call ActsAsXapian::Search.new. This takes in turn: +* model_classes - list of models to search, e.g. [PublicBody, InfoRequestEvent] +* query_string - Google like syntax, see below -Google like query syntax is as described in http://www.xapian.org/docs/queryparser.html +And then a hash of options: +* :offset - Offset of first result +* :limit - Number of results per page +* :sort_by_prefix - Optionally, prefix of value to sort by, otherwise sort by relevance +* :sort_by_ascending - Default true, set to false for descending sort +* :collapse_by_prefix - Optionally, prefix of value to collapse by (i.e. only return most relevant result from group) + +Google like query syntax is as described in + "Xapian::QueryParser Syntax":http://www.xapian.org/docs/queryparser.html Queries can include prefix:value parts, according to what you indexed in the acts_as_xapian part above. You can also say things like model:InfoRequestEvent to constrain by model in more complex ways than the :model parameter, or modelid:InfoRequestEvent-100 to only find one specific object. Returns an ActsAsXapian::Search object. Useful methods are: - description - a techy one, to check how the query has been parsed - matches_estimated - a guesstimate at the total number of hits - spelling_correction - the corrected query string if there is a correction, otherwise nil - words_to_highlight - list of words for you to highlight, perhaps with TextHelper::highlight - results - an array of hashes containing: - :model - your Rails model, this is what you most want! - :weight - relevancy measure - :percent - the weight as a %, 0 meaning the item did not match the query at all - :collapse_count - number of results with the same prefix, if you specified collapse_by_prefix - - - +* description - a techy one, to check how the query has been parsed +* matches_estimated - a guesstimate at the total number of hits +* spelling_correction - the corrected query string if there is a correction, otherwise nil +* words_to_highlight - list of words for you to highlight, perhaps with TextHelper::highlight +* results - an array of hashes each containing: +** :model - your Rails model, this is what you most want! +** :weight - relevancy measure +** :percent - the weight as a %, 0 meaning the item did not match the query at all +** :collapse_count - number of results with the same prefix, if you specified collapse_by_prefix + + +For more details about anything, see source code in lib/acts_as_xapian.rb - +please though do patch this file if there is documentation missing / wrong. +It's called README.txt and is in git, using Textile formatting. The wiki page +is just copied from the README.txt file. |