aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--vendor/plugins/acts_as_xapian/README1
-rw-r--r--vendor/plugins/acts_as_xapian/README.txt170
-rw-r--r--vendor/plugins/acts_as_xapian/lib/acts_as_xapian.rb160
3 files changed, 174 insertions, 157 deletions
diff --git a/vendor/plugins/acts_as_xapian/README b/vendor/plugins/acts_as_xapian/README
deleted file mode 100644
index b851372c8..000000000
--- a/vendor/plugins/acts_as_xapian/README
+++ /dev/null
@@ -1 +0,0 @@
-See extensive comments at top of lib/acts_as_xapian.rb for documentation.
diff --git a/vendor/plugins/acts_as_xapian/README.txt b/vendor/plugins/acts_as_xapian/README.txt
new file mode 100644
index 000000000..0ccab02ba
--- /dev/null
+++ b/vendor/plugins/acts_as_xapian/README.txt
@@ -0,0 +1,170 @@
+Contents
+========
+
+a. Introduction to acts_as_xapian
+b. Comparison to acts_as_solr (as on 24 April 2008)
+c. Documentation - indexing
+d. Documentation - querying
+
+
+a. Introduction to acts_as_xapian
+=================================
+
+Xapian is a full text search engine library, which has Ruby bindings.
+acts_as_xapian adds support for it to Rails. It is an alternative to
+acts_as_lucene or acts_as_ferret.
+
+Xapian is an *offline indexing* search library - only one process can have the
+Xapian database open for writing at once, and others that try meanwhile are
+unceremoniously kicked out. For this reason, acts_as_xapian does not support
+immediate writing to the database when your models change.
+
+Instead, there is a ActsAsXapianJob model which stores which models need
+updating or deleting in the search index. A rake task 'xapian:update_index'
+then performs the updates since last change. Run it on a cron job, or similar.
+
+Xapian 1.0.5 and associated Ruby bindings are required.
+
+Email francis@mysociety.org with patches.
+
+
+b. Comparison to acts_as_solr (as on 24 April 2008)
+=============================
+
+* Offline indexing only mode - which is a minus if you want changes
+immediately reflected in the search index, and a plus if you were going to
+have to implement your own offline indexing anyway.
+
+* Collapsing - the equivalent of SQL's "group by". You can specify a field
+to collapse on, and only the most relevant result from each value of that
+field is returned. Along with a count of how many there are in total.
+acts_as_solr doesn't have this.
+
+* No highlighting - Xapian can't return you text highlighted with a search
+query. You can try and make do with TextHelper::highlight (combined with
+words_to_highlight below). I found the highlighting in acts_as_solr didn't
+really understand the query anyway.
+
+* Date range searching - maybe this works in acts_as_solr, but I never found
+out how.
+
+* Spelling correction - "did you mean?" built in and just works.
+
+* Multiple models - acts_as_xapian searches multiple models if you like,
+returning them mixed up together by relevancy. This is like multi_solr_search,
+only it is the default mode of operation and is properly supported.
+
+* No daemons - However, if you have more than one web server, you'll need to
+work out how to use Xapian's remote backend http://xapian.org/docs/remote.html.
+
+* One layer - full-powered Xapian is called directly from the Ruby, without
+Solr getting in the way whenever you want to use a new feature from Lucene.
+
+* No Java - an advantage if you're more used to working in the rest of the
+open source world. acts_as_xapian, it's pure Ruby and C++.
+
+* Xapian's awesome email list - the kids over at xapian-discuss are super
+helpful. Useful if you need to extend and improve acts_as_xapian. The
+Ruby bindings are mature and well maintained as part of Xapian.
+http://lists.xapian.org/mailman/listinfo/xapian-discuss
+
+
+c. Documentation - indexing
+===========================
+
+1. Put acts_as_xapian in your models that need search indexing.
+
+e.g. acts_as_xapian :texts => [ :name, :short_name ],
+ :values => [ [ :created_at, 0, "created_at", :date ] ],
+ :terms => [ [ :variety, 'V', "variety" ] ]
+
+Options must include:
+:texts, an array of fields for indexing with full text search
+ e.g. :texts => [ :title, :body ]
+:values, things which have a range of values for indexing, or for collapsing.
+ Specify an array quadruple of [ field, identifier, prefix, type ] where
+ - number is an arbitary numeric identifier for use in the Xapian database
+ - prefix is the part to use in search queries that goes before the :
+ - type can be any of :string, :number or :date
+ e.g. :values => [ [ :created_at, 0, "created_at" ], [ :size, 1, "size"] ]
+:terms, things which come after a : in search queries. Specify an array
+ triple of [ field, char, prefix ] where
+ - char is an arbitary single upper case char used in the Xapian database
+ - prefix is the part to use in search queries that goes before the :
+ e.g. :terms => [ [ :variety, 'V', "variety" ] ]
+A 'field' is a symbol referring to either an attribute or a function which
+returns the text, date or number to index. Both 'number' and 'char' must be
+the same for the same prefix in different models.
+
+Alternatively,
+:instead_index, a field which refers to another model that should be reindexed
+ instead of this one.
+
+Options may include:
+:eager_load, added as an :include clause when looking up search results in
+database
+:if, either an attribute or a function which if returns false means the
+object isn't indexed
+
+2. Make and run this database migration to create the ActsAsXapianJob model.
+
+ class ActsAsXapianMigration < ActiveRecord::Migration
+ def self.up
+ create_table :acts_as_xapian_jobs do |t|
+ t.column :model, :string, :null => false
+ t.column :model_id, :integer, :null => false
+
+ t.column :action, :string, :null => false
+ end
+ add_index :acts_as_xapian_jobs, [:model, :model_id], :unique => true
+ end
+
+ def self.down
+ remove_table :acts_as_xapian_jobs
+ end
+ end
+
+3. Call 'rake xapian::rebuild_index models="ModelName1 ModelName2"' to build the index
+the first time (you must specify all your indexed models). It's put in a
+development/test/production dir in acts_as_xapian/xapiandbs.
+
+4. Then from a cron job or a daemon, or by hand regularly!, call 'rake xapian:update_index'
+
+
+d. Documentation - querying
+===========================
+
+If you just want to test indexing is working, you'll find this rake task
+useful (it has more options, see lib/tasks/xapian.rake)
+ rake xapian:query models="PublicBody User" query="moo"
+
+To perform a query call ActsAsXapian::Search.new. This takes in turn:
+ model_classes - list of models to search, e.g. [PublicBody, InfoRequestEvent]
+ query_string - Google like syntax, see below
+And then a hash of options:
+ :offset - Offset of first result
+ :limit - Number of results per page
+ :sort_by_prefix - Optionally, prefix of value to sort by, otherwise sort by relevance
+ :sort_by_ascending - Default true, set to false for descending sort
+ :collapse_by_prefix - Optionally, prefix of value to collapse by (i.e. only return most relevant result from group)
+
+Google like query syntax is as described in http://www.xapian.org/docs/queryparser.html
+Queries can include prefix:value parts, according to what you indexed in the
+acts_as_xapian part above. You can also say things like model:InfoRequestEvent
+to constrain by model in more complex ways than the :model parameter, or
+modelid:InfoRequestEvent-100 to only find one specific object.
+
+Returns an ActsAsXapian::Search object. Useful methods are:
+ description - a techy one, to check how the query has been parsed
+ matches_estimated - a guesstimate at the total number of hits
+ spelling_correction - the corrected query string if there is a correction, otherwise nil
+ words_to_highlight - list of words for you to highlight, perhaps with TextHelper::highlight
+ results - an array of hashes containing:
+ :model - your Rails model, this is what you most want!
+ :weight - relevancy measure
+ :percent - the weight as a %, 0 meaning the item did not match the query at all
+ :collapse_count - number of results with the same prefix, if you specified collapse_by_prefix
+
+
+
+
diff --git a/vendor/plugins/acts_as_xapian/lib/acts_as_xapian.rb b/vendor/plugins/acts_as_xapian/lib/acts_as_xapian.rb
index 3983f5c19..976a5df19 100644
--- a/vendor/plugins/acts_as_xapian/lib/acts_as_xapian.rb
+++ b/vendor/plugins/acts_as_xapian/lib/acts_as_xapian.rb
@@ -4,150 +4,13 @@
# Copyright (c) 2008 UK Citizens Online Democracy. All rights reserved.
# Email: francis@mysociety.org; WWW: http://www.mysociety.org/
#
-# $Id: acts_as_xapian.rb,v 1.20 2008-05-15 10:00:06 francis Exp $
+# $Id: acts_as_xapian.rb,v 1.21 2008-05-15 11:20:47 francis Exp $
# Documentation
# =============
#
-# Xapian is a full text search engine library, which has Ruby bindings.
-# acts_as_xapian adds support for it to Rails. It is an alternative to
-# acts_as_lucene or acts_as_ferret.
-#
-# Xapian is an *offline indexing* search library - only one process can have
-# the Xapian database open for writing at once, and others that try meanwhile
-# are unceremoniously kicked out. For this reason, acts_as_xapian does not
-# support immediate writing to the database when your models change.
-#
-# Instead, there is a ActsAsXapianJob model which stores which models need
-# updating or deleting in the search index. A rake task 'xapian:update_index'
-# then performs the updates since last change. Run it on a cron job, or
-# similar.
-#
-# Xapian 1.0.5 and associated Ruby bindings are required.
-#
-# Email francis@mysociety.org with patches.
-#
-#
-# Comparison to acts_as_solr (as on 24 April 2008)
-# ==========================
-#
-# * Offline indexing only mode - which is a minus if you want changes
-# immediately reflected in the search index, and a plus if you were going to
-# have to implement your own offline indexing anyway.
-#
-# * Collapsing - the equivalent of SQL's "group by". You can specify a field
-# to collapse on, and only the most relevant result from each value of that
-# field is returned. Along with a count of how many there are in total.
-# acts_as_solr doesn't have this.
-#
-# * No highlighting - Xapian can't return you text highlighted with a search
-# query. You can try and make do with TextHelper::highlight (combined with
-# words_to_highlight below). I found the highlighting in acts_as_solr didn't
-# really understand the query anyway.
-#
-# * Date range searching - maybe this works in acts_as_solr, but I never found
-# out how.
-#
-# * Spelling correction - "did you mean?" built in and just works.
-#
-# * Multiple models - acts_as_xapian searches multiple models if you like,
-# returning them mixed up together by relevancy. This is like multi_solr_search,
-# only it is the default mode of operation and is properly supported.
-#
-# * No daemons - However, if you have more than one web server, you'll need to
-# work out how to use Xapian's remote backend http://xapian.org/docs/remote.html.
-#
-# * One layer - full-powered Xapian is called directly from the Ruby, without
-# Solr getting in the way whenever you want to use a new feature from Lucene.
-#
-# * No Java - an advantage if you're more used to working in the rest of the
-# open source world. acts_as_xapian, it's pure Ruby and C++.
-#
-# * Xapian's awesome email list - the kids over at xapian-discuss are super
-# helpful. Useful if you need to extend and improve acts_as_xapian. The
-# Ruby bindings are mature and well maintained as part of Xapian.
-# http://lists.xapian.org/mailman/listinfo/xapian-discuss
-#
-#
-# Indexing
-# ========
-#
-# 1. Put acts_as_xapian in your models that need search indexing.
-#
-# e.g. acts_as_xapian :texts => [ :name, :short_name ],
-# :values => [ [ :created_at, 0, "created_at", :date ] ],
-# :terms => [ [ :variety, 'V', "variety" ] ]
-#
-# Options must include:
-# :texts, an array of fields for indexing with full text search
-# e.g. :texts => [ :title, :body ]
-# :values, things which have a range of values for indexing, or for collapsing.
-# Specify an array quadruple of [ field, identifier, prefix, type ] where
-# - number is an arbitary numeric identifier for use in the Xapian database
-# - prefix is the part to use in search queries that goes before the :
-# - type can be any of :string, :number or :date
-# e.g. :values => [ [ :created_at, 0, "created_at" ], [ :size, 1, "size"] ]
-# :terms, things which come after a : in search queries. Specify an array
-# triple of [ field, char, prefix ] where
-# - char is an arbitary single upper case char used in the Xapian database
-# - prefix is the part to use in search queries that goes before the :
-# e.g. :terms => [ [ :variety, 'V', "variety" ] ]
-# A 'field' is a symbol referring to either an attribute or a function which
-# returns the text, date or number to index. Both 'number' and 'char' must be
-# the same for the same prefix in different models.
-#
-# Alternatively,
-# :instead_index, a field which refers to another model that should be reindexed
-# instead of this one.
-#
-# Options may include:
-# :eager_load, added as an :include clause when looking up search results in
-# database
-# :if, either an attribute or a function which if returns false means the
-# object isn't indexed
-#
-# 2. Make and run the migration to create the ActsAsXapianJob model, code below
-# (search for ActsAsXapianJob).
-#
-# 3. Call 'rake xapian::rebuild_index models="ModelName1 ModelName2"' to build the index
-# the first time (you must specify all your indexed models). It's put in a
-# development/test/production dir in acts_as_xapian/xapiandbs.
-#
-# 4. Then from a cron job or a daemon, or by hand regularly!, call 'rake xapian:update_index'
-#
-#
-# Querying
-# ========
-#
-# If you just want to test indexing is working, you'll find this rake task
-# useful (it has more options, see lib/tasks/xapian.rake)
-# rake xapian:query models="PublicBody User" query="moo"
-#
-# To perform a query call ActsAsXapian::Search.new. This takes in turn:
-# model_classes - list of models to search, e.g. [PublicBody, InfoRequestEvent]
-# query_string - Google like syntax, see below
-# And then a hash of options:
-# :offset - Offset of first result
-# :limit - Number of results per page
-# :sort_by_prefix - Optionally, prefix of value to sort by, otherwise sort by relevance
-# :sort_by_ascending - Default true, set to false for descending sort
-# :collapse_by_prefix - Optionally, prefix of value to collapse by (i.e. only return most relevant result from group)
-#
-# Google like query syntax is as described in http://www.xapian.org/docs/queryparser.html
-# Queries can include prefix:value parts, according to what you indexed in the
-# acts_as_xapian part above. You can also say things like model:InfoRequestEvent
-# to constrain by model in more complex ways than the :model parameter, or
-# modelid:InfoRequestEvent-100 to only find one specific object.
-#
-# Returns an ActsAsXapian::Search object. Useful methods are:
-# description - a techy one, to check how the query has been parsed
-# matches_estimated - a guesstimate at the total number of hits
-# spelling_correction - the corrected query string if there is a correction, otherwise nil
-# results - an array of hashes containing:
-# :model - your Rails model, this is what you most want!
-# :weight - relevancy measure
-# :percent - the weight as a %, 0 meaning the item did not match the query at all
-# :collapse_count - number of results with the same prefix, if you specified collapse_by_prefix
+# See ../README.txt for documentation. Please update that file as you
+# this code.
require 'xapian'
@@ -427,22 +290,7 @@ module ActsAsXapian
######################################################################
# Index
- # Offline indexing job queue model, create with this migration:
- # class ActsAsXapianMigration < ActiveRecord::Migration
- # def self.up
- # create_table :acts_as_xapian_jobs do |t|
- # t.column :model, :string, :null => false
- # t.column :model_id, :integer, :null => false
- #
- # t.column :action, :string, :null => false
- # end
- # add_index :acts_as_xapian_jobs, [:model, :model_id], :unique => true
- # end
- #
- # def self.down
- # remove_table :acts_as_xapian_jobs
- # end
- # end
+ # Offline indexing job queue model, create with migration in ../README.txt
class ActsAsXapianJob < ActiveRecord::Base
end