- Fix for Redis list
inject
.
- Fix for exact partial
load
interface.
- Automatic segmenter handles indexes with symbol keys.
- Several semi-internal APIs have changed (
Backend#load
, for example).
- Experimental: Add boolean option
force_update
toPicky::Index#add
andPicky::Category#add
. Default isfalse
– index will only be partially updated. Index parts which already have the passed "thing" stored will not be updated by default. This impacts ordering (If id 1 was already in, it will not be unshifted to the ids). IfPicky::Results#order_by
is used, then ordering will not be impacted.
- Do not set default encodings anymore.
- Handle LoadError.
- Add experimental explicit
Picky::Indexes.optimize_memory
andPicky::Index#optimize_memory
calls.
- Also use Google hashes for partial indexes when
optimize :no_dump
is specified on an index.
- Use Google hashes when
optimize :no_dump
is specified on an index. The index can then not be dumped/loaded from file anymore.
- Experimental: Require 'google_hash' to enable Picky to use denser hashes.
- Using multiple different stemmers is now possible.
- Stemming is now done per-category, and is not defined anymore on
Search
, although that's still an option. - Currently, only a single type of stemmer is possible to use.
- Symbol keys can now be used with an OR query.
- Symbol keys can now be used with facets.
- Picky can now use Symbols internally. Only use with Ruby 2.2.0.
- Use both
Index#symbol_keys(bool)
andSearch#symbol_keys
to enable.
- Change default range character from - to …. This will likely cause breakage. So I hope you read this. Brown M&Ms.
- Fix bug when accessing category name.
- Optimization: Stop sorting allocations whose results won't be shown.
- Adds
Results#sort_by
to Picky results. Allows arbitrary sorting.
- Searching for similarity "hullo~" will also return results for "hullo" (the string itself).
- Add option
format
toCategory#id
. E.g.id :number, :format => :to_i
. Tells Picky to convert thenumber
attribute to an Integer. Same functionality asCategory#key_format
.
- Add method
id
to category. E.g.id :number
. Tells Picky to use methodnumber
instead of the defaultid
.
- Rename "aux" folder to "tools".
- Still
require 'strscan'
.
- Explicitly allow
FalseClass
andString
instances onIndex#stopwords
andSearch#stopwords
. - Explicitly allow
FalseClass
instances onIndex#remove_characters
andSearch#removes_characters
.
- Added
Index#static
option, which will cause the realtime index not to be used when using a source on an index. We recommend to use this when you do not often change index data, but load the index once then run the engine for searches.
- Fix rspec matcher loading.
- Loosen gem restrictions on activesupport.
- Only add have_categories matcher to client/spec helpers if RSpec::Matcher exists.
- Various small bugfixes and refactorings.
- Fixes made to the
Picky::Splitters::Automatic
algorithm.
- Updated Picky Javascript: The success callback gets (data, query) instead of (data). The query is the query of the request that resulted in the success callback.
- Fix order of categories in example.
- Avoid Rails 4 logger deprecation warning (thanks @albandiguer).
- Explicitly require 'strscan'.
- Picky uses
StringScanner
instead ofString#split
to reduceString
usage. If only a single word is used in a query instead of a former 3, no new ones are created, reducing amount of GC runs (and enhancing performance). If two words are used, two new ones are created instead of 5, and so on. Thanks to @kasparschiess and @zmoazeni. - Picky core has been slightly optimised.
- Many code paths minimized: speedup of roughly 20% over last version (in standard cases).
- Default lambda for Tokenizer#rejects_token_if is &:empty? instead of &:blank?.
- Changes in how SQLite client is initialized. Method SQLite::Basic#lazily_initialize_client replaced by SQLite::Basic#db.
- Token#partial? is calculated at create time, not dynamically.
- Internal API changes in Allocation (indirection removed).
- More informative error messages.
- Fix for parallel tokenizing when option
tokenize
is set to false on a category.
- Added OR mode to Picky search terms, e.g.
hello world|florian
. This will find results for "hello world" and "hello florian". This works with similarity, partial, etc. For examplehello text:wor*|text:flarian~
. (Note that you will have to NOTremove_characters
"|")
key_format
is now explicitly needed unless the keys do not need to be converted.- Search facets do not report duplicate entries under 1000 facets.
- The
from
category option accepts anything that can be called. For example,category :authors, :from => lambda { |book| book.authors.map(&:name).join(" ") }
. - Key format
:split
can be used, in case of a string id. This results in an array be stored as IDs. - Arrays can be used as IDs if you use the in-memory backend.
- Completely rewritten and more standard C code compilation/inclusion.
- Add
tokenize
option on category. Set tofalse
when you already pre-tokenize the category (default istrue
).
- The
#only
and#ignore
options onSearch
now work as expected (Array
s describe order of categories in allocations – so[:title, :author]
will also match[:title, :title, :author, :author, :author]
). - Removed
#to_json
fromHash
andAllocation
.
- Warn on erroneous category options, eg. :weights instead of :weight (thanks @rogerbraun).
- Make use of Redis' SCRIPT LOAD feature.
- Handle script flush in Redis backend (if SCRIPT FLUSH is called, EVALSHA will raise an error).
- Fix: Redis searching, script reuse.
- Fix: Redis searching, realtime mode (thanks for pushing @andi and @rogerbraun).
- Removed i18n from client Gemfile.
- Fix: Directory creation (thanks @rogerbraun).
- Fix: use PUT instead of POST for replace action (thanks @rogerbraun).
- Added
Picky::Client#to_s
method. - Cleaned JS source code, fixed header allocation information.
- Reimplemented
Search#ignore
for single categories (see 4.12.5 and use a single category name symbol). - Removed generator file duplication.
-
Added experimental options
Search#only
andSearch#ignore
.Example: people = Search.new(people_index) do only [:first_name, :last_name], [:last_name, :first_name] end
- Removed
rake index
from generated client (thanks @kschiess!).
- Added
CharacterSubstituters::Polish
(thanks @prami!) – substitutes various Polish characters with their ascii counterparts.
- Extracted index/search into separate files, like most people seem to like it.
- Reverted last change, and uses
.find_all_by_id
instead of.find_by_id
(thanks to @prami).
Picky::Convenience#populate_with
offers a new optionfinder_method
where you can the object finding method. It will be given an array of ids and options given to#populate_with
(minusup_to
andfinder_method
), thanks @joho!- Breaking: By default,
#populate_with
uses.find_by_id
instead of.find
on the given (model class) instance. This will simply continue to work if you useActiveRecord
.
- Experimental feature: Automatic input splitting. Use when you can't use eg. space. Initialize as
Picky::Splitters::Automatic.new(index_category)
. Offers the method#split(text) # => ['split', 'text']
. This means you can use this for the optionsplits_text_on
instead of aRegexp
.
- Removed unnecessary jQuery History.js adapter file.
- Experimental feature: Ranges can now be customized. Pass a category an option
ranging: CustomRanger
. That class has to initialize likeRange.new(min, max)
, but can offer a specialized#inject
method which can yield a custom order. (Alternatively, implement#each
andinclude Enumerable
) - Thanks to @andykitchen for this one.
- Experimental feature: Range query over natural ranges, ie. numeric or alphabetical.
- Examples:
2000-2008
,year:2000-2008
(adding a qualifier is recommended, faster and usually known). Though: Useyear:200*
if you want fixed ranges2000-2009
,2010-2019
, etc. - Be clever in your use of ranges. If they are not flexibly chooseable by the user, don't use them. Be also wary in this initial version of huge range:
0-1000000
is a bad idea. If your range encompasses all values, simply don't use a range query.
- Fix: Include a server Gemfile with the generator (thanks @mbajur for noticing!).
- If you have a search with multiple indexes, it will now map the same qualifier to different categories in multiple indexes. Example, if you search for "name:bla", then the name qualifier will be mapped to the respective category on each index. We do not recommend to use the same category names on different indexes if they are used in the same search.
- Removed: Method
Picky::Search#only
. We will reinstate it in version 5+.
- Picky Loggers now also accept Ruby Logger instances instead of just IO instances:
Picky.logger = Picky::Loggers::Concise.new Logger.new('log/some.log')
.Picky.logger = Picky::Loggers::Concise.new
- Same three logging types still available:
Picky::Loggers::Silent
,Picky::Loggers::Concise
,Picky::Loggers::Verbose
. - Picky now outputs all warnings/info to the logger set in
Picky.logger=
(available viaPicky.logger
).
- Procrastinate gem is optional (add when you wish parallel indexing).
- Added
Picky::Category#source=
. Each category can have a different source, mostly used for different sorting.
- Removed
PICKY_ROOT
and did not replace it.
- Removed
PICKY_ROOT
and replaced it withPicky.root = "absolute path"
andPicky.root # => Current Picky root directory (used for indexes).
- By default, we use Yajl if it is available, via
MultiJson.use :yajl if defined? ::Yajl
. - Use
MultiJson.use :your_prefered_adapter
explicitly to change the adapter.
- Experimental update: Stemming.
- New tokenizer (indexing/searching) option:
stems_with
. Give it a thing that responds to#stem(text) # => stemmed_text
. - See https://github.com/floere/picky/blob/master/server/spec/functional/stemming_spec.rb for examples.
- More helpful tokenizer error message.
- Fixes issues with Clang on OSX (thanks Andy Kitchen!).
- If you add categories dynamically after using the index in a search, you have to call
search.remap_categories
to have the new category's qualifiers registered.
- The spermy operator (~>) in the gemspec is not consistent over all version granularities.
- Also allow activesupport 4 (Note: Untested, assuming semantics didn't change and version change is Rails related).
- Many small internal improvements.
- Important Note: This release changes the file location of the prepared indexes! If you rely on this not changing, you need to adapt your scripts!
- Prepared files had a double "prepared" and an unnecessary "index" in the file name. They do not anymore. For example,
prepared_keywords_index.prepared.txt
has changed tokeywords.prepared.txt
, the tokenized index for the keywords category.
- Facets API now returns counts rather than weights.
- Facets API changed option
more_than
intoat_least
– give if you need facets with at least a certain count. - Facets API added option
counts
(true
/false
) –facets
methods will return a hash with counts iftrue
or not given, ie.nil
, and an array iffalse
.
Search#facets
now does access the partial index, but always the exact index.
Search#facets
performance improved.
- Experimental simple facets support.
- Added
Index#facets(:category_name, options = {})
withoptions
:more_than
(a minimum weight a facet needs to have to be included). Will return keys and weights. - Added
Search#facets(:category_name, options = {})
withoptions
:filter
(a query to filter with, e.g.'brand:mammut'
), andmore_than
(a minimum weight, see above). - Note – if your data is very dirty (ie. many facets that occur only once./), consider using a minimum to speed up the facets query!
- Usage –
products.facets :brand_name, filter: 'category:boots', more_than: 0
(will return allbrand_name
facets filtered by'category:boots'
that have more weight than0
).
- Added category option
weight
. Theweight
option now takes a number and adds that to the default logarithmic weighing. E.g.weight: +6
(very strong positive weight) orweight: -0.5
(slightly negative weight). This results in a higher/lower score.
- Clang can now compile Picky.
- Much better error message in case Picky can't be compiled.
- Clean compilation on gem install.
- Removed code, making Picky approximately 10% faster.
- Check for the existence of
RbConfig
before compiling.
- New experimental statistics interface. Run
picky stats
to get the usage.
- Fix for
multi_json
gem usage.
- Picky now uses the
multi_json
gem.
- Implements a suggestion by David Lowenfels which enables a Picky user to set the CC environment variable to define which C compiler is used.
- Fix for bug introduced in 4.4.0,
unique
option works correctly with offset.
Unique option on search instance. This will remove each result id in allocations if they have appeared in preceding allocations. What does this mean?
Example:
You search for "picky search"
. And you find it in two allocations, name, type
and name, name
. Let's say Picky finds ids [1, 2, 3]
in name, type
and then [2, 3, 4]
in name, name
. Picky will then remove 2 and 3
from name, name
because they have been found in name, type
already.
Usually this is used when you only want a list of unique ids in the results.
- Added
unique: truey/falsy
option onSearch#search
. Use like this:search_instance.search 'query', 20, 0, unique: true
.
This version lets you define control characters on tokens, like so (shows how, and the default):
Picky::Query::Token.partial_character = '\*'
for searching partially.Picky::Query::Token.no_partial_character = '"'
for not searching partially.Picky::Query::Token.similar_character = '~'
for searching similar strings.Picky::Query::Token.no_similar_character = '"'
for not searching similar strings.Picky::Query::Token.qualifier_text_delimiter = ':'
for telling qualifier and string apart (title:sometitle
).Picky::Query::Token.qualifiers_delimiter = ','
for telling qualifiers apart (title,author:bla
).
The first four are going to be interpolated into %r
, so escape the character like you would in a regexp. The last two are used in String#split
, so doing this is not necessary.
So, for example, if you set
Picky::Query::Token.partial_character = '…'
Picky::Query::Token.qualifier_text_delimiter = '?'
Picky::Query::Token.qualifiers_delimiter = '|'
Then you can search like so:
something.search("title|author?wittgenstei…")
- Sinatra index actions now return more sensible HTTP status codes.
- Gorgeous new design (thanks `tvandervossen!)
- Completely overhauled Picky JavaScript.
- Require fileutils regardless of the Ruby version Picky is run in.
- Require fileutils in case we run Picky on MacRuby (thanks overbryd).
- Use the "standard" way to detect the Ruby engine used for MacRuby.
- Experimental extensions to get Picky run on MacRuby 0.12.
- Unfortunately, we dom't have the resources to always run the tests – please use with caution.
- Redesigned how Picky logs: Picky itself logs its index handling (tokenizing/dumping/loading) using one of its built-in loggers. Set a logger after requiring 'picky' like this:
Picky.logger = Picky::Loggers::Verbose.new(STDOUT) # or any IO
. Default isPicky::Loggers::Concise.new(STDOUT)
akaPicky::Loggers::Default
. Also an option isPicky::Loggers::Silent
. This closes issue 70. - Note: Logging searches is your job (see generated examples on how to do this).
Picky::Results#ids(only = nil)
returns the amount of ids originally requested, except if anonly
amount gets passed in (then that amount is used).
Picky::Client::ActiveRecord.configure(options)
added as an alias ofnew
(thanks auastro!).
require 'picky/sinatra/index_actions'
is not necessary anymore to load the index actions. They are required automatically withrequire 'picky/sinatra'
.
- Encode data part in JSON.
- Experimental ActiveRecord 3.0+ integration release. See below.
- ActiveRecord models can now use
extend Picky::Client::ActiveRecord.new(*attributes_to_send, options = {})
to have the model send updates/deletes back to the Picky server. Note that error handling is not yet built in. The server needs to be up and running.
- The Sinatra style server can now
extend Picky::Sinatra::IndexActions
to install index updating POST/DELETE methods on the "/" path (Note: Currently needs arequire 'picky/sinatra/index_actions'
beforehand).
- Fixed bug #57, multicategory selections in the Javascript user interface.
- Experimental release of
only
option for searches. Does the same asonly:that_category
, but implicitly, in the search. E.g.only :cat1, :cat2
.
- Default amount of similar tokens is now set to 3 instead of 10 for phonetic similarities.
- Server uses PICKY_ENV environment variable before RUBY_ENV and then RACK_ENV.
- Fix for realtime indexing when using specific options.
- Customized
weight
andsimilarity
do not need thesaved?
method anymore.
- No changes from 4.0.0pre7.
- BREAKING The
tokenizer
option for a category has been renamed toindexing
, to conform with the methods for the index and the sinatra app. - BREAKING Internal
Similarity#encoded
method has been renamed to#encode
.
- Similarity API fixed.
- Only use 0.01s for checking the log file instead of 0.1.
- Overhauled statistics interface. Use
picky statistics log/search.log
to start it.
- BREAKING Reverting customizeable backends from version 3.3.2. They are no longer available. Please use simple subclassing to achieve funky backends.
- BREAKING SQLite
self_indexed
and Redisimmediate
option is now calledrealtime
, as changes go directly through to the actual backends, in "realtime". - The
Index#source
block is now evaluated every time an indexer runs.
- BREAKING Removed Picky classic application. Please use Picky e.g. in a Sinatra app.
- BREAKING Removed Picky classic sources. Please use a source with the #each method.
- BREAKING Option
weights
for thePicky::Index#category
method has been renamedweight
to conform with the other methods. - BREAKING Picky does not require the text gem anymore by default. Only when you use phonetic similarity. It will tell you what it needs.
- BREAKING Added the PICKY_ENVIRONMENT in front of the Redis key namespace to differentiate the various environments.
- BREAKING Removed
rake routes
since only the classic server was able to provide it. - BREAKING Removed the classic server from the generators.
- Explicitly uses
Yajl::Encoder#encode
for JSON encoding. - Fixed cases where even when no similarity was defined on a category, similar results were still found.
- Rake task
index
now points to taskindex:parallel
by default. Callrake:serial
to index serially. - Indexer calls
reconnect!
on sources that support it. - Location/Volumetric/Geosearch rewritten.
- BREAKING
Picky::Indexes.index
does not index in parallel anymore. - BREAKING Renamed
Picky::Indexes.index_for_tests
toPicky::Indexes.index
. - If you want to explicitly run parallel indexing programmatically, use
Picky::Indexes.index Picky::Scheduler.new(parallel: true)
orPicky::Indexes[:index_name].index Picky::Scheduler.new(parallel: true)
. - BREAKING Renamed
Picky::Wrappers::Category::ExactFirst
toPicky::Results::ExactFirst
. Extend instead of wrap:index.extend Results::ExactFirst
orcategory.extend Results::ExactFirst
. If an index is extended, each category of the index will be extended. - BREAKING
Picky::Indexes.reload
has been renamed toPicky::Indexes.load
. - BREAKING
index.reload
has been renamed toindex.load
. - BREAKING
category.reload
has been renamed tocategory.load
. - BREAKING Removed all
define_...
methods on indexes. - Using the
procrastinate
gem to parallelize indexing. - Indexing call structure cleaned up. Improves performance by about 40%.
- Fixed integration specs for the generated "all in one" server/client.
- Changed method calls to adapt to above changes.
- Semantics for terminate_early(n) are to calculate n more allocations than necessary. A n of 0 means that only exactly the number of necessary allocations for the ids is calculated.
- Fix for terminate_early with offsets in 3.6.12 (thanks niko!).
- Fix for exact first matching (thanks geelen!).
Picky::Search
optionterminate_early(integer)
orterminate_early(with_extra_allocations: integer)
introduces early termination. If in your interface you only need the ids and no total, then this is the option for you. Callingterminate_early
without parameters will use 0 as the default.- Fix for exact first matching (thanks geelen!).
- Fix for bad performance bug introduced somewhere in 2.4.
- Backends rewritten to support realtime indexes (SQLite, Redis). Memory already supported it (needs call to
Index#build_realtime_mapping
after loading if dumped+loaded). File backend will not support realtime index in the near future. - Experimental, use at your own peril: Method to build the realtime index, explicitly:
Index#build_realtime_mapping
.
- script/console command minified in the generation and moved to the server.
- The generated client will now use the raw JS file from Github (http://github.com/floere/picky/issues/46).
- BREAKING Renamed the undocumented
Tokenizer#maximum_tokens(integer)
toTokenizer#max_words(integer)
. Restricts the amount of words that the tokenizer lets through to the core search engine. - Added
Search#max_allocations(integer)
to restrict number of allocations that are actually calculated (to avoid combinatorial and UI explosions). - Added
<<
andunshift
onIndex
andCategory
. Theunshift
method behaves like theadd
method when that one is called without a second parameter. Use likeindex << Thing.new(1, 'some text', 'some other text')
. - Existence of a source is only checked when really needed. Will fail hard if there is none, with a (hopefully) useful error message.
- Experimental #build_realtime_mapping method to rebuild the realtime mapping helper after a dump/load.
- Fix and regression spec for a Redis backend bug introduced in 3.6.5.
- Exact-first wrapper for experimental purposes.
- Removed active record, redis, mysql dependencies from picky.gemspec.
- From Redis 2.6.0 on, Picky will be around 65% faster with Redis as a backend.
- Fixed Javascript. See #47.
- Weights now only saved up to the third position after the decimal point.
- SQLite backend has been renamed from
Sqlite
toSQLite
. - Backends can be switched dynamically (use
index.backend = new_backend
). Used for performance tests.
- Removed sqlite3 from gemspec to enable Heroku compatibility. Please add it in your Gemfile if you need it or simply install the gem separately.
This release includes BREAKING changes. See below.
- This version tries to reduce maintenance complexity and prepare for 4.0.
- BREAKING In your code, rename any occurrences of
Indexes.reload
,Indexes#reload
,Index#reload
,Category#reload
with an equivalentload
method. - Renamed
load_from_cache
withload
onIndexes
,Index
,Category
. - Removed
rake check
and related methods with no replacement. Please tell us if you miss it. - Removed
Index#backup
,Index#restore
and related methods onCategory
etc. with no replacements. Please tell us if you miss them. - Fix for the problem that
#remove(id)
didn't remove when a different key_format than the standard one was defined (Thanks niko!).
- Fix for using
Rack::Harakiri
in an example project. (Ok, time for bed)
- Fix for using dynamic weights and then deleting something from it.
- Changed the way the internal backend is dumped to json or marshalled.
generate_from
methods have been removed from all generators as they are not used anymore.- Added the option of having dynamic weights calculation. Use this if you don't need weights based on the amount of indexed ids per token. This does not generate an index in the backend (Redis or file), but calculates the weight at runtime. Examples: Always return the default 0.0,
category :text, weights: Picky::Weights::Constant.new
or always return 3.14,category :text1, weights: Picky::Weights::Constant.new(3.14)
or calculate a weight at runtime, based on the size of the str_or_sym we are looking for,category :text1, weights: Picky::Weights::Dynamic.new { |str_or_sym| str_or_sym.size }
. We recommend using search boosts to boost specific category combinations.
- Internally, tokens are held as strings. This helps dealing with memory issues when using realtime indexes. This might make Picky's memory usage a bit higher that before. However, when using realtime indexes, the memory usage will be much improved.
- Complete internal rewrite of how indexing is handled.
- Performance fix for problem introduced in 3.4.3.
- Fixed a bug where ids occurred multiple times for an indexed token in the same index bundle (thanks M. Below for finding the bug). This did not impact on the search results, just the stored index files.
- Intermittent service release to test internal String-based indexes.
- Method
populate_with
keeps the ids by default. Useclear_ids
on the results if you want to remove them.
- Fixing issue 38. Possibly caused by a problem described here.
- Internal interface for generators changed. The generators are now used directly, e.g.:
Picky::Generators::Partial::Substring.new(from: 1).generate_from inverted_index_hash
. No change on your part is necessary if you didn't usePicky::Generators::{Partial,Weights,Similarity}Generator
. - Experimental exchangeable backend change:
Redis now passes bundle, client into the lambda, instead of client, bundle
. E.g.inverted: ->(bundle, client) { Picky::Backends::Redis::List.new(client, "#{bundle.identifier}:inverted") }
- Fix for
Partial::None
, introduced in 3.3.0.
- ActiveRecord is not loaded anymore by default, as only few users use the Picky db source (if you do, Picky will try to require it and tell you if it can't).
- It is now possible to explicitly dump an index, using
index.dump
. This is useful with realtime indexes. - Added a new partial option,
Postfix
, with an option,from
. Withfrom: -4
and a word likeoctopus
, will generate partials[:octopus, :octopu, :octop, :octo]
(until -4). New default option isPostfix.new(from: -3)
, notSubstring.new(from: -3, to: -1)
anymore. The two options are identical in function. - Only Picky's tokenizers call
to_s
on data anymore. This means that you can write tokenizers that work on whatever kind of object you like. The Picky standard tokenizers themselves ensure that they get to work with a string. - Fix for
Substring
partialization, when negativefrom
andto
options are used at the same time. - Experimental exchangeable backends.
- RSpec 1 has been updated to RSpec 2.
This release includes BREAKING changes. See below.
- Removed bundler specific code from Picky. You can now decide yourself if you want it. Opens the possibility to just run Picky in a script to try ideas etc. (see example gist: https://gist.github.com/1315618).
-
The generated Sinatra server does not use bundler anymore. Classic servers (might) still need it. You can add it back in by adding the following code in
app.rb
, right afterrequire 'picky'
:begin require 'bundler' rescue LoadError => e require 'rubygems' require 'bundler' end Bundler.setup PICKY_ENVIRONMENT Bundler.require
picky generate
will not display the error backtrace part anymore.
- Runtime indexing (
remove
,add
,replace
) now possible on a single category. Please use e.g.index[:category_name].add some_object_with_id_and_category_name_method
.
- See last release. This release adds support for similarity searches on a realtime index.
- Please only use realtime indexing for experimental purposes.
- This release holds an experimental release of realtime indexing for 3.2: An index now supports
#add(object_responding_to_id_and_categories)
,#remove(id_of_added_object)
,#replace(object_responding_to_id_and_categories)
. Replace is simply remove+add. Replacing a non-existent object behaves like an add. I suggest using solelyreplace
. Notes: Only works in single-process, single-threaded servers. Does not persist. Only yet works when starting from an empty index, e.g.source []
. - Please only use realtime indexing for experimental purposes.
- Rewrite of "rake index" – Picky will only fork processes if there is the capability to fork (i.e. not Windows), or if there are more than one processor available.
- Possible solution to Issue 32. The issue is possibly related to http://redmine.ruby-lang.org/issues/5003. (Windows users, please use the next version, 3.1.9)
- Fixed scrolling after "More Results". Will scroll to the top of the newly added results, instead of to the last header of the newly added results. Get the new minified version here: https://github.com/floere/picky/tree/master/client/javascripts.
- Javascripts fixed. Get the new minified version here: https://github.com/floere/picky/tree/master/client/javascripts.
- Number of cores for OS Lion correctly reported.
- New Search block option:
ignore_unassigned_tokens(truey/falsy)
. Default is false. If true, will ignore tokens that cannot be assigned to any category. If you search for example for"Picky Garblegarblegarble"
, and"Garblegarblegarble"
isn't in any index, then it will return result as if"Garblegarblegarble"
hadn't been there. In this case, it will just return something like searchengine:"picky".
- Don't fork if there's just one index to be processed.
- Added
#ignore
option toSearch
definition block. Callingignore :name
will ignore tokens in allocations that are mapped to the name category. Example: You search for "David Hasselhoff". If Picky maps this to allocations[ [:first_name, name], [:first_name, :movie_title] ]
, only[ [:first_name], [:first_name, :movie_title] ]
will survive. TheHasselhoff - name
match will simply be ignored.
- The
before
Javascript callback option given to thePickyClient
has changed signature and how it is called. Old wasbefore(params, query)
, and the returned params changed the params. This did not allow changing thequery
in the callback. New isbefore(query, params)
and the returnedquery
replaces the query given as parameter. This allows changing the query before sending it off. The params can be changed as well, usingparams['option'] = value;
.
rake index
does not fork anymore if there's just one index to be indexed.- Experimental
Picky::Partial::Infix
partial generator. Use to find all possible substrings inside words. Options aremin
,max
, both take negative and/or positive values. Negative values indicate length up to length - X. E.g.min: 3, max: -1 # :hello => [:hello, :hell, :ello, :hel, :ell, :llo]
- Experimental
Picky::Backends::File
file backend. Use in index definition block as follows:backend Picky::Backends::File.new
. Use if you don't want Picky to use as much memory. Performance penalty applies.
This release includes BREAKING changes. See below.
- Exchangeable backends. New index definition:
Indexes::Memory
andIndexes::Redis
are now unified inIndex
. So useindex = Picky::Index.new(name)
from now on. (See next point) - A new option has been added to the index,
backend
. It takes a backend instance, making the backend exchangeable. The default is the memory backend, which you do not need to set. If you want a Redis backend, use as follows:index = Index.new(name) { backend Picky::Backends::Redis.new }
. If you want to explicitly set the memory backend:index = Index.new(name) { backend Picky::Backends::Memory.new }
. - Unified tokenizers. Method
#tokenize(text)
now returns[ ["token", "token", "token"], ["Original", "Original", "Original"] ]
. So your own tokenizer only needs to adhere to this interface and can be passed to the index/search using theindexing
/searching
method. - Removed tokenizer option
removes_characters_after_splitting: /some regexp/
(without replacement).
- Fixed & integration tested rake tasks (Thanks rogerbraun!)
This release includes BREAKING changes. See below. (Here we start with this style of BREAKING notation)
- BREAKING Removed method
Picky::Convenience#allocations_size
. Use#allocations.size
.
- BREAKING Removed
Results#to_log
.Results#to_s
returns a log worthy string now. - See changes in pre versions for complete changelog on 3.0.
- Renamed Picky::Result#serialize -> Picky::Result#to_hash.
- Added an All-In-One (Client + Server) Sinatra web app. This proves useful when wishing to use Picky on Heroku.
- Gemfile referred to version ~> 2.0 instead of = 3.0.0.pre2.
-
Breaking: Index::Memory and Index::Redis do not accept options anymore.
Define options in the block or on the resulting instances
some_index = Indexes::Memory.new(:some_name) do source ... key_format ... category ... category ... category ... result_identifier ... end
-
Breaking: PickyLog removed.
In the classic server, use
Picky.logger = Logger.new 'log/search.log'
if you want to log (uses SomeLogger#info).
In the Sinatra server, use
MyLogger = Logger.new 'log/search.log' ... get '/path' do result = ... MyLogger.info result.to_log(params[:query]) if you want to log. result.to_json end
-
Breaking: app/logging.rb not loaded anymore. You have to require it yourself if you want that.
-
A missing source is only noticed when it is used (such as in indexing). This makes it possible to set a source at a later time.
- Note: The key_format is not saved in the index configuration anymore.
- New example server, sinatra_server. The new default, very flexible.
- Breaking: Method
#take_snapshot
removed from Indexes/Index/Category (not needed anymore). - Breaking: Users need to reindex when installing this version (index "index" now identified by "inverted" to be more clear).
- Rake tasks rewritten to be simpler and clearer. Most notably,
index:specific[index,category]
is now justindex[index,category]
(both optional). - Reindexing now possible in running server, also for ActiveRecord Arel sources.
- More verbose indexing output with file locations.
- Taking data snapshots improved.
- Fix for e.g.
picky search localhost:8080/books
if highline gem is missing (thanks tonini!).
-
Breaking:
Indexes#find
method has been removed. UseIndexes[index_name]
andIndexes[index_name][category_name]
. -
Breaking:
Index#index!
,Index#cache!
,Category#index!
,Category#cache!
have been removed. UseIndexes.index
(combinesindex!
andcache!
), orIndexes[books].index
, orIndexes[books][title].index
. -
Get Indexes/Categories using the
#[]
method. E.g.Indexes[:books]
to get the:books
index, andIndexes[:books][:author]
to get the:author
category of the:books
index. -
Indexes
,Indexes[:some_index]
, andIndexes[:some_index][:some_category]
now all supportthe following methods:
#index
(just index: prepare data and cache data)#reload
(just reload the cached data into the server, no effect on Redis indexes)#reindex
(index and reload one category after another)
Note:
#reload
and#reindex
only make sense in a running server with memory indexes.Examples:
Indexes.index
(index all indexes, randomly)Indexes[:some_index].reindex
(reindex that index)Indexes[:some_index][:some_category].reload
(just reload that category)
- Fixed: Redis indexing. Old values are now removed on reindexing.
- Minor changes.
- Searches can now search in multiple qualifiers, separating them by a ",". E.g. name,street:tyne.
- Searches will no longer search in all categories (fields) if a qualifier has been mistyped. So, namme:peter will not search in all categories, but instead return an empty result if category namme does not exist.
- Fixed: Indexing a single category where a
#each
source was used usingrake index:specific[index,category]
raised an error.
- Live interface for picky-live gem fixed.
- Fixes Redis indexing.
- Requires activesupport (thanks stanley!).
- Added a configuration option
key_format
for index, categories. It sets the format that this index'/category's keys are in. Use as you would withsource
, as either method in the index block, as index parameter, or category parameter.
- The client is now finally really data driven by the server, see next changes.
- Added two options for the
PickyClient
,fullResults
andliveResults
. It designates how many results should be rendered. Defaults are for full: 20, and for live: 0. - The
Convenience#ids
method now by default returns all ids returned from the server. - The
Convenience#populate_with
's second param is not the amount of populated ids anymore. Instead it populates all returned ids by default. If you want less, pass in theup_to
option. So, e.g.results.populate_with :up_to => 20
.
- Integration specs in the server are now easy. In your specs,
require 'picky-client/spec'
. Example:it { books.search('alan').ids.should == [259, 307, 449] }
.
- Added integration specs that use the above tests & matchers to the generated example app.
- Added
Picky::TestClient
which can be used in the server for integration specs. UsePicky::TestClient.new(YourPickyApp, :path => '/your_search_url')
, thentest_client.search('bla', :ids => 12, :offset => 0).ids.should ==== [1,3,4]
ortest_client.search('blu bli').should have_categories(['title', 'author'], ['title', 'title'])
to test category result combinations and order.
- Very simple geo search that works best in temperate areas. If you're just looking for results that are close to yours, give it a go. Use
#geo_categories(lat, lng, radius_in_kilometers, options = {})
- (BREAKING CHANGE) Since I prefer the block style configuration for indexes, the source is now an optional parameter. Picky will tell you if you still use the old style. New is that you can define the source of an index in the block, e.g.:
Index::Memory.new(:some_index) do source Sources::CSV.new(...) end
- Sources can now be anything that responds to #each and that returns objects that respond to #id. (That means you can just pass in an array, or MongoMapper or ActiveRecord's
Book.order('updated_at DESC')
or similar) - The app/application.rb API has gotten a few aliases:
default_indexing
anddefault_querying
can now be called withindexing
orsearching
. - Each index can now have its own indexing. Use e.g.
Index::Memory.new(:some_index) do indexing removes_characters: /[^a-z]/i end
. - Each
Search
can now have its own "searching", e.g.:Search.new(some_index) do searching removes_characters: /[^a-z]/i end
- Added option for collaborators (on the Picky server) of setting the performance ratio if the performance specs fail too often. Just add a
spec/performance_ratio.rb
file with the contentmodule Picky; PerformanceRatio = x.xx end
. Less than 1.0 is more benign, more than 1.0 is harsher.
- Improved
rake search <url> [<result id amount>]
with better description and error handling.
rake search <url>
, a simple experimental terminal search interface.
- Tokenizing completely rewritten. It works now almost the same in indexing and in querying, with the exception of downcasing (or not, for case sensitive searches).
- Indexing and querying now don't downcase anymore right at the beginning of processing, but rather after text preprocessing. For you this means that you need to use case insensitive regexps
/…/i
in the config if you need a case sensitive search (get it?). default_indexing
anddefault_querying
offer a new option,case_sensitive
, which is by defaultfalse
. Set it in indexing and querying totrue
to have your search be case sensitive (usually it is a good idea to set them both to the same case sensitivity). Watch the regexp options – possibly best if you set them to case insensitive/…/i
.
- Let's go live, wohoo! :) See the prerelease history notes for all changes.
- Renamed
Similarity::DoubleLevenshtone
(akaSimilarity::Phonetic
) toSimilarity::DoubleMetaphone
(BREAKING: Cannot useSimilarity::Phonetic
anymore). - Added
Similarity::Soundex
. - Added
Similarity::Metaphone
.
- Asterisks are correctly escaped before saved in the browser history.
- you: Give feedback, thanks! :)
- New major version number – see reasons for API change: http://florianhanke.com/blog/2011/03/16/pickys-adolescence.html.
- (Breaking change)
Query::Full
andQuery::Live
have been replaced by justSearch
. So what you now do isroute /something/ => Search.new(index1, index2, ..., options)
. - Pass in the
ids
param to define the amount of result ids you'd like. This is how you'd do it with curl:curl 'localhost:8080/books?query=test&ids=20'
. 20 ids is the default.
- (Breaking change)
Picky::Client::Full
andPicky::Client::Live
have been replaced byPicky::Client
. New option:ids
. Pass in to define the amount ofids
you'd like. For a live query you need none, so pass in 0. (20 is the default in the server) - See client changes above. Replace
Picky::Client::Full
andPicky::Client::Live
with just a singlePicky::Client
instance with the same options as before (but just a single URL on the server as desribed above). - Added
rake javascripts
,rake update
to the client and client project generator which copies the javascripts from the client gem into your directory. (If you have an old generated project, addrequire 'picky-client/tasks'; Picky::Tasks::Javascripts.new
in yourRakefile
)
- See server changes above. Replace
Query::Full
andQuery::Live
instance pairs by just a singleSearch
instance.
- Not breaking the web anymore ;) Using history.js instead of address.js to do away with the hash bang.
rake stats
andrake analyze
. Get information about your app.
- When indexing from the database, the intermediary snapshot table is now called
"picky_#{index.identifier}_index"
instead of"#{index.identifier}_type_index"
to be clearer that it is Picky creating these tables, and what it is. You can remove the ..._type_index tables. - The database source now uses mostly AR adapter methods to make it more agnostic.
- Picky now traverses more cleanly over your database data. (Thanks Jason Botwick!)
- Redis backend.
- The Redis backend uses db 15.
- The mysql gem is used by default.
- Fix for non-working picky command line interface. (Thanks Jason Botwick!)
- Redis backend prototype.
rake index:specific[index]
orrake index:specific[index,category]
to index just a specific index or category.- Postgres source better handled.
- The
choices
option is now localized. If you have generated a new Picky project with 1.4.0, please do localize yourchoices
like so:choices:{ (formats here) }
=>choices:{en:{ (formats here) }}
and whatever locales you'd like to use.
- Latest Javascript PickyClient object includes the option to format the choices better, option
group: [['author', 'title', 'subjects'], ['publisher']]
lets you group certain categories together while optionchoices: { 'title': format: "<strong>%1$s</strong>", filter: function(text) { return text.toUpperCase(); }, ignoreSingle: false }
lets you define how each combination is handled in detail. Again, hard to explain, easy to see. (see issue for details, closes issue 9). - Added a
wrapResults
options where you can define wrapper HTML bits that are wrapped around each allocation group of<li>
results. The default is:wrapResults: '<ol class="results"></ol>'
. - Headers are now contracted, this means no more "written by florian and written by hanke", but "written by florian hanke". (closes issue 10)
- Split #interface method into => #input, #results, so that users can wrap each with custom elements. Don't forget to wrap into a div#picky.
- Example now constricts the Picky interface width using a div.content. Please use a wrapper div to constrict div#picky.
- Cleanup of Javascript code, inclusion of formerly external javascripts (
scrollTo
,timer
,jQuery 1.5
). - Interface HTML structure refactor. Results should now be li-s. Result groups (combinations/allocations, around the result li-s) are each inside an ol.results. Please check your CSS files if they need to be adapted to the new structure.
- Cleanup of CSS, much more flexible and specific.
- In the generated Sinatra client, queries can be passed in through the query param q. Example: http://www.mysearch.com/?q=example
- In the generated sinatra client, the back/forward buttons work via jquery.address plugin. Closes github issue 6.
- Server now sends the similar word instead of the original in similarity tokens (semelor~ -> similar). Even if that means, that the original way of writing is not preserved (SEmElOr~ -> similar). We're trying to help people have good searches, so there.
- Fixed description in the "picky" command. Also now shows optional parameters more clearly.
- Ability to handle string/symbol keys (for future key/value store data sources).
- Live interface uses select instead of sleep in the master process.
- Offers a new routing API, an interface that permits changing parameters in the running server. Use
route %r{/admin} => Live::Interface.new
.
- The statistics server is now called "Clam", a chain smoking friend of Picky's.
- A new Gem "picky-live" that offers a live interface into the Picky server, provided you have a route for it. It is called "Suckerfish", and is one of Picky's friends, too.
default_indexing
(in the application.rb) provides a new optionreject_token_if => some_lambda
, e.g.:reject_token_if: lambda { |token| token.nil? || token == :hello }
where you can define which tokens go into the index, and which do not. Default lambda is:&:empty?
. This means that only non-empty tokens are saved in the index. You could, for example, not save tokens that have length < 2 (since they might be too small for your purposes). Note that tokens are passed into the hash as symbols.
- Fixed a bug where the last line in the log file was counted once a second time after reloading the stats.
- Slight interface redesign.
- Fixed a bug where the partial strategy
Partial::None
was not correctly used: A query likePeter
did not return results even if "Peter" could be found using quotes: "Peter" (FYI, double quotes force Picky to use the exact index instead of the partial one. While, conversely, the asterisk* forces Picky to use the partial index instead of the exact one).
- Statistics server handles logfile reading in a cleaner way when the gem has been installed by root.
- (BETA) New statistics gem for Picky. Run
picky stats path/to/your/search.log [port]
to start a statistics server. Go to http://localhost:4567 after running the command to take a look.
- (BREAKING) Picky::Client::Base.search(:query => 'bla') has changed to Picky::Client::Base.search('bla'), as the query itself is not optional. The rest of the options is still passed in as a Hash through the second parameter.
- Redefined API for 1.1.6 beta feature, ranged search.
- API for #define_ranged_category.
- Enabled beta feature "low/high limited range search", see API RDoc (IndexAPI class).
- Passing in a similarity search (e.g. with text "hello") will never return "hello" as a similar token.
- Removed unnecessary jquery-1.3.2 from client, since it wasn't referenced anyway.
- The CouchDB source now uses a little trick/hack to make its ids work in Picky. They are translated into decimal numbers from its hex string representations. Recalculate using #to_s(16) before getting objects from CouchDB in a webapp.
- Added generator for empty unicorn projects, use
picky generate empty_unicorn_project <project_name>
to generate one.
- Removed generator projects that have been moved to picky-generators. Gems now much smaller :)
- Generators extracted into picky-generators gem.
- Generators and example projects for both server and client.
- Lots of API RDoc.
- Yaaaay! Finally :)
- Fixed cased file name (led to problems under Linux, thanks Bernd Schoeller)
- New :from option. Assume you have a source
Sources::CSV.new(:title, file:'some_file.csv')
but you want the category to be called differently. Use the from option as follows:define_category(:similar_title, :from => :title)
. - CSV source uses
FasterCSV
, passing through all its options (col_sep
,row_sep
et cetera). - More understandable output for rake try, rake try:index, rake try:query.
- Fixed a bug where the default qualifier definition (like the author in the query author:tolkien) for categories were ignored.
- API change in application.rb: Use #define_category instead of #category on an index. (category still possible, but deprecated)
- Internal rewrite.
- Rake task index:check will check if all necessary index files are generated. (Nice to use before restarting.)
- Better error reporting in Rake tasks. Task naming improved.
- Internal cleanup.
- Major API and internals rewrite. See generated project for help.
- Source CouchDB added (thanks to github.com/stanley).
- Typo fixed (thanks to github.com/stanley).
- Helpful configuration page in the client at /configure.
- Phonetic similarity (e.g. lyterature~) available through Similarity::Phonetic.new(4), see example.
- :weights option for queries also ok in the form: { [:cat1, :cat2] => 4 }, where 4 is any weight.
- (BREAKING) Total rewrite/exploration of the Application API. Stay on 0.9.4 if you don't want to update right now.
- Character substitution now configurable. Default is no character substitution.
- rake routes: Shows all current URL paths, and if they are processable fast.
- Fixed: Querying parameters are not ignored anymore.
- Fixed result_hash.entries to return the right amount of entries.
- The result_hash#entries now takes a block and replaces the e.g. AR instances with e.g rendered results.
- Locale handling fixed. Uses the locale of the HTML tag by default.
- Delicious missing gem notice if www-delicious gem is missing. -Partial::Subtoken renamed to Partial::Substring. Options: down_to -> from, starting_at -> to
- Index bundle file handling extracted into specific Index::Files backend.
- Jump to 0.9.0 to work on API, release 1.0.0 soon.
- Partial indexing now only down to -3, e.g. florian -> partial: floria, flori, flor. If you want down_to the first character (florian, floria, flori, flor, flo, fl, f), use: field(:some_field_name, :partial => Partial::Subtoken.new(:down_to => 1))
- Sources::Delicious.new(user, pass) for indexing your delicious posts.
- indexing and querying config now done on tokenizer instances.
- Generator gives more informative NoGeneratorError message.
- Uses json (index, index weights) and marshal (similarity index) to dump indexes.
- Generator is more helpful (thanks to github.com/kschiess)
- Generator for a Sinatra project. (picky-client sinatra project_name <- Note: Changed to picky generate sinatra_client project_name)
- Helpful generator. (thanks to github.com/kschiess)
- Indexing output, output in general cleaned up.
- Better info after generating a new project (thanks kschiess).
- Indexer now uses json for the dump files (much faster, slightly larger, thanks to github.com/niko).
- JS files rewritten.
- Explicit index buffering: Indexer hits filesystem only seldomly.
- Internal rename from full index to exact index (visible in index filenames).
- Solr Indexing removed until someone needs it. Then we'll talk cash. Just kidding.
- Improved Gemfile.
- Umlaut handling (i.e. character substitution) now pluggable.
- Apps finalization now handled through Ruby callback (thanks to github.com/severin).
- Fix for negative partial index values (:partial => Partial::Subtoken.new(:down_to => -3))
- Only uses JSON to encode results.
- Only uses JSON for full and partial queries.
- Application interface rewrite. See a freshly created project (using picky project <- Note: Renamed picky generate unicorn_server ). Application#add_index.
- Cleanup. Frontend example.
- Application#add_index instead of Application#type.
- Simplified scaffolding.
- Gem compiles on install. Do not compile on run.
- Removed unnecessary gem dependencies (thanks to niko).
- Added CSV to the possible Sources. Sources::CSV.new(:title, :author, :isbn, :file => 'data/books.csv'),
- Renamed all instances of SEARCH_* constants to PICKY_*. (Uses RACK_ENV)
- config.ru, unicorn.ru now top level in newly created project (more standard).
- Port now defined in unicorn.ru (use listen 'host:port').
- Enriched callbacks in the JS interface definition (before, success, after).
- Interface now created using Picky::Helper.interface or .cached_interface (if you only have a single language in your app).
- C-Code cleaned up, removed warnings.
- Newly created application better documented.
- Initial project. Server (picky) and basic frontend client (picky-client) available.