Skip to content

Commit

Permalink
updated readme
Browse files Browse the repository at this point in the history
  • Loading branch information
Crenshinibon committed Sep 19, 2013
1 parent f03a2a4 commit 057cd14
Showing 1 changed file with 11 additions and 12 deletions.
23 changes: 11 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ Spomet is the contraction of Spotting Meteors.

Test it [here](http://spomet.meteor.com/ "Spomet hosted at meteor.com")

It is a quite simple and limited fulltext search engine for [Meteor](http://meteor.com "Home of Meteor"). Besides it's simplicity it's sufficient for my purpose (and eventually for many other's). It should be easily included into your Meteor project. Take the [spomet package](https://github.com/Crenshinibon/spomet/tree/master/packages/spomet "Spomet package") from this GitHub repository's *packages* folder and put it into your app's *packages* folder.
It is a quite simple and limited fulltext search engine for [Meteor](http://meteor.com "Home of Meteor"). Besides it's simplicity it's sufficient for my purpose (and eventually for many other's). It should be easily includable into your Meteor project. Take the [spomet package](https://github.com/Crenshinibon/spomet/tree/master/packages/spomet "Spomet package") from this GitHub repository's *packages* folder and put it into your app's *packages* folder.

This repository is itself a Meteor app and should serve as an example, of how to actually use Spomet.

Expand All @@ -17,9 +17,7 @@ Include the search box in your template:

Access the results through the CurrentSearch collection:

Spomet.CurrentSearch.find()

The search results are stored user based, for logged in users. Otherwise the results will be saved by sessionId and dismissed after one hour.
Spomet.Results()

Add documents to the search by calling the method *add* with a *Findable* instance:

Expand All @@ -31,25 +29,24 @@ Add documents to the search by calling the method *add* with a *Findable* instan
Part of the identifier, relative to the base. Useful to identify parts of the base document. E.g. attribute identifiers of the stored document.
* base
The base path. E.g. the id of the document, whose text should be indexed.
* type
The documents type. Might be useful to distinguish between different types of documents.
* rev
A revision number to support multiple version of a document. Postpones the need to
remove documents from the index and provides basic support for versioning.
A revision number to support multiple version of a document.


Technology
==========

The current implementation uses three simple indexes. They are supposed to balance precision and recall. There haven't been any tests yet. So future updates might fine-tune the parameters and introduce further indexes, drop some or make their use configurable.
The current implementation uses four simple indexes. They are supposed to balance precision and recall. There haven't been any tests yet. So future updates might fine-tune the parameters and introduce further indexes.

Currently there is a 3gram based index, a simple word index and a wordgroup index. Whereas wordgroups are groups of two words.
Currently there is a 3gram based index, a simple word index, a custom index and a wordgroup index. Whereas wordgroups are groups of two words.

Future enhancements might include stemming, algorithm based (e.g. Porter) or based on a lexikon. As well as phonetics.

Furthermore is the implementation not very efficient, I fear. There is plenty of room to optimize certain aspects.

A small client side subset of the most commonly used index terms to accelarate the search, for example.

The server process handles the heavy lifting of indexing the documents. So when there are many documents to include the server will stall. A future enhancement might include establishing a separate process (deployable on a different host) for the indexing. Client side indexing might not me doable, because of security considerations.
The server process handles the heavy lifting of indexing the documents. So when there are many documents to include the server will stall. A future enhancement might include establishing a separate process (deployable on a different host) for the indexing. Client side indexing might not be doable, because of security considerations.

Tests
=====
Expand All @@ -60,10 +57,12 @@ Run the tests from the project's root folder with:

laika --compilers coffee:coffee-script

Note: There might be some false errors, indicating some curly braces problem, when you run all tests at once.

Warning
=======

This package is in it's really really early stages. As it should allow for some basic usage, there are many aspects missing. The functionality to remove documents from the index, for example. Supporting information to highlight matching aspects of resulting documents is also missing. As well as the performance improvements mentioned above.
This package is in it's really really early stages. As it should allow for some basic usage, there might be some essential things missing.

There is of course no guarantee for it's correct functioning. And I'm not liable on any consequences resulting from the usage of this software.

Expand Down

0 comments on commit 057cd14

Please sign in to comment.