RDig

RDig provides an HTTP crawler and content extraction utilities to help building a site search for web sites or intranets. Internally, Ferret is used for the full text indexing. After creating a config file for your site, the index can be built with a single call to rdig.

RDig depends on Ferret (>= 0.10.0) and, for parsing HTML, on either Hpricot (>= 0.4) or the RubyfulSoup library (>= 1.0.4). As I know no way to specify such an OR dependency in a gem specification, the gem depends on Hpricot. If this is a problem for you, install the gem with –force and manually do a +gem install rubyful_soup+.

basic usage

Index creation

create a config file based on the template in doc/examples
to create an index:
```
rdig -c CONFIGFILE
```
to run a query against the index (just to try it out)
```
rdig -c CONFIGFILE -q 'your query'
```
this will dump the first 10 search results to STDOUT

Handle search in your application:

require 'rdig'
require 'rdig_config'   # load your config file here
search_results = RDig.searcher.search(query)

see RDig::Search::Searcher for more information.

usage in rails

add to config/environment.rb :
```
require 'rdig'
require 'rdig_config'
```
place rdig_config.rb into config/ directory.
build index:
```
rdig -c config/rdig_config.rb
```

in your controller that handles the search form:

search_results = RDig.searcher.search(params[:query])
@results = search_results[:list]
@hitcount = search_results[:hitcount]

search result paging

Use the :first_doc and :num_docs options to implement paging through search results. (:num_docs is 10 by default, so without using these options only the first 10 results will be retrieved)

sample configuration

from doc/examples/config.rb. The tag_selector properties are called with a BeautifulSoup instance as parameter. See the RubyfulSoup Site for more info about this cool lib. You can also have a look at the html_content_extractor unit test.

:include:doc/examples/config.rb

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
bin		bin
doc		doc
lib		lib
test		test
.gitignore		.gitignore
.svnignore		.svnignore
CHANGES		CHANGES
History.txt		History.txt
LICENSE		LICENSE
Manifest.txt		Manifest.txt
README.rdoc		README.rdoc
TODO		TODO
install.rb		install.rb
rakefile		rakefile
rdig.gemspec		rdig.gemspec

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RDig

basic usage

Index creation

Handle search in your application:

usage in rails

search result paging

sample configuration

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

jkraemer/rdig

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages