|
|
SiSU Search Ralph Amissah copy @ SiSU |
Rights: Copyright © Ralph Amissah 2007, part of SiSU documentation, License GPL 3
SiSU - Search,
|
1 |
SiSU Search |
2 |
1. SiSU Search - Introduction |
3 |
SiSU output can easily and conveniently be indexed by a number of standalone indexing tools, such as Lucene, Hyperestraier. |
4 |
2. SQL |
7 |
2.1 populating SQL type databases |
8 |
SiSU feeds sisu markupd documents into sql type databases PostgreSQL 1 and/or SQLite 2 database together with information related to document structure. |
9 |
|
11 |
|
13 |
|
14 |
There is of course the possibility to add further structures. |
15 |
3. Postgresql |
18 |
3.1 Name |
19 |
SiSU - Structured information, Serialized Units - a document publishing system, postgresql dependency package |
20 |
3.2 Description |
21 |
3.3 Synopsis |
23 |
24 |
sisu -D --pg --[instruction] [filename/wildcard if required] |
25 |
3.4 Commands |
26 |
28 |
3.4.1 create and destroy database |
29 |
sisu -D --createdb |
31 |
sisu -D --create |
32 |
sisu -D --Dropall |
33 |
sisu -D --recreate |
34 |
3.4.2 import and remove documents |
35 |
sisu -D --import -v [filename/wildcard] |
36 |
sisu -D --update -v [filename/wildcard] |
37 |
sisu -D --remove -v [filename/wildcard] |
38 |
4. Sqlite |
39 |
4.1 Name |
40 |
SiSU - Structured information, Serialized Units - a document publishing system. |
41 |
4.2 Description |
42 |
4.3 Synopsis |
44 |
45 |
sisu -d --(sqlite|pg) --[instruction] [filename/wildcard if required] |
46 |
4.4 Commands |
47 |
49 |
4.4.1 create and destroy database |
50 |
sisu -d --createdb |
52 |
sisu -d --create |
53 |
sisu -d --dropall |
54 |
sisu -d --recreate |
55 |
4.4.2 import and remove documents |
56 |
sisu -d --import -v [filename/wildcard] |
57 |
sisu -d --update -v [filename/wildcard] |
58 |
sisu -d --remove -v [filename/wildcard] |
59 |
5. Introduction |
60 |
5.1 Search - database frontend sample, utilising database and SiSU features, including object citation numbering (backend currently PostgreSQL) |
61 |
Sample search frontend 3 A small database and sample query front-end (search from) that makes use of the citation system, object citation numbering to demonstrates functionality. 4 |
62 |
Note you may set results either for documents matched and object number locations within each matched document meeting the search criteria; or display the names of the documents matched along with the objects (paragraphs) that meet the search criteria. 5 |
64 |
sisu -F --webserv-webrick |
65 |
The following is feedback on the setup on a machine provided by the help command: |
66 |
67 |
69 |
5.2 Search Form |
71 |
sisu -F |
72 |
sisu -F --webserv-webrick |
73 |
sisu -Fv |
74 |
sisu -W |
75 |
The generated search form must be copied manually to the webserver directory as instructed |
76 |
6. Hyperestraier |
77 |
78 |
79 |
80 |
81 |
82 |
83 |
/usr/share/doc/sisu/sisu_markup/sisu_hyperestraier/index.html |
84 |
NOTE: the examples that follow assume that sisu output is placed in the directory /home/ralph/sisu_www |
85 |
(A) to generate the index within the webserver directory to be indexed: |
86 |
87 |
the following are examples that will need to be tailored according to your needs: |
88 |
89 |
90 |
you may use the 'find' command together with 'egrep' to limit indexing to particular document collection directories within the web server directory: |
91 |
find /home/ralph/sisu_www -type f | egrep '/home/ralph/sisu_www/sisu/.+?.html$' |estcmd gather -sd casket - |
92 |
Check which directories in the webserver/output directory (~/sisu_www or elsewhere depending on configuration) you wish to include in the search index. |
93 |
find /home/ralph/sisu_www -type f | egrep '/sisu_www/(sisu|bookmarks)/.+?.html$' | egrep -v '(doc|concordance).html$' |estcmd gather -sd casket - |
95 |
from your current document preparation/markup directory, you would construct a rune along the following lines: |
96 |
98 |
(i) copy estseek.cgi to your cgi directory and set file permissions to 755: |
99 |
100 |
101 |
sudo cp -v /usr/share/hyperestraier/estseek.* /usr/lib/cgi-bin |
102 |
103 |
(ii) edit estseek.conf, with attention to the lines starting 'indexname:' and 'replace:': |
104 |
105 |
replace: ^file:///home/ralph/sisu_www{!} |
106 |
107 |
108 |
109 |
and try open the url: <http://localhost:8081/cgi-bin/estseek.cgi> |
110 |
Endnotes |
0 |
1. <http://www.postgresql.org/> |
2. <http://www.hwaci.com/sw/sqlite/> |
4. (which could be extended further with current back-end). As regards scaling of the database, it is as scalable as the database (here Postgresql) and hardware allow. |
5. of this feature when demonstrated to an IBM software innovations evaluator in 2004 he said to paraphrase: this could be of interest to us. We have large document management systems, you can search hundreds of thousands of documents and we can tell you which documents meet your search criteria, but there is no way we can tell you without opening each document where within each your matches are found. |
Document Information (metadata) |
0 |
<http://www.jus.uio.no/sisu/sisu_manual/sisu_search/sisu_manifest.html> |
Dublin Core (DC) |
DC tags included with this document are provided here. |
DC Title: SiSU - Search |
DC Creator: Ralph Amissah |
DC Rights: Copyright (C) Ralph Amissah 2007, part of SiSU documentation, License GPL 3 |
DC Type: information |
DC Date created: 2002-08-28 |
DC Date issued: 2002-08-28 |
DC Date available: 2002-08-28 |
DC Date modified: 2007-09-16 |
DC Date: 2007-09-16 |
Version Information |
Sourcefile: sisu_search._sst |
Filetype: SiSU text insert 0.58 |
Sourcefile Digest, MD5(sisu_search._sst)= c085c2eb6d68f1b7d50435f673ede407 |
Skin_Digest: MD5(/home/ralph/grotto/theatre/dbld/sisu-dev/sisu/data/doc/sisu/sisu_markup_samples/sisu_manual/_sisu/skin/doc/skin_sisu_manual.rb)= 20fc43cf3eb6590bc3399a1aef65c5a9 |
Generated |
Document (metaverse) last generated: Mon Sep 24 15:36:03 +0100 2007 |
Generated by: SiSU 0.59.0 of 2007w38/0 (2007-09-23) |
Ruby version: ruby 1.8.6 (2007-06-07 patchlevel 36) [i486-linux] |
|
|
Output generated by
SiSU
0.59.0 2007-09-23 (2007w38/0)
|
SiSU using:
| |
SiSU is released under GPLv3 or later, <http://www.gnu.org/licenses/gpl.html> |
SiSU, developed using
Ruby
on
Debian/Gnu/Linux
software infrastructure,
with the usual GPL (or OSS) suspects.
|