summaryrefslogtreecommitdiffstats
path: root/data/doc/sisu/markup-samples/manual/en/sisu_sql.ssi
blob: 4b4866ec851991ccf2a6a8d76e5ae2a99fbc3842 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
% SiSU insert 2.0

@title: SiSU
 :subtitle: SQL and Search

@creator:
 :author: Amissah, Ralph

@date:
 :created: 2002-08-28
 :issued: 2002-08-28
 :available: 2002-08-28
 :published: 2007-09-16
 :modified: 2011-02-07

@rights:
 :copyright: Copyright (C) Ralph Amissah 2007
 :license: GPL 3 (part of SiSU documentation)

@classify:
 :subject: ebook, epublishing, electronic book, electronic publishing, electronic document, electronic citation, data structure, citation systems, search

:A~? @title @creator

:B~? SiSU Search

:C~? Search

1~search_sql SQL
={ SiSU sql; SiSU search }

2~ Populate the database
={ SiSU search:populate database }

TO populate the sql database, run sisu against a sisu markup file with one of the following sets of flags

``` code
sisu --sqlite filename.sst
```

creates an sqlite3 database containing searchable content of just the sisu markup document selected

``` code
sisu --sqlite --update filename.sst
```

creates an sqlite3 database containing searchable content of marked up document(s) selected by the user from a common directory

``` code
sisu --pg --update filename.sst
```

fills a postgresql database with searchable content of marked up document(s) selected by the user from a common directory

For postgresql the first time the command is run in a given directory the user will be prompted to create the requisite database, at the time of writing the prompt sisu provides is as follows:

``` code
no connection with pg database established, you may need to run:
    createdb "SiSU.7a.current"
  after that don't forget to run:
    sisu --pg --createall
  before attempting to populate the database
```

The named database that sisu expects to find must exist and if necessary be created using postgresql tools. If the database exist but the database tables do not, sisu will attempt to create the tables it needs, the equivalent of the requested #{sisu --pg --createall}# command.

Once this is done, the sql database is populated and ready to be queried.

2~ SQL type databases

SiSU feeds sisu markup documents into sql type databases PostgreSQL~{ http://www.postgresql.org/ \\ http://advocacy.postgresql.org/ \\ http://en.wikipedia.org/wiki/Postgresql }~ and/or SQLite~{ http://www.hwaci.com/sw/sqlite/ \\ http://en.wikipedia.org/wiki/Sqlite }~ database together with information related to document structure.

This is one of the more interesting output forms, as all the structural data of the documents are retained (though can be ignored by the user of the database should they so choose). All site texts/documents are (currently) streamed to four tables:

_1* one containing semantic (and other) headers, including, title, author, subject, (the Dublin Core...);

_1* another the substantive texts by individual "paragraph" (or object) - along with structural information, each paragraph being identifiable by its paragraph number (if it has one which almost all of them do), and the substantive text of each paragraph quite naturally being searchable (both in formatted and clean text versions for searching); and

_1* a third containing endnotes cross-referenced back to the paragraph from which they are referenced (both in formatted and clean text versions for searching).

_1* a fourth table with a one to one relation with the headers table contains full text versions of output, eg. pdf, html, xml, and ascii.

There is of course the possibility to add further structures.

At this level SiSU loads a relational database with documents chunked into objects, their smallest logical structurally constituent parts, as text objects, with their object citation number and all other structural information needed to construct the document. Text is stored (at this text object level) with and without elementary markup tagging, the stripped version being so as to facilitate ease of searching.

Being able to search a relational database at an object level with the SiSU citation system is an effective way of locating content generated by SiSU. As individual text objects of a document stored (and indexed) together with object numbers, and all versions of the document have the same numbering, complex searches can be tailored to return just the locations of the search results relevant for all available output formats, with live links to the precise locations in the database or in html/xml documents; or, the structural information provided makes it possible to search the full contents of the database and have headings in which search content appears, or to search only headings etc. (as the Dublin Core is incorporated it is easy to make use of that as well).