aboutsummaryrefslogtreecommitdiffhomepage
path: root/data/doc/sisu/v2/sisu_user_txt/markup_minimum_requirement.txt
blob: 27b5b77477b7b88e395c4f4d1debd9d38dba1b7c (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
                                                                       index.txt
                                                                      markup.txt
%% Summary ---------------------------------------------------------------------

sisu takes an ASCII (UTF-8) document and abstracts the document into its
structure and smaller constituent parts, objects. This abstraction is then used
in subsequent processing to reconstitute the document into disparate document
representations, many quite different from one another, e.g. HTML, ODF, LaTeX
and to populate SQL

SiSU identifies the document structure (headings and their levels) and pulls
the document apart in to its constituent parts, objects (paragraphs, headings,
tables, images etc.) to which it assigns an object number if substantive
content.

From the marked up document SiSU need to be able to determine a documents
structure, and the objects that a document contains

The first line of a SiSU marked up document can identify itself with

%% Identify SiSU Document ------------------------------------------------------

% SiSU

%% The Basic SiSU Markup Document ----------------------------------------------

SiSU documents are divided into two parts, (i) the document header and (ii)
substantive content.

(i) the document header, which contains (a)  metadata and (b) processing
instructions if any. Document headers take the form of a tag and the related
related information. The Document header, metadata, should contain at least:

@title:

@creator:
  :author:

Processing instructions are grouped under the @make: tag. In the absence of any
program (or configured) defaults will be used.

(ii) for the substantive content the document structure must be defined, here
structure equates to the headings and their relative levels (this can be done
either by explicit markup where each heading occurs, or in the header @make:
section of the, or both).

The basic document objects are headings and paragraphs. Paragraphs are
identified automatically, and headings must be defined (with respect to
document structure), so sisu is able to determine the basic objects without
anything further.

sample_1.sst

%% Document Structure (heading levels) -----------------------------------------

Document structure (heading levels) are determined from information provided in
the markup of the document. There are two ways to identify document structure:
(i) manual markup of headings with their level; (ii) in the sisu header, under
@make: :heading: provide a regex, in the manner understood by sisu, that
identifies what to look for in headings of various levels.

Document structure is the different headings in a document, and their relative
levels.

There are two sets of docment level markers :A~ and an optional :B~ & :C~ and
beneath that 1~, 2~, 3~.

For the first set of document level markers the document Title being the top
level in the hierarchy; beneath that book titles if the document contains more
than one book followed by sections

%% Document Objects (paragraphs, headings, tables, verse etc.) -----------------

Document objects are units of text that are identified, stored and processed as
a block. The most usual document objects would be paragraphs and headings. A
more complete list of objects includes: paragraphs; headings; tables; code
blocks; verse (the poem is identified, but each verse is an object); grouped
text...