%% Summary ---------------------------------------------------------------------
sisu takes an ASCII (UTF-8) document and abstracts the document into its
structure and smaller constituent parts, objects. This abstraction is then used
in subsequent processing to reconstitute the document into disparate document
representations, many quite different from one another, e.g. HTML, ODF, LaTeX
and to populate SQL
SiSU identifies the document structure (headings and their levels) and pulls
the document apart in to its constituent parts, objects (paragraphs, headings,
tables, images etc.) to which it assigns an object number if substantive
content.
From the marked up document SiSU need to be able to determine a documents
structure, and the objects that a document contains
The first line of a SiSU marked up document can identify itself with
%% Identify SiSU Document ------------------------------------------------------
% SiSU
%% The Basic SiSU Markup Document ----------------------------------------------
SiSU documents are divided into two parts:
(i) the document header, which contains (a) metadata and (b) processing
instructions if any document headers take the form of a tag and the related
related information
(ii) the substantive content
%% Minimum (SiSU) Markup Requirements ------------------------------------------
minimum requirements for a SiSU document
(i)(a) document header metadata, should contain at least:
@title:
@creator:
:author:
(i)(b) processing instructions (grouped under @make:) are not required as there
are defaults that will be used
(ii) the substantive document structure must be defined, here structure equates
to the headings and their relative levels, this can be done either by explicit
markup where each heading occurs, or in the header @make: section of the, or
both
(iii) the basic document objects are headings and paragraphs. Paragraphs are
identified automatically, and headings must be defined, so sisu is able to
determine the basic objects without anything further
sample_1.sst
%% Document Structure ----------------------------------------------------------
Document structure (heading levels) are determined from information provided in
the markup of the document. There are two ways to identify document structure:
(i) manual markup of headings with their level; (ii) in the sisu header, under
@make: :heading: provide a regex, in the manner understood by sisu, that
identifies what to look for in headings of various levels.
Document structure is the different headings in a document, and their relative
levels.
There are two sets of docment level markers :A~ and an optional :B~ & :C~ and
beneath that 1~, 2~, 3~.
For the first set of document level markers the document Title being the top
level in the hierarchy; beneath that book titles if the document contains more
than one book followed by sections
%% Document Objects ------------------------------------------------------------
Document objects are units of text that are identified, stored and processed as
a block. The most usual document objects would be paragraphs and headings. A
more complete list of objects includes: paragraphs; headings; tables; code
blocks; verse (the poem is identified, but each verse is an object); grouped
text...
%% The Gory Details ------------------------------------------------------------
%% comments --------------------------------------------------------------------
Comments in sisu are a percentage sign at the start of a line followed by a
space and then the comment
% this would be a comment
%% headings --------------------------------------------------------------------
There are two sets of docment heading markers :A~ and an optional :B~ & :C~ and
beneath that 1~, 2~, 3~. These markers are placed at the start of the
line/paragraph, and followed by the heading
There is usually one :A~ top level heading, which is the document title,
sometimes including the author. This is such a common occurrence that there is
a shortcut where metadata headers are provided for @title: and @creator
:author:, instead of rewriting the title and author's name, you may write :A~
@title @author
If you have a document/manuscript that has subsections above the level of
chapter, such as multiple books, parts, section, two additional top level
headings are available :B~ and :C~
At the main division level, usually chapter heading level 1~ begins, followed
by 2~ and 3~ if the chapter has subheadings. Because the html and epub
segmented output breaks level 1~ into separate files it is possible to provide
the filename, e.g. 1~prologue Prologue
Where names are provided following the heading tilde, these become tagged
points within the document which can where the output format permits be
(hyper-)linked to.
%% font effect, modified font including emphasis -------------------------------
Whithin normal text it is possible to modify the font effect of a word or
phrase using the following markers:
!{ bold }!
/{ italics }/
_{ underscore }_
*{ emphasis }* (how emphasis is represented in output text can be defined in
the header @make: section of a document or in the sisurc.yml config file, and
this may be as bold, italics or underscore, the default being bold)
^{ superscript }^
,{ subscript },
+{ inserted text }+
_{ strikethrough }_
An exclamation mark followed by an underscore at the start of a line will bold
that line until the first line-break
!_ this line would be bold
It is also possible to define in the header section under the @make: section
which words or patterns should automatically be made bold or italics.
@make:
:bold: /Gnu|Linux|Debian|Fedora|Ruby|SiSU/
:italics: /inter alia/
%% indent ----------------------------------------------------------------------
_1 a paragraph that is indented one level
_2 a paragraph that is indented two steps
%% bullet ----------------------------------------------------------------------
_* bulleted text
_1* bulleted indented text
%% auto-numbering -------------------------------------------------------------
Some auto-numbering occurs in the building of sisu documents, either by default
or when requested through configuration options
%% auto-numbering document objects ---------------------------------------------
Document objects are automatically given sequential object numbers, object
citation numbers (ocn). If there is text that for some reason should not be
regarded as substantive objects, it is possible to prevent a document number
being given by adding ~# to the end of the object (paragraph/heading, etc.) ~#
A variation used for headings that are added to provide document structure that
should where possible not be included in output is -# a heading that is marked
with -# is un-numbered and may be excluded from document outputs.
%% auto-numbering headings -----------------------------------------------------
Note auto-numbering of headings may be specified in the header @make: :num_top:
by providing the heading level from which numbering is to start, this is
usually at the chapter level (1~).
@make:
:num_top: 1
numbering continues three levels down, level 1 being numbered 1, 2, 3 ...
level 2: 1.1, 1.2, 1.3 and so on
level 3: 1.1.1, 1.1.2, 1.1.3
It is also possible to make an auto-numbered list
# numbered list numbered list 1., 2., 3, etc.
_# indented lettered list sub-level of previous list number a., b., c., d., etc.
%% line breaks -----------------------------------------------------------------
<:br> line break
In paper/ page oriented outputs, such as LaTeX/pdf the following are avaiable
<:pb> page or column break
<:pn> new page
in the header section under @make: :breaks: new and break set a page break or
new page at the levels indicated, e.g.
@make:
:breaks: new=C; break=1
%% footnotes / endnotes --------------------------------------------------------
This paragraph contains a footnote~{ a footnote or endnote }~ which would be
automatically numbered
Footnotes and endnotes are marked up at the location where they would be
indicated within a text. They are automatically numbered. The output type
determines whether footnotes or endnotes will be produced
In addition to regular footnotes/endnotes there are astrisk and plus sign
numbered and unnumbered footnotes.
normal text ~[* editors notes, numbered asterisk footnote/endnote series ]~ continues
normal text ~[+ editors notes, numbered asterisk footnote/endnote series ]~ continues
normal text ~{* unnumbered asterisk footnote/endnote, insert multiple asterisks if required }~ continues
%% tag points ------------------------------------------------------------------
Tag points are markers within the document which may be used within the
document for (internal document) linking where the output format permits. Tag
names should use alphanumeric characters and underscores [a-z0-9_]+.
There are different types of tag point, some automatically provided by sisu,
such as each ocn (object citation number)
Manual tags may be provided either:
(a) with headings where a name is added to the heading level after the tilde:
1~prefix [heading]
(b) a tag marker can be added to a paragraph using an asterisk tilde and the
name *~tag_marker
%% links and urls --------------------------------------------------------------
Urls found within text are marked up automatically, and where the output type
permits is autmatically hyperlinked to inself and decorated with angled braces
(unless contained in a code block, or escaped by a preceeding underscore).
To link text or an image to a url the markup is as follows
{ this is the linked section of text}http://url.org
Where it is wished to include the url for the linked text in a footnote, the
long form of markup would be:
{ SiSU }http://www.jus.uio.no/sisu/ ~{ http://www.jus.uio.no/sisu/ }~
A short form is provided for achieving the same:
{~^ SiSU }http://www.jus.uio.no/sisu/
%% images ----------------------------------------------------------------------
Images are placed in the directory beneath the location of the document to be
processed _sisu/image
The following are examples of links to images
{ tux.png 64x80 }image
{tux.png 64x80 "Gnu/Linux - a better way" }http://www.jus.uio.no/sisu/
{GnuDebianLinuxRubyBetterWay.png 100x101 "Way Better - with Gnu/Linux, Debian and Ruby" }http://www.jus.uio.no/sisu/
The 64x80 in the first example is the image dimension, (width x height). This
may be omitted if imagemagick or graphicsmagick are installed, as they will
determine the image dimensions
As with other linked text, the following markup
{~^ ruby_logo.png "Ruby" }http://www.ruby-lang.org/en/
maps to
{ ruby_logo.png "Ruby" }http://www.ruby-lang.org/en/ ~{ http://www.ruby-lang.org/en/ }~
%% grouped text ----------------------------------------------------------------
%% group -----------------------------------------------------------------------
The start and end of text that is grouped are tagged. Grouped text retains its
line breaks, and is treated as a unit, getting a single ocn
group{
License: GPL 3 or later:
SiSU, a framework for document structuring, publishing and search
Copyright (C) Ralph Amissah
This program is free software: you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation, either version 3 of the License, or (at your option) any later
version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with
this program. If not, see .
If you have Internet connection, the latest version of the GPL should be
available at these locations:
}group
%% poem ------------------------------------------------------------------------
The start and end of a poem are tagged. Each verse in a poem is given an object
number. Verses retain their line breaks.
poem{
verse here
declare
another verse
here
}poem
%% table -----------------------------------------------------------------------
{table~h 24; 12; 12; 12; 12; 12; 12;}
|Jan. 2001|Jan. 2002|Jan. 2003|Jan. 2004|July 2004|June 2006
Contributors* | 10| 472| 2,188| 9,653| 25,011| 48,721
Active contributors** | 9| 212| 846| 3,228| 8,442| 16,945
Very active contributors*** | 0| 31| 190| 692| 1,639| 3,016
No. of English language articles| 25| 16,000| 101,000| 190,000| 320,000| 630,000
No. of articles, all languages | 25| 19,000| 138,000| 490,000| 862,000|1,600,000
table{ c3; 40; 30; 30;
This is a table
this would become column two of row one
column three of row one is here
And here begins another row
column two of row two
column three of row two, and so on
}table
a second form may be easier to work with in cases where there is not much information in each column
%% code ------------------------------------------------------------------------
code{
code lines here
and so on
}code
%% headers ---------------------------------------------------------------------
%% header metadata -------------------------------------------------------------
@title: SiSU
:subtitle: Markup
@creator:
:author: Amissah, Ralph
@rights:
:copyright: Copyright (C) Ralph Amissah 2007, part of SiSU documentation
:license: GPL 3 or later
@classify:
:type: information
:topic_register: electronic documents:SiSU:document:markup;SiSU:document:markup;SiSU:document:markup;SiSU:manual:markup;electronic documents:SiSU:manual:markup
:subject: ebook, epublishing, electronic book, electronic publishing, electronic document, electronic citation, data structure, citation systems, search
@date:
:created: 2002-08-28
:issued: 2002-08-28
:available: 2002-08-28
:published: 2008-05-22
:modified: 2010-05-25
@links:
{ SiSU Manual }http://www.jus.uio.no/sisu/sisu_manual/
{ Book Samples and Markup Examples }http://www.jus.uio.no/SiSU/examples.html
{ SiSU @ Wikipedia }http://en.wikipedia.org/wiki/SiSU
{ SiSU Download }http://www.jus.uio.no/sisu/SiSU/download.html
{ SiSU Changelog }http://www.jus.uio.no/sisu/SiSU/changelog.html
%% header processing instructions, @make: --------------------------------------
Some document processing parameters can be set in the @make: section of the header
@make:
:num_top: 1
:breaks: new=C; break=1
:skin: skin_sisu_manual
:bold: /Gnu|Debian|Ruby|SiSU/
%% config, sisurc.yml ----------------------------------------------------------
There are default configuration settings within the program many of which can
be modified in sisurc.yml
sisurc.yml can be located in one of the following locations:
/etc/sisu/sisurc.yml ~/.sisu/sisurc.yml and ./_sisu/sisurc.yml
if sisu versions 1 and 2 are both in use, the v2 sisurc.yml can be placed a the
following locations:
/etc/sisu/v2/sisurc.yml ~/.sisu/v2/sisurc.yml and ./_sisu/v2/sisurc.yml
# Name: SiSU - Simple information Structuring Universe
# Author: Ralph@Amissah.com
# Description: Site wide envionment defaults set here
# system environment info / resource configuration file, for sisu
# License: GPL v3 or later
# site environment configuration file
# this file should be configured and live in
# /etc/sisu #per environment settings, overridden by:
# ~/.sisu #per user settings, overridden by:
# ./_sisu #per local markup directory settings
#% #image source directory, main path and subdirectories
#image:
# path: 'sisu_working'
# public: '_sisu/image'
#% presentation/web directory, main path and subdirectories (most subdirectories are created automatically based on markup directory name)
#webserv:
# url_root: 'http://www.your.url' #without dir stub
# path: '/var/www' #either (i) / [full path from root] or (ii) ~/ [home] or (iii) ./ [pwd] or (iv) will be made from home
# images: '_sisu/image'
# man: 'man'
# cgi: '/usr/lib/cgi-bin'
# feed: 'feed'
# sqlite: 'sisu/sqlite'
# webrick_url: true
#show_output_on: 'filesystem' #for -v and -u url information, alternatives: 'filesystem','webserver','remote_webserver','local:8111','localhost','localhost:8080','webrick','path'
#show_output_on: 'local:8111'
#webserv_cgi:
# host: localhost
# base_path: ~
# port: '8081'
# user: ~
show_output_on: 'filesystem_url'
#texinfo display output
#texinfo:
# stub: 'texinfo'
##% processing directories, main path and subdirectories (appended to $HOME), using defaults set in sysenv
#processing:
# path: '~'
# dir: '.sisu_processing~'
# metaverse: 'metaverse'
# tune: 'tune'
# latex: 'tex'
# texinfo: 'texinfo'
# concord_max: 400000
#% flag - set (non-default) processing flag shortcuts -1, -2 etc. (here adding colour and verbosity as default)
flag:
color: true # making colour default -c is toggle, and will now toggle colour off
default: '-NhwepoabxXyYv' # -m run by default; includes verbose
i: '-hwpoay' # -m run by default
ii: '-NhwepoabxXy' # -m run by default
iii: '-NhwepoabxXyY' # -m run by default
iv: '-NhwepoabxXYDy --update' # -m run by default
v: '-NhwepoabxXYDyv --update' # -m run by default; includes verbose
#% papersize, (LaTeX/pdf) available values: A4, US_letter, book_b5, book_a5, US_legal
default:
papersize: 'A4,letter'
#text_wrap: 78
#emphasis: 'bold' #make *{emphasis}* 'bold', 'italics' or 'underscore', default if not configured is 'bold'
#digest: 'sha' #sha is sha256, default is md5
#multilingual: false
#language_file: 2
#language: 'English'
#% markup, make *{emphasis}* 'bold' or 'italics', default if not configured is 'bold'
#% settings used by ssh scp
#remote:
# -
# user: '[usrname]'
# host: '[remote.hostname]'
# path: '.' #no trailing slash eg 'sisu/www'
# -
# user: '[usrname]'
# host: '[remote.hostname]'
# path: '.' #no trailing slash eg 'sisu/www'
#% webrick information
#webrick:
# port: '8081'
#% sql database info, postgresql and sqlite
#db:
# share_source: false # boolean, default is false
# postgresql:
# port: # '[port (default is 5432)]'
# host: # '[if not localhost, provide host tcp/ip address or domain name]''
# user: # '[(if different from user) provide username]'
# password: # '[password if required]'
# sqlite:
# path: ~ # './sisu_sqlite.db'
# port: "**"
#% possible values ~, true, false, or command instruction e.g. editor: 'gvim -c :R -c :S'.
#will only ignore if value set to false, absence or nil will not remove program as should operate without rc file
#ie in case of ~ will ignore and use hard coded defaults within program), true, false, or command instruction e.g. editor: 'gvim -c :R -c :S'
#on value true system defaults used, to change, e.g. editor specify
permission_set:
zap: false
css_modify: false
# remote_base_site: true
program_set:
rmagick: false
# wc: true
# editor: true
# postgresql: true
# sqlite: true
# tidy: true
# rexml: true
# pdflatex: true
#program_select:
# editor: 'gvim -c :R -c :S'
# pdf_viewer: 'evince'
# web_browser: 'firefox' #'iceweasel' #'epiphany' #'galeon' #'konqueror' #'kazehakase'
# console_www_browser: 'links2' #'elinks'
# odf_viewer: 'oowriter' #'abiword'
# xml_viewer: 'xml-viewer'
# man: 'nroff -man' #'groff -man -Tascii' # 'nroff -man'
#promo: sisu_icon, sisu, sisu_search_libre, open_society, fsf, ruby
#search:
# sisu:
# flag: true
## action: http://localhost:8081/cgi-bin/sisu_pgsql.cgi
# action: http://search.sisudoc.org
# db: sisu
# title: sample search form
# hyperestraier:
# flag: true
# action: http://search.sisudoc.org/cgi-bin/estseek.cgi?
%% skin ------------------------------------------------------------------------
skins can be used to modify the appearance of output
[provide information]
%% css ------------------------------------------------------------------------
the default css may be replaced for the site or particular processing
directories
[provide information]