aboutsummaryrefslogtreecommitdiffhomepage
path: root/data/doc/sisu/org/sisu.org
diff options
context:
space:
mode:
Diffstat (limited to 'data/doc/sisu/org/sisu.org')
-rw-r--r--data/doc/sisu/org/sisu.org853
1 files changed, 853 insertions, 0 deletions
diff --git a/data/doc/sisu/org/sisu.org b/data/doc/sisu/org/sisu.org
new file mode 100644
index 00000000..fdcb3eaa
--- /dev/null
+++ b/data/doc/sisu/org/sisu.org
@@ -0,0 +1,853 @@
+#+PRIORITIES: A F E
+#+OPTIONS: ^:nil _:nil
+(emacs:evil mode gifts a "vim" of enticing "alternative" powers! ;)
+(vim, my _editor_ of choice also in the emacs environment :)
+
+* What is SiSU?
+
+Multiple output formats with a nod to the strengths of each output format and
+the ability to cite text easily across output formats.
+
+** debian/control desc
+
+documents - structuring, publishing in multiple formats and search
+ SiSU is a lightweight markup based, command line oriented, document
+ structuring, publishing and search, static content tool for document
+ collections.
+ .
+ With minimal preparation of a plain-text (UTF-8) file, using sisu markup syntax
+ in your text editor of choice, SiSU can generate various document formats, most
+ of which share a common object numbering system for locating content, including
+ plain text, HTML, XHTML, XML, EPUB, OpenDocument text (ODF:ODT), LaTeX, PDF
+ files, and populate an SQL database with objects (roughly paragraph-sized
+ chunks) so searches may be performed and matches returned with that degree of
+ granularity. Think of being able to finely match text in documents, using
+ common object numbers, across different output formats and across languages if
+ you have translations of the same document. For search, your criteria is met
+ by these documents at these locations within each document (equally relevant
+ across different output formats and languages). To be clear (if obvious) page
+ numbers provide none of this functionality. Object numbering is particularly
+ suitable for "published" works (finalized texts as opposed to works that are
+ frequently changed or updated) for which it provides a fixed means of reference
+ of content. Document outputs can also share provided semantic meta-data.
+ .
+ SiSU also provides concordance files, document content certificates and
+ manifests of generated output and the means to make book indexes that make use
+ of its object numbering.
+ .
+ Syntax highlighting and folding (outlining) files are provided for the Vim and
+ Emacs editors.
+ .
+ Dependencies for various features are taken care of in sisu related packages.
+ The package sisu-complete installs the whole of SiSU.
+ .
+ Additional document markup samples are provided in the package
+ sisu-markup-samples which is found in the non-free archive. The licenses for
+ the substantive content of the marked up documents provided is that provided
+ by the author or original publisher.
+ .
+ SiSU uses utf-8 & parses left to right. Currently supported languages:
+ am bg bn br ca cs cy da de el en eo es et eu fi fr ga gl he hi hr hy ia is it
+ ja ko la lo lt lv ml mr nl nn no oc pl pt pt_BR ro ru sa se sk sl sq sr sv ta
+ te th tk tr uk ur us vi zh (see XeTeX polyglossia & cjk)
+ .
+ SiSU works well under po4a translation management, for which an administrative
+ sample Rakefile is provided with sisu_manual under markup-samples.
+
+** take two
+
+SiSU may be regarded as an open access document publishing platform, applicable
+to a modest but substantial domain of documents (typically law and literature,
+but also some forms of technical writing), that is tasked to address certain
+challenges I identified as being of interest to me over the years in open
+publishing.
+
+The idea and implementation may be of interest to consider as some of the
+issues encountered and that it seeks to address are known and common to such
+endeavors. Amongst them:
+
+ * how do you ensure what you do now can be read in decades?
+ * how do you keep up with new changing and technologies?
+ * do you select a canonical format to represent your documents, if so
+ what?
+ * how do you reliably cite (locate) material in different document
+ representations?
+ * how do you deal with multilingual texts?
+ * what of search?
+ * how are documents contributed to the collection?
+
+(these questions are selected in to help describe the direction of efforts with
+regard to sisu).
+
+My Dabblings in the Domain of Open Publishing
+---------------------------------------------
+
+The system is called SiSU, it is an offshoot of my early efforts at finding out
+what to make of the web, that started at the University of Tromsø in 1993 (an
+early law website Ananse/ International Trade Law Project / Lex Mercatoria). I
+have worked on SiSU continually since 1997 and it has been open source in 2005
+(under a license called GPL3+), though I remain its developer.
+
+In working in this field I have had to address some of the common issues.
+
+So how do you ensure what you do now can be read in decades to come? There are
+alternative solutions. (i) stick with a widely used and not overly complicated
+well document open standard, and for that the likes of odf is an excellent
+choice (ii) alternatively go for the most basic representation of a document
+that meets your needs, in my case based on UTF-8 text and some markup tags,
+fairly easily parsable by the human eye and as long as utf8 is in use it will
+always be possible to extract the information
+
+How do you keep up with new changing and technologies? Here my solution has
+been to generate new versions of the substantive content so as to always have
+the latest document representations available e.g. HTML has changed a lot over
+the years, different specifications come out for various formats including ODF,
+electronic readers have become an important viewing alternative, introducing
+the open reader format EPUB. Output representations are generated from source
+documents. Different open document file formats can be produced and databases
+and search engines populated. (The source documents and interpreter are all
+that are required to re-create site content. Source documents can be made
+public or retained privately). The strict separation of a simple source
+document from the output produced, means that with updates to SiSU (the
+interpreter/processor/generator), outputs can be updated technically as
+necessary, and new output formats added when needed. Amongst the output formats
+currently supported are HTML, LaTeX generated Pdfs (A4, letter, other;
+landscape, portrait), Epub, Open Document Format text. Returning to HTML as an
+example, it has changed a lot over the years I have worked with it, this way of
+working has meant it is possible to keep producing current versions of HTML,
+retaining the original substantive document... and new formats have been added
+as thought desired. There is no attempt to make output in different document
+formats/ representations look alike let alone identical. Rather the attempt is
+to optimize output for the particular document filetype, (there is no reason
+why an epub document would look or behave like an open document text or that a
+Pdf would look like HTML output; rather PDF is optimized for paper viewing,
+HTML for screen etc.) Wherever possible features associated with the
+particular output type are taken advantage of. This freedom is made possible to
+a large extent by the answer to the question that follows.
+
+How do you reliably cite (locate) material in different document
+representations? The traditional answer has been to have a canonical
+publication, and resulting fixed page numbers. This was not a viable solution
+for HTML (which changes from one viewer to another and with selectable font
+faces & size etc.); nor is it otherwise ideal in an electronic age with the
+possibility of presenting/interacting with material/documents in so many
+different ways. Why be so restricted? Here my solution has been "object
+citation numbering". What the various generated document formats have in
+common is a shared object numbering system that identifies the location of text
+and that is available for citation purposes. Object numbers are: sequential
+numbers assigned to each identified object in a document. Objects are logical
+units of text (or equivalent parts of a document), usually paragraphs, but also
+document headings, tables, images, in a poem a verse etc. [In an electronic
+publishing age are page numbers the best we can come up with? Change font
+type, font size, page orientation, paper size (sometimes even the viewer) and
+where are you with them? And paper though a favorite medium of mine is no
+longer the sole (or sometimes primary) means of interacting with documents/text
+or of sharing knowledge]
+
+What object numbers mean (unlike page numbers) is e.g.
+
+ * if you cite text in any format, the resulting output can be reliably located
+ in any other document format type. Cite HTML and the reader can choose to
+ view in Epub or Pdf (the PDFs being an independent output, generated by
+ book publishing software XeTeX/LaTeX).
+
+ * if you do a search, you can be given a result "index" indicating that your
+ search criteria is met by these documents, and at these specific locations
+ within each document, and the "index" is relevant not only for content
+ within the database, but for all document formats.
+
+ * if you have a translated text prepared for sisu, then your citations are
+ relevant across languages e.g. you can specify exactly where in a Chinese
+ document text is to be found.
+
+ * generated document index references & concordance list references etc. are
+ relevant across all output formats.
+
+What of search? For search, see the implications of object numbers for search
+mentioned above. The system currently loads an SQL server (Postgresql) with
+object sized text chunks. It could just as well populate an analytical engine
+with larger sections or chapters of text for analytical purposes (such as the
+currently popular Elasticsearch), whilst availing itself also of the concept of
+objects and object numbers in search results.
+
+How do you deal with multilingual texts? If you have translated text prepared
+for sisu, then your citations are relevant across languages. Object numbers
+also provide an easy way to compare, discuss text (translations) across
+languages. Text found/cited in one language has the same object number in its
+translations, a given paragraph will be the same in another language, just
+change the language code. (documents are prepared in UTF-8, current language
+restrictions are: through use of LaTeX tools, Polyglosia & CJK (Chinese,
+Japanese & Korean), and from the fact that sisu parses left to right)
+
+How are materials prepared for contribution to the collection? (a) The easiest
+solution if the system allows is for submission in the format in which work is
+authored, usually a word processor, for which odf may be a decent selection.
+(b) I have stuck with enhanced plaintext, UTF-8 with minimal markup. Source
+documents are prepared in UTF-8 text, with a minimalist native markup to
+indicate the document structure (headings and their relative levels),
+footnotes, and other document "features". This markup is easily parsable to the
+human eye, and plays well with version control systems. Documents are prepared
+in a text editor. Front ends such as markup assistants in a word processor that
+can save to sisu text format or other tool whist possible do not exist. [(c)
+yet another form of submission for collaborative work are wikis which have
+shown their strength in efforts such as Wikipedia.]
+
+The system has proven to be a good testing ground for ideas and is flexible and
+extensible. (things that could usefully be done: apart from a front end for
+simpler user interaction; feed text to an analytical search engine, like
+Elasticsearch/Lucene; it still needs a bibliography parser (auto-generation of
+a bibliography from footnotes); and it might be useful to allow rough auto
+translation documents on the fly by passing text through a translator (such as
+Google translate)).
+
+In any event, my resulting technical opinions (in my modest domain of
+action) may be regarded as encapsulated within SiSU
+[http://www.sisudoc.org/]
+
+http://www.sisudoc.org/
+http://www.jus.uio.no/sisu/
+
+git clone git://git.sisudoc.org/git/code/sisu.git --branch upstream
+http://git.sisudoc.org/gitweb/?p=code/sisu.git;a=summary
+(there may be additional commits in the upstream branch)
+git clone --depth 1 git://git.sisudoc.org/git/code/sisu.git --branch upstream
+
+git clone git://git.sisudoc.org/git/doc/sisu-markup-samples.git --branch upstream
+git clone --depth 1 git://git.sisudoc.org/git/doc/sisu-markup-samples.git --branch upstream
+Development work is on Linux and the easiest way to install it is through the
+Debian Linux package as this takes care of optional external dependencies such
+as XeTeX for PDF output and Postgresql or Sqlite for search.
+
+** multiple document formats
+
+Text can be represented in multiple output formats with different
+characteristics that are (or may be) regarded as strengths/advantages and
+therefore preferred in different contexts.
+
+Given the different strengths and characteristics of various output formats, it
+makes little sense to try too hard to make different representations of a
+document look the same. More interesting is have document representations that
+take advantage of each given outputs strengths. As valuable if not more so is
+the ability to cite, find, discuss text with ease, across the different output
+formats.
+
+For citation across output formats, SiSU uses object citation numbers.
+
+** document structure and document objects
+
+SiSU breaks marked up text into document structure and objects
+
+Document structure being the document heading hierarchy (having separated out
+the document header).
+
+*** What are document objects?
+An object is an identified meaningful unit of a document, most commonly a
+paragraph of text, but also for example a table, code block, verse or image.
+
+SiSU tracks these substantive document units as document objects (and their
+relationship to the document structure).
+
+** object citation numbers
+
+*** What are object citation numbers?
+
+An object citation number is a sequential number assigned to a document object.
+
+In sisu output documents share this common object numbering system (dubbed
+"object citation numbering" (ocn)) that is meaningful (machine & human readable)
+across various digital outputs whether paper, screen, or database oriented,
+(PDF, html, XML, EPUB, sqlite, postgresql), and across multilingual content if
+prepared appropriately. This numbering system can be used to reference content
+across output types.
+
+*** Why might I want object citation numbering?
+
+The ability to cite and quickly locate text can be invaluable if not essential.
+ (whether for instruction or discussion).
+
+In this digital & Internet age we have multiple ways to represent documents and
+multiple document output formats as options with different characteristics,
+strengths/advantages etc. We need a way to cite text that works and is relevant
+independent of the document format used.
+
+I want to discuss (cite) html text how do I do this?
+how do I refer to / cite / discuss text in html?
+Issue: html may be viewed online or printed, it is not tied to paper (as
+e.g. pdf) and prints differently depending on selected font face and font size.
+
+I want to discuss (cite) text that is available in multiple formats (e.g. pdf,
+epub, html) without having to worry about the output format that is referred
+to.
+How do I refer to / discuss text that is available in more than one format,
+uncertain of what format is preferred, used or available to my colleagues?
+e.g. html and epub or pdf have rather different text representations, how do I
+discuss ...
+
+I would like to have a book index that is relevant (can be used) across multiple
+output formats (e.g. pdf, epub, html)
+
+How do I make a book index (or a concordance file) that works across multiple
+output formats?
+
+I would like to have search results indicating where in a document matches are
+found and I would like it to be relevant across available output formats (e.g.
+pdf, epub, html)
+How do I get search results for locations of text within each relevant document
+
+I would like to be able to discuss a text that has been translated ...
+how do I find text across languages?
+Where I have a nicely translated document, how do I point to or discuss with my
+foreign language counterpart some detail of the text, or, how do I point my
+foreign language counterpart to the text I would like to bring to his
+attention.
+
+** "Granular" Search
+
+Of interest is the ease of streaming documents to a relational database, at an
+object (roughly paragraph) level and the potential for increased precision in
+the presentation of matches that results thereby. The ability to serialize
+html, LaTeX, XML, SQL, (whatever) is also inherent in / incidental to the
+design.
+
+** Summary
+SiSU information Structuring Universe
+Structured information, Serialized Units <www.sisudoc.org> or
+<www.jus.uio.no/sisu/> software for electronic texts, document collections,
+books, digital libraries, and search, with "atomic search" and text positioning
+system (shared text citation numbering: "ocn")
+outputs include: plaintext, html, XHTML, XML, ODF (OpenDocument), EPUB, LaTeX,
+PDF, SQL (PostgreSQL and SQLite)
+
+** SiSU Short Description
+
+SiSU is a comprehensive future-resilient electronic document management system.
+Built-in search capabilities allow you to search across multiple documents and
+highlight matches in an easy-to-follow format. Paragraph numbering system
+allows you to cite your electronic documents in a consistent manner across
+multiple file formats. Multiple format outputs allow you to display your
+documents in plain text, PDF (portrait and horizontal), OpenDocument format,
+HTML, or e-book reading format (EPUB). Word mapping allows you to easily create
+word indexes for your documents. Future-resilient flexibility allows you to
+quickly adapt your documents to newer output formats as needed. All these and
+many other features are achieved with little or no additional work on your
+documents - by marking up the documents with a super simplistic markup
+language, leaving the SiSU engine to handle the heavy-lifting processing.
+
+Potential users of SiSU include individual authors who want to publish their
+books or articles electronically to reach a broad audience, web publishers who
+want to provide multiple channels of access to their electronic documents, or
+any organizations which centrally manage a medium or large set of electronic
+documents, especially governmental organizations which may prefer to keep their
+documents in easily accessible yet non-proprietary formats.
+
+SiSU is an Open Source project initiated and led by Ralph Amissah
+<ralph.amissah@gmail.com> and can be contacted via mailing list
+<http://lists.sisudoc.org/listinfo/sisu> at <sisu@lists.sisudoc.org>. SiSU is
+licensed under the GNU General Public License.
+
+*** notes
+
+For less markup than the most elementary HTML you can have more. SiSU -
+Structured information, Serialized Units for electronic documents, is an
+information structuring, transforming, publishing and search framework with the
+following features:
+
+(i) markup syntax: (a) simpler than html, (b) mnemonic, influenced by
+mail/messaging/wiki markup practices, (c) human readable, and easily writable,
+
+(ii) (a) minimal markup requirement, (b) single file marked up for multiple outputs,
+
+ * documents are prepared in a single UTF-8 file using a minimalistic mnemonic
+syntax. Typical literature, documents like "War and Peace" require almost no
+markup, and most of the headers are optional.
+
+ * markup is easily readable/parsed by the human eye, (basic markup is simpler
+and more sparse than the most basic html), [this may also be converted to XML
+representations of the same input/source document].
+
+ * markup defines document structure (this may be done once in a header
+pattern-match description, or for heading levels individually); basic text
+attributes (bold, italics, underscore, strike-through etc.) as required; and
+semantic information related to the document (header information, extended
+beyond the Dublin core and easily further extended as required); the headers
+may also contain processing instructions.
+
+(iii) (a) multiple output formats, including amongst others: plaintext (UTF-8);
+html; (structured) XML; ODF (Open Document text); EPUB; LaTeX; PDF (via LaTeX);
+SQL type databases (currently PostgreSQL and SQLite). SiSU produces:
+concordance files; document content certificates (md5 or sha256 digests of
+headings, paragraphs, images etc.) and html manifests (and sitemaps of
+content). (b) takes advantage of the strengths implicit in these very different
+output types, (e.g. PDFs produced using typesetting of LaTeX, databases
+populated with documents at an individual object/paragraph level, making
+possible granular search (and related possibilities))
+
+(iv) outputs share a common numbering system (dubbed "object citation
+numbering" (ocn)) that is meaningful (to man and machine) across various
+digital outputs whether paper, screen, or database oriented, (PDF, html, XML,
+EPUB, sqlite, postgresql), this numbering system can be used to reference
+content.
+
+(v) SQL databases are populated at an object level (roughly headings,
+paragraphs, verse, tables) and become searchable with that degree of
+granularity, the output information provides the object/paragraph numbers which
+are relevant across all generated outputs; it is also possible to look at just
+the matching paragraphs of the documents in the database; [output indexing also
+work well with search indexing tools like hyperesteier].
+
+(vi) use of semantic meta-tags in headers permit the addition of semantic
+information on documents, (the available fields are easily extended)
+
+(vii) creates organised directory/file structure for (file-system) output,
+easily mapped with its clearly defined structure, with all text objects
+numbered, you know in advance where in each document output type, a bit of text
+will be found (e.g. from an SQL search, you know where to go to find the
+prepared html output or PDF etc.)... there is more; easy directory management
+and document associations, the document preparation (sub-)directory may be used
+to determine output (sub-)directory, the skin used, and the SQL database used,
+
+(viii) "Concordance file" wordmap, consisting of all the words in a document
+and their (text/ object) locations within the text, (and the possibility of
+adding vocabularies),
+
+(ix) document content certification and comparison considerations: (a) the
+document and each object within it stamped with an sha256 hash making it
+possible to easily check or guarantee that the substantive content of a document
+is unchanged, (b) version control, documents integrated with time based source
+control system, default RCS or CVS with use of $Id$ tag, which SiSU checks
+
+(x) SiSU's minimalist markup makes for meaningful "diffing" of the substantive
+content of markup-files,
+
+(xi) easily skinnable, document appearance on a project/site wide, directory
+wide, or document instance level easily controlled/changed,
+
+(xii) in many cases a regular expression may be used (once in the document
+header) to define all or part of a documents structure obviating or reducing
+the need to provide structural markup within the document,
+
+(xiii) prepared files may be batch process, documents produced are static files
+so this needs to be done only once but may be repeated for various reasons as
+desired (updated content, addition of new output formats, updated technology
+document presentations/representations)
+
+(xiv) possible to pre-process, which permits: the easy creation of standard
+form documents, and templates/term-sheets, or; building of composite documents
+(master documents) from other sisu marked up documents, or marked up parts,
+i.e. import documents or parts of text into a main document should this be
+desired
+
+there is a considerable degree of future-resilience, output representations are
+"upgradeable", and new document formats may be added.
+
+(xv) there is a considerable degree of future-resilience, output representations
+are "upgradeable", and new document formats may be added: (a) modular, (thanks
+in no small part to Ruby) another output format required, write another
+module.... (b) easy to update output formats (eg html, XHTML, LaTeX/PDF
+produced can be updated in program and run against whole document set), (c)
+easy to add, modify, or have alternative syntax rules for input, should you
+need to,
+
+(xvi) scalability, dependent on your file-system (ext3, Reiserfs, XFS,
+whatever) and on the relational database used (currently Postgresql and
+SQLite), and your hardware,
+
+(xvii) only marked up files need be backed up, to secure the larger document
+set produced,
+
+(xviii) document management,
+
+(xix) Syntax highlighting for SiSU markup is available for a number of text
+editors.
+
+(xx) remote operations: (a) run SiSU on a remote server, (having prepared sisu
+markup documents locally or on that server, i.e. this solution where sisu is
+installed on the remote server, would work whatever type of machine you chose
+to prepare your markup documents on), (b) generated document outputs may be
+posted by sisu to remote sites (using rsync/scp) (c) document source (plaintext
+utf-8) if shared on the net may be identified by its url and processed locally
+to produce the different document outputs.
+
+(xxi) document source may be bundled together (automatically) with associated
+documents (multiple language versions or master document with inclusions) and
+images and sent as a zip file called a sisupod, if shared on the net these too
+may be processed locally to produce the desired document outputs, these may be
+downloaded, shared as email attachments, or processed by running sisu against
+them, either using a url or the filename.
+
+(xxii) for basic document generation, the only software dependency is Ruby, and
+a few standard Unix tools (this covers plaintext, html, XML, ODF, EPUB, LaTeX).
+To use a database you of course need that, and to convert the LaTeX generated
+to PDF, a LaTeX processor like tetex or texlive.
+
+as a developers tool it is flexible and extensible
+
+** description
+
+SiSU ("SiSU information Structuring Universe" or "Structured information,
+Serialized Units"),1 is a Unix command line oriented framework for document
+structuring, publishing and search. Featuring minimalistic markup, multiple
+standard outputs, a common citation system, and granular search. Using markup
+applied to a document, SiSU can produce plain text, HTML, XHTML, XML,
+OpenDocument, LaTeX or PDF files, and populate an SQL database with objects2
+(equating generally to paragraph-sized chunks) so searches may be performed and
+matches returned with that degree of granularity (e.g. your search criteria is
+met by these documents and at these locations within each document). Document
+output formats share a common object numbering system for locating content.
+This is particularly suitable for "published" works (finalized texts as opposed
+to works that are frequently changed or updated) for which it provides a fixed
+means of reference of content. How it works
+
+SiSU markup is fairly minimalistic, it consists of: a (largely optional)
+document header, made up of information about the document (such as when it was
+published, who authored it, and granting what rights) and any processing
+instructions; and markup within text which is related to document structure and
+typeface. SiSU must be able to discern the structure of a document, (text
+headings and their levels in relation to each other), either from information
+provided in the instruction header or from markup within the text (or from a
+combination of both). Processing is done against an abstraction of the document
+comprising of information on the document's structure and its objects,2 which
+the program serializes (providing the object numbers) and which are assigned
+hash sum values based on their content. This abstraction of information about
+document structure, objects, (and hash sums), provides considerable flexibility
+in representing documents different ways and for different purposes (e.g.
+search, document layout, publishing, content certification, concordance etc.),
+and makes it possible to take advantage of some of the strengths of established
+ways of representing documents, (or indeed to create new ones).
+
+1. also chosen for the meaning of the Finnish term "sisu".
+
+2 objects include: headings, paragraphs, verse, tables, images, but not
+footnotes/endnotes which are numbered separately and tied to the object from
+which they are referenced.
+
+More information on SiSU provided at: <www.sisudoc.org/sisu/SiSU>
+
+SiSU was developed in relation to legal documents, and is strong across a wide
+variety of texts (law, literature...(humanities, law and part of the social
+sciences)). SiSU handles images but is not suitable for formulae/ statistics,
+or for technical writing at this time.
+
+SiSU has been developed and has been in use for several years. Requirements to
+cover a wide range of documents within its use domain have been explored.
+
+<ralph@amissah.com>
+<ralph.amissah@gmail.com>
+<sisu@lists.sisudoc.org>
+<http://lists.sisudoc.org/listinfo/sisu>
+2010
+w3 since October 3 1993
+* Finding SiSU
+** source
+http://git.sisudoc.org/gitweb/
+
+*** sisu
+sisu git repo:
+http://git.sisudoc.org/gitweb/?p=code/sisu.git;a=summary
+
+**** most recent source without repo history
+git clone --depth 1 git://git.sisudoc.org/git/code/sisu.git --branch upstream
+**** full clone
+git clone git://git.sisudoc.org/git/code/sisu.git --branch upstream
+
+*** sisu-markup-samples git repo:
+http://git.sisudoc.org/gitweb/?p=doc/sisu-markup-samples.git;a=summary
+
+** mailing list
+sisu at lists.sisudoc.org
+http://lists.sisudoc.org/listinfo/sisu
+
+** irc oftc #sisu
+
+** home pages
+ <http://www.sisudoc.org/>
+ <http://search.sisudoc.org/>
+ <http://www.jus.uio.no/sisu>
+
+* Installation
+
+** where you take responsibility for having the correct dependencies
+
+Provided you have *Ruby*, *SiSU* can be run.
+
+SiSU should be run from the directory containing your sisu marked up document
+set.
+
+This works fine so long as you already have sisu external dependencies in
+place. For many operations such as html, epub, odt this is likely to be fine.
+Note however, that additional external package dependencies, such as texlive
+(for pdfs), sqlite3 or postgresql (for search) should you desire to use them
+are not taken care of for you.
+
+*** run off the source tarball without installation
+
+RUN OFF SOURCE PACKAGE DIRECTORY TREE (WITHOUT INSTALLING)
+..........................................................
+
+**** 1. Obtain the latest sisu source
+
+using git:
+
+http://git.sisudoc.org/gitweb/?p=code/sisu.git;a=summary
+http://git.sisudoc.org/gitweb/?p=code/sisu.git;a=log
+
+ git clone git://git.sisudoc.org/git/code/sisu.git --branch upstream
+ git clone --depth 1 git://git.sisudoc.org/git/code/sisu.git --branch upstream
+
+or, identify latest available source:
+
+https://packages.debian.org/sid/sisu
+http://packages.qa.debian.org/s/sisu.html
+http://qa.debian.org/developer.php?login=sisu@lists.sisudoc.org
+
+http://sisudoc.org/sisu/archive/pool/main/s/sisu/
+
+and download the:
+
+ sisu_5.4.5.orig.tar.xz
+
+using debian tool dget:
+
+The dget tool is included within the devscripts package
+https://packages.debian.org/search?keywords=devscripts
+to install dget install devscripts:
+
+ apt-get install devscripts
+
+and then you can get it from Debian:
+ dget -xu http://ftp.fi.debian.org/debian/pool/main/s/sisu/sisu_5.4.5-1.dsc
+
+or off sisu repos
+ dget -x http://www.jus.uio.no/sisu/archive/pool/main/s/sisu/sisu_5.4.5-1.dsc
+or
+ dget -x http://sisudoc.org/sisu/archive/pool/main/s/sisu/sisu_5.4.5-1.dsc
+
+**** 2. Unpack the source
+
+Provided you have *Ruby*, *SiSU* can be run without installation straight from
+the source package directory tree.
+
+Run ruby against the full path to bin/sisu (in the unzipped source package
+directory tree). SiSU should be run from the directory containing your sisu
+marked up document set.
+
+ ruby ~/sisu-5.4.5/bin/sisu --html -v document_name.sst
+
+This works fine so long as you already have sisu external dependencies in
+place. For many operations such as html, epub, odt this is likely to be fine.
+Note however, that additional external package dependencies, such as texlive
+(for pdfs), sqlite3 or postgresql (for search) should you desire to use them
+are not taken care of for you.
+
+*** gem install (with rake)
+
+(i) create the gemspec; (ii) build the gem (from the gemspec); (iii) install
+the gem
+
+Provided you have ruby & rake, this can be done with the single command:
+
+ rake gem_create_build_install
+
+to build and install sisu v5 & sisu v6, alias gemcbi
+
+separate gems are made/installed for sisu v5 & sisu v6 contained in source.
+
+to build and install sisu v5, alias gem5cbi:
+
+ rake gem_create_build_install_stable
+
+to build and install sisu v6, alias gem6cbi:
+
+ rake gem_create_build_install_unstable
+
+for individual steps (create, build, install) see rake options, rake -T to
+specify sisu version for sisu installed via gem
+
+ gem search sisu
+
+ sisu _5.4.5_ --version
+
+ sisu _6.0.11_ --version
+
+to uninstall sisu installed via gem
+
+ sudo gem uninstall --verbose sisu
+
+For a list of alternative actions you may type:
+
+ rake help
+
+ rake -T
+
+Rake: <http://rake.rubyforge.org/> <http://rubyforge.org/frs/?group_id=50>
+
+*** installation with setup.rb
+
+this is a three step process, in the root directory of the unpacked *SiSU* as
+root type:
+
+ruby setup.rb config
+ruby setup.rb setup
+#[as root:]
+ruby setup.rb install
+
+further information:
+<http://i.loveruby.net/en/projects/setup/>
+<http://i.loveruby.net/en/projects/setup/doc/usage.html>
+
+ ruby setup.rb config && ruby setup.rb setup && sudo ruby setup.rb install
+
+** Debian install
+
+*SiSU* is available off the *Debian* archives. It should necessary only to run
+as root, Using apt-get:
+
+ apt-get update
+
+ apt get install sisu-complete
+
+(all sisu dependencies should be taken care of)
+
+If there are newer versions of *SiSU* upstream, they will be available by
+adding the following to your sources list /etc/apt/sources.list
+
+#/etc/apt/sources.list
+
+deb http://www.jus.uio.no/sisu/archive unstable main non-free
+deb-src http://www.jus.uio.no/sisu/archive unstable main non-free
+
+The non-free section is for sisu markup samples provided, which contain
+authored works the substantive text of which cannot be changed, and which as a
+result do not meet the debian free software guidelines.
+
+*SiSU* is developed on *Debian*, and packages are available for *Debian* that
+take care of the dependencies encountered on installation.
+
+The package is divided into the following components:
+
+ *sisu*, the base code, (the main package on which the others depend), without
+ any dependencies other than ruby (and for convenience the ruby webrick web
+ server), this generates a number of types of output on its own, other
+ packages provide additional functionality, and have their dependencies
+
+ *sisu-complete*, a dummy package that installs the whole of greater sisu as
+ described below, apart from sisu -examples
+
+ *sisu-pdf*, dependencies used by sisu to produce pdf from /LaTeX/ generated
+
+ *sisu-postgresql*, dependencies used by sisu to populate postgresql database
+ (further configuration is necessary)
+
+ *sisu-sqlite*, dependencies used by sisu to populate sqlite database
+
+ *sisu-markup-samples*, sisu markup samples and other miscellany (under
+ *Debian* Free Software Guidelines non-free)
+
+ *SiSU* is available off Debian Unstable and Testing [link:
+ <http://packages.debian.org/cgi-bin/search_packages.pl?searchon=names&subword=1&version=all&release=all&keywords=sisu>]
+ [^1] install it using apt-get, aptitude or alternative *Debian* install tools.
+
+** Arch Linux
+
+* sisu markup :sisu:markup:
+
+** sisu markup
+
+#% structure - headings, levels
+ * headings (A-D, 1-3)
+ * inline
+ 'A~ ' NOTE title level
+ 'B~ ' NOTE optional
+ 'C~ ' NOTE optional
+ 'D~ ' NOTE optional
+ '1~ ' NOTE chapter level
+ '2~ ' NOTE optional
+ '3~ ' NOTE optional
+ '4~ ' NOTE optional :consider:
+ * node
+ * parent
+ * children
+
+#% font face NOTE open & close marks, inline within paragraph
+ * emphasize '*{ ... }*' NOTE configure whether bold italics or underscore, default bold
+ * bold '!{ ... }!'
+ * italics '/{ ... }/'
+ * underscore '_{ ... }_'
+ * superscript '^{ ... }^'
+ * subscript ',{ ... },'
+ * strike '-{ ... }-'
+ * add '+{ ... }+'
+ * monospace '#{ ... }#'
+#% para NOTE paragraph controls are at the start of a paragraph
+ * a para is a block of text separated from others by an empty line
+ * indent
+ * default, all '_1 ' up to '_9 '
+ * first line hang '_1_0 '
+ * first line indent further '_0_1 '
+ * bullet
+ [levels 1-6]
+ '_* '
+ '_1* '
+ '_2* '
+ * numbered list
+ [levels 1-3]
+ '# '
+
+#% blocks NOTE text blocks that are not to be treated in the way that ordinary paragraphs would be
+ * code
+ * [type of markup if any]
+ * poem
+ * group
+ * alt
+ * tables
+#% boxes
+ NOTE grouped text with code block type color & possibly default image, warning, tip, red, blue etc. decide [NB N/A not implemented]
+
+#% notes NOTE inline within paragraph at the location where the note reference is to occur
+ * footnotes '~{ ... }~'
+ * [bibliography] [NB N/A not implemented]
+
+#% links, linking
+ * links - external, web, url
+ * links - internal
+
+#% images [multimedia?]
+ * images
+ * [base64 inline] [N/A not implemented]
+
+#% object numbers
+ * ocn (object numbers)
+ automatically attributed to substantive objects, paragraphs, tables, blocks, verse (unless exclude marker provided)
+
+#% contents
+ * toc (table of contents)
+ autogenerated from structure/headings information
+ * index (book index)
+ built from hints in newline text following a paragraph and starting with ={} has identifying rules for main and subsidiary text
+
+#% breaks
+ * line break ' \\ ' inline
+ * page break, column break ' -\\- ' start of line, breaks a column, starts a new column, if using columns, else breaks the page, starts a new page.
+ * page break, page new ' =\\= ' start of line, breaks the page, starts a new page.
+ * horizontal '-..-' start of line, rule page (break) line across page (dividing paragraphs)
+
+#% book type index
+
+#% comment
+ * comment
+
+#% misc
+ * term & definition
+
+** syntax hilighting
+
+*** vim
+data/sisu/conf/editor-syntax-etc/vim/
+data/sisu/conf/editor-syntax-etc/vim/syntax/sisu.vim
+
+*** emacs
+data/sisu/conf/editor-syntax-etc/emacs/
+data/sisu/conf/editor-syntax-etc/emacs/sisu-mode.el
+
+* todo
+sisu_todo.org