diff options
Diffstat (limited to 'data/doc/manuals_generated/sisu_manual/sisu_description/scroll.xhtml')
-rw-r--r-- | data/doc/manuals_generated/sisu_manual/sisu_description/scroll.xhtml | 2520 |
1 files changed, 2520 insertions, 0 deletions
diff --git a/data/doc/manuals_generated/sisu_manual/sisu_description/scroll.xhtml b/data/doc/manuals_generated/sisu_manual/sisu_description/scroll.xhtml new file mode 100644 index 00000000..a4f681bf --- /dev/null +++ b/data/doc/manuals_generated/sisu_manual/sisu_description/scroll.xhtml @@ -0,0 +1,2520 @@ +<?xml version="1.0" encoding="UTF-8" standalone="no"?> +<?xml-stylesheet type="text/css" href="../_sisu/css/xhtml.css"?> +<!-- Document processing information: + * Generated by: SiSU 0.59.0 of 2007w38/0 (2007-09-23) + * Ruby version: ruby 1.8.6 (2007-06-07 patchlevel 36) [i486-linux] + * + * Last Generated on: Sun Sep 23 04:12:20 +0100 2007 + * SiSU http://www.jus.uio.no/sisu +--> + +<document> +<head> + <meta http-equiv="Content-Type" content="text/html;charset=utf-8" /> + <meta>Title:</meta> + <title class="dc"> + SiSU - SiSU information Structuring Universe / Structured information, Serialized Units - Description + </title> + <br /> + <meta>Creator:</meta> + <creator class="dc"> + Ralph Amissah + </creator> + <br /> + <meta>Rights:</meta> + <rights class="dc"> + Copyright (C) Ralph Amissah 2007, part of SiSU documentation, License GPL 3 + </rights> + <br /> + <meta>Type:</meta> + <type class="dc"> + information + </type> + <br /> + <meta>Subject:</meta> + <subject class="dc"> + ebook, epublishing, electronic book, electronic publishing, electronic document, electronic citation, data structure, citation systems, search + </subject> + <br /> + <meta>Date created:</meta> + <date_created class="extra"> + 2002-11-12 + </date_created> + <br /> + <meta>Date issued:</meta> + <date_issued class="extra"> + 2002-11-12 + </date_issued> + <br /> + <meta>Date available:</meta> + <date_available class="extra"> + 2002-11-12 + </date_available> + <br /> + <meta>Date modified:</meta> + <date_modified class="extra"> + 2007-08-30 + </date_modified> + <br /> + <meta>Date:</meta> + <date class="dc"> + 2007-08-30 + </date> + <br /> +</head> +<body> +<object id="1"> + <text class="h1"> + SiSU - SiSU information Structuring Universe / Structured information, +Serialized Units - Description,<br /> Ralph Amissah + </text> + <ocn>1</ocn> +</object> +<object id="2"> + <text class="h2"> + SiSU an attempt to describe + </text> + <ocn>2</ocn> +</object> +<object id="3"> + <text class="h4"> + 1. Description + </text> + <ocn>3</ocn> +</object> +<object id="4"> + <text class="h5"> + 1.1 Outline + </text> + <ocn>4</ocn> +</object> +<object id="5"> + <text class="norm"> + <b>SiSU</b> is a flexible document preparation, generation publishing +and search system.<en>1</en> + </text> + <endnote notenumber="1"> + 1. This information was first placed on the web 12 November 2002; with +predating material taken from <<link +xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.jus.uio.no/lm/lm.information/toc.html">http://www.jus.uio.no/lm/lm.information/toc.html</link>> +part of a site started and developed since 1993. See document metadata +section <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/SiSU/metadata.html">http://www.jus.uio.no/sisu/SiSU/metadata.html</link>> +for information on this version. Dates related to the development of +<b>SiSU</b> are mostly contained within the Chronology section of this +document, e.g. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/sisu_chronology">http://www.jus.uio.no/sisu/sisu_chronology</link>> + </endnote> + <ocn>5</ocn> +</object> +<object id="6"> + <text class="norm"> + <b>SiSU</b> ("<b>SiSU</b> information Structuring Universe" or +"Structured information, Serialized Units"),<en>2</en> is a Unix +command line oriented framework for document structuring, publishing +and search. Featuring minimalistic markup, multiple standard outputs, a +common citation system, and granular search. + </text> + <endnote notenumber="2"> + 2. also chosen for the meaning of the Finnish term "sisu". + </endnote> + <ocn>6</ocn> +</object> +<object id="7"> + <text class="norm"> + Using markup applied to a document, <b>SiSU</b> can produce plain text, +HTML, XHTML, XML, OpenDocument, LaTeX or PDF files, and populate an SQL +database with objects<en>3</en> (equating generally to paragraph-sized +chunks) so searches may be performed and matches returned with that +degree of granularity (e.g. your search criteria is met by these +documents and at these locations within each document). Document output +formats share a common object numbering system for locating content. +This is particularly suitable for "published" works (finalized texts as +opposed to works that are frequently changed or updated) for which it +provides a fixed means of reference of content. + </text> + <endnote notenumber="3"> + 3. objects include: headings, paragraphs, verse, tables, images, but not +footnotes/endnotes which are numbered separately and tied to the object +from which they are referenced. + </endnote> + <ocn>7</ocn> +</object> +<object id="8"> + <text class="norm"> + <b>SiSU</b> is the data/information structuring and transforming tool, +that has resulted from work on one of the oldest law web projects. It +makes possible the one time, simple human readable markup of documents, +that <b>SiSU</b> can then publish in various forms, suitable for +paper<en>4</en>, web<en>5</en> and relational database<en>6</en> +presentations, retaining common data-structure and meta-information +across the output/presentation formats. Several requirements of legal +and scholarly publication on the web have been addressed, including the +age old need to be able to reliably cite/pinpoint text within a +document, to easily make footnotes/endnotes, to allow for semantic +document meta-tagging, and to keep required markup to a minimum. These +and other features of interest are listed and described below. A few +points are worth making early (and will be repeated a number of times): + </text> + <endnote notenumber="4"> + 4. pdf via LaTeX or lout + </endnote> + <endnote notenumber="5"> + 5. currently html (two forms of html presentation one based on css the +other on tables), and <i>PHP</i>; potentially structured XML + </endnote> + <endnote notenumber="6"> + 6. any SQL - currently PostgreSQL and <i>sqlite</i> (for portability, +testing and development) + </endnote> + <ocn>8</ocn> +</object> +<object id="9"> + <text class="indent1"> + (i) The <b>SiSU</b> document generator was the first to place +material on the web with a system that makes possible citation across +different document types, with paragraph, or rather object citation +numbering<en>7</en> a text positioning system, available for the +pinpointing of text, 1997, a simple idea from which much benefit, and +<b>SiSU</b> remains today, to the best of my knowledge, the only +multiple format e-book/ electronic-document system on the web that +gives you this possibility (including for relational databases). + </text> + <endnote notenumber="7"> + 7. previously called "text object numbering" + </endnote> + <ocn>9</ocn> +</object> +<object id="10"> + <text class="indent1"> + (ii) Markup is done once for the multiple formats produced. + </text> + <ocn>10</ocn> +</object> +<object id="11"> + <text class="indent1"> + (iii) Markup is simple, and human readable (with a little +practice), in almost all cases there is less and simpler markup +required than basic html. In any event the markup required is very much +simpler than the html, LaTeX, [lout], structured XML, ODF +(OpenDocument), PostgreSQL or SQLite feed etc. that you can have +<b>SiSU</b> generate for you. + </text> + <ocn>11</ocn> +</object> +<object id="12"> + <text class="indent1"> + (iv) <b>SiSU</b> is a batch processor, dealing with as many files +as you need to generate at a time. + </text> + <ocn>12</ocn> +</object> +<object id="13"> + <text class="indent1"> + (v) Scalability is dependent on your file system (in my case +Reiserfs), the database (currently Postgresql and/or SQLite) and your +hardware. + </text> + <ocn>13</ocn> +</object> +<object id="14"> + <text class="norm"> + <b>SiSU</b> Sabaki<en>8</en> (or just <b>SiSU</b>) is the provisional +name given to the software described here that helps structure +documents for web and other publication. The name <b>SiSU</b> is a +loose anagram for something along the lines of <b><i>"SiSU is +structuring unit"</i></b>, or <i>"<b>SiSU</b>, information structuring +unit"</i> or the more descriptive <i>"Structured information, +Serialized Units"</i> or <b><i>"simple - information structuring +unit"</i></b> or the more descriptive <i>"Structured information, +Serialized Units"</i> or what it may be directed towards +<i>"<b>semantic</b> and <b>information structuring universe</b>" +</i>,<en>9</en> tongue in cheek, only just. Guess I'll get away with +<b><i>"Simple - information Structuring Universe"</i></b>. <b>SiSU</b> +is also a Finnish word roughly meaning guts, inner strength and +perseverance.<en>10</en> + </text> + <endnote notenumber="8"> + 8. <b>SiSU</b> Sabaki, release version. Pre-release version <b>SiSU</b> +Scribe, and version prior to that <b>SiSU</b> nicknamed Scribbler. +Pre-release versions go back several years. Both Scribbler and Scribe +(still maintained) made system calls to <b>SiSU</b>'s various parts, +instead of using libraries. + </endnote> + <endnote notenumber="9"> + 9. A little universe it may be, but semantic you may have a hard time +getting away with, given the meaning the word has taken on with markup. +On a document wide basis semantic information may be provided, which +can be really useful, (and meaningful, especially) if you have a large +document set, and use this with rss feeds or in an sql database etc. On +a markup level, I have little inclination to add semantic markup +formally beyond references, title, author [Dublin Core entities? +addresses?] etc. Actually this deserves a bit of thought possibly use +letter tags (including letter alias/synonyms for font faces) to create +a small set of default semantic tags, with the possibility for per +document adjustments. Will seek to permit XML entity tagging, within +<b>SiSU</b> markup and have that ignored/removed by the parts of the +program that have no use for it. + </endnote> + <endnote notenumber="10"> + 10. "Sisu refers not to the courage of optimism, but to a concept of +life that says, 'I may not win, but I will gladly give my life for what +I believe.'" Aini Rajanen, Of Finnish Ways, 1981, p. 10.<br /> +<<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.humanlanguages.com/finnishenglish/rlfs.htm">http://www.humanlanguages.com/finnishenglish/rlfs.htm</link>> +<br /> "Every Finn has his own pet definition. To me, sisu means +patience without passion. But there are many varieties of sisu. Sisu +can be a sudden outburst or it can be the kind that lasts. A man can +have both kinds. It is outside reason. It is something in the soul. It +comes from oneself. For instance, it makes a soldier do things because +he himself must, not because he has been told." Paavo Nurmi<br /> +<<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://personalweb.smcvt.edu/tmatikainen/finnishtraditions.htm">http://personalweb.smcvt.edu/tmatikainen/finnishtraditions.htm</link>> + </endnote> + <ocn>14</ocn> +</object> +<object id="15"> + <text class="norm"> + <b>SiSU</b> was born of the need to find a way, with minimal effort, +and for as wide a range of document types as possible, to produce high +quality publishing output in a variety of document formats. As such it +was necessary to find a simple document representation that would work +across a large number of document types, and the most convenient way(s) +to produce acceptable output formats. The project leading to this +program was started in 1993 (together with the trade law project now +known as Lex Mercatoria) as an investigation of how to +effectively/efficiently place documents on the web. The unified +document handling, together with features such as paragraph numbering, +endnote handling and tables... appeared in 1996/97. <b>SiSU</b> was +originally written in Perl,<en>11</en> and converted to <b>Ruby</b>, +<en>12</en> in 2000, one of the most impressive programming languages +in existence! In its current form it has been written to run on the +<b>Gnu</b> /Linux platform, and in particular on <b>Debian</b>, +<en>13</en> taking advantage of many of the wonderful projects that are +available there. + </text> + <endnote notenumber="11"> + 11. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.perl.org/">http://www.perl.org/</link>> + </endnote> + <endnote notenumber="12"> + 12. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.ruby-lang.org/en/">http://www.ruby-lang.org/en/</link>> + </endnote> + <endnote notenumber="13"> + 13. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.debian.org/">http://www.debian.org/</link>> + </endnote> + <ocn>15</ocn> +</object> +<object id="16"> + <text class="norm"> + <b>SiSU</b> markup is based on requiring the minimum markup needed to +determine the structure of a document. (This can be as little as saying +in a header to look for the word Book at a specified level and the word +Chapter at another level). <b>SiSU</b> then breaks a document into its +smallest parts (at a heading, and paragraph level) while retaining all +structural information. This break up of the document and information +on its structure is taken advantage of in the transformations made in +generating the very different output types that can be created, and in +providing as much as can be for what each output type is best at doing, +e.g. LaTeX (professional document typesetting, easy conversion to pdf +or Postscript), XML (in this case, structural representation), ODF +(OpenDocument [experimental]), SQL (e.g. document search; representing +constituent parts of documents based on their structure, headings, +chapters, paragraphs as required; user control).<en>14</en> + </text> + <endnote notenumber="14"> + 14. where explicit structure is provided through the use of tagging +headings, it could be reduced (still) further, for example by reducing +the number of characters used to identify heading levels; but in many +cases even that information is not required as regular expressions can +be used to extract the implicit structure. + </endnote> + <ocn>16</ocn> +</object> +<object id="17"> + <text class="norm"> + From markup that is simpler and more sparse than html you get: + </text> + <ocn>17</ocn> +</object> +<object id="18"> + <text class="indent_bullet"> + far greater output possibilities, including html, XML, ODF +(OpenDocument), LaTeX (pdf), and SQL; + </text> + <ocn>18</ocn> +</object> +<object id="19"> + <text class="indent_bullet"> + the advantages implicit in the very different output possibilities; + </text> + <ocn>19</ocn> +</object> +<object id="20"> + <text class="indent_bullet"> + a common citation system (for all outputs - including the relational +database, search results are relevant for all outputs); + </text> + <ocn>20</ocn> +</object> +<object id="21"> + <text class="norm"> + For more see the short summary of features provided below. + </text> + <ocn>21</ocn> +</object> +<object id="22"> + <text class="norm"> + <b>SiSU</b> processes files with minimal tagging to produce various +document outputs including html, LaTeX or lout (which is converted to +pdf) and if required loads the structured information into an SQL +database (PostgreSQL and SQLite have been used for this). <b>SiSU</b> +produces an intermediate processing format.<en>15</en> + </text> + <endnote notenumber="15"> + 15. This proved to be the easiest way to develop syntax, changes could +be made, or alternatives provided for the markup syntax whilst the +intermediate markup syntax was largely held constant. There is actually +an optional second intermediate markup format in YAML <<link +xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.yaml.org/">http://www.yaml.org/</link>> + </endnote> + <ocn>22</ocn> +</object> +<object id="23"> + <text class="norm"> + <b>SiSU</b> is used in constructing Lex Mercatoria <<link +xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://lexmercatoria.org/">http://lexmercatoria.org/</link>> +or <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.jus.uio.no/lm/">http://www.jus.uio.no/lm/</link>> +(one of the oldest law web sites), and considerable thought went into +producing output that would be suitable for legal and academic writings +(that do not have formulae) given the limitations of html, and +publication in a wide variety of "formats", in particular in relation +to the convenient and accurate citation of text. However, the +construction of Lex Mercatoria uses only a fraction of the features +available from <b>SiSU</b> today, <i>vis</i> generation of flat file +structures, rather than in addition the building of ("granular") SQL +database content, (at an object level with relevant relational tables, +and other outputs also available). + </text> + <ocn>23</ocn> +</object> +<object id="24"> + <text class="h5"> + 1.2 Short summary of features + </text> + <ocn>24</ocn> +</object> +<object id="25"> + <text class="norm"> + <b>(i)</b> markup syntax: (a) simpler than html, (b) mnemonic, +influenced by mail/messaging/wiki markup practices, (c) human readable, +and easily writable, + </text> + <ocn>25</ocn> +</object> +<object id="26"> + <text class="norm"> + <b>(ii)</b> (a) minimal markup requirement, (b) single file marked up +for multiple outputs, + </text> + <ocn>26</ocn> +</object> +<object id="27"> + <text class="norm"> + notes: + </text> + <ocn>27</ocn> +</object> +<object id="28"> + <text class="norm"> + * documents are prepared in a single UTF-8 file using a minimalistic +mnemonic syntax. Typical literature, documents like "War and Peace" +require almost no markup, and most of the headers are optional. + </text> + <ocn>28</ocn> +</object> +<object id="29"> + <text class="norm"> + * markup is easily readable/parsed by the human eye, (basic markup is +simpler and more sparse than the most basic html), [this may also be +converted to XML representations of the same input/source document]. + </text> + <ocn>29</ocn> +</object> +<object id="30"> + <text class="norm"> + * markup defines document structure (this may be done once in a header +pattern-match description, or for heading levels individually); basic +text attributes (bold, italics, underscore, strike-through etc.) as +required; and semantic information related to the document (header +information, extended beyond the Dublin core and easily further +extended as required); the headers may also contain processing +instructions. + </text> + <ocn>30</ocn> +</object> +<object id="31"> + <text class="norm"> + <b>(iii)</b> (a) multiple outputs primarily industry established and +institutionally accepted open standard formats, include amongst others: +plaintext (UTF-8); html; (structured) XML; ODF (Open Document text)l; +LaTeX; PDF (via LaTeX); SQL type databases (currently PostgreSQL and +SQLite). Also produces: concordance files; document content +certificates (md5 or sha256 digests of headings, paragraphs, images +etc.) and html manifests (and sitemaps of content). (b) takes advantage +of the strengths implicit in these very different output types, (e.g. +PDFs produced using typesetting of LaTeX, databases populated with +documents at an individual object/paragraph level, making possible +granular search (and related possibilities)) + </text> + <ocn>31</ocn> +</object> +<object id="32"> + <text class="norm"> + <b>(iv)</b> outputs share a common numbering system (dubbed "object +citation numbering" (ocn)) that is meaningful (to man and machine) +across various digital outputs whether paper, screen, or database +oriented, (PDF, html, XML, sqlite, postgresql), this numbering system +can be used to reference content. + </text> + <ocn>32</ocn> +</object> +<object id="33"> + <text class="norm"> + <b>(v)</b> SQL databases are populated at an object level (roughly +headings, paragraphs, verse, tables) and become searchable with that +degree of granularity, the output information provides the +object/paragraph numbers which are relevant across all generated +outputs; it is also possible to look at just the matching paragraphs of +the documents in the database; [output indexing also work well with +search indexing tools like hyperesteier]. + </text> + <ocn>33</ocn> +</object> +<object id="34"> + <text class="norm"> + <b>(vi)</b> use of semantic meta-tags in headers permit the addition of +semantic information on documents, (the available fields are easily +extended) + </text> + <ocn>34</ocn> +</object> +<object id="35"> + <text class="norm"> + <b>(vii)</b> creates organised directory/file structure for +(file-system) output, easily mapped with its clearly defined structure, +with all text objects numbered, you know in advance where in each +document output type, a bit of text will be found (e.g. from an SQL +search, you know where to go to find the prepared html output or PDF +etc.)... there is more; easy directory management and document +associations, the document preparation (sub-)directory may be used to +determine output (sub-)directory, the skin used, and the SQL database +used, + </text> + <ocn>35</ocn> +</object> +<object id="36"> + <text class="norm"> + <b>(viii)</b> "Concordance file" wordmap, consisting of all the words +in a document and their (text/ object) locations within the text, (and +the possibility of adding vocabularies), + </text> + <ocn>36</ocn> +</object> +<object id="37"> + <text class="norm"> + <b>(ix)</b> document content certification and comparison +considerations: (a) the document and each object within it stamped with +an md5 hash making it possible to easily check or guarantee that the +substantive content of a document is unchanged, (b)version control, +documents integrated with time based source control system, default RCS +or CVS with use of $Id: sisu_description.sst,v 1.25 2007/08/23 12:22:36 +ralph Exp $ tag, which <b>SiSU</b> checks + </text> + <ocn>37</ocn> +</object> +<object id="38"> + <text class="norm"> + <b>(x)</b> <b>SiSU</b>'s minimalist markup makes for meaningful +"diffing" of the substantive content of markup-files, + </text> + <ocn>38</ocn> +</object> +<object id="39"> + <text class="norm"> + <b>(xi)</b> easily skinnable, document appearance on a project/site +wide, directory wide, or document instance level easily +controlled/changed, + </text> + <ocn>39</ocn> +</object> +<object id="40"> + <text class="norm"> + <b>(xii)</b> in many cases a regular expression may be used (once in +the document header) to define all or part of a documents structure +obviating or reducing the need to provide structural markup within the +document, + </text> + <ocn>40</ocn> +</object> +<object id="41"> + <text class="norm"> + <b>(xiii)</b> prepared files may be batch process, documents produced +are static files so this needs to be done only once but may be repeated +for various reasons as desired (updated content, addition of new output +formats, updated technology document presentations/representations) + </text> + <ocn>41</ocn> +</object> +<object id="42"> + <text class="norm"> + <b>(xiv)</b> possible to pre-process, which permits: the easy creation +of standard form documents, and templates/term-sheets, or; building of +composite documents (master documents) from other sisu marked up +documents, or marked up parts, i.e. import documents or parts of text +into a main document should this be desired + </text> + <ocn>42</ocn> +</object> +<object id="43"> + <text class="norm"> + there is a considerable degree of future-proofing, output +representations are "upgradeable", and new document formats may be +added. + </text> + <ocn>43</ocn> +</object> +<object id="44"> + <text class="norm"> + <b>(xv)</b> there is a considerable degree of future-proofing, output +representations are "upgradeable", and new document formats may be +added: (a) modular, (thanks in no small part to <b>Ruby</b>) another +output format required, write another module.... (b) easy to update +output formats (eg html, XHTML, LaTeX/PDF produced can be updated in +program and run against whole document set), (c) easy to add, modify, +or have alternative syntax rules for input, should you need to, + </text> + <ocn>44</ocn> +</object> +<object id="45"> + <text class="norm"> + <b>(xvi)</b> scalability, dependent on your file-system (ext3, +Reiserfs, XFS, whatever) and on the relational database used (currently +Postgresql and SQLite), and your hardware, + </text> + <ocn>45</ocn> +</object> +<object id="46"> + <text class="norm"> + <b>(xvii)</b> only marked up files need be backed up, to secure the +larger document set produced, + </text> + <ocn>46</ocn> +</object> +<object id="47"> + <text class="norm"> + <b>(xviii)</b> document management, + </text> + <ocn>47</ocn> +</object> +<object id="48"> + <text class="norm"> + <b>(xix)</b> Syntax highlighting for <b>SiSU</b> markup is available +for a number of text editors. + </text> + <ocn>48</ocn> +</object> +<object id="49"> + <text class="norm"> + <b>(xx)</b> remote operations: (a) run <b>SiSU</b> on a remote server, +(having prepared sisu markup documents locally or on that server, i.e. +this solution where sisu is installed on the remote server, would work +whatever type of machine you chose to prepare your markup documents +on), (b) generated document outputs may be posted by sisu to remote +sites (using rsync/scp) (c)document source (plaintext utf-8) if shared +on the net may be identified by its url and processed locally to +produce the different document outputs. + </text> + <ocn>49</ocn> +</object> +<object id="50"> + <text class="norm"> + <b>(xxi)</b> document source may be bundled together (automatically) +with associated documents (multiple language versions or master +document with inclusions) and images and sent as a zip file called a +sisupod, if shared on the net these too may be processed locally to +produce the desired document outputs, these may be downloaded, shared +as email attachments, or processed by running sisu against them, either +using a url or the filename. + </text> + <ocn>50</ocn> +</object> +<object id="51"> + <text class="norm"> + <b>(xxii)</b> for basic document generation, the only software +dependency is <b>Ruby</b>, and a few standard Unix tools (this covers +plaintext, html, XML, ODF, LaTeX). To use a database you of course need +that, and to convert the LaTeX generated to PDF, a LaTeX processor like +tetex or texlive. + </text> + <ocn>51</ocn> +</object> +<object id="52"> + <text class="norm"> + as a developers tool it is flexible and extensible + </text> + <ocn>52</ocn> +</object> +<object id="53"> + <text class="norm"> + <b>SiSU</b> was developed in relation to legal documents, and is strong +across a wide variety of texts (law, literature...). <b>SiSU</b> +handles images but is not suitable for formulae/ statistics, or for +technical writing at this time. + </text> + <ocn>53</ocn> +</object> +<object id="54"> + <text class="norm"> + <b>SiSU</b> has been developed and has been in use for several years. +Requirements to cover a wide range of documents within its use domain +have been explored. + </text> + <ocn>54</ocn> +</object> +<object id="55"> + <text class="norm"> + Some modules are more mature than others, the most mature being Html +and LaTeX / pdf. PostgreSQL and search functions are useable and +together with <i>ocn</i> unique (to the best of my knowledge). The XML +output document set is "well formed" but largely proof of concept. + </text> + <ocn>55</ocn> +</object> +<object id="56"> + <text class="h5"> + 1.3 How it works + </text> + <ocn>56</ocn> +</object> +<object id="57"> + <text class="norm"> + <b>SiSU</b> markup is fairly minimalistic, it consists of: a (largely +optional) document header, made up of information about the document +(such as when it was published, who authored it, and granting what +rights) and any processing instructions; and markup within text which +is related to document structure and typeface. <b>SiSU</b> must be able +to discern the structure of a document, (text headings and their levels +in relation to each other), either from information provided in the +instruction header or from markup within the text (or from a +combination of both). Processing is done against an abstraction of the +document comprising of information on the document's structure and its +objects,<en>16</en> which the program serializes (providing the object +numbers) and which are assigned hash sum values based on their content. +This abstraction of information about document structure, objects, (and +hash sums), provides considerable flexibility in representing documents +different ways and for different purposes (e.g. search, document +layout, publishing, content certification, concordance etc.), and makes +it possible to take advantage of some of the strengths of established +ways of representing documents, (or indeed to create new ones). + </text> + <endnote notenumber="16"> + 16. objects include: headings, paragraphs, verse, tables, images, but +not footnotes/endnotes which are numbered separately and tied to the +object from which they are referenced. + </endnote> + <ocn>57</ocn> +</object> +<object id="58"> + <text class="h5"> + 1.4 Simple markup + </text> + <ocn>58</ocn> +</object> +<object id="59"> + <text class="norm"> + <b>SiSU</b> markup is based on requiring the minimum markup needed to +determine the structure of a document. (This can be as little as saying +in a header to look for the word Book at a specified level and the word +Chapter at another level). <b>SiSU</b> then breaks a document into its +smallest parts (at a heading, and paragraph level) while retaining all +structural information. This break up of the document and information +on its structure is taken advantage of in the transformations made in +generating the very different output types that can be created, and in +providing as much as can be for what each output type is best at doing, +e.g. LaTeX (professional document typesetting, easy conversion to pdf +or Postscript), XML (in this case, structural representation), ODF +(OpenDocument), SQL (e.g. document search; representing constituent +parts of documents based on their structure, headings, chapters, +paragraphs as required; user control).<en>17</en> + </text> + <endnote notenumber="17"> + 17. where explicit structure is provided through the use of tagging +headings, it could be reduced (still) further, for example by reducing +the number of characters used to identify heading levels; but in many +cases even that information is not required as regular expressions can +be used to extract the implicit structure. + </endnote> + <ocn>59</ocn> +</object> +<object id="60"> + <text class="h6"> + 1.4.1 Sparse markup requirement, try to get the most out of markup + </text> + <ocn>60</ocn> +</object> +<object id="61"> + <text class="norm"> + One of its strengths is that very small amounts of initial tagging is +required for the program to generate its output. + </text> + <ocn>61</ocn> +</object> +<object id="62"> + <text class="norm"> + This is a basic markup example: + </text> + <ocn>62</ocn> +</object> +<object id="63"> + <text class="indent_bullet"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/sample/markup/un_contracts_international_sale_of_goods_convention_1980.sst"> +basic markup example, text file - an international convention </link> +<en>18</en> + </text> + <endnote notenumber="18"> + 18. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/sample/markup/un_contracts_international_sale_of_goods_convention_1980.sst">http://www.jus.uio.no/sisu/sample/markup/un_contracts_international_sale_of_goods_convention_1980.sst</link>> +output provided as example in the next section + </endnote> + <ocn>63</ocn> +</object> +<object id="64"> + <text class="indent_bullet"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/sample/syntax/un_contracts_international_sale_of_goods_convention_1980.sst.html"> +view basic markup, as it would be highlighted by vim editor </link> +<en>19</en> + </text> + <endnote notenumber="19"> + 19. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/sample/syntax/un_contracts_international_sale_of_goods_convention_1980.sst.html">http://www.jus.uio.no/sisu/sample/syntax/un_contracts_international_sale_of_goods_convention_1980.sst.html</link>> +as it would appear with syntax highlighting (by vim) + </endnote> + <ocn>64</ocn> +</object> +<object id="65"> + <text class="norm"> + Emphasis has been on simplicity and minimalism in markup requirements. +Design philosophy is to try keep the amount of markup required low, for +whatever has been determined to be acceptable output.<en>20</en> + </text> + <endnote notenumber="20"> + 20. seems there are several "smart ASCIIs" available, primarily for +ascii to html conversion, that make this, and reasonable looking ascii +their goal<br /> <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://webseitz.fluxent.com/wiki/SmartAscii">http://webseitz.fluxent.com/wiki/SmartAscii</link>> +<br /> <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://daringfireball.net/projects/markdown/">http://daringfireball.net/projects/markdown/</link>> +<br /> <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.textism.com/tools/textile/">http://www.textism.com/tools/textile/</link>> + </endnote> + <ocn>65</ocn> +</object> +<object id="66"> + <text class="norm"> + <b>SiSU</b>'s markup is more minimalistic and simpler than (the +equivalent) html and for it, you get considerably more than just html, +as this preparation gives you all available output formats, upon +request. + </text> + <ocn>66</ocn> +</object> +<object id="67"> + <text class="h6"> + 1.4.2 Single markup file provides multiple output formats + </text> + <ocn>67</ocn> +</object> +<object id="68"> + <text class="norm"> + For each document, there is only one (input, minimalistically marked +up) file from which all the available output types are +generated.<en>21</en> + </text> + <endnote notenumber="21"> + 21. These include richly laid out and linked html (table or css +variants), <i>PHP</i>, LaTeX (from which pdf portrait and landscape +documents are produced), texinfo (for info files etc.), and PostgreSQL +and/or SQLite. And the opportunity to fairly easily build additional +modules, such as XML. See the examples provided in this document. + </endnote> + <ocn>68</ocn> +</object> +<object id="69"> + <text class="norm"> + Eg. the markup example: + </text> + <ocn>69</ocn> +</object> +<object id="70"> + <text class="indent_bullet"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/sample/markup/un_contracts_international_sale_of_goods_convention_1980.sst"> +original text file - an international convention </link> <en>22</en> + </text> + <endnote notenumber="22"> + 22. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/sample/markup/un_contracts_international_sale_of_goods_convention_1980.sst">http://www.jus.uio.no/sisu/sample/markup/un_contracts_international_sale_of_goods_convention_1980.sst</link>> + </endnote> + <ocn>70</ocn> +</object> +<object id="71"> + <text class="indent_bullet"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/sample/syntax/un_contracts_international_sale_of_goods_convention_1980.sst.html"> +view as syntax would be highlighted by vim editor </link> <en>23</en> + </text> + <endnote notenumber="23"> + 23. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/sample/syntax/un_contracts_international_sale_of_goods_convention_1980.sst.html">http://www.jus.uio.no/sisu/sample/syntax/un_contracts_international_sale_of_goods_convention_1980.sst.html</link>> + </endnote> + <ocn>71</ocn> +</object> +<object id="72"> + <text class="norm"> + Produces the following output: + </text> + <ocn>72</ocn> +</object> +<object id="73"> + <text class="indent_bullet"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/toc.html"> +Segmented html version of document </link> <en>24</en> + </text> + <endnote notenumber="24"> + 24. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/toc.html">http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/toc.html</link>> + </endnote> + <ocn>73</ocn> +</object> +<object id="74"> + <text class="indent_bullet"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/doc.html"> +Full length html document </link> <en>25</en> + </text> + <endnote notenumber="25"> + 25. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/doc.html">http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/doc.html</link>> + </endnote> + <ocn>74</ocn> +</object> +<object id="75"> + <text class="indent_bullet"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/landscape.pdf"> +pdf landscape version of document </link> <en>26</en> + </text> + <endnote notenumber="26"> + 26. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/landscape.pdf">http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/landscape.pdf</link>> + </endnote> + <ocn>75</ocn> +</object> +<object id="76"> + <text class="indent_bullet"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/portrait.pdf"> +pdf portrait version of document </link> <en>27</en> + </text> + <endnote notenumber="27"> + 27. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/portrait.pdf">http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/portrait.pdf</link>> + </endnote> + <ocn>76</ocn> +</object> +<object id="77"> + <text class="indent_bullet"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/plain.txt"> +clean tex ascii version of document </link> <en>28</en> + </text> + <endnote notenumber="28"> + 28. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/plain.txt">http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/plain.txt</link>> + </endnote> + <ocn>77</ocn> +</object> +<object id="78"> + <text class="indent_bullet"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/sax.xml"> +<i>xml</i> sax version of document </link> <en>29</en> + </text> + <endnote notenumber="29"> + 29. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/sax.xml">http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/sax.xml</link>> + </endnote> + <ocn>78</ocn> +</object> +<object id="79"> + <text class="indent_bullet"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/dom.xml"> +<i>xml</i> dom version of document </link> <en>30</en> + </text> + <endnote notenumber="30"> + 30. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/dom.xml">http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/dom.xml</link>> + </endnote> + <ocn>79</ocn> +</object> +<object id="80"> + <text class="indent_bullet"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/concordance.html"> +Concordance </link> <en>31</en> + </text> + <endnote notenumber="31"> + 31. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/concordance.html">http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/concordance.html</link>> + </endnote> + <ocn>80</ocn> +</object> +<object id="81"> + <text class="norm"> + (and in addition to these: PostgreSQL, SQLite, texinfo and +<del>YAML</del> <en>32</en> versions if desired) + </text> + <endnote notenumber="32"> + 32. discontinued for the time being + </endnote> + <ocn>81</ocn> +</object> +<object id="82"> + <text class="h6"> + 1.4.3 Syntax relatively easy to read and remember + </text> + <ocn>82</ocn> +</object> +<object id="83"> + <text class="norm"> + Syntax is kept simple and mnemonic.<en>33</en> + </text> + <endnote notenumber="33"> + 33. <b>SiSU</b> markup syntax, an incomplete summary: <<link +xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/sisu_markup_table/doc.html#h200306">http://www.jus.uio.no/sisu/sisu_markup_table/doc.html#h200306</link>> +<br /> Visual check of elementary font face modifiers: <b>bold</b> +<b>bold</b> <em>emphasis</em> <i>italics</i> <u>underscore</u> +<del>strikethrough</del> <sup>superscript</sup> <sub>subscript</sub> + </endnote> + <ocn>83</ocn> +</object> +<object id="84"> + <text class="h6"> + 1.4.4 Kept simple by having a limited publishing feature set, and +features identified as most important, are available across several +document types + </text> + <ocn>84</ocn> +</object> +<object id="85"> + <text class="norm"> + To keep <b>SiSU</b> markup sparse and simple <b>SiSU</b> deliberately +provides a limited publishing feature set, including: indent levels; +bold; italics; superscript; subscript; simple tables; images; tables of +contents and; endnotes. Which in most cases are available across the +different output formats. + </text> + <ocn>85</ocn> +</object> +<object id="86"> + <text class="norm"> + The publishing feature set may be expanded as required. + </text> + <ocn>86</ocn> +</object> +<object id="87"> + <text class="h5"> + 1.5 Designed with usability in mind + </text> + <ocn>87</ocn> +</object> +<object id="88"> + <text class="norm"> + Output is designed to be uniform, easy to read, navigate and cite. + </text> + <ocn>88</ocn> +</object> +<object id="89"> + <text class="h5"> + 1.6 Code separate from content + </text> + <ocn>89</ocn> +</object> +<object id="90"> + <text class="norm"> + Code<en>34</en> is separated from content. This means that when changes +are desired in the output presentation, the code that produces them, +and not the marked up text data set (which could be thousands of +documents) is modified. Separating code from content makes large scale +changes to output appearance trivial, and permits the easy addition of +new output modules. + </text> + <endnote notenumber="34"> + 34. the program that generates the documents + </endnote> + <ocn>90</ocn> +</object> +<object id="91"> + <text class="h5"> + 1.7 Object citation numbering, a text or object positioning / citation +system - "paragraph" (or text object) numbering, that remains same and +usable across all output formats by people and machine + + </text> + <ocn>91</ocn> +</object> +<object id="92"> + <text class="norm"> + Object citation numbering is a simple object (text) positioning and +cition system that is human relevant and machine useable, used by +<b>SiSU</b> for all manner of presentations, and that is available for +use in all text mappings. It is based on the automated sequential +numbering of objects (roughly paragraphs, (headings, tables, verse) or +other blocks of text or images etc.). The text positioning system (in +which I claim copyright) is invaluable for publishing requiring the +citing text across multiple output formats, and for the general mapping +of text within a document: + </text> + <ocn>92</ocn> +</object> +<object id="93"> + <text class="indent_bullet"> + in html, html not being easily citeable (change font size, or use a +different browser and the page on which specific text appears has +changed), and + </text> + <ocn>93</ocn> +</object> +<object id="94"> + <text class="indent_bullet"> + across multiple formats being common to all output formats +html/xml/pdf/sql output, + </text> + <ocn>94</ocn> +</object> +<object id="95"> + <text class="indent_bullet"> + the results of an sql search can just be "live" citation references to +the documents in which the text is found, <link +xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/SiSU/1.html#search"> much like +an index (see image examples provided). </link> <en>35</en> + </text> + <endnote notenumber="35"> + 35. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/SiSU/1.html#search">http://www.jus.uio.no/sisu/SiSU/1.html#search</link>> + </endnote> + <ocn>95</ocn> +</object> +<object id="96"> + <text class="norm"> + I claim copyright on the system I use which is the most basic of all, +numbering all text in headings and paragraphs sequentially (with tables +and images being treated as a single paragraph) and only +footnotes/endnotes not following this numbering, as their position in +text is not strictly determined, (a change from footnotes to endnotes +would change their numbering), footnotes instead "belong" to the +paragraph from which they are referenced, and have sequential numbers +of their own. + </text> + <ocn>96</ocn> +</object> +<object id="97"> + <text class="norm"> + <b>SiSU</b> has a paragraph numbering system, that remains the same +regardless of the output format. This provides an effective means of +citation, pinpointing text accurately in all output formats, using the +same reference. This is particularly useful where text has to be +located across different output formats - for example once html is +printed the number of pages and pages on which given text is found will +vary depending on the browser, its settings the font size setting etc. +Similarly <b>SiSU</b> produces pdf in different forms, eg. on the +example site Lex Mercatoria as portrait and landscape documents - here +too page numbering varies, but paragraph numbering is the same, <i>vis +a vis</i> all versions of the text (portrait and landscape pdf and the +html versions of the text, and as stored (with "paragraphs" as records) +to the PostgreSQL or SQLite database). + </text> + <ocn>97</ocn> +</object> +<object id="98"> + <text class="norm"> + These numbers are placed in the text margins and are intended to be +independent of and not to interfere with authors tagging. [The citation +system (object citation numbering system, automated "paragraph +numbering") which is automatically generated and is common and +identical across all document formats] The paragraph numbering system +is more accurately described as an (text) object numbering system, as +headings are also numbered... all headings and paragraphs are numbered +sequentially. Endnotes are automatically numbered independently and +rather "belong" to the paragraph from which they are referenced, as an +endnote does not (necessarily) form a part of a documents sequence, +(they may be produced as either endnotes or footnotes (or both +depending on what output you choose to look at - if you take the +segmented html version document provided as an example, you will find +that the endnotes are placed both at the end of each section, and in a +separate section of their own called endnotes, and these are +hyper-linked)). An attractive feature of providing citation numbering +in this way is that it is independent of the document structure... it +remains the same regardless of what is done about the document +structure. + </text> + <ocn>98</ocn> +</object> +<object id="99"> + <text class="norm"> + The rules have been kept very simple, unique incremental object +citation numbers are assigned to headings, paragraphs, verse, tables +and images. It is possible to manually override this feature on a per +heading or comment basis though this should be used exceptionally, it +may be of use where there a substantive text, and the addition of a +minor comment by the publisher that should not be mapped as part of the +text. + </text> + <ocn>99</ocn> +</object> +<object id="100"> + <text class="norm"> + The object citation number markers contain additional numbering +information with regard to the document structure, that can be used for +alternative presentations, including such detail as the type of object +(heading, paragraph, table, image, etc.), numbered sequentially. + </text> + <ocn>100</ocn> +</object> +<object id="101"> + <text class="norm"> + An advantage is that the numbering remains the same regardless of +document structure. + </text> + <ocn>101</ocn> +</object> +<object id="102"> + <text class="norm"> + Text object ("paragraph") numbering is the same for all output versions +of the same document, vis html, pdf, pgsql, yaml etc. + </text> + <ocn>102</ocn> +</object> +<object id="103"> + <text class="norm"> + In the relational database, as individual text objects of a document +stored (and indexed) together with object numbers, and all versions of +the document have the same numbering, the results of searches may be +tailored just to provide the location of the search result in all +available document formats. + </text> + <ocn>103</ocn> +</object> +<object id="104"> + <text class="norm"> + <i> Note: there is a bug in the released behaviour of object citation +numbering, (not certain when it was introduced) tables should be +numbered, ie each table gets an ocn, required amongst other things for +relational database. This will be corrected in a future release. +Citation numbering of existing documents that contain tables will +changed. </i> + </text> + <ocn>104</ocn> +</object> +<object id="105"> + <text class="h5"> + 1.8 Handling of Dublin Core meta-tags making use of the Resource +Description Framework + </text> + <ocn>105</ocn> +</object> +<object id="106"> + <text class="norm"> + <b>SiSU</b> is able to use meta tags based on the Dublin +Core<en>36</en> and Resource Description Framework<en>37</en> + </text> + <endnote notenumber="36"> + 36. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://dublincore.org/">http://dublincore.org/</link>> + </endnote> + <endnote notenumber="37"> + 37. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.w3.org/RDF/">http://www.w3.org/RDF/</link>> + </endnote> + <ocn>106</ocn> +</object> +<object id="107"> + <text class="norm"> + This provides the means of providing semantic information about a +document, both as computer processable meta-tags, and as human readable +information that may be of value for classification purposes. + </text> + <ocn>107</ocn> +</object> +<object id="108"> + <text class="norm"> + This information is provided both in html metatags, and (where +available) under the section titled "Document Information - MetaData", +near the end of a document, for example in the segmented html version +of this text at: <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/SiSU/metadata.html">http://www.jus.uio.no/sisu/SiSU/metadata.html</link>> + </text> + <ocn>108</ocn> +</object> +<object id="109"> + <text class="h5"> + 1.9 Easy directory management + </text> + <ocn>109</ocn> +</object> +<object id="110"> + <text class="norm"> + 1. Directory file association, skins and special image management, made +simpler.<en>38</en> + </text> + <endnote notenumber="38"> + 38. The previous way was directory associations for file output were set +up in the configuration file. The present system is a more natural way +to work requireing less configuration. + </endnote> + <ocn>110</ocn> +</object> +<object id="111"> + <text class="norm"> + The last part of the name of the work directory in which markup is +being done, or rather from where <b>SiSU</b> is run in order to +generate document output, is used in determining the sub-directory name +for output files, that is created in the document output directory. +This provides a rather easy way to associate documents e.g. of a given +subject, or by owner. + </text> + <ocn>111</ocn> +</object> +<object id="112"> + <ocn>112</ocn> + <text class="code"> +    /www/docs<br />   /intellectual_property<br />   /arbitration<br />   /contract_law<br /><br />   /www/docs<br />   /ralph<br />   /sisu     + </text> +</object> +<object id="113"> + <text class="norm"> + all are placed in their own directories within the directory structure +created. Similar rules are used in the creation of sql type databases +(though they can be overridden). + </text> + <ocn>113</ocn> +</object> +<object id="114"> + <text class="norm"> + There are a couple of further associations with these directories. + </text> + <ocn>114</ocn> +</object> +<object id="115"> + <text class="norm"> + Directory wide skins. + </text> + <ocn>115</ocn> +</object> +<object id="116"> + <text class="norm"> + Directory specific images. + </text> + <ocn>116</ocn> +</object> +<object id="117"> + <text class="norm"> + 2. If there is a "directory skin", that is a skin of the same name as +the directory, it is used in the generation of the documents within it, +rather than the default skin, unless the document has a specific skin +associated with it. + </text> + <ocn>117</ocn> +</object> +<object id="118"> + <text class="indent1"> + a. default skin (always available) + </text> + <ocn>118</ocn> +</object> +<object id="119"> + <text class="indent1"> + b. directory skin (precedence over default if exists) + </text> + <ocn>119</ocn> +</object> +<object id="120"> + <text class="indent1"> + c. document skin (takes precedence wherever document requests a +specific skin) + </text> + <ocn>120</ocn> +</object> +<object id="121"> + <text class="norm"> + Skins are defined in the document skin directory and if a directory +association is desired a softlink made to the relevant skin. Skins +(directory association auto load) auto load skin if a directory skin +exists of same name as directory stub, (and there is no specific doc +skin) + </text> + <ocn>121</ocn> +</object> +<object id="122"> + <text class="norm"> + 3. If the working directory has within it a sub-directory called +image_local, the images within that directory are used for references +to images, that are not part of the default site build. + </text> + <ocn>122</ocn> +</object> +<object id="123"> + <text class="h5"> + 1.10 Document Version Control Information + </text> + <ocn>123</ocn> +</object> +<object id="124"> + <text class="norm"> + The possibility of citing an exact document version. + </text> + <ocn>124</ocn> +</object> +<object id="125"> + <text class="norm"> + Permits the inclusion of document version control information to the +document body and metatags.<en>39</en> This provides a much more +certain method of referring to the exact version of a particular +document, (assuming that the document is from a trusted source, that +will retain earlier versions of a document).<en>40</en> + </text> + <endnote notenumber="39"> + 39. from a version control system such as CVS + </endnote> + <endnote notenumber="40"> + 40. The version control system must be run, so the version number is +obtained, prior to the <b>SiSU</b> document generation, and subsequent +posting of the document. + </endnote> + <ocn>125</ocn> +</object> +<object id="126"> + <text class="norm"> + This information (where available) is provided under the section of the +document titled "Document Information - MetaData", near the end of a +document, for example in the segmented html version of this text at: +<<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/SiSU/metadata.html">http://www.jus.uio.no/sisu/SiSU/metadata.html</link>> + </text> + <ocn>126</ocn> +</object> +<object id="127"> + <text class="h5"> + 1.11 Table of contents + </text> + <ocn>127</ocn> +</object> +<object id="128"> + <text class="norm"> + <b>SiSU</b> produces a rudimentary a table of contents based on +document headings. + </text> + <ocn>128</ocn> +</object> +<object id="129"> + <text class="h5"> + 1.12 Auto-numbering of headings + </text> + <ocn>129</ocn> +</object> +<object id="130"> + <text class="norm"> + Headings can be automatically numbered, (and automatically named for +hyper-linking) + </text> + <ocn>130</ocn> +</object> +<object id="131"> + <text class="h5"> + 1.13 Numbering and cross-hyperlinking of endnotes + </text> + <ocn>131</ocn> +</object> +<object id="132"> + <text class="norm"> + <b>SiSU</b> can automatically number footnotes/endnotes. This is the +default operation where no number is provided. + </text> + <ocn>132</ocn> +</object> +<object id="133"> + <text class="norm"> + Footnotes/endnotes may also be manually numbered. Where a number, or +numbers are provided for a footnote/endnote, this does not increment +the automatic footnote/endnote number counter. + </text> + <ocn>133</ocn> +</object> +<object id="134"> + <text class="norm"> + In the html output footnotes/endnotes are cross-hyper-linked (to their +reference point and vice versa). In th pdf output footnotes are linked +from their reference point only. + </text> + <ocn>134</ocn> +</object> +<object id="135"> + <text class="h5"> + 1.14 "Skinnable" + </text> + <ocn>135</ocn> +</object> +<object id="136"> + <text class="norm"> + <b>SiSU</b> is skinnable, on a site-wide, directory-wide and per +document basis, so different looking versions of things may be produced +with little difficulty. There is a default skin which may be modified, +as the background site skin, and each working directory may have a skin +associated with it, as may each individual document. The hierarchy of +application is document, directory, then site... ie if a document skin +exists it gets precedence. + </text> + <ocn>136</ocn> +</object> +<object id="137"> + <text class="norm"> + Whilst it is skinnable, the default output styles are selected to work +across the widest possible range of document types. + </text> + <ocn>137</ocn> +</object> +<object id="138"> + <text class="h5"> + 1.15 Multiple Outputs + </text> + <ocn>138</ocn> +</object> +<object id="139"> + <text class="norm"> + From markup that is simpler and more sparse than html you get: + </text> + <ocn>139</ocn> +</object> +<object id="140"> + <text class="indent_bullet"> + far greater output possibilities, including multiple html types, XML +(different structured types), LaTeX (pdf landscape, portrait), and SQL +(Postgresql or SQLite or other); + </text> + <ocn>140</ocn> +</object> +<object id="141"> + <text class="indent_bullet"> + the advantages implicit in these very different output +possibilities;<en>41</en> + </text> + <endnote notenumber="41"> + 41. e.g. LaTeX (professional document typesetting, easy conversion to +pdf or Postscript), XML (in this case, structural representation), SQL +(e.g. document set searches; representation of the constituent parts of +documents based on their structure, headings, chapters, paragraphs as +desired; control of use) + </endnote> + <ocn>141</ocn> +</object> +<object id="142"> + <text class="indent_bullet"> + a common citation system + </text> + <ocn>142</ocn> +</object> +<object id="143"> + <text class="norm"> + As many output formats/presentations as one cares to write modules for +- several types of html (e.g. structure based on css, or structure +based on tables); <i>LaTeX/pdf</i> and <i>Lout/pdf</i>; pgsql other +databases easily added; yaml... + </text> + <ocn>143</ocn> +</object> +<object id="144"> + <text class="h6"> + 1.15.1 html - several presentations: full length & segmented; css +& table based + </text> + <ocn>144</ocn> +</object> +<object id="145"> + <text class="norm"> + Most documents are produced in single and segmented html versions, +described below: + </text> + <ocn>145</ocn> +</object> +<object id="146"> + <text class="norm"> + <b>The Scroll (full length text presentations)</b> + </text> + <ocn>146</ocn> +</object> +<object id="147"> + <text class="norm"> + The full length of the text in a single scrollable document.<en>42</en> +As a rule the files they are saved in are named: <i>doc</i> or more +precisely <i>doc.html</i> + </text> + <endnote notenumber="42"> + 42. CISG <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.jus.uio.no/lm/un_contracts_international_sale_of_goods_convention_1980/doc">http://www.jus.uio.no/lm/un_contracts_international_sale_of_goods_convention_1980/doc</link>> +<br /> The Unidroit Contract Principles <<link +xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.jus.uio.no/lm/unidroit.contract.principles.1994/doc">http://www.jus.uio.no/lm/unidroit.contract.principles.1994/doc</link>> +or <br /> The Autonomous Contract <<link +xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.jus.uio.no/lm/autonomous.contract.2000.amissah/doc">http://www.jus.uio.no/lm/autonomous.contract.2000.amissah/doc</link>> + </endnote> + <ocn>147</ocn> +</object> +<object id="148"> + <text class="norm"> + For various reasons texts may only be provided in this form (such as +this one which is short), though most are also provided as segmented +texts. + </text> + <ocn>148</ocn> +</object> +<object id="149"> + <text class="norm"> + "Scroll" is a reference to the historical scroll, a single long +document/ parchment, and also no doubt to what you will have to do to +get to the bottom of the text.<en>43</en> + </text> + <endnote notenumber="43"> + 43. Scrolling is not however necessarily confined to full length +documents as you will have to scroll to get to the bottom of any long +segment (eg. chapter) of a segmented text. + </endnote> + <ocn>149</ocn> +</object> +<object id="150"> + <text class="norm"> + <b>The Segmented Text</b> + </text> + <ocn>150</ocn> +</object> +<object id="151"> + <text class="norm"> + The text divided into segments (such as articles or chapters depending +on the text)<en>44</en> As a rule the files they are saved in are +named: <i>toc</i> and <i>index</i> or more precisely <i>toc.html</i> +and <i>index.html</i> + </text> + <endnote notenumber="44"> + 44. CISG <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980">http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980</link>> +<br /> The Unidroit Principles <<link +xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.jus.uio.no/lm/unidroit.contract.principles.1994">http://www.jus.uio.no/lm/unidroit.contract.principles.1994</link>> +<br /> The Autonomous Contract <<link +xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/the.autonomous.contract.2000.amissah">http://www.jus.uio.no/sisu/the.autonomous.contract.2000.amissah</link>> +or <br /> WTA 1994 <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.jus.uio.no/lm/wta.1994">http://www.jus.uio.no/lm/wta.1994</link>> + </endnote> + <ocn>151</ocn> +</object> +<object id="152"> + <text class="norm"> + If you know exactly what you are looking for, loading a segment of text +is faster (the segments being smaller). Occasionally longer documents +such as the WTA 1994 <<link +xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.jus.uio.no/lm/wta.1994/toc">http://www.jus.uio.no/lm/wta.1994/toc</link>> +are only provided in segmented form. + </text> + <ocn>152</ocn> +</object> +<object id="153"> + <text class="norm"> + <b>Cascading Style Sheet, and Table based html</b> + </text> + <ocn>153</ocn> +</object> +<object id="154"> + <text class="norm"> + <b>SiSU</b> outputs html, two current standard forms available are: + </text> + <ocn>154</ocn> +</object> +<object id="155"> + <text class="norm"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/SiSU/toc.html"> css based +</link> + </text> + <ocn>155</ocn> +</object> +<object id="156"> + <text class="norm"> + and + </text> + <ocn>156</ocn> +</object> +<object id="157"> + <text class="norm"> + table based [largely discontinued ]<en>45</en> + </text> + <endnote notenumber="45"> + 45. formatting possibility still exists in code tree but maintenance has +been largely discontinuted. + </endnote> + <ocn>157</ocn> +</object> +<object id="158"> + <text class="norm"> + <b>The html is tested across several browsers</b> + </text> + <ocn>158</ocn> +</object> +<object id="159"> + <text class="norm"> + I like to remind you that there are other excellent browsers out there, +many of which have long supported practical features like tabbing. + </text> + <ocn>159</ocn> +</object> +<object id="160"> + <text class="norm"> + The html is tested across several browsers, including: + </text> + <ocn>160</ocn> +</object> +<object id="161"> + <text class="indent_bullet"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.mozilla.org/products/firefox/"> <b>Firefox</b> +(Mozilla-Firefox) </link> <en>46</en> + </text> + <endnote notenumber="46"> + 46. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.mozilla.org/products/firefox/">http://www.mozilla.org/products/firefox/</link>> + </endnote> + <ocn>161</ocn> +</object> +<object id="162"> + <text class="indent_bullet"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://kazehakase.sourceforge.jp/"> Kazehakase </link> +<en>47</en> + </text> + <endnote notenumber="47"> + 47. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://kazehakase.sourceforge.jp/">http://kazehakase.sourceforge.jp/</link>> + </endnote> + <ocn>162</ocn> +</object> +<object id="163"> + <text class="indent_bullet"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.konqueror.org/"> Konqueror </link> <en>48</en> + </text> + <endnote notenumber="48"> + 48. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.konqueror.org/">http://www.konqueror.org/</link>> + </endnote> + <ocn>163</ocn> +</object> +<object id="164"> + <text class="indent_bullet"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.mozilla.org/"> Mozilla </link> <en>49</en> + </text> + <endnote notenumber="49"> + 49. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.mozilla.org/">http://www.mozilla.org/</link>> + </endnote> + <ocn>164</ocn> +</object> +<object id="165"> + <text class="indent_bullet"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.microsoft.com/windows/ie/default.asp"> MS +Internet Explorer </link> <en>50</en> + </text> + <endnote notenumber="50"> + 50. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.microsoft.com/windows/ie/default.asp">http://www.microsoft.com/windows/ie/default.asp</link>> + </endnote> + <ocn>165</ocn> +</object> +<object id="166"> + <text class="indent_bullet"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://home.netscape.com/comprod/mirror/client_download.html"> +Netscape </link> <en>51</en> + </text> + <endnote notenumber="51"> + 51. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://home.netscape.com/comprod/mirror/client_download.html">http://home.netscape.com/comprod/mirror/client_download.html</link>> + </endnote> + <ocn>166</ocn> +</object> +<object id="167"> + <text class="indent_bullet"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.opera.com/"> Opera </link> <en>52</en> + </text> + <endnote notenumber="52"> + 52. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.opera.com/">http://www.opera.com/</link>> + </endnote> + <ocn>167</ocn> +</object> +<object id="168"> + <text class="norm"> + Also lighter weight graphical browsers: + </text> + <ocn>168</ocn> +</object> +<object id="169"> + <text class="indent_bullet"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.dillo.org/"> Dillo </link> <en>53</en> + </text> + <endnote notenumber="53"> + 53. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.dillo.org/">http://www.dillo.org/</link>> + </endnote> + <ocn>169</ocn> +</object> +<object id="170"> + <text class="indent_bullet"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.gnome.org/projects/epiphany/"> <b>Epiphany</b> +</link> <en>54</en> + </text> + <endnote notenumber="54"> + 54. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.gnome.org/projects/epiphany/">http://www.gnome.org/projects/epiphany/</link>> + </endnote> + <ocn>170</ocn> +</object> +<object id="171"> + <text class="indent_bullet"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://galeon.sourceforge.net/"> <b>Galeon</b> </link> +<en>55</en> + </text> + <endnote notenumber="55"> + 55. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://galeon.sourceforge.net/">http://galeon.sourceforge.net/</link>> + </endnote> + <ocn>171</ocn> +</object> +<object id="172"> + <text class="norm"> + And for console/text browsing: + </text> + <ocn>172</ocn> +</object> +<object id="173"> + <text class="indent_bullet"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://elinks.or.cz/"> <b>elinks</b> </link> <en>56</en> + </text> + <endnote notenumber="56"> + 56. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://elinks.or.cz/">http://elinks.or.cz/</link>> + </endnote> + <ocn>173</ocn> +</object> +<object id="174"> + <text class="indent_bullet"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://links.twibright.com/"> <b>links2</b> </link> +<en>57</en> + </text> + <endnote notenumber="57"> + 57. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://links.twibright.com/">http://links.twibright.com/</link>> + </endnote> + <ocn>174</ocn> +</object> +<object id="175"> + <text class="indent_bullet"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://w3m.sourceforge.net/"> <b>w3m</b> </link> +<en>58</en> + </text> + <endnote notenumber="58"> + 58. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://w3m.sourceforge.net/">http://w3m.sourceforge.net/</link>> + </endnote> + <ocn>175</ocn> +</object> +<object id="176"> + <text class="norm"> + The html tables output is rendered more accurately across a wider +variety set and older versions of browsers (than the html css output). + </text> + <ocn>176</ocn> +</object> +<object id="177"> + <text class="h6"> + 1.15.2 XML + </text> + <ocn>177</ocn> +</object> +<object id="178"> + <text class="norm"> + <b>SiSU</b> generates well formed XML, and multiple versions. An XML +SAX version with a flat/shallow structure, and XML DOM version with a +deeper (embedded) structure. There is also a released working xhtml +module. Examples of SAX and DOM versions are provided within this +document. + </text> + <ocn>178</ocn> +</object> +<object id="179"> + <text class="h6"> + 1.15.3 ODT:ODF, Open Document Format - ISO/IEC 26300:2006 + </text> + <ocn>179</ocn> +</object> +<object id="180"> + <text class="norm"> + <b>SiSU</b> generates Open Document Output format. + </text> + <ocn>180</ocn> +</object> +<object id="181"> + <text class="h6"> + 1.15.4 PDF - portrait and landscape, (through the generation of LaTeX +output which is then transformed to pdf) + </text> + <ocn>181</ocn> +</object> +<object id="182"> + <text class="norm"> + <b>SiSU</b> outputs LaTeX if required which is easily transformed to +PDF.<en>59</en> PDF documents are generated on the site from the same +source files and <b>Ruby</b> program that produce html. Landscape +oriented pdf introduced, providing easier screen viewing, they are also +(paper saving, being currently) formatted to have fewer pages than +their portrait equivalents. + </text> + <endnote notenumber="59"> + 59. LaTeX and pdf features introduced 18<sup>th</sup> June 2001, +Landscape and portrait pdfs introduced 7<sup>th</sup> October 2001., +Lout is a more recent addition 22<sup>th</sup> April 2003 + </endnote> + <ocn>182</ocn> +</object> +<object id="183"> + <text class="indent_bullet"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.adobe.com/products/acrobat/readstep2.html"> +Adobe Reader </link> <en>60</en> + </text> + <endnote notenumber="60"> + 60. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.adobe.com/products/acrobat/readstep2.html">http://www.adobe.com/products/acrobat/readstep2.html</link>> + </endnote> + <ocn>183</ocn> +</object> +<object id="184"> + <text class="indent_bullet"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.gnome.org/projects/evince/"> <b>Evince</b> +</link> <en>61</en> + </text> + <endnote notenumber="61"> + 61. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.gnome.org/projects/evince/">http://www.gnome.org/projects/evince/</link>> + </endnote> + <ocn>184</ocn> +</object> +<object id="185"> + <text class="indent_bullet"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.foolabs.com/xpdf/"> xpdf </link> <en>62</en> + </text> + <endnote notenumber="62"> + 62. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.foolabs.com/xpdf/">http://www.foolabs.com/xpdf/</link>> + </endnote> + <ocn>185</ocn> +</object> +<object id="186"> + <text class="h6"> + 1.15.5 Search - loading/populating of relational database while +retaining document structure information, object citation numbering and +other features (currently PostgreSQL and/or SQLite) + </text> + <ocn>186</ocn> +</object> +<object id="187"> + <text class="norm"> + <b>SiSU</b> (from the same markup input file) automatically feeds into +PostgreSQL<en>63</en> and/or SQLite<en>64</en> database (could be any +other of the better relational databases)<en>65</en> - together with +all additional information related to document structure, and the +alternative ways in which it is generated on the site retained. As +regards scaling of the database, it is as scalable as the database +(here Postgresql or SQLite) and hardware allow. I will prune the images +later. + </text> + <endnote notenumber="63"> + 63. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.postgresql.org/">http://www.postgresql.org/</link>> +<br /> <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://advocacy.postgresql.org/">http://advocacy.postgresql.org/</link>> +<br /> <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://en.wikipedia.org/wiki/Postgresql">http://en.wikipedia.org/wiki/Postgresql</link>> + </endnote> + <endnote notenumber="64"> + 64. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.hwaci.com/sw/sqlite/">http://www.hwaci.com/sw/sqlite/</link>> +<br /> <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://en.wikipedia.org/wiki/Sqlite">http://en.wikipedia.org/wiki/Sqlite</link>> + </endnote> + <endnote notenumber="65"> + 65. Relational database features retaining document structure and +citation introduced 15<sup>th</sup> July 2002 + </endnote> + <ocn>187</ocn> +</object> +<object id="188"> + <text class="norm"> + This is one of the more interesting output forms, as all the structural +data for the documents are retained (though can be ignored by the user +of the database should they so choose). All site texts/documents are +(currently) streamed to four pgsql database tables: + </text> + <ocn>188</ocn> +</object> +<object id="189"> + <text class="indent_bullet1"> + one containing semantic (and other) headers, including, title, +author, subject, (the Dublin Core...); + </text> + <ocn>189</ocn> +</object> +<object id="190"> + <text class="indent_bullet1"> + another the substantive texts by individual "paragraph" (or +object) - along with structural information, each paragraph being +identifiable by its paragraph number (if it has one which almost all of +them do), and the substantive text of each paragraph quite naturally +being searchable (both in formatted and clean text versions for +searching); and + </text> + <ocn>190</ocn> +</object> +<object id="191"> + <text class="indent_bullet1"> + a third containing endnotes cross-referenced back to the +paragraph from which they are referenced (both in formatted and clean +text versions for searching). + </text> + <ocn>191</ocn> +</object> +<object id="192"> + <text class="indent_bullet1"> + a fourth table with a one to one relation with the headers table +contains full text versions of output, eg. pdf, html, xml, and ascii. + </text> + <ocn>192</ocn> +</object> +<object id="193"> + <text class="norm"> + There is of course the possibility to add further structures. + </text> + <ocn>193</ocn> +</object> +<object id="194"> + <text class="norm"> + At this level <b>SiSU</b> loads a relational database with documents +broken in to their smallest logical structurally constituent parts, as +text objects, with their object citation number and all other +structural information needed to construct the structured document. +Text is stored (at this text object level) with and without elementary +markup tagging, the stripped version being so as to facilitate ease of +searching. + </text> + <ocn>194</ocn> +</object> +<object id="195"> + <text class="norm"> + Because the document structure of sites created is clearly defined, and +the text object citation system is available for all forms of output, +it is possible to search the sql database, and either read results from +that database, or just as simply map the results to the html output, +which has richer text markup. + </text> + <ocn>195</ocn> +</object> +<object id="196"> + <text class="norm"> + The combination of the <b>SiSU</b> citation system with a relational +database is pretty powerful, giving rise to several possibilities. As +individual text objects of a document stored (and indexed) together +with object numbers, and all versions of the document have the same +numbering, complex searches can be tailored to return just the +locations of the search results relevant for all available output +formats, with live links to the precise locations in the database or in +html/xml documents; or, the structural information provided makes it +possible to search the full contents of the database and have headings +in which search content appears, or to search only headings etc. (as +the Dublin Core is incorporated it is easy to make use of that as +well). + </text> + <ocn>196</ocn> +</object> +<object id="197"> + <text class="norm"> + This is a larger scale project, (with little development on the front +end largely ignored), though the "infrastructure" has been in place +since 2002. + </text> + <ocn>197</ocn> +</object> +<object id="198"> + <text class="h6"> + 1.15.6 Search - database frontend sample, utilising database and SiSU +features, including object citation numbering (backend currently +PostgreSQL) + </text> + <ocn>198</ocn> +</object> +<object id="199"> + <text class="norm"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://search.sisudoc.org"> Sample search frontend </link> +<en>66</en> A small database and sample query front-end (search from) +that makes use of the citation system, <u>object citation numbering</u> +to demonstrates functionality.<en>67</en> + </text> + <endnote notenumber="66"> + 66. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://search.sisudoc.org">http://search.sisudoc.org</link>> + </endnote> + <endnote notenumber="67"> + 67. (which could be extended further with current back-end). As regards +scaling of the database, it is as scalable as the database (here +Postgresql) and hardware allow. + </endnote> + <ocn>199</ocn> +</object> +<object id="200"> + <text class="norm"> + <b>SiSU</b> can provide information on which documents are matched and +at what locations within each document the matches are found. These +results are relevant across all outputs using object citation +numbering, which includes html, XML, LaTeX, PDF and indeed the SQL +database. You can then refer to one of the other outputs or in the SQL +database expand the text within the matched objects (paragraphs) in the +documents matched. + </text> + <ocn>200</ocn> +</object> +<object id="201"> + <text class="norm"> + (further work needs to be done on the sample search form, which is +rudimentary and only passes simple booleans correctly at present to the +SQL engine) + </text> + <ocn>201</ocn> +</object> +<object id="202"> + <text class="norm"> + A few canned searches, showing object numbers. Search for: + </text> + <ocn>202</ocn> +</object> +<object id="203"> + <text class="norm"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://search.sisudoc.org?s1=Linux%2BOR%2BDebian&lang=En&db=SiSU_sisu&view=index&a=1"> +English documents matching Linux OR Debian </link> + </text> + <ocn>203</ocn> +</object> +<object id="204"> + <text class="norm"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://search.sisudoc.org?s1=GPL%2BOR%2BRichard%2BStallman&lang=En&db=SiSU_sisu&view=index&a=1"> +GPL OR Richard Stallman </link> + </text> + <ocn>204</ocn> +</object> +<object id="205"> + <text class="norm"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://search.sisudoc.org?s1=invention%2BOR%2Binnovation&lang=En&db=SiSU_sisu&view=index&a=1"> +invention OR innovation in English language </link> + </text> + <ocn>205</ocn> +</object> +<object id="206"> + <text class="norm"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://search.sisudoc.org?s1=copyright&lang=En&db=SiSU_sisu&view=index&a=1"> +copyright in English language documents </link> + </text> + <ocn>206</ocn> +</object> +<object id="207"> + <text class="norm"> + Note that the searches done in this form are case sensitive. + </text> + <ocn>207</ocn> +</object> +<object id="208"> + <text class="norm"> + Expand those same searches, showing the matching text in each document: + </text> + <ocn>208</ocn> +</object> +<object id="209"> + <text class="norm"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://search.sisudoc.org?s1=Linux%2BOR%2BDebian&lang=En&db=SiSU_sisu&view=text&a=1"> +English documents matching Linux OR Debian </link> + </text> + <ocn>209</ocn> +</object> +<object id="210"> + <text class="norm"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://search.sisudoc.org?s1=GPL%2BOR%2BRichard%2BStallman&lang=En&db=SiSU_sisu&view=text&a=1"> +GPL OR Richard Stallman </link> + </text> + <ocn>210</ocn> +</object> +<object id="211"> + <text class="norm"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://search.sisudoc.org?s1=invention%2BOR%2Binnovation&lang=En&db=SiSU_sisu&view=text&a=1"> +invention OR innovation in English language </link> + </text> + <ocn>211</ocn> +</object> +<object id="212"> + <text class="norm"> + <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://search.sisudoc.org?s1=copyright&lang=En&db=SiSU_sisu&view=text&a=1"> +copyright in English language documents </link> + </text> + <ocn>212</ocn> +</object> +<object id="213"> + <text class="norm"> + Note you may set results either for documents matched and object number +locations within each matched document meeting the search criteria; or +display the names of the documents matched along with the objects +(paragraphs) that meet the search criteria.<en>68</en> + </text> + <endnote notenumber="68"> + 68. of this feature when demonstrated to an IBM software innovations +evaluator in 2004 he said to paraphrase: this could be of interest to +us. We have large document management systems, you can search hundreds +of thousands of documents and we can tell you which documents meet your +search criteria, but there is no way we can tell you without opening +each document where within each your matches are found. + </endnote> + <ocn>213</ocn> +</object> +<object id="214"> + <text class="norm"> + <b>OCN index mode,</b> (object citation number) the numbers displayed +are relevant (and may be used to reference the match) in any sisu +generated rendition of the text<en>69</en> the links provided are to +the locations of matches within the html generated by <b>SiSU</b>. + </text> + <endnote notenumber="69"> + 69. OCN are provided for HTML, XML, pdf ... though currently omitted in +plain-text and opendocument format output + </endnote> + <ocn>214</ocn> +</object> +<object id="215"> + <text class="norm"> + <b>Paragraph mode,</b> you may alternatively display the text of each +paragraph in which the match was made, again the object/paragraph +numbers are relevant to any <b>SiSU</b> generated/published text. + </text> + <ocn>215</ocn> +</object> +<object id="216"> + <text class="norm"> + Several options for output - select database to search, show results in +index view (links to locations within text), show results with text, +echo search in form, show what was searched, create and show a "canned +url" for search, show available search fields. Also shows counters +number of documents in which found and number of locations within +documents where found. [could consider sorting by document with most +occurrences of the search result]. + </text> + <ocn>216</ocn> +</object> +<object id="217"> + <text class="norm"> + Earlier version of the search frontend - Simple search, results with +files in which search found, and locations where found within files. + </text> + <ocn>217</ocn> +</object> +<object id="218"> + <text class="norm"> + Simple search, results with files in which search found, and text +object (paragraph or endnote) where found within files. + </text> + <ocn>218</ocn> +</object> +<object id="219"> + <text class="h6"> + 1.15.7 Other forms + </text> + <ocn>219</ocn> +</object> +<object id="220"> + <text class="norm"> + There are other forms as well, YAML file, <b>Ruby</b> Marshal dumps, +document pre-processing (processing of documents prior to the steps +described here, to produce input suitable for the program) snap in a +new module as required/desired, well formed XML, no problem. + </text> + <ocn>220</ocn> +</object> +<object id="221"> + <text class="h5"> + 1.16 Concordance / Word Map or rudimentary index + </text> + <ocn>221</ocn> +</object> +<object id="222"> + <text class="norm"> + Concordance /WordMaps:<en>70</en> <b>SiSU</b> produces a rudimentary +index based on the words within the text, making use of paragraph +numbers to identify text locations. This is generated in html and +hyper-linked but identifies these words locations in the other document +formats. Though it is possible to search using a search engine, this is +a means for browsing an alphabetical list of words which may suggest +other useful content. + </text> + <endnote notenumber="70"> + 70. Concordance/ WordMaps introduced 15<sup>th</sup> August 2002 + </endnote> + <ocn>222</ocn> +</object> +<object id="223"> + <text class="h5"> + 1.17 Managed (document) directory, database, or site structure + </text> + <ocn>223</ocn> +</object> +<object id="224"> + <text class="norm"> + <b>SiSU</b> builds the web site (or more generically provides a +suitable directory structure) - placing various output texts in the +hierarchy of the web-site (or db), which (for directories) is a +sub-directory with the name of the text file. + </text> + <ocn>224</ocn> +</object> +<object id="225"> + <text class="h5"> + 1.18 Batch processing + </text> + <ocn>225</ocn> +</object> +<object id="226"> + <text class="norm"> + <b>SiSU</b> is a batch processing tool, handling and transforming +multiple (or individual) documents (in many ways) with a single +instruction. + </text> + <ocn>226</ocn> +</object> +<object id="227"> + <text class="h5"> + 1.19 Integration to superior Gnu/Linux and Unix tools + </text> + <ocn>227</ocn> +</object> +<object id="228"> + <text class="norm"> + As should have been noted by the above description of <b>SiSU</b>, it +makes use of existing programs found on <b>Gnu</b> /Linux and Unix, +amongst those already mentioned include the LaTeX to pdf converters and +the database PostgreSQL or SQLite. + </text> + <ocn>228</ocn> +</object> +<object id="229"> + <text class="h6"> + 1.19.1 Backup and version control + </text> + <ocn>229</ocn> +</object> +<object id="230"> + <text class="norm"> + Unix provides many tools for version control. For documents Subversion, +CVS and even the old RCS are useful for the per-document histories they +provide. + </text> + <ocn>230</ocn> +</object> +<object id="231"> + <text class="norm"> + For writing code superior (more recent) version control system exist. +These can also be used for documents though they tend to take stamps of +changes across the repository as a whole, rather than for each +individual file that is tracked, (as CVS and RCS do). My personal +preference is for distributed systems such as Git, Mercurial or Darcs, +of which I use Git for both code and documents. + </text> + <ocn>231</ocn> +</object> +<object id="232"> + <text class="norm"> + Several backup tools exist. At the base level I tend to use rdiff. + </text> + <ocn>232</ocn> +</object> +<object id="233"> + <text class="h6"> + 1.19.2 Editor support + </text> + <ocn>233</ocn> +</object> +<object id="234"> + <text class="norm"> + <b>SiSU</b> documents are prepared / marked up in utf-8 text <u>you are +free to use the text editor of your choice.</u> + </text> + <ocn>234</ocn> +</object> +<object id="235"> + <text class="norm"> + Syntax highlighting for a number of editors are provided. Amongst them +Vim, Kwrite, Kate, Gedit and diakonos. These may be found with +configuration instructions at <<link +xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.jus.uio.no/sisu/syntax_highlight">http://www.jus.uio.no/sisu/syntax_highlight</link>>. +<link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" +xlink:href="http://www.vim.org/"> Vim </link> <en>71</en> as of version +7 has built in sytax highlighting for <b>SiSU</b>. + </text> + <endnote notenumber="71"> + 71. <<link xmlns:xlink="http://www.w3.org/1999/xlink" +xlink:type="simple" +xlink:href="http://www.vim.org/">http://www.vim.org/</link>> + </endnote> + <ocn>235</ocn> +</object> +<object id="236"> + <text class="h5"> + 1.20 Modular design, need something new add a module + </text> + <ocn>236</ocn> +</object> +<object id="237"> + <text class="norm"> + Need a new output format that does not already exist, write a new +module. + </text> + <ocn>237</ocn> +</object> +<object id="238"> + <text class="norm"> + Prefer a new input syntax, you could write a new syntax matching the +existing design, though my personal preference is some uniformity in +entry appearance. If necessary has been fairly easy to extend the +design parameters. It is intended to incorporate some additional basic +semantic tagging, (book, article, author etc.) However, keeping the +requirements for input minimal, and relatively simple has been a design +goal. + </text> + <ocn>238</ocn> +</object> +<object id="0"> + <text class="h4"> + Endnotes + </text> + <ocn>0</ocn> +</object> +</body> +</document> |