From 0e6fc15ada3c5d9a86b227163f35a54993b32529 Mon Sep 17 00:00:00 2001 From: Ralph Amissah Date: Tue, 2 Dec 2008 23:54:23 -0500 Subject: sisu harvest, introduce module along with header syntax addition & modification * sisu markup, additional header and new format rule: * @creator: / @author: header field, introduced author name format rules for more usable metadata harvesting: surname comma other names, additional authors separated by semi-colon * param added meta-tag, @topic_register: formatting topic levels are separated from sub-levels by a colon, a semi-colon separates main topics if there are multiple topics at lowest sub-level, a pipe can be used to create multiple headings * harvest module, harvests metadata from document set currently extracts: (i) authors and their writings from document set; (ii) topics and associated writings from document set (topics use topic_register header). harvest (when run against documents common to a directory of a site) extracts metadata and organises the documents on a site by author and topic information provided (there is a new "topic_register" header, with formatting rules similar to those of the book index), results are placed in [output_path]/sisu_site_metadata. sisu --harvest *.sst * by author (see change in param @creator: / @author: header field) * by topic / subject index (see addition in param of @topic_register: header field) initially there should be an example samples here: http://www.jus.uio.no/sisu/sisu_site_metadata/harvest_authors.html http://www.jus.uio.no/sisu/sisu_site_metadata/harvest_topics.html together with update markup source files The authors and their writings list will be made to take on a more biblographical form, with the use of additional fields as required. (concept example, suitable for medium sized sites [to remove size constraint: implement SQL equivalent]) make feature more robust * css, for harvest output added * remote placement of sisu_site_metadata (output produced by metadata harvest) * sisu markup, update document samples accordingly * tidy copyright marks in program headers, remove repetition of dates [version bump because formatting rule introduced to author / creator header - where new site metadata harvest feature is used, (at present changes changes should not be noticed except when using metadata harvest)] --- data/doc/sisu/sisu_markup_samples/sisu_manual/sisu_markup.sst | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) (limited to 'data/doc/sisu/sisu_markup_samples/sisu_manual/sisu_markup.sst') diff --git a/data/doc/sisu/sisu_markup_samples/sisu_manual/sisu_markup.sst b/data/doc/sisu/sisu_markup_samples/sisu_manual/sisu_markup.sst index 09a8f427..292bfe13 100644 --- a/data/doc/sisu/sisu_markup_samples/sisu_manual/sisu_markup.sst +++ b/data/doc/sisu/sisu_markup_samples/sisu_manual/sisu_markup.sst @@ -103,7 +103,7 @@ With SiSU installed sample skins may be found in: /usr/share/doc/sisu/sisu_marku 1~headers Markup of Headers -Headers consist of semantic meta-data about a document, which can be used by any output module of the program; and may in addition include extra processing instructions. +Headers contain either: semantic meta-data about a document, which can be used by any output module of the program, or; processing instructions. Note: the first line of a document may include information on the markup version used in the form of a comment. Comments are a percentage mark at the start of a paragraph (and as the first character in a line of text) followed by a space and the comment: @@ -125,7 +125,9 @@ code{ @subtitle: Markup -@creator: Ralph Amissah +@creator: Amissah, Ralph + +% note formatting on author / creator field, surname comma then other names, if more than one author separate by semi-colon @rights: Copyright (C) Ralph Amissah 2007, part of SiSU documentation, License GPL 3 @@ -133,6 +135,10 @@ code{ @subject: ebook, epublishing, electronic book, electronic publishing, electronic document, electronic citation, data structure, citation systems, search +@topic_register: text markup language; application:text processing;output:html|xml|latex|pdf|sql + +% note formatting for topic_register topic levels are separated by a colon, a semi-colon separates main topics + @date.created: 2002-08-28 @date.issued: 2002-08-28 -- cgit v1.2.3