From a72e66db913de3a2e508080c8b1fc8d1342a899b Mon Sep 17 00:00:00 2001
From: Ralph Amissah <ralph@amissah.com>
Date: Tue, 25 Sep 2007 23:23:03 +0100
Subject: remove generated output from main package

---
 .../sisu_manual/sisu_description/plain.txt         | 1566 --------------------
 1 file changed, 1566 deletions(-)
 delete mode 100644 data/doc/manuals_generated/sisu_manual/sisu_description/plain.txt

(limited to 'data/doc/manuals_generated/sisu_manual/sisu_description/plain.txt')

diff --git a/data/doc/manuals_generated/sisu_manual/sisu_description/plain.txt b/data/doc/manuals_generated/sisu_manual/sisu_description/plain.txt
deleted file mode 100644
index a2a490e2..00000000
--- a/data/doc/manuals_generated/sisu_manual/sisu_description/plain.txt
+++ /dev/null
@@ -1,1566 +0,0 @@
-SISU - DESCRIPTION,
-RALPH AMISSAH
-**********************************
-
-SISU AN ATTEMPT TO DESCRIBE
-===========================
-
-1. DESCRIPTION
---------------
-
-1.1 OUTLINE
-...........
-
-*SiSU* is a flexible document preparation, generation publishing and search
-system.[^1]
-
-
-- [1]: This information was first placed on the web 12 November 2002; with
- predating material taken from
- <http://www.jus.uio.no/lm/lm.information/toc.html> part of a site started and
- developed since 1993. See document metadata section
- <http://www.jus.uio.no/sisu/SiSU/metadata.html> for information on this
- version. Dates related to the development of *SiSU* are mostly contained
- within the Chronology section of this document, e.g.
- <http://www.jus.uio.no/sisu/sisu_chronology>
-
-*SiSU* ("*SiSU* information Structuring Universe" or "Structured information,
-Serialized Units"),[^2] is a Unix command line oriented framework for document
-structuring, publishing and search. Featuring minimalistic markup, multiple
-standard outputs, a common citation system, and granular search.
-
-
-- [2]: also chosen for the meaning of the Finnish term "sisu".
-
-Using markup applied to a document, *SiSU* can produce plain text, HTML, XHTML,
-XML, OpenDocument, LaTeX or PDF files, and populate an SQL database with
-objects[^3] (equating generally to paragraph-sized chunks) so searches may be
-performed and matches returned with that degree of granularity (e.g. your
-search criteria is met by these documents and at these locations within each
-document). Document output formats share a common object numbering system for
-locating content. This is particularly suitable for "published" works
-(finalized texts as opposed to works that are frequently changed or updated)
-for which it provides a fixed means of reference of content.
-
-
-- [3]: objects include: headings, paragraphs, verse, tables, images, but not
- footnotes/endnotes which are numbered separately and tied to the object from
- which they are referenced.
-
-*SiSU* is the data/information structuring and transforming tool, that has
-resulted from work on one of the oldest law web projects. It makes possible the
-one time, simple human readable markup of documents, that *SiSU* can then
-publish in various forms, suitable for paper[^4], web[^5] and relational
-database[^6] presentations, retaining common data-structure and
-meta-information across the output/presentation formats. Several requirements
-of legal and scholarly publication on the web have been addressed, including
-the age old need to be able to reliably cite/pinpoint text within a document,
-to easily make footnotes/endnotes, to allow for semantic document meta-tagging,
-and to keep required markup to a minimum. These and other features of interest
-are listed and described below. A few points are worth making early (and will
-be repeated a number of times):
-
-
-- [4]: pdf via LaTeX or lout
-
-- [5]: currently html (two forms of html presentation one based on css the other on
- tables), and /PHP/; potentially structured XML
-
-- [6]: any SQL - currently PostgreSQL and /sqlite/ (for portability, testing and
- development)
-
-  (i) The *SiSU* document generator was the first to place material on the web
-  with a system that makes possible citation across different document types,
-  with paragraph, or rather object citation numbering[^7] a text positioning
-  system, available for the pinpointing of text, 1997, a simple idea from which
-  much benefit, and *SiSU* remains today, to the best of my knowledge, the only
-  multiple format e-book/ electronic-document system on the web that gives you
-  this possibility (including for relational databases).
-
-
-- [7]: previously called "text object numbering"
-
-  (ii) Markup is done once for the multiple formats produced.
-
-
-  (iii) Markup is simple, and human readable (with a little practice), in
-  almost all cases there is less and simpler markup required than basic html.
-  In any event the markup required is very much simpler than the html, LaTeX,
-  [lout], structured XML, ODF (OpenDocument), PostgreSQL or SQLite feed etc.
-  that you can have *SiSU* generate for you.
-
-
-  (iv) *SiSU* is a batch processor, dealing with as many files as you need to
-  generate at a time.
-
-
-  (v) Scalability is dependent on your file system (in my case Reiserfs), the
-  database (currently Postgresql and/or SQLite) and your hardware.
-
-
-*SiSU* Sabaki[^8] (or just *SiSU*) is the provisional name given to the
-software described here that helps structure documents for web and other
-publication. The name *SiSU* is a loose anagram for something along the lines
-of */"SiSU is structuring unit"/*, or /"*SiSU*, information structuring unit"/
-or the more descriptive /"Structured information, Serialized Units"/ or
-*/"simple - information structuring unit"/* or the more descriptive
-/"Structured information, Serialized Units"/ or what it may be directed towards
-/"*semantic* and *information structuring universe*" /,[^9] tongue in cheek,
-only just. Guess I'll get away with */"Simple - information Structuring
-Universe"/*. *SiSU* is also a Finnish word roughly meaning guts, inner strength
-and perseverance.[^10]
-
-
-- [8]: *SiSU* Sabaki, release version. Pre-release version *SiSU* Scribe, and
- version prior to that *SiSU* nicknamed Scribbler. Pre-release versions go back
- several years. Both Scribbler and Scribe (still maintained) made system calls
- to *SiSU*'s various parts, instead of using libraries.
-
-- [9]: A little universe it may be, but semantic you may have a hard time getting
- away with, given the meaning the word has taken on with markup. On a document
- wide basis semantic information may be provided, which can be really useful,
- (and meaningful, especially) if you have a large document set, and use this
- with rss feeds or in an sql database etc. On a markup level, I have little
- inclination to add semantic markup formally beyond references, title, author
- [Dublin Core entities? addresses?] etc. Actually this deserves a bit of
- thought possibly use letter tags (including letter alias/synonyms for font
- faces) to create a small set of default semantic tags, with the possibility
- for per document adjustments. Will seek to permit XML entity tagging, within
- *SiSU* markup and have that ignored/removed by the parts of the program that
- have no use for it.
-
-- [10]: "Sisu refers not to the courage of optimism, but to a concept of life that
- says, 'I may not win, but I will gladly give my life for what I believe.'"
- Aini Rajanen, Of Finnish Ways, 1981, p. 10.
-
-- <http://www.humanlanguages.com/finnishenglish/rlfs.htm>
-
-- "Every Finn has his own pet definition. To me, sisu means patience without
- passion. But there are many varieties of sisu. Sisu can be a sudden outburst
- or it can be the kind that lasts. A man can have both kinds. It is outside
- reason. It is something in the soul. It comes from oneself. For instance, it
- makes a soldier do things because he himself must, not because he has been
- told." Paavo Nurmi
-
-- <http://personalweb.smcvt.edu/tmatikainen/finnishtraditions.htm>
-
-*SiSU* was born of the need to find a way, with minimal effort, and for as wide
-a range of document types as possible, to produce high quality publishing
-output in a variety of document formats. As such it was necessary to find a
-simple document representation that would work across a large number of
-document types, and the most convenient way(s) to produce acceptable output
-formats. The project leading to this program was started in 1993 (together with
-the trade law project now known as Lex Mercatoria) as an investigation of how
-to effectively/efficiently place documents on the web. The unified document
-handling, together with features such as paragraph numbering, endnote handling
-and tables... appeared in 1996/97. *SiSU* was originally written in Perl,[^11]
-and converted to *Ruby*, [^12] in 2000, one of the most impressive programming
-languages in existence! In its current form it has been written to run on the
-*Gnu* /Linux platform, and in particular on *Debian*, [^13] taking advantage of
-many of the wonderful projects that are available there.
-
-
-- [11]: <http://www.perl.org/>
-
-- [12]: <http://www.ruby-lang.org/en/>
-
-- [13]: <http://www.debian.org/>
-
-*SiSU* markup is based on requiring the minimum markup needed to determine the
-structure of a document. (This can be as little as saying in a header to look
-for the word Book at a specified level and the word Chapter at another level).
-*SiSU* then breaks a document into its smallest parts (at a heading, and
-paragraph level) while retaining all structural information. This break up of
-the document and information on its structure is taken advantage of in the
-transformations made in generating the very different output types that can be
-created, and in providing as much as can be for what each output type is best
-at doing, e.g. LaTeX (professional document typesetting, easy conversion to pdf
-or Postscript), XML (in this case, structural representation), ODF
-(OpenDocument [experimental]), SQL (e.g. document search; representing
-constituent parts of documents based on their structure, headings, chapters,
-paragraphs as required; user control).[^14]
-
-
-- [14]: where explicit structure is provided through the use of tagging headings,
- it could be reduced (still) further, for example by reducing the number of
- characters used to identify heading levels; but in many cases even that
- information is not required as regular expressions can be used to extract the
- implicit structure.
-
-From markup that is simpler and more sparse than html you get:
-
-
-* far greater output possibilities, including html, XML, ODF (OpenDocument),
-LaTeX (pdf), and SQL;
-
-
-* the advantages implicit in the very different output possibilities;
-
-
-* a common citation system (for all outputs - including the relational
-database, search results are relevant for all outputs);
-
-
-For more see the short summary of features provided below.
-
-
-*SiSU* processes files with minimal tagging to produce various document outputs
-including html, LaTeX or lout (which is converted to pdf) and if required loads
-the structured information into an SQL database (PostgreSQL and SQLite have
-been used for this). *SiSU* produces an intermediate processing format.[^15]
-
-
-- [15]: This proved to be the easiest way to develop syntax, changes could be made,
- or alternatives provided for the markup syntax whilst the intermediate markup
- syntax was largely held constant. There is actually an optional second
- intermediate markup format in YAML <http://www.yaml.org/>
-
-*SiSU* is used in constructing Lex Mercatoria <http://lexmercatoria.org/> or
-<http://www.jus.uio.no/lm/> (one of the oldest law web sites), and considerable
-thought went into producing output that would be suitable for legal and
-academic writings (that do not have formulae) given the limitations of html,
-and publication in a wide variety of "formats", in particular in relation to
-the convenient and accurate citation of text. However, the construction of Lex
-Mercatoria uses only a fraction of the features available from *SiSU* today,
-/vis/ generation of flat file structures, rather than in addition the building
-of ("granular") SQL database content, (at an object level with relevant
-relational tables, and other outputs also available).
-
-
-1.2 SHORT SUMMARY OF FEATURES
-.............................
-
-*(i)* markup syntax: (a) simpler than html, (b) mnemonic, influenced by
-mail/messaging/wiki markup practices, (c) human readable, and easily writable,
-
-
-*(ii)* (a) minimal markup requirement, (b) single file marked up for multiple
-outputs,
-
-
-notes:
-
-
-* documents are prepared in a single UTF-8 file using a minimalistic mnemonic
-syntax. Typical literature, documents like "War and Peace" require almost no
-markup, and most of the headers are optional.
-
-
-* markup is easily readable/parsed by the human eye, (basic markup is simpler
-and more sparse than the most basic html), [this may also be converted to XML
-representations of the same input/source document].
-
-
-* markup defines document structure (this may be done once in a header
-pattern-match description, or for heading levels individually); basic text
-attributes (bold, italics, underscore, strike-through etc.) as required; and
-semantic information related to the document (header information, extended
-beyond the Dublin core and easily further extended as required); the headers
-may also contain processing instructions.
-
-
-*(iii)* (a) multiple outputs primarily industry established and institutionally
-accepted open standard formats, include amongst others: plaintext (UTF-8);
-html; (structured) XML; ODF (Open Document text)l; LaTeX; PDF (via LaTeX); SQL
-type databases (currently PostgreSQL and SQLite). Also produces: concordance
-files; document content certificates (md5 or sha256 digests of headings,
-paragraphs, images etc.) and html manifests (and sitemaps of content). (b)
-takes advantage of the strengths implicit in these very different output types,
-(e.g. PDFs produced using typesetting of LaTeX, databases populated with
-documents at an individual object/paragraph level, making possible granular
-search (and related possibilities))
-
-
-*(iv)* outputs share a common numbering system (dubbed "object citation
-numbering" (ocn)) that is meaningful (to man and machine) across various
-digital outputs whether paper, screen, or database oriented, (PDF, html, XML,
-sqlite, postgresql), this numbering system can be used to reference content.
-
-
-*(v)* SQL databases are populated at an object level (roughly headings,
-paragraphs, verse, tables) and become searchable with that degree of
-granularity, the output information provides the object/paragraph numbers which
-are relevant across all generated outputs; it is also possible to look at just
-the matching paragraphs of the documents in the database; [output indexing also
-work well with search indexing tools like hyperesteier].
-
-
-*(vi)* use of semantic meta-tags in headers permit the addition of semantic
-information on documents, (the available fields are easily extended)
-
-
-*(vii)* creates organised directory/file structure for (file-system) output,
-easily mapped with its clearly defined structure, with all text objects
-numbered, you know in advance where in each document output type, a bit of text
-will be found (e.g. from an SQL search, you know where to go to find the
-prepared html output or PDF etc.)... there is more; easy directory management
-and document associations, the document preparation (sub-)directory may be used
-to determine output (sub-)directory, the skin used, and the SQL database used,
-
-
-*(viii)* "Concordance file" wordmap, consisting of all the words in a document
-and their (text/ object) locations within the text, (and the possibility of
-adding vocabularies),
-
-
-*(ix)* document content certification and comparison considerations: (a) the
-document and each object within it stamped with an md5 hash making it possible
-to easily check or guarantee that the substantive content of a document is
-unchanged, (b)version control, documents integrated with time based source
-control system, default RCS or CVS with use of $Id: sisu_description.sst,v 1.25
-2007/08/23 12:22:36 ralph Exp $ tag, which *SiSU* checks
-
-
-*(x)* *SiSU*'s minimalist markup makes for meaningful "diffing" of the
-substantive content of markup-files,
-
-
-*(xi)* easily skinnable, document appearance on a project/site wide, directory
-wide, or document instance level easily controlled/changed,
-
-
-*(xii)* in many cases a regular expression may be used (once in the document
-header) to define all or part of a documents structure obviating or reducing
-the need to provide structural markup within the document,
-
-
-*(xiii)* prepared files may be batch process, documents produced are static
-files so this needs to be done only once but may be repeated for various
-reasons as desired (updated content, addition of new output formats, updated
-technology document presentations/representations)
-
-
-*(xiv)* possible to pre-process, which permits: the easy creation of standard
-form documents, and templates/term-sheets, or; building of composite documents
-(master documents) from other sisu marked up documents, or marked up parts,
-i.e. import documents or parts of text into a main document should this be
-desired
-
-
-there is a considerable degree of future-proofing, output representations are
-"upgradeable", and new document formats may be added.
-
-
-*(xv)* there is a considerable degree of future-proofing, output
-representations are "upgradeable", and new document formats may be added: (a)
-modular, (thanks in no small part to *Ruby*) another output format required,
-write another module.... (b) easy to update output formats (eg html, XHTML,
-LaTeX/PDF produced can be updated in program and run against whole document
-set), (c) easy to add, modify, or have alternative syntax rules for input,
-should you need to,
-
-
-*(xvi)* scalability, dependent on your file-system (ext3, Reiserfs, XFS,
-whatever) and on the relational database used (currently Postgresql and
-SQLite), and your hardware,
-
-
-*(xvii)* only marked up files need be backed up, to secure the larger document
-set produced,
-
-
-*(xviii)* document management,
-
-
-*(xix)* Syntax highlighting for *SiSU* markup is available for a number of text
-editors.
-
-
-*(xx)* remote operations: (a) run *SiSU* on a remote server, (having prepared
-sisu markup documents locally or on that server, i.e. this solution where sisu
-is installed on the remote server, would work whatever type of machine you
-chose to prepare your markup documents on), (b) generated document outputs may
-be posted by sisu to remote sites (using rsync/scp) (c)document source
-(plaintext utf-8) if shared on the net may be identified by its url and
-processed locally to produce the different document outputs.
-
-
-*(xxi)* document source may be bundled together (automatically) with associated
-documents (multiple language versions or master document with inclusions) and
-images and sent as a zip file called a sisupod, if shared on the net these too
-may be processed locally to produce the desired document outputs, these may be
-downloaded, shared as email attachments, or processed by running sisu against
-them, either using a url or the filename.
-
-
-*(xxii)* for basic document generation, the only software dependency is *Ruby*,
-and a few standard Unix tools (this covers plaintext, html, XML, ODF, LaTeX).
-To use a database you of course need that, and to convert the LaTeX generated
-to PDF, a LaTeX processor like tetex or texlive.
-
-
-as a developers tool it is flexible and extensible
-
-
-*SiSU* was developed in relation to legal documents, and is strong across a
-wide variety of texts (law, literature...). *SiSU* handles images but is not
-suitable for formulae/ statistics, or for technical writing at this time.
-
-
-*SiSU* has been developed and has been in use for several years. Requirements
-to cover a wide range of documents within its use domain have been explored.
-
-
-Some modules are more mature than others, the most mature being Html and LaTeX
-/ pdf. PostgreSQL and search functions are useable and together with /ocn/
-unique (to the best of my knowledge). The XML output document set is "well
-formed" but largely proof of concept.
-
-
-1.3 HOW IT WORKS
-................
-
-*SiSU* markup is fairly minimalistic, it consists of: a (largely optional)
-document header, made up of information about the document (such as when it was
-published, who authored it, and granting what rights) and any processing
-instructions; and markup within text which is related to document structure and
-typeface. *SiSU* must be able to discern the structure of a document, (text
-headings and their levels in relation to each other), either from information
-provided in the instruction header or from markup within the text (or from a
-combination of both). Processing is done against an abstraction of the document
-comprising of information on the document's structure and its objects,[^16]
-which the program serializes (providing the object numbers) and which are
-assigned hash sum values based on their content. This abstraction of
-information about document structure, objects, (and hash sums), provides
-considerable flexibility in representing documents different ways and for
-different purposes (e.g. search, document layout, publishing, content
-certification, concordance etc.), and makes it possible to take advantage of
-some of the strengths of established ways of representing documents, (or indeed
-to create new ones).
-
-
-- [16]: objects include: headings, paragraphs, verse, tables, images, but not
- footnotes/endnotes which are numbered separately and tied to the object from
- which they are referenced.
-
-1.4 SIMPLE MARKUP
-.................
-
-*SiSU* markup is based on requiring the minimum markup needed to determine the
-structure of a document. (This can be as little as saying in a header to look
-for the word Book at a specified level and the word Chapter at another level).
-*SiSU* then breaks a document into its smallest parts (at a heading, and
-paragraph level) while retaining all structural information. This break up of
-the document and information on its structure is taken advantage of in the
-transformations made in generating the very different output types that can be
-created, and in providing as much as can be for what each output type is best
-at doing, e.g. LaTeX (professional document typesetting, easy conversion to pdf
-or Postscript), XML (in this case, structural representation), ODF
-(OpenDocument), SQL (e.g. document search; representing constituent parts of
-documents based on their structure, headings, chapters, paragraphs as required;
-user control).[^17]
-
-
-- [17]: where explicit structure is provided through the use of tagging headings,
- it could be reduced (still) further, for example by reducing the number of
- characters used to identify heading levels; but in many cases even that
- information is not required as regular expressions can be used to extract the
- implicit structure.
-
-1.4.1 SPARSE MARKUP REQUIREMENT, TRY TO GET THE MOST OUT OF MARKUP
-..................................................................
-
-One of its strengths is that very small amounts of initial tagging is required
-for the program to generate its output.
-
-
-This is a basic markup example:
-
-
-* basic markup example, text file - an international convention [link:]
-<http://www.jus.uio.no/sisu/sample/markup/un_contracts_international_sale_of_goods_convention_1980.sst>
-[^18]
-
-
-- [18]: <http://www.jus.uio.no/sisu/sample/markup/un_contracts_international_sale_of_goods_convention_1980.sst>
- output provided as example in the next section
-
-* view basic markup, as it would be highlighted by vim editor [link:]
-<http://www.jus.uio.no/sisu/sample/syntax/un_contracts_international_sale_of_goods_convention_1980.sst.html>
-[^19]
-
-
-- [19]: <http://www.jus.uio.no/sisu/sample/syntax/un_contracts_international_sale_of_goods_convention_1980.sst.html>
- as it would appear with syntax highlighting (by vim)
-
-Emphasis has been on simplicity and minimalism in markup requirements. Design
-philosophy is to try keep the amount of markup required low, for whatever has
-been determined to be acceptable output.[^20]
-
-
-- [20]: seems there are several "smart ASCIIs" available, primarily for ascii to
- html conversion, that make this, and reasonable looking ascii their goal
-
-- <http://webseitz.fluxent.com/wiki/SmartAscii>
-
-- <http://daringfireball.net/projects/markdown/>
-
-- <http://www.textism.com/tools/textile/>
-
-*SiSU*'s markup is more minimalistic and simpler than (the equivalent) html and
-for it, you get considerably more than just html, as this preparation gives you
-all available output formats, upon request.
-
-
-1.4.2 SINGLE MARKUP FILE PROVIDES MULTIPLE OUTPUT FORMATS
-.........................................................
-
-For each document, there is only one (input, minimalistically marked up) file
-from which all the available output types are generated.[^21]
-
-
-- [21]: These include richly laid out and linked html (table or css variants),
- /PHP/, LaTeX (from which pdf portrait and landscape documents are produced),
- texinfo (for info files etc.), and PostgreSQL and/or SQLite. And the
- opportunity to fairly easily build additional modules, such as XML. See the
- examples provided in this document.
-
-Eg. the markup example:
-
-
-* original text file - an international convention [link:]
-<http://www.jus.uio.no/sisu/sample/markup/un_contracts_international_sale_of_goods_convention_1980.sst>
-[^22]
-
-
-- [22]: <http://www.jus.uio.no/sisu/sample/markup/un_contracts_international_sale_of_goods_convention_1980.sst>
-
-* view as syntax would be highlighted by vim editor [link:]
-<http://www.jus.uio.no/sisu/sample/syntax/un_contracts_international_sale_of_goods_convention_1980.sst.html>
-[^23]
-
-
-- [23]: <http://www.jus.uio.no/sisu/sample/syntax/un_contracts_international_sale_of_goods_convention_1980.sst.html>
-
-Produces the following output:
-
-
-* Segmented html version of document [link:]
-<http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/toc.html>
-[^24]
-
-
-- [24]: <http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/toc.html>
-
-* Full length html document [link:]
-<http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/doc.html>
-[^25]
-
-
-- [25]: <http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/doc.html>
-
-* pdf landscape version of document [link:]
-<http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/landscape.pdf>
-[^26]
-
-
-- [26]: <http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/landscape.pdf>
-
-* pdf portrait version of document [link:]
-<http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/portrait.pdf>
-[^27]
-
-
-- [27]: <http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/portrait.pdf>
-
-* clean tex ascii version of document [link:]
-<http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/plain.txt>
-[^28]
-
-
-- [28]: <http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/plain.txt>
-
-* /xml/ sax version of document [link:]
-<http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/sax.xml>
-[^29]
-
-
-- [29]: <http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/sax.xml>
-
-* /xml/ dom version of document [link:]
-<http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/dom.xml>
-[^30]
-
-
-- [30]: <http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/dom.xml>
-
-* Concordance [link:]
-<http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/concordance.html>
-[^31]
-
-
-- [31]: <http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980/concordance.html>
-
-(and in addition to these: PostgreSQL, SQLite, texinfo and <del>YAML</del>
-[^32] versions if desired)
-
-
-- [32]: discontinued for the time being
-
-1.4.3 SYNTAX RELATIVELY EASY TO READ AND REMEMBER
-.................................................
-
-Syntax is kept simple and mnemonic.[^33]
-
-
-- [33]: *SiSU* markup syntax, an incomplete summary:
- <http://www.jus.uio.no/sisu/sisu_markup_table/doc.html#h200306>
-
-- Visual check of elementary font face modifiers: *bold* *bold*
- <em>emphasis</em> /italics/ _underscore_ <del>strikethrough</del>
- ^superscript^ [subscript]
-
-1.4.4 KEPT SIMPLE BY HAVING A LIMITED PUBLISHING FEATURE SET, AND FEATURES
-IDENTIFIED AS MOST IMPORTANT, ARE AVAILABLE ACROSS SEVERAL DOCUMENT TYPES
-..............................................................................
-
-To keep *SiSU* markup sparse and simple *SiSU* deliberately provides a limited
-publishing feature set, including: indent levels; bold; italics; superscript;
-subscript; simple tables; images; tables of contents and; endnotes. Which in
-most cases are available across the different output formats.
-
-
-The publishing feature set may be expanded as required.
-
-
-1.5 DESIGNED WITH USABILITY IN MIND
-...................................
-
-Output is designed to be uniform, easy to read, navigate and cite.
-
-
-1.6 CODE SEPARATE FROM CONTENT
-..............................
-
-Code[^34] is separated from content. This means that when changes are desired
-in the output presentation, the code that produces them, and not the marked up
-text data set (which could be thousands of documents) is modified. Separating
-code from content makes large scale changes to output appearance trivial, and
-permits the easy addition of new output modules.
-
-
-- [34]: the program that generates the documents
-
-1.7 OBJECT CITATION NUMBERING, A TEXT OR OBJECT POSITIONING / CITATION SYSTEM -
-"PARAGRAPH" (OR TEXT OBJECT) NUMBERING, THAT REMAINS SAME AND USABLE ACROSS ALL
-OUTPUT FORMATS BY PEOPLE AND MACHINE
-..............................................................................
-
-Object citation numbering is a simple object (text) positioning and cition
-system that is human relevant and machine useable, used by *SiSU* for all
-manner of presentations, and that is available for use in all text mappings. It
-is based on the automated sequential numbering of objects (roughly paragraphs,
-(headings, tables, verse) or other blocks of text or images etc.). The text
-positioning system (in which I claim copyright) is invaluable for publishing
-requiring the citing text across multiple output formats, and for the general
-mapping of text within a document:
-
-
-* in html, html not being easily citeable (change font size, or use a different
-browser and the page on which specific text appears has changed), and
-
-
-* across multiple formats being common to all output formats html/xml/pdf/sql
-output,
-
-
-* the results of an sql search can just be "live" citation references to the
-documents in which the text is found, much like an index (see image examples
-provided). [link:] <http://www.jus.uio.no/sisu/SiSU/1.html#search> [^35]
-
-
-- [35]: <http://www.jus.uio.no/sisu/SiSU/1.html#search>
-
-I claim copyright on the system I use which is the most basic of all, numbering
-all text in headings and paragraphs sequentially (with tables and images being
-treated as a single paragraph) and only footnotes/endnotes not following this
-numbering, as their position in text is not strictly determined, (a change from
-footnotes to endnotes would change their numbering), footnotes instead "belong"
-to the paragraph from which they are referenced, and have sequential numbers of
-their own.
-
-
-*SiSU* has a paragraph numbering system, that remains the same regardless of
-the output format. This provides an effective means of citation, pinpointing
-text accurately in all output formats, using the same reference. This is
-particularly useful where text has to be located across different output
-formats - for example once html is printed the number of pages and pages on
-which given text is found will vary depending on the browser, its settings the
-font size setting etc. Similarly *SiSU* produces pdf in different forms, eg. on
-the example site Lex Mercatoria as portrait and landscape documents - here too
-page numbering varies, but paragraph numbering is the same, /vis a vis/ all
-versions of the text (portrait and landscape pdf and the html versions of the
-text, and as stored (with "paragraphs" as records) to the PostgreSQL or SQLite
-database).
-
-
-These numbers are placed in the text margins and are intended to be independent
-of and not to interfere with authors tagging. [The citation system (object
-citation numbering system, automated "paragraph numbering") which is
-automatically generated and is common and identical across all document
-formats] The paragraph numbering system is more accurately described as an
-(text) object numbering system, as headings are also numbered... all headings
-and paragraphs are numbered sequentially. Endnotes are automatically numbered
-independently and rather "belong" to the paragraph from which they are
-referenced, as an endnote does not (necessarily) form a part of a documents
-sequence, (they may be produced as either endnotes or footnotes (or both
-depending on what output you choose to look at - if you take the segmented html
-version document provided as an example, you will find that the endnotes are
-placed both at the end of each section, and in a separate section of their own
-called endnotes, and these are hyper-linked)). An attractive feature of
-providing citation numbering in this way is that it is independent of the
-document structure... it remains the same regardless of what is done about the
-document structure.
-
-
-The rules have been kept very simple, unique incremental object citation
-numbers are assigned to headings, paragraphs, verse, tables and images. It is
-possible to manually override this feature on a per heading or comment basis
-though this should be used exceptionally, it may be of use where there a
-substantive text, and the addition of a minor comment by the publisher that
-should not be mapped as part of the text.
-
-
-The object citation number markers contain additional numbering information
-with regard to the document structure, that can be used for alternative
-presentations, including such detail as the type of object (heading, paragraph,
-table, image, etc.), numbered sequentially.
-
-
-An advantage is that the numbering remains the same regardless of document
-structure.
-
-
-Text object ("paragraph") numbering is the same for all output versions of the
-same document, vis html, pdf, pgsql, yaml etc.
-
-
-In the relational database, as individual text objects of a document stored
-(and indexed) together with object numbers, and all versions of the document
-have the same numbering, the results of searches may be tailored just to
-provide the location of the search result in all available document formats.
-
-
-/ Note: there is a bug in the released behaviour of object citation numbering,
-(not certain when it was introduced) tables should be numbered, ie each table
-gets an ocn, required amongst other things for relational database. This will
-be corrected in a future release. Citation numbering of existing documents that
-contain tables will changed. /
-
-
-1.8 HANDLING OF DUBLIN CORE META-TAGS MAKING USE OF THE RESOURCE DESCRIPTION
-FRAMEWORK
-..............................................................................
-
-*SiSU* is able to use meta tags based on the Dublin Core[^36] and Resource
-Description Framework[^37]
-
-
-- [36]: <http://dublincore.org/>
-
-- [37]: <http://www.w3.org/RDF/>
-
-This provides the means of providing semantic information about a document,
-both as computer processable meta-tags, and as human readable information that
-may be of value for classification purposes.
-
-
-This information is provided both in html metatags, and (where available) under
-the section titled "Document Information - MetaData", near the end of a
-document, for example in the segmented html version of this text at:
-<http://www.jus.uio.no/sisu/SiSU/metadata.html>
-
-
-1.9 EASY DIRECTORY MANAGEMENT
-.............................
-
-1. Directory file association, skins and special image management, made
-simpler.[^38]
-
-
-- [38]: The previous way was directory associations for file output were set up in
- the configuration file. The present system is a more natural way to work
- requireing less configuration.
-
-The last part of the name of the work directory in which markup is being done,
-or rather from where *SiSU* is run in order to generate document output, is
-used in determining the sub-directory name for output files, that is created in
-the document output directory. This provides a rather easy way to associate
-documents e.g. of a given subject, or by owner.
-
-
-
-  /www/docs
-      /intellectual_property
-      /arbitration
-      /contract_law
-  /www/docs
-      /ralph
-      /sisu
-
-all are placed in their own directories within the directory structure created.
-Similar rules are used in the creation of sql type databases (though they can
-be overridden).
-
-
-There are a couple of further associations with these directories.
-
-
-Directory wide skins.
-
-
-Directory specific images.
-
-
-2. If there is a "directory skin", that is a skin of the same name as the
-directory, it is used in the generation of the documents within it, rather than
-the default skin, unless the document has a specific skin associated with it.
-
-
-  a. default skin (always available)
-
-
-  b. directory skin (precedence over default if exists)
-
-
-  c. document skin (takes precedence wherever document requests a specific
-  skin)
-
-
-Skins are defined in the document skin directory and if a directory association
-is desired a softlink made to the relevant skin. Skins (directory association
-auto load) auto load skin if a directory skin exists of same name as directory
-stub, (and there is no specific doc skin)
-
-
-3. If the working directory has within it a sub-directory called image_local,
-the images within that directory are used for references to images, that are
-not part of the default site build.
-
-
-1.10 DOCUMENT VERSION CONTROL INFORMATION
-.........................................
-
-The possibility of citing an exact document version.
-
-
-Permits the inclusion of document version control information to the document
-body and metatags.[^39] This provides a much more certain method of referring
-to the exact version of a particular document, (assuming that the document is
-from a trusted source, that will retain earlier versions of a document).[^40]
-
-
-- [39]: from a version control system such as CVS
-
-- [40]: The version control system must be run, so the version number is obtained,
- prior to the *SiSU* document generation, and subsequent posting of the
- document.
-
-This information (where available) is provided under the section of the
-document titled "Document Information - MetaData", near the end of a document,
-for example in the segmented html version of this text at:
-<http://www.jus.uio.no/sisu/SiSU/metadata.html>
-
-
-1.11 TABLE OF CONTENTS
-......................
-
-*SiSU* produces a rudimentary a table of contents based on document headings.
-
-
-1.12 AUTO-NUMBERING OF HEADINGS
-...............................
-
-Headings can be automatically numbered, (and automatically named for
-hyper-linking)
-
-
-1.13 NUMBERING AND CROSS-HYPERLINKING OF ENDNOTES
-.................................................
-
-*SiSU* can automatically number footnotes/endnotes. This is the default
-operation where no number is provided.
-
-
-Footnotes/endnotes may also be manually numbered. Where a number, or numbers
-are provided for a footnote/endnote, this does not increment the automatic
-footnote/endnote number counter.
-
-
-In the html output footnotes/endnotes are cross-hyper-linked (to their
-reference point and vice versa). In th pdf output footnotes are linked from
-their reference point only.
-
-
-1.14 "SKINNABLE"
-................
-
-*SiSU* is skinnable, on a site-wide, directory-wide and per document basis, so
-different looking versions of things may be produced with little difficulty.
-There is a default skin which may be modified, as the background site skin, and
-each working directory may have a skin associated with it, as may each
-individual document. The hierarchy of application is document, directory, then
-site... ie if a document skin exists it gets precedence.
-
-
-Whilst it is skinnable, the default output styles are selected to work across
-the widest possible range of document types.
-
-
-1.15 MULTIPLE OUTPUTS
-.....................
-
-From markup that is simpler and more sparse than html you get:
-
-
-* far greater output possibilities, including multiple html types, XML
-(different structured types), LaTeX (pdf landscape, portrait), and SQL
-(Postgresql or SQLite or other);
-
-
-* the advantages implicit in these very different output possibilities;[^41]
-
-
-- [41]: e.g. LaTeX (professional document typesetting, easy conversion to pdf or
- Postscript), XML (in this case, structural representation), SQL (e.g. document
- set searches; representation of the constituent parts of documents based on
- their structure, headings, chapters, paragraphs as desired; control of use)
-
-* a common citation system
-
-
-As many output formats/presentations as one cares to write modules for -
-several types of html (e.g. structure based on css, or structure based on
-tables); /LaTeX/pdf/ and /Lout/pdf/; pgsql other databases easily added;
-yaml...
-
-
-1.15.1 HTML - SEVERAL PRESENTATIONS: FULL LENGTH & SEGMENTED; CSS & TABLE BASED
-..............................................................................
-
-Most documents are produced in single and segmented html versions, described
-below:
-
-
-*The Scroll (full length text presentations)*
-
-
-The full length of the text in a single scrollable document.[^42] As a rule the
-files they are saved in are named: /doc/ or more precisely /doc.html/
-
-
-- [42]: CISG
- <http://www.jus.uio.no/lm/un_contracts_international_sale_of_goods_convention_1980/doc>
-
-- The Unidroit Contract Principles
- <http://www.jus.uio.no/lm/unidroit.contract.principles.1994/doc> or
-
-- The Autonomous Contract
- <http://www.jus.uio.no/lm/autonomous.contract.2000.amissah/doc>
-
-For various reasons texts may only be provided in this form (such as this one
-which is short), though most are also provided as segmented texts.
-
-
-"Scroll" is a reference to the historical scroll, a single long document/
-parchment, and also no doubt to what you will have to do to get to the bottom
-of the text.[^43]
-
-
-- [43]: Scrolling is not however necessarily confined to full length documents as
- you will have to scroll to get to the bottom of any long segment (eg. chapter)
- of a segmented text.
-
-*The Segmented Text*
-
-
-The text divided into segments (such as articles or chapters depending on the
-text)[^44] As a rule the files they are saved in are named: /toc/ and /index/
-or more precisely /toc.html/ and /index.html/
-
-
-- [44]: CISG
- <http://www.jus.uio.no/sisu/un_contracts_international_sale_of_goods_convention_1980>
-
-- The Unidroit Principles
- <http://www.jus.uio.no/lm/unidroit.contract.principles.1994>
-
-- The Autonomous Contract
- <http://www.jus.uio.no/sisu/the.autonomous.contract.2000.amissah> or
-
-- WTA 1994 <http://www.jus.uio.no/lm/wta.1994>
-
-If you know exactly what you are looking for, loading a segment of text is
-faster (the segments being smaller). Occasionally longer documents such as the
-WTA 1994 <http://www.jus.uio.no/lm/wta.1994/toc> are only provided in segmented
-form.
-
-
-*Cascading Style Sheet, and Table based html*
-
-
-*SiSU* outputs html, two current standard forms available are:
-
-
-css based [link:] <http://www.jus.uio.no/sisu/SiSU/toc.html>
-
-
-and
-
-
-table based [largely discontinued ][^45]
-
-
-- [45]: formatting possibility still exists in code tree but maintenance has been
- largely discontinuted.
-
-*The html is tested across several browsers*
-
-
-I like to remind you that there are other excellent browsers out there, many of
-which have long supported practical features like tabbing.
-
-
-The html is tested across several browsers, including:
-
-
-* *Firefox* (Mozilla-Firefox) [link:]
-<http://www.mozilla.org/products/firefox/> [^46]
-
-
-- [46]: <http://www.mozilla.org/products/firefox/>
-
-* Kazehakase [link:] <http://kazehakase.sourceforge.jp/> [^47]
-
-
-- [47]: <http://kazehakase.sourceforge.jp/>
-
-* Konqueror [link:] <http://www.konqueror.org/> [^48]
-
-
-- [48]: <http://www.konqueror.org/>
-
-* Mozilla [link:] <http://www.mozilla.org/> [^49]
-
-
-- [49]: <http://www.mozilla.org/>
-
-* MS Internet Explorer [link:]
-<http://www.microsoft.com/windows/ie/default.asp> [^50]
-
-
-- [50]: <http://www.microsoft.com/windows/ie/default.asp>
-
-* Netscape [link:]
-<http://home.netscape.com/comprod/mirror/client_download.html> [^51]
-
-
-- [51]: <http://home.netscape.com/comprod/mirror/client_download.html>
-
-* Opera [link:] <http://www.opera.com/> [^52]
-
-
-- [52]: <http://www.opera.com/>
-
-Also lighter weight graphical browsers:
-
-
-* Dillo [link:] <http://www.dillo.org/> [^53]
-
-
-- [53]: <http://www.dillo.org/>
-
-* *Epiphany* [link:] <http://www.gnome.org/projects/epiphany/> [^54]
-
-
-- [54]: <http://www.gnome.org/projects/epiphany/>
-
-* *Galeon* [link:] <http://galeon.sourceforge.net/> [^55]
-
-
-- [55]: <http://galeon.sourceforge.net/>
-
-And for console/text browsing:
-
-
-* *elinks* [link:] <http://elinks.or.cz/> [^56]
-
-
-- [56]: <http://elinks.or.cz/>
-
-* *links2* [link:] <http://links.twibright.com/> [^57]
-
-
-- [57]: <http://links.twibright.com/>
-
-* *w3m* [link:] <http://w3m.sourceforge.net/> [^58]
-
-
-- [58]: <http://w3m.sourceforge.net/>
-
-The html tables output is rendered more accurately across a wider variety set
-and older versions of browsers (than the html css output).
-
-
-1.15.2 XML
-..........
-
-*SiSU* generates well formed XML, and multiple versions. An XML SAX version
-with a flat/shallow structure, and XML DOM version with a deeper (embedded)
-structure. There is also a released working xhtml module. Examples of SAX and
-DOM versions are provided within this document.
-
-
-1.15.3 ODT:ODF, OPEN DOCUMENT FORMAT - ISO/IEC 26300:2006
-.........................................................
-
-*SiSU* generates Open Document Output format.
-
-
-1.15.4 PDF - PORTRAIT AND LANDSCAPE, (THROUGH THE GENERATION OF LATEX OUTPUT
-WHICH IS THEN TRANSFORMED TO PDF)
-..............................................................................
-
-*SiSU* outputs LaTeX if required which is easily transformed to PDF.[^59] PDF
-documents are generated on the site from the same source files and *Ruby*
-program that produce html. Landscape oriented pdf introduced, providing easier
-screen viewing, they are also (paper saving, being currently) formatted to have
-fewer pages than their portrait equivalents.
-
-
-- [59]: LaTeX and pdf features introduced 18^th^ June 2001, Landscape and portrait
- pdfs introduced 7^th^ October 2001., Lout is a more recent addition 22^th^
- April 2003
-
-* Adobe Reader [link:] <http://www.adobe.com/products/acrobat/readstep2.html>
-[^60]
-
-
-- [60]: <http://www.adobe.com/products/acrobat/readstep2.html>
-
-* *Evince* [link:] <http://www.gnome.org/projects/evince/> [^61]
-
-
-- [61]: <http://www.gnome.org/projects/evince/>
-
-* xpdf [link:] <http://www.foolabs.com/xpdf/> [^62]
-
-
-- [62]: <http://www.foolabs.com/xpdf/>
-
-1.15.5 SEARCH - LOADING/POPULATING OF RELATIONAL DATABASE WHILE RETAINING
-DOCUMENT STRUCTURE INFORMATION, OBJECT CITATION NUMBERING AND OTHER FEATURES
-(CURRENTLY POSTGRESQL AND/OR SQLITE)
-..............................................................................
-
-*SiSU* (from the same markup input file) automatically feeds into
-PostgreSQL[^63] and/or SQLite[^64] database (could be any other of the better
-relational databases)[^65] - together with all additional information related
-to document structure, and the alternative ways in which it is generated on the
-site retained. As regards scaling of the database, it is as scalable as the
-database (here Postgresql or SQLite) and hardware allow. I will prune the
-images later.
-
-
-- [63]: <http://www.postgresql.org/>
-
-- <http://advocacy.postgresql.org/>
-
-- <http://en.wikipedia.org/wiki/Postgresql>
-
-- [64]: <http://www.hwaci.com/sw/sqlite/>
-
-- <http://en.wikipedia.org/wiki/Sqlite>
-
-- [65]: Relational database features retaining document structure and citation
- introduced 15^th^ July 2002
-
-This is one of the more interesting output forms, as all the structural data
-for the documents are retained (though can be ignored by the user of the
-database should they so choose). All site texts/documents are (currently)
-streamed to four pgsql database tables:
-
-
-  * one containing semantic (and other) headers, including, title, author,
-  subject, (the Dublin Core...);
-
-
-  * another the substantive texts by individual "paragraph" (or object) - along
-  with structural information, each paragraph being identifiable by its
-  paragraph number (if it has one which almost all of them do), and the
-  substantive text of each paragraph quite naturally being searchable (both in
-  formatted and clean text versions for searching); and
-
-
-  * a third containing endnotes cross-referenced back to the paragraph from
-  which they are referenced (both in formatted and clean text versions for
-  searching).
-
-
-  * a fourth table with a one to one relation with the headers table contains
-  full text versions of output, eg. pdf, html, xml, and ascii.
-
-
-There is of course the possibility to add further structures.
-
-
-At this level *SiSU* loads a relational database with documents broken in to
-their smallest logical structurally constituent parts, as text objects, with
-their object citation number and all other structural information needed to
-construct the structured document. Text is stored (at this text object level)
-with and without elementary markup tagging, the stripped version being so as to
-facilitate ease of searching.
-
-
-Because the document structure of sites created is clearly defined, and the
-text object citation system is available for all forms of output, it is
-possible to search the sql database, and either read results from that
-database, or just as simply map the results to the html output, which has
-richer text markup.
-
-
-The combination of the *SiSU* citation system with a relational database is
-pretty powerful, giving rise to several possibilities. As individual text
-objects of a document stored (and indexed) together with object numbers, and
-all versions of the document have the same numbering, complex searches can be
-tailored to return just the locations of the search results relevant for all
-available output formats, with live links to the precise locations in the
-database or in html/xml documents; or, the structural information provided
-makes it possible to search the full contents of the database and have headings
-in which search content appears, or to search only headings etc. (as the Dublin
-Core is incorporated it is easy to make use of that as well).
-
-
-This is a larger scale project, (with little development on the front end
-largely ignored), though the "infrastructure" has been in place since 2002.
-
-
-1.15.6 SEARCH - DATABASE FRONTEND SAMPLE, UTILISING DATABASE AND SISU FEATURES,
-INCLUDING OBJECT CITATION NUMBERING (BACKEND CURRENTLY POSTGRESQL)
-..............................................................................
-
-Sample search frontend [link:] <http://search.sisudoc.org> [^66] A small
-database and sample query front-end (search from) that makes use of the
-citation system, _object citation numbering_ to demonstrates
-functionality.[^67]
-
-
-- [66]: <http://search.sisudoc.org>
-
-- [67]: (which could be extended further with current back-end). As regards scaling
- of the database, it is as scalable as the database (here Postgresql) and
- hardware allow.
-
-*SiSU* can provide information on which documents are matched and at what
-locations within each document the matches are found. These results are
-relevant across all outputs using object citation numbering, which includes
-html, XML, LaTeX, PDF and indeed the SQL database. You can then refer to one of
-the other outputs or in the SQL database expand the text within the matched
-objects (paragraphs) in the documents matched.
-
-
-(further work needs to be done on the sample search form, which is rudimentary
-and only passes simple booleans correctly at present to the SQL engine)
-
-
-A few canned searches, showing object numbers. Search for:
-
-
-English documents matching Linux OR Debian [link:]
-<http://search.sisudoc.org?s1=Linux%2BOR%2BDebian&lang=En&db=SiSU_sisu&view=index&a=1>
-
-
-GPL OR Richard Stallman [link:]
-<http://search.sisudoc.org?s1=GPL%2BOR%2BRichard%2BStallman&lang=En&db=SiSU_sisu&view=index&a=1>
-
-
-invention OR innovation in English language [link:]
-<http://search.sisudoc.org?s1=invention%2BOR%2Binnovation&lang=En&db=SiSU_sisu&view=index&a=1>
-
-
-copyright in English language documents [link:]
-<http://search.sisudoc.org?s1=copyright&lang=En&db=SiSU_sisu&view=index&a=1>
-
-
-Note that the searches done in this form are case sensitive.
-
-
-Expand those same searches, showing the matching text in each document:
-
-
-English documents matching Linux OR Debian [link:]
-<http://search.sisudoc.org?s1=Linux%2BOR%2BDebian&lang=En&db=SiSU_sisu&view=text&a=1>
-
-
-GPL OR Richard Stallman [link:]
-<http://search.sisudoc.org?s1=GPL%2BOR%2BRichard%2BStallman&lang=En&db=SiSU_sisu&view=text&a=1>
-
-
-invention OR innovation in English language [link:]
-<http://search.sisudoc.org?s1=invention%2BOR%2Binnovation&lang=En&db=SiSU_sisu&view=text&a=1>
-
-
-copyright in English language documents [link:]
-<http://search.sisudoc.org?s1=copyright&lang=En&db=SiSU_sisu&view=text&a=1>
-
-
-Note you may set results either for documents matched and object number
-locations within each matched document meeting the search criteria; or display
-the names of the documents matched along with the objects (paragraphs) that
-meet the search criteria.[^68]
-
-
-- [68]: of this feature when demonstrated to an IBM software innovations evaluator
- in 2004 he said to paraphrase: this could be of interest to us. We have large
- document management systems, you can search hundreds of thousands of documents
- and we can tell you which documents meet your search criteria, but there is no
- way we can tell you without opening each document where within each your
- matches are found.
-
-*OCN index mode,* (object citation number) the numbers displayed are relevant
-(and may be used to reference the match) in any sisu generated rendition of the
-text[^69] the links provided are to the locations of matches within the html
-generated by *SiSU*.
-
-
-- [69]: OCN are provided for HTML, XML, pdf ... though currently omitted in
- plain-text and opendocument format output
-
-*Paragraph mode,* you may alternatively display the text of each paragraph in
-which the match was made, again the object/paragraph numbers are relevant to
-any *SiSU* generated/published text.
-
-
-Several options for output - select database to search, show results in index
-view (links to locations within text), show results with text, echo search in
-form, show what was searched, create and show a "canned url" for search, show
-available search fields. Also shows counters number of documents in which found
-and number of locations within documents where found. [could consider sorting
-by document with most occurrences of the search result].
-
-
-Earlier version of the search frontend - Simple search, results with files in
-which search found, and locations where found within files.
-
-
-Simple search, results with files in which search found, and text object
-(paragraph or endnote) where found within files.
-
-
-1.15.7 OTHER FORMS
-..................
-
-There are other forms as well, YAML file, *Ruby* Marshal dumps, document
-pre-processing (processing of documents prior to the steps described here, to
-produce input suitable for the program) snap in a new module as
-required/desired, well formed XML, no problem.
-
-
-1.16 CONCORDANCE / WORD MAP OR RUDIMENTARY INDEX
-................................................
-
-Concordance /WordMaps:[^70] *SiSU* produces a rudimentary index based on the
-words within the text, making use of paragraph numbers to identify text
-locations. This is generated in html and hyper-linked but identifies these
-words locations in the other document formats. Though it is possible to search
-using a search engine, this is a means for browsing an alphabetical list of
-words which may suggest other useful content.
-
-
-- [70]: Concordance/ WordMaps introduced 15^th^ August 2002
-
-1.17 MANAGED (DOCUMENT) DIRECTORY, DATABASE, OR SITE STRUCTURE
-..............................................................
-
-*SiSU* builds the web site (or more generically provides a suitable directory
-structure) - placing various output texts in the hierarchy of the web-site (or
-db), which (for directories) is a sub-directory with the name of the text file.
-
-
-1.18 BATCH PROCESSING
-.....................
-
-*SiSU* is a batch processing tool, handling and transforming multiple (or
-individual) documents (in many ways) with a single instruction.
-
-
-1.19 INTEGRATION TO SUPERIOR GNU/LINUX AND UNIX TOOLS
-.....................................................
-
-As should have been noted by the above description of *SiSU*, it makes use of
-existing programs found on *Gnu* /Linux and Unix, amongst those already
-mentioned include the LaTeX to pdf converters and the database PostgreSQL or
-SQLite.
-
-
-1.19.1 BACKUP AND VERSION CONTROL
-.................................
-
-Unix provides many tools for version control. For documents Subversion, CVS and
-even the old RCS are useful for the per-document histories they provide.
-
-
-For writing code superior (more recent) version control system exist. These can
-also be used for documents though they tend to take stamps of changes across
-the repository as a whole, rather than for each individual file that is
-tracked, (as CVS and RCS do). My personal preference is for distributed systems
-such as Git, Mercurial or Darcs, of which I use Git for both code and
-documents.
-
-
-Several backup tools exist. At the base level I tend to use rdiff.
-
-
-1.19.2 EDITOR SUPPORT
-.....................
-
-*SiSU* documents are prepared / marked up in utf-8 text _you are free to use
-the text editor of your choice._
-
-
-Syntax highlighting for a number of editors are provided. Amongst them Vim,
-Kwrite, Kate, Gedit and diakonos. These may be found with configuration
-instructions at <http://www.jus.uio.no/sisu/syntax_highlight>. Vim [link:]
-<http://www.vim.org/> [^71] as of version 7 has built in sytax highlighting for
-*SiSU*.
-
-
-- [71]: <http://www.vim.org/>
-
-1.20 MODULAR DESIGN, NEED SOMETHING NEW ADD A MODULE
-....................................................
-
-Need a new output format that does not already exist, write a new module.
-
-
-Prefer a new input syntax, you could write a new syntax matching the existing
-design, though my personal preference is some uniformity in entry appearance.
-If necessary has been fairly easy to extend the design parameters. It is
-intended to incorporate some additional basic semantic tagging, (book, article,
-author etc.) However, keeping the requirements for input minimal, and
-relatively simple has been a design goal.
-
-
-DOCUMENT INFORMATION (METADATA)
-*******************************
-
-METADATA
---------
-
-Document Manifest @
-<http://www.jus.uio.no/sisu/sisu_manual/sisu_description/sisu_manifest.html>
-
-
-*Dublin Core* (DC)
-
-
-/DC tags included with this document are provided here./
-
-
-DC Title: _SiSU - Description_
-
-
-DC Creator: _Ralph Amissah_
-
-
-DC Rights: _Copyright (C) Ralph Amissah 2007, part of SiSU documentation,
-License GPL 3_
-
-
-DC Type: _information_
-
-
-DC Date created: _2002-11-12_
-
-
-DC Date issued: _2002-11-12_
-
-
-DC Date available: _2002-11-12_
-
-
-DC Date modified: _2007-08-30_
-
-
-DC Date: _2007-08-30_
-
-
-*Version Information*
-
-
-Sourcefile: _sisu_description.sst_
-
-
-Filetype: _SiSU text 0.57_
-
-
-Sourcefile Digest, MD5(sisu_description.sst)=
-_b89ccdad9f6d9c2260d8d383d6b35ccc_
-
-
-Skin_Digest:
-MD5(/home/ralph/grotto/theatre/dbld/builds/sisu/sisu/data/doc/sisu/sisu_markup_samples/sisu_manual/_sisu/skin/doc/skin_sisu_manual.rb)=
-_20fc43cf3eb6590bc3399a1aef65c5a9_
-
-
-*Generated*
-
-
-Document (metaverse) last generated: _Tue Sep 25 02:54:06 +0100 2007_
-
-
-Generated by: _SiSU_ _0.59.1_ of 2007w39/2 (2007-09-25)
-
-
-Ruby version: _ ruby 1.8.6 (2007-06-07 patchlevel 36) [i486-linux]_
-
-
-
-==============================================================================
-
-	title:  SiSU - Description
-
-	creator:  Ralph Amissah
-
-	rights:  Copyright (C) Ralph Amissah 2007, part of SiSU documentation,
-               License GPL 3
-
-	type:  information
-
-	subject:  ebook, epublishing, electronic book, electronic publishing,
-               electronic document, electronic citation, data structure,
-               citation systems, search
-
-	date.created:  2002-11-12
-
-	date.issued:  2002-11-12
-
-	date.available:  2002-11-12
-
-	date.modified:  2007-08-30
-
-	date:  2007-08-30
-
-
-
-
-
-==============================================================================
-nil
-
-Other versions of this document:
-manifest:
-   http://www.jus.uio.no/sisu/sisu_description/sisu_manifest.html
-html:
-   http://www.jus.uio.no/sisu/sisu_description/toc.html
-pdf:
-   http://www.jus.uio.no/sisu/sisu_description/portrait.pdf
-   http://www.jus.uio.no/sisu/sisu_description/landscape.pdf
-plaintext (plain text):
-   http://www.jus.uio.no/sisu/sisu_description/plain.txt
-at:
-   http://www.jus.uio.no/sisu
-* Generated by: SiSU 0.59.1 of 2007w39/2 (2007-09-25)
-* Ruby version: ruby 1.8.6 (2007-06-07 patchlevel 36) [i486-linux]
-* Last Generated on: Tue Sep 25 02:54:08 +0100 2007
-* SiSU http://www.jus.uio.no/sisu
-- 
cgit v1.2.3