diff options
author | Ralph Amissah <ralph@amissah.com> | 2015-04-25 13:50:40 -0400 |
---|---|---|
committer | Ralph Amissah <ralph@amissah.com> | 2015-04-25 13:50:40 -0400 |
commit | aeb7754510ffacc207b17e5d7512ff5f14debcc3 (patch) | |
tree | e2f17e3ad2ee4e1ea81287febe5049ce6fb82147 /data/doc/sisu/org/sisu.org | |
parent | added patch jessie_bugfix_767761 (diff) | |
parent | version & changelog, tag for release (diff) |
Merge tag 'sisu_5.8.0' into debian/sid
SiSU 5.8.0
Conflicts:
.gitignore
data/doc/sisu/CHANGELOG_v5
data/doc/sisu/CHANGELOG_v6
data/sisu/v5/v/version.yml
data/sisu/v6/v/version.yml
setup/sisu_version.rb
Diffstat (limited to 'data/doc/sisu/org/sisu.org')
-rw-r--r-- | data/doc/sisu/org/sisu.org | 853 |
1 files changed, 853 insertions, 0 deletions
diff --git a/data/doc/sisu/org/sisu.org b/data/doc/sisu/org/sisu.org new file mode 100644 index 00000000..fdcb3eaa --- /dev/null +++ b/data/doc/sisu/org/sisu.org @@ -0,0 +1,853 @@ +#+PRIORITIES: A F E +#+OPTIONS: ^:nil _:nil +(emacs:evil mode gifts a "vim" of enticing "alternative" powers! ;) +(vim, my _editor_ of choice also in the emacs environment :) + +* What is SiSU? + +Multiple output formats with a nod to the strengths of each output format and +the ability to cite text easily across output formats. + +** debian/control desc + +documents - structuring, publishing in multiple formats and search + SiSU is a lightweight markup based, command line oriented, document + structuring, publishing and search, static content tool for document + collections. + . + With minimal preparation of a plain-text (UTF-8) file, using sisu markup syntax + in your text editor of choice, SiSU can generate various document formats, most + of which share a common object numbering system for locating content, including + plain text, HTML, XHTML, XML, EPUB, OpenDocument text (ODF:ODT), LaTeX, PDF + files, and populate an SQL database with objects (roughly paragraph-sized + chunks) so searches may be performed and matches returned with that degree of + granularity. Think of being able to finely match text in documents, using + common object numbers, across different output formats and across languages if + you have translations of the same document. For search, your criteria is met + by these documents at these locations within each document (equally relevant + across different output formats and languages). To be clear (if obvious) page + numbers provide none of this functionality. Object numbering is particularly + suitable for "published" works (finalized texts as opposed to works that are + frequently changed or updated) for which it provides a fixed means of reference + of content. Document outputs can also share provided semantic meta-data. + . + SiSU also provides concordance files, document content certificates and + manifests of generated output and the means to make book indexes that make use + of its object numbering. + . + Syntax highlighting and folding (outlining) files are provided for the Vim and + Emacs editors. + . + Dependencies for various features are taken care of in sisu related packages. + The package sisu-complete installs the whole of SiSU. + . + Additional document markup samples are provided in the package + sisu-markup-samples which is found in the non-free archive. The licenses for + the substantive content of the marked up documents provided is that provided + by the author or original publisher. + . + SiSU uses utf-8 & parses left to right. Currently supported languages: + am bg bn br ca cs cy da de el en eo es et eu fi fr ga gl he hi hr hy ia is it + ja ko la lo lt lv ml mr nl nn no oc pl pt pt_BR ro ru sa se sk sl sq sr sv ta + te th tk tr uk ur us vi zh (see XeTeX polyglossia & cjk) + . + SiSU works well under po4a translation management, for which an administrative + sample Rakefile is provided with sisu_manual under markup-samples. + +** take two + +SiSU may be regarded as an open access document publishing platform, applicable +to a modest but substantial domain of documents (typically law and literature, +but also some forms of technical writing), that is tasked to address certain +challenges I identified as being of interest to me over the years in open +publishing. + +The idea and implementation may be of interest to consider as some of the +issues encountered and that it seeks to address are known and common to such +endeavors. Amongst them: + + * how do you ensure what you do now can be read in decades? + * how do you keep up with new changing and technologies? + * do you select a canonical format to represent your documents, if so + what? + * how do you reliably cite (locate) material in different document + representations? + * how do you deal with multilingual texts? + * what of search? + * how are documents contributed to the collection? + +(these questions are selected in to help describe the direction of efforts with +regard to sisu). + +My Dabblings in the Domain of Open Publishing +--------------------------------------------- + +The system is called SiSU, it is an offshoot of my early efforts at finding out +what to make of the web, that started at the University of Tromsø in 1993 (an +early law website Ananse/ International Trade Law Project / Lex Mercatoria). I +have worked on SiSU continually since 1997 and it has been open source in 2005 +(under a license called GPL3+), though I remain its developer. + +In working in this field I have had to address some of the common issues. + +So how do you ensure what you do now can be read in decades to come? There are +alternative solutions. (i) stick with a widely used and not overly complicated +well document open standard, and for that the likes of odf is an excellent +choice (ii) alternatively go for the most basic representation of a document +that meets your needs, in my case based on UTF-8 text and some markup tags, +fairly easily parsable by the human eye and as long as utf8 is in use it will +always be possible to extract the information + +How do you keep up with new changing and technologies? Here my solution has +been to generate new versions of the substantive content so as to always have +the latest document representations available e.g. HTML has changed a lot over +the years, different specifications come out for various formats including ODF, +electronic readers have become an important viewing alternative, introducing +the open reader format EPUB. Output representations are generated from source +documents. Different open document file formats can be produced and databases +and search engines populated. (The source documents and interpreter are all +that are required to re-create site content. Source documents can be made +public or retained privately). The strict separation of a simple source +document from the output produced, means that with updates to SiSU (the +interpreter/processor/generator), outputs can be updated technically as +necessary, and new output formats added when needed. Amongst the output formats +currently supported are HTML, LaTeX generated Pdfs (A4, letter, other; +landscape, portrait), Epub, Open Document Format text. Returning to HTML as an +example, it has changed a lot over the years I have worked with it, this way of +working has meant it is possible to keep producing current versions of HTML, +retaining the original substantive document... and new formats have been added +as thought desired. There is no attempt to make output in different document +formats/ representations look alike let alone identical. Rather the attempt is +to optimize output for the particular document filetype, (there is no reason +why an epub document would look or behave like an open document text or that a +Pdf would look like HTML output; rather PDF is optimized for paper viewing, +HTML for screen etc.) Wherever possible features associated with the +particular output type are taken advantage of. This freedom is made possible to +a large extent by the answer to the question that follows. + +How do you reliably cite (locate) material in different document +representations? The traditional answer has been to have a canonical +publication, and resulting fixed page numbers. This was not a viable solution +for HTML (which changes from one viewer to another and with selectable font +faces & size etc.); nor is it otherwise ideal in an electronic age with the +possibility of presenting/interacting with material/documents in so many +different ways. Why be so restricted? Here my solution has been "object +citation numbering". What the various generated document formats have in +common is a shared object numbering system that identifies the location of text +and that is available for citation purposes. Object numbers are: sequential +numbers assigned to each identified object in a document. Objects are logical +units of text (or equivalent parts of a document), usually paragraphs, but also +document headings, tables, images, in a poem a verse etc. [In an electronic +publishing age are page numbers the best we can come up with? Change font +type, font size, page orientation, paper size (sometimes even the viewer) and +where are you with them? And paper though a favorite medium of mine is no +longer the sole (or sometimes primary) means of interacting with documents/text +or of sharing knowledge] + +What object numbers mean (unlike page numbers) is e.g. + + * if you cite text in any format, the resulting output can be reliably located + in any other document format type. Cite HTML and the reader can choose to + view in Epub or Pdf (the PDFs being an independent output, generated by + book publishing software XeTeX/LaTeX). + + * if you do a search, you can be given a result "index" indicating that your + search criteria is met by these documents, and at these specific locations + within each document, and the "index" is relevant not only for content + within the database, but for all document formats. + + * if you have a translated text prepared for sisu, then your citations are + relevant across languages e.g. you can specify exactly where in a Chinese + document text is to be found. + + * generated document index references & concordance list references etc. are + relevant across all output formats. + +What of search? For search, see the implications of object numbers for search +mentioned above. The system currently loads an SQL server (Postgresql) with +object sized text chunks. It could just as well populate an analytical engine +with larger sections or chapters of text for analytical purposes (such as the +currently popular Elasticsearch), whilst availing itself also of the concept of +objects and object numbers in search results. + +How do you deal with multilingual texts? If you have translated text prepared +for sisu, then your citations are relevant across languages. Object numbers +also provide an easy way to compare, discuss text (translations) across +languages. Text found/cited in one language has the same object number in its +translations, a given paragraph will be the same in another language, just +change the language code. (documents are prepared in UTF-8, current language +restrictions are: through use of LaTeX tools, Polyglosia & CJK (Chinese, +Japanese & Korean), and from the fact that sisu parses left to right) + +How are materials prepared for contribution to the collection? (a) The easiest +solution if the system allows is for submission in the format in which work is +authored, usually a word processor, for which odf may be a decent selection. +(b) I have stuck with enhanced plaintext, UTF-8 with minimal markup. Source +documents are prepared in UTF-8 text, with a minimalist native markup to +indicate the document structure (headings and their relative levels), +footnotes, and other document "features". This markup is easily parsable to the +human eye, and plays well with version control systems. Documents are prepared +in a text editor. Front ends such as markup assistants in a word processor that +can save to sisu text format or other tool whist possible do not exist. [(c) +yet another form of submission for collaborative work are wikis which have +shown their strength in efforts such as Wikipedia.] + +The system has proven to be a good testing ground for ideas and is flexible and +extensible. (things that could usefully be done: apart from a front end for +simpler user interaction; feed text to an analytical search engine, like +Elasticsearch/Lucene; it still needs a bibliography parser (auto-generation of +a bibliography from footnotes); and it might be useful to allow rough auto +translation documents on the fly by passing text through a translator (such as +Google translate)). + +In any event, my resulting technical opinions (in my modest domain of +action) may be regarded as encapsulated within SiSU +[http://www.sisudoc.org/] + +http://www.sisudoc.org/ +http://www.jus.uio.no/sisu/ + +git clone git://git.sisudoc.org/git/code/sisu.git --branch upstream +http://git.sisudoc.org/gitweb/?p=code/sisu.git;a=summary +(there may be additional commits in the upstream branch) +git clone --depth 1 git://git.sisudoc.org/git/code/sisu.git --branch upstream + +git clone git://git.sisudoc.org/git/doc/sisu-markup-samples.git --branch upstream +git clone --depth 1 git://git.sisudoc.org/git/doc/sisu-markup-samples.git --branch upstream +Development work is on Linux and the easiest way to install it is through the +Debian Linux package as this takes care of optional external dependencies such +as XeTeX for PDF output and Postgresql or Sqlite for search. + +** multiple document formats + +Text can be represented in multiple output formats with different +characteristics that are (or may be) regarded as strengths/advantages and +therefore preferred in different contexts. + +Given the different strengths and characteristics of various output formats, it +makes little sense to try too hard to make different representations of a +document look the same. More interesting is have document representations that +take advantage of each given outputs strengths. As valuable if not more so is +the ability to cite, find, discuss text with ease, across the different output +formats. + +For citation across output formats, SiSU uses object citation numbers. + +** document structure and document objects + +SiSU breaks marked up text into document structure and objects + +Document structure being the document heading hierarchy (having separated out +the document header). + +*** What are document objects? +An object is an identified meaningful unit of a document, most commonly a +paragraph of text, but also for example a table, code block, verse or image. + +SiSU tracks these substantive document units as document objects (and their +relationship to the document structure). + +** object citation numbers + +*** What are object citation numbers? + +An object citation number is a sequential number assigned to a document object. + +In sisu output documents share this common object numbering system (dubbed +"object citation numbering" (ocn)) that is meaningful (machine & human readable) +across various digital outputs whether paper, screen, or database oriented, +(PDF, html, XML, EPUB, sqlite, postgresql), and across multilingual content if +prepared appropriately. This numbering system can be used to reference content +across output types. + +*** Why might I want object citation numbering? + +The ability to cite and quickly locate text can be invaluable if not essential. + (whether for instruction or discussion). + +In this digital & Internet age we have multiple ways to represent documents and +multiple document output formats as options with different characteristics, +strengths/advantages etc. We need a way to cite text that works and is relevant +independent of the document format used. + +I want to discuss (cite) html text how do I do this? +how do I refer to / cite / discuss text in html? +Issue: html may be viewed online or printed, it is not tied to paper (as +e.g. pdf) and prints differently depending on selected font face and font size. + +I want to discuss (cite) text that is available in multiple formats (e.g. pdf, +epub, html) without having to worry about the output format that is referred +to. +How do I refer to / discuss text that is available in more than one format, +uncertain of what format is preferred, used or available to my colleagues? +e.g. html and epub or pdf have rather different text representations, how do I +discuss ... + +I would like to have a book index that is relevant (can be used) across multiple +output formats (e.g. pdf, epub, html) + +How do I make a book index (or a concordance file) that works across multiple +output formats? + +I would like to have search results indicating where in a document matches are +found and I would like it to be relevant across available output formats (e.g. +pdf, epub, html) +How do I get search results for locations of text within each relevant document + +I would like to be able to discuss a text that has been translated ... +how do I find text across languages? +Where I have a nicely translated document, how do I point to or discuss with my +foreign language counterpart some detail of the text, or, how do I point my +foreign language counterpart to the text I would like to bring to his +attention. + +** "Granular" Search + +Of interest is the ease of streaming documents to a relational database, at an +object (roughly paragraph) level and the potential for increased precision in +the presentation of matches that results thereby. The ability to serialize +html, LaTeX, XML, SQL, (whatever) is also inherent in / incidental to the +design. + +** Summary +SiSU information Structuring Universe +Structured information, Serialized Units <www.sisudoc.org> or +<www.jus.uio.no/sisu/> software for electronic texts, document collections, +books, digital libraries, and search, with "atomic search" and text positioning +system (shared text citation numbering: "ocn") +outputs include: plaintext, html, XHTML, XML, ODF (OpenDocument), EPUB, LaTeX, +PDF, SQL (PostgreSQL and SQLite) + +** SiSU Short Description + +SiSU is a comprehensive future-resilient electronic document management system. +Built-in search capabilities allow you to search across multiple documents and +highlight matches in an easy-to-follow format. Paragraph numbering system +allows you to cite your electronic documents in a consistent manner across +multiple file formats. Multiple format outputs allow you to display your +documents in plain text, PDF (portrait and horizontal), OpenDocument format, +HTML, or e-book reading format (EPUB). Word mapping allows you to easily create +word indexes for your documents. Future-resilient flexibility allows you to +quickly adapt your documents to newer output formats as needed. All these and +many other features are achieved with little or no additional work on your +documents - by marking up the documents with a super simplistic markup +language, leaving the SiSU engine to handle the heavy-lifting processing. + +Potential users of SiSU include individual authors who want to publish their +books or articles electronically to reach a broad audience, web publishers who +want to provide multiple channels of access to their electronic documents, or +any organizations which centrally manage a medium or large set of electronic +documents, especially governmental organizations which may prefer to keep their +documents in easily accessible yet non-proprietary formats. + +SiSU is an Open Source project initiated and led by Ralph Amissah +<ralph.amissah@gmail.com> and can be contacted via mailing list +<http://lists.sisudoc.org/listinfo/sisu> at <sisu@lists.sisudoc.org>. SiSU is +licensed under the GNU General Public License. + +*** notes + +For less markup than the most elementary HTML you can have more. SiSU - +Structured information, Serialized Units for electronic documents, is an +information structuring, transforming, publishing and search framework with the +following features: + +(i) markup syntax: (a) simpler than html, (b) mnemonic, influenced by +mail/messaging/wiki markup practices, (c) human readable, and easily writable, + +(ii) (a) minimal markup requirement, (b) single file marked up for multiple outputs, + + * documents are prepared in a single UTF-8 file using a minimalistic mnemonic +syntax. Typical literature, documents like "War and Peace" require almost no +markup, and most of the headers are optional. + + * markup is easily readable/parsed by the human eye, (basic markup is simpler +and more sparse than the most basic html), [this may also be converted to XML +representations of the same input/source document]. + + * markup defines document structure (this may be done once in a header +pattern-match description, or for heading levels individually); basic text +attributes (bold, italics, underscore, strike-through etc.) as required; and +semantic information related to the document (header information, extended +beyond the Dublin core and easily further extended as required); the headers +may also contain processing instructions. + +(iii) (a) multiple output formats, including amongst others: plaintext (UTF-8); +html; (structured) XML; ODF (Open Document text); EPUB; LaTeX; PDF (via LaTeX); +SQL type databases (currently PostgreSQL and SQLite). SiSU produces: +concordance files; document content certificates (md5 or sha256 digests of +headings, paragraphs, images etc.) and html manifests (and sitemaps of +content). (b) takes advantage of the strengths implicit in these very different +output types, (e.g. PDFs produced using typesetting of LaTeX, databases +populated with documents at an individual object/paragraph level, making +possible granular search (and related possibilities)) + +(iv) outputs share a common numbering system (dubbed "object citation +numbering" (ocn)) that is meaningful (to man and machine) across various +digital outputs whether paper, screen, or database oriented, (PDF, html, XML, +EPUB, sqlite, postgresql), this numbering system can be used to reference +content. + +(v) SQL databases are populated at an object level (roughly headings, +paragraphs, verse, tables) and become searchable with that degree of +granularity, the output information provides the object/paragraph numbers which +are relevant across all generated outputs; it is also possible to look at just +the matching paragraphs of the documents in the database; [output indexing also +work well with search indexing tools like hyperesteier]. + +(vi) use of semantic meta-tags in headers permit the addition of semantic +information on documents, (the available fields are easily extended) + +(vii) creates organised directory/file structure for (file-system) output, +easily mapped with its clearly defined structure, with all text objects +numbered, you know in advance where in each document output type, a bit of text +will be found (e.g. from an SQL search, you know where to go to find the +prepared html output or PDF etc.)... there is more; easy directory management +and document associations, the document preparation (sub-)directory may be used +to determine output (sub-)directory, the skin used, and the SQL database used, + +(viii) "Concordance file" wordmap, consisting of all the words in a document +and their (text/ object) locations within the text, (and the possibility of +adding vocabularies), + +(ix) document content certification and comparison considerations: (a) the +document and each object within it stamped with an sha256 hash making it +possible to easily check or guarantee that the substantive content of a document +is unchanged, (b) version control, documents integrated with time based source +control system, default RCS or CVS with use of $Id$ tag, which SiSU checks + +(x) SiSU's minimalist markup makes for meaningful "diffing" of the substantive +content of markup-files, + +(xi) easily skinnable, document appearance on a project/site wide, directory +wide, or document instance level easily controlled/changed, + +(xii) in many cases a regular expression may be used (once in the document +header) to define all or part of a documents structure obviating or reducing +the need to provide structural markup within the document, + +(xiii) prepared files may be batch process, documents produced are static files +so this needs to be done only once but may be repeated for various reasons as +desired (updated content, addition of new output formats, updated technology +document presentations/representations) + +(xiv) possible to pre-process, which permits: the easy creation of standard +form documents, and templates/term-sheets, or; building of composite documents +(master documents) from other sisu marked up documents, or marked up parts, +i.e. import documents or parts of text into a main document should this be +desired + +there is a considerable degree of future-resilience, output representations are +"upgradeable", and new document formats may be added. + +(xv) there is a considerable degree of future-resilience, output representations +are "upgradeable", and new document formats may be added: (a) modular, (thanks +in no small part to Ruby) another output format required, write another +module.... (b) easy to update output formats (eg html, XHTML, LaTeX/PDF +produced can be updated in program and run against whole document set), (c) +easy to add, modify, or have alternative syntax rules for input, should you +need to, + +(xvi) scalability, dependent on your file-system (ext3, Reiserfs, XFS, +whatever) and on the relational database used (currently Postgresql and +SQLite), and your hardware, + +(xvii) only marked up files need be backed up, to secure the larger document +set produced, + +(xviii) document management, + +(xix) Syntax highlighting for SiSU markup is available for a number of text +editors. + +(xx) remote operations: (a) run SiSU on a remote server, (having prepared sisu +markup documents locally or on that server, i.e. this solution where sisu is +installed on the remote server, would work whatever type of machine you chose +to prepare your markup documents on), (b) generated document outputs may be +posted by sisu to remote sites (using rsync/scp) (c) document source (plaintext +utf-8) if shared on the net may be identified by its url and processed locally +to produce the different document outputs. + +(xxi) document source may be bundled together (automatically) with associated +documents (multiple language versions or master document with inclusions) and +images and sent as a zip file called a sisupod, if shared on the net these too +may be processed locally to produce the desired document outputs, these may be +downloaded, shared as email attachments, or processed by running sisu against +them, either using a url or the filename. + +(xxii) for basic document generation, the only software dependency is Ruby, and +a few standard Unix tools (this covers plaintext, html, XML, ODF, EPUB, LaTeX). +To use a database you of course need that, and to convert the LaTeX generated +to PDF, a LaTeX processor like tetex or texlive. + +as a developers tool it is flexible and extensible + +** description + +SiSU ("SiSU information Structuring Universe" or "Structured information, +Serialized Units"),1 is a Unix command line oriented framework for document +structuring, publishing and search. Featuring minimalistic markup, multiple +standard outputs, a common citation system, and granular search. Using markup +applied to a document, SiSU can produce plain text, HTML, XHTML, XML, +OpenDocument, LaTeX or PDF files, and populate an SQL database with objects2 +(equating generally to paragraph-sized chunks) so searches may be performed and +matches returned with that degree of granularity (e.g. your search criteria is +met by these documents and at these locations within each document). Document +output formats share a common object numbering system for locating content. +This is particularly suitable for "published" works (finalized texts as opposed +to works that are frequently changed or updated) for which it provides a fixed +means of reference of content. How it works + +SiSU markup is fairly minimalistic, it consists of: a (largely optional) +document header, made up of information about the document (such as when it was +published, who authored it, and granting what rights) and any processing +instructions; and markup within text which is related to document structure and +typeface. SiSU must be able to discern the structure of a document, (text +headings and their levels in relation to each other), either from information +provided in the instruction header or from markup within the text (or from a +combination of both). Processing is done against an abstraction of the document +comprising of information on the document's structure and its objects,2 which +the program serializes (providing the object numbers) and which are assigned +hash sum values based on their content. This abstraction of information about +document structure, objects, (and hash sums), provides considerable flexibility +in representing documents different ways and for different purposes (e.g. +search, document layout, publishing, content certification, concordance etc.), +and makes it possible to take advantage of some of the strengths of established +ways of representing documents, (or indeed to create new ones). + +1. also chosen for the meaning of the Finnish term "sisu". + +2 objects include: headings, paragraphs, verse, tables, images, but not +footnotes/endnotes which are numbered separately and tied to the object from +which they are referenced. + +More information on SiSU provided at: <www.sisudoc.org/sisu/SiSU> + +SiSU was developed in relation to legal documents, and is strong across a wide +variety of texts (law, literature...(humanities, law and part of the social +sciences)). SiSU handles images but is not suitable for formulae/ statistics, +or for technical writing at this time. + +SiSU has been developed and has been in use for several years. Requirements to +cover a wide range of documents within its use domain have been explored. + +<ralph@amissah.com> +<ralph.amissah@gmail.com> +<sisu@lists.sisudoc.org> +<http://lists.sisudoc.org/listinfo/sisu> +2010 +w3 since October 3 1993 +* Finding SiSU +** source +http://git.sisudoc.org/gitweb/ + +*** sisu +sisu git repo: +http://git.sisudoc.org/gitweb/?p=code/sisu.git;a=summary + +**** most recent source without repo history +git clone --depth 1 git://git.sisudoc.org/git/code/sisu.git --branch upstream +**** full clone +git clone git://git.sisudoc.org/git/code/sisu.git --branch upstream + +*** sisu-markup-samples git repo: +http://git.sisudoc.org/gitweb/?p=doc/sisu-markup-samples.git;a=summary + +** mailing list +sisu at lists.sisudoc.org +http://lists.sisudoc.org/listinfo/sisu + +** irc oftc #sisu + +** home pages + <http://www.sisudoc.org/> + <http://search.sisudoc.org/> + <http://www.jus.uio.no/sisu> + +* Installation + +** where you take responsibility for having the correct dependencies + +Provided you have *Ruby*, *SiSU* can be run. + +SiSU should be run from the directory containing your sisu marked up document +set. + +This works fine so long as you already have sisu external dependencies in +place. For many operations such as html, epub, odt this is likely to be fine. +Note however, that additional external package dependencies, such as texlive +(for pdfs), sqlite3 or postgresql (for search) should you desire to use them +are not taken care of for you. + +*** run off the source tarball without installation + +RUN OFF SOURCE PACKAGE DIRECTORY TREE (WITHOUT INSTALLING) +.......................................................... + +**** 1. Obtain the latest sisu source + +using git: + +http://git.sisudoc.org/gitweb/?p=code/sisu.git;a=summary +http://git.sisudoc.org/gitweb/?p=code/sisu.git;a=log + + git clone git://git.sisudoc.org/git/code/sisu.git --branch upstream + git clone --depth 1 git://git.sisudoc.org/git/code/sisu.git --branch upstream + +or, identify latest available source: + +https://packages.debian.org/sid/sisu +http://packages.qa.debian.org/s/sisu.html +http://qa.debian.org/developer.php?login=sisu@lists.sisudoc.org + +http://sisudoc.org/sisu/archive/pool/main/s/sisu/ + +and download the: + + sisu_5.4.5.orig.tar.xz + +using debian tool dget: + +The dget tool is included within the devscripts package +https://packages.debian.org/search?keywords=devscripts +to install dget install devscripts: + + apt-get install devscripts + +and then you can get it from Debian: + dget -xu http://ftp.fi.debian.org/debian/pool/main/s/sisu/sisu_5.4.5-1.dsc + +or off sisu repos + dget -x http://www.jus.uio.no/sisu/archive/pool/main/s/sisu/sisu_5.4.5-1.dsc +or + dget -x http://sisudoc.org/sisu/archive/pool/main/s/sisu/sisu_5.4.5-1.dsc + +**** 2. Unpack the source + +Provided you have *Ruby*, *SiSU* can be run without installation straight from +the source package directory tree. + +Run ruby against the full path to bin/sisu (in the unzipped source package +directory tree). SiSU should be run from the directory containing your sisu +marked up document set. + + ruby ~/sisu-5.4.5/bin/sisu --html -v document_name.sst + +This works fine so long as you already have sisu external dependencies in +place. For many operations such as html, epub, odt this is likely to be fine. +Note however, that additional external package dependencies, such as texlive +(for pdfs), sqlite3 or postgresql (for search) should you desire to use them +are not taken care of for you. + +*** gem install (with rake) + +(i) create the gemspec; (ii) build the gem (from the gemspec); (iii) install +the gem + +Provided you have ruby & rake, this can be done with the single command: + + rake gem_create_build_install + +to build and install sisu v5 & sisu v6, alias gemcbi + +separate gems are made/installed for sisu v5 & sisu v6 contained in source. + +to build and install sisu v5, alias gem5cbi: + + rake gem_create_build_install_stable + +to build and install sisu v6, alias gem6cbi: + + rake gem_create_build_install_unstable + +for individual steps (create, build, install) see rake options, rake -T to +specify sisu version for sisu installed via gem + + gem search sisu + + sisu _5.4.5_ --version + + sisu _6.0.11_ --version + +to uninstall sisu installed via gem + + sudo gem uninstall --verbose sisu + +For a list of alternative actions you may type: + + rake help + + rake -T + +Rake: <http://rake.rubyforge.org/> <http://rubyforge.org/frs/?group_id=50> + +*** installation with setup.rb + +this is a three step process, in the root directory of the unpacked *SiSU* as +root type: + +ruby setup.rb config +ruby setup.rb setup +#[as root:] +ruby setup.rb install + +further information: +<http://i.loveruby.net/en/projects/setup/> +<http://i.loveruby.net/en/projects/setup/doc/usage.html> + + ruby setup.rb config && ruby setup.rb setup && sudo ruby setup.rb install + +** Debian install + +*SiSU* is available off the *Debian* archives. It should necessary only to run +as root, Using apt-get: + + apt-get update + + apt get install sisu-complete + +(all sisu dependencies should be taken care of) + +If there are newer versions of *SiSU* upstream, they will be available by +adding the following to your sources list /etc/apt/sources.list + +#/etc/apt/sources.list + +deb http://www.jus.uio.no/sisu/archive unstable main non-free +deb-src http://www.jus.uio.no/sisu/archive unstable main non-free + +The non-free section is for sisu markup samples provided, which contain +authored works the substantive text of which cannot be changed, and which as a +result do not meet the debian free software guidelines. + +*SiSU* is developed on *Debian*, and packages are available for *Debian* that +take care of the dependencies encountered on installation. + +The package is divided into the following components: + + *sisu*, the base code, (the main package on which the others depend), without + any dependencies other than ruby (and for convenience the ruby webrick web + server), this generates a number of types of output on its own, other + packages provide additional functionality, and have their dependencies + + *sisu-complete*, a dummy package that installs the whole of greater sisu as + described below, apart from sisu -examples + + *sisu-pdf*, dependencies used by sisu to produce pdf from /LaTeX/ generated + + *sisu-postgresql*, dependencies used by sisu to populate postgresql database + (further configuration is necessary) + + *sisu-sqlite*, dependencies used by sisu to populate sqlite database + + *sisu-markup-samples*, sisu markup samples and other miscellany (under + *Debian* Free Software Guidelines non-free) + + *SiSU* is available off Debian Unstable and Testing [link: + <http://packages.debian.org/cgi-bin/search_packages.pl?searchon=names&subword=1&version=all&release=all&keywords=sisu>] + [^1] install it using apt-get, aptitude or alternative *Debian* install tools. + +** Arch Linux + +* sisu markup :sisu:markup: + +** sisu markup + +#% structure - headings, levels + * headings (A-D, 1-3) + * inline + 'A~ ' NOTE title level + 'B~ ' NOTE optional + 'C~ ' NOTE optional + 'D~ ' NOTE optional + '1~ ' NOTE chapter level + '2~ ' NOTE optional + '3~ ' NOTE optional + '4~ ' NOTE optional :consider: + * node + * parent + * children + +#% font face NOTE open & close marks, inline within paragraph + * emphasize '*{ ... }*' NOTE configure whether bold italics or underscore, default bold + * bold '!{ ... }!' + * italics '/{ ... }/' + * underscore '_{ ... }_' + * superscript '^{ ... }^' + * subscript ',{ ... },' + * strike '-{ ... }-' + * add '+{ ... }+' + * monospace '#{ ... }#' +#% para NOTE paragraph controls are at the start of a paragraph + * a para is a block of text separated from others by an empty line + * indent + * default, all '_1 ' up to '_9 ' + * first line hang '_1_0 ' + * first line indent further '_0_1 ' + * bullet + [levels 1-6] + '_* ' + '_1* ' + '_2* ' + * numbered list + [levels 1-3] + '# ' + +#% blocks NOTE text blocks that are not to be treated in the way that ordinary paragraphs would be + * code + * [type of markup if any] + * poem + * group + * alt + * tables +#% boxes + NOTE grouped text with code block type color & possibly default image, warning, tip, red, blue etc. decide [NB N/A not implemented] + +#% notes NOTE inline within paragraph at the location where the note reference is to occur + * footnotes '~{ ... }~' + * [bibliography] [NB N/A not implemented] + +#% links, linking + * links - external, web, url + * links - internal + +#% images [multimedia?] + * images + * [base64 inline] [N/A not implemented] + +#% object numbers + * ocn (object numbers) + automatically attributed to substantive objects, paragraphs, tables, blocks, verse (unless exclude marker provided) + +#% contents + * toc (table of contents) + autogenerated from structure/headings information + * index (book index) + built from hints in newline text following a paragraph and starting with ={} has identifying rules for main and subsidiary text + +#% breaks + * line break ' \\ ' inline + * page break, column break ' -\\- ' start of line, breaks a column, starts a new column, if using columns, else breaks the page, starts a new page. + * page break, page new ' =\\= ' start of line, breaks the page, starts a new page. + * horizontal '-..-' start of line, rule page (break) line across page (dividing paragraphs) + +#% book type index + +#% comment + * comment + +#% misc + * term & definition + +** syntax hilighting + +*** vim +data/sisu/conf/editor-syntax-etc/vim/ +data/sisu/conf/editor-syntax-etc/vim/syntax/sisu.vim + +*** emacs +data/sisu/conf/editor-syntax-etc/emacs/ +data/sisu/conf/editor-syntax-etc/emacs/sisu-mode.el + +* todo +sisu_todo.org |