Txt2tags User Guide Aurelio, %%date(%c) = About this document = "Hi! I'm the txt2tags manual document. Here you'll find all available information about the txt2tags text conversion tool. My updated version can be found at http://txt2tags.sf.net/userguide/ For more informations and recent releases, please visit [the txt2tags website http://txt2tags.sf.net]. Enjoy!" ======================================================================== = About txt2tags = This chapter is a txt2tags overview, that will introduce the program purpose and features. ------------------------------------------------------------------------ == What is it? == Txt2tags is a text formatting and conversion tool. Txt2tags converts a plain text file with little marks, to any of the supported targets: - HTML document - SGML document - LaTeX document - UNIX man page - MoinMoin page - Magic Point presentation - PageMaker 6.0 document ------------------------------------------------------------------------ == Why should I use it? == You'll find txt2tags really useful if you: - need to publish documents on different formats - need to maintain updated documents on different formats - write technical or manual documents - don't know how to write a document on a certain format - don't have a specific editor for a certain format - want to use a simple text editor to update your documents And the main purpose motivation is: - save time, writing **contents** and forgetting about **formatting** ------------------------------------------------------------------------ == Why it is a good choice among other tools? == Txt2tags has a very straight way of growing, following basic concepts. These are the highlights: | //Source file readable // | Txt2tags marks are very simplistic, almost natural. | //Target document readable// | As the source file, the target document is readable also, with indentation and short lines. | //Marks consistent // | Txt2tags marks are unique enough to fit at all kind of documents and don't be confused with the document contents. | //Rules consistent // | As the marks, the rules that applies to them are tied to each other, there are no "exceptions" or "special cases". | //Simple structures // | All the supported formatting is **simple**, with no extra-options or complicated behaviour modifiers. A mark is just a mark, with no options at all. | //Easy to learn // | With simple marks and source readable, the txt2tags learning curve is user friendly. | //Nice examples // | The **sample files** included on the package gives real life examples of simple and over-complicated documents written on the txt2tags format. | //Valuable Tools // | The **syntax files** included on the package (for vim and emacs editors) help you to write documents with no syntax errors. | //Three user interfaces // | There is a **Graphical Tk interface** that is very user friendly, a **Web interface** to use it remotely or on the intranet, and a **Command Line interface** for powerusers and scripting. | //Scripting // | With the full featured comand line mode, an experienced user can **automatize** tasks and do **post-editting** on the converted files. | //Download and run / Multi-platform// | Txt2tags is a single **Python script**. There is no need to compile it or download extra modules. So it runs nicely on *NIX, Linux, Windows and Macintosh machines. | //Frequent Updates // | The program has a mailing list with active users who suggest corrections and improvements. The author himself is an extensive user at home and at work, so the development won't stop briefly. ------------------------------------------------------------------------ == I have to pay for it? == **Absolutely NO!** It's free, GPL, open source, public domain, ////. You can copy, use, modify, sell, release as yours. Software politics/copyright is not one of the author's major concern. ======================================================================== = Detailed info about txt2tags = On this section the program features will be seen in a detailed form, solving the doubts you may have about it. ------------------------------------------------------------------------ == Supported Formatting Structures == The following is a list of all the structures supported by txt2tags. - header (document title, author name, date) - section title - paragraphs - font beautifiers - bold - italic - bold-italic - underline - preformatted font (verbatim) - preformatted inside paragraph - preformatted line - preformatted area (multiline) - quoted area - link - URL/internet links - e-mail links - local links - named links - lists - bulleted list - numbered list - definition list - horizontal separator line - image (with smart alignment) - table (with or without border) - special mark for raw text (no parsing) - special macro for current date - comments (for self notes, TODO, FIXME) ------------------------------------------------------------------------ == Supported Targets == = **SGML**: It is a common document format which has powerful [sgmltools http://www.sgmltools.org] conversion applications. From a single sgml file you can generate html, pdf, ps, info, latex, lyx, rtf and xml documents. The sgml2* tools also does automatic TOC and break sections into subpages (sgml2html). txt2regex generates SGML files in the linuxdoc system type, ready to be converted with sgml2* tools without any extra catalog files or any SGML annoying requirements. = **HTML**: Everybody knows what HTML is. (hint: internet) txt2regex generates clean HTML documents, that look pretty and have its source readable. It DOES NOT use CSS, javascript, frames or other futile formatting techniques, that aren't required for simple, techie documents. = **LATEX**: TODO. = **PM6**: I guess you didn't know, but Adobe PageMaker 6.0 has its own tagged language! You can define styles, colortable, beautifiers, and most of all the PageMaker mouse-clicking features are available on its tagged language also. You just need to access "Import tagged text" menu item. Just for the records, it's an tag format. txt2regex generates all the tags and already defines a extensive and working header, setting paragraph styles and formatting. This is the hard part. **GOTCHA:** No line breaks! A paragraph must be one single line. Author's note: //My entire portuguese [regular expression book http://guia-er.sf.net]// //was written in vi, converted to PageMaker with txt2tags and went to// //press.// = **MGP**: [Magic Point http://www.mew.org/mgp] is a very handy presentation tool (hint: Microsoft PowerPoint), that uses a tagged language to define all the screens. So you can do complex presentations in vi/emacs/notepad. txt2tags generates a ready-to-use .mgp file, defining all the necessary headers for fonts and appearence definitions, as long as ISO-8859 accents support. **HOTSPOT 1:** txt2tags created .mgp file uses the XFree86 Type1 fonts! So you do not need to carry TrueType fonts files with your presentation. **HOTSPOT 2:** the color definitions for fonts are clean, so even on a poor color palette system (as `startx -- -bpp 8`), the presentation will look pretty! The key is: convert and use. No adaptation or requirements needed. = **MAN**: UNIX man pages resist over the years. Document formats come and go, and there they are, unbeatable. There are other tools to generate man documents, but the txt2tags has one advantage: one source, multi targets. so the same man page contents can be converted as HTML page, Magic Point presentation, etc. = **MOIN**: You don't know what [MoinMoin http://moin.sourceforge.net] is? It is a [WikiWiki http://www.c2.com/cgi/wiki]! Moin syntax is kinda boring when you need to keep `{{{'''''adding braces and quotes'''''}}}`, so txt2tags comes with the simplified marks and unified solution: one source, multi targets. = **TXT**: TXT is text. The only true formatting type. Besides txt2tags marks are very intuitive and discrete, you can remove them by converting the file to pure TXT. The titles are underlined, and the text is basicaly left as is on the source. ------------------------------------------------------------------------ == Target status for supported structures == || structure | txt | html | sgml | tex | mgp | pm6 | moin | man | | headers | Y | Y | Y | Y | Y | N | N | Y | | section title | Y | Y | Y | Y | Y | Y | Y | Y | | paragraphs | Y | Y | Y | Y | Y | Y | Y | Y | | bold | - | Y | Y | Y | Y | Y | Y | Y | | italic | - | Y | Y | Y | Y | Y | Y | Y | | bold-italic | - | Y | Y | Y | Y | Y | Y | Y | | underline | - | Y | - | Y | Y | Y | ? | - | | preformatted | - | Y | Y | Y | Y | Y | Y | - | | preformatted line | - | Y | Y | Y | Y | Y | Y | Y | | preformatted area | - | Y | Y | Y | Y | Y | Y | Y | | quoted area | Y | Y | Y | Y | Y | Y | ? | N | | internet links | - | Y | Y | - | - | - | Y | - | | e-mail links | - | Y | Y | - | - | - | Y | - | | local links | - | Y | Y | N | - | - | Y | - | | named links | - | Y | Y | - | - | - | Y | - | | bulleted list | Y | Y | Y | Y | Y | Y | Y | Y | | numbered list | Y | Y | Y | Y | Y | Y | Y | N | | definition list | Y | Y | ? | Y | N | N | N | Y | | horizontal line | Y | Y | - | Y | Y | N | Y | - | | image | - | Y | Y | N | Y | N | Y | - | | table | N | Y | Y | Y | N | N | Y | N | Legend: --- Y supported N not supported (may be in future releases) - not supported (can't be done on this target) ? not supported (not sure if it can be done or not) --- ======================================================================== = Download & Installation = == 1. Download & Install Python == First of all, you must download and install the Python interpreter on your system. If you already have it, just skip this step. Python is one of the nicest programming languages out there, it works on Windows, Linux, UNIX, Macintosh, and others and it can be downloaded from the [Python web site http://www.python.org]. Installation hints are found on the same site. If you are not sure if you have Python or not, open a console (tty, xterm, MSDOS) and type `python`. If it is not installed, the system will tell you. == 2. Download txt2tags == % mirrors? The official location for txt2tags distribution is on the program homepage, at http://txt2tags.sf.net/src. All the program files are on the tarball (.tgz file), which can be expanded by most of the compression utilities (including Winzip). Just get the **latest** one (more recent date, higher version number). The previous versions remains for historical purposes only. == 3. Install txt2tags == As a single Python script, txt2tags needs no installation at all. The only needed file to use the program is the txt2tags script. The other files of the tarball are documentation, tools and sample files. The fail-proof way to run txt2tags, is calling Python with it: --- prompt$ python txt2tags If you want to "install" txt2tags on the system as a stand alone program, just copy (or link) the txt2tags script to a System PATH directory and make sure the system knows how to run it. = **UNIX/Linux**: Make the script executable (`chmod +x txt2tags`) and copy it to a $PATH directory (`cp txt2tags /usr/bin`) = **Windows**: Rename the script adding the .py extension (`ren txt2tags txt2tags.py`) and copy it to a system PATH directory (`copy txt2tags.py C:\WINNT`) Done that, you can create an icon on your desktop for it, if you want to use the program's Graphical Interface. ======================================================================== = User Interfaces = Txt2tags has three user interfaces. Now we will take a look at them. ------------------------------------------------------------------------ == Graphical Tk Interface == Since version 1.0, there is a nice Graphical Interface, that works on Linux, Windows and Mac (and others). It's pretty simple and easy to use: [gui-interface.jpg] And it also has the ability to dump the result file to a window, instead of writing to the disc, so you can do quick testings before save the target file: [gui-interface-dump.jpg] ------------------------------------------------------------------------ == Web Interface == The Web Interface is up and running on the internet at http://txt2tags.sf.net/online.php, so you can use and test the program instantly, before download. % screenshot of http://txt2tags.sf.net/online.php [web-interface.jpg] One can also put this interface on the local intranet for common use, avoiding to install txt2tags in all machines. ------------------------------------------------------------------------ == Command Line Interface == For command line powerusers, the --help should be enough: --- usage: txt2tags -t [OPTIONS] file.t2t txt2tags -t html -s -l file.t2t -t, --type target document type. actually supported: txt, sgml, html, pm6, mgp, moin, man, tex --stdout by default, the output is written to file. with this option, STDOUT is used (no files written) --noheaders suppress header, title and footer information --enumtitle enumerate all title lines as 1, 1.1, 1.1.1, etc --maskemail hide email from spam robots. x@y.z turns to --toc add TOC (Table of Contents) to target document --toconly print document TOC and exit --gui invoke Graphical Tk Interface -h, --help print this help information and exit -V, --version print program version and exit extra options for HTML target (needs sgml-tools): --split split documents. values: 0, 1, 2 (default 0) --lang document language (default english) --- ==== Examples ==== Assuming you have written a `file.t2t` marked file, let's have some converting fun. | **Convert to HTML** | `$ txt2tags -t html file.t2t` | **The same, using redirection** | `$ txt2tags -t html --stdout file.t2t > file.html` | | . | **Including Table Of Contents** | `$ txt2tags -t html --toc file.t2t` | **And also, numbering titles** | `$ txt2tags -t html --toc --enumtitle file.t2t` | | . | **Contents quick view** | `$ txt2tags --toconly file.t2t` | **Maybe enumerate them?** | `$ txt2tags --toconly --enumtitle file.t2t` | | . | **Oneliners from STDIN** | `$ echo -e "\n**bold**" | txt2tags -t html --noheaders -` | **Testing Mask Email feature** | `$ echo -e "\njohn.wayne@farwest.com" | txt2tags -t txt --maskemail --noheaders - ` | **Post-convert editting** | `$ txt2tags -t html --stdout file.t2t | sed "s/^/" > file.html` ======================================================================== = The .t2t document Areas = Txt2tags marked files are divided in 3 areas. Each area have its own rules and purpose. They are: = //Headers Area//: Place for Document Title, Author, Version and Date information. (optional) = //Settings Area//: Place for general Document Settings and Parser behaviour modifiers. (optional) = //Body Area//: Place for the Document Content. (required) As seen on the reminders, the first two Areas are optional, being //Body Area// the only required one. (//Note: The **Settings Area**// //was introduced on txt2tags version 1.3//) The areas are delimited by special rules, which will be seen ahead. For now, this is a graphical representation of the areas on a document: --- ____________ | | | HEADERS | 1. First, the Headers | | | SETTINGS | 2. Then the Settings | | | BODY | 3. And finally the Document Body, | | | ... | which goes until the end | ... | |____________| --- In short, this is how the areas are defined: | **Headers** | First 3 lines of the file, or the first line blank for No Headers. | **Settings** | Begins right after the Header (4th or 2nd line) and ends when the //Body Area// starts. | **Body** | The first valid text line (not comment or setting) after the //Headers Area//. ------------------------------------------------------------------------ == The Headers Area == Location: - Fixed position: **First 3 lines** of the file. Dot. - Fixed position: **First line** of the file if it is blank. This means Empty Headers. The Headers Area is the only one that has a fixed position, line oriented. They are located at the first three lines of the source file. These lines are content-free, with no static information type needed. But the following is recomended for the most documents: - //line 1//: document title - //line 2//: author name and/or email - //line 3//: document date and/or version (nice place for `%%date`) Keep in mind that the first 3 lines of the source document will be the first 3 lines on the target document, separated and with high contrast to the text body (i.e. big letters, bold). If paging is allowed, the headers will be alone and centralized on the first page. ==== Less (or None) Header lines ==== Sometimes user wants to specify less then tree lines for headers, giving just document title and/or date information. Just let the 2nd and/or the 3rd lines empty (blank) and this position will not be placed at the target document. But keep in mind that even blanks, these lines are still part of the headers, so the document body must start **after** the 3rd line anyway. The title is the only required header (the first line), but if you leave it blank, you are saying that your document has **no headers**. So the //Body Area// will begin right after, on the 2nd line. This is useful to use together with the command line `--noheaders` option. ==== Straight to the point ==== In short: "//Headers are just **positions**, not contents//". Place one text on the first line, and it will appear on the target's first line. The same for 2nd and 3rd header lines. ------------------------------------------------------------------------ == The Settings Area == Location: - Begins right after the Headers Area - Begins on the **4th line** of the file if **Headers** were specified - Begins on the **2nd line** of the file if **No Headers** were specified - Ends when the Body Area starts - Ends by a non Setting, Blank or Comment line The Settings Area is optional, and an average English writter user should life fine with txt2tags without even know it exists. The primary use of this area is to define settings that affects the program behaviour. ==== So, how to set something? What's the syntax? ==== Setting lines are //special comment lines//, marked by a leading identifier ("!") that makes them different from plain comments. The syntax is just as simple as variable setting, composed by a keyword and a value, separated from each other by the canonical separator colon (":"). Example: --- %! keyword : value The exclamation mark should be placed together with the comment char ("%!"), no spaces between them. The spaces around //keyword// and the separator are optional, and both //keyword// and //value// are case insensitive (case doesn't matter). ==== What can i set? Which are the valid keywords? ==== For now, the only setting that could be done is //Encoding//. It's needed by non-english writters, who uses accented letters and other locale specific details, so the target document //Character Set// must be customized (if allowed). A real life example is: --- %! Encoding: iso-8859-1 To specify the //latin// charset. The valid values for the Encoding setting are the same charset names valid for HTML documents, like //iso-8859-1// and //koi8-r//. If you're not sure which encoding you should use, [this complete (and long!) list http://www.iana.org/assignments/character-sets] should help. The LateX target use alias names for encoding. This is not a problem for the user, because txt2tags translate the names internally. Some examples: || txt2tags/HTML | > | LaTeX | | windows-1250 | >>> | cp1250 | | windows-1252 | >>> | cp1252 | | ibm850 | >>> | cp850 | | ibm852 | >>> | cp852 | | iso-8859-1 | >>> | latin1 | | iso-8859-2 | >>> | latin2 | | koi8-r | >>> | koi8-r | If the value is unknown to txt2tags, it will be passed "as is", allowing user to specify custom encodings. ==== Some rules about Settings ==== - Settings are valid only inside the Settings Area, and will be a plain comment if found on the document Heading or Body. - If the same keyword appears more than one time on the Settings Area, the last found will be the one used. - A setting line with an invalid keyword will be considered a plain comment line. ---------------------------------------------------------------- == The Body Area == Location: - Begins on the first valid text line of the file - Headers, Settings and Comments are **not** valid text lines - Ends at the end of the file (EOF) Well, the body is anything outside Headers and Settings. The body holds the document contents and all formatting and structures txt2tags can recognize. Inside the body you can also put comments for //TODOs// and self notes. You can use the `--noheaders` command line option to convert only the document body, supressing the headers. This is useful to set your own headers on a separate file, then join the converted body. ---------------------------------------------------------------- == Full Example == --- My nice doc Title Mr. John Doe Last Updated: %%date(%c) %! Encoding: iso8859-1 Hi! This is my test document. Its content will end here. --- ================================================================ = Even more detailed info about txt2tags = == Marks (RULES) == All marks and syntax used by txt2tags are detailed on a [separate RULES file ../RULES]. ----------------------------------------------------------------------- == The %%date macro == The `%%date` macro called alone, returns the current date on the ISO //yyyymmdd// format. Optional formatting can be specified using the `%%date(format-string)` format. This //format-string// is made of plain text plus the formatting directives, which are a percent sign % followed by an identification character. Following is a list of some common use directives. The full list can be found in http://www.python.org/doc/current/lib/module-time.html. || Directive | Description | | %a | Locale's abbreviated weekday name. | %A | Locale's full weekday name. | %b | Locale's abbreviated month name. | %B | Locale's full month name. | %c | Locale's appropriate date and time representation. | %d | Day of the month as a decimal number [01,31]. | %H | Hour (24-hour clock) as a decimal number [00,23]. | %I | Hour (12-hour clock) as a decimal number [01,12]. | %m | Month as a decimal number [01,12]. | %M | Minute as a decimal number [00,59]. | %p | Locale's equivalent of either AM or PM. | %S | Second as a decimal number [00,61]. (1) | %x | Locale's appropriate date representation. | %X | Locale's appropriate time representation. | %y | Year without century as a decimal number [00,99]. | %Y | Year with century as a decimal number. | %% | A literal "%" character. ==== Examples ==== || `%%date(format)` | Results for: 2002, Jan31, 15:00 | | Last Update: %c | Last Update: Thu Jan 31 15:00:00 2002 | %Y-%m-%d | 2002-01-31 | %I:%M %p | 03:00 PM | Today is %A, on %B. | Today is Thursday, on January. ======================================================================= = Txt2tags HISTORY = On July 2001, was launched the first public release of txt2tags (v0.1). But its origins date more than an year before that... This chapter illustrates in a few words the tool development since its very first draw until the current series. == 1999 January: Pre-History == From the author: //"My really first attempts of a text conversion tool began back // //in 1999, as a very simple and limited Bourne Shell script that // //convert marked text to an HTML page. Yes, Yet-Another txt2html // //tool. Everyone Everywhere already must have done one of this...// //In short, it just recognized simple marks as `*bold*`, // //`/italic/`, `_under_`, and escape the classic `< & >` HTML // //special characters. Not impressive, but hey! I was young ;)" // == 1999 June: Still Pre-History == % ts/antigos/txt2sgml-bash/txt2sgml 25/06/1999 The author wants to speak some more: //"Some months passed, and a big Sgml hype arrived at the company // //I was working (Conectiva). So the txt2html turned into a // //txt2sgml script. I was really trying to learn about SED* at // //that moment so txt2sgml was a 110 lines Bourne Shell script // //with lots of SED code."// * **SED:** UNIX Stream EDitor - an automatic text editing tool This improved Sgml version had more supported structures as lists and preformatted text. On the following sample file, you can see the txt2tags marks origins: --- * This was a bold line (BOLD line oriented? well...) -- - bullet list was very similar to txt2tags list - but with these -- to begin and close a list -- =---------------------- Preformatted text was delimited by the =-- pattern. The other ------- was just cosmetic. =---------------------- --- Still not impressive, but the big step is comming... == 2000 August: Not Pre-History anymore == % verde666.org/sed/programas/txt2*/txt2sgml.sed 20000816 - 20010514 TODO (txt2sgml.sed) == 2001 July: Debut of 0.x series (World Release) == TODO == 2002 September: Debut of 1.x series == = **Announce**: This release starts my //1.x series//. More than a year of almost-monthly updates, and the //0.x series// provided me a nice set of features, as Command Line and Web interface, TOC handling, numbering titles and lists, STDIN/STDOUT facilities, vim/emacs syntax files and seven supported target formats. For the incoming //1.x series//, I'll try to spread myself out, providing a nice GUI, extensive documentation, mailing list, user base, Unix/Windows/Mac full compatibility and including more targets (as tex, rtf and xhtml). On this 1.0 release I'm already at full speed ahead, with a new suit (Graphical Tk Interface) and compatibility with Unix/Windows/Mac, handling line breaks and other platform specific issues. Fortunely, now my master can reach Linux, Windows 2000, Cygwin and MacOS 8.6 systems for testing me. ======================================================================== The End. ([see source userguide.t2t]) % txt2tags is *not* indent oriented as other tools % no marks are based on indentation % = TODO = % == Split and language features (only for HTML target) == % % --- % usage: txt2tags -t html --split --lang file.t2t % % --split split documents. values: 0, 1, 2 (default 0) % --lang document language (default english) % --- % % For those who have [sgml-tools http://www.sgmltools.org] installed and % running on the system, when generating HTML documents, the split and % language features are available. First a SGML text is generated and the % sgml2html binary is called. % % From sgml2html man page: % | 0 | don't split | % | 1 | split by major sections | % | 2 | split by subsections | % % % ------------------------------------------------------------------------ % % == EXTRA I == % % Cool Vim and Emacs syntax files sync'ed with all the rules. % See `txt2tags.vim` and `txt2tags-mode.el` on the `extras` dir. % % BONUS: there's also a `pagemaker.vim` for the .pm6 files. % % ------------------------------------------------------------------------ % % == EXTRA II == % % For those who have the (¹) package installed, the (²) target % is available! The package is found here: (³) % % | ¹ | ² | ³ | % | sgml-tools | ps | http://www.sgmltools.org | % | ghostscript | pdf | http://www.cs.wisc.edu/~ghost | % % % And remember, from SGML you can convert to lots of types, see sgml2* on % the sgml-tools package. % % PostScript Example: % --- prompt$ sgml2latex -o ps --language=english file.sgml % % PDF Example: % --- prompt$ ps2pdf file.ps % % % ------------------------------------------------------------------------ % ler e sugar! http://www.zope.org/Documentation/Articles/STX % common chars as ', ", #, ?, ~, ^, {, }, (, ), <, >, $ and & are not used as marks. % here is the list of all special chars: % ! - + = : * _ / [ ] | ` \t % pq não usar alinhamento como PRE % - title line DOESNT support beautifiers, link, macro, etc (like PRE) % why not ZIP? windows .ZIP file ? (winzip abre .gz anyway...) % sgml article not supported: % -------------------------- % image as definition list term (warning) % image inside table (warning) % image that points to a link (error - tags as link label-
) % -- string inside comments (warning) % open sect3 when inside sect1 (warning) % how to insert a literal ' | ' on tables? `|` % #TODO what closes quote: pre, blank % #TODO what closes lists: 2 blanks % #TODO what closes tables: anything not table line % #TODO ensure correct link detection (strict regex, no generalizations) % palm : HTML to PalmDOC format OnLine: http://pilot.screwdriver.net/ % .CHM: windows help % incluir ps e pdf (Dicas de como gerar a partior de sgml/tex) % falar sobre os escapes/pegadinhas de cada linguagem % TODO conversio natively, without the need to other tex2html or something %macos - macpython %mac X %other users % vim:tw=72 foldmethod=syntax