CodeMirror XML4TEI

A jQuery plugin to build a TEI editor with Codemirror

https://github.com/orazionelson/CodeMirrorXML4TEI

Try the editor with the interfaces for teiHeader and text

1) Intro

Codemirror is a text editor that can be integrated into a web application, the project is open source and provides for plugins to extend the basic editor and tools to write code in different programming languages.

One of the most interesting plugin integration demos is on this page: https://codemirror.net/demo/xmlcomplete.html, it shows how to integrate an autocompletion tool with the plug-in to write XML.

Other CodeMirror extensions useful for building an XML editor are

By implementing these extensions, the Codemirror trigger for an XML editor will look like this:

CodeMirror.fromTextArea(document.getElementById("code"), {
	mode: "xml",
	styleActiveLine: true,
	lineNumbers: true,
	foldGutter: true,
	matchTags: {bothTags: true},
    gutters: ["CodeMirror-linenumbers", "CodeMirror-foldgutter"],
	extraKeys: {
		"'>'": completeAfter,
		"'/'": completeIfAfterLt,
		"' '": completeIfInTag,
		"'='": completeIfInTag,
		"Ctrl-Space": "autocomplete",
		"Ctrl-Q": function(cm){ cm.foldCode(cm.getCursor()); },
		"Ctrl-M": "toMatchingTag"
	},
	hintOptions: {schemaInfo: tags}
});

The last trigger option ( hintOptions ) will call a variable written in json (which in our case is called tags ) which will be used to define the XML structure that the Auto-completion of Codemirror will suggest during file editing.

Let's imagine we want to set an editor to mark the element in TEI body which contains prose paragraphs p; within which we want to highlight personal names: persName and place names: placeName.

Each person (persName) is identified by two attributes: a role that is set through a predefined list (controlled vocabulary) and a free text key, moreover it can contain a forename, a surname and a placeName.

Each place (placeName) can contain a settlement, a region or a country.

This scheme can be written in json in this way:

var tags = {
        "!top": ["p"],        
        p: {
          children: ["persName", "placeName"]
        },
        persName: {
          attrs: {
            key: null,
            role: ["re","barone","conte"]
          },
          children: ["forename", "surname", "placeName"]
        },
        placeName: {
          children: ["settlement", "region", "country"]
        }
      };

Here is an example of how the editor will guide you in writing XML.

Preview

This is a good starting point to develop a slightly more complex editor that:

  • don't define the scheme in JSON but in XML (that's easier to manage),
  • separate the teiHeader area from the text area,
  • validate the document,
  • have a pretty code to work on,
  • introduce a notes management system, which in TEI are not particularly user friendly,
  • Bonus: bundle an help that calls TEI guidelines

The editor is designed as a jQuery plugin in four files:

2) XML to write XML

See: xml4tei.js and xml4teiSchema2json.js.

JSON was not developed to be human writable or readable, XML also but it is easier to manage, so we will define our schema in XML and then translate it into JSON according to the Codemirror specifications.

On GitHub there is a script that, with few changes, will be useful to our editor:

https://github.com/sergeyt/jQuery-xml2json/blob/master/src/xml2json.js

This script has been adapted to fit Codemirror specs (which, for example, requires that the root tag is preceded by an exclamation point, which cannot be done in XML).

Without having to fight against quotes, double quotes and parentheses, the XML scheme for writing XML-TEI will just follow a few simple rules:

  • the tag is defined by its name;
  • within the tag you define in the attributes, the content will be the list of possible values, an empty attribute allows you to insert a free text
  • nested in the tags the <children> element defines the allowed children, if the tag does not allow children you will have to insert an empty <children/>.

So a structure similar to the one in JSON shown above will look like this

file: cm-tei-schema-sample.xml


	text
	
		body
	
	<body>
		docDate
		div
	</body>
    
        date
    
	
		<children/>
	
    
p

persName placeName

forename surname settlement region country

3) Separate the teiHeader part from the text, which means a different schema for each textarea

See: xml4tei.js.

To have a textarea with a scheme for the teiHeader and another for the text we can use the magic attribute data-* of HTML5 to provide the trigger with the schema to be imported.

	<textarea data-xmlschema="PATH/SCHEME_FILE_NAME.xml" rows="8" name="teiHeader" id="teiHeader" class="tei-editor">      
	      

We use jQuery to instantiate the editor through a class (.tei-editor) and make the trigger work for every textarea identified by that class.

	$('.tei-editor').each(function(index, myeditor) {      
		var xmlschema=$(this).data('xmlschema');
		var editor = CodeMirror.fromTextArea(myeditor, {
		.
		.
		.
		.
			hintOptions: {schemaInfo: schemaInfo : $.fn.getCmJson(xmlschema)}
		});
	});
	     

4) Validate the document

See: xml4tei.js.

Then we need a function to validate our TEI document. The function called "validate" is in the main plugin (xml4tei.js) and wokrs in this way:

  • a button with class .validate triggers the function
  • the funnction joins the tow pieces of XML in one TEI document
  • an ajax call points to a PHP script (validator.php) that validates the XML against the TEI RelaxNG Schema
		/*Validator*/
		var validator=function(){
		$(".validate").on('click', function(e){
			e.preventDefault();

		//Build the document
			var doc=[];
			$(".tei-editor").each(function(index, myeditor){
				var val=getCodeMirrorNative(this).getValue();
				doc.push(val)
			 });
			 
			var txtdoc=doc.join("");
		
			var tei='\n'+txtdoc+'\n';	
	    //console.log(tei);
        
        //Load validator.php with ajax
	        $.ajax({
				type: "GET",
				async: true,
				cache: false,
	            //data: {xml: tei},
	            url: 'validator.php?xml='+encodeURIComponent(tei),
	            processData     :   false,
				contentType     :   'text/xml',
				beforeSend: function() {
					$(".loader").show(); 
					$("#validation_response").empty();
				},
	            success: function(response) {
					$(".loader").hide();
					$("#validation_response").append(response); 
				},
				error    : function(msg) { console.log(msg); }	
	        });
		});
	}		
		
		

There is one more problem to complete the validator: the two panels are independent but the line numbers must be unitary.

To do this we need to export the two CodeMirror instances into an array, then apply to the latter a mechanism that makes the numbering of its rows depend on the first panel, both at startup and during typing.

So the trigger now looks like this:

	var editors=[];
	$('.tei-editor').each(function(index, myeditor) {
	    var xmlschema=$(this).data('xmlschema');
	    var editor = CodeMirror.fromTextArea(myeditor, {
		    mode: "xml",
			...
			...
			...
			hintOptions: {schemaInfo : $.fn.getCmJson(xmlschema)}    
		});


	 editors.push(editor);
	 });

At the beginning you can see the creation of the array and at the end its export with the two instances.

While the function to make the line numbers dynamic looks like this:

function manageLines(editors){
	var hlines=editors[0].lineCount();
	var hhlines=hlines+1;
	editors[1].setOption('firstLineNumber', hhlines);
	
	editors[0].on('change', function(){
		var lc=editors[0].lineCount();
		var llc=lc+1;
		editors[1].setOption('firstLineNumber', llc);
		});
	
	}

Just call the function after the trigger.

	var editors=[];
	$('.tei-editor').each(function(index, myeditor) {
	    var xmlschema=$(this).data('xmlschema');
	    ....
	    editors.push(editor);
	    });	
	manageLines(editors);
	

5) Prettify the code

See: xml4tei.js.

It is useful to to prettify the code before loading it, Codemirror has some built in method but I think that is better to do with XSLT trought Javascript.

Another function, found on Stackoverflow,will be useful for our script:

		//Found on
		//https://stackoverflow.com/questions/376373/pretty-printing-xml-with-javascript
		var prettifyXml = function(sourceXml)
		{
		    var xmlDoc = new DOMParser().parseFromString(sourceXml, 'application/xml');
		    var xsltDoc = new DOMParser().parseFromString([
		        // describes how we want to modify the XML - indent everything
		        '',
		        '  ',
		        '  ', // change to just text() to strip space in text nodes
		        '    ',
		        '  ',
		        '  ',
		        '    ',
		        '  ',
		        '  ',
		        '',
		    ].join('\n'), 'application/xml');
		
		    var xsltProcessor = new XSLTProcessor();    
		    xsltProcessor.importStylesheet(xsltDoc);
		    var resultDoc = xsltProcessor.transformToDocument(xmlDoc);
		    var resultXml = new XMLSerializer().serializeToString(resultDoc);
		    return resultXml;
		}; 

6) The NTE, a notetaking environment

See: xml4teiNte.js.

The NTE is one of the hardest part to design. Text notes, according to TEI specs, can be in-line with the document or detached form it (like classical end notes). I don't like in-line notes because I believe that, despite everything, the document must maintain a minimum human readability, and the in-line notes have a strong impact on this factor.

The the prototype of our document with notes will be like this:


 ...
  <body>
    

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Maecenas commodo vel velit a porttitor. Quisque cursus tellus felis, sit amet tempor justo commodo eget. Nulla facilisi. Sed felis turpis, vestibulum nec elit feugiat, pellentesque tempor arcu.

...

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Maecenas commodo vel velit a porttitor. Quisque cursus tellus felis, sit amet tempor justo commodo eget.

...
</body>

On the other side putting the notes are at the bottom of the text element of the TEI document impacts with the editing process, the author would find himself constantly scrolling the document to add or delete a note.

Therefore, we will have to foresee the possibility to make the note appear in a modal window while positioning it at the end of the text, linking the reference and the note by the attribute target that acts as a unique ID. This attribute is randomly generated by the javascript.

Moreover, the design must also include a mechanism that does not allow duplicating the reference to a note and moving it to another part of the text while maintaining the internal connection.

The script must, of course, provide also a way to delete a note and its reference.

While we are using the editor, notes don't need to have seriality, we can create it by transforming the document (with XSLT, Javascript or any parsing library) in the end view once outside the editor, moreover, you can use the type attribute to have different serialities.

For example:

<note xml:id="yuiwjjui6" type="integer">  //Seriality: 1,2,3...
or
<note xml:id="yuiwjjui6" type="alpha">  //Seriality: a,b,c...

7) A fancy interactive help

See xml4teiHelp.js

This plugin opens a bootstrap modal window with a list of the tags choosen in schemes. It shows also allowed children and attributes. Moreover: it can retrieve the tags description from the json version of TEI Guidelines (both if local and remote), and generates links to html guidelines version.

  • json Guidelines: https://www.tei-c.org/release/xml/tei/odd/p5subset.json
  • html Guidelines: http://www.tei-c.org/release/doc/tei-p5-doc/";

The plugin has 3 options:

	langs: array,
	jsondriver : string,
	proxy : string,
	

langs: is and array of language codes for the links to html documentation.

langs : ['it','en','de','fr']

jsondriver: choose local if you want to get tag descriptions by the json file teiresources/p5subset.json, it is faster but you have to update the file by hand.
Choose proxy if you want to query the online Guidelines in json. The TEI server doesn't allow CORS, so you have to use a proxy.

jsondriver : "local|proxy"

proxy: the path to the CORS proxy server, the default is: https://cors-anywhere.herokuapp.com/

proxy : "https://cors-anywhere.herokuapp.com/"

8) Call the script: options

The call is simply:

$.fn.xml4tei();

There are few options:

//shows/hide the buttons panel (default: true)
buttonsPanel : true|false 
//shows/hide the Examples button (default: true)
examplesBtn : true|false 
//shows/hide the Validator button (default: true)
validatorBtn : true|false
//shows/hide the Help button (default: true)
helpBtn : true|false 

//Values passed to the help plug-in
//Languages for guidelines url (default: ['it','en']) 
helpLangs: array es. ['it','en'],

//The source for elements definition in the help,
//(file: p5subset.json)
//if set to local the script search for it in teiresources directory
//else if it is set to proxy the script will parse the online file:
//		http://www.tei-c.org/release/xml/tei/odd/p5subset.json
//keeping it by a proxy to skip cross origins limits  
helpJsondriver : "local|proxy",  
helpProxy : "https://cors-anywhere.herokuapp.com/"

//True if you want to load examples
//when the script loads. (default: false)
loadExamples: false|true		
	 	 

Try the editor with the interfaces for teiHeader and text