Beginning with JAMWiki 0.6.5 the parser was modified to make unit testing significantly more robust. As a result it is much easier to validate parser output against Mediawiki, which demonstrates some disturbingly bad coverage in areas such as unbalanced tag handling (example: <b><i>unbalanced</b></i>).
It should be a goal of JAMWiki to produce output that is as close to Mediawiki as possible.
JAMWiki 0.6.5 began the work of improving the parser by treating parsed tags as a stack of tokens. This change allows significantly more flexibility, and should lead to the ability to handle unbalanced tags and other issues that Mediawiki currently deals with nicely.
Modifications over the past few days include the following:
<!-- old output --> <ul><li>list item </li></ul> <!-- new output, matches Mediawiki --> <ul> <li>list item</li> </ul>
A long-standing issue with the JAMWiki parser should be resolved by revision 2146. This change means that JAMWiki can finally parse the following examples of bold / italic text correctly:
this is '''''bold''' and italic'' text this is '''''italic'' and bold''' text this is '''bold''''' followed by italic'' text this is ''italic''''' followed by bold''' text
With all of the changes made to the parser thus far there have been some minor regressions - notably it seems that section edits are unnecessarily trimming newlines - but I haven't noticed any major issues. Remaining tasks include looking into whether paragraph parsing can be improved / simplified, adding better handling of unbalanced HTML tags, and addressing some of the reports of Mediawiki parsing differences that have been made on the Feedback page. Once that work is complete and enough unit tests have been created to verify that everything is working as expected the final 0.6.6 code should be ready for release. -- Ryan 05-Apr-2008 12:47 PDT
The current status on the parser changes are as follows:
The parser presently handles paragraphs during a "post-processor" parsing run, but I think that this parsing can be moved into the main "processor" run, which should eliminate some complexity and solve many problems. I don't know if that's something that can be implemented this weekend or whether it will take a couple of weeks, but that's the next item on my to-do list. -- Ryan 12-Apr-2008 18:04 PDT
I was up late last night and spent additional time on parser work today, so here's another update:
-- Ryan 13-Apr-2008 16:38 PDT
I suspect that parser updates will be a major focus area of mine during the 0.6.6 release cycle, and if the improvements are significant enough they could warrant bumping the next version to 0.7.0. These changes should also be good for outside developers who want to use the JAMWiki parser engine, as the engine should be made more robust. -- Ryan 23-Mar-2008 10:50 PDT