Current development on JAMWiki is primarily focused on maintenance rather than new features due to a lack of developer availability. If you are interested in working on JAMWiki please join the jamwiki-devel mailing list.

Tech comments:JAMWiki Design

Contents

Topic.topicContent[edit]

There are two content fields: Topic.topicContent and TopicVersion.versionContent. Is Topic.topicContent containing the current version? Mike 24-Jul-2007 05:21 PDT

Usually. That field is there as a convenience so that pages like view-topic-include.jsp don't need both a Topic and a TopicVersion object in order to display topic content. Most of the time topicContent will contain the latest version content, although when viewing topic history it would be set to the content of the version being viewed. -- Ryan 24-Jul-2007 09:17 PDT

Alternative parsers[edit]

Archived from the Feedback page:

Prior to starting work to support Mediawiki templates I started looking into whether another technology might make the parser simpler. The problem I'm running into with JFlex is that parsing is very linear, so if the parser matches something like ~~~~ there is no easy way to replace that with (for example) [[User:wrh2|Ryan]] 03-Oct-2006 14:57 PDT and then re-parse. This is an especially import issue for templates, where the code may need to perform multiple levels of parsing when faced with something like {{template1|foo={{template2|bar={{template3}}}}}}

Thus far I've looked at:

  • JavaCC. The input file format is fairly difficult to follow, but this seems to offer the required functionality. Anyone have any experience? Are the input files really as difficult to work with as they appear to be?
  • Axel's parser. This might be a good option since it already includes a lot of useful functionality. The downside is that Axel would need to give an OK to integrate it into JAMWiki, the Coding Style is somewhat different from current JAMWiki code, and we'd need to figure out who would be the "official" maintainer of the code. If JAMWiki uses a parser that can't evolve rapidly (ie either Axel or myself can change it) then that would be a problem.
  • Antlr. I didn't look into this one too closely. Anyone have any experience with it?
  • Keep JFlex. JFlex can be made to work, I'm just wondering if other options might be preferable.
  • Something else?

If anyone has experience with these sorts of technology, feedback would be appreciated. -- Ryan 03-Oct-2006 16:31 PDT

Having looked into things a bit more, I think I'd lean towards adopting an approach similar to Axel's parser, or even to incorporate Axel's parser outright. The current JAMWiki parsing code is too complex and too inflexible. Axel's approach allows custom tags to be incorporated fairly easily, and is fairly easy to understand. I'm not sure if Axel wants to continue maintaining his parser as a separate project or not, so I'll wait to hear his thoughts. -- Ryan 04-Oct-2006 00:54 PDT

How does AnsiData/QueryHandler write Topic.content?[edit]

Moved from the Feedback page:

When saving new Topic content from an external resource TopicSpaces using the internal database, topic content got saved and the page was created. When I moved to MySQL, the Topic itself gets saved, but not the content. Clicking on a recent changes link opens an edit page. Something changed.

I use this code

WikiBase.getDataHandler().writeTopic(topic, topicVersion, parserDocument, true, null);

which is driven by this code

ParserDocument parserDocument = Utilities.parseSave(parserInput, contents);

I do not see any reference to Topic.getTopicContent() except to drive Category setup. I do not see ParserDocument.getContent() anywhere. What am I missing? -- Jack 17-April-2007 13:44 PDST

I think you're looking for AnsiQueryHandler.insertTopicVersion() and topicVersion.getVersionContent(). The content is stored with the version, and the topic table then contains a pointer to the current version. And yes, there is probably a lot of room for code cleanups, documentation, and simplifications :) -- Ryan 17-Apr-2007 20:38 PDT
Thanks! What, however, is different from MySQL to the HSQLDB way of handling save? It sounds like I need to explicitly call saveVersion or something. -- Jack 18-Apr-2007 0729 PDT

I should point out that the code I use to save a topic also includes the following, all of which was adapted from org.jamwiki.servlets.EditServlet:

TopicVersion topicVersion = new TopicVersion(user, "TopicSpaces", "New TopicSpaces topic", contents);
topicVersion.setEditType(TopicVersion.EDIT_NORMAL); -- Jack
Provided writeTopic() is used to write to the database then there shouldn't be any differences between databases - I test with Postgres, MySQL and HSQL on a regular basis and haven't encountered any problems. Do you have any error messages in the logs, or have any other files been modified? -- Ryan 18-Apr-2007 09:08 PDT
What I now know is that JAMWiki is properly writing the data. That I am not apparently properly retrieving the data was first thought to be associated with the toLower aspect of AnsiQueryHandler's topic fetching. I am now busy instrumenting code to see why I cannot retrieve that which appears to be properly saved. -- Jack
Provided lookupTopic() or lookupTopicVersion() are used to retrieve the topic then you should probably have all of the data you need available in the Topic and TopicVersion objects that are returned. If anything fails to get initialized please let me know and I can investigate, or alternatively if you find a bug feel free to fix it directly in Subversion (you have access, right?). I may have a little bit of time to investigate any issues tonight, but that will depend on how many active brain cells remain after getting off work. -- Ryan 18-Apr-2007 12:49 PDT
At this time, I am able to determine that TopicSpaces can see the saved data, but JAMWiki cannot, at least in terms of generating a URL to display it. That is to say, if I perform a search on the saved data, JAMWiki will find the proper Topic(s) that contain(s) that saved data, but it continues to think that the page itself needs editing; it will not open the page directly. A typical page it will not see carries the topic name: AirSubjectPropertyTypeDescription_0, which leads me to wonder if the trailing _0 is somehow changed by the wiki parser when defining the URL to use. Trailing dash numbers identify version numbers of certain content in TopicSpaces. Side note: I started writing this comment at the same time Ryan was just saving his comment above, so this final edit is added during content resolution. Yes, I got one of those big red notices that says the content needs resolution! As far as my instrumentation tells me, everything seems to be initialized properly. RecentChanges properly lists the topic names, but they display in red, which suggests that the parser (somehow) failed to find the content when setting up the RecentChanges page. As a final check, pasting the topic name into the browser bar confirms the "topic does not exist" even though search can find its content. More instrumentation coming up... -- Jack
Note that writeTopic() does more than just insert or update a record - for performance reasons it also updates cached values and the search index. If a topic is cached as "non-existent", it will show up as a red link until the cache record is updated. All of that code is consolidated in writeTopic() to make it easier to integrate, but if writeTopic() isn't used there will be problems. If records are created or inserted from another source then that source would need to perform the additional work that writeTopic() performs, but be warned that using anything other than writeTopic() may break with future JAMWiki versions. -- Ryan 18-Apr-2007 14:04 PDT
Interesting observation -- loading this page tagSlotTypeDescription_0 gets this trace from AnsiDataHandler.lookupTopic: AnsiQueryHandler.lookupTopic topicName tagSlotTypeDescription 0
That is, something in the JAMWiki code is stripping out the underscore from the topicName. I'm tempted to think that's not playing nice, but maybe there's a reason for that? I'm wondering if I went to a dash rather than an underscore...Just tried that with tagSlotTypeDescription-0 and it appears to have left the dash in! This suggests a theoretical bug fix: change to dash from underscore. Believe I'll go try that... -- Jack
A conundrum: the JCR appears to replace '-' with '_' for a variety of internal reasons, meaning, I'm stuck with underscores. JAMWiki's Utilities.decodeFromURL replaces '_' with ' ', meaning, JAMWiki will not accept underscores. Open question: what necessitates removal of underscores in Topic names? -- Jack
The underscore removal is a Mediawiki-compatibility thing since they automatically convert underscores to spaces. -- Ryan 18-Apr-2007 19:11 PDT
It is not clear to me what you are saying here. Consider this valid Wikipedia URL: http://en.wikipedia.org/wiki/Bohm_Dialogue -- Jack 19-Apr 14:07 PDT
The topic name for that article is "Bohm Dialogue" - there is no way in Mediawiki to create a topic named "Bohm_Dialogue". The URL http://en.wikipedia.org/wiki/Bohm%20Dialogue will also go to that same topic. -- Ryan 19-Apr-2007 18:22 PDT
I have rewritten my code to avoid dashes or underscores. The problem has left the building! Plus which, I have a much better understanding of this aspect of the bowels of JAMWiki. Thanks. -- Jack 19-Apr-2007 PDT

Citations[edit]

Archived from the Feedback page:

I am looking at the code for WikiReference and how it appears to serve as a container for some content, where a vector of citations is passed to the parser. My goal is to implement something along the lines of Purple Numbers that grant fine-grained addressability to sentences, paragraphs, images, etc. Is there an explanation of how WikiReference objects could be extended to support that?

To be honest I haven't looked at the reference code in a while, so I'll need to refresh my memory to answer your question. I'm a bit fried from working at the moment, but I'll make a note to look through it some time in the next couple of days unless you figure out what you need without any additional input. Let me know if you come up with anything interesting! -- Ryan 06-Mar-2007 21:56 PST
Thanks! It seems to me that the trick, at least for me, is to wade through all the code to figure out when and where citations come into being, then decide if the WikiReference objects are appropriate to modify to include an identifier, if one doesn't already exist, such that a tweaked JSP code would write purple numbers usable for external direct reference. --Jack 07-Mar-2007 14:36 PST

Wiki / database integration[edit]

Archived from the Feedback page:

I'm developing a web based database front end that integrates with a wiki. Basically, the idea is that every record can have an associated wiki page. There's a one to many relationship between wiki pages and db records (each wiki page can be linked to from more than one record) and wiki titles can be included in db reports.

JAMWiki is used because amongst other things its db schema is simple to understand and integrate with. You can query the db storage directly to retrieve titles and content. However, somewhere along the way I'll probably look at the interface to choose which wiki page to link to a db record. What would be good would be if I could use the Lucene search engine in JAMWiki to search for content just like you do in JAMWiki itself. I've had a look and can call the JSP search results page by URL directly with a search string. However, for integration I'll need to use a modified template with a different design and probably some JavaScript. Is it / could it be possible to include a parameter in the request to use an alternative JSP results template?

Does anyone have any alternative ideas?

A video showing the product is here: http://www.gtportalbase.com/video/gtwp_section_leader.htm. The wiki stuff is about halfway through.

82.32.115.35 21-Apr-2007 12:29 PDT

Apologies for not responding to this issue sooner. It should be possible to re-work the search engine code to allow anyone to create an alternative search engine implementation by simply implementing the SearchEngine interface, although that's currently untested and incomplete in the existing code. In any case, it would be nice if someone who wanted to customize the search code could simply change a Special:Admin option, which would also allow JAMWiki to integrate more easily with sites that already have their own search engines. -- Ryan 06-May-2007 21:43 PDT

Assuring Security by testing[edit]

Archived from the Feedback page:

Hi devs,

I've been investigating JAMWiki within my Bachelor's thesis "Application of security test tools in open source" at the Free University of Berlin (FU Berlin). Basically, I am looking for security measures which have been taken to prevent security leaks/vulnerabilities especially with security test tools which provide fuzzing capabilities for SQL injection, parameter tampering, path traversal etc.

So far, I have searched the repository and the homepage. Surefire runs JUnit test cases which are not designed to do any security testing. The homepage revealed some ideas about fuzzing (not available thru lucene anymore) but no measures in this direction have been taken.

Are any measures taken whatsoever to assure security with testing tools, a special test plan or functional requirements?

Thanks in advance,

Michael

With respect to security, here are several general areas of focus within JAMWiki:
  • Several of the unit tests are designed to detect XSS vulnerabilities (originally suggested by NickJ). These tests can be found in the source code in the /jamwiki-core/src/test/resources/data/topics/ directory and contain "XSS" in the name.
  • All user passwords are encrypted using strong password encryption.
  • Prepared statements are used for all database queries to avoid SQL injection.
  • The Acegi security framework is used to control access to pages.
  • Javascript within wiki syntax is disabled by default.
If there are additional automated tools that can be used then that would definitely be of interest, and if legitimate holes are found then it would be a priority to close them. Hopefully that answers your question! -- Ryan 30-Apr-2008 07:59 PDT
Ryan,
thanks for your quick reply. I svn co the source again and found the 2 testcases for XSS. It seems to me like a good starter. Were you already able to tackle down any holes by it?
The four security measures you take are great, removes most of the attack are. Btw you mean you ciphering the passwords with a one-way hash function like SHA1 and not encrypting them. Encrypting means, I can decrypt them with a password. Don't you?
It answer the question pretty. I assume right now that there is no explicit security testing with and without appropriate testing tools except those XSS test resources.
What may be of your interest, is Absinthe]], Wfuzz and some other tools stated [[http://www.tssci-security.com/archives/2007/11/24/2007-security-testing-tools-in-review/|here.
Most of the test cases are there as a result of a previous bug report - the XSS unit tests resulted from bug reports from NickJ, who also does a lot of security testing on Mediawiki. And yes, the passwords are ciphered - sorry, I'm awful with terminology (my talent is in writing code - remembering names is something I fail miserably at). The default cipher for passwords is SHA-512, but for older JDKs and systems that don't support SHA-512 it falls back to SHA-1.
There isn't currently explicit security testing using the tools you've mentioned, but anyone interested in adding it is welcome to do so. Additionally, I've had fuzz-testing on my to-do list forever, but haven't found the time to implement anything. -- Ryan 30-Apr-2008 10:51 PDT
Ryan, thanks for the link over to NickJ. He has his own website with the fuzzer "mangleme". I'll try to run some tools on JAMWiki and will report to you in the next couple of weeks. Mike