Software and People

As I have mentioned elsewhere, I'm currently spending some time and effort trying to work out how to maximise the effectiveness of Wiki technology in education, using the developing features of my own Wiki implementation as a test bench.

One of the most significant things I have noticed every time I have tried Wiki technology in the classroom is the usage pattern. Wiki is a web technology, provided via web servers to users via browsers and the HTTP protocol. There has been tons of analysis of internet usage patterns, which I won't go into here. However, a "rule of thumb" that I have successfully used for capacity planning on several projects is that traffic for a globally-accessible web site on a weekday is roughly twice what it is on a weekend, and traffic when the USA is awake is roughly twice what it is when the USA is asleep. All the sufficiently busy sites I have examined (such as the software developer resource site JavaRanch.com) seem to tend toward a fairly smooth usage curve.

All of this accumulated experience fails in the face of specific situations, though. Imagine I suggest to my students at 10:35 one morning that we do some work on a Wiki. If the students are at all interested in the course it's pretty obvious that the great majority of them will try to access the same system within the next few minutes. No smooth curves here - just very sharp spikes. This usage pattern can cause all sorts of consequences. For an effectively static site or web application where the main activity of these visitors is to read/view content, the issue becomes one of allowing enough "headroom" to handle the expected peak load, or of somehow "cacheing" content for easier provision of preprocessed output to multiple clients. For an application like a Wiki which is very interactive (often with several users trying to edit the same information at once) it all becomes much more complicated. Cacheing is worse than useless - a student who is shown old page contents when he thinks he has changed something is quite likely to panic, repeatedly resubmit changes (making the usage spike even worse), or even abandon the whole exercise and miss out on a major part of the intended learning.

This usage pattern is also sufficently far from the original context in which Wiki technology was designed, that it puts pressure on some of the original design decisions. Many Wiki implementations (mine included) rely on experiential data from the original wiki at c2.com. In particular, long usage has shown that "edit clashes" (where two users try to submit differing versions of the same page to the system at once) are so rare that they can successfully be left to individual negotiation to coordinate. It's clear that during the massive usage spikes found in a classroom setting this does not hold true.

During one internet research activity I ran for a class of about 20 students, I asked them to choose (as a group) a recent topical new story, then (as individuals or small teams) search the web and any news sites they could think of, and paste URLs to as many related pages as they could on to a Wiki page. Some aspects of this activity were very positive. Adding links to a Wiki page is simple and transparent, and pooling the results of all the groups' research helped avoid redundant effort and share knowledge of web research techniques in a very natural way.

What didn't work was the traditional Wiki laissez-faire approach to coordinating edit clashes. During the initial flurry of activity, where most of the students were "cherry picking" the obvious hits from the likes of Google and the BBC, at least half of the intended additions never made it to the list. Things began to settle down as the interval between new additions became longer, but the initial performance of the Wiki as a link-gatherer was not really acceptable.

There are several possibilities to remedy this sort of situation. The one that seems to have had the most discussion among the general Wiki community is the approach of simply "bouncing" edits which clash. A typoical implementation of this approach might store some sort of "version number" with each page, and (when an edited page is submitted for update) check the version number in the edit submission with the current version of the page. If the stored page version is higher than the one in the edit submission, things have changed and the edit is invalid. The problems with this approach include the difficulty of remembering and re-applying changes to an altered context.

Another possibility is some sort of "smart merging", where the system calculates the actual differences between the submitted page and the one it was based on, and automatically applies those changes to the current version of the text. This would work great for a list of URLs, but could go very wrong with more subtle changes to the meaning and arrangement of written work.

A third feasible possibility is to somehow mark whole or part pages as "append only". and apply all incoming changes in a first-come, first-served manner. This, again, would be a good solution to something like link gathering, and even to sequential comments. This is effectively the approach taken by the comment facility in this blog software, for example.

More off-the-wall solutions include "branching" the page every time there is an edit clash - keeping several independent versions of the conflicting pages and requiring some sort of manual merging when activity has slowed, or even cloning the whole page base for each user or implementing some sort of two-phase-commit transactional protocol as used in big database systems.

If anyone has any other solution suggestions, I'd love to hear them.

James Farmer at "incorporated subversion" has spent some time researching Wiki implementations. I'm chuffed to report that my Wiki implementation Friki gets a nod as being "nice", but obviously slightly saddened that it doesn't make his final shortlist.

I understand the reasons for his choices, and can see that Friki (in its current version, at least) would not be a particularly good fit for his needs. However, I would like to reiterate (on the off-chance that anyone reads this) that Friki is under continual development and I'm always willing to accept suggestions for improvements and new features. One of my major themes for the next few releases is to improve Friki to provide a better fit for educational uses, so if you have any suggestions (or feature envy from other software) in this area, please let me know.