A few weeks ago my second experiment was performed by 86 students of the University of Twente. The goal was to measure the (emotional) interaction experience of websites. In order to accomplish this, each student had to rate two different versions of a room search website.

The presented information (content) and the interaction path were the same for both websites. The websites differed on 2 other aspects. First, the presentation mode was different; one website was presented solely in text, the other used as many images as possible to inform the participant about the rooms. The other aspect was usability; for each presentation mode, a good and bad usability version was made. The websites were paired (visual-good usability with text-bad usability, and visual-bad usability with text-good usability), and each participant had to rate one pair. It soon became clear that the visual websites were hard to use, for a number of reasons. This meant some comparisons between the websites could not be made. However, in the end it was about testing the method to measure interaction, not the websites.

The experiment could be performed online and the same method to measure emotions was used as in the first experiment. A screenshot of the website was shown for 10 seconds, after which the first (emotional) impression had to be rated. Then, two tasks on the website had to be performed in order to create an ‘interaction experience’. The same rating process was performed after both tasks were accomplished. After that, the website had to be rated for beauty, goodness, and spent mental effort.

Scoring Cards
The scoring cards for the four websites are presented below. Click on them for larger images.
website1.jpg
The Text, Good Usability website was rated rather good after the screenshot, but after the interaction it was rated even better. The website scores average on beauty and high on goodness. The mental effort score is low, which is positive. The top scoring emotion words do not change much for this website.
website3.jpg


The Text, Bad Usability website scored more in the negative octants than the TU+ website, but it still scored rather positive in general. The scores for beauty and goodness were also a little lower for this text version. This was expected as the lay-out was less clear and tidy than the Good Usability version. The main functionality was the same however; the emotion radar shifts to a more positive view after interaction and a low value for mental effort.

website2.jpg
The first impression of the Visual Good Usability website was very positive, as can be seen from the emotion radar after the screenshot. The second emotion radar is almost a mirror to the first radar, completely shifting from positive to negative. The bad usability was probably the reason for this. It is also visible in the high mental effort rating and the low scoring for goodness. The emotion words also shift to negative, although some participants remain positive about the website.

website4.jpg
The Visual, Bad Usability website shows the same pattern as the Good Usability version, only more extreme. There is a higher score for mental effort and a lower score for goodness. The website is till regarded beautiful, but the top rated emotion words tell a story of disappointment and dissatisfaction.

Conclusions
The main goal of experiment 2 was to see whether differences could be measured in the emotional experience of a website before and after interaction. It was expected to find these differences in the different distribution of the octants of Russell’s circular structure (translated into the Emotion Radar).
It was found that the interaction process was noticeable in the emotional experience of participants. The scoring of octants on the Emotion Radar differed for all websites. In a pre-test, neither of the visual websites were perceived as good on usability. This was clearly visible in the scoring cards. The websites were regarded beautiful and scored high on the positive octants before interaction. After interaction, because of the poor usability, the negative octants were chosen most often. The differences were rather extreme for these websites and perhaps easier to measure. However, the differences in usability for the textual sites were more subtle and the Emotion Radar was also able to measure these nuances between the websites.

In a next post, I will go in more details about the results. This will regard Mahlke’s model and the interplay between non-instrumental and instrumental qualities, as found in the experiment.

Trackback URI | Comments RSS

Leave a Reply