- Open Access
Using an empirical study to evaluate the feasibility of a new usability inspection technique for paper based prototypes of web applications
Journal of Software Engineering Research and Development volume 1, Article number: 2 (2013)
Usability is one of the most important factors that determine the quality of Web applications, which can be verified performing usability inspection. This paper presents the Web Design Usability Evaluation (Web DUE) technique, which allows the identification of usability problems in low-fidelity prototypes (or mockups) of Web applications during the design phases of the development. We have also proposed the Mockup Design Usability Evaluation (Mockup DUE) tool which is able to assist inspectors using the Web DUE technique.
In order to verify the feasibility of these technologies, we have performed two empirical studies. During the first study, we compared the effectiveness and efficiency indicators of the Web DUE technique with the ones of its predecessor, the Web Design Perspective (WDP) based usability inspection technique. Also, during the second study, experienced inspectors used the Mockup DUE tool and answered a questionnaire aiming at identifying improvement opportunities in its design.
The analysis of the quantitative data showed that the Web DUE technique allowed the identification of more usability problems in lesser time when compared to the WDP technique. Moreover, the qualitative data from the second empirical study provided information on the tool’s perceived ease of use, indicating that inspectors were satisfied and that they would use it to perform a usability inspection with the Web DUE technique.
These results showed that the DUE technologies could be applied in the identification of usability problems early in the design of Web applications. Thus, their use could enable the correction of such problems before the source code of the application is written.
In recent years Web applications have become very important (Oztekin et al., 2009). This type of applications is currently the backbone of business and information exchange, and is being used to present products and services to potential customers (Fernandez et al., 2011). In this context, usability plays a fundamental role since it affects both the acceptability and quality of Web applications (Matera et al., 2006).
The software development industry has invested in the development of a variety of Usability Inspection Methods (UIMs) to address Web usability issues (Matera et al., 2006). UIMs are procedures in which inspectors examine usability-related aspects of a user interface (Rocha and Baranauska, 2003). However, in (Rivero et al., 2013), we identified that there is a lack of UIMs that are able to find usability problems in early stages of the development process.
In this paper we present the proposal and empirical evaluation of the Web Design Usability Evaluation (Web DUE) technique and its tool support, the Mockup Design Usability Evaluation (Mockup DUE) tool. The Web DUE technique allows inspectors to identify usability problems in earlier stages of the development process by evaluating Web low fidelity prototypes or mockups. These mockups are images of how the software would look like after implementation. Furthermore, the Mockup DUE tool assists inspectors using the Web DUE technique by allowing them to: (a) interact with mockups as if they were a real application, and (b) use the Web DUE technique to find usability problems.
This paper is organized as follows. Related work on UIMs for Web applications presents the background knowledge of Usability Inspection Methods for Web applications and related work for this research. In The Web design usability evaluation technique we present the Web DUE technique proposal. Section First empirical study: evaluating the feasibility of the Web DUE technique shows how we carried out and analyzed the results of the feasibility study of the Web DUE technique. We then present the motivation and main functionalities of the Mockup DUE tool in Section The mockup design usability evaluation tool. Next, in Second empirical study: evaluating the mockup DUE tool, we show how we performed an empirical study to address the Mockup DUE tool’s perceived ease of use and the satisfaction of its users. Finally Conclusions and future work presents our conclusions and future work.
Related work on UIMs for Web applications
Usability Inspection Methods (UIMs) are evaluation methods in which experienced inspectors or the development team review the usability aspects of the software artifacts. In UIMs, the inspectors base their evaluation in guidelines that check the system’s level of achievement of usability attributes. The obtained results can be used to predict whether there will be a usability problem. The main advantage of UIMs is that they can lower the cost of finding usability problems since they do not need any special equipment or laboratory to be performed (Rocha and Baranauska, 2003).
Regarding UIMs for the Web, in our previous work (Rivero et al., 2013) we performed an extension over the systematic mapping from Fernandez et al. (2011) to draw conclusions about the state of art of UIMs for Web applications. During this extension we verified that around 77% of the reviewed papers reported UIMs analyzing finished Web applications or at least functional prototypes. Moreover, the remaining papers described automated techniques in which HTML code was verified, or in which UIMs evaluated if the application model met interaction rules within the Web domain.
There are some UIMs that could be applied in early stages of the development of Web applications (Rivero et al., 2013). Allen et al. (2006) present the Paper-Based Heuristic Evaluation, which was designed for assessing the degree of usability of medical Web applications mockups. During the inspection process, the inspectors evaluate mockups or Web application’s print screens using a set of usability heuristics. Furthermore, Molina and Troval (2009) suggest the Model Driven Engineering of Web Information Systems. This method proposes to integrate usability requirements during the specification of Web applications and allows inspectors to check 50 metrics within the navigational models in order to evaluate if they meets usability features. Finally, the Comprehensive Model for Web Sites Quality (Signore, 2005) uses five perspectives in order to evaluate the usability of Web applications: correctness, presentation, content, navigation and interaction. According to Signore (2005), the correctness perspective is directly related to code quality while the other perspectives are related to the users’ opinion. During the inspection process, a tool verifies correctness problems and then, the inspectors relate these problems to the other perspectives to identify usability problems in software models.
Despite the academy’s effort in developing UIMs for Web applications there is still room for improvement. When analyzing the current state of UIMs for the Web (Rivero and Conte, 2012a), we identified that emerging UIMs for Web applications should be able to: (a) find usability problems in the initial stages of the development process; (b) aid in both the identification and solution of usability problems; and (c) provide assistance by means of a tool to reduce the inspector’s effort.
The Web design usability evaluation technique
A. The Web DUE proposal
The Web Design Usability Evaluation (Web DUE) technique is an inspection method based on checklists that was proposed to meet the needs of the software development industry regarding the usability evaluation of Web applications. Consequently, it mainly focuses on identifying usability problems in early stages of the development process. Therefore, the Web DUE technique can be used to evaluate the usability of mockups. This feature is important as finding usability problems in earlier stages of the development process can lower the cost of correcting them (Rivero and Conte, 2012a).
The main innovation of the Web DUE technique is its ability to guide inspectors through the evaluation process by using specific “pieces” that are used to compose Web pages of Web applications. These “pieces” are called Web page zones (Fons et al., 2008) and they contain specific component within Web pages. Table 1 shows the list of the Web page zones used by the Web DUE technique, the suitability of including such zones in a Web page, and a brief description of their contents. In this table, “Mandatory” means that the zone is essential; “Optional” means that including/excluding the zone will not affect the provided functionality; and “Depends on the functionality” means that regarding the purpose of the Web page it might be necessary to include the zone. A more thorough description of each Web page zone can be found at (Rivero et al., 2013).
The main advantage of the use of Web page zones is that they can aid inspectors in evaluating only the elements that are present in the evaluated mockup. Another expected benefit of the use of Web page zones is that they can provide inspectors with a guideline to identify the elements within the application.
We based the Web DUE technique in the Web Design Perspective (WDP) based usability evaluation technique (Conte et al., 2009). This technique evolves the Heuristic Evaluation (Nielsen, 1992) in order to evaluate the usability of Web applications. Therefore, it relates the usability of Web applications to three main perspectives: concept, navigation and presentation. Using these perspectives and the heuristics from the Heuristic Evaluation, it creates pairs HxP by combining heuristics and perspectives when there is a relationship between them. For each pair HxP, the WDP provides a set of hints in order to aid inspectors in finding usability problems when evaluating Web applications. Furthermore, the WDP technique has been proved feasible for the usability evaluation of Web applications during several empirical studies (Conte et al., 2009).
We extracted the hints from the WDP technique, and related them to each of the Web page zones creating usability verification items. These verification items check the usable properties of each Web page zone and are grouped in checklists. There is a checklist for each Web page zone. Inspectors use these checklists to verify if the evaluated low-fidelity prototype meets usability principals. Table 2 shows part of the verification items for the Data Entry Web page zone. All the usability verification items from the Web DUE technique can be found at our previous work (Rivero et al., 2013).
B. Web DUE inspection process
The steps of the simplified inspection process of the Web DUE technique are shown in Figure 1. We have used these steps to evaluate the usability of a mockup which was created based on a page of the Journal and Event Management System (JEMSa).
The first step for the identification of usability problems is to divide the paper based prototype into Web page zones. Figure 1 shows the identified zones within the mockup: System’s State zone, Data Entry zone and Navigation zone. After identifying the Web page zones, inspectors must check if the mockup meets all the items described within each of the checklists per zone. Table 3 shows some of the usability verification items that were not met by the prototyped Web page and their associated Web page zones.
Inspectors must point out in the mockups which components within each Web page zone did not meet the usability verification items. If we look at Figure 1 and Table 3 simultaneously, we can relate the nonconformity of the usability verification items in Table 3 with the augmented elements A, B and C in Figure 1. We will address each of the encountered usability problems as follows.
Regarding the System’s State zone, the usability verification item 01 indicated that, despite showing the actual state of the system, the prototype does not show it logically (see Figure 1 element A). In other words, the prototype does not show how the user reached that state.
Regarding the Data Entry zone, we identified the nonconformity 02 which indicates that the mockup does not request data in a logical way. Asking for the country’s state before informing the country is not coherent (see Figure 1 element B).
During the evaluation of the Navigation zone we encountered nonconformity 03 which indicates that the symbols used within the navigation zone are difficult to understand. A user would find it confusing that the “globe” symbol would leave to the JEMS portal (see Figure 1 element C).
We have shown the simplified inspection process of the Web DUE technique by evaluating a low-fidelity prototype of a Web page. As mentioned before, in order to evaluate the entire Web application, all Web pages within the Web application must be evaluated. Furthermore, the amount of mockups (number of sketched Web pages) must be enough to simulate a user task or user case in the Web application. Also, these mockups must provide enough detail (layout and elements) for the inspector to understand the overall design of the application and its interaction steps.
First empirical study: evaluating the feasibility of the Web DUE technique
According to Shull et al. (2001), the first study that must be carried out to evaluate a new technology is a feasibility study. Therefore, we have designed and executed a feasibility study to verify if the Web DUE technique is feasible regarding the number of detected defects and the time spent. In this study we compared the Web DUE technique with the WDP technique (Conte et al., 2009). Despite the fact that the WDP is not a technique for the evaluation of paper based prototypes, it is the Web DUE’s predecessor, since the Web DUE is based on the WDP. Therefore, we believe it is reasonable to compare if the Web DUE presents better results than the WDP when evaluating mockups.
During this study we have controlled the following independent variables: the subjects’ experience, the evaluated Web application mockups, and the applied techniques. Furthermore, we have measured the following dependent variables: number of defects, number of false positives, time, effectiveness and efficiency per technique. These variables will be explained as they are cited throughout the text. Moreover, we gathered the subjects’ opinion to better understand the results from this empirical study.
A. Description of the empirical study
In Table 4, we present the goal of this feasibility study using the GQM paradigm (Basili and Rombach 1998). We have characterized the feasibility of the Web DUE technique by analyzing its effectiveness and efficiency indicators. For these indicators, we have used the same definition used in (Conte et al., 2009) and (Fernandez et al., 2010):
Effectiveness is the ratio between the number of detected problems and the total of existing problems.
Efficiency is the ratio between the number of detected problems and the time spent in finding them.
We carried out the feasibility study in April 2012 with students of the Computer Science course from Federal University of Amazonas. There were a total of eight undergraduate and postgraduate students who agreed to participate. All subjects signed a consent form and filled out a characterization form. The characterization form addressed the subjects’ expertise concerning: (a) expertise in usability knowledge, (b) expertise in usability evaluation, and (c) expertise in application design. All subjects answered objective questions regarding their degree of knowledge and professional experience. The characterization data were analyzed and each subject was classified as having Low, Medium or High experience according to the provided information within his/her characterization form. For instance, in the characterization of the inspectors’ experience in Human Computer Interaction (HCI), we considered: (a) Low, if the subject had no practical experience in HCI and/or had not studied HCI; (b) Medium, if the subject had studied HCI, but had poor practical experience in usability evaluations; and (c) High, if the subject had studied HCI in books and class, and had participated in projects involving usability evaluation. In order to reduce the bias of having more experienced inspectors using one or another technique, we equally distributed the subjects into two teams. One team used the Web DUE technique and the other one the WDP technique. Table 5 shows both teams and the expertise of their members.
There were other participants in this empirical study: (a) the moderator of the inspection, who was responsible for planning and collecting the data during the empirical study; (b) the technique lecturers, who provided the training on each technique; and (c) the discrimination team, which was responsible for deciding which of the pointed out defects were real usability problems.
Each team used either the Web DUE technique or the WDP technique to evaluate the usability of a set of paper based prototypes. For each technique they received a set of checklists to report the encountered usability problems. The mockups were prepared based on a coupon website. Coupon websites offer social coupons which are quickly emerging as a marketing tool for businesses, and an attractive shopping tool for consumers. We decided to perform the inspection over this type of Website since its popularity is raising (Kumar, 2012). To aid inspectors in the navigation among the prototyped Web pages, we provided a navigation map indicating which page should be shown after pressing buttons. Figure 2 shows one of the prepared mockups (see Mockup 05) and its navigation map (see Navigation Map for Mockup 05). It is noteworthy that the mockups and navigation map are in Portuguese since the Web application had Brazilians as target users.
In order to use the Web DUE technique or the WDP technique, it is necessary to receive previous training in usability inspections and how to identify a usability problem during the problems detection phase from the assigned technique. Therefore, to provide background in usability inspections, all subjects had lectures on: (a) usability, (b) examples of typical usability problems, and (c) how to apply an inspection method like the Heuristic Evaluation. Furthermore, before the inspection, each team was trained in the technique it was assigned. Consequently, each team had contact with just one of the techniques in order to avoid biased results. The main reason for this measure is that knowing both techniques could affect the overall performance and opinions of the inspectors.
Each subject had three days to complete the inspection and prepare a defect report which contained discrepancies (issues reported by the inspector that could be real defects or false-positives) and the time spent to perform the inspection, which was measured by the inspector himself. After finishing the inspection, each subject sent back his/her report and a follow-up questionnaire with comments regarding the Web DUE or the WDP technique.
All inspections reports were delivered on time and none of them were discarded. The moderator checked all discrepancies within the defect reports for incorrect or missing information and also gathered the discrepancies. Finally he generated a new discrepancies report which contained all discrepancies found without showing the duplicated ones. Readers must note that both the collection activity and the discrepancies report were also verified by another researcher to avoid bias.
After the collection activity there was a discrimination meeting. This meeting was attended by the moderator and two other researchers who were not involved with the study. These researchers possessed good usability knowledge and prior experience in usability evaluation. For each reported discrepancy, the other researchers verified if it was a usability error by evaluating the paper based prototypes. It is noteworthy that the moderator, who was involved in the study, did not classify any of the discrepancies in order to reduce biased opinions. Considering all discrepancies, there were a total of 79 real defects. This number was used in the calculation of the effectiveness indicator as shown in the next subsection.
B. Analysis of the results of the feasibility study
We have analyzed quantitative and qualitative data. We obtained the quantitative data from the discrepancies’ list resulting from the discrimination meeting, and the qualitative data from the answers to the questionnaires.
As mentioned in the previous subsection, the total number of known usability defects is 79. Table 5 presents both the results per inspector and the overall results per technique. Using these data, we applied the Mann–Whitney U test to perform the statistical analysis of the experiment results. This test is the non-parametric equivalent of the t-Student test and we used it because we had two groups to compare (inspection techniques), different participants in each condition, and no assumption about the data distribution. Furthermore, in this analysis, we used α = 0.10 due to the small sample used within this study (Dyba et al., 2006).
It is noteworthy that a thorough description of our hypotheses and their statistical evaluation is available at our previous work (Rivero and Conte, 2012b). Interested readers can refer to that paper for further information.
The boxplot graph comparing the effectiveness indicator is shown in Figure 3. When analyzing the graph, we can see that the median from the Web DUE group is slightly higher than the median from the WDP group. However, the comparison using the Mann–Whitney statistic method showed that there was no significant difference between the groups (p = 0.885). These results suggest that the Web DUE technique and the WDP technique provided similar effectiveness when used to inspect paper based Web pages of a Coupon Website.
The boxplot graph comparing the efficiency indicator is shown in Figure 4. When analyzing the graph, we can see that the median from the Web DUE group is higher than the median from the WDP group. Therefore, the group that used the Web DUE technique was able to find more defects in lesser time than the group that used the WDP technique. Nevertheless, the comparison using the Mann–Whitney statistic method showed that there was no significant difference between the groups (p = 0.149). These results suggest that: (a) the Web DUE technique and the WDP technique provided different efficiency when used to inspect paper based Web pages of a Coupon Website; and (b) this difference is not significant from a statistical point of view.
The data analysis began with the examination of the answers within the follow-up questionnaires of the subjects who used the Web DUE technique. Figure 5 shows the main questions within the questionnaires which aimed to obtain information about the subjects’ overall opinion of the main components of the Web DUE technique. We also collected data regarding the adequacy and perceived ease of use of the technique.
Regarding the subjects’ opinion towards the use of Web Page zones, all inspectors agreed that it was easy to understand and identify all Web page zones within paper based Web applications. This is illustrated in the following quote.
“Yes, they were really clarifying; I had no trouble understanding them.” - Inspector 1.
Inspector 4 added that maybe it would be easier to carry out the inspection, if the Web DUE allowed an easy way to verify repeated Web page zones within a Web page. This could have caused an increase in the time spent during the inspection.
“… In case of a data entry zone, a Web page could contain more than one data entry zone with different purposes. The technique could allow the verification of all these repeated zones in a particular way.” - Inspector 4.
Regarding the Web DUE’s verification items, Inspectors 1, 3 and 4 agreed that they were helpful when combined with the Web page zones (See quote from Inspector 4). However, Inspector 2 pointed out that some verification items could be used to evaluate the same Web page zone (See quote from Inspector 2). This could mean that some Web pages can be used to evaluate the same components, which confuses the inspectors; or that some verification items are too broad to allow the identification of specific features of the evaluated Web page zone. Furthermore, Inspector 3 stated that some items did not help him decide whether there was a usability problem or not (See quote from Inspector 3). This means that some verification items could not provide a clear description of which nonconformities must be found, or that the verification items are too subjective for the inspector to decide.
“…, the usability verification items contributed to find more discrepancies when analyzing a determined part of the page.” - Inspector 4.
“Data entry zones, for example, possess verification items that could be adequate for the evaluation of the help zone. This ambiguity made it difficult to find terms that defined the problems encountered during the inspection.” - Inspector 2.
“… However, some verification items judge features, which are not totally wrong or right.” - Inspector 3.
Regarding the subjects’ opinion towards the examples and explanations provided along with the verification items, all inspectors agreed they were easy to understand and that they also made it easier to understand the usability verification items. However, when asked about the technique’s adequacy, one inspector (Inspector 3) found the technique inadequate. Nevertheless, this inspector was not referring to the technique, but to the mapping process of the Web pages. All inspectors agreed that the technique could aid in finding usability problems in Web page prototypes. However, the fact that inspectors had to manually simulate the interaction between the user and the system turned the inspection process very tiring. The following quotes illustrate the inspector’s opinion towards the difficulty in simulating interaction.
“Applying the technique in mockups is really confusing when navigating through the pages. Mainly in bigger systems, with a higher number of pages. Furthermore, there is no interaction with the system.” - Inspector 3.
“It was too difficult to simulate using Web pages. Linking Web pages is complex even with the use of the navigation map.” - Inspector 3.
Regarding the technique’s perceived ease of use we found out that Inspector 4 found the technique very difficult. He argued that the Web DUE technique was too detailed and very repetitive (See quote from Inspector 4). The other inspectors did not stress that the technique was difficult, but that it was difficult to simulate the interaction as pointed out before.
“The technique did not seem easy at all. It was too detailed and repetitive during most of the evaluation. On the other hand, it includes a very complete view of the contents of the pages.” - Inspector 4.
Inspectors also stated advantages and suggestions of the evaluation of paper based prototypes. According to Inspector 4, mockups allow to directly point out the usability problems. Regarding suggestions for the evaluation of low fidelity prototypes, Inspector 3 indicated that colored mockups could make them more realistic and Inspector 2 said that the organization of the mockups is important for simulating the interaction between the system and the user.
“I found the technique appropriate because in the mockups we can scribble and then, we can relate what we drew in the checklists options.” - Inspector 4.
“For a more artistic/visual view of the site, it could be interesting to provide colored mockups.” - Inspector 3.
“The use of a graph explaining the interaction among pages, or the organization of the pages according to the activities that are being executed can help in understanding the interaction.” - Inspector 2.
The overall results show that the technique can be applied in the evaluation of paper based low-fidelity prototypes of Web applications. However, as the qualitative data indicate, there is still room for improvement. In the following subsection, we present the threats to validity of the feasibility study of the Web DUE technique.
Threats to validity
In this study we considered four main threats to the internal validity, which is concerned with if, in fact, the treatment causes the results (Wohlin et al., 2000). Regarding the first issue, training effect, there could be a risk if the quality of the training of the WDP technique had been inferior to the training of the Web DUE technique. However, we controlled this risk by using the same examples of encountered usability problems in both trainings. Furthermore, in order to mitigate the threat of the subject's knowledge affecting the results, we divided them into balanced groups according to their experience. Moreover, the inspector with the highest experience in inspections and usability was assigned to the WDP technique to avoid bias. To mitigate the bias of the subjects’ classification we used three criteria that were assessed through an objective questionnaire. Finally, despite asking the subjects to be very precise in their time measurement, we cannot guarantee that these measures were carefully obtained.
Regarding the external validity in this study, which is concerned with the generalization of the results (Wohlin et al., 2000), we will discuss four issues. As for the first issue, the use of students as inspectors, we can argue that since we were looking for inspectors with the same degree of usability knowledge and we balanced both teams, students could be used as subjects since none of them had previous experience with any of the techniques. Furthermore, even though we used an academic environment to carry out the feasibility study, we based the mockups and their interaction in a real Web application, which can help resemble a real industry environment. Regarding the use of a coupon Website, we cannot guarantee that it is not a threat since there are many Web application categories (Kappel et al., 2006). Finally, using the WDP technique for comparison purposes is a validity threat. Nevertheless, since the Web DUE technique is based on the WDP technique, it is reasonable to compare if the Web DUE presents better results than the WDP when evaluating mockups.
According to Wohlin et al. (2000), the conclusion validity is concerned with the relationship between the treatment and the results. In this study, the biggest problem is the statistical power. Since the number of subjects is low, the data extracted from this study can only be considered indicators and not conclusive.
Finally, the criteria used to measure the feasibility of the technique can be considered a threat to the construct validity (relationship between the theory and the observation) if not properly chosen (Wohlin et al., 2000). However, as effectiveness and efficiency are two common criteria used for investigating the productivity of new techniques (Fernandez et al., 2010), this threat cannot be considered a risk to the validity of our results.
The mockup design usability evaluation tool
A. Motivation and features of the mockup DUE tool
The results of the empirical study in Section IV indicated that the fact that inspectors had to manually simulate the interaction between the user and the Web mockups turned the inspection process very tiring. Furthermore, the inspectors cited the main advantages of using mockups for usability inspection:
Colored mockups can show how the Web application will look.
The inspectors can directly point out the encountered usability problems.
The inspectors can add notes or suggestions.
The Mockup DUE tool was conceived in order to: (a) automatically simulate the interaction among mockups, and (b) maintain the main features that made the use of mockups appropriate for usability inspection. To do so, we divided the inspection process into two main activities: (i) planning of the interaction, and (ii) detection. Using these activities the Mockup DUE tool supports the overall inspection process of the Web DUE technique.
During the planning stage, the moderators, who are preparing the mockups for inspection, can load their mockups in the tool and connect them by adding links (see Figure 1 stage 1). Furthermore, they can visualize and simulate the interaction steps by clicking in the mockups. This feature allows the creation of clickable mockups, which are close to real finished applications.
During the detection phase, inspectors use the previously mapped mockups and perform an inspection using the Web DUE technique. In this stage, the inspector mentally divides the mockups into Web page zones (see Figure 1 stage 2). For each of the identified zones, the inspector checks the usability verification items which are grouped and shown by the Mockup DUE tool (see Figure 1 stage 3). Moreover, the inspector can also interact with the mockups to verify the interaction. If he/she identifies a usability problem, he/she can add that problem and point it out in the mockup. Furthermore, the inspector can add notes with suggestions or considerations at any time.
In the next subsection we will show, through an example, how to use the Mockup DUE tool. It is noteworthy that all the screenshots from the Mockup DUE tool are in Portuguese since the first version of the Mockup DUE tool was originally developed in Brazil.
B. Using the mockup DUE tool to carry out an inspection
In order to carry out an inspection using the Mockup DUE tool the inspectors must be familiar with the overall inspection process of the Web DUE technique, either by using it without the tool or by being trained in it. In this subsection, we will discuss the main features of the Mockup DUE tool by analyzing some print screens of its graphical user interface. Figure 6 shows print screens in which we used the tool to map the interaction between Web mockups. In Part 1 we loaded two mockups to the Mockup DUE tool, and marked them as: (a) Elements A: a mockup that provides user information, and (b) Element B: a mockup of a page in which the user can edit his/her user information. When the user clicks in one of the mockups he added, the Mockup DUE tool shows it in real size in the visualization area (see Figure 6 Element C).
In Figure 6 part 2 we also show how to link mockups to simulate interaction. When the user clicks in the “add link” button (see Figure 6 Element D), the system asks the user which mockup will be shown once the link is clicked (see Figure 6 Element E). After imputing this information, the user can locate and resize the link (see Figure 6 Element F).
To check if the links lead to the correct mockups, users can visualize the mapping of the mockups (see Figure 7) and interact with them. Therefore, users can click in the previously added links (Figure 7 part 1) and then the tool will take the user to the related destination (Figure 7 part 2). This step is important as the moderator of the inspection can evaluate if the simulated interaction will be presented to the inspectors as intended by the designers of the Web application.
In Figure 8 we show the how to find usability problems using the Mockup DUE tool. Initially the inspector loads a project which consists of a set of previously mapped Web mockups. Then, the tool shows the first mockup to be evaluated, and next to it, the Web pages zones and the usability verification items of the Web DUE technique. The inspector must select a Web page zone (see Figure 8 Element A) to load its respective usability verification items list (see one of the items from the list in Figure 8 Element B). In this stage, inspectors can simulate interaction by clicking on the links (see Figure 8 Element C).
When using the Mockup DUE tool, inspectors must verify if the verification items from the Web DUE technique are met by the Web mockups. When a mockup violates a usability verification item, the inspector clicks in the “point error” button (see Figure 8 Element D). After that, the tool draws a red empty circle in the mockup. This circle will be used by the inspector to indicate which part of the mockup has the usability error. The inspector can resize and locate the circle according to his/her needs (see Figure 8 Element E). Furthermore, if the inspector needs to make comments or suggestions, he can use the “add note” button (see Figure 8 Element F) to place a note.
After finishing the inspection of the mockups, the Mockup DUE tool can be used to generate a report for further analysis. Such report contains all the violated usability verifications items and their location in the mockup. In the following section we will discuss the empirical study of the Mockup DUE tool and its results.
Second empirical study: evaluating the mockup DUE tool
We have designed and executed an empirical study to evaluate the users’ satisfaction when using the Mockup DUE tool. In this study, we have performed a cooperative evaluation of the Mockup DUE tool. In a cooperative evaluation, design teams and users collaborate in order to evaluate a product and identify usability issues and their solutions (Rocha and Baranauska, 2003).
During this study we have controlled the following independent variables: the subjects’ experience, the evaluated Web application mockups, and the applied tool. Furthermore, although it was not completely quantified, we have gathered the subjects’ opinion as dependent variable. These variables will be explained as they are cited throughout the text.
A. Description of the empirical study
The goal of this empirical study is shown in Table 6 using the GQM paradigm (Basili and Rombach 1998). We aim to evaluate the Mockup DUE tool by analyzing the user satisfaction indicator and the perceived ease of use it provides to its users. Therefore, we have analyzed the inspectors’ opinion during and after their experience with the tool.
We carried out the feasibility study in June 2012 with usability experts from Federal University of Amazonas. There were a total of four postgraduate students who agreed to participate. All subjects signed a consent form and filled out the same characterization form used in the first empirical study (see Section IV). The characterization data were analyzed and each subject was classified as having Low, Medium or High experience, according to the provided information within his/her characterization form. The results from the overall characterization (See Table 7) show that, in general, the inspectors possessed from medium to high experience levels in HCI and usability inspections. Furthermore, the inspectors had no previous contact with the Mockup DUE tool in order to avoid learning biased opinions.
Each subject tested the main features of the Mockup DUE tool using two sets of Web mockups. The first set, which was used to test the mapping and simulation activities from the Mockup DUE tool, was based on the Googleb Web site. The second set was based on the JEMS system. We previously mapped the interaction among the mockups from the second set as the subjects would use it to test the activity of usability problems detection. It is noteworthy that we selected these mockups because of the degree of the familiarity that subjects had with them. The first set was based on Google as it was necessary for the subjects to know the application because they would have to link its mockups, while the second set was based on the JEMS system as the subjects would not need any previous experience in the interaction provided by the application. Some inspectors, however, had used the JEMS system to submit their work for evaluation and therefore, had a little experience with the application.
We also used the Moraec usability testing software to obtain richer qualitative data from the cooperative evaluation. This software allows the researcher to record and visualize the user reaction towards the tested software, and identify the interaction steps in which a problem can occur.
All subjects received a set of tasks to be performed using the Mockup DUE tool: (a) to map the interaction of Web mockups; (b) to simulate the interaction between the user and the Web application; and (c) to detect usability problems. The subjects were told that the purpose of the evaluation was to identify usability problems regarding the Mockup DUE tool. Furthermore, they were encouraged to speak what they were thinking and suggest improvements so that the tool could be more easily used. During the cooperative evaluation, the moderator also asked questions regarding the system’s ease of use. Moreover, if the subject encountered any difficulty in using the tool, the moderator would take notes and try to identify the cause. Additionally, we recorded the cooperative evaluation using the Morae software, for further analysis. Figure 9 shows a subject carrying out the cooperative evaluation of the Mockup DUE tool.
Each subject had as much time as he/she considered necessary to perform the tasks. Readers must note that the subjects did not perform a complete inspection over the evaluated mockups. As the goal of the study was to evaluate if inspectors could actually use the tool to perform an inspection, there was no need for performing a complete inspection. Finally, at the end of the cooperative evaluation, the subjects filled out a follow-up questionnaire with comments regarding the Mockup DUE tool.
We gathered three types of information: (a) the notes taken by the moderator during the execution of the cooperative evaluation; (b) the follow-up questionnaires with comments from the subjects and their overall satisfaction rating; and (c) the videos from the recording of the cooperative evaluations. In the next subsection we present the results of this empirical study and its qualitative analysis.
B. Qualitative analysis of the results of the empirical study
The data analysis began with the examination of the answers within the follow-up questionnaires. In these questionnaires we asked the subjects to rank their overall degree of satisfaction in a Visual Analogue Scale - VAS. A VAS is a measurement instrument that tries to determine a characteristic or attitude that is believed to range across a continuum of values and cannot easily be directly measured (Schaik and Ling, 2003). As shown in Figure 10, all subjects marked their overall satisfaction rate in the VAS scale after interacting and carrying out the inspection tasks with the Mockup DUE tool. In our VAS, 0 represents very displeased and 10 represents very satisfied. Furthermore, Figure 11 shows the follow-up questionnaire from this empirical study. We will now relate the results from the study to the subject’s answers and our observations during the cooperative evaluation.
The results for Q1 (Is the tool easy to use?) and Q6 (Which were the aspects that made the tool easy or difficult to use?) showed that the inspectors’ opinions were divided. Inspectors 1 and 2, found the tool difficult to use as they argued that they needed more information in order to start using it. Inspectors 3 and 4 stated that overall the tool was easy to use, but that it was necessary to make some changes to improve its ease of use. Below we present some quotes from the subjects in order to support these statements:
“The tool needs to improve its usability. I felt lost when I started to map the mockups and to start the inspection.” - Inspector 1.
“The tool should improve some of the interface elements. There are some ambiguities regarding the activities of the user and the activities of the inspection.” - Inspector 2.
“The tool is easy to use because the icons are intuitive… However, when I added a usability error, I couldn’t read the description…” - Inspector 3.
“It’s very neat, nice. But there are some details that made it confusing… When working with the links I had no information which mockup I had left…” - Inspector 4.
The answers for Q2 (Is the tool adequate for Web mockups inspections?) indicated that all inspectors agreed that the tool was very adequate (see quote from Inspector 3). Nevertheless, inspector 4 argued that it was not clear whether it was possible to point out a usability problem that was not in the usability verification lists.
“I liked it very much and certainly it’s way better than just using Web mockups…” - Inspector 3.
“It is adequate. However, there are some details that confuse the user, for example: I thought I could only point out the usability problems from the lists.” - Inspector 4.
The results for Q3 (Is training necessary before using the tool?) showed that some inspectors needed more information before starting using the tool (see quotes from Inspectors 1 and 2). However, Inspectors 3 and 4 argued that the activities were easy to perform without previous training, but that some usability problems needed to be corrected (see quote from Inspector 4).
“No, some hints should be offered when starting the activities.” - Inspector 1.
“Without the observer’s influence it could be very difficult to use the tool without training. Some icons, functionalities and modules possess names that are inadequate. … Some functionalities could be grouped differently to enhance user experience.” - Inspector 2.
“Yes, despite being a bit confusing and difficult to learn, it is possible to use without training. Maybe there could be a general description of the functionalities…” - Inspector 4.
Regarding question Q4 (Are the offered functionalities intuitive and easy to perform?) and Q5 (Is the layout of the application easy to understand?), we identified that the layout had a direct effect over how easy it was to execute the functionalities. Inspectors stated that overall they were able to carry out all the activities. However, the location, size and description of the of the user interface elements made it difficult to execute the activities. Quotes from Inspectors 1 and 4 illustrate this fact.
“Some user interface elements should be improved… In the detection phase the items should be better organized, as well as the verification items and 'add error’ buttons.” - Inspector 1.
“I understood everything I had contact with. However, I had to read the hints to understand the functionalities of the buttons… I believe some of the elements from the interface are not well located.” - Inspector 4.
Finally, the answers to question Q7 (Would it be possible to consider using the tool for Web mockups inspections?) showed that all inspectors would use the tool to carry out a usability inspection of Web mockups. However, most of them argued that in order to provide a better interaction, the previously mentioned usability problems should be corrected (See quotes from Inspectors 2 and 3).
“Yes, but improve the words and terms, and some items and their location.” - Inspector 2.
“Of course I would use it again if the previously described errors were corrected.” - Inspector 3.
Overall, the average degree of the subjects’ satisfaction when using the tool was 7.7 (based on the VAS scale in Figure 10), which means that the inspectors were pleased when using the tool. We identified there was a relationship between the perceived ease of use and the degree of satisfaction of the inspectors. Inspectors 3 and 4 rated the Mockup DUE tool with an acceptable satisfaction rate. We identified that these inspectors found the tool easy to use right from the beginning and that they did not need any training. However, Inspectors 1 and 2, argued that the tool did not provide enough information for new users, therefore affecting the evaluation of their satisfaction rate.
In order to identify which user interface elements had caused a detrimental effect over the inspectors’ opinion towards the tool, we analyzed the recordings and the notes taken during the cooperative evaluation. In Table 8 we provide some of the pointed out elements and what should be corrected to improve the usability of the Mockup DUE tool.
Threats to validity
We considered five threats to validity: (a) students are probably not good substitutes for professional inspectors, (b) academic environments do not represent day to day experience in the industry, (c) if the Google and JEMS Websites were representative applications of all Web applications, (d) the number of subjects involved in the study, and (e) if the criteria used to measure the feasibility of the tool were properly chosen.
As for the first issue, the use of students as inspectors, we can argue that since the subjects possessed high levels of usability knowledge and usability inspection experience, they could be used as experienced inspectors. Furthermore, even though we used an academic environment to carry out the feasibility study, we based the mockups and the interaction between the system and the user, in a real Web application which can help resemble a real industry environment (threat b). Regarding the use of the Google and JEMS Websites, we cannot guarantee that it is not a threat since there are many Web application categories (Kappel et al., 2006).
In this study, the biggest problem is the number of subjects who participated of the cooperative evaluation. Since there were only four subjects, the data extracted from this study can only be considered indicators and not conclusive. Finally, as satisfaction and perceived ease of use are two common criteria used for investigating the productivity of new software, threat (e) cannot be considered a risk to the validity of the results (Tanaka and Rocha, 2011).
Conclusions and future work
This paper has described and evaluated the Web DUE technique and the Mockup DUE tool for low-fidelity prototypes of Web applications. Furthermore, the empirical evaluation of these technologies showed indicators of their feasibility. For instance, the Web DUE technique managed better effectiveness and efficiency than its predecessor (the WDP technique) when evaluating the mockups of a coupon website. Moreover, the experienced inspectors who used the Mockup DUE tool stated that they would use it to carry out usability inspections within the context of Web mockups.
Regarding the evolution of the Web DUE technique future work involves: (a) verifying which usability verification items can be combined within the Web page zones to reduce effort during the inspection; (b) analyzing and suggesting an alternative inspection process in order to reduce repetitive instructions; and (c) analyzing which usability verification items or examples/explanations are not clear or ambiguous in order to reduce the inspector’s effort and confusion.
Future work concerning the evolution of the Mockup DUE tool involves: (a) analyzing and suggesting a new organization for the supported activities; (b) modifying the user interface according to the suggestions from its feasibility study; and (c) preparing a help menu to aid inspectors that are not familiar with the tool in performing the mapping and inspection of Web mockups.
We also intend to perform new empirical studies to validate the obtained results and to better understand both the technique and tool. In these studies we intend to use further resources to achieve richer data. For instance, we will use existing satisfaction questionnaires found in the literature for further qualitative data, and likert scales so that it is possible to perform quantitative analyses regarding the proposed technologies. Furthermore, we will compare the Web DUE technique with a technique specifically created for the evaluation of low fidelity prototypes.
We also believe it is important to test the performance of the Web DUE technique and the Mockup DUE tool in Web development processes which take benefit from early Web artifacts with a high degree of expressiveness. Consequently, we intend to perform new empirical studies within the context of agile development and model-driven development to gather information about the impact of using the proposed technologies in such contexts. Finally, we intend to analyze the agreement rate among the evaluators and to what extent are their experience levels impacting these results.
Allen M, Currie L, Patel S, Cimino J: Heuristic evaluation of paper-based Web pages: a simplified inspection usability methodology. J Biomed Inform 2006, 39(4):412–423. 10.1016/j.jbi.2005.10.004
Basili V, Rombach H: The tame project: towards improvement-oriented software environments. IEEE Transactions on Software Engineering 1998, 14(6):758–773.
Conte T, Massollar J, Mendes E, Travassos G: Web usability inspection technique based on design perspectives. IET Software 2009, 3(2):106–123. 10.1049/iet-sen.2008.0021
Dyba T, Kampenes V, Sjoberg D: A systematic review of statistical power in software engineering experiments. Information and Software Technology 2006, 48(8):745–755. 10.1016/j.infsof.2005.08.009
Fernandez A, Abrahao S, Insfran E: Towards to the validation of a usability evaluation method for model-driven Web development. Proceedings of the IV International Symposium on Empirical Software Engineering and Measurement, USA; 2010:54–57.
Fernandez A, Insfran E, Abrahao S: Usability evaluation methods for the Web: a systematic mapping study. Information and Software Technology 2011, 53(8):789–817. 10.1016/j.infsof.2011.02.007
Fons J, Pelechano V, Pastor O, Valderas P, Torres V: Applying the OOWS model-driven approach for developing Web applications: The internet movie database case study. In Web Engineering: Modeling and Implementing Web Applications. Springer, USA; 2008.
Kappel G, Proll B, Reich S, Retschitzegger W: An Introduction to Web Engineering. In Web Engineering: The Discipline of Systematic Development of Web Applications. Wiley, USA; 2006.
Kumar V: Social coupons as a marketing strategy: a multifaceted perspective. Journal of the Academy of Marketing Science 2012, 40(1):120–136. 10.1007/s11747-011-0283-0
Matera M, Rizzo F, Carughi G: Web Usability: Principles and Evaluation Methods. In Web Engineering. Springer, USA; 2006.
Molina F, Toval A: Integrating usability requirements that can be evaluated in design time into Model Driven Engineering of Web Information Systems. Advances in Engineering Software 2009, 40(12):1306–1317. 10.1016/j.advengsoft.2009.01.018
Nielsen J: Finding usability problems through heuristic evaluation. Proceedings of the Computer Human Interaction 92, UK; 1992:373–380.
Oztekin A, Nikov A, Zaim S: UWIS: an assessment methodology for usability of Web-based information systems. Journal of Systems and Software 2009, 8(12):2038–2050.
Rivero L, Conte T: Using the Results from a Systematic Mapping Extension to Define a Usability Inspection Method for Web Applications. Proceedings of the 24th International Conference on Software Engineering and Knowledge Engineering, USA; 2012a:582–587.
Rivero L, Conte T: Using an Empirical Study to Evaluate the Feasibility of a New Usability Inspection Technique for Paper Based Prototypes of Web Applications. Proceedings of the 26th Brazilian Symposium on Software Engineering, Brazil; 2012b:81–90.
Rivero L, Barreto R, Conte T: Characterizing usability inspection methods through the analysis of a systematic mapping study extension. Latin-american Center for Informatics Studies Electronic Journal 2013., 16(1):
Rocha H, Baranauska M: Design and Evaluation of Human Computer Interfaces. Nied, Brazil (In Portuguese); 2003.
Schaik P, Ling J: Using on-line surveys to measure three key constructs of the quality of human-computer interaction in Web sites: psychometric properties and implications. International Journal of Human-Computer Studies 2003, 5(5):545–567.
Shull F, Carver J, Travassos G: An empirical methodology for introducing software processes. ACM SIGSOFT Software Engineering Notes 2001, 26(5):288–296. 10.1145/503271.503248
Signore O: A comprehensive model for Web sites quality. Proceedings of the 7th IEEE International Symposium on Web Site Evolution, Hungary; 2005:30–36.
Tanaka E, Rocha E: Evaluation of Web Accessibility Tools. Proceedings of the X Brazilian Symposium on Human Factors in Computing Systems and the V Latin American Conference on Human-Computer Interaction, Brazil, In; 2011:272–279.
Wohlin C, Runeson P, Host M, Ohlsson M, Regnell B, Wessl A: Experimentation in software engineering: an introduction. Kluwer Academic Publishers, USA; 2000.
We thank the financial support granted by CAPES to the first author of this paper. Furthermore, we thank the Natasha Costa and Priscila Fernandes for their assistance during the discrimination meeting.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
About this article
Cite this article
Rivero, L., Conte, T. Using an empirical study to evaluate the feasibility of a new usability inspection technique for paper based prototypes of web applications. J Softw Eng Res Dev 1, 2 (2013) doi:10.1186/2195-1721-1-2
- Usability Evaluation
- Usability Problem
- Cooperative Evaluation
- Inspection Process
- Heuristic Evaluation