Figure 3 displays the extended GSE taxonomy proposed herein 2. Seven new dimensions were incorporated into the base taxonomy: “setting”, “software process type”, “software process distance”, “power distance”, “uncertainty avoidance”, “language distance” and “communication model”.
The relationship between the new and original dimensions are as follows:
-
The dimension “GSE is the parent of all the other dimensions.
-
The classifications by means of the dimensions “software process type, “power distance, “uncertainty avoidance and “language distance are related to the category site of the dimension “setting.
-
Classifications by means of the dimensions “software process distance and “communication model are related to the category relationship of the dimension “setting.
The seven new dimension are further detailed in Sections 4.1 (GSE), 4.2 (setting), 4.3 (software process type and software process distance), 4.4 (power distance and uncertainty avoidance), 4.5 (language distance) and 4.6 (communication model) respectively.
GSE
GSE is the root of the taxonomy. To better manage to classify on both site and relationship-between-pair-of-sites granularity levels, project has been introduced into the root level instead of sourcing, as proposed in the original taxonomy. Herein we consider a project as “a temporary endeavor undertaken to create a unique product, service, or result (Institute 2013).
Setting
The dimensions of the extended taxonomy are formulated to classify GSE projects on both the site level (Site) and the relationship-between-pair-of-sites level (Relationship).
A site is defined as a unit composed of human resources that interact with other sites (nodes). We define a relationship as the relationship between two sites interacting in a project (edge).
Software process dimensions
The software development process type used (agile Royce (1987), plan-driven Sommerville (2010) or hybrid Kuhrmann and Mendez Fernandez (2015); Vijayasarathy and Butler (2015)) is an aspect that impacts the conduct of a GSE project, e.g. the effort required to perform such projects (Britto et al. 2014, 2015). In addition, the way the practices are incorporated into the sites’ routines can also be different (workflows). Differences between software processes used in different sites may lead to problems in the communication and loss of trust (Ramasubbu et al. 2011), for example impacting the associated effort (Britto et al. 2014, 2015).
Therefore, we incorporated the dimensions software process type and software process distance to account for software process factors.
Software process type
Plan-driven software development may be viewed as heavy and bureaucratic to deal with certain types of projects, specially the ones where the requirements are unclear and uncertain (Fernandez and Fernandez 2008). Therefore, the main criticism regarding plan-driven development is that many decisions that are taken early on must be reappraised later on, since software development deals with a lot of uncertainty in the early stages of a project (Pfleeger 1999). Nevertheless, this approach allows for planning organizational aspects earlier, besides fostering the discovery of potential problems before the start of a particular project.
Agile methods are regarded as being more suitable to deal with projects that present unclear and uncertain requirements, but they demand close collaboration between the customer and the development team (Beck and Andres 2004). Furthermore, organizations and customers may be more familiar with plan-driven approaches and may find it hard to trust and follow an agile-based approach (Gandomani and Zulzalil 2013). Pure agile-based software processes are difficult to scale. They are more adequate to small and medium size projects (Gandomani and Zulzalil 2013). Finally, existing empirical evidence suggests that agile practices are not readily applicable to GSE projects (Hossain et al. 2009; Jalali and Wohlin 2012).
A software process at the project level may be split, which enables distributed teams to combine practices from both agile and plan-driven approaches, and hence generating software process diversity (Ramasubbu et al. 2015). Software diversity can help teams to address the limitations of pure agile or plan-driven software processes by combining the practices from each approach that fits each case (Ramasubbu et al. 2015), thus leading to hybrid processes (Kuhrmann and Mendez Fernandez 2015; Vijayasarathy and Butler 2015). For example, some organizations have a more plan-driven mentality about project management practices, but the teams may still be mainly agile.
Considering the discussion above, we define the dimension “software process type as having two categories:
We did not include a category hybrid in this dimension, because one of the types of practice (agile or plan-driven) is expected to be more prevalent. Furthermore, most organizations would probably perceive themselves as using a hybrid approach, and hence it is viewed as more important to know the process type being most commonly used with respect to the objective of a study. For example, an organization may have agile teams that are managed in a more plan-driven way; if the main focus of the classification is the effort associated with the software development in each site, the best classification would be agile, because the teams use mainly agile practices to develop software; however, if the main focus of the classification is the effort associated with coordination between sites, plan-driven would fit the best, since management is more plan-driven than agile.
Software process distance
While software diversity can help teams to overcome the limitations associated with “pure software processes (pure agile or pure plan-driven), it may result in differences between the software processes of different sites. To account for this, we incorporated the dimension software process distance, which enables the classification of the distance between two sites in terms of the software processes used. This dimension has the following categories:
-
Equal - The software processes of the sites are very similar, i.e. they use the same workflows, roles and practices to develop software.
-
Similar - The workflows, roles and practices are not the same in both sites, but the sites use software processes that are based on the same type of software development practices (mainly agile or mainly plan-driven).
-
Different - The sites neither use the same type of software development practices nor use the same workflows, roles and practices.
Cultural factors
Both national and organizational cultures influence both decision-making and the way development is conducted in a project. In GSE projects, the different cultures involved can impact negatively on the communication and trust between sites (Da Silva et al. 2010), and can lead, for example, to a bigger effort (Britto et al. 2014, 2015).
Culture is represented in our extended taxonomy using Hofstede’s national culture framework (Hofstede et al. 2010). From Hofstede’s framework, we have only adopted two dimensions that are named power distance index - PDI and uncertainty avoidance index - UAI. They have been adopted because empirical evidence exists for these two dimension; the evidence supports their influence on the organizational level (Hofstede et al. 2010), which is the level in which projects are carried out.
Hofstede’s PDI and UAI are defined as follows:
-
Power distance index (PDI) - Measures how people manage inequality in hierarchical relationships, i.e. manager-subordinates. In nations with high PDI, the employees depend more on the managers to make decisions. However, in nations with low PDI, the competences of the employees are higher valued than their hierarchical position.
-
Uncertainty avoidance index (UAI) - Measures how people manage uncertainty, how they feel threatened by uncertain situations and try to avoid or mitigate such situations. In nations with strong uncertainty avoidance, they have strict laws and rules. Nevertheless, nations with weak uncertainty avoidance have as few rules as possible, which make their people more tolerant to uncertain situations.
Based on Hofstede’s framework, we designed the dimensions power distance and uncertainty avoidance to account for the cultural factors in our extended taxonomy.
Power distance
PDI is represented in our extended taxonomy by the dimension called power distance (PD), which has the following categories:
Uncertainty avoidance
In our extended taxonomy, UAI is represented by the dimension called “uncertainty avoidence (UA), which has the following categories:
The threshold values used to differentiate sites with Small or Large PD and Weak or Strong UA were defined based on Hofstede et al.’s empirical study (Hofstede et al. 2010) 3. To choose the proper UA and PD categories for a site, UAI and PDI scores for the countries involved in a project under classification should be determined; the scores are available in Hofstede et al.’s book (Hofstede et al. 2010).
Note that outcomes of the PD and UA classifications for sites should be compared. For example, consider a project with two sites, respectively named X and Y. Site X is placed in Germany and site Y is placed in Brazil, i.e. X’s PD is Small and UA is Strong, while Y’s PD is Large and UA is Strong. In this example, there is no major concern about the impact of UA on the GSE project, since both sites are classified in the same category. However, PD could negatively impact the GSE project, since the sites are classified in different categories.
In some situations, companies source human resources from different countries to compose teams in one location. This means that the main national culture of a particular site is not necessarily the national culture of the country wherein the site is placed. For example, Ramasubbu and Balan (2007) report a project with two sites, one placed in the USA and the other one located in India. Although both sites are placed in countries with different PD and UA, the human resources of both sites were from India. In such situation, the actual cultural distance between the two sites is expected to be zero.
Therefore, it is important to account for the predominant nationality of the human resources of a site to define the appropriate PD and UA.
Language distance
In a GSE project, it is very likely that involved sites do not have the same native language, which may lead to misunderstandings between sites and generate delays in the entire project (Ågerfalk et al. 2005). Nowadays, English is the most commonly used language when there is need for a lingua franca (Lutz 2009). Thus, instead of calculating the distance between sites’ languages, we incorporated a dimension named language distance, which classifies the distance between each site’s language and English.
This dimension has the following categories:
-
No distance - When the mother language of a site is English, or no lingua franca is required. In the latter case there is no language distance in such a site, since people from both sites could communicate in their native tongue.
-
Small - When 0<L
d≤0.4, the language distance of a site is considered small. This means that it is more likely that people from such a site have an acceptable level of proficiency in English, since it is relatively easy for them to learn it.
-
Medium - When 0.4<L
d≤0.57, the language distance of a site is considered medium. This means that it is more likely that people from such a site struggle somewhat to learn English, which affects their proficiency. However, they can learn and speak English by applying more effort than people from the previous group
-
Large - When 0.57<L
d≤1, the language distance of a site is considered large. This means that it is more likely that people from such a site struggle even more to learn English. In general, those languages have almost no commonalities with English, which requires more effort to learn English.
In the aforementioned categories, Ld represents the distance between the language of a particular site and English (Chiswick and Miller 2004). According to Chiswick et al., Ld can assume the following values: 0.33, 0.36, 0.4, 0,44, 0.5, 0.57, 0,67, 0.8 and 1 4.
The bigger the Ld value, the farther a particular language L is from English, which is also a measure of how difficult it is for people who speak L to learn to speak English. Thus, the larger the Ld, the higher the likelihood that the proficiency in English will not be very good (Chiswick and Miller 2004). The lower the level of proficiency in English (as lingua franca), the higher the probability of problems regarding the communication between sites (Herbsleb and Moitra 2001).
Note that this dimension of our taxonomy can be used only in projects that require no lingua franca to enable communication between sites (i.e. there is no language distance) or when the chosen lingua franca is English.
The first category of this dimension (No distance) was designed to represent sites that either have English as its mother tongue or no lingua franca is required in the project, since the sites have the same mother tongue. The other three categories were defined by dividing the language distance scale in three equal parts (Small, Medium and Large), so that there is enough representativeness to classify the existing spectrum of language distance values.
This dimension focuses on the language that is spoken the most in a site’s location. We did so because it would be very difficult to embrace the particularities of countries that have more than one official language. In addition, high proficiency in English is a prerequisite to allocate personnel to participate in many GSE projects, i.e. in these cases, the mother tongues of sites’ locations are not an issue.
Thus, when using this dimension to classify the language distance of each site, the language spoken the most by the site’s personnel should be identified and it should be used as basis for selecting the language distance category that fit the best.
Communication model
In GSE projects, the communication between the distributed sites is often mediated via electronic communication media (Jaanu et al. 2012) and existing empirical evidence shows that mediated communication demands more effort (Ebert and De Neve 2001; DeLuca and Valacich 2005). Different electronic communication media types have different properties and capabilities to deal with geographic distance between sites in GSE projects.
Media synchronicity theory (MST) (Dennis et al. 2008) states that the most effective communication occurs when the communication media matches a given set of communication requirements (Dennis et al. 2008). The information to be transmitted can require more conveyance, i.e. processing and transmission of new information, or more convergence, i.e. a group should agree on something (Dennis et al. 2008).
In our extended taxonomy, we incorporated the communication factor through the dimension called communication model, which is based on the MST. The communication model dimension has the following categories:
-
Low synchronicity communication model - The communication between sites is mainly mediated via asynchronous media, e.g. email and issue trackers (the most adequate media type for conveyance (Dennis et al. 2008)).
-
High synchronicity communication model - The communication between sites is mainly mediated via synchronous media, e.g. video conference and instant messaging tools (the most adequate media type to achieve consensus (Dennis et al. 2008)).
-
Balanced synchronicity communication model - The communication model encompasses both asynchronous and synchronous media types and each type is used for its most adequate purpose, i.e. synchronous media is used when there is need for fast feedback and consensus achievement, and asynchronous media is used when there is need to convey some message that should be formalized and consolidated before transmission.