We now discuss the study results shown in the previous section with regard to our research questions, thus explaining where and how this study has improved our knowledge of multi-language software development, cross-language linking, and accompanying tool support. We also indicate points where we feel practical efforts or future research would be beneficial.
Before we begin with the actual results, we discuss the metadata, or results from participant selection and demographic parts of the survey. As stated in the methods section, our selection was based on a form of snowballing and we thus refrain from generalizations. Based on the data we have gathered, however, it seems that the sample did include both a) knowledgeable respondents and b) a quite diverse set of participants.
First, with an average of 8 years of experience and 88% of participants indicating that they their responsibilities includes programming, we believe that the recipients indeed had knowledge about the questions posed in this questionnaire and the results are thus to be considered meaningful.
Second, the results show that many different levels of experience, team sizes, and project lengths are found in the sample. The years of experience range between from below 1 to 36 with 50% lying in between 5 and 15 years (median 8); the team size ranges between 1 and 170 with 50% between 4 and 18.5 developers (median 7), and the length of the projects range from 1 month to 5 years with 50% between 6.5 and 24 months (median 12). Furthermore, recipients worked in at least 23 different companies. This indicates that we achieved a good spread of people and projects which makes this sample worth investigating.
When we look at the types of software developed, we see a more uniform picture. 66% of the reported projects were either web applications or client/server applications. Server-only applications followed behind with 16% of responses, with all others below 9%. Thus, the results of this survey should be interpreted with web, client/server, and server applications in mind — not, in particular, with desktop or mobile applications nor with embedded or operating systems.
The following three subsections discuss the answers to each of the three areas listed in the introduction, with two research questions each. We discuss threats to validity in Section 4.4.
On multi-language software development
In the following, we discuss the answers to research questions RQ1 and RQ2.
RQ1
Our first research question was How prevalent is multi-language programming and which languages are used? How many languages did developers work with?
Firstly, projects were reported to have an average of 7 languages (lower bound, since we counted DSL types instead of DSLs) with 50% of projects having between 6 and 8 languages. So, we conclude that multi-language programming is indeed prevalent.
We can also compare this figure with the number of languages in open source projects from a previous study (Mayer and Bauer 2015). Here, languages were automatically retrieved from source files with a median of 4 and 50% of projects having between 2 and 7 languages. It is thus interesting to see that more languages were reported in the current study, even if this count is a lower bound due to our asking for DSL types. This which may be due to various factors.
First, the current data is from the memory of participants, whereas the other is extracted from code; however, we would have expected a result in the other direction here (i.e. less reported languages from memory). The more likely explanation is that the data set in the previous study did include both a) unfinished projects and b) several smaller "toy" projects which reduced the number of languages. It would be interesting to compare the current results with a survey of open-source personnel.
The languages used mostly in our sample set were Java and JavaScript by a wide margin. This means in turn that the results of this study should be mostly interpreted with regard to projects with these languages, since we do not know the effect of the use of different general purpose languages on the questions asked.
With regard to DSLs, we only asked for types. Here, it has become abundantly clear that nearly all projects used the top five categories of DSLs, namely languages for the user interface, (shell) scripting, building, querying, and configuring. It will be interesting in the future to look into these categories in more detail and separate out the individual languages in use.
Only 12 respondents reported having created their own languages. Thus, although creating custom DSLs has been an active research topic for some time now (Fowler 2011), this was not a common phenomenon in our sample of industrial respondents.
It is important to note here that about 66% of respondents indicated that their last project was either a web application (36%) or a client/server application (30%). This might also indicate why the number of languages was higher than in our previous study, as the use of DSLs in those types of systems is very common.
Finally, there is no clear answer with regard to the question of the use of a project’s languages by developers. Figure 7 has shown that basically all possible options from "just one language" to "all of the languages" were selected. In general, there seem to be different philosophies at work regarding the allocation of work to developers or, more general, developer training. As we have seen in the comments to the question on benefits and problems, several recipients indicated problems precisely when not all team members were able to understand all languages, which is related to but not exactly the same question we asked here (which was about writing code in these languages). This seems like a natural step to follow up in future research.
To summarize our answer to RQ1: Multi-language programming is indeed common with an average of seven languages. Our sample set included mostly Java/JavaScript projects. Most projects used DSLs from the UI, configuration, shell scripting, querying, and building domains. Only 9% of respondents used custom languages. The developer-to-language ratio ranged freely from one language per developer to developers writing code in all present languages.
RQ2
Our second research question was What are the benefits and problems developers encountered in the use of multiple languages? Do developers feel that multi-language programming has/will increase or decrease over time?
Looking at the results from question five, respondents saw a benefit of the use of multi-language development in two areas. The first is a technical one, namely the translation of requirements into code. Thus, developers agree that certain languages are better suited to encode specific requirements than others. The second is related to a human issue, namely developer motivation, where a high number of respondents saw a benefit of using multiple languages.
On the other hand, a rather high number of respondents saw problems for the understandability of the system due to the use of multiple languages. This seems to conflict with the requirement translation answer above: Why do developers see multiple languages as beneficial for encoding requirements, but problematic for code understanding? We believe that the question of requirement translation was mostly interpreted regarding single languages, not the combination of languages. A single language can indeed be better suited to encode a problem; however, problems with understandability crop up when code is subsequently combined with others. This is an interesting discrepancy and should be followed up with further research.
The second area marked out to be problematic is the changeability of the system, that is, performing later changes in the presence of the use of multiple languages — developers in our sample mostly agree that multi-language software development leads to challenges when changing code later on.
Two additional areas are seen as problematic, which are the management of the build (which must take into account more languages and their respective tools) as well as the required effort for developers.
Finally, there is no clear trend to be seen with regard to the architectural design and the initial (first) implementation of the system, nor with regard to the two technical issues of memory consumption and CPU performance. We assume that these questions could not be answered on this level of abstraction, which would also explain the large set of neutral responses. Thus, we should follow up here with splitting these questions up with regard to individual languages or language types.
To summarize, the top two areas seen as problematic are related to understandability and changeability of the system. Both are important areas for system maintenance. Here, it seems prudent to follow up either with practical support in the form of tools, or with further research on techniques to improve these two areas by design.
The second part of RQ2 is about trends in multi-language programming. Here, respondents mostly agree that there were less languages in the past, and there will be more languages in the future, which is probably as expected and shows the need for further addressing multi-language software development.
To summarize our answer to RQ2: MLSD is seen as beneficial for developer motivation and for translating requirements to code, and seen as problematic for program understanding, changes to the system, build management, and developer effort. For architectural design, initial implementation, memory consumption, and CPU performance there is no clear trend. Developers agree that less languages were used in the past and more will be used in the future.
On cross-language linking
In the following, we discuss the answers to research questions RQ3 and RQ4.
RQ3
Our third research question was In how many and which combinations of languages did developers encounter cross-language links?. This question is interesting since answers to this question are difficult to extract in a generic fashion from source code (compared to, for example, the language counts of RQ1), since the language pairs and the linking mechanisms must be known before a link detection mechanism can be written, and there is a large number of possible link pairs and frameworks.
The results show that most links, as expected, occur between general-purpose languages and domain-specific languages. Since we already saw that Java and JavaScript are the top languages used in this sample, it is unsurprising to see them employed here as well, with languages from three of the five main DSL types as link targets (XML and.properties for configuration, HTML for UI, and SQL for querying) as shown in Table 3. As the table only shows the top 5, it is again unsurprising to see the "usual suspects" in GPL/GPL and DSL/DSL linking as well.
As we have mentioned above, the three questions on linking used free text only and were thus very tiresome to answer. We thus expect that many respondents only entered what immediately came to mind, and therefore the total numbers (3 link pairs per respondent) are probably below the actually occurring link pairs. 9 out of 139 respondents indicated that there were no cross-language links which seems surprising and might also be due to this issue. We took the reported language pairs at face value and did not take discrepancies with questions 1 and 2 into account.
Another issue we found here is that several respondents wrongly attributed languages to either general-purpose or domain-specific although these terms were explained in the survey. While this was easily corrected afterwards, it suggests that this distinction is not as well-known or clear-cut as we expected.
It is worth noting at this point that the questionnaire did not ask participants to separate between generated and non-generated code. Some code which includes cross-language links might in fact be generated (either in a GPL or in a DSL). This usually means that the code is only read, but not (manually) changed. Still, developers need to be able to understand it.
With the answers to these questions we have gained information on link pairs which, as mentioned above, is hard to extract automatically. The full list of 152 distinct link pairs is available on our web site. We thus have a starting point for further research including writing tool support. For future surveys, we recommend using this information to create multiple-choice answers for these questions as well.
To summarize our answer to RQ3: Developers encountered cross-language links in a total of 152 distinct language pairs with an average of 3 link pairs per project. The most common combinations were GPL/DSL links between Java and XML, SQL, HTML, and.properties, as well as JavaScript and HTML.
RQ4
Our fourth research question was Did problems with cross-language linking occur? If so, which, when, how frequent, and what was done to alleviate these problems?. This question is at the heart of this study.
The first subquestion here is about whether problems with cross-language links occurred at all. Only about 8% of respondents reported no issues, so we can clearly state here that the overwhelming majority of respondents did encounter at least some issues with cross-language links.
Of the suggested issues with cross-language links, the most-selected one were problems as a result of changing cross-language identifiers (with 61% of respondents having had this problem in the last project). Another 46% stated that developers refrained from changing cross-language links for fear of breaking code. We added this answer based on previous experiences with industrial development without expecting that many developers would admit to having this issue. That they indeed did so is evidence, in our opinion, that cross-language links as they are now are indeed seen as being hard to keep track of.
While not changing identifiers prevents problems in the short run, it will lead to problems with understandability later on since necessary changes — such as renaming identifiers whose function has changed — are not carried out any longer. Indeed, the issue selected third most often (by 44%) is problems with understanding or explaining how the system worked due to cross-language links.
We conclude that cross-language links are seen as hard to keep track of and thus difficult to understand and communicate.
The other three suggestions of problems — increased difficulties in build management, test writing, and configuration of required libraries or frameworks — were selected by 30 to 40 percent of recipients. They are thus of concern as well, but less so than the problems discussed above.
In question 11, we asked respondents when and how frequently problems with cross-language links occurred during the development of their systems. The two activities most affected were those in which the system was changed, either to implement new functionality or when refactoring. A smaller additional amount of problems were detected (as should be) during unit testing.
Part of the aim of this question was to test which activities would likely profit from (tool) support and whether cross-language linking problems occur late in development, making them harder to fix. Thus, the result is encouraging: Few respondents indicated that problems with cross-language links made it to user testing or were still present after release; instead, problems mostly occurred during programming when changing code. We conclude that future efforts to aid developers should be focused on these activities.
The last question here is about measures taken to alleviate problems with cross-language linking. It is interesting to see that despite 92% of recipients reporting problems with such links, only 20% used dedicated support tools and only 30% used dedicated tests to find such errors. About 9% reported that no measures were taken at all. It is unclear whether this is due to the fact that such tools simply do not exist for the relevant cross-language links, or whether they were not used for other reasons. This should be followed up in the future.
We also provided suggestions for “soft measures” as part of this question: 68% of respondents indicated that they took “special care” when changing cross-language identifiers. 12% reported avoidance of the use of multiple programming languages, and 21% reported avoidance of cross-language identifiers in general. 37% indicated that identifiers were not changed to avoid issues.
This again indicates that it seems difficult to feel comfortable with the presence of cross-language links and they are thus avoided when possible; if present, they are handled with the utmost care. This situation clearly needs additional attention both in practical efforts and in research.
To summarize our answer to RQ4: Problems with cross-language linking were reported by 92% of respondents. Most problems were related to changing cross-language identifiers and to program understanding, which suggests that cross-language links are seen as fragile and difficult to understand and communicate. These issues occurred mostly during activities in which code was changed by developers. Only about 20-30% of respondents used concrete measures against cross-language linking problems in the form of tools or test cases; many respondents indicated that they tried to avoid multiple languages, cross-language linking, or changing identifiers as far as possible.
On tool support
In the following, we discuss the answers to research questions RQ5 and RQ6.
RQ5
Our fifth research question was Was tool support available for dealing with cross-language identifiers? If so, which functions were available?.
55 (nearly 40%) of participants indicated that no tool support at all was used in their last completed project. Thus, tool support was in fact in use, but less so than we expected. In particular, the error marking functionality, which is the one directly relevant to problems — was selected by the lowest amount of recipients (35 recipients, 25%), which mirrors the result in question 12 (where 27 respondents (20%) reported using dedicated tools for error detection).
It is important to note here that we simply asked whether the functionality was available in the project. Whether this means that tools do not exist at all or were simply not used is unclear. This should be investigated in the future; we suggest asking about concrete tools in combination with cross-linked language pairs.
To summarize our answer to RQ5: Tool support was available to about 60% of respondents. The functionality most available was highlighting (41%), followed by renaming, navigation, and finally error marking (25%).
RQ6
Our sixth and final research question was Is tool support considered important, and if so, which functions for handling cross-language identifiers are most important?.
In this final question, recipients agreed that having more tool support would be beneficial. A total of 82% of recipients indicated that they see tool support in general as "very" or "rather" important. Thus, it seems clear that there is a need for further support. There is little difference between the individual functions in rating. The most important functionality to recipients is, as expected, support for error marking (87%). The least important was support for highlighting (67%), although it should be noted here that highlighting was the functionality most available to developers and thus may have received less votes.
To summarize our answer to RQ6: Respondents universally agree on the benefits of tool support for cross-language linking. The most benefits (87%) are expected from functionality to mark errors in cross-language links.
Closing remarks
In the previous subsections we have answered our research questions in great detail. We have found that MLSD and the use of cross-language linking are indeed prevalent, as are related problems. We see three areas in which we might improve the state of the art: Better tool support, easier cross-language linking by design, and more focus on the ability of developer teams to speak all languages in use in a system.
We attribute problems with understandability and changeability to the fragile and implicit nature of cross-language links as they exist today — it seems hard to keep track of such links. Two remedies suggest themselves for this issue. The first is tool support, which has already been suggested as part of this survey and has been met with universal agreement by developers: Better tool support for tracking and changing cross-language links may indeed alleviate many problems associated with these links, including the fear of developers to change code due to unknown cross-language effects and thus reestablishing trust in the code base.
However, better tool support can also be seen as only handling the symptoms of the problem: If cross-language linking mechanisms were more robust in the first place, we would not have as great a need of tool support as we do now. Thus, secondly, we should investigate creating maintainable and understandable cross-language links by design. How specific cross-language linking mechanisms could be improved is a matter of future research. In the comments, several respondents mentioned the use of code generation tools with the benefit of cross-language linking information being stored in only one place. One respondent explicitly mentioned using a central database for this information which is a form of explicit interface specification. A first step forward might thus be made by drawing on the experience with interface specifications in general, that is, using more explicit and accountable links which are stored in well-known places.
Third, the qualitative data from the comments of developers on various questions show one potentially underinvestigated area, which is the knowledge of team members about the languages used in the project. This problem is related to but not the same as our question on the developer/language ratio, which was more about changing, not only understanding code. Several respondents have commented that various problems occur if not all team members speak all languages, not only for the code quality, but also for organizational issues such as finding replacements during a vacation or when developers leave a company. We suggest that this angle of the problem be followed up in the future as well.
Threats to validity
As with any empirical study, there may be threats to the validity and trustworthiness of our work (Kitchenham and Pfleeger 2002b). Since this is the first survey in this area we cannot compare to existing results or instruments. We thus discuss content validity as perceived by the authors, pre-testers, and the survey respondents (in the form of free text comments).
A general issue with questionnaires is the risk of participants not understanding the questions or (pre-made) answers. We have taken care to adapt the terms to the intended target audience, i.e. professional developers, not researchers. We have refrained from using too technical terms by replacing them with simpler ones; where this was not possible we have explained the terms (such as "cross-language link") within the questionnaire: each section of the questionnaire was prefixed with a page of explanations. We have also provided examples where appropriate, for example, for the distinction between GPL and DSL. Furthermore, the questionnaire was run through a pre-test with five participants. We thus believe it unlikely that there were serious misunderstanding of the questions.
The question on benefits and problems of MLSD in this questionnaire was kept generic, i.e. participants were asked to answer this question regardless of concrete languages or language types. This was done on purpose as it was our aim to find out if there are trends in opinion when considering the MLSD phenomenon as a whole; however, we might have attracted more insights had we asked this question multiple times for individual languages or types. A "neutral" option was provided to enable participants to refrain from having to choose, an option which was made use of in several instances as reported. While we believe that the results to this question are a meaningful first step, we recommend that future studies split this question to individual languages and types.
Most questions in this questionnaire included multiple-choice answers, i.e. we provided prefixed answers instead of open ones. The main benefits of this approach are a) it is easier for respondents to answer, and b) the results are easier to analyze. However, this may lead to a bias since participants were able to select pre-fabricated opinions instead of having to provide their own ones. All options provided in the questionnaire were discussed among the authors and, where possible, with other developers, and thus stem from the experience of the authors and other software developers in the development of multi-language software.
It is worth noting at this point that open questions are problematic as well since having to provide free text increases the likelihood of participant dropouts. Furthermore, analysis of such data would again require categorization which is then performed by the researcher, not the respondent. To address this issue, we provided both suggestions and an additional free text field in all of these questions such that developers could add their own opinions. We investigated these qualitative answers manually, but mostly did not find a clustering of non-listed issues except in two cases: Several respondents mentioned additional problems if not all team members are able to understand all languages; and several respondents indicated the use of code generation, that is, using a single specification for cross-language links and using this specification to prevent problems. In a follow-up study, these two topics should thus be investigated further.
We finally come to the question of the ability to generalize. The selection of participants for this study was done in a snowball fashion, the reasons for which have to do with our inability to randomly select participants due to the inaccessibility of the population and questions regarding unsolicited e-mail. We have thus, in this report, refrained from using inferential statistics and have only reported our results on this sample. As we have indicated in the results section, we feel that the sample has a good diversity nevertheless and as such we feel that the results offer interesting insights into industrial development. Also, parts of this study are of an explorative nature, and we feel that our results directly suggest further research as indicated in the discussion section.
Even when reporting on the sample only, there is one possible bias which comes from the threat that multiple respondents in our survey may have reported on the same project. However, most companies in our sample offer customized development to clients and thus have many individual projects running at the same time. This problem also only affects the questions on the last completed project, not the questions on developers opinions. The free text answers did not show any patterns to confirm this concern, and we assume that the overall diversity was high enough to counter this issue.