Mobile GUI Testing Fragility: A Study on Open-Source Android Applications
AS SEVERAL market analyses underline, Android hasgained a very significant market share with respect toother mobile operating systems, reaching the 86.2% in the sec-ond quarter of 2016.1Mobile devices, nowadays, offer theirusers a very wide range of different applications, which havereached a complexity that just a few years ago was exclusive ofhigh-end desktop computers.One of the points of strength of the Android operating systemis the availability of several marketplaces, which allow devel-opers to easily sell the applications or release them for free.Because of the huge amount of apps available on such plat-forms, and the resulting competition, it is crucial for developersto make sure that their software works as promised to the users.In fact, applications that crash unexpectedly during their normalexecution, or that are hampered by bugs, are likely to be quicklyabandoned by their users for competitors [1] and to gather verynegative feedback [2]. Mobile applications must also complywith a series of strict nonfunctional requirements, which arespecific to a mobile and context-aware environment [3].In such a scenario, testing mobile apps becomes a very crucialpractice. In particular, it is fundamental to test the graphical userinterfaces (GUIs) of the apps, since most of the interaction withthe final user is performed through them.There is evidence that relevant players of the industry performstructured testing procedures of their mobile applications, alsoleveraging the aid of automated tools (for instance, Al ́egroth andFeldt documented the long-term adoption of visual GUI testingpractices at Spotify [4]). By contrast, it has been proved byseveral studies that open-source mobile developers rarely adoptautomated testing techniques in their projects. Kochharet al.[5] found that, on the set of open-source projects (mined fromF-Droid)2they examined, just 14% of the set featured any kindof scripted automated test classes; Linares-V ́asquezet al.[6]found that the majority of an interviewed set of contributors toopen-source projects relied just on the execution of manual testcases, even though a variety of automated testing tools (opensource or not) are available Code Shoppy
AS SEVERAL market analyses underline, Android hasgained a very significant market share with respect toother mobile operating systems, reaching the 86.2% in the sec-ond quarter of 2016.1Mobile devices, nowadays, offer theirusers a very wide range of different applications, which havereached a complexity that just a few years ago was exclusive ofhigh-end desktop computers.One of the points of strength of the Android operating systemis the availability of several marketplaces, which allow devel-opers to easily sell the applications or release them for free.Because of the huge amount of apps available on such plat-forms, and the resulting competition, it is crucial for developersto make sure that their software works as promised to the users.In fact, applications that crash unexpectedly during their normalexecution, or that are hampered by bugs, are likely to be quicklyabandoned by their users for competitors [1] and to gather verynegative feedback [2]. Mobile applications must also complywith a series of strict nonfunctional requirements, which arespecific to a mobile and context-aware environment [3].In such a scenario, testing mobile apps becomes a very crucialpractice. In particular, it is fundamental to test the graphical userinterfaces (GUIs) of the apps, since most of the interaction withthe final user is performed through them.There is evidence that relevant players of the industry performstructured testing procedures of their mobile applications, alsoleveraging the aid of automated tools (for instance, Al ́egroth andFeldt documented the long-term adoption of visual GUI testingpractices at Spotify [4]). By contrast, it has been proved byseveral studies that open-source mobile developers rarely adoptautomated testing techniques in their projects. Kochharet al.[5] found that, on the set of open-source projects (mined fromF-Droid)2they examined, just 14% of the set featured any kindof scripted automated test classes; Linares-V ́asquezet al.[6]found that the majority of an interviewed set of contributors toopen-source projects relied just on the execution of manual testcases, even though a variety of automated testing tools (opensource or not) are available Code Shoppy
Performing proper testing of Android apps presents a set ofdomain-specific challenges, principally due to the very fast paceof evolution of the operating system and to the vast number ofpossibleconfigurations andfeatures theapps must becompatiblewith. In addition to that, the development process for Androidapps is typically very quick, and the need for making the ap-plications available to the public as soon as possible may be adeterrent for developers to perform complex forms of testing.Mucciniet al.[7] stress the differences between traditional soft-ware and Android applications when it comes to testing them:the huge quantity of context events, to which apps have to re-act properly; the diversity of devices, to which the apps must becompatible; and the possible lack of resources for some devices
Similar to what happens for web applications testing, auto-mated GUI testing of Android apps is also hampered by thefragility issue. For our purposes, we define a GUI test case asfragile if it requires interventions when the application evolves(i.e., between subsequent releases) due to modifications appliedto the application under test (AUT). Being system level tests,GUI test cases are affected by variations in the functionalities ofthe application (as it happens for lower level component tests),as well as from even small interventions in the appearance,definition, and arrangement of the GUI presented to the user.Fragility is a significant issue for Android application testing,since a failing test may require in-depth investigations to findout what are the causes of the failures, and entire test suitesmay need modifications due to even minor changes in the GUIsand in their definition. If that happens, developers may decideto not exercise any kind of structured scripted testing. In ourprevious work [8], we developed a small test suite (made of 11test classes) for K9-Mail3—a popular large-sized open-sourceAndroid mail client—and tracked the modifications that wereneeded by test classes to be executable on different releases.We found out that up to 75% of tests that we developed hadto be modified because of modifications performed on the GUIof the app. If scripted test cases were obtained through the useof Capture&Replay techniques, for some transitions betweenreleases, the entirety of the test suite had to be rewritten.In this work, we aimed at gathering information about ex-isting test suites featured by open-source Android applications.We extended the context of previous similar work (like the oneby Kochharet al.[5], who analyzed a set of about 600 open-source Android apps collected from F-Droid), considering allthe projects hosted on GitHub that contained proper Androidapplications and that featured a history of releases, for a to-tal of 18 930 projects. We identified six open-source AndroidGUI testing tools cited in available literature and producing testclasses in Java, and we searched for the presence of code writtenwith those tools in the mined Android projects. This way, wesubdivided the projects in six subsets, according to the testingtool they featured. Then, change metrics about the evolution oftesting code produced with a given testing tool were computedfor each project and averaged over the respective sets. In addi-tion to its evolution, we measured the relevance of testing codewith respect to the total production code for each project in termsof quantitative comparisons of the respective amount of lines ofcode. To estimate the fragility issue, we defined a set of metricsthat can be obtained for each project by automated inspectionof the source code. Thus, we can give a characterization and aquantification of the average fragility occurrence for each of thetesting tools considered.
No comments:
Post a Comment