Mobile Application Coverage: The 30% Curse and Ways Forward
Testing, security analysis, and other dynamic quality assurance approaches rely on mechanisms that invoke the software under test, aiming to achieve high code coverage. A large number of invocation mechanisms proposed in the literature, in particular for Android mobile applications, employ GUI-driven application exploration. However, studies show that even the most advanced GUI exploration techniques can cover only around 30% of a real-world application. This paper aims to investigate “the remaining 70%”. By conducting a large-scale experiment involving two human experts, who thoroughly explored 61 benchmark and 42 popular apps from Google Play, we show that achieving a substantially larger coverage for real-world applications is impractical even if we factor out known GUI-based exploration issues, such as the inability to provide semantic inputs and the right order of events. The main reasons preventing even human analysts from covering the entire application include application dependencies on remote servers and external resources, hard-to-reach app entry points, disabled and erroneous features, and software/hardware properties of the underlying device. Thus, future investment in GUI-based exploration strategies is unlikely to lead to substantial improvements in coverage. To chart possible ways forward and explore approaches to satisfy/bypass these “blockers”, we thoroughly analyze code-level properties guarding them. Our analysis shows that a large fraction of the blockers could actually be successfully bypassed with relatively simple beyond-GUI exploration techniques. We hope our study can inspire future work in this area; it also provides a realistic benchmark for evaluating such work.