RomaDroid: A Robust and Efficient Technique for Detecting Android App Clones Using a Tree Structure and Components of Each App’s Manifest File

There are various types of Android apps, such as entertainment apps, health and fitness apps, travel apps, educational apps, business apps, and so on. Android apps can contain business logic, maintain sensitive personal information, and act as a bridge between IoT devices and cloud servers. Since illegal users frequently make a copy of a legitimate Android app and redistribute the plagiarized app for commercial or malicious purposes, many studies have been conducted to detect repackaged/cloned apps and make the Android ecosystem safer. A malicious attacker might apply code obfuscation to avoid app clone detection. Therefore, it is necessary to consider the effects of code obfuscation when detecting cloned apps. In this paper, we design and implement a tool called RomaDroid, which can detect efficiently cloned apps based on features inherent in each app’s AndroidManifest.xml file. The manifest file is XML structure defined by tags or attributes and its XML document can be modeled as an ordered labeled tree. The RomaDroid creates a string from the hierarchical tree structure of tags as well as the class name of the components related to intent-filter tags in the manifest file, which are robust to code obfuscation. That is, we create a string from each manifest file of two apps to be compared and measure the similarity between the created two strings with the longest common subsequence (LCS) algorithm. If the measured similarity exceeds a certain threshold, the two apps are determined to be a clone pair (or similar app pair). To validate the RomaDroid, we perform various experiments with both non-obfuscated apps and their obfuscated versions generated by three obfuscation tools. The experimental results show that the RomaDroid detects accurately cloned apps even in the cases code obfuscation has been applied.

You may also like…