Do androids dream of electric sheep? On privacy in the android supply chain

Julien Gamba

Ayuda

Do androids dream of electric sheep? On privacy in the android supply chain

Autores: Julien Gamba
Directores de la Tesis: Narseo Vallina Rodriguez (dir. tes.)
Lectura: En la Universidad Carlos III de Madrid ( España ) en 2022
Idioma: inglés
Tribunal Calificador de la Tesis: Douglas Leith (presid.), Rubén Cuevas Rumín (secret.), Hamed Haddadi (voc.)
Programa de doctorado: Programa de Doctorado en Ingeniería Telemática por la Universidad Carlos III de Madrid
Materias:
- Ciencias tecnológicas
  - Tecnología de las telecomunicaciones
Enlaces
- Tesis en acceso abierto en: e-Archivo
Resumen
- The Android operating system (OS) started as Android Inc,̇ a California based start-up found in October 2003 by Andy Rubin, Rich Miner, Nick Sears, and Chris White. The founders’ initial goal was to create an OS for smart camera based on the Linux kernel, but then later adapted it to smartphones. Google bought the company in 2005, and went on to develop the operating system.
  
  In 2007, Google, along with 34 other tech companies, unveiled the Open Handset Alliance (OHA), a consortium dedicated to developing open standards for mobile devices. The announced goal was to create “the first truly open and comprehensive platform for mobile devices”. A preview source code of the first Android Software Development Kit (SDK) was released a week after that announcement to attract developers. The first official version of Android was released in 2008, along with the first Android-powered smartphone, the HTC Dream .
  
  Android has since grown to be the most used OS, with at least three billion active devices as of May 2021 .1 A major factor to this rapid adoption of Android is its open source model. Each component of the OS can be modified by a phone manufacturer or an original equipment manufacturer (OEM, a contractor that can produce a device on behalf of another company) before being installed on a device. Such customization of the OS are encouraged by Google. In practice, manufacturers take advantage of the openness of the OS to distinguish themselves from their market competitors.
  
  Because of this openness, the supply chain of Android devices involves a large and diverse number of stakeholders, from the creation of the device to its manufacturing and distribution. While the actual number of stakeholders can vary depending on the complexity of the model and the commercial partnerships between them and other companies, the commercialization of most devices involves the following actors:
  
  • Chipset manufacturers: at the beginning of the supply chain are the chipset manufacturers such as Qualcomm or MediaTek. These companies are responsible for the manufacturing of essential electronic components and provide software (including apps and drivers) to interact with said components.
  
  • Device manufacturers: these actors are the most visible ones, as it involves the brands known to the end user. Device manufacturers are the companies that actually assemble the components and load the firmware. Here I also include Original Equipment Manufacturers (OEM) and Original Device Manufacturers (ODM) which are contractors hired by the phone vendors to externalize the manufacturing of the phones (an OEM will manufacture a device based on the vendor’s design, while an ODM will create a design, potentially from scratch, and manufacture the devices).
  
  • App markets: certified devices can come with the Google apps suite pre-installed (e.g., Google Play Store, YouTube, Gmail), which also includes the Google Play Store app. Only phones certified by the Google Android team can pre-install the Google Play Store. Devices can also pre-install alternative app marketplaces, such as the Amazon Appstore. Devices can also come pre-installed with regional app markets (e.g., Chinese devices may come pre-installed with app stores from Baidu or Tencent ).
  
  • Mobile Network Operators (MNO): MNOs can create strategic partnerships with vendors to sell devices to users at lower prices in exchange for a subscription to their services. In these cases, MNOs can pre-install apps to add value to the device or to ease access to MNOs-related services (e.g., a companion app to keep track of data consumption and data plan).
  
  • Resellers and distributors: finally, at the end of the supply chain are resellers and distributors. This includes brick-and-mortar shops as well as online shops such as Amazon or eBay.
  
  These stakeholders are the ones of the typical supply chain of an Android device, but the actual supply chain of a given device can vary across brands. In fact, what makes the Android supply chain unique is the variable number of stakeholders that can be involved at any point, and its diversity: for instance, two copies of the same device model might have a different set of pre-installed apps depending on the country in which they were bought. Moreover, the supply chain can be dynamic, with extra apps installed without interaction with the user when they first boot the device. Any of the stakeholders can also pre-install software from their partners, expose features to other apps on the device, or even change core Android components, thus giving these stakeholders access to a privileged vantage point to get information on the user. Indeed, pre-installed2 apps are trusted by the system by default, and can even be pre-granted permissions, without user interaction. Once installed, it is very difficult for a user to remove them, if possible at all.
  
  Not all pre-installed software is deemed as wanted by users, and the term “bloatware” is often applied to such software. The process of how a particular set of apps end up packaged together in the firmware of a device is not transparent, and various isolated cases reported over the last few years suggest that it lacks end-to-end control mechanisms to guarantee that shipped firmware is free from vulnerabilities or potentially malicious and unwanted apps. For example, at Black Hat USA 2017, Johnson et al. gave details of a powerful backdoor present in the firmware of several models of Android smartphones, including the popular BLU R1 HD. In response to this disclosure, Amazon removed Blu products from their Prime Exclusive line-up. A company named Shanghai Adups Technology Co. Ltd. was pinpointed as responsible for this incident. The same report also discussed the case of how vulnerable core system services (e.g., the widely deployed MTKLogger component developed by the chipset manufacturer MediaTek) could be abused by co-located apps. The infamous Triada trojan has also been recently found embedded in the firmware of several low-cost Android smartphones. Other cases of malware found pre-installed include Loki (spyware and adware) and Slocker (ransomware), which were spotted in the firmware of various high-end phones .
  
  Android handsets also play a key role in the mass-scale data collection practices followed by many actors in the digital economy, including advertising and tracking companies. OnePlus has been under suspicion of collecting Personally Identifiable Information (PII) from users of its smartphones through exceedingly detailed analytics, and also deploying the capability to remotely root the phone. In July 2018 the New York Times revealed the existence of secret agreements between Facebook and device manufacturers such as Samsung to collect private data from users without their knowledge. This is currently under investigation by the US Federal authorities. Additionally, users from developing countries with lax data protection and privacy laws may be at an even greater risk. The Wall Street Journal has exposed the presence of a pre-installed app that sends users’ geographical location as well as device identifiers to GMobi, a mobile-advertising agency that engages in ad fraud activities. Recently, the European Commission publicly expressed concern about Chinese manufacturers like Huawei, alleging that they were required to cooperate with national intelligence services by installing backdoors on their devices. In March 2019, it was reported that hackers managed to hijack the update process of Asus computers to install malware. While this does not involve pre-installed apps, it is a prime example of the consequences of the size and lack of control over the agents forming the supply chain.
  
  To make sure that every device can properly run any app regardless of their level of customization, Google has set up a compatibility program, that states the minimum requirements that the modified OS must meet to stay compatible with standard Android apps. However, this compatibility program only sets software requirements and does not consider security and privacy implications for the end user. Google also created a certification for devices, to assess for their security and performance. Both phone vendors and ODMs can make their devices certified. This certification is mandatory in order to pre-install Google apps and the Google Play Store on a device. Unfortunately, there is little information available regarding the tests that are actually performed by Google before certifying devices, and it is not clear at which stage of the manufacturing the tests are performed.
  
  Another vector for customization available to stakeholders of the supply chain is the Android permission system. The Android OS implements a permission-based mechanism to control how apps can access sensitive data and dangerous system features such as user contacts, the camera, location sensors, or the system settings. Coupled with other protection mechanisms such as process sandboxing, the permission system empowers users to control what sensitive resources are accessible to which apps. The Android Open Source Project (AOSP) defines a standard set of permissions that are supported by most Android devices. Any Google certified device must implement the whole set of AOSP permissions to guarantee their compatibility with the standard Android platform. A decade of research in the use, enforcement, and usability of AOSP permissions has revealed severe privacy and security shortcomings inherent to the Android permission model. Consequently, many vulnerabilities were fixed gradually across Android releases.
  
  Android’s permission model possesses an interesting, overlooked feature: its extensibility. By design, the Android framework allows any app developer to share features implemented in their software with other apps in a “controlled” way by defining custom permissions. Therefore, custom permissions allow extending the capabilities offered by the Android OS and the creation of new features exposed by pre-installed apps and facilitate the flourishment of an open software ecosystem in which apps (and third-party libraries or SDKs) can share data and components with other developers. However, custom permissions also pose security and privacy risks as they can be (ab)used–intentionally or by mistake–to circumvent the standard permission system and provide backdoored access to privileged data and features to apps that are otherwise not permitted to do so, in a way akin to how covert and side channels operate .
  
  The control and transparency mechanisms implemented by the Android operating system are insufficient to protect users from abusive or insecure implementations of custom permissions. Google recommends using the reverse domain name as the prefix of such permissions, and supplying a description of the custom functionality or data protected by the permission, but, in practice, there is no enforcement of such recommendations. Consequently, it is not possible to automatically know what precise function or resource is protected by a custom permission, and how they are being integrated and used across Android apps. This lack of control and transparency also manifests at installation time, which translates into profound implications in terms of user awareness and control: unlike official AOSP permissions custom permissions are not listed in the app stores, and end users cannot grant or deny apps access to them at runtime unless the developer willingly defines them with a dangerous protection level.
  
  Despite more than a decade of research into the Android ecosystem, the ecosystem of pre-installed Android software and its associated privacy and security concerns have remained neglected by the research community.
  
  This ecosystem has remained largely unexplored due to the inherent difficulty to access such software at scale and across vendors. This state of affairs makes this work even more relevant, since i) these apps – typically unavailable on app stores – have mostly escaped the scrutiny of researchers and regulators; and ii) regular users are unaware of their presence on the device, which could imply lack of consent in data collection and other activities. Similarly, the research literature focused on the evolution of the permission system or on custom permissions is significantly narrow. As of now, no app analysis tool has been able to capture the asynchronous behavior of custom permissions. Prior work demonstrated, using proof-of-concept implementations, how custom permissions can enable permission re-delegation and confused deputy attacks. Yet, our understanding of the Android custom permissions landscape has remained low, particularly in terms of their prevalence, usage, and potential misuse.
  
  Research Questions and Objectives Analyzing at scale the customization of Android devices poses a certain number of challenges. As discussed previously, the openness of the Android ecosystem has led to the complexification of the supply chain. There is a myriad of stakeholders that can pre-install apps, each with their own business model and practices, therefore gaining privileged access to system resources and potentially users’ personal data.
  
  Moreover, pre-installed apps differ from publicly available apps: while a publicly available app is standalone, pre-installed apps developers know in advance the environment in which their app will run, i.e., the software and hardware specification of the device. Pre-installed apps can safely rely on specific libraries or even other apps that will also come pre-loaded on the device, for specific operations. As a consequence, pre-installed apps tend to use more features of the Android OS such as shared user IDs3, to pool resources with other apps, or custom permissions, to expose some of their components to a specific set of other apps on the device; such features are typically less common in publicly available apps. I develop in depth such aspects in Chapter 2 (page 15). This can hinder the use of state of the art static and dynamic analysis tools, as such tools expect a standalone entity to analyze, and might miss inter-component and inter-app communication, or call to functions defined by non-standard libraries that are otherwise not present in their emulated environment .
  
  RQ1: Exploring the system Android apps ecosystem The majority of system apps are not publicly available and have therefore escaped the scrutiny of the research community. This is especially worrying, as apps installed on system partitions hold a privileged position in the Android operating system. However, the state of the art has not produced any method to gather pre-installed apps directly from users’ devices, and relied instead on crawling firmware images or on buying devices. None of those methods scale well. It is, therefore, necessary to design novel, scalable methods to gather system apps.
  
  There is also a dynamic component to the supply chain, which further complicates its analysis. Modern devices include mechanisms to install updates for system apps, usually under the form of a Fimware Over The Air (FOTA) app, which has the possibility of updating apps even if said apps are installed on a system partition, or installing new apps on those partitions. This implies that a FOTA app also has the ability to install new system apps, possibly after the user start using the device, with the same issues as pre-installed ones.
  
  The exploration of the modern Android supply chain is a necessary first step to uncover its stakeholders and the relationship between them. I design an innovative crowdsourcing method to collect pre-installed apps on users’ devices in a privacy-respecting manner, giving us an accurate overview of the ecosystem in the wild, and allowing us to study its main stakeholders.
  
  RQ2: Measuring the consequences on user privacy and security Given the scale of the Android supply chain, and the high number of third parties that can pre-install extra system apps, it is paramount to conduct a privacy and security analysis of these apps. I first use static and dynamic analysis to try and understand the purpose of these apps and highlight numerous potentially harmful behaviors in both high and low-end devices. I specifically focus on apps that can access the full, unfiltered system logs–which might contain private information–and manually analyze their code to understand in which circumstances they access them, and whether they upload them on the Internet.
  
  Another critical part of users’ security and privacy is the Android permission system. This system has evolved over time, and it is unclear what the impact is on users’ security and privacy. Specifically, I study how third-party apps make use of lesser-known features of the permission system to potentially make features available to other apps.
  
  Finally, custom permission can open the door to privacy and security abuses. The aforementioned limitations make the automatic detection of privacy-invasive or malicious behaviors due to custom permissions challenging at best. I first evaluate the usability of state of the art tools for the analysis of pre-installed apps and show that they are not suitable for these purposes. I then develop my own analysis tools targeted specifically at these apps. Once I have suitable tools for analysis, I conduct a large-scale privacy analysis of the pre-installed apps ecosystem.
  
  Contributions and Organization In this thesis, I answer the research questions discussed above and make several contributions to advance the state of the art.
  
  Exploration of the pre-installed apps ecosystem I first design a novel crowdsourcing method to create a dataset of pre-installed Android apps. I created an app, Firmware Scanner, freely available on Google Play, that scans the system partitions of Android phones and uploads pre-loaded apps to our server, along with metadata about the device (e.g., information about the brand and model of the device, or the MCC and MNC codes and country code of the SIM card). This metadata allows to identify stakeholders of the supply chain and attribute customizations back to them. With this tool, I was able to gather 1,309,968 unique apps (according to their MD5 hash) from 33,915 unique devices (according to their build fingerprint), coming from 1,050 unique vendors. This app relies on crowdsourcing mechanisms, which allows me to capture also system apps dynamically installed. This data is coming from devices from every continent, giving us an unprecedented overview of this ecosystem, including regional customizations.
  
  Armed with this dataset, I present in Chapter 5 (page 47) the first large-scale study of pre-installed software and the supply chain on a global scale. This dataset allows us to characterize the stakeholders involved in the supply chain, from device manufacturers and mobile network operators to third-party organizations like advertising and tracking services, and social network platforms. To do this, I mainly rely on the analysis of information available in the manifest of the app packages, their signing certificates, and the third-party libraries (TPLs) they embed. my analysis covers 1,200 unique developers associated with the major manufacturers, vendors, and Internet service companies. I also uncover a vast landscape of third-party services (11,665 unique third-party libraries) revolving around advertisement, analytics, and social networking services. This chapter also explores the relationships between these stakeholders, by analyzing the custom permissions defined by hardware vendors, MNOs, third-party services, security firms, industry alliances, chipset manufacturers, and Internet browsers. Such permissions can potentially expose data and features to over-the-top apps and can be used to access privileged system resources and sensitive data in a way that circumvents the Android permission model. A manual inspection illuminates a complex supply chain that involves different stakeholders and commercial partnerships between handset vendors and online service providers. Overall, I show evidence that the supply chain around Android’s open source model lacks transparency and has facilitated potentially harmful behaviors and backdoored access to sensitive data and services without user consent or awareness.
  
  Privacy analysis of pre-installed Android apps In Chapter 5, I report numerous potentially harmful behavior in pre-installed apps, coming from both first and third parties. I find that user tracking is prevalent in the pre-installed apps ecosystem and that some apps abuse their privileged position by requesting permissions usually reserved for system apps. In particular, I show in depth how some apps access the full, unfiltered system logs, and then either store them on the SD card of the device or even upload it on the Internet.
  
  I then present in Chapter 6 a temporal evolution of the permission system, both in terms of the number of AOSP permissions but also in complexity. I show the vast number of flags that can be used by developers to slightly alter the permission granting algorithm, and their impact on the permission granting algorithm which I formalize. Then, I show how these flags are used by pre-installed apps in the wild, including by third-party system apps, which can then make some features available to other apps on the device.
  
  I then focus on custom permissions, which could potentially be used to circumvent the AOSP permission system. The use of custom permissions is not limited to pre-installed apps, however, and there could in fact be collusion between such apps and publicly available ones. Therefore, I decide to investigate the custom permissions ecosystem as a whole, including apps from any origin. In Chapter 7, I gather a 2.2-million-app-large dataset of both pre-installed and publicly available apps from 8 different app stores, complemented with apps downloaded from the Androzoo project. With this dataset, I present the first longitudinal and large scale analysis of the usage of custom permissions in the Android ecosystem. I find that both pre-installed and public apps both define and request a large number of custom permissions. Namely, 58% and 67% of pre-installed and public apps request at least one, and 26% and 4% define at least one, respectively. I find widespread violations of the naming recommendations set by Google for custom permissions: 45% of definitions do not follow that recommendation. For example, I find 722 custom permissions with the android.permission prefix, which is explicitly forbidden by the Android Compatibility Definition Document (CDD). Despite this prevalence, I find that custom permissions are virtually invisible to end users, and their purpose is mostly undocumented. While there is a description tag to describe the purpose and functionality of the custom permission, its usage is optional and I find that it is rarely used by developers (missing in 75% of the cases).
  
  This lack of transparency can lead to serious security and privacy problems: I show that custom permissions can facilitate access to permission-protected system resources to apps that lack those permissions without user awareness. However, there were no available tools to trace and understand the type of data or capability that is protected by a given custom permission, making it difficult to assess their risk from a user’s privacy and security perspective. To fill this gap, I present a novel method to triage apps that are potentially misusing custom permissions to access personal data, or perform other actions potentially detrimental to users’ privacy and security. My method relies on two custom-made tools: (1) PermissionTracer, a tool that reports potentially-dangerous custom permissions and detects potential cases of a privilege escalation attack in which an attacker can access permission-protected information using custom permissions; and (2) PermissionTainter, a static taint analysis tool that inspects the bytecode of apps that define custom permissions, to identify potential privacy leaks due to those permissions. Thanks to these tools, I identify several potentially harmful implementations where an attacker could access sensitive data such as the location, Wi-Fi MAC address, or contacts without requesting the corresponding AOSP permission.
  
  Finally, I conduct a small-scale survey of app developers who defined some of these custom permissions in order to understand their use case and rationale. My findings suggest that most developers lack a clear understanding of their purpose and functioning. As a result, custom permissions are often used due to poor software development practices or because they are required to define them in order to integrate third-party SDKs.

Acceso de usuarios registrados

¿Olvidó su contraseña?

¿Es nuevo? Regístrese

Ventajas de registrarse

Dialnet Plus

Opciones de compartir

Opciones de entorno

Sugerencia / Errata

Coordinado por: