HOME

TheInfoList



OR:

A device fingerprint or machine fingerprint is information collected about the software and hardware of a remote computing device for the purpose of identification. The information is usually assimilated into a brief identifier using a
fingerprinting algorithm In computer science, a fingerprinting algorithm is a procedure that maps an arbitrarily large data item (such as a computer file) to a much shorter bit string, its fingerprint, that uniquely identifies the original data for all practical purposes ...
. A browser fingerprint is information collected specifically by interaction with the
web browser A web browser is application software for accessing websites. When a user requests a web page from a particular website, the browser retrieves its files from a web server and then displays the page on the user's screen. Browsers are used on ...
of the device. Device fingerprints can be used to fully or partially identify individual devices even when persistent cookies (and zombie cookies) cannot be read or stored in the browser, the client
IP address An Internet Protocol address (IP address) is a numerical label such as that is connected to a computer network that uses the Internet Protocol for communication.. Updated by . An IP address serves two main functions: network interface ident ...
is hidden, or one switches to another browser on the same device. This may allow a service provider to detect and prevent identity theft and credit card fraud, but also to compile long-term records of individuals' browsing histories (and deliver targeted advertising or targeted
exploit Exploit means to take advantage of something (a person, situation, etc.) for one's own end, especially unethically or unjustifiably. Exploit can mean: * Exploitation of natural resources *Exploit (computer security) * Video game exploit *Exploita ...
s) even when they are attempting to avoid tracking – raising a major concern for
internet privacy Internet privacy involves the right or mandate of personal privacy concerning the storing, re-purposing, provision to third parties, and displaying of information pertaining to oneself via Internet. Internet privacy is a subset of data privacy. P ...
advocates.


History

Basic
web browser A web browser is application software for accessing websites. When a user requests a web page from a particular website, the browser retrieves its files from a web server and then displays the page on the user's screen. Browsers are used on ...
configuration information has long been collected by
web analytics Web analytics is the measurement, collection, analysis, and reporting of web data to understand and optimize web usage. Web analytics is not just a process for measuring web traffic but can be used as a tool for business and market research a ...
services in an effort to measure real human web traffic and discount various forms of click fraud. Since its introduction in the late 1990s, client-side scripting has gradually enabled the collection of an increasing amount of diverse information, with some
computer security Computer security, cybersecurity (cyber security), or information technology security (IT security) is the protection of computer systems and networks from attack by malicious actors that may result in unauthorized information disclosure, t ...
experts starting to complain about the ease of bulk parameter extraction offered by web browsers as early as 2003. In 2005, researchers at
University of California, San Diego The University of California, San Diego (UC San Diego or colloquially, UCSD) is a public university, public Land-grant university, land-grant research university in San Diego, California. Established in 1960 near the pre-existing Scripps Insti ...
showed how
TCP TCP may refer to: Science and technology * Transformer coupled plasma * Tool Center Point, see Robot end effector Computing * Transmission Control Protocol, a fundamental Internet standard * Telephony control protocol, a Bluetooth communication s ...
timestamps could be used to estimate the clock skew of a device, and consequently to remotely obtain a hardware fingerprint of the device. In 2010,
Electronic Frontier Foundation The Electronic Frontier Foundation (EFF) is an international non-profit digital rights group based in San Francisco, California. The foundation was formed on 10 July 1990 by John Gilmore, John Perry Barlow and Mitch Kapor to promote Internet ...
launched a website where visitors can test their browser fingerprint. After collecting a sample of 470161 fingerprints, they measured at least 18.1 bits of
entropy Entropy is a scientific concept, as well as a measurable physical property, that is most commonly associated with a state of disorder, randomness, or uncertainty. The term and the concept are used in diverse fields, from classical thermodyna ...
possible from browser fingerprinting, but that was before the advancements of canvas fingerprinting, which claims to add another 5.7 bits. In 2012, Keaton Mowery and Hovav Shacham, researchers at
University of California, San Diego The University of California, San Diego (UC San Diego or colloquially, UCSD) is a public university, public Land-grant university, land-grant research university in San Diego, California. Established in 1960 near the pre-existing Scripps Insti ...
, showed how the HTML5 canvas element could be used to create digital fingerprints of web browsers. In 2013, at least 0.4% of Alexa top 10,000 sites were found to use fingerprinting scripts provided by a few known third parties. In 2014, 5.5% of Alexa top 10,000 sites were found to use canvas fingerprinting scripts served by a total of 20 domains. The overwhelming majority (95%) of the scripts were served by AddThis, which started using canvas fingerprinting in January that year, without the knowledge of some of its clients. In 2015, a feature to protect against browser fingerprinting was introduced in Firefox version 41, but it has been since left in an experimental stage, not initiated by default.
The same year a feature named ''Enhanced Tracking Protection'' was introduced in Firefox version 42 to protect against tracking during private browsing by blocking scripts from third party domains found in the lists published by
Disconnect Mobile Disconnect is a partly open source browser extension and mobile app designed to stop non-consensual third party trackers, and providing private web search and private web browsing. On mobile, it is available for Android and iPhone. It was developed ...
. At WWDC 2018
Apple An apple is an edible fruit produced by an apple tree (''Malus domestica''). Apple trees are cultivated worldwide and are the most widely grown species in the genus '' Malus''. The tree originated in Central Asia, where its wild ances ...
announced that Safari on macOS Mojave "presents simplified system information when users browse the web, preventing them from being tracked based on their system configuration."
A 2018 study revealed that only one-third of browser fingerprints in a French database were unique, indicating that browser fingerprinting may become less effective as the number of users increases and web technologies convergently evolve to implement fewer distinguishing features. In 2019, starting from Firefox version 69, ''Enhanced Tracking Protection'' has been turned on by default for all users also during non-private browsing. The feature was first introduced to protect private browsing in 2015 and was then extended to standard browsing as an opt-in feature in 2018.


Diversity and stability

Motivation for the device fingerprint concept stems from the
forensic Forensic science, also known as criminalistics, is the application of science to criminal and civil laws, mainly—on the criminal side—during criminal investigation, as governed by the legal standards of admissible evidence and crimin ...
value of human fingerprints. In order to uniquely distinguish over time some devices through their fingerprints, the fingerprints must be both sufficiently diverse and sufficiently stable. In practice neither diversity nor stability is fully attainable, and improving one has a tendency to adversely impact the other. For example, the assimilation of an additional browser setting into the browser fingerprint would usually increase diversity, but it would also reduce stability, because if a user changes that setting, then the browser fingerprint would change as well. A certain degree of instability can be compensated by linking together fingerprints that, although partially different, might probably belong to the same device. This can be accomplished by a simple rule-based linking algorithm (which, for example, links together fingerprints that differ only for the browser version, if that increases with time) or machine learning algorithms.
Entropy Entropy is a scientific concept, as well as a measurable physical property, that is most commonly associated with a state of disorder, randomness, or uncertainty. The term and the concept are used in diverse fields, from classical thermodyna ...
is one of several ways to measure diversity.


Sources of identifying information

Applications that are locally installed on a device are allowed to gather a great amount of information about the software and the hardware of the device, often including unique identifiers such as the
MAC address A media access control address (MAC address) is a unique identifier assigned to a network interface controller (NIC) for use as a network address in communications within a network segment. This use is common in most IEEE 802 networking tec ...
and serial numbers assigned to the machine hardware. Indeed, programs that employ
digital rights management Digital rights management (DRM) is the management of legal access to digital content. Various tools or technological protection measures (TPM) such as access control technologies can restrict the use of proprietary hardware and copyrighted work ...
use this information for the very purpose of uniquely identifying the device. Even if they are not designed to gather and share identifying information, local applications might unwillingly expose identifying information to the remote parties with which they interact. The most prominent example is that of web browsers, which have been proved to expose diverse and stable information in such an amount to allow remote identification, see . Diverse and stable information can also be gathered below the application layer, by leveraging the protocols that are used to transmit data. Sorted by OSI model layer, some examples of such protocols are: * OSI Layer 7: SMB, FTP,
HTTP The Hypertext Transfer Protocol (HTTP) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web, ...
, Telnet,
TLS/SSL Transport Layer Security (TLS) is a cryptographic protocol designed to provide communications security over a computer network. The protocol is widely used in applications such as email, instant messaging, and voice over IP, but its use in securi ...
, DHCP * OSI Layer 5: SNMP, NetBIOS * OSI Layer 4:
TCP TCP may refer to: Science and technology * Transformer coupled plasma * Tool Center Point, see Robot end effector Computing * Transmission Control Protocol, a fundamental Internet standard * Telephony control protocol, a Bluetooth communication s ...
(see TCP/IP stack fingerprinting) * OSI Layer 3:
IPv4 Internet Protocol version 4 (IPv4) is the fourth version of the Internet Protocol (IP). It is one of the core protocols of standards-based internetworking methods in the Internet and other packet-switched networks. IPv4 was the first version d ...
,
IPv6 Internet Protocol version 6 (IPv6) is the most recent version of the Internet Protocol (IP), the communications protocol that provides an identification and location system for computers on networks and routes traffic across the Internet. I ...
, ICMP, IEEE 802.11 * OSI Layer 2: CDP Passive fingerprinting techniques merely require the fingerprinter to observe traffic originated from the target device, while active fingerprinting techniques require the fingerprinter to initiate connections to the target device. Techniques that require interaction with the target device over a connection initiated by the latter are sometimes addressed as semi-passive.


Browser fingerprint

The collection of a large amount of diverse and stable information from web browsers is possible for most part due to client-side scripting languages, which were introduced in the late '90s. Today there are several open-source browser fingerprinting libraries, such as FingerprintJS, ImprintJS, and ClientJS, where FingerprintJS is updated the most often and supersedes ImprintJS and ClientJS to a large extent.


Browser version

Browsers provide their name and version, together with some compatibility information, in the User-Agent request header. Being a statement freely given by the client, it should not be trusted when assessing its identity. Instead, the type and version of the browser can be inferred from the observation of quirks in its behavior: for example, the order and number of HTTP header fields is unique to each browser family and, most importantly, each browser family and version differs in its implementation of HTML5, CSS and
JavaScript JavaScript (), often abbreviated as JS, is a programming language that is one of the core technologies of the World Wide Web, alongside HTML and CSS. As of 2022, 98% of Website, websites use JavaScript on the Client (computing), client side ...
. Such differences can be remotely tested by using JavaScript. A Hamming distance comparison of parser behaviors has been shown to effectively fingerprint and differentiate a majority of browser versions.


Browser extensions

A combination of extensions or
plugins Plug-in, plug in or plugin may refer to: * Plug-in (computing) is a software component that adds a specific feature to an existing computer program. ** Audio plug-in, adds audio signal processing features ** Photoshop plugin, a piece of software t ...
unique to a browser can be added to a fingerprint directly. Extensions may also modify how any other browser attributes behave, adding additional complexity to the user's fingerprint.
Adobe Flash Adobe Flash (formerly Macromedia Flash and FutureSplash) is a multimedia software platform used for production of animations, rich web applications, desktop applications, mobile apps, mobile games, and embedded web browser video players. Fla ...
and
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mo ...
plugins were widely used to access user information before their deprecation.


Hardware properties

User agents may provide system hardware information, such as phone model, in the HTTP header. Properties about the user's
operating system An operating system (OS) is system software that manages computer hardware, software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ef ...
, screen size,
screen orientation Page orientation is the way in which a rectangular page is oriented for normal viewing. The two most common types of orientation are ''portrait'' and ''landscape''. The term "portrait orientation" comes from visual art terminology and describes ...
, and display aspect ratio can be also retrieved by using
JavaScript JavaScript (), often abbreviated as JS, is a programming language that is one of the core technologies of the World Wide Web, alongside HTML and CSS. As of 2022, 98% of Website, websites use JavaScript on the Client (computing), client side ...
to observe the result of CSS media queries.


Browsing history

The fingerprinter could determine which sites the browser had previously visited within a list it provided, by querying the list using JavaScript with the CSS selector . Typically, a list of 50 popular websites were sufficient to generate a unique user history profile, as well as provide information about the user's interests. However, browsers have since then mitigated this risk.


Font metrics

The letter bounding boxes differ between browsers based on anti-aliasing and font hinting configuration and can be measured by JavaScript.


Canvas and WebGL

Canvas fingerprinting uses the HTML5 canvas element, which is used by WebGL to render 2D and 3D graphics in a browser, to gain identifying information about the installed graphics driver, graphics card, or
graphics processing unit A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mo ...
(GPU). Canvas-based techniques may also be used to identify installed
font In movable type, metal typesetting, a font is a particular #Characteristics, size, weight and style of a typeface. Each font is a matched set of type, with a piece (a "Sort (typesetting), sort") for each glyph. A typeface consists of a range of ...
s. Furthermore, if the user does not have a GPU,
CPU A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, and ...
information can be provided to the fingerprinter instead. A canvas fingerprinting script first draws text of specified font, size, and background color. The image of the text as rendered by the user's browser is then recovered by the ToDataURL Canvas API method. The hashed text-encoded data becomes the user's fingerprint. Canvas fingerprinting methods have been shown to produce 5.7 bits of entropy. Because the technique obtains information about the user's GPU, the information entropy gained is "orthogonal" to the entropy of previous browser fingerprint techniques such as screen resolution and JavaScript capabilities.


Hardware benchmarking

Benchmark tests can be used to determine whether a user's CPU utilizes AES-NI or Intel Turbo Boost by comparing the CPU time used to execute various simple or cryptographic algorithms. Specialized
APIs Apis or APIS may refer to: * Apis (deity), an ancient Egyptian god * Apis (Greek mythology), several different figures in Greek mythology * Apis (city), an ancient seaport town on the northern coast of Africa **Kom el-Hisn, a different Egyptian ci ...
can also be used, such as the Battery API, which constructs a short-term fingerprint based on the actual battery state of the device, or OscillatorNode, which can be invoked to produce a waveform based on user entropy. A device's hardware ID, which is a
cryptographic hash function A cryptographic hash function (CHF) is a hash algorithm (a map of an arbitrary binary string to a binary string with fixed size of n bits) that has special properties desirable for cryptography: * the probability of a particular n-bit output ...
specified by the device's vendor, can also be queried to construct a fingerprint.


Mitigation methods for browser fingerprinting

Different approaches exist to mitigate the effects of browser fingerprinting and improve users' privacy by preventing unwanted tracking, but there is no ultimate approach that can prevent fingerprinting while keeping the richness of a modern web browser.


Offering a simplified fingerprint

Users may attempt to reduce their fingerprintability by selecting a
web browser A web browser is application software for accessing websites. When a user requests a web page from a particular website, the browser retrieves its files from a web server and then displays the page on the user's screen. Browsers are used on ...
which minimizes the availability of identifying information, such as browser fonts, device ID, canvas element rendering, WebGL information, and
local IP address In Internet networking, a private network is a computer network that uses a private address space of IP addresses. These addresses are commonly used for local area networks (LANs) in residential, office, and enterprise environments. Both the IPv4 ...
. As of 2017 Microsoft Edge is considered to be the most fingerprintable browser, followed by Firefox and Google Chrome,
Internet Explorer Internet Explorer (formerly Microsoft Internet Explorer and Windows Internet Explorer, commonly abbreviated IE or MSIE) is a series of graphical user interface, graphical web browsers developed by Microsoft which was used in the Microsoft Wind ...
, and Safari. Among mobile browsers, Google Chrome and Opera Mini are most fingerprintable, followed by mobile Firefox, mobile Edge, and mobile Safari. Tor Browser disables fingerprintable features such as the canvas and WebGL API and notifies users of fingerprint attempts.


Offering a spoofed fingerprint

Spoofing some of the information exposed to the fingerprinter (e.g. the
user agent In computing, a user agent is any software, acting on behalf of a user, which "retrieves, renders and facilitates end-user interaction with Web content". A user agent is therefore a special kind of software agent. Some prominent examples of u ...
) may create a reduction in diversity, but the contrary could be also achieved if the spoofed information differentiates the user from all the others who do not use such a strategy more than the real browser information. Spoofing the information differently at each site visit, for example by perturbating the sound and canvas rendering with a small amount of random noise, allows a reduction of stability. This technique has been adopted by the Brave browser in 2020.


Blocking scripts

Blindly blocking client-side scripts served from third-party domains, and possibly also first-party domains (e.g. by disabling JavaScript or using NoScript) can sometimes render websites unusable. The preferred approach is to block only third-party domains that seem to track people, either because they're found on a blacklist of tracking domains (the approach followed by most ad blockers) or because the intention of tracking is inferred by past observations (the approach followed by Privacy Badger).


Using multiple browsers

Different browsers on the same machine would usually have different fingerprints, but if both browsers are not protected against fingerprinting, then the two fingerprints could be identified as originating from the same machine.


See also

*
Anonymous web browsing Private browsing is a privacy feature in some web browsers. When operating in such a mode, the browser creates a temporary session that is isolated from the browser's main session and user data. Browsing history is not saved, and local data as ...
* Browser security * Browser sniffing *
Evercookie Evercookie (also known as supercookie) is a JavaScript application programming interface (API) that identifies and reproduces intentionally deleted cookies on the clients' browser storage. It was created by Samy Kamkar in 2010 to demonstrate th ...
* Fingerprint (computing) *
Internet privacy Internet privacy involves the right or mandate of personal privacy concerning the storing, re-purposing, provision to third parties, and displaying of information pertaining to oneself via Internet. Internet privacy is a subset of data privacy. P ...
*
Web tracking Web tracking is the practice by which operators of websites and third parties collect, store and share information about visitors’ activities on the World Wide Web. Analysis of a user's behaviour may be used to provide content that enables the ...


References


Further reading

* * *


External links


Panopticlick
by the
Electronic Frontier Foundation The Electronic Frontier Foundation (EFF) is an international non-profit digital rights group based in San Francisco, California. The foundation was formed on 10 July 1990 by John Gilmore, John Perry Barlow and Mitch Kapor to promote Internet ...
, gathers some elements of a browser's device fingerprint and estimates how identifiable it makes the user
Am I Unique
by INRIA and INSA Rennes, implements fingerprinting techniques including collecting information through WebGL. {{DEFAULTSORT:Device Fingerprint Computer network security Internet privacy Internet fraud Fingerprinting algorithms Web analytics