Click Tracking
   HOME

TheInfoList



OR:

Click tracking is when user click behavior or user navigational behavior is collected in order to derive insights and fingerprint users. Click behavior is commonly tracked using server logs which encompass click paths and clicked
URL A uniform resource locator (URL), colloquially known as an address on the Web, is a reference to a resource that specifies its location on a computer network and a mechanism for retrieving it. A URL is a specific type of Uniform Resource Identi ...
s (Uniform Resource Locator). This log is often presented in a standard format including information like the hostname, date, and username. However, as technology develops, new software allows for in depth analysis of user click behavior using hypervideo tools. Given that the internet can be considered a risky environment, research strives to understand why users click certain links and not others. Research has also been conducted to explore the
user experience User experience (UX) is how a user interacts with and experiences a product, system or service. It includes a person's perceptions of utility, ease of use, and efficiency. Improving user experience is important to most companies, designers, a ...
of privacy with making user personal identification information individually anonymized and improving how data collection consent forms are written and structured. Click tracking is relevant in several industries including Human-Computer Interaction (HCI),
software engineering Software engineering is a branch of both computer science and engineering focused on designing, developing, testing, and maintaining Application software, software applications. It involves applying engineering design process, engineering principl ...
, and
advertising Advertising is the practice and techniques employed to bring attention to a Product (business), product or Service (economics), service. Advertising aims to present a product or service in terms of utility, advantages, and qualities of int ...
. Email tracking, link tracking,
web analytics Web analytics is the measurement, data collection, collection, analysis, and reporting of web Data (computing), data to understand and optimize web usage. Web analytics is not just a process for measuring web traffic but can be used as a tool for ...
, and user research are also related concepts and applications of click tracking. A common utilization of click data from click tracking is to improve results' positions from search engines to make their order more relevant to users' needs. Click tracking employs many modern techniques such as
machine learning Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
and
data mining Data mining is the process of extracting and finding patterns in massive data sets involving methods at the intersection of machine learning, statistics, and database systems. Data mining is an interdisciplinary subfield of computer science and ...
.


Tracking and recording technology

Tracking and recording technologies (TRTs) can be split into two categories, institutional TRTs and end-user TRTs. Institutional TRTs and end-user TRTs differ by who is collecting and storing the data, and this can be respectively understood as institutions and users. Examples of TRTs include
radio frequency identification Radio-frequency identification (RFID) uses electromagnetic fields to automatically Automatic identification system, identify and Tracking system, track tags attached to objects. An RFID system consists of a tiny radio transponder called a tag, ...
(RFID), credit cards, and store video cameras. Research suggests that individuals are concerned with privacy, but they are less concerned with how TRTs are used daily. This discrepancy has been attributed to the public not understanding how information about them is getting collected. Another means of obtaining user input is eye-tracking or gaze tracking. Gaze-tracking technology is especially beneficial for those with motor disabilities. Systems that employ gaze-tracking often try to mimic cursor and keyboard behavior. In this process, the gaze-tracking system is separated into its own panel in the system interface, and the user experience of this system is compromised as individuals have to switch between the panel and the other interface features. The experience is also difficult because users have to first imagine how to complete the task using keyboard and cursor features and then employ gaze. This causes tasks to take additional time. Hence, researchers created their own web browser called ''GazeTheWeb'' (GTW), and the focus of their research was on the user experience. They improved the interface to incorporate gaze better. Eye-movement tracking is also applied in
usability testing Usability testing is a technique used in user-centered interaction design to evaluate a product by testing it on users. This can be seen as an irreplaceable usability practice, since it gives direct input on how real users use the system. It is mo ...
when creating web applications. However, in order to track user eye movements, a lab setting with appropriate equipment is often required. Mouse and keyboard activity can be measured remotely, so this quality can be capitalized for usability testing. Algorithms can use mouse movements to predict and trace user eye movements. Such tracking in a remote environment is denoted as a remote logging technique. Browser fingerprinting is another means of identifying users and tracking them. In this process, information about a user is collected from their web browser to create a browser fingerprint. A browser fingerprint contains information about a device, its operating system, its browser, and its configuration. HTTP headers, JavaScript, and browser plugins can be used to build a fingerprint. Browser fingerprints can change over time from automatic software updates or user browser preference adjustments. Measures to increase privacy in this realm can reduce functionality by blocking features.


Methods of click tracking

User browsing behavior is often tracked using server access logs which contain patterns of clicked URLs, queries, and paths. However, more modern tracking software utilizes
JavaScript JavaScript (), often abbreviated as JS, is a programming language and core technology of the World Wide Web, alongside HTML and CSS. Ninety-nine percent of websites use JavaScript on the client side for webpage behavior. Web browsers have ...
in order to track cursor behavior. The collected mouse data can be used to create videos, allowing for user behavior to be replayed and easily analyzed.
Hypermedia Hypermedia, an extension of hypertext, is a nonlinear medium of information that includes graphics, audio, video, plain text and hyperlinks. This designation contrasts with the broader term ''multimedia'', which may include non-interactive linear ...
is used to create such visualizations that allow for behavior like highlighting, hesitating, and selecting to be monitored. Technology that is used to record such behavior can also be used to predict it. One of these monitoring tools, SMT2є, collects fifteen cursor features and uses the other fourteen to predict the last feature's outcome. This software also generates a log analysis which summarizes user cursor activity. In a search session, users can be identified using
cookies A cookie is a sweet biscuit with high sugar and fat content. Cookie dough is softer than that used for other types of biscuit, and they are cooked longer at lower temperatures. The dough typically contains flour, sugar, egg, and some type of ...
, ''identd'' protocol, or their
IP address An Internet Protocol address (IP address) is a numerical label such as that is assigned to a device connected to a computer network that uses the Internet Protocol for communication. IP addresses serve two main functions: network interface i ...
. This information can then be stored in a
database In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and a ...
, and every time a user visits a web page again, their click behavior will be appended to the database. '' DoubleClick Inc.'' is an example of a company that has such a database and partners with other companies to aid with their web mining. Cookies are added to
HTTP HTTP (Hypertext Transfer Protocol) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web, wher ...
(Hypertext Transfer Protocol), and when a user clicks on a link, they are connected to the associated web server. This action of a user clicking on a link is seen as a request, and the server “responds” by sending the user's information, and this information is a cookie. Cookies provide a “bookmark” for users’ sessions on a website, and they store user login information and the pages users visit on a website. This aids with preserving the state of the session. If there is more than one such server, information must be consistent among all servers; hence, information is transferred. Data collected via cookies can be used to improve websites for all users and this also aids with user profiling for advertising. When data mining techniques and statistical procedures are applied to understand web log data, the process is noted as log analysis or web usage mining. This helps with determining patterns in the users’ navigational behaviors. Some features that can be observed include how long users viewed pages for, click path lengths, and the number of clicks. Web usage mining has three phases. First, the log data is "preprocessed" to see the users and search sessions’ content. Then, tools like association and clustering are applied to look for patterns, and lastly, these patterns are saved to be further analyzed. The tool of association rule mining helps with finding “patterns, associations, and correlations” among pages users visit in a search session. Sequential pattern discovery is association rule mining, but it also accounts for time like the page views in an allotted time period.
Classification Classification is the activity of assigning objects to some pre-existing classes or categories. This is distinct from the task of establishing the classes themselves (for example through cluster analysis). Examples include diagnostic tests, identif ...
is a tool that allows for pages to be added to groups representing certain similar qualities. Some examples of tools individuals can use when conducting click analytics are the
Google Analytics Google Analytics is a web analytics service offered by Google that tracks and reports website traffic and also mobile app traffic and events, currently as a platform inside the Google Marketing Platform brand. Google launched the service in N ...
tool In-Page Analytics, ClickHeat, and Crazy Egg. These tools create a visual from user click data on a webpage. ClickHeat and Crazy Egg showcase the density of user clicks using specific colors, and all of these tools allow for webpage visitors to be categorized into groups by qualities like being a mobile user or using a particular browser. The specific groups' data can be analyzed for further insight.


Click behaviour

One of the main factors users consider when clicking links is a link's position in a list of results. The closer links are to the top, the more likely they are to be selected by users. When users have a personal connection to a subject matter they tend to click that article more frequently. Pictures, position, and specific individuals in the news content also more heavily influenced users’ decisions. The source of the news was deemed as less important. Click attitude and click intention play a large role in user click behavior. In one study when research participants were presented with positive and negative insurance advertisement photographs, emotion was seen to have a positive association with click intention and click attitude. The researchers also observed that click attitude affects click intention, and positive emotion has more of an impact than negative emotion on click attitude. The internet can be considered a risky environment due to the abundance of cybersecurity attacks that can occur and the prevalence of malware. Hence, whenever individuals use the internet, they have to decide whether or not to click on the various links. A 2018 study found that users tend to click on more URLs on websites they are familiar with; this user trait is then exploited by cybercriminals, and personal information can be compromised. Hence, trust is seen to also increase click-through intention. When given
Google Chrome Google Chrome is a web browser developed by Google. It was first released in 2008 for Microsoft Windows, built with free software components from Apple WebKit and Mozilla Firefox. Versions were later released for Linux, macOS, iOS, iPadOS, an ...
warnings, 70% of the time people will click through. They also tend to adjust default computer settings in this process. Users were also found to better recognize malware risks when there is a greater potential for revealing their personal information.


Relevance of search results

Pages that are viewed by users during a particular search session constitute click data. Such data can be used to improve search results in two ways, as explicit and implicit feedback. Explicit feedback is when users indicate which pages are relevant to their search query, while implicit feedback is when user behavior is interpreted to determine results’ relevance. Certain user actions on a webpage that can be used as a part of the interpretation process include bookmarking, saving, or printing a particular web page. Through collecting click data from a few individuals, the relevance of results for all users for given queries can improve. In a search session, a user indicates which documents they are more interested in with their clicks, and this indicates what is relevant to the search. The most relevant click data to determine relevance of results is often the last viewed web page rather than all of the pages clicked on in a search session. Click data outside of search sessions can also be used to improve the accuracy of relevant results for users. The search results to a given query are usually subject to positional bias. This is because users tend to select links that are at the top of result lists. However, this position does not mean a result is the most relevant since relevance can change over time. As a part of a machine learning approach to improving the result order, human editors begin by supplying an original rank for each result to the algorithm. Then, live user click feedback in the form of tracked click-through rates (CTR) in search sessions can be used to rerank the results based on the data. This improves the order of the results based on the live indicated relevance from the users. Click dwell time and click sequence information can also be used to improve the relevance of search results. Click dwell time is how long a user takes to return to the
search engine results page A search engine results page (SERP) is a webpage that is displayed by a search engine in response to a query by a user. The main component of a SERP is the listing of results that are returned by the search engine in response to a Keyword (Inter ...
(SERP) after clicking on a particular result, and this can indicate how satisfied the user is with a particular result. Eye-tracking research indicates that users exhibit an abundance of non-sequential viewing activity when looking at search results. Click models that abide by “top-down” user click behavior cannot interpret the user process of revisiting pages.


Extensions


Advertising

Supply-demand mismatch costs can be reduced through click tracking. Huang ''et al.'' defines strategic customers as “forward looking” individuals who know that their clicks are being tracked and expect that companies will engage in appropriate business activities. In the conducted study, researchers used clickstream data from customers to observe their preferences and desired product quantities. Noisy clicks are when customers click but do not actually buy the product. This leads to imperfect advanced demand information or ADI. Click tracking can be used in the realm of advertising, but there is the potential for this tool to be used negatively. Publishers display advertisements on their websites, and they receive money depending on the amount of traffic, measured as a number of clicks, they send to the advertisers website. Click fraud is when publishers fake clicks to generate revenue for themselves. In the 2012 Fraud Detection in Mobile Advertising (FDMA) conference, competition teams were tasked with having to use data mining and machine learning techniques to determine “fraudulent publishers” from a given dataset. A successful algorithm is able to observe and use morning and night click traffic patterns. When there is density of clicks between these main patterns, it is often an indicator of a fraudulent publisher. Website content can be adjusted to make it specific to users using “user navigational behavior” and user interests in a process called web personalization. Web personalization is useful in the realm of
e-commerce E-commerce (electronic commerce) refers to commercial activities including the electronic buying or selling products and services which are conducted on online platforms or over the Internet. E-commerce draws on technologies such as mobile co ...
. There are unique steps in the process of web personalization, and the first step is noted as “ user profiling.” In this step, the user is understood and constituted through their click behavior, preferences, and qualities. Following user profiling is “log analysis and web usage mining.”


Email

Phishing Phishing is a form of social engineering and a scam where attackers deceive people into revealing sensitive information or installing malware such as viruses, worms, adware, or ransomware. Phishing attacks have become increasingly sophisticate ...
is usually administered through emails, and when a user clicks on a phishing attempt email, their information will be leaked to particular websites. Spear-phishing is a more “targeted” form of phishing in which user information is used to personalize emails and entice users to click. Some phishing emails will also contain other links and attachments. Once these are either clicked or downloaded, users’ privacy can be encroached. Lin ''et al.'' conducted a study to see which psychological “weapons of influence” and “life domains” affect users most in phishing attempts, and they found that scarcity was the most influential factor weapon of influence, and the legal domain was the most influential life domain. Age is also an important factor in determining those who are more susceptible to clicking on phishing attempts. When a
virus A virus is a submicroscopic infectious agent that replicates only inside the living Cell (biology), cells of an organism. Viruses infect all life forms, from animals and plants to microorganisms, including bacteria and archaea. Viruses are ...
infects a computer, it finds email addresses and sends copies of itself through these emails. These emails will usually contain an attachment and will be sent to several individuals. This differs from user email account behavior because users tend to have a particular network they communicate with regularly. Researchers studied how the Email Mining Toolkit (EMT) could be used to detect viruses by studying such user email account behavior and found that it was easier to decipher quick, broad viral propagations in comparison to slow, gradual viral propagations. In order to know what emails users have opened, email senders engage in email tracking. By merely opening an email, users' email addresses can be leaked to third parties, and if users click on links within the emails, their email address can get leaked to a larger number of third parties. Also, each time a user opens an email sent to them, their information can get sent to a new third party among those that their address has already been leaked to. Many third party email trackers are also involved in web tracking, leading to further user profiling.


Privacy

Privacy-protection models anonymize data after it is sent to a server and stored in a database. Hence, user personal identification information is still collected, and this collection process is based on users trusting such servers. Researchers study giving users control over what information is sent from their mobile devices. They also observe giving users control over how that information is represented in databases in the realm of trajectory data, and they create a system that allows for this approach. This approach gives users the potential to increase their privacy. When user privacy is going to be encroached, consent forms are often distributed. The type of user activity required in these forms can have an effect on how much information a user retains from the form. Karegar ''et al.'' compares the simple agree/disagree format with forms that incorporate checkboxes, drag and drop (DAD), and swipe features. When testing what information users would agree to disclose with each of the consent form formats, researchers observed that users presented with DAD forms had a greater number of eye-fixations and on the given consent form. When a
third-party Third party may refer to: Business * Third-party source, a supplier company not owned by the buyer or seller * Third-party beneficiary, a person who could sue on a contract, despite not being an active party * Third-party insurance, such as a veh ...
is associated with a first-party website or mobile application, anytime a user visits the first party website or mobile application, their information will be sent to the third-party. Third-party tracking generates more privacy concerns than first-party tracking because it allows for many website or application records about a particular user to be combined, yielding better user profiles. Binns ''et al.'' found that among 5000 popular websites, the top two websites alone had 2000 trackers. Of the 2000 embedded trackers, 253 were used in 25 other websites. Researchers evaluated the reach of third-party trackers based on their contact with users rather than websites, so more "popular" trackers were those who received information about the highest number of people rather than code embedded in the most first-parties.
Google Google LLC (, ) is an American multinational corporation and technology company focusing on online advertising, search engine technology, cloud computing, computer software, quantum computing, e-commerce, consumer electronics, and artificial ...
and
Facebook Facebook is a social media and social networking service owned by the American technology conglomerate Meta Platforms, Meta. Created in 2004 by Mark Zuckerberg with four other Harvard College students and roommates, Eduardo Saverin, Andre ...
were deemed as the first and second largest web trackers, and Google and
Twitter Twitter, officially known as X since 2023, is an American microblogging and social networking service. It is one of the world's largest social media platforms and one of the most-visited websites. Users can share short text messages, image ...
were deemed as the first and second largest mobile trackers.


See also

* Click path * Click analytics *
Web analytics Web analytics is the measurement, data collection, collection, analysis, and reporting of web Data (computing), data to understand and optimize web usage. Web analytics is not just a process for measuring web traffic but can be used as a tool for ...
*
Phishing Phishing is a form of social engineering and a scam where attackers deceive people into revealing sensitive information or installing malware such as viruses, worms, adware, or ransomware. Phishing attacks have become increasingly sophisticate ...
*
Web log analysis software Web log analysis software (also called a web log analyzer) is a kind of web analytics software that parses a server log file from a web server, and based on the values contained in the log file, derives indicators about when, how, and by whom a web ...
* Web mining * Search session


References

{{reflist Web analytics Tracking