diff --git a/templates/about.html b/templates/about.html index b90a5bc..8f8117d 100644 --- a/templates/about.html +++ b/templates/about.html @@ -9,7 +9,7 @@ - +
@@ -47,31 +47,31 @@

About

-

The purpose of the laundromat, how to use it effectively, and how to interpret the results +

The purpose of the Laundromat, how to use it effectively, and how to interpret the results

The Laundromat

-

The laundromat tool provides two functions: Content Similarity Search and Domain Forensics Matching: +

The Laundromat tool provides two functions: Content Similarity Search and Domain Forensics Matching:

  • Content Similarity Search attempts to detect URLs where a given text snippet occurs. It does not provide evidence of where that text originated or any relationship between two entities posting two similar texts. Detemination of a given text's provenance is outside the scope of this tool.
  • -
  • Domain Forensics Matching attempts to find aspects of a website which indicate what makes it +
  • Metadata Similarity Search attempts to find aspects of a website which indicate what makes it unique, give insight into its architecture/design, or show how its used/tracked. These indicators are compared for items with high degrees of similarity and matches are provided to - the user
  • + the user.

The Domain Forensics Comparison Corpus

-

Any URLs entered into the Domain Forensics Matching tool are compared against against a list of +

Any URLs entered into the Metadata Similarity Search tool are compared against a list of domains already processed by the tool. This corpus is sourced from a number of sources, including:

Inclusion in the corpus of comparison sites is neither an endorsement nor a criticism of a given website's point of view or their relationship to any other member of the corpus. It solely reflects what websites are of interest to OSINT researchers. If you'd like a website removed - from the list or have a potential list of new items to include, email pbenzoni (at) gmfus.org

+ from the list or have a potential list of new items to include, email info (at) securingdemocracy.org.

About the Indicator Tier System and Interpreting Results

-

Each indicator is associated with evidentiary tier and are subject to Each indicator is associated with an evidentiary tier and is subject to interpretation.

-

Tier 1 indicators: WHEN VALID are +

Tier 1 Indicators: WHEN VALID are typically unique or highly indicative of the provenance of a website. This includes unique IDs for verification purposes and web services like Google, Yandex, etc as well as site metadata like WHOIS information and certification, WHEN VALID, as DDOS protection services like Cloudflare and shared hosting services like Bluehost can provide spurious matches.

-

Tier 2 indicators: Tier 2 indicators, WHEN - VALID, offer a moderate level of certainty regarding the provenance of a +

Tier 2 Indicators: WHEN + VALID, these offer a moderate level of certainty regarding the provenance of a website. These are not as unique as Tier 1 indicators but provide valuable context. This tier includes IPs within the same subnet, matching meta tags, and commonalities in standard and custom - response headers

-

Tier 3: Tertiary Indicators - Tier 3 indicators, WHEN VALID, are + response headers.

+

Tier 3 Indicators: WHEN VALID, these are the least specific but can still support broader analyses when combined with higher-tier indicators. - These include shared CSS classes, UUIDs, and Content Management Systems

+ These include shared CSS classes, UUIDs, and Content Management Systems.

Interpreting Indicator Validity

Understanding the validity of indicators is crucial in the analysis of websites' provenance and connections. Indicators can range from high-confidence markers of direct relationships to spurious @@ -138,12 +137,12 @@

Interpreting Indicator Validity

Identifying that multiple websites are behind Cloudflare does not inherently indicate a connection beyond choosing a common, popular service for performance and security enhancements. All tier 1 and - 2 indicators should be scrutinized carefully to determine if a match is valid or spurious

+ 2 indicators should be scrutinized carefully to determine if a match is valid or spurious.

Example Investigation:

An analyst investigating a network of disinformation websites notices that several sites share a specific Facebook Pixel ID, indicating a potential link in their online marketing strategies. This Tier 1 indicator suggests a high-confidence connection. However, upon further investigation, - it's revealed that these sites also use Cloudflare for DDOS protection, sharing SSL certificates + it's revealed that these sites also use Cloudflare for DDOS protection, sharing SSL certificates and IP addresses with numerous unrelated sites. While the shared Facebook Pixel ID remains a strong indicator of connection, the shared certificates and IP addresses through Cloudflare are deemed spurious matches and the additional sites are discarded from the network. The analyst corroborates @@ -162,7 +161,7 @@

Enter the full URL of an article or webpage (e.g. https://tech.cnn.com/article-title.html or https://www.rt.com/russia/588284-darkening-prospects-ukraine-postwar/) - to automatically attempt to extract title and content

+ to automatically attempt to extract title and content.

This search allows users to specify the title and content (and apply boolean ANDs/ORs to the title and content). It also requires specifying a country and language to search in. As not all languages @@ -173,15 +172,15 @@

title or snippet which matches the provided inputs as determined by the Ratcliff/Obershelp algorithm..

-

Domain Forensics Matching

+

Metadata Similarity Search

This search, which will accept a list of one or more fully qualified domain names. (including a prepended https:// on each domain name). This will produce a list of - indicators and a list of sites which match (or are extremely similart to) those indicators. + indicators and a list of sites which match (or are extremely similar to) those indicators. Indicators, and thus matches, are broken into the three tiers described above.

Partners, Sponsors, Disclaimers

-

The Laundromat Tool is made possible with the support of European Media and Information Fund (EMIF). - The Information Laundromat Tool is built a partnership of the Alliance for Securing Democracy (ASD), +

The Laundromat Tool is made possible with the support of the European Media and Information Fund (EMIF). + The Information Laundromat Tool is built by a partnership of the Alliance for Securing Democracy (ASD), the Institute for Strategic Dialogue (ISD), and the University of Amsterdam (UvA) through the Digital Methods Institute.

@@ -623,7 +622,7 @@

Full Indicators List:

Disclaimers

Opinions Disclaimer

The sole responsibility for any content supported by the European Media and Information Fund lies - with the author(s) and it may not necessarily reflect the positions of the EMIF and the Fund + with the author(s) and it may not necessarily reflect the positions of the EMIF and the Fund's Partners, the Calouste Gulbenkian Foundation and the European University Institute.

GDPR Disclaimer

The Information Laundromat tool is committed to protecting and respecting your privacy in compliance @@ -903,7 +902,10 @@

.about-page a:hover { color: lightgray; } - + .main-page { + max-width: 1800px; + margin: 0 auto; + } \ No newline at end of file diff --git a/templates/index.html b/templates/index.html index 96f07d5..765a0d1 100644 --- a/templates/index.html +++ b/templates/index.html @@ -491,7 +491,7 @@

Metadata Similarity

engines, databases, and plagiarism checkers to find similar texts. Enter a URL, the title, or content of an article to search for instances - of reposted & similar content on search engines, + of reposted and similar content on search engines, GDELT, and a plagiarism database. Searching by URL automatically parses the title and content, but may fail. Title and content can be specified using _title: and _content: @@ -595,7 +595,7 @@
Batch Content Search
url (full url, e.g. https://tech.cnn.com/article-title.html) OR titleQuery and contentQuery (text snippets). Download the - template. + template here.
@@ -622,7 +622,7 @@
Batch Content Search
{% else %}
-

Please log in to run batch searches.

+

Please log in or register to run batch searches. Contact us at info [at] securingdemocracy.org to obtain a registration code.

{% endif %}
@@ -763,7 +763,7 @@
Batch Metadata Search
{% else %}
-

Please log in to run batch searches.

+

Please log in or register to run batch searches. Contact us at info [at] securingdemocracy.org to obtain a registration code.

{% endif %}
@@ -848,7 +848,7 @@

Interpreting the Laundromat Results About page.

- Content Similarity - This tool compares headlines, content snippets, + Content Similarity: This tool compares headlines, content snippets, or URLs with search engines, databases, and plagiarism checkers to find similar texts. It filters out unrelated content and assigns a match score to gauge similarity. Scores of 50% or more typically mean a closer match, minimizing false positives. @@ -856,17 +856,17 @@

Interpreting the Laundromat Results
- +