MHTML, an
initialism
An acronym is a word or name formed from the initial components of a longer name or phrase. Acronyms are usually formed from the initial letters of words, as in ''NATO'' (''North Atlantic Treaty Organization''), but sometimes use syllables, as ...
of "
MIME encapsulation of aggregate
HTML documents", is a
Web archive file format used to combine, in a single
computer file, the HTML code and its companion resources (such as images) that are represented by external
hyperlink
In computing, a hyperlink, or simply a link, is a digital reference to data that the user can follow or be guided by clicking or tapping. A hyperlink points to a whole document or to a specific element within a document. Hypertext is text wit ...
s in the web page's HTML code. The content of an MHTML file is encoded using the same techniques that were first developed for
HTML email messages, using the MIME content type
multipart/related
. MHTML files use an .mhtml or .mht
filename extension.
The first part of the file is an
e-mail header. The second part is normally HTML code. Subsequent parts are additional resources identified by their original
uniform resource locators (URLs) and encoded in
base64 binary-to-text encoding. MHTML was proposed as an open standard, then circulated in a revised edition in 1999 as RFC 2557.
The .mhtml (Web archive) and
.eml
Electronic mail (email or e-mail) is a method of exchanging messages ("mail") between people using electronic devices. Email was thus conceived as the electronic (digital) version of, or counterpart to, mail, at a time when "mail" meant ...
(email) filename extensions are interchangeable: either filename extension can be changed from one to the other. An .eml message can be sent by e-mail, and it can be displayed by an
email client. An email message can be saved using a .mhtml or .mht filename extension and then opened for display in a
web browser or for editing other programs, including
word processors and
text editors.
Layout
The header of an MHTML file contains metadata such as a
date and time stamp, page title, the source URL, and a unique randomized boundary string for separating resources contained within the file. The boundary string is defined at the beginning and used throughout the file.
From:
Snapshot-Content-Location: https://en.wikipedia.org/wiki/Smartphone
Subject: Smartphone - Wikipedia
Date: Sat, 24 Sep 2022 00:34:32 -0000
MIME-Version: 1.0
Content-Type: multipart/related;
type="text/html";
boundary="----MultipartBoundary--GsIBda0vjy2AKIAIliwl7JMwezXDRjDAsLje9khd5l----"
Then, the page resources are contained sequentially, starting with the page's rendered HTML source code. Each resource has its own metadata header which specifies its
MIME type
A media type (also known as a MIME type) is a two-part identifier for file formats and format contents transmitted on the Internet. The Internet Assigned Numbers Authority, Internet Assigned Numbers Authority (IANA) is the official authority for t ...
and the original location.
------MultipartBoundary--GsIBda0vjy2AKIAIliwl7JMwezXDRjDAsLje9khd5l----
Content-Type: text/html
Content-ID:
Content-Transfer-Encoding: binary
Content-Location: https://en.wikipedia.org/wiki/Smartphone
The MHTML file ends with a boundary string that is not followed by any data.
Browser support
Some browsers support the MHTML format, either directly or through third-party extensions, but the process for saving a web page along with its resources as an MHTML file is not standardized. Due to this, a web page saved as an MHTML file using one browser may render differently on another.
Internet Explorer
As of version 5.0,
IE was the first browser to support reading and saving web pages and external resources to a single MHTML file.
Microsoft Edge
As of
switching to the Chromium source code, Edge supports saving as MHTML.
Opera
Support for saving web pages as MHTML files was made available in the
Opera 9.0 web browser. From Opera 9.50 through the rest of the Presto-based Opera product line (currently at Opera 12.16 as of 19 July 2013), the default format for saving pages is MHTML. The initial release of the new Webkit/Blink-based Opera (Opera 15) did not support MHTML, but subsequent releases (Opera 16 onwards) do.
MHTML can be enabled by typing "opera://flags#save-page-as-mhtml" at the address bar.
Google Chrome
Creating MHTML files in Google Chrome is enabled by default in version 86.
Yandex Browser
Creating MHTML (multipart/related) files in Yandex Browser is enabled by default in version 22.7.4.960 (July 2022).
Vivaldi
Similarly to Google Chrome, the
Chromium
Chromium is a chemical element with the symbol Cr and atomic number 24. It is the first element in group 6. It is a steely-grey, lustrous, hard, and brittle transition metal.
Chromium metal is valued for its high corrosion resistance and hardne ...
-based
Vivaldi browser can save webpages as MHTML files since the 2.3 release.
It supports both reading and writing MHTML files by toggling the "vivaldi://flags/#save-page-as-mhtml" option.
Firefox
Mozilla Firefox
Mozilla Firefox, or simply Firefox, is a free and open-source web browser developed by the Mozilla Foundation and its subsidiary, the Mozilla Corporation. It uses the Gecko rendering engine to display web pages, which implements current and a ...
does not support MHTML. Until the advent of
version 57 ("Firefox Quantum"), MHT files could be read and written by installing a
browser extension
A browser extension is a small software module for customizing a web browser. Browsers typically allow a variety of extensions, including user interface modifications, cookie management, ad blocking, and the custom scripting and styling of web p ...
, such as Mozilla Archive Format or UnMHT.
Safari
From version 3.1.1 onwards,
Apple Inc.'s
Safari
A safari (; ) is an overland journey to observe wild animals, especially in eastern or southern Africa. The so-called "Big Five" game animals of Africa – lion, leopard, rhinoceros, elephant, and Cape buffalo – particularly form an importa ...
web browser does not natively support the MHTML format. Instead, Safari supports the
webarchive format, and the
macOS version includes a print-to-
PDF
Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. ...
feature.
As with most other modern web browsers, support for MHTML files can be added to Safari via various third-party extensions.
Konqueror
As of version 3.5.7,
KDE's
Konqueror web browser does not support MHTML files. An extension project, mhtconv, can be used to allow saving and viewing of MHTML files.
ACCESS NetFront
NetFront 3.4 (on devices such as the Sony Ericsson
K850) can view and save MHTML files.
Pale Moon
Pale Moon requires an extension to be installed to read and write MHT files. One extension is freely available, MozArchiver, a fork of Mozilla Archive Format extension.
GNOME Web
GNOME Web added support for read and save web pages in MHTML since version 3.14.1 released in September 2014.
MHT viewers
There are commercial software products for viewing MHTML files and converting them to other formats, such as PDF and
ePub
EPUB is an e-book file format that uses the ".epub" file extension. The term is short for ''electronic publication'' and is sometimes styled ''ePub''. EPUB is supported by many e-readers, and compatible software is available for most smartphones ...
. Some
HTML editor programs can view and edit MHTML files.
MIME type
MIME type for MHTML is not well agreed upon. Used MIME types include:
* multipart/related
* application/x-mimearchive
* message/rfc822
Other apps
Problem Steps Recorder
Problem Steps Recorder for Windows can save its output to MHT format.
Save to Google Drive extension
The "Save to Google Drive" extension for
Google Chrome
Google Chrome is a cross-platform web browser developed by Google. It was first released in 2008 for Microsoft Windows, built with free software components from Apple WebKit and Mozilla Firefox. Versions were later released for Linux, macOS ...
can save as MHTML as one of its outputs.
Microsoft OneNote
Microsoft OneNote, starting with OneNote 2010, emails individual pages as .mht files.
Evernote
Evernote for Windows can export notes as MHT format, as an alternative to HTML or its own native .enex format.
Exploits
In May 2015, a researcher noted that attackers could build malicious documents by creating an MHT file, appending an MSO object at the end (MSO is a file format used by the
Microsoft Outlook
Microsoft Outlook is a personal information manager software system from Microsoft, available as a part of the Microsoft Office and Microsoft 365 software suites. Though primarily an email client, Outlook also includes such functions as Calen ...
e-mail application), and renaming the resulting file with a .doc extension. The delivery method would be by spam emails.
In April 2019, a security researcher published details about an
XML external entity (XXE) vulnerability that could be exploited when a user opens an MHT file. Since the Windows operating system is set to automatically open all MHT files, by default, in Internet Explorer, the exploit could be triggered when a user double-clicked on a file that they received via email, instant messaging, or another vector, including a different browser.
See also
*
data URI scheme
The data URI scheme is a uniform resource identifier (URI) scheme that provides a way to include data in-line in Web pages as if they were external resources. It is a form of file literal or here document. This technique allows normally separate ...
*
Mozilla Archive Format
*
Mpack (Unix)
*
Webarchive
*
Web ARChive
References
External links
MHTML standard explained* RFC 2557 (1999)—MIME Encapsulation of Aggregate Documents, such as HTML (MHTML)
* RFC 2110 (1997, Obsolete)—MIME E-mail Encapsulation of Aggregate Documents, such as HTML (MHTML)
{{DEFAULTSORT:Mhtml
Archive formats
Internet Explorer
HTML
MIME
Web Archives