HOME

TheInfoList



OR:

Web Archive (stylized by
Apple An apple is a round, edible fruit produced by an apple tree (''Malus'' spp.). Fruit trees of the orchard or domestic apple (''Malus domestica''), the most widely grown in the genus, are agriculture, cultivated worldwide. The tree originated ...
as Web archive, extension .webarchive) is a
Web archive file Web archiving is the process of collecting, preserving, and providing access to material from the World Wide Web. The aim is to ensure that information is preserved in an archival format for research and the public. Web archivists typically e ...
format available on
macOS macOS, previously OS X and originally Mac OS X, is a Unix, Unix-based operating system developed and marketed by Apple Inc., Apple since 2001. It is the current operating system for Apple's Mac (computer), Mac computers. With ...
and
Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
for saving and reviewing complete web pages using the Safari web browser. The Web Archive format differs from a standalone
HTML Hypertext Markup Language (HTML) is the standard markup language for documents designed to be displayed in a web browser. It defines the content and structure of web content. It is often assisted by technologies such as Cascading Style Sheets ( ...
file because it also saves linked files such as images, CSS, and
JavaScript JavaScript (), often abbreviated as JS, is a programming language and core technology of the World Wide Web, alongside HTML and CSS. Ninety-nine percent of websites use JavaScript on the client side for webpage behavior. Web browsers have ...
. The Web Archive format is a concatenation of source files with filenames saved in the binary plist format using NSKeyedArchiver. Support for Web Archive documents was added in Safari 4 Beta on Windows and was included in subsequent versions, until its discontinuation in 2012. Safari on
iOS Ios, Io or Nio (, ; ; locally Nios, Νιός) is a Greek island in the Cyclades group in the Aegean Sea. Ios is a hilly island with cliffs down to the sea on most sides. It is situated halfway between Naxos and Santorini. It is about long an ...
and
iPadOS iPadOS is a mobile operating system developed by Apple for its iPad line of tablet computers. It was given a name distinct from iOS, the operating system used by Apple's iPhones to reflect the diverging features of the two product lines, suc ...
(
iPhone The iPhone is a line of smartphones developed and marketed by Apple that run iOS, the company's own mobile operating system. The first-generation iPhone was announced by then–Apple CEO and co-founder Steve Jobs on January 9, 2007, at ...
and
iPad The iPad is a brand of tablet computers developed and marketed by Apple Inc., Apple that run the company's mobile operating systems iOS and later iPadOS. The IPad (1st generation), first-generation iPad was introduced on January 27, 2010. ...
) has supported Web Archive files since at least
iOS 13 iOS 13 is the thirteenth major release of the iOS mobile operating system developed by Apple for the iPhone, iPod Touch and HomePod. The successor to iOS 12, it was announced at the company's Worldwide Developers Conference (WWDC) on June ...
. Previously there was a third party iOS app called Web Archive Viewer that provided this functionality.


Usage

* A version of the Web Archive format is used to bundle whole music albums and movies with extra content and menus inside iTunes LP and Extras. * Web Archive files were automatically generated for ads submitted to Apple's iAd advertising platform. * The
WebKit WebKit is a browser engine primarily used in Apple's Safari web browser, as well as all web browsers on iOS and iPadOS. WebKit is also used by the PlayStation consoles starting with the PS3, the Tizen mobile operating systems, the Amazon K ...
framework's WebArchive class is used to simplify cutting-and-pasting with whole or partial web pages.


Vulnerability

In February 2013, a vulnerability with the Web Archive format was discovered and reported by Joe Vennix, a
Metasploit Project The Metasploit Project is a computer security project that provides information about security vulnerabilities and aids in penetration testing and IDS signature development. It is owned by Boston, Massachusetts-based security company, Rapid7. ...
developer. The exploit allows an attacker to send a crafted Web Archive to a user containing code to access
cookies A cookie is a sweet biscuit with high sugar and fat content. Cookie dough is softer than that used for other types of biscuit, and they are cooked longer at lower temperatures. The dough typically contains flour, sugar, egg, and some type of ...
, local files, and other data. Apple's response to the report was that it will not fix the bug, most likely because it requires action on the users' part in opening the file.


Converting for other browsers

Workarounds to allow the file to be viewed in other browsers are possible, though specific webpage contents may hinder this process. This requires one of the free tools WebArchive Folderizer (for OS X 10.2 and higher) or WebArchive Extractor (for OS X 10.4.3 and higher). Webarchives can be converted to WARC using the
National Library of Norway The National Library of Norway () was established in 1989. Its principal task is "to preserve the past for the future". The library is located both in Oslo and in Mo i Rana. The building in Oslo was restored and reopened in 2005. Prior to the e ...
's Warchaeology set of tools.


Alternatives

MAFF MAFF(S) may refer to: * MAFF (gene), a transcription factor * Malmö Arab Film Festival, held in Malmö (Sweden), the largest Arabic film festival in Europe * Ministry of Agriculture, Fisheries and Food (United Kingdom), a former department of UK g ...
is an open format (with a published specification) that enables saving of whole webpages in a single file. It is currently supported by
Firefox Mozilla Firefox, or simply Firefox, is a free and open-source web browser developed by the Mozilla Foundation and its subsidiary, the Mozilla Corporation. It uses the Gecko rendering engine to display web pages, which implements curr ...
, using an extension. Other web browsers use the
MHTML MHTML, an initialism of "MIME encapsulation of aggregate HTML documents", is a web archiving file format used to combine, in a single computer file, the HTML code and its companion resources (such as images) that are represented by external hyp ...
format or do the equivalent by saving a directory of inline resources (usually images) alongside the
HTML Hypertext Markup Language (HTML) is the standard markup language for documents designed to be displayed in a web browser. It defines the content and structure of web content. It is often assisted by technologies such as Cascading Style Sheets ( ...
file, sometimes compressed, like the .war format used by
Konqueror Konqueror is a Free and open-source software, free and open-source web browser and file manager that provides World Wide Web, web access and file viewer, file-viewer functionality for file systems (such as local files, files on a remote FTP ser ...
(tar+gzip or tar+bzip2). Safari does not support these alternative archive formats. For archiving entire websites, the
Internet Archive The Internet Archive is an American 501(c)(3) organization, non-profit organization founded in 1996 by Brewster Kahle that runs a digital library website, archive.org. It provides free access to collections of digitized media including web ...
has developed the
Web ARChive The WARC (Web ARChive) archive format specifies a method for combining multiple digital resources into an aggregate archive file together with related information. These combined resources are saved as a WARC computer file, file which can be rep ...
(WARC) format which was standardized by
ISO The International Organization for Standardization (ISO ; ; ) is an independent, non-governmental, international standard development organization composed of representatives from the national standards organizations of member countries. Me ...
. HTMLD (HTML Directory) is a NeXT-developed format for saving web pages and their dependencies in a bundle that may also be served by a web server. Chrome offers the "webpage, complete" format which saves the page with a folder containing the required resources.


See also

*
Web archiving Web archiving is the process of collecting, preserving, and providing access to material from the World Wide Web. The aim is to ensure that information is preserved in an archival format for research and the public. Web archivists typically ...
– the general process of archiving web pages * List of web archiving file formats – file formats for archiving web pages


References

Web archives Archive formats Web browsers {{Mac-stub