archive.today (formerly archive.is) is a

web archiving Web archiving is the process of collecting, preserving, and providing access to material from the World Wide Web. The aim is to ensure that information is preserved in an archival format for research and the public. Web archivists typically ...

website that saves snapshots on demand. It has support for

JavaScript JavaScript (), often abbreviated as JS, is a programming language and core technology of the World Wide Web, alongside HTML and CSS. Ninety-nine percent of websites use JavaScript on the client side for webpage behavior. Web browsers have ...

-heavy sites such as

Google Maps Google Maps is a web mapping platform and consumer application offered by Google. It offers satellite imagery, aerial photography, street maps, 360° interactive panorama, interactive panoramic views of streets (Google Street View, Street View ...

and

Twitter Twitter, officially known as X since 2023, is an American microblogging and social networking service. It is one of the world's largest social media platforms and one of the most-visited websites. Users can share short text messages, image ...

. Archive.today records two snapshots: one replicates the original webpage including any functional live links; the other is a

screenshot A screenshot (also known as screen capture or screen grab) is an analog or digital image that shows the contents of a computer display. A screenshot is created by a (film) camera shooting the screen or the operating system An operating sys ...

of the page.

History

Archive.today was founded in 2012. The site originally branded itself as archive.today, but changed the primary

mirror A mirror, also known as a looking glass, is an object that Reflection (physics), reflects an image. Light that bounces off a mirror forms an image of whatever is in front of it, which is then focused through the lens of the eye or a camera ...

to archive.is in May 2015. It began to deprecate the archive.is domain in favor of other mirrors in January 2019. In 2021, archive.today had saved about 500 million pages.

Features

Archive.today can capture individual pages in response to explicit user requests. Since its beginning, it has supported crawling pages with

URL A uniform resource locator (URL), colloquially known as an address on the Web, is a reference to a resource that specifies its location on a computer network and a mechanism for retrieving it. A URL is a specific type of Uniform Resource Identi ...

s containing the now-deprecated hash-bang fragment (). Archive.today records only text and images, excluding

XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing data. It defines a set of rules for encoding electronic document, documents in a format that is both human-readable and Machine-r ...

, RTF,

spreadsheet A spreadsheet is a computer application for computation, organization, analysis and storage of data in tabular form. Spreadsheets were developed as computerized analogs of paper accounting worksheets. The program operates on data entered in c ...

( xls or ods) and other non-static content. However, videos for certain sites, like

X (formerly Twitter) Twitter, officially known as X since 2023, is an American microblogging and social networking service. It is one of the world's largest social media platforms and one of the most-visited websites. Users can share short text messages, imag ...

, are saved. It keeps track of the history of snapshots saved, requesting confirmation before adding a new snapshot of an already saved page. Pages are captured at a browser width of 1,024 pixels. CSS is converted to inline CSS, removing

responsive web design Responsive web design (RWD) or responsive design is an approach to web design that aims to make web pages render well on a variety of devices and window or screen sizes from minimum to maximum display size to ensure usability and satisfactio ...

and selectors such as :hover and :active. Content generated using

during the crawling process appears in a frozen state.JavaScript-generated loading animation of

Dailymotion Dailymotion is a French online video platform, online video sharing platform owned by Canal+ S.A., Canal+. Prior to 2024, the company was owned by Vivendi. North American launch partners included Vice Media, Bloomberg L.P., Bloomberg, and Hears ...

vide
appearing in a frozen state
/ref> HTML class names are preserved inside the old-class attribute. When text is selected, a JavaScript applet generates a URL fragment seen in the browser's

address bar In a web browser, the address bar (also location bar or URL bar) is the element that shows the current URL. The user can type a URL into it to navigate to a chosen website. In most modern browsers, non-URLs are automatically sent to a search eng ...

that automatically highlights that portion of the text when visited again. Web pages can be duplicated from archive.today to web.archive.org as second-level backup, but archive.today does not save its snapshots in WARC format. The reverse—from web.archive.org to archive.today—is also possible, but the copy usually takes more time than a direct capture. Historically, website owners had the option to opt out of

Wayback Machine The Wayback Machine is a digital archive of the World Wide Web founded by Internet Archive, an American nonprofit organization based in San Francisco, California. Launched for public access in 2001, the service allows users to go "back in ...

through the use of the

robots exclusion standard robots.txt is the filename used for implementing the Robots Exclusion Protocol, a standard used by websites to indicate to visiting web crawlers and other web robots which portions of the website they are allowed to visit. The standard, dev ...

(robots.txt), and these exclusions were also applied retroactively. Archive.today does not obey robots.txt because it acts "as a direct agent of the human user." As of 2019, the

also no longer obeys robots.txt. The research toolbar enables advanced keywords operators, using as the

wildcard character In software, a wildcard character is a kind of placeholder represented by a single character (computing), character, such as an asterisk (), which can be interpreted as a number of literal characters or an empty string. It is often used in file ...

. A couple of

quotation mark Quotation marks are punctuation marks used in pairs in various writing systems to identify direct speech, a quotation, or a phrase. The pair consists of an opening quotation mark and a closing quotation mark, which may or may not be the sam ...

s address the search to an exact sequence of keywords present in the title or in the body of the webpage, whereas the ''insite'' operator restricts it to a specific Internet domain. Once a web page is archived, it cannot be deleted directly by any Internet user. Removing advertisements, popups or expanding links from archived pages is possible by asking the owner to do it on his blog. While saving a dynamic list, archive.today search box shows only a result that links the previous and the following section of the list (e.g. 20 links for page). The other web pages saved are filtered, and sometimes may be found by one of their occurrences. The search feature is backed by Google CustomSearch. If it delivers no results, archive.today attempts to utilize

Yandex Search Yandex Search () is a search engine owned by the company Yandex, based in Russia. In January 2015, Yandex Search generated 51.2% of all of the search traffic in Russia according to . In February 2024, Yandex N.V. announced the sale of the majo ...

. While saving a page, a list of URLs for individual page elements and their content sizes, HTTP statuses and

MIME type In information and communications technology, a media type, content type or MIME type is a two-part identifier for file formats and content formats. Their purpose is comparable to filename extensions and uniform type identifiers, in that they ide ...

s is shown. This list can only be viewed during the crawling process. Users can download archived pages as a ZIP file, except pages archived when archive.today changed their browser engine from PhantomJS to

Chromium Chromium is a chemical element; it has Symbol (chemistry), symbol Cr and atomic number 24. It is the first element in Group 6 element, group 6. It is a steely-grey, Luster (mineralogy), lustrous, hard, and brittle transition metal. Chromium ...

(non-headless). In July 2013, Archive.today began supporting the

API An application programming interface (API) is a connection between computers or between computer programs. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how to build ...

of the

Memento Project Memento is a United States ''National Digital Information Infrastructure and Preservation Program (NDIIPP)''–funded project aimed at making Web archiving, Web-archived content more readily discoverable and accessible to the public. Technical ...

Worldwide availability

Australia and New Zealand

In March 2019, the site was blocked for six months by several internet providers in

Australia Australia, officially the Commonwealth of Australia, is a country comprising mainland Australia, the mainland of the Australia (continent), Australian continent, the island of Tasmania and list of islands of Australia, numerous smaller isl ...

and

New Zealand New Zealand () is an island country in the southwestern Pacific Ocean. It consists of two main landmasses—the North Island () and the South Island ()—and List of islands of New Zealand, over 600 smaller islands. It is the List of isla ...

in the aftermath of the Christchurch mosque shootings in an attempt to limit distribution of the footage of the attack.

China

According to GreatFire.org, archive.today has been blocked in mainland China archive.li archive.fo as well as archive.ph

Finland

On 21 July 2015, the operators blocked access to the service from all Finnish

IP address An Internet Protocol address (IP address) is a numerical label such as that is assigned to a device connected to a computer network that uses the Internet Protocol for communication. IP addresses serve two main functions: network interface i ...

es, stating on Twitter that they did this in order to avoid escalating a dispute they allegedly had with the Finnish government.

Russia

In 2016, the Russian communications agency

Roskomnadzor The Federal Service for Supervision of Communications, Information Technology and Mass Media, abbreviated as ''Roskomnadzor'' (RKN), is the Russian federal executive agency responsible for monitoring, controlling and censoring Russian mass media. ...

began blocking access to archive.is from Russia.

Cloudflare DNS availability

Since May 2018

Cloudflare Cloudflare, Inc., is an American company that provides content delivery network services, cybersecurity, DDoS mitigation, wide area network services, reverse proxies, Domain Name Service, ICANN-accredited domain registration, and other se ...

's 1.1.1.1 DNS service would not resolve archive.today's web addresses, making it inaccessible to users of the Cloudflare DNS service. Both organizations claimed the other was responsible for the issue. Cloudflare staff stated that the problem was on archive.today's DNS infrastructure, as its authoritative nameservers return invalid records when Cloudflare's network systems made requests to archive.today. archive.today countered that the issue was due to Cloudflare requests not being compliant with DNS standards, as Cloudflare does not send EDNS Client Subnet information in its DNS requests.

References

External links

*
FAQ
at Archive.today
archive.today
at Archive Team

wiki A wiki ( ) is a form of hypertext publication on the internet which is collaboratively edited and managed by its audience directly through a web browser. A typical wiki contains multiple pages that can either be edited by the public or l ...

"archive.today: On the trail of the mysterious guerrilla archivist of the Internet"
''Gyrovague'', 5 August 2023 {{Authority control History of the Internet Internet properties established in 2012 Tor onion services Web archiving initiatives