HOME

TheInfoList



OR:

Anna's Archive is an
open source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...
search engine A search engine is a software system that provides hyperlinks to web pages, and other relevant information on World Wide Web, the Web in response to a user's web query, query. The user enters a query in a web browser or a mobile app, and the sea ...
for shadow libraries that was launched by Anna shortly after law enforcement efforts to shut down Z-Library in 2022. The site aggregates records from major shadow libraries including
Z-Library Z-Library (abbreviated as z-lib, formerly BookFinder) is a shadow library project for file-sharing access to scholarly journal articles, academic texts and general-interest books. It began as a mirror of Library Genesis but has expanded dram ...
,
Sci-Hub Sci-Hub is a library website that provides free access to millions of research papers, regardless of copyright, by bypassing publishers' paywalls in various ways. Unlike Library Genesis, it does not provide access to books. Sci-Hub was found ...
, and
Library Genesis Library Genesis (shortened to LibGen) is a shadow library project for file-sharing access to scholarly journal articles, academic and general-interest books, images, comics, audiobooks, and magazines. The site enables free access to content th ...
, among other sources. It calls itself "the largest truly open library in human history", and has said it aims to " catalog all the books in existence" and "track humanity's progress toward making all these books easily available in digital form". It claims not to be responsible for downloads of
copyright A copyright is a type of intellectual property that gives its owner the exclusive legal right to copy, distribute, adapt, display, and perform a creative work, usually for a limited time. The creative work may be in a literary, artistic, ...
ed materials, since the site indexes
metadata Metadata (or metainformation) is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive ...
but does not directly host any files, instead linking to third-party downloads. However, it has faced government blocks and legal action from publishers and publishing
trade association A trade association, also known as an industry trade group, business association, sector association or industry body, is an organization founded and funded by businesses that operate in a specific Industry (economics), industry. Through collabor ...
s for engaging in large-scale
copyright infringement Copyright infringement (at times referred to as piracy) is the use of Copyright#Scope, works protected by copyright without permission for a usage where such permission is required, thereby infringing certain exclusive rights granted to the c ...
.


Origins

Anna's Archive emerged from the Pirate Library Mirror (PiLiMi) project, an anonymous effort to
mirror A mirror, also known as a looking glass, is an object that Reflection (physics), reflects an image. Light that bounces off a mirror forms an image of whatever is in front of it, which is then focused through the lens of the eye or a camera ...
shadow libraries that completed a full copy of Z-Library in September 2022. PiLiMi acknowledged that it "deliberately violated the copyright law in most countries", and its initial focus was on
preservation Preservation may refer to: Heritage and conservation * Preservation (library and archival science), activities aimed at prolonging the life of a record while making as few changes as possible * ''Preservation'' (magazine), published by the Nat ...
rather than on making its data searchable. Days after US law enforcement seized several Z-Library domains and arrested its alleged operators in November 2022, PiLiMi member Anna (also known as Anna Archivist) launched Anna's Archive, which initially displayed results from Z-Library and Library Genesis.


Website and operations

Anna's Archive has been variously described as a search engine, a
metasearch engine A metasearch engine (or search aggregator) is an online information retrieval tool that uses the data of a web search engine to produce its own results. Metasearch engines take input from a user and immediately query search engines for results. ...
, and a shadow library itself. The site does not itself host any files, but it links to third-party downloads provided by anonymous partners. It also offers downloads through the IPFS protocol. Its source code is dedicated to the
public domain The public domain (PD) consists of all the creative work to which no Exclusive exclusive intellectual property rights apply. Those rights may have expired, been forfeited, expressly Waiver, waived, or may be inapplicable. Because no one holds ...
under the CC0 license, and its data is released in bulk with
torrent file In the BitTorrent file distribution system, a torrent file or meta-info file is a computer file that contains metadata about files and folders to be distributed, and usually also a list of the network locations of trackers, which are computers ...
s so as to make it resilient to website takedowns. It operates three mirrors under different
top-level domain A top-level domain (TLD) is one of the domain name, domains at the highest level in the hierarchical Domain Name System of the Internet after the root domain. The top-level domain names are installed in the DNS root zone, root zone of the nam ...
s, currently .li, .se, and .org. Anna's Archive includes 51,064,327 books and 98,551,617 papers, and its unified list of torrents totals roughly 1.1 petabytes in size. In March 2025, it averaged over 650,000 daily downloads, roughly 10 times the estimated daily distribution of the
New York Public Library The New York Public Library (NYPL) is a public library system in New York City. With nearly 53 million items and 92 locations, the New York Public Library is the second-largest public library in the United States behind the Library of Congress a ...
. It lists Library Genesis, Sci-Hub, Z-Library, the
Internet Archive The Internet Archive is an American 501(c)(3) organization, non-profit organization founded in 1996 by Brewster Kahle that runs a digital library website, archive.org. It provides free access to collections of digitized media including web ...
, DuXiu, MagzDB, Nexus/STC, and
HathiTrust HathiTrust Digital Library is a large-scale collaborative repository of digital content from research libraries. Its holdings include content digitized via Google Books and the Internet Archive digitization initiatives, as well as content digit ...
among its "source libraries";
Open Library Open Library is an online project intended to create "one web page for every book ever published". Created by Aaron Swartz, Brewster Kahle, Alexis Rossi, Anand Chitipothu, and Rebecca Hargrave Malamud, Open Library is a project of the Internet ...
,
WorldCat WorldCat is a union catalog that itemizes the collections of tens of thousands of institutions (mostly libraries), in many countries, that are current or past members of the OCLC global cooperative. It is operated by OCLC, Inc. Many of the O ...
, and
Google Books Google Books (previously known as Google Book Search, Google Print, and by its code-name Project Ocean) is a service from Google that searches the full text of books and magazines that Google has scanned, converted to text using optical charac ...
are listed as metadata-only sources. Some of these datasets are already publicly accessible, while others are scraped or otherwise privately acquired for distribution.


Finances

High-speed downloads on Anna's Archive are only available to users with a paid membership, while nonmembers must use slower options with browser verification to prevent abuse by bots. It describes itself as a
nonprofit A nonprofit organization (NPO), also known as a nonbusiness entity, nonprofit institution, not-for-profit organization, or simply a nonprofit, is a non-governmental (private) legal entity organized and operated for a collective, public, or so ...
, claiming that membership fees and donations are mostly spent on server infrastructure and that none are personally used by the site's operators. Memberships and monetary rewards are given to some volunteer contributors. Anna's Archive offers high-speed access to its full collection via SFTP to groups training
large language model A large language model (LLM) is a language model trained with self-supervised machine learning on a vast amount of text, designed for natural language processing tasks, especially language generation. The largest and most capable LLMs are g ...
s in exchange for large contributions of money or data. It said it provided such access to about 30 companies as of January 2025, primarily based in China, including both LLM companies and
data broker A data broker is an individual or company that specializes in collecting personal data (such as income, ethnicity, political beliefs, or geolocation data) or data about people, mostly from public records but sometimes sourced privately, and sell ...
s.
DeepSeek Hangzhou DeepSeek Artificial Intelligence Basic Technology Research Co., Ltd., Trade name, doing business as DeepSeek, is a Chinese artificial intelligence company that develops large language models (LLMs). Based in Hangzhou, Zhejiang, Deepse ...
's VL model was trained on data from the site.


Motivation

The project's stated goal is to continue the preservation and access missions of earlier initiatives like Z-Library and Sci-Hub. Anna cites programmer and information activist
Aaron Swartz Aaron Hillel Swartz (; November 8, 1986January 11, 2013), also known as AaronSw, was an American computer programmer, entrepreneur, writer, political organizer, and Internet hacktivism, hacktivist. As a programmer, Swartz helped develop the we ...
as a key inspiration for the project, and has said that they and other shadow librarians believe that "information wants to be free". The project's team identifies as "ideologues" who believe that preserving and hosting their collection of over 140 million copyrighted works is morally right, especially as traditional libraries face funding cuts and corporate digital archives are deemed untrustworthy for preserving humanity's heritage. The site recommends Swartz's writings as well as Stephen Witt's '' How Music Got Free'' and
Michele Boldrin Michele Boldrin (; 20 August 1956) is an Italian-born academic, former politician, YouTuber, and economist working in the areas of economic growth, business cycles, technological progress, and intellectual property. He is the Joseph Gibson Hoy ...
and David K. Levine's ''Against Intellectual Monopoly'', which criticize existing copyright law and have been associated with the
copyleft Copyleft is the legal technique of granting certain freedoms over copies of copyrighted works with the requirement that the same rights be preserved in derivative works. In this sense, ''freedoms'' refers to the use of the work for any purpose, ...
movement. Anna reports that while most US-based AI companies were hesitant to use the archive's illegal collection for training data, Chinese firms "have enthusiastically embraced" it, with one of them being DeepSeek. This has led Anna to frame copyright reform as a matter of
national security National security, or national defence (national defense in American English), is the security and Defence (military), defence of a sovereign state, including its Citizenship, citizens, economy, and institutions, which is regarded as a duty of ...
, arguing that Western countries risk falling behind in the AI race if they do not create legal carve-outs for the mass-preservation and use of texts for purposes like AI training. Anna's Archive has been described as greatly expanding the ambitions of earlier shadow libraries with its vision of a "universal library" that preserves as many books as possible, and in the context of an ascendant "culture of mistrust towards corporations, institutions, governments, and laws... that perhaps began with the financial collapse of 2008 and the
Occupy Wall Street Occupy Wall Street (OWS) was a left-wing populist movement against economic inequality, capitalism, corporate greed, big finance, and the influence of money in politics that began in Zuccotti Park, located in New York City's Financial ...
movements", one which led to the rise of decentralizing technologies like shadow libraries and cryptocurrency.


Site blocks and legal issues


United States

Since 2023, Anna's Archive domains have appeared in the annual Notorious Markets List of the
Office of the United States Trade Representative The Office of the United States Trade Representative (USTR) is an agency of the United States federal government responsible for developing and promoting United States foreign trade policies. Part of the Executive Office of the President, it ...
, which highlights digital and physical markets allegedly involved in large-scale
intellectual property infringement An intellectual property (IP) infringement is the infringement or violation of an intellectual property right. There are several types of intellectual property rights, such as copyrights, patents, trademarks, industrial designs, plant breeders ri ...
. These reports describe the site as related to Sci-Hub and Library Genesis. In response to a request for comment by the Office on its 2023 List, the
Association of American Publishers The Association of American Publishers (AAP) is the national trade association of the American book publishing industry. AAP lobbies for book, journal and education publishers in the United States. AAP members include most of the major commercial ...
identified Anna's Archive as an infringing site, and analyzed its
cryptocurrency A cryptocurrency (colloquially crypto) is a digital currency designed to work through a computer network that is not reliant on any central authority, such as a government or bank, to uphold or maintain it. Individual coin ownership record ...
wallets to find that it had received over $29,000 in funds as of July 2023.


OCLC lawsuit

In October 2023, Anna's Archive was reported to have scraped the entirety of
WorldCat WorldCat is a union catalog that itemizes the collections of tens of thousands of institutions (mostly libraries), in many countries, that are current or past members of the OCLC global cooperative. It is operated by OCLC, Inc. Many of the O ...
, the world's largest
bibliographic database A bibliographic database is a database of bibliographic records. This is an organised online collection of references to published written works like academic journal, journal and newspaper articles, conference proceedings, reports, government an ...
, and made its proprietary data freely available, which Anna described as "a major milestone in mapping out all the books in the world".
OCLC OCLC, Inc. See also: is an American nonprofit cooperative organization "that provides shared technology services, original research, and community programs for its membership and the library community at large". It was founded in 1967 as the ...
, WorldCat's maintainer, responded by filing a lawsuit against the site in an Ohio federal court in January 2024, claiming the scrape was achieved through
cyberattack A cyberattack (or cyber attack) occurs when there is an unauthorized action against computer infrastructure that compromises the confidentiality, integrity, or availability of its content. The rising dependence on increasingly complex and inte ...
s on its servers. It sought over $5 million in total
damages At common law, damages are a remedy in the form of a monetary award to be paid to a claimant as compensation for loss or injury. To warrant the award, the claimant must show that a breach of duty has caused foreseeable loss. To be recognized at ...
and an
injunction An injunction is an equitable remedy in the form of a special court order compelling a party to do or refrain from doing certain acts. It was developed by the English courts of equity but its origins go back to Roman law and the equitable rem ...
to stop Anna's Archive from scraping or sharing its data. OCLC clarified that although its internal systems were not breached, it believes the site's actions legally constitute hacking. The only named defendant denied any involvement with Anna's Archive or the scrape. Technology writer Glyn Moody criticized the suit as "costly and pointless", saying it went against OCLC's stated mission of making information accessible. In July 2024, in the wake of the suit, the .org mirror of Anna's Archive was replaced with a new
.gs .gs is the Internet country code top-level domain (ccTLD) for South Georgia and the South Sandwich Islands. .gs is a member of the Council of Country Code Administrators (CoCCA), a group of country-code domains making use of common registry an ...
mirror to avoid falling under US jurisdiction; however, soon afterward, the .gs domain was suspended and the mirror reverted to the original .org domain. In March 2025, the court deferred judgement on aspects of the case to the
Supreme Court of Ohio The Supreme Court of the State of Ohio is the highest court in the U.S. state of Ohio, with final authority over interpretations of Ohio law and the Ohio Constitution. The court has seven members, a chief justice and six associate justices, ...
over concerns about its legal novelty, denying both a motion for
default judgement Default may refer to: Law * Default (law), the failure to do something required by law ** Default (finance), failure to satisfy the terms of a loan obligation or failure to pay back a loan ** Default judgment, a binding judgment in favor of ...
from OCLC and a
motion to dismiss In United States law, a motion is a procedural device to bring a limited, contested issue before a court for decision. It is a request to the judge (or judges) to make a decision about the case. Motions may be made at any point in administrativ ...
from the named defendant. In April, OCLC reached an agreement with the named defendant to drop her from the case, focusing instead on obtaining judgement against the site itself.


Nvidia lawsuit

In March 2024, a group of authors filed a lawsuit against
Nvidia Nvidia Corporation ( ) is an American multinational corporation and technology company headquartered in Santa Clara, California, and incorporated in Delaware. Founded in 1993 by Jensen Huang (president and CEO), Chris Malachowsky, and Curti ...
in a California federal court for allegedly training its
generative AI Generative artificial intelligence (Generative AI, GenAI, or GAI) is a subfield of artificial intelligence that uses generative models to produce text, images, videos, or other forms of data. These models learn the underlying patterns and str ...
platform NeMo on the Books3 dataset, which includes copyrighted data from several shadow libraries, Anna's Archive among them. In the company's response, it disputed the characterization of these sites as shadow libraries, despite Anna's own use of the term.


Meta lawsuit

In February 2025, internal emails were unsealed in a lawsuit against Meta in a California court for allegedly training its AI models on copyrighted works which revealed that the company had downloaded over 81 terabytes of data through Anna's Archive torrents, in addition to data previously downloaded from Library Genesis. The plaintiffs in the case, a group of authors including Richard Kadrey,
Sarah Silverman Sarah Kate Silverman (born December 1, 1970) is an American stand-up comedian, actress, and writer. She first rose to prominence for her brief stint as a writer and cast member on the NBC sketch comedy series ''Saturday Night Live'' during its ...
, and
Christopher Golden Christopher Golden (born July 15, 1967) is an American writer. Early life Golden was born and raised in Massachusetts, where he still lives with his family. He graduated from Tufts University. Career As well as novels, Golden has written com ...
, alleged that CEO
Mark Zuckerberg Mark Elliot Zuckerberg (; born May 14, 1984) is an American businessman who co-founded the social media service Facebook and its parent company Meta Platforms, of which he is the chairman, chief executive officer, and controlling sharehold ...
personally authorized the use of shadow libraries. The company had argued that its use of copyrighted data in AI training constituted
fair use Fair use is a Legal doctrine, doctrine in United States law that permits limited use of copyrighted material without having to first acquire permission from the copyright holder. Fair use is one of the limitations to copyright intended to bal ...
.


Netherlands

In March 2024, the
Rotterdam Rotterdam ( , ; ; ) is the second-largest List of cities in the Netherlands by province, city in the Netherlands after the national capital of Amsterdam. It is in the Provinces of the Netherlands, province of South Holland, part of the North S ...
District Court District courts are a category of courts which exists in several nations, some call them "small case court" usually as the lowest level of the hierarchy. These courts generally work under a higher court which exercises control over the lower co ...
ordered major
internet service provider An Internet service provider (ISP) is an organization that provides a myriad of services related to accessing, using, managing, or participating in the Internet. ISPs can be organized in various forms, such as commercial, community-owned, no ...
s in the Netherlands to block Anna's Archive and Library Genesis due to a request by advocacy group BREIN. The order was "dynamic", meaning that if the blocked sites changed domains or IP addresses in the future, ISPs would be obligated to update their blocks.


Italy

In January 2024, Italy's national communications agency ordered ISPs in the country to block Anna's Archive due to a copyright complaint by the Italian Publishers Association. An investigation by the country's Digital Services Directorate confirmed the presence of copyrighted material on the site and found that some of its servers were likely owned by a Ukrainian hosting provider, but failed to uncover the identity of its operators.


United Kingdom

In December 2024, the UK Publishers Association won an order from the
High Court of Justice The High Court of Justice in London, known properly as His Majesty's High Court of Justice in England, together with the Court of Appeal (England and Wales), Court of Appeal and the Crown Court, are the Courts of England and Wales, Senior Cour ...
requiring major ISPs to block Anna's Archive and other copyright-infringing sites, extending a list of sites blocked since 2015 under section 97A of the Copyright, Designs and Patents Act. The Association said it identified over one million records of copyrighted books and journal articles on Anna's Archive domains.


Other issues

Anna's Archive was among
Google Search Google Search (also known simply as Google or Google.com) is a search engine operated by Google. It allows users to search for information on the World Wide Web, Web by entering keywords or phrases. Google Search uses algorithms to analyze an ...
's ten most reported domains for DMCA takedown as of June 2024. It has been one of the most targeted sites of Dutch anti-piracy service Link-Busters, which sends takedown notices to Google and other search engines on behalf of major publishers. In January 2025, the messaging app
Telegram Telegraphy is the long-distance transmission of messages where the sender uses symbolic codes, known to the recipient, rather than a physical exchange of an object bearing the message. Thus flag semaphore is a method of telegraphy, whereas pi ...
suspended the channel for Anna's Archive due to copyright infringement, despite the operators reportedly taking precautions to avoid infringing posts on the app. Z-Library's Telegram channel was suspended the same week, and neither was alerted of the action. The removals were speculated to be linked to legal action by an Indian court.


Notes


References


Primary sources


External links

*
Official blog

PiLiMi team website
(via
Wayback Machine The Wayback Machine is a digital archive of the World Wide Web founded by Internet Archive, an American nonprofit organization based in San Francisco, California. Launched for public access in 2001, the service allows users to go "back in ...
; live redirects t
Datasets - Anna's Archive
{{Authority control Book websites Digital libraries Ebooks File sharing communities Freedom of information Free search engine software Intellectual property activism Open access projects Shadow libraries Internet properties established in 2022 Internet censorship Libraries established in 2022