Memcache
   HOME

TheInfoList



OR:

Memcached (pronounced variously /mɛmkæʃˈdiː/ ''mem-cash-dee'' or /ˈmɛmkæʃt/ ''mem-cashed'') is a general-purpose distributed memory-caching system. It is often used to speed up dynamic
database In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and a ...
-driven websites by caching data and
objects Object may refer to: General meanings * Object (philosophy), a thing, being, or concept ** Object (abstract), an object which does not exist at any particular time or place ** Physical object, an identifiable collection of matter * Goal, an ai ...
in
RAM Ram, ram, or RAM most commonly refers to: * A male sheep * Random-access memory, computer memory * Ram Trucks, US, since 2009 ** List of vehicles named Dodge Ram, trucks and vans ** Ram Pickup, produced by Ram Trucks Ram, ram, or RAM may also ref ...
to reduce the number of times an external data source (such as a database or API) must be read. Memcached is
free and open-source software Free and open-source software (FOSS) is software available under a license that grants users the right to use, modify, and distribute the software modified or not to everyone free of charge. FOSS is an inclusive umbrella term encompassing free ...
, licensed under the
Revised BSD license BSD licenses are a family of permissive free software licenses, imposing minimal restrictions on the use and distribution of covered software. This is in contrast to copyleft licenses, which have share-alike requirements. The original BSD licens ...
. Memcached runs on
Unix-like A Unix-like (sometimes referred to as UN*X, *nix or *NIX) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Uni ...
operating systems (
Linux Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
and
macOS macOS, previously OS X and originally Mac OS X, is a Unix, Unix-based operating system developed and marketed by Apple Inc., Apple since 2001. It is the current operating system for Apple's Mac (computer), Mac computers. With ...
) and on
Microsoft Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
. It depends on the
libevent libevent is a software library that provides asynchronous event notification. The libevent API provides a mechanism to execute a callback function when a specific event occurs on a file descriptor or after a timeout has been reached. libevent ...
library. Memcached's
API An application programming interface (API) is a connection between computers or between computer programs. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how to build ...
s provide a very large
hash table In computer science, a hash table is a data structure that implements an associative array, also called a dictionary or simply map; an associative array is an abstract data type that maps Unique key, keys to Value (computer science), values. ...
distributed across multiple machines. When the table is full, subsequent inserts cause older data to be purged in
least recently used In computing, cache replacement policies (also known as cache replacement algorithms or cache algorithms) are Program optimization, optimizing instructions or algorithms which a computer program or hardware-maintained structure can utilize to ma ...
(LRU) order. Applications using Memcached typically layer requests and additions into RAM before falling back on a slower backing store, such as a database. Memcached has no internal mechanism to track misses which may happen. However, some third party utilities provide this functionality. Memcached was first developed by
Brad Fitzpatrick Bradley Joseph Fitzpatrick (born February 5, 1980) is an American programmer. He is best known as the creator of LiveJournal and is the author of a variety of free software projects such as memcached, PubSubHubbub, OpenID, and Perkeep. Personal l ...
for his website
LiveJournal LiveJournal (), stylised as LiVEJOURNAL, is a Russian-owned social networking service where users can keep a blog, journal, or diary. American programmer Brad Fitzpatrick started LiveJournal on April 15, 1999, as a way of keeping his high school ...
, on May 22, 2003. It was originally written in
Perl Perl is a high-level, general-purpose, interpreted, dynamic programming language. Though Perl is not officially an acronym, there are various backronyms in use, including "Practical Extraction and Reporting Language". Perl was developed ...
, then later rewritten in C by Anatoly Vorobey, then employed by LiveJournal. Memcached is now used by many other systems, including
YouTube YouTube is an American social media and online video sharing platform owned by Google. YouTube was founded on February 14, 2005, by Steve Chen, Chad Hurley, and Jawed Karim who were three former employees of PayPal. Headquartered in ...
,
Reddit Reddit ( ) is an American Proprietary software, proprietary social news news aggregator, aggregation and Internet forum, forum Social media, social media platform. Registered users (commonly referred to as "redditors") submit content to the ...
,
Facebook Facebook is a social media and social networking service owned by the American technology conglomerate Meta Platforms, Meta. Created in 2004 by Mark Zuckerberg with four other Harvard College students and roommates, Eduardo Saverin, Andre ...
,
Pinterest Pinterest is an American social media service for publishing and discovery of information in the form of digital Bulletin board, pinboards. This includes recipes, home, style, motivation, and inspiration on the Internet using image sharing. Pint ...
,
Twitter Twitter, officially known as X since 2023, is an American microblogging and social networking service. It is one of the world's largest social media platforms and one of the most-visited websites. Users can share short text messages, image ...
,
Wikipedia Wikipedia is a free content, free Online content, online encyclopedia that is written and maintained by a community of volunteers, known as Wikipedians, through open collaboration and the wiki software MediaWiki. Founded by Jimmy Wales and La ...
, and
Method Studios Method Studios is a visual effects company launched in 1999 in Los Angeles, California with facilities in New York, Atlanta, Vancouver, San Francisco, Melbourne, Montreal, and Pune. The company provides production and post-production services ...
.
Google App Engine Google App Engine (also referred to as GAE or App Engine) is a cloud computing platform used as a service for developing and hosting web applications. Applications are sandboxed and run across multiple Google-managed servers. GAE supports aut ...
,
Google Cloud Platform Google Cloud Platform (GCP) is a suite of cloud computing services offered by Google that provides a series of modular cloud services including computing, Computer data storage, data storage, Data analysis, data analytics, and machine learnin ...
,
Microsoft Azure Microsoft Azure, or just Azure ( /ˈæʒər, ˈeɪʒər/ ''AZH-ər, AY-zhər'', UK also /ˈæzjʊər, ˈeɪzjʊər/ ''AZ-ure, AY-zure''), is the cloud computing platform developed by Microsoft. It has management, access and development of ...
,
IBM Bluemix IBM Cloud (formerly known as Bluemix) is a set of cloud computing services for business offered by the information technology company IBM. Services As of 2021, IBM Cloud contains more than 170 services including compute, storage, networkin ...
and
Amazon Web Services Amazon Web Services, Inc. (AWS) is a subsidiary of Amazon.com, Amazon that provides Software as a service, on-demand cloud computing computing platform, platforms and Application programming interface, APIs to individuals, companies, and gover ...
also offer a Memcached service through an API.


Software architecture

The system uses a client–server architecture. The servers maintain a key–value
associative array In computer science, an associative array, key-value store, map, symbol table, or dictionary is an abstract data type that stores a collection of (key, value) pairs, such that each possible key appears at most once in the collection. In math ...
; the clients populate this array and query it by key. Keys are up to 250 bytes long and values can be at most 1
megabyte The megabyte is a multiple of the unit byte for digital information. Its recommended unit symbol is MB. The unit prefix ''mega'' is a multiplier of (106) in the International System of Units (SI). Therefore, one megabyte is one million bytes ...
in size. Clients use client-side libraries to contact the servers which, by default, expose their service at
port A port is a maritime facility comprising one or more wharves or loading areas, where ships load and discharge cargo and passengers. Although usually situated on a sea coast or estuary, ports can also be found far inland, such as Hamburg, Manch ...
11211. Both TCP and UDP are supported. Each client knows all servers; the servers do not communicate with each other. If a client wishes to set or read the value corresponding to a certain key, the client's library first computes a
hash Hash, hashes, hash mark, or hashing may refer to: Substances * Hash (food), a coarse mixture of ingredients, often based on minced meat * Hash (stew), a pork and onion-based gravy found in South Carolina * Hash, a nickname for hashish, a canna ...
of the key to determine which server to use. This gives a simple form of
shard Shard or sherd is a sharp piece of glass, pottery or stone. Shard may also refer to: Places * Shard End, a place in Birmingham, United Kingdom Architecture * Dresden Shard, a redesign of the Bundeswehr Military History Museum in Dresden, German ...
ing and scalable
shared-nothing architecture A shared-nothing architecture (SN) is a distributed computing architecture in which each update request is satisfied by a single node (processor/memory/storage unit) in a computer cluster. The intent is to eliminate contention among nodes. Nodes do ...
across the servers. The server computes a second hash of the key to determine where to store or read the corresponding value. The servers keep the values in RAM (and, starting in 1.6.0, in auxiliary cache on disk using an external storage server option); if a server runs out of available memory or disk, it discards the oldest values. Therefore, clients must treat Memcached as a transitory cache; they cannot assume that data stored in Memcached is still there when they need it. Other databases, such as MemcacheDB,
Couchbase Server Couchbase Server, originally known as Membase, is a source-available, distributed (shared-nothing architecture) multi-model NoSQL document-oriented database software package optimized for interactive applications. These applications may serve m ...
, provide persistent storage while maintaining Memcached protocol compatibility. If all client libraries use the same hashing algorithm to determine servers, then clients can read each other's cached data. A typical deployment has several servers and many clients. However, it is possible to use Memcached on a single computer, acting simultaneously as client and server. The size of its hash table is often very large. It is limited to available memory across all the servers in the cluster of servers in a data center. Where high-volume, wide-audience Web publishing requires it, this may stretch to many gigabytes. Memcached can be equally valuable for situations where either the number of requests for content is high, or the cost of generating a particular piece of content is high. Applications with particularly high-demand caching needs can use a built-in proxy to define and configure complex client-server routes.


Security

Most deployments of Memcached are within trusted networks where clients may freely connect to any server. However, sometimes Memcached is deployed in untrusted networks or where administrators want to exercise control over the clients that are connecting. For this purpose Memcached can be compiled with optional SASL authentication support. The SASL support requires the binary protocol. A presentation at BlackHat USA 2010 revealed that a number of large public websites had left Memcached open to inspection, analysis, retrieval, and modification of data. Even within a trusted organisation, the flat trust model of memcached may have security implications. For efficient simplicity, all Memcached operations are treated equally. Clients with a valid need for access to low-security entries within the cache gain access to ''all'' entries within the cache, even when these are higher-security and that client has no justifiable need for them. If the cache key can be either predicted, guessed or found by exhaustive searching, its cache entry may be retrieved. Some attempt to isolate setting and reading data may be made in situations such as high volume web publishing. A farm of outward-facing content servers have ''read'' access to memcached containing published pages or page components, but no write access. Where new content is published (and is not yet in memcached), a request is instead sent to content generation servers that are not publicly accessible to create the content unit and add it to memcached. The content server then retries to retrieve it and serve it outwards.


Used as a DDoS attack vector

In February 2018,
CloudFlare Cloudflare, Inc., is an American company that provides content delivery network services, cybersecurity, DDoS mitigation, wide area network services, reverse proxies, Domain Name Service, ICANN-accredited domain registration, and other se ...
reported that misconfigured memcached servers were used to launch
DDoS attacks In computing, a denial-of-service attack (DoS attack) is a cyberattack in which the perpetrator seeks to make a machine or network resource unavailable to its intended users by temporarily or indefinitely disrupting services of a host conne ...
in large scale. The memcached protocol over UDP has a huge
amplification factor The amplification factor, also called gain, is the extent to which an analog amplifier boosts the strength of a signal. Amplification factors are usually expressed in terms of power. The decibel (dB), a logarithmic unit, is the most common way of q ...
, of more than 51000. Victims of the DDoS attacks include
GitHub GitHub () is a Proprietary software, proprietary developer platform that allows developers to create, store, manage, and share their code. It uses Git to provide distributed version control and GitHub itself provides access control, bug trackin ...
, which was flooded with 1.35 Tbit/s peak incoming traffic. This issue was mitigated in Memcached version 1.5.6, which disabled UDP protocol by default.


Example code

''Note that all functions described on this page are
pseudocode In computer science, pseudocode is a description of the steps in an algorithm using a mix of conventions of programming languages (like assignment operator, conditional operator, loop) with informal, usually self-explanatory, notation of actio ...
only. Memcached calls and programming languages may vary based on the API used.'' Converting database or object creation queries to use Memcached is simple. Typically, when using straight database queries, example code would be as follows: function get_foo(int userid) data = db_select("SELECT * FROM users WHERE userid = ?", userid) return data After conversion to Memcached, the same call might look like the following function get_foo(int userid) /* first try the cache */ data = memcached_fetch("userrow:" + userid) if not data /* not found : request database */ data = db_select("SELECT * FROM users WHERE userid = ?", userid) /* then store in cache until next get */ memcached_add("userrow:" + userid, data) end return data The client would first check whether a Memcached value with the unique key "userrow:userid" exists, where userid is some number. If the result does not exist, it would select from the database as usual, and set the unique key using the Memcached API add function call. However, if only this API call were modified, the server would end up fetching incorrect data following any database update actions: the Memcached "view" of the data would become out of date. Therefore, in addition to creating an "add" call, an update call would also be needed using the Memcached set function. function update_foo(int userid, string dbUpdateString) /* first update database */ result = db_execute(dbUpdateString) if result /* database update successful : fetch data to be stored in cache */ data = db_select("SELECT * FROM users WHERE userid = ?", userid) /* the previous line could also look like data = createDataFromDBString(dbUpdateString) */ /* then store in cache until next get */ memcached_set("userrow:" + userid, data) This call would update the currently cached data to match the new data in the database, assuming the database query succeeds. An alternative approach would be to invalidate the cache with the Memcached delete function, so that subsequent fetches result in a cache miss. Similar action would need to be taken when database records were deleted, to maintain either a correct or incomplete cache. An alternate cache-invalidation strategy is to store a random number in an agreed-upon cache entry and to incorporate this number into all keys that are used to store a particular kind of entry. To invalidate all such entries at once, change the random number. Existing entries (which were stored using the old number) will no longer be referenced and so will eventually expire or be recycled. function store_xyz_entry(int key, string value) /* Retrieve the random number - use zero if none exists yet. * The key-name used here is arbitrary. */ seed = memcached_fetch(":xyz_seed:") if not seed seed = 0 /* Build the key used to store the entry and store it. * The key-name used here is also arbitrary. Notice that the "seed" and the user's "key" * are stored as separate parts of the constructed hashKey string: ":xyz_data:(seed):(key)." * This is not mandatory, but is recommended. */ string hashKey = sprintf(":xyz_data:%d:%d", seed, key) memcached_set(hashKey, value) /* "fetch_entry," not shown, follows identical logic to the above. */ function invalidate_xyz_cache() existing_seed = memcached_fetch(":xyz_seed:") /* Coin a different random seed */ do seed = rand() until seed != existing_seed /* Now store it in the agreed-upon place. All future requests will use this number. * Therefore, all existing entries become un-referenced and will eventually expire. */ memcached_set(":xyz_seed:", seed)


Usage

*
MySQL MySQL () is an Open-source software, open-source relational database management system (RDBMS). Its name is a combination of "My", the name of co-founder Michael Widenius's daughter My, and "SQL", the acronym for Structured Query Language. A rel ...
- directly supports the Memcached API as of version 5.6. *
Oracle Coherence In computing, Oracle Coherence (originally Tangosol Coherence) is a Java-based distributed cache and in-memory data grid. It is claimed to be intended for systems that require high availability, high scalability and low latency, particularly in ca ...
- directly supports the Memcached API as of version 12.1.3. *
Infinispan Infinispan is a distributed cache and key–value NoSQL in-memory database developed by Red Hat. Java applications can embed it as library, use it as a service in WildFly or any non-java applications can use it, as remote service through TCP/IP. ...
- directly supports Memcached.


See also

*
Amazon ElastiCache Amazon ElastiCache is a fully managed in-memory data store and cache service by Amazon Web Services (AWS). The service improves the performance of web applications by retrieving information from managed in-memory caches, instead of relying en ...
* Aerospike *
Couchbase Server Couchbase Server, originally known as Membase, is a source-available, distributed (shared-nothing architecture) multi-model NoSQL document-oriented database software package optimized for interactive applications. These applications may serve m ...
*
Redis Redis (; Remote Dictionary Server) is an in-memory key–value database, used as a distributed cache and message broker, with optional durability. Because it holds all data in memory and because of its design, Redis offers low- latency reads ...
*
Mnesia Mnesia is a distributed, soft real-time database management system written in the Erlang programming language. It is distributed as part of the Open Telecom Platform. Description As with Erlang, Mnesia was developed by Ericsson for soft ...
* MemcacheDB *
Hazelcast In computing, Hazelcast is a unified real-time data platform implemented in Java that combines a fast data store with stream processing. It is also the name of the company that develops the product. The Hazelcast company is funded by venture ca ...
*
Cassandra Cassandra or Kassandra (; , , sometimes referred to as Alexandra; ) in Greek mythology was a Trojan priestess dedicated to the god Apollo and fated by him to utter true prophecy, prophecies but never to be believed. In modern usage her name is e ...
*
ScyllaDB ScyllaDB is a source-available distributed NoSQL wide-column data store. It was designed to be compatible with Apache Cassandra while achieving significantly higher throughputs and lower latencies. It supports the same protocols as Cassandra ( CQ ...
*
Tarantool Tarantool is an in-memory computing platform with a flexible data schema, best used for creating high-performance applications. Two main parts of it are an in-memory database and a Lua application server. Tarantool maintains data in memory and ...
*
Ehcache Ehcache ( ) is an open source Java distributed cache for general-purpose caching, Java EE and . Ehcache is available under an Apache open source license. Ehcache was developed by Greg Luck starting in 2003. In 2009, the project was purchased b ...
*
Infinispan Infinispan is a distributed cache and key–value NoSQL in-memory database developed by Red Hat. Java applications can embed it as library, use it as a service in WildFly or any non-java applications can use it, as remote service through TCP/IP. ...


References


External links

*{{Official website 2003 software Cross-platform software Database caching Free memory management software Key-value databases Software using the BSD license Structured storage