An oblivious RAM (ORAM) simulator is a
compiler
In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs that ...
that transforms
algorithms
In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing c ...
in such a way that the resulting algorithms preserve the
input-
output
Output may refer to:
* The information produced by a computer, see Input/output
* An output state of a system, see state (computer science)
* Output (economics), the amount of goods and services produced
** Gross output in economics, the value of ...
behavior of the original algorithm but the
distribution Distribution may refer to:
Mathematics
*Distribution (mathematics), generalized functions used to formulate solutions of partial differential equations
* Probability distribution, the probability of a particular value or value range of a vari ...
of
memory
Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembered, ...
access pattern of the transformed algorithm is independent of the memory access pattern of the original algorithm.
The use of ORAMs is motivated by the fact that an adversary can obtain nontrivial information about the execution of a program and the nature of the
data
In the pursuit of knowledge, data (; ) is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted ...
that it is dealing with, just by observing the pattern in which various locations of memory are accessed during its execution. An adversary can get this information even if the data values are all
encrypted.
The definition suits equally well to the settings of protected programs running on unprotected
shared memory as well as a client running a program on its system by accessing previously stored data on a
remote server. The concept was formulated by
Oded Goldreich and
Rafail Ostrovsky in 1996.
Definition
A
Turing machine
A Turing machine is a mathematical model of computation describing an abstract machine that manipulates symbols on a strip of tape according to a table of rules. Despite the model's simplicity, it is capable of implementing any computer algori ...
(TM), the mathematical abstraction of a real computer (program), is said to be
oblivious if for any two inputs of the same length, the motions of the tape heads remain the same.
Pippenger and Fischer proved that every TM with running time
can be made oblivious and that the running time of the oblivious TM is
. A more realistic model of computation is the
RAM model. In the RAM model of computation, there is a
CPU
A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, and ...
that can execute the basic mathematical, logical and control instructions. The CPU is also associated with a few
registers and a physical random access
memory
Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembered, ...
, where it stores the operands of its instructions. The CPU in addition has instructions to read the contents of a memory cell and write a specific value to a memory cell. The definition of ORAMs capture a similar notion of obliviousness memory accesses in this model.
Informally, an ORAM is an algorithm at the interface of a protected CPU and the physical RAM such that it acts like a RAM to the CPU by querying the physical RAM for the CPU while hiding information about the actual memory access pattern of the CPU from the physical RAM. In other words, the distribution of memory accesses of two programs that make the same number of memory accesses to the RAM are indistinguishable from each other. This description will still make sense if the CPU is replaced by a client with a small storage and the physical RAM is replaced with a remote server with a large storage capacity, where the data of the client resides.
The following is a formal definition of ORAMs.
Let
denote a program requiring a memory of size
when executing on an input
. Suppose that
has instructions for basic mathematical and control operations in addition to two special instructions
and
, where
reads the value at location
and
writes the value
to
. The sequence of memory cell accessed by a program
during its execution is called its memory access pattern and is denoted by
.
A
polynomial-time algorithm
In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by t ...
,
is an Oblivious RAM (ORAM) compiler with computational overhead
and memory overhead
, if
given
and a
deterministic RAM program with memory-size
outputs a program
with memory-size
such that for any input
, the running-time of
is bounded by
where
is the running-time of
, and there exists a
negligible function In mathematics, a negligible function is a function \mu:\mathbb\to\mathbb such that for every positive integer ''c'' there exists an integer ''N'c'' such that for all ''x'' > ''N'c'',
:, \mu(x), 0 such that for all ''x'' ...
such that the following properties hold:
* Correctness: For any
and any string
, with probability at least
,
.
*Obliviousness: For any two programs
, any
and any two inputs,
if
, then
is
-close to
in
statistical distance
In statistics, probability theory, and information theory, a statistical distance quantifies the distance between two statistical objects, which can be two random variables, or two probability distributions or samples, or the distance can be be ...
, where
and
.
Note that the above definition uses the notion of
statistical security. One can also have a similar definition for the notion of
computational security.
History of ORAMs
ORAMs were introduced by
Goldreich and
Ostrovsky
wherein the key motivation was stated as software protection from an adversary who can observe the memory access pattern (but not the contents of the memory).
The main result in this work
is that there exists an ORAM compiler that uses
server space and incurs a running time overhead of
when making a program that uses
memory cells oblivious. This work initiated a series of works in the construction of oblivious RAMs that is going on till date. There are several attributes that need to be considered when we compare various ORAM constructions. The most important parameters of an ORAM construction are the amounts of client storage, the amount of server storage and the time overhead in making one memory access. Based on these attributes, the construction of Kushilevitz et al.
is the best known ORAM construction. It achieves
client storage,
server storage and
access overhead.
Another important attribute of an ORAM construction is whether the access overhead is
amortized or
worst-case. Several of the earlier ORAM constructions have good amortized access overhead guarantees, but have
worst-case access overheads. Some of the ORAM constructions with
polylogarithmic
In mathematics, a polylogarithmic function in is a polynomial in the logarithm of ,
: a_k (\log n)^k + a_ (\log n)^ + \cdots + a_1(\log n) + a_0.
The notation is often used as a shorthand for , analogous to for .
In computer science, poly ...
worst-case computational overheads are.
The constructions of
were for the random oracle model, where the client assumes access to an oracle that behaves like a random function and returns consistent answers for repeated queries. They also noted that this oracle could be replaced by a pseudorandom function whose seed is a secret key stored by the client, if one assumes the existence of one-way functions. The papers
were aimed at removing this assumption completely. The authors of
also achieve an access overhead of
, which is just a log-factor away from the best known ORAM access overhead.
While most of the earlier works focus on proving security computationally, there are more recent works
that use the stronger statistical notion of security.
One of the only known lower bounds on the access overhead of ORAMs is due to Goldreich et al.
They show a
lower bound for ORAM access overhead, where
is the data size. There is also a conditional lower bound on the access overhead of ORAMs due to Boyle et al.
that relates this quantity with that of the size of sorting networks.
ORAM constructions
Trivial construction
A trivial ORAM simulator construction, for each read or write operation, reads from and writes to every single element in the array, only performing a meaningful action for the address specified in that single operation. The trivial solution thus, scans through the entire memory for each operation. This scheme incurs a time overhead of
for each memory operation, where is the size of the memory.
A simple ORAM scheme
A simple version of a statistically secure ORAM compiler constructed by Chung and Pass
is described in the following along with an overview of the proof of its correctness. The compiler on input and a program with its memory requirement , outputs an equivalent oblivious program .
If the input program uses registers, the output program will need
registers, where
is a parameter of the construction. uses
memory and its (worst-case) access overhead is
.
The ORAM compiler is very simple to describe. Suppose that the original program has instructions for basic mathematical and control operations in addition to two special instructions
and
, where
reads the value at location and
writes the value to . The ORAM compiler, when constructing , simply replaces each and instructions with subroutines and and keeps the rest of the program the same. It may be noted that this construction can be made to work even for memory requests coming in an
online
In computer technology and telecommunications, online indicates a state of connectivity and offline indicates a disconnected state. In modern terminology, this usually refers to an Internet connection, but (especially when expressed "on line" or ...
fashion.
Memory organization of the oblivious program
The program stores a complete binary tree of depth
in its memory. Each node in is represented by a binary string of length at most . The root is the empty string, denoted by . The left and right children of a node represented by the string
are
and
respectively. The program thinks of the memory of as being partitioned into blocks, where each block is a contiguous sequence of memory cells of size . Thus, there are at most
blocks in total. In other words, the memory cell corresponds to block
.
At any point of time, there is an association between the blocks and the leaves in .
To keep track of this association, also stores a data structure called position map, denoted by
, using
registers. This data structure, for each block , stores the leaf of associated with in
.
Each node in contains an array with at most triples. Each triple is of the form
, where is a block identifier and is the contents of the block. Here, is a security parameter and is
.
Description of the oblivious program
The program starts by initializing its memory as well as registers to . Describing the procedures and is enough to complete the description of . The sub-routine is given below. The inputs to the sub-routine are a memory location