Metadata removal tool or metadata scrubber is a type of
privacy software built to protect the
privacy
Privacy (, ) is the ability of an individual or group to seclude themselves or information about themselves, and thereby express themselves selectively.
The domain of privacy partially overlaps with security, which can include the concepts of a ...
of its users by removing potentially privacy-compromising
metadata
Metadata (or metainformation) is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including:
* Descriptive ...
from files before they are shared with others, e.g., by sending them as
e-mail attachments or by posting them on the
Web
Web most often refers to:
* Spider web, a silken structure created by the animal
* World Wide Web or the Web, an Internet-based hypertext system
Web, WEB, or the Web may also refer to:
Computing
* WEB, a literate programming system created by ...
.
Overview
Metadata can be found in many types of files such as
document
A document is a writing, written, drawing, drawn, presented, or memorialized representation of thought, often the manifestation of nonfiction, non-fictional, as well as fictional, content. The word originates from the Latin ', which denotes ...
s,
spreadsheet
A spreadsheet is a computer application for computation, organization, analysis and storage of data in tabular form. Spreadsheets were developed as computerized analogs of paper accounting worksheets. The program operates on data entered in c ...
s,
presentation
A presentation conveys information from a speaker to an audience. Presentations are typically demonstrations, introduction, lecture, or speech meant to inform, persuade, inspire, motivate, build goodwill, or present a new idea/product. Presenta ...
s,
image
An image or picture is a visual representation. An image can be Two-dimensional space, two-dimensional, such as a drawing, painting, or photograph, or Three-dimensional space, three-dimensional, such as a carving or sculpture. Images may be di ...
s, and
audio files. They can include information such as details on the file authors, file creation and modification dates, geographical location, document revision history, thumbnail images, and comments. Metadata may be added to files by users, but some metadata is often automatically added to files by authoring applications or by devices used to produce the files, without user intervention.
Since metadata is sometimes not clearly visible in authoring applications (depending on the application and its settings), there is a risk that the user will be unaware of its existence or will forget about it and, if the file is shared, private or confidential information will inadvertently be exposed. The purpose of metadata removal tools is to minimize the risk of such data leakage.
The metadata removal tools that exist today can be divided into four groups:
* Integral metadata removal tools, which are included in some applications, like the ''Document Inspector'' in
Microsoft Office.
* Batch metadata removal tools, which can process multiple files.
* E-mail client
add-ins, which are designed to remove metadata from e-mail attachments just before they are sent.
* Server based systems, which are designed to automatically remove metadata at the network gateway.
To securely delete the metadata of a
PDF file, it is important to linearize the PDF file afterwards, otherwise changes are reversible and the metadata can be recovered.
Metadata removal tools are also commonly used to reduce the overall sizes of files, particularly image files posted on the Web. For example, a small image on a website, which may contain metadata including a
thumbnail image, can easily contain as much metadata as image data, thus removal of that metadata can halve the file size.
See also
*
Data loss prevention software
*
Exif
*
Sanitization (classified information)
Redaction or sanitization is the process of removing information sensitivity, sensitive information from a document so that it may be distributed to a broader audience. It is intended to allow the selective disclosure of information. Typically, t ...
References
External links
White Papers about the Risks of File Hidden Data & Metadata
Metadata
Privacy software
Legal software
{{security-software-stub