Freedup
   HOME

TheInfoList



OR:

freedup is a
program Program (American English; also Commonwealth English in terms of computer programming and related activities) or programme (Commonwealth English in all other meanings), programmer, or programming may refer to: Business and management * Program m ...
to scan directories or file lists for duplicate
files File or filing may refer to: Mechanical tools and processes * File (tool), a tool used to remove fine amounts of material from a workpiece. **Filing (metalworking), a material removal process in manufacturing ** Nail file, a tool used to gentl ...
. The file lists may be provided to an input pipe or internally generated using find with provided options. There are more options to specify the search conditions more detailed. Other options influence the performed actions, i.e. whether to display only or to specify what kind of link under which circumstances. freedup first compares file sizes, then on equal sizes the MD5 signatures, and before taking actions a byte-by-byte check for verification is performed. An interactive mode allows to decide individually which files to link soft or
hard Hard means something that is difficult to do. It may also refer to: * Hardness, resistance of physical materials to deformation or fracture * Hard water, water with high mineral content Arts and entertainment * Hard (TV series), ''Hard'' (TV ser ...
or to delete. The comparison by ignoring
metadata Metadata (or metainformation) is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive ...
tags and comments is a unique feature of ''freedup''. Filesize, start and end of unique content is kept for later processing. Comparing
sound file An audio file format is a file format for storing digital audio data on a computer system. The bit layout of the audio data (excluding metadata) is called the audio coding format and can be uncompressed, or compressed to reduce the file size, o ...
s you may ignore the tags, e.g. whether one is tagged with an
ID3v1 ID3 is a metadata container most often used in conjunction with the MP3 audio file format. It allows information such as the title, artist, album, track number, and other information about the file to be stored in the file itself. ID3 is a ''d ...
-tag while another sound file with identical music is tagged with
ID3v2 ID3 is a metadata container most often used in conjunction with the MP3 audio file format. It allows information such as the title, artist, album, track number, and other information about the file to be stored in the file itself. ID3 is a ''d ...
. It also works, if you copied and retagged the copy to fit into another album. This works for
JPEG JPEG ( , short for Joint Photographic Experts Group and sometimes retroactively referred to as JPEG 1) is a commonly used method of lossy compression for digital images, particularly for those images produced by digital photography. The degr ...
files (
Exif Exchangeable image file format (officially Exif, according to JEIDA/JEITA/CIPA specifications) is a standard that specifies formats for images, sound, and ancillary tags used by digital cameras (including smartphones), scanners and other system ...
) and
mp4 MP4 (formally MPEG-4 Part 14), is a digital multimedia container format most commonly used to store video and audio, but it can also be used to store other data such as subtitles and still images. Like most modern container formats, it allows ...
-Movies as well. An auto-Mode is supported to instruct ''freedup'' to ignore all tags that are recognized. The author will extend this function on demand, if there is sufficient documentation how to strip the tags. freedup is written in
POSIX The Portable Operating System Interface (POSIX; ) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines application programming interfaces (APIs), along with comm ...
compliant C and is released under the
GNU General Public License The GNU General Public Licenses (GNU GPL or simply GPL) are a series of widely used free software licenses, or ''copyleft'' licenses, that guarantee end users the freedom to run, study, share, or modify the software. The GPL was the first ...
. Its complexity is O(n log n) for full file comparison. This is done for equally long files after sorting according to filesize using
qsort qsort is a C standard library function that implements a sorting algorithm for arrays of arbitrary objects according to a user-provided comparison function. It is named after the "quicker sort" algorithm (a quicksort variant due to R. S. Scowen), ...
.


See also

*
List of Unix commands This is a list of the shell commands of the most recent version of the Portable Operating System Interface (POSIX) IEEE Std 1003.1-2024 which is part of the Single UNIX Specification (SUS). These commands are implemented in many shells on moder ...
*
Duplicate code In computer programming, duplicate code is a sequence of source code that occurs more than once, either within a program or across different programs owned or maintained by the same entity. Duplicate code is generally considered Code smell, undesira ...


References


External links


Official site
{{unix-stub freedup freedup freedup freedup freedup freedup freedup freedup freedup freedup freedup freedup