HOME

TheInfoList



OR:

Recoll is a desktop search tool that provides
full-text search In Document retrieval, text retrieval, full-text search refers to techniques for searching a single computer-stored document or a collection in a full-text database. Full-text search is distinguished from searches based on metadata or on parts of ...
in a GUI with a few mandatory external dependencies. It runs on many
Unix Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
-like
operating system An operating system (OS) is system software that manages computer hardware and software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ...
s and is mostly independent of the
desktop environment In computing, a desktop environment (DE) is an implementation of the desktop metaphor made of a bundle of programs running on top of a computer operating system that share a common graphical user interface (GUI), sometimes described as a graphi ...
. Recoll has been ported to
OS/2 OS/2 is a Proprietary software, proprietary computer operating system for x86 and PowerPC based personal computers. It was created and initially developed jointly by IBM and Microsoft, under the leadership of IBM software designer Ed Iacobucci, ...
, and is planned for integration into the OS/2-based
ArcaOS ArcaOS is a Proprietary software, proprietary operating system based on OS/2, developed and marketed by Arca Noae, LLC under license from IBM. It was first released in 2017 and builds on OS/2 Warp 4.52 by adding support for new hardware, fixing ...
. Recoll was designed not to require a permanent
daemon A demon is a malevolent supernatural being, evil spirit or fiend in religion, occultism, literature, fiction, mythology and folklore. Demon, daemon or dæmon may also refer to: Entertainment Fictional entities * Daemon (G.I. Joe), a character ...
; on
Linux Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
systems, it can make use of
inotify inotify (inode notify) is a Linux kernel subsystem created by John McCutchan, which monitors changes to the filesystem, and reports those changes to applications. It can be used to automatically update directory views, reload configuration files, ...
. Recoll updates its index at designed intervals (for example, through cronjobs), but if desired, the indexing task can run as a file-system monitoring daemon for real-time index updates.


Features

* Qt GUI. * Xapian backend. * Indexes the contents of many document types: text,
HTML Hypertext Markup Language (HTML) is the standard markup language for documents designed to be displayed in a web browser. It defines the content and structure of web content. It is often assisted by technologies such as Cascading Style Sheets ( ...
,
email Electronic mail (usually shortened to email; alternatively hyphenated e-mail) is a method of transmitting and receiving Digital media, digital messages using electronics, electronic devices over a computer network. It was conceived in the ...
stores of all kinds,
OpenDocument The Open Document Format for Office Applications (ODF), also known as OpenDocument, standardized as ISO 26300, is an open file format for word processor, word processing documents, spreadsheets, Presentation program, presentations and ...
,
Microsoft Office Microsoft Office, MS Office, or simply Office, is an office suite and family of client software, server software, and services developed by Microsoft. The first version of the Office suite, announced by Bill Gates on August 1, 1988, at CO ...
and
Office Open XML Office Open XML (also informally known as OOXML) is a zipped, XML-based file format developed by Microsoft for representing spreadsheets, charts, presentations and word processing documents. Ecma International standardized the initial version ...
, AbiWord, KWord, Gaim,
Lyx LyX (styled as LYX; pronounced ) is an open-source software, open source, graphical user interface document processor based on the LaTeX typesetting system. Unlike most word processors, which follow the WYSIWYG ("what you see is what you get") ...
,
Scribus Scribus () is free and open-source desktop publishing (DTP) software available for most desktop operating systems. It is designed for layout, typesetting, and preparation of files for professional-quality image-setting equipment. Scribus can a ...
,
PDF Portable document format (PDF), standardized as ISO 32000, is a file format developed by Adobe Inc., Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, computer hardware, ...
,
WordPerfect WordPerfect (WP) is a word processing application, now owned by Alludo, with a long history on multiple personal computer platforms. At the height of its popularity in the 1980s and early 1990s, it was the market leader of word processors, disp ...
,
PostScript PostScript (PS) is a page description language and dynamically typed, stack-based programming language. It is most commonly used in the electronic publishing and desktop publishing realm, but as a Turing complete programming language, it c ...
, RTF,
TeX Tex, TeX, TEX, may refer to: People and fictional characters * Tex (nickname), a list of people and fictional characters with the nickname * Tex Earnhardt (1930–2020), U.S. businessman * Joe Tex (1933–1982), stage name of American soul singer ...
, DVI, DjVu,
MP3 MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio developed largely by the Fraunhofer Society in Germany under the lead of Karlheinz Brandenburg. It was designed to greatly reduce the amount ...
and other audio file formats,
JPEG JPEG ( , short for Joint Photographic Experts Group and sometimes retroactively referred to as JPEG 1) is a commonly used method of lossy compression for digital images, particularly for those images produced by digital photography. The degr ...
and other image file formats. * Recursively processes embedded documents (
email Electronic mail (usually shortened to email; alternatively hyphenated e-mail) is a method of transmitting and receiving Digital media, digital messages using electronics, electronic devices over a computer network. It was conceived in the ...
attachments, zip archives) to arbitrary depths. * Query facilities with boolean searches, wildcards, phrases, proximity, and filters on file types and directory trees. * GUI Boolean search build tool. * Xesam query language support. * Word stemming is performed at query time (you can switch stemming language after indexing). * Multiple indexes are selectable at query time (i.e., personal + system indexes). * Natively based on Unicode. Supports many languages and character sets, including good support for East Asian texts ( CJK). *
MD5 The MD5 message-digest algorithm is a widely used hash function producing a 128-bit hash value. MD5 was designed by Ronald Rivest in 1991 to replace an earlier hash function MD4, and was specified in 1992 as Request for Comments, RFC 1321. MD5 ...
document hashes for the elimination of duplicates in results. * Batch and real-time indexing modes. * Python
API An application programming interface (API) is a connection between computers or between computer programs. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how to build ...
. * GNOME Shell search provider, WEB interface, and
Firefox Mozilla Firefox, or simply Firefox, is a free and open-source web browser developed by the Mozilla Foundation and its subsidiary, the Mozilla Corporation. It uses the Gecko rendering engine to display web pages, which implements curr ...
history extensions.


File type supported


File types indexed natively

* Text. * Html. *
Maildir The Maildir e-mail format is a common way of storing email messages on a file system, rather than in a database. Each message is assigned a Computer file, file with a unique name, and each mail folder is a file system directory containing these fil ...
, MH, and mailbox (Mozilla, Thunderbird, and Evolution). Evolution requires .cache to be removed from the skippedNames list in the GUI Indexing preferences/Local Parameters/ Pane to index local copies of IMAP mail. * Gaim and purple log files. * Scribus files. *
Man page A man page (short for manual page) is a form of software documentation found on Unix and Unix-like operating systems. Topics covered include programs, system libraries, system calls, and sometimes local system details. The local host administr ...
s (needs Groff). * Mimehtml web archive format (support based on the mail filter). * All the following need Python 3: ** Dia diagrams. ** Excel and PowerPoint (pre-open XML). ** Tar archives. Tar file indexing is disabled by default given that tar archives don't typically contain the kind of documents that people search for, so it needs to be enabled explicitly with " ndex or "application/x-tar=execm rcltar" in a $HOME/.recoll/mimeconf file. ** Zip archives. ** Konqueror web archive format (uses the tarfile Python standard library module).


File types indexed with external helpers

* PDF files. * MS-Word files. * Wordperfect files. * RTF files. * Image and audio file tags. * Abiword files. * Fb2, Epub, and CHM ebooks. * Kword files. * Microsoft Office traditional and Open XML files. * OpenOffice files. * SVG files. * Okular annotations files. * HWP files (without page numbering).


See also

* Desktop search * List of desktop search engines


References


External links

* {{Navigationbox Desktopsearch Desktop search engines Software that uses Qt