HOME

TheInfoList



OR:

An information filtering system is a system that removes redundant or unwanted
information Information is an abstract concept that refers to that which has the power to inform. At the most fundamental level information pertains to the interpretation of that which may be sensed. Any natural process that is not completely random, ...
from an information stream using (semi)automated or computerized methods prior to presentation to a human user. Its main goal is the management of the
information overload Information overload (also known as infobesity, infoxication, information anxiety, and information explosion) is the difficulty in understanding an issue and effectively making decisions when one has too much information (TMI) about that issue, ...
and increment of the
semantic Semantics (from grc, σημαντικός ''sēmantikós'', "significant") is the study of reference, meaning, or truth. The term can be used to refer to subfields of several distinct disciplines, including philosophy, linguistics and comput ...
signal-to-noise ratio Signal-to-noise ratio (SNR or S/N) is a measure used in science and engineering that compares the level of a desired signal to the level of background noise. SNR is defined as the ratio of signal power to the noise power, often expressed in de ...
. To do this the user's profile is compared to some reference characteristics. These characteristics may originate from the information item (the content-based approach) or the user's social environment (the
collaborative filtering Collaborative filtering (CF) is a technique used by recommender systems.Francesco Ricci and Lior Rokach and Bracha ShapiraIntroduction to Recommender Systems Handbook Recommender Systems Handbook, Springer, 2011, pp. 1-35 Collaborative filtering ...
approach). Whereas in
information transmission Data transmission and data reception or, more broadly, data communication or digital communications is the transfer and reception of data in the form of a digital bitstream or a digitized analog signal transmitted over a point-to-point or ...
signal processing filters are used against
syntax In linguistics, syntax () is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure ( constituenc ...
-disrupting noise on the bit-level, the methods employed in information filtering act on the semantic level. The range of machine methods employed builds on the same principles as those for
information extraction Information extraction (IE) is the task of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents and other electronically represented sources. In most of the cases this activity concer ...
. A notable application can be found in the field of email
spam filter Email filtering is the processing of email to organize it according to specified criteria. The term can apply to the intervention of human intelligence, but most often refers to the automatic processing of messages at an SMTP server, possibly appl ...
s. Thus, it is not only the
information explosion The information explosion is the rapid increase in the amount of published information or data and the effects of this abundance. As the amount of available data grows, the problem of managing the information becomes more difficult, which can lead ...
that necessitates some form of filters, but also inadvertently or maliciously introduced
pseudo The prefix pseudo- (from Greek ψευδής, ''pseudes'', "false") is used to mark something that superficially appears to be (or behaves like) one thing, but is something else. Subject to context, ''pseudo'' may connote coincidence, imitation, ...
-information. On the presentation level, information filtering takes the form of user-preferences-based newsfeeds, etc.
Recommender system A recommender system, or a recommendation system (sometimes replacing 'system' with a synonym such as platform or engine), is a subclass of information filtering system that provide suggestions for items that are most pertinent to a particular ...
s and content discovery platforms are active information filtering systems that attempt to present to the user information items (
film A film also called a movie, motion picture, moving picture, picture, photoplay or (slang) flick is a work of visual art that simulates experiences and otherwise communicates ideas, stories, perceptions, feelings, beauty, or atmospher ...
,
television Television, sometimes shortened to TV, is a telecommunication medium for transmitting moving images and sound. The term can refer to a television set, or the medium of television transmission. Television is a mass medium for advertising, ...
,
music Music is generally defined as the art of arranging sound to create some combination of form, harmony, melody, rhythm or otherwise expressive content. Exact definitions of music vary considerably around the world, though it is an aspe ...
,
book A book is a medium for recording information in the form of writing or images, typically composed of many pages (made of papyrus, parchment, vellum, or paper) bound together and protected by a cover. The technical term for this physical ...
s,
news News is information about current events. This may be provided through many different media: word of mouth, printing, postal systems, broadcasting, electronic communication, or through the testimony of observers and witnesses to event ...
, web pages) the user is interested in. These systems add information items to the information flowing towards the user, as opposed to removing information items from the information flow towards the user. Recommender systems typically use
collaborative filtering Collaborative filtering (CF) is a technique used by recommender systems.Francesco Ricci and Lior Rokach and Bracha ShapiraIntroduction to Recommender Systems Handbook Recommender Systems Handbook, Springer, 2011, pp. 1-35 Collaborative filtering ...
approaches or a combination of the collaborative filtering and content-based filtering approaches, although content-based recommender systems do exist.


History

Before the advent of the
Internet The Internet (or internet) is the global system of interconnected computer networks that uses the Internet protocol suite (TCP/IP) to communicate between networks and devices. It is a '' network of networks'' that consists of private, p ...
, there are already several methods of filtering information; for instance, governments may control and restrict the flow of information in a given country by means of formal or informal censorship. On the other hand, we are going to talk about information filters if we refer to newspaper editors and journalists when they provide a service that selects the most valuable information for their clients, readers of books, magazines, newspapers,
radio Radio is the technology of signaling and communicating using radio waves. Radio waves are electromagnetic waves of frequency between 30  hertz (Hz) and 300  gigahertz (GHz). They are generated by an electronic device called a tr ...
listeners and TV viewers. This filtering operation is also present in schools and universities where there is a selection of information to provide assistance based on academic criteria to customers of this service, the students. With the advent of the Internet it is possible that anyone can publish anything he wishes at a low-cost. In this way, it increases considerably the less useful information and consequently the quality information is disseminated. With this problem, it began to devise new filtering with which we can get the information required for each specific topic to easily and efficiently.


Operation

A filtering system of this style consists of several tools that help people find the most valuable information, so the limited time you can dedicate to read / listen / view, is correctly directed to the most interesting and valuable documents. These filters are also used to organize and structure information in a correct and understandable way, in addition to group messages on the mail addressed. These filters are essential in the results obtained of the
search engines A search engine is a software system designed to carry out web searches. They search the World Wide Web in a systematic way for particular information specified in a textual web search query. The search results are generally presented in ...
on the Internet. The functions of filtering improves every day to get downloading Web documents and more efficient messages.


Criterion

One of the criteria used in this step is whether the
knowledge Knowledge can be defined as awareness of facts or as practical skills, and may also refer to familiarity with objects or situations. Knowledge of facts, also called propositional knowledge, is often defined as true belief that is distin ...
is harmful or not, whether knowledge allows a better understanding with or without the concept. In this case the task of information filtering to reduce or eliminate the harmful information with knowledge.


Learning System

A system of learning content consists, in general rules, mainly of three basic stages: # First, a system that provides solutions to a defined set of tasks. # Subsequently, it undergoes assessment criteria which will measure the performance of the previous stage in relation to solutions of problems. # Acquisition module which its output obtained knowledge that are used in the system solver of the first stage.


Future

Currently the problem is not finding the best way to filter information, but the way that these systems require to learn independently the information needs of users. Not only because they automate the process of
filtering Filter, filtering or filters may refer to: Science and technology Computing * Filter (higher-order function), in functional programming * Filter (software), a computer program to process a data stream * Filter (video), a software component tha ...
but also the construction and adaptation of the filter. Some branches based on it, such as statistics, machine learning, pattern recognition and data mining, are the base for developing information filters that appear and adapt in base to experience. To carry out the learning process, part of the information has to be pre-filtered, which means there are positive and negative examples which we named training data, which can be generated by experts, or via
feedback Feedback occurs when outputs of a system are routed back as inputs as part of a chain of cause-and-effect that forms a circuit or loop. The system can then be said to ''feed back'' into itself. The notion of cause-and-effect has to be handled ...
from ordinary users.


Error

As data is entered, the system includes new rules; if we consider that this data can generalize the training data information, then we have to evaluate the system development and measure the system's ability to correctly predict the categories of new
information Information is an abstract concept that refers to that which has the power to inform. At the most fundamental level information pertains to the interpretation of that which may be sensed. Any natural process that is not completely random, ...
. This step is simplified by separating the training data in a new series called "test data" that we will use to measure the error rate. As a general rule it is important to distinguish between types of errors (false positives and false negatives). For example, in the case on an aggregator of content for children, it doesn't have the same gravity to allow the passage of information not suitable for them, that shows violence or pornography, than the mistake to discard some appropriated information. To improve the system to lower error rates and have these systems with learning capabilities similar to humans we require development of systems that simulate human cognitive abilities, such as
natural-language understanding Natural-language understanding (NLU) or natural-language interpretation (NLI) is a subtopic of natural-language processing in artificial intelligence that deals with machine reading comprehension. Natural-language understanding is considered an A ...
, capturing meaning Common and other forms of advanced processing to achieve the semantics of information.


Fields of use

Nowadays, there are numerous techniques to develop information filters, some of these reach error rates lower than 10% in various experiments.{{Citation needed, date=December 2017 Among these techniques there are decision trees, support vector machines, neural networks, Bayesian networks, linear discriminants, logistic regression, etc.. At present, these techniques are used in different applications, not only in the web context, but in thematic issues as varied as voice recognition, classification of telescopic astronomy or evaluation of financial risk.


See also

*
Information literacy The Association of College & Research Libraries defines information literacy as a "set of integrated abilities encompassing the reflective discovery of information, the understanding of how information is produced and valued and the use of infor ...
*
Kalman filter For statistics and control theory, Kalman filtering, also known as linear quadratic estimation (LQE), is an algorithm that uses a series of measurements observed over time, including statistical noise and other inaccuracies, and produces estima ...
*
Filter bubble A filter bubble or ideological frame is a state of intellectual isolationTechnopediaDefinition – What does Filter Bubble mean?, Retrieved October 10, 2017, "....A filter bubble is the intellectual isolation, that can occur when websites make us ...
*
Information explosion The information explosion is the rapid increase in the amount of published information or data and the effects of this abundance. As the amount of available data grows, the problem of managing the information becomes more difficult, which can lead ...
*
Information overload Information overload (also known as infobesity, infoxication, information anxiety, and information explosion) is the difficulty in understanding an issue and effectively making decisions when one has too much information (TMI) about that issue, ...
*
Information society An information society is a society where the usage, creation, distribution, manipulation and integration of information is a significant activity. Its main drivers are information and communication technologies, which have resulted in rapid inf ...
*
Artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech ...


References

*Hanani, U., Shapira, B., Shoval, P. (2001) Information filtering: Overview of issues, research and systems. User Modeling and User-Adapted Interaction, 11, pp. 203–259. * http://www.infoworld.com/d/developer-world/human-information-filter-813


External links


Infoworld

IEEXplore
Information systems Social information processing Software by type Filters