Video content analysis or video content analytics (VCA), also known as video analysis or video analytics (VA), is the capability of automatically analyzing
video
Video is an Electronics, electronic medium for the recording, copying, playback, broadcasting, and display of moving picture, moving image, visual Media (communication), media. Video was first developed for mechanical television systems, whi ...
to detect and determine temporal and spatial events.
This technical capability is used in a wide range of domains including entertainment,
[KINECT](_blank)
, add-on peripheral for the Xbox 360
The Xbox 360 is a home video game console developed by Microsoft. As the successor to the original Xbox, it is the second console in the Xbox series. It competed with Sony's PlayStation 3 and Nintendo's Wii as part of the seventh generati ...
console video retrieval and
video browsing Video browsing, also known as exploratory video search, is the interactive process of skimming through video content in order to satisfy some information need or to interactively check if the video content is relevant. While originally proposed to h ...
, health-care, retail, automotive, transport,
home automation
Home automation or domotics is building automation for a home, called a smart home or smart house. A home automation system will monitor and/or control home attributes such as lighting, climate, entertainment systems, and appliances. It ...
, flame and smoke detection, safety, and security.
[VCA usage increase in British Security](_blank)
BSIA report The
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
s can be implemented as software on general-purpose machines, or as hardware in specialized video processing units.
Many different functionalities can be implemented in VCA. Video Motion Detection is one of the simpler forms where motion is detected with regard to a fixed background scene. More advanced functionalities include
video tracking and
egomotion estimation.
Based on the internal representation that VCA generates in the machine, it is possible to build other functionalities, such as
video summarization,
identification
Identification or identify may refer to:
*Identity document, any document used to verify a person's identity
Arts, entertainment and media
* ''Identify'' (album) by Got7, 2014
* "Identify" (song), by Natalie Imbruglia, 1999
*Identification (a ...
,
behavior
Behavior (American English) or behaviour ( British English) is the range of actions and mannerisms made by individuals, organisms, systems or artificial entities in some environment. These systems can include other systems or organisms as w ...
analysis, or other forms of
situation awareness
Situational awareness or situation awareness (SA) is the perception of environmental elements and events with respect to time or space, the comprehension of their meaning, and the projection of their future status. An alternative definition is tha ...
.
VCA relies on good input video, so it is often combined with video enhancement technologies such as
video denoising,
image stabilization,
unsharp masking
Unsharp masking (USM) is an image sharpening technique, first implemented in darkroom photography, but now commonly used in digital image processing software. Its name derives from the fact that the technique uses a blurred, or "unsharp", negati ...
, and
super-resolution.
Functionalities
Several articles provide an overview of the modules involved in the development of video analytic applications.
[Nik Gagvani](_blank)
Introduction to Video Analytics[Cheng Peng](_blank)
Video Analytics
This is a list of known functionalities and a short description.
Commercial applications
VCA is a relatively new technology, with numerous companies releasing VCA-enhanced products in the mid-2000s. While there are many applications, the track record of different VCA solutions differ widely. Functionalities such as
motion detection
Motion detection is the process of detecting a change in the position of an object relative to its surroundings or a change in the surroundings relative to an object. It can be achieved by either mechanical or electronic methods. When it is done ...
,
people counting and gun detection are available as
commercial off-the-shelf
Commercial off-the-shelf or commercially available off-the-shelf (COTS) products are packaged or canned (ready-made) hardware or software, which are adapted aftermarket to the needs of the purchasing organization, rather than the commissioning of ...
products and believed to have a decent track-record (for example, even freeware such as dsprobotics Flowstone can handle movement and color analysis). In response to the
COVID-19 pandemic
The COVID-19 pandemic, also known as the coronavirus pandemic, is an ongoing global pandemic of coronavirus disease 2019 (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The novel virus was first identified ...
, many software manufacturers have introduced new public health analytics like
face mask detection or
social distancing
In public health, social distancing, also called physical distancing, (NB. Regula Venske is president of the PEN Centre Germany.) is a set of non-pharmaceutical interventions or measures intended to prevent the spread of a contagious dise ...
tracking.
In many domains VCA is implemented on
CCTV
Closed-circuit television (CCTV), also known as video surveillance, is the use of video cameras to transmit a signal to a specific place, on a limited set of monitors. It differs from broadcast television in that the signal is not openly tr ...
systems, either distributed on the cameras (at-the-edge) or centralized on dedicated processing systems. Video Analytics and Smart CCTV are commercial terms for VCA in the security domain. In the UK the
BSIA has developed an introduction guide for VCA in the security domain.
[British Industry VCA Guide](_blank)
262 An Introduction to Video Content Analysis Industry Guide In addition to video analytics and to complement it, audio analytics can also be used.
[ UK based startup that provides audio analytics into the CCTV industry]
Video management software manufacturers are constantly expanding the range of the video analytics modules available. With the new suspect tracking technology, it is then possible to track all of this subject's movements easily: where they came from, and when, where, and how they moved. Within a particular surveillance system, the indexing technology is able to locate people with similar features who were within the cameras’ viewpoints during or within a specific period of time. Usually, the system finds a lot of different people with similar features and presents them in the form of snapshots. The operator only needs to click on those images and subjects which need to be tracked. Within a minute or so, it's possible to track all the movements of a particular person, and even to create a step-by-step video of the movements.
Kinect
Kinect is a line of motion sensing input devices produced by Microsoft and first released in 2010. The devices generally contain RGB cameras, and infrared projectors and detectors that map depth through either structured light or time of fl ...
is an add-on peripheral for the Xbox 360 gaming console that uses VCA for part of the user input.
In retail industry, VCA is used to track shoppers inside the store. By this way, a heatmap of the store can be obtained, which is beneficial for store design and marketing optimisations. Other applications include dwell time when looking at a products and item removed/left detection.
The quality of VCA in the commercial setting is difficult to determine. It depends on many variables such as
use case
In software and systems engineering, the phrase use case is a polyseme with two senses:
# A usage scenario for a piece of software; often used in the plural to suggest situations where a piece of software may be useful.
# A potential scenari ...
,
implementation
Implementation is the realization of an application, or execution of a plan, idea, model, design, specification, standard, algorithm, or policy.
Industry-specific definitions
Computer science
In computer science, an implementation is a real ...
,
system configuration and
computing platform
A computing platform or digital platform is an environment in which a piece of software is executed. It may be the hardware or the operating system (OS), even a web browser and associated application programming interfaces, or other underlying ...
. Typical methods to get an objective idea of the quality in commercial settings include independent
benchmarking[i-Lids](_blank)
Benchmarking initiative by the UK Home Office and designated test locations.
VCA has been used for
crowd management purposes, notably at
The O2 Arena
The O2 Arena, commonly known as the O2 (stylised as The O2 arena), is a multi-purpose indoor arena in the centre of the O2 entertainment complex on the Greenwich Peninsula in southeast London. It opened in its present form in 2007. It has the ...
in London and
The London Eye.
Law enforcement
Police and forensic scientists analyse
CCTV
Closed-circuit television (CCTV), also known as video surveillance, is the use of video cameras to transmit a signal to a specific place, on a limited set of monitors. It differs from broadcast television in that the signal is not openly tr ...
video when investigating criminal activity. Police use software, such as
Kinesense
Kinesense is computer vision and video analytics company based in Dublin, Ireland. The company is one of largest suppliers of computer vision products to the UK police, who use the technology to search CCTV content in the course of criminal invest ...
, which performs video content analysis to search for key events in video and find suspects. Surveys have shown that up to 75% of cases involve CCTV. Police use video content analysis software to search long videos for important events.
Academic research
Video content analysis is a subset of
computer vision
Computer vision is an Interdisciplinarity, interdisciplinary scientific field that deals with how computers can gain high-level understanding from digital images or videos. From the perspective of engineering, it seeks to understand and automate t ...
and thereby of
artificial intelligence
Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machine
A machine is a physical system using Power (physics), power to apply Force, forces and control Motion, moveme ...
. Two major academic benchmark initiatives are
TRECVID,
[TRECVID](_blank)
Academic benchmark initiative by NIST
The National Institute of Standards and Technology (NIST) is an agency of the United States Department of Commerce whose mission is to promote American innovation and industrial competitiveness. NIST's activities are organized into physical sc ...
which uses a small portion of i-LIDS video footage, and the PETS Benchmark Data.
[PETS Benchmark Data](_blank)
Performance Evaluation of Tracking and Surveillance (PETS) by University of Reading
The University of Reading is a public university in Reading, Berkshire, England. It was founded in 1892 as University College, Reading, a University of Oxford extension college. The institution received the power to grant its own degrees in 192 ...
They focus on functionalities such as tracking, left luggage detection and virtual fencing. Benchmark video datasets such as th
UCF101ref name="Center 2013"> enables
action recognition researches incorporating
temporal and
spatial
Spatial may refer to:
*Dimension
*Space
*Three-dimensional space
Three-dimensional space (also: 3D space, 3-space or, rarely, tri-dimensional space) is a geometric setting in which three values (called ''parameters'') are required to determ ...
visual attention
Attention is the behavioral and cognition, cognitive process of selectively concentrating on a discrete aspect of information, whether considered Subjectivity, subjective or Objectivity (philosophy), objective, while ignoring other perceivable ...
with
convolutional neural network
In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of artificial neural network (ANN), most commonly applied to analyze visual imagery. CNNs are also known as Shift Invariant or Space Invariant Artificial Neural Netwo ...
and
long short-term memory
Long short-term memory (LSTM) is an artificial neural network used in the fields of artificial intelligence and deep learning. Unlike standard feedforward neural networks, LSTM has feedback connections. Such a recurrent neural network (RNN) ca ...
. Video analysis software is also being paired with footage from
body-worn and
dashboard cameras in order to more easily redact footage for public disclosure and to identify events and people in videos.
The
EU is funding a
FP7 project called P-REACT
[P-REACT Project Website](_blank)
/ref> to integrate video content analytics on embedded systems with police and transport security databases.
Artificial Intelligence
Artificial intelligence for video surveillance utilizes computer software
Software is a set of computer programs and associated documentation and data. This is in contrast to hardware, from which the system is built and which actually performs the work.
At the lowest programming level, executable code consists ...
programs that analyze the audio and images from video surveillance cameras in order to recognize humans, vehicles, objects and events. Security contractors program is the software to define restricted areas within the camera's view (such as a fenced off area, a parking lot but not the sidewalk or public street outside the lot) and program for times of day (such as after the close of business) for the property being protected by the camera surveillance. The artificial intelligence
Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machine
A machine is a physical system using Power (physics), power to apply Force, forces and control Motion, moveme ...
("A.I.") sends an alert if it detects a trespasser breaking the "rule" set that no person is allowed in that area during that time of day.
See also
* Activity recognition
* Artificial intelligence for video surveillance
* Forensic video analysis
Forensic video analysis is the scientific examination, comparison and/or evaluation of video in legal matters.
Forensic video analysis usage
Forensic video analysis has been used in a variety of high profile cases, international disagreements, ...
* Object co-segmentation
* Structure from motion
Structure from motion (SfM) is a photogrammetric range imaging technique for estimating three-dimensional structures from two-dimensional image sequences that may be coupled with local motion signals. It is studied in the fields of computer vis ...
* Video browsing Video browsing, also known as exploratory video search, is the interactive process of skimming through video content in order to satisfy some information need or to interactively check if the video content is relevant. While originally proposed to h ...
* Video motion analysis
* Video processing In electronics engineering, video processing is a particular case of signal processing, in particular image processing, which often employs video filters and where the input and output signals are video files or video streams. Video processing t ...
References
{{Reflist
Film and video technology
Applications of computer vision
Video surveillance
Television terminology
Motion in computer vision