HOME

TheInfoList



OR:

Visual temporal attention is a special case of
visual attention Attention is the behavioral and cognition, cognitive process of selectively concentrating on a discrete aspect of information, whether considered Subjectivity, subjective or Objectivity (philosophy), objective, while ignoring other perceivable ...
that involves directing attention to specific instant of time. Similar to its spatial counterpart visual spatial attention, these attention modules have been widely implemented in video analytics in
computer vision Computer vision is an Interdisciplinarity, interdisciplinary scientific field that deals with how computers can gain high-level understanding from digital images or videos. From the perspective of engineering, it seeks to understand and automate t ...
to provide enhanced performance and human interpretable explanation of deep learning models. As visual spatial attention mechanism allows human and/or
computer vision Computer vision is an Interdisciplinarity, interdisciplinary scientific field that deals with how computers can gain high-level understanding from digital images or videos. From the perspective of engineering, it seeks to understand and automate t ...
systems to focus more on semantically more substantial regions in space, visual temporal attention modules enable
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
algorithms to emphasize more on critical video frames in video analytics tasks, such as human action recognition. In
convolutional neural network In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of artificial neural network (ANN), most commonly applied to analyze visual imagery. CNNs are also known as Shift Invariant or Space Invariant Artificial Neural Netwo ...
-based systems, the prioritization introduced by the attention mechanism is regularly implemented as a linear weighting layer with parameters determined by labeled training data.


Application in Action Recognition

Recent video segmentation algorithms often exploits both spatial and temporal attention mechanisms. Research in human action recognition has accelerated significantly since the introduction of powerful tools such as Convolutional Neural Networks (CNNs). However, effective methods for incorporation of temporal information into CNNs are still being actively explored. Motivated by the popular recurrent attention models in
natural language processing Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to proc ...
, the Attention-aware Temporal Weighted CNN (ATW CNN) is proposed in videos, which embeds a visual attention model into a temporal weighted multi-stream CNN. This attention model is implemented as temporal weighting and it effectively boosts the recognition performance of video representations. Besides, each stream in the proposed ATW CNN framework is capable of end-to-end training, with both network parameters and temporal weights optimized by stochastic gradient descent (SGD) with
back-propagation In machine learning, backpropagation (backprop, BP) is a widely used algorithm for training feedforward artificial neural networks. Generalizations of backpropagation exist for other artificial neural networks (ANNs), and for functions gener ...
. Experimental results show that the ATW CNN attention mechanism contributes substantially to the performance gains with the more discriminative snippets by focusing on more relevant video segments.


See also

*
Attention Attention is the behavioral and cognitive process of selectively concentrating on a discrete aspect of information, whether considered subjective or objective, while ignoring other perceivable information. William James (1890) wrote that "Att ...
* Visual spatial attention * Action Recognition *
Video content analysis Video content analysis or video content analytics (VCA), also known as video analysis or video analytics (VA), is the capability of automatically analyzing video to detect and determine temporal and spatial events. This technical capability is used ...
*
Convolutional neural network In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of artificial neural network (ANN), most commonly applied to analyze visual imagery. CNNs are also known as Shift Invariant or Space Invariant Artificial Neural Netwo ...
*
Computer vision Computer vision is an Interdisciplinarity, interdisciplinary scientific field that deals with how computers can gain high-level understanding from digital images or videos. From the perspective of engineering, it seeks to understand and automate t ...


References

{{reflist, refs= {{cite web , title=NIPS 2017 , website=Interpretable ML Symposium , date=2017-10-20 , url=http://interpretable.ml/ , access-date=2018-09-12 {{cite book , last1=Zang , first1=Jinliang , last2=Wang , first2=Le , last3=Liu , first3=Ziyi , last4=Zhang , first4=Qilin , last5=Hua , first5=Gang , last6=Zheng , first6=Nanning , title=IFIP Advances in Information and Communication Technology , chapter=Attention-Based Temporal Weighted Convolutional Neural Network for Action Recognition , publisher=Springer International Publishing , location=Cham , year=2018 , isbn=978-3-319-92006-1 , issn=1868-4238 , doi=10.1007/978-3-319-92007-8_9 , pages=97–108 , arxiv=1803.07179 , s2cid=4058889 {{cite journal , last1=Wang , first1=Le , last2=Zang , first2=Jinliang , last3=Zhang , first3=Qilin , last4=Niu , first4=Zhenxing , last5=Hua , first5=Gang , last6=Zheng , first6=Nanning , title=Action Recognition by an Attention-Aware Temporal Weighted Convolutional Neural Network , journal=Sensors , publisher=MDPI AG , volume=18 , issue=7 , date=2018-06-21 , issn=1424-8220 , doi=10.3390/s18071979 , page=1979 , url=https://qilin-zhang.github.io/_pages/pdfs/sensors-18-01979-Action_Recognition_by_an_Attention-Aware_Temporal_Weighted_Convolutional_Neural_Network.pdf , pmid=29933555 , pmc=6069475, bibcode=2018Senso..18.1979W , doi-access=free Material was copied from this source, which is available under
Creative Commons Attribution 4.0 International License
{{cite web , title=UCF101 - Action Recognition Data Set , last=Center , first=UCF , website=CRCV , date=2013-10-17 , url=http://crcv.ucf.edu/data/UCF101.php , access-date=2018-09-12 Attention Computer vision Machine vision Applications of computer vision Applied machine learning Film and video technology Cognition Cognitive neuroscience Neuropsychology