A multidimensional approach to detect action scene in video data

L. N. Abdullah, Shahrul Azman Mohd Noah, T. M T Sembok, Khairuddin Omar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

There is a need to automatically extracting video content for efficient access, understanding, browsing and retrieval of videos. Detecting and interpreting human presence, actions and activities is one of the most valuable functions in this proposed framework. The general objectives of this research are to analyze and process the audio-video streams to a robust audiovisual action recognition system by integrating, structuring and accessing multimodal information via multidimensional retrieval and extraction model. The research also presented a method to characterize, detect, identify, and abstract action by combining low level and high level features. The proposed technique characterizes the action scenes by integrating cues obtained from both the audio and video tracks. Information is combined based on visual features (motion, edge, and visual characteristics of objects), audio features and video for recognizing action. This model uses HMM and GMM to provide a framework for fusing these features and to represent the multidimensional structure of the framework. Compared with using single source of either visual or audio track alone, such combined audio-visual information provides more reliable performance and allows us to understand the story content of movies in more detail. Several experiments were conducted and the results showed that by using visual features only (74%), audio features only (65%) and combined audiovisual (88%). The results showed an improvement in recognition when both audio and visual cues are combined.

Original languageEnglish
Title of host publicationProceedings of the 4th IASTED International Conference on Advances in Computer Science and Technology, ACST 2008
Pages207-212
Number of pages6
Publication statusPublished - 2008
Event4th IASTED International Conference on Advances in Computer Science and Technology, ACST 2008 - Langkawi
Duration: 2 Apr 20084 Apr 2008

Other

Other4th IASTED International Conference on Advances in Computer Science and Technology, ACST 2008
CityLangkawi
Period2/4/084/4/08

Fingerprint

Experiments

Keywords

  • Action recognition
  • Gaussian mixture model
  • Hidden markov model
  • Multidimensional structure,
  • Multimodal information

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

Abdullah, L. N., Mohd Noah, S. A., Sembok, T. M. T., & Omar, K. (2008). A multidimensional approach to detect action scene in video data. In Proceedings of the 4th IASTED International Conference on Advances in Computer Science and Technology, ACST 2008 (pp. 207-212)

A multidimensional approach to detect action scene in video data. / Abdullah, L. N.; Mohd Noah, Shahrul Azman; Sembok, T. M T; Omar, Khairuddin.

Proceedings of the 4th IASTED International Conference on Advances in Computer Science and Technology, ACST 2008. 2008. p. 207-212.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abdullah, LN, Mohd Noah, SA, Sembok, TMT & Omar, K 2008, A multidimensional approach to detect action scene in video data. in Proceedings of the 4th IASTED International Conference on Advances in Computer Science and Technology, ACST 2008. pp. 207-212, 4th IASTED International Conference on Advances in Computer Science and Technology, ACST 2008, Langkawi, 2/4/08.
Abdullah LN, Mohd Noah SA, Sembok TMT, Omar K. A multidimensional approach to detect action scene in video data. In Proceedings of the 4th IASTED International Conference on Advances in Computer Science and Technology, ACST 2008. 2008. p. 207-212
Abdullah, L. N. ; Mohd Noah, Shahrul Azman ; Sembok, T. M T ; Omar, Khairuddin. / A multidimensional approach to detect action scene in video data. Proceedings of the 4th IASTED International Conference on Advances in Computer Science and Technology, ACST 2008. 2008. pp. 207-212
@inproceedings{d7cba878be4d43609ed6cbf4c177c9f7,
title = "A multidimensional approach to detect action scene in video data",
abstract = "There is a need to automatically extracting video content for efficient access, understanding, browsing and retrieval of videos. Detecting and interpreting human presence, actions and activities is one of the most valuable functions in this proposed framework. The general objectives of this research are to analyze and process the audio-video streams to a robust audiovisual action recognition system by integrating, structuring and accessing multimodal information via multidimensional retrieval and extraction model. The research also presented a method to characterize, detect, identify, and abstract action by combining low level and high level features. The proposed technique characterizes the action scenes by integrating cues obtained from both the audio and video tracks. Information is combined based on visual features (motion, edge, and visual characteristics of objects), audio features and video for recognizing action. This model uses HMM and GMM to provide a framework for fusing these features and to represent the multidimensional structure of the framework. Compared with using single source of either visual or audio track alone, such combined audio-visual information provides more reliable performance and allows us to understand the story content of movies in more detail. Several experiments were conducted and the results showed that by using visual features only (74{\%}), audio features only (65{\%}) and combined audiovisual (88{\%}). The results showed an improvement in recognition when both audio and visual cues are combined.",
keywords = "Action recognition, Gaussian mixture model, Hidden markov model, Multidimensional structure,, Multimodal information",
author = "Abdullah, {L. N.} and {Mohd Noah}, {Shahrul Azman} and Sembok, {T. M T} and Khairuddin Omar",
year = "2008",
language = "English",
isbn = "9780889867307",
pages = "207--212",
booktitle = "Proceedings of the 4th IASTED International Conference on Advances in Computer Science and Technology, ACST 2008",

}

TY - GEN

T1 - A multidimensional approach to detect action scene in video data

AU - Abdullah, L. N.

AU - Mohd Noah, Shahrul Azman

AU - Sembok, T. M T

AU - Omar, Khairuddin

PY - 2008

Y1 - 2008

N2 - There is a need to automatically extracting video content for efficient access, understanding, browsing and retrieval of videos. Detecting and interpreting human presence, actions and activities is one of the most valuable functions in this proposed framework. The general objectives of this research are to analyze and process the audio-video streams to a robust audiovisual action recognition system by integrating, structuring and accessing multimodal information via multidimensional retrieval and extraction model. The research also presented a method to characterize, detect, identify, and abstract action by combining low level and high level features. The proposed technique characterizes the action scenes by integrating cues obtained from both the audio and video tracks. Information is combined based on visual features (motion, edge, and visual characteristics of objects), audio features and video for recognizing action. This model uses HMM and GMM to provide a framework for fusing these features and to represent the multidimensional structure of the framework. Compared with using single source of either visual or audio track alone, such combined audio-visual information provides more reliable performance and allows us to understand the story content of movies in more detail. Several experiments were conducted and the results showed that by using visual features only (74%), audio features only (65%) and combined audiovisual (88%). The results showed an improvement in recognition when both audio and visual cues are combined.

AB - There is a need to automatically extracting video content for efficient access, understanding, browsing and retrieval of videos. Detecting and interpreting human presence, actions and activities is one of the most valuable functions in this proposed framework. The general objectives of this research are to analyze and process the audio-video streams to a robust audiovisual action recognition system by integrating, structuring and accessing multimodal information via multidimensional retrieval and extraction model. The research also presented a method to characterize, detect, identify, and abstract action by combining low level and high level features. The proposed technique characterizes the action scenes by integrating cues obtained from both the audio and video tracks. Information is combined based on visual features (motion, edge, and visual characteristics of objects), audio features and video for recognizing action. This model uses HMM and GMM to provide a framework for fusing these features and to represent the multidimensional structure of the framework. Compared with using single source of either visual or audio track alone, such combined audio-visual information provides more reliable performance and allows us to understand the story content of movies in more detail. Several experiments were conducted and the results showed that by using visual features only (74%), audio features only (65%) and combined audiovisual (88%). The results showed an improvement in recognition when both audio and visual cues are combined.

KW - Action recognition

KW - Gaussian mixture model

KW - Hidden markov model

KW - Multidimensional structure,

KW - Multimodal information

UR - http://www.scopus.com/inward/record.url?scp=62649088897&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=62649088897&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:62649088897

SN - 9780889867307

SP - 207

EP - 212

BT - Proceedings of the 4th IASTED International Conference on Advances in Computer Science and Technology, ACST 2008

ER -