Hybrid handcrafted and learned feature framework for human action recognition

Chaolong Zhang, Yuanping Xu, Zhijie Xu, Jian Huang, Jun Lu

Research output: Contribution to journalArticlepeer-review

15 Citations (Scopus)


Recognising human actions in video is a challenging task in real-world. Dense trajectory (DT) offers accurate recording of motions over time that is rich in dynamic information. However, DT models lack the mechanism to distinguish dominant motions from secondary ones over separable frequency bands and directions. By contrast, deep learning-based methods are promising over the challenge though still suffering from limited capacity in handling complex temporal information, not mentioning huge datasets needed to guide the training. To take the advantage of semantical meaningful and “handcrafted” video features through feature engineering, this study integrates the discrete wavelet transform (DWT) technique into the DT model for gaining more descriptive human action features. Through exploring the pre-trained dual-stream CNN-RNN models, learned features can be integrated with the handcrafted ones to satisfy stringent analytical requirements within the spatial-temporal domain. This hybrid feature framework generates efficient Fisher Vectors through a novel Bag of Temporal Features scheme and is capable of encoding video events whilst speeding up action recognition for real-world applications. Evaluation of the design has shown superior recognition performance over existing benchmark systems. It has also demonstrated promising applicability and extensibility for solving challenging real-world human action recognition problems.

Original languageEnglish
Pages (from-to)12771-12787
Number of pages17
JournalApplied Intelligence
Issue number11
Early online date12 Feb 2022
Publication statusPublished - 1 Sep 2022


Dive into the research topics of 'Hybrid handcrafted and learned feature framework for human action recognition'. Together they form a unique fingerprint.

Cite this