Abstract
Accurately predicting future activities in egocentric (first-person) videos is a challenging yet essential task, requiring robust object recognition and reliable forecasting of action patterns. However, the limited number of observable frames in such videos often lacks critical semantic context, making long-term predictions particularly difficult. Traditional approaches, especially those based on recurrent neural networks, tend to suffer from cumulative error propagation over extended time steps, leading to degraded performance. To address these challenges, this paper introduces a novel framework, Virtual Frame-Augmented Guided Forecasting (VFGF), designed specifically for long-term egocentric activity prediction. The VFGF framework enhances semantic continuity by generating and incorporating virtual frames into the observable sequence. These synthetic frames fill the temporal and contextual gaps caused by rapid changes in activity or environmental conditions. In addition, we propose a Feature Guidance Module that integrates anticipated activity-relevant features into the recursive prediction process, guiding the model toward more accurate and contextually coherent inferences. Extensive experiments on the EPIC-Kitchens dataset demonstrate that VFGF, with its interpolation-based temporal smoothing and feature-guided strategies, significantly improves long-term activity prediction accuracy. Specifically, VFGF achieves a state-of-the-art Top-5 accuracy of 44.11% at a 0.25 s prediction horizon. Moreover, it maintains competitive performance across a range of long-term forecasting intervals, highlighting its robustness and establishing a strong foundation for future research in egocentric activity prediction.
| Original language | English (US) |
|---|---|
| Article number | 5644 |
| Journal | Sensors |
| Volume | 25 |
| Issue number | 18 |
| DOIs | |
| State | Published - Sep 2025 |
All Science Journal Classification (ASJC) codes
- Analytical Chemistry
- Information Systems
- Atomic and Molecular Physics, and Optics
- Biochemistry
- Instrumentation
- Electrical and Electronic Engineering
Fingerprint
Dive into the research topics of 'VFGF: Virtual Frame-Augmented Guided Prediction Framework for Long-Term Egocentric Activity Forecasting'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver