*: Equal contribution.
✉: Corresponding author.
Beijing Sport University
Accurate perception and analysis of human motion form the cornerstone of multimodal computing, sports medicine, and rehabilitation engineering. However, existing datasets are often limited to a single sensing modality or specific simple actions, lacking comprehensive data that concurrently integrates kinematic, kinetic, and physiological information. In this work, we propose MM-Motion, a multimodal dataset featuring 16 standardized actions performed by 33 subjects. The dataset comprises 933 IMU data files, 1,866 Kinect video files including over 260,000 RGB-D image pairs, 918 pressure insole data streams, and fundamental physiological metadata. Through hardware synchronization and data preprocessing, the dataset achieved high-precision temporal alignment at the decimillisecond level. All video sequences were evaluated by experts using a multi-dimensional joint-scoring scale to obtain fine-grained action quality scores.MM-Motion provides high-quality data support for personalized assessment and intervention in intelligent sports medicine. By establishing a multi-task benchmark framework encompassing action recognition, quality assessment, and injury risk stratification, this research validates the high fidelity of the MM-Motion dataset. Furthermore, the experimental results demonstrate that the multimodal fusion of kinematics and kinetics significantly enhances the accuracy of ankle sprain risk prediction, underscoring the necessity of collecting diverse sensory data for complex physiological analysis.
(a) Scene Layout
(b) Pose Demonstration
(c) Data Overview
Data capture in MM-Motion: (a) Scene layout: Three experimenters are involved: one controls the IMU and master-Kinect, one controls the sub-Kinect and pressure insole, and one demonstrates the movements. (b) Pose demonstration: The process of performing movements for data collection. (c) Data overview: The four modalities of captured data.
subject34 pose05
subject06 pose09
subject41 pose16
subject37 pose01
subject21 pose04
subject17 pose10
The above figures compare RGB images and IMU visualizations from six subjects, covering movements including single-leg jumping, deep squat, hurdle step, and straight lunge.
Below is a visual analysis system for three types of data. A unified timeline control is provided to manipulate the time‑synchronized visualization of these data. The plantar pressure data visualization shows a heatmap along with the center of pressure (CoP) trajectory. In the IMU and Kinect data visualization modules, time‑series plots and rose plots are available for in‑depth analysis of the IMU and Kinect data.
If someone wants to download the MM-Motion dataset, please fill in the agreement, and email Jiaqi Wu <2025240795@bsu.edu.cn> or Jianwei Li <jianwei@bsu.edu.cn> to request the download link. Also, you can download the demo of MM-Motion dataset at https://zenodo.org/records/19283407,
This work was supported by the Beijing Natural Science Foundation (No.4262068) and the Innovation Project of Sports Medicine Science and Technology of General Administration of Sport of China (General Project No.2 of 2025). We sincerely thank Qian Kang and Chen Ziyao for their assistance during the data collection process.
@inproceedings{Wu2026MMMotion,
title={MM-Motion: A Multimodal Human Motion Dataset Supporting Action Understanding and Injury Risk Evaluation},
author={Wu, Jiaqi and Li, Jianwei and Ding, Ruiqi and Wang, Sixuan and Ran, Kehao and Cao, Rui},
booktitle={Proceedings of the 2026 ACM Multimedia Conference},
series={MM 2026},
year={2026},
pages={To Appear},
organization={ACM},
note={To appear}
}