| Issue |
Natl Sci Open
Volume 5, Number 3, 2026
|
|
|---|---|---|
| Article Number | 20260016 | |
| Number of page(s) | 39 | |
| Section | Information Sciences | |
| DOI | https://doi.org/10.1360/nso/20260016 | |
| Published online | 29 April 2026 | |
REVIEW
A survey on edge multimodal large models: Compression, inference acceleration, and applications
College of Control Science and Engineering, Zhejiang University, Hangzhou 310027, China
* Corresponding authors (emails: This email address is being protected from spambots. You need JavaScript enabled to view it.
(Yuanchao Shu); This email address is being protected from spambots. You need JavaScript enabled to view it.
(Jiming Chen))
Received:
31
January
2026
Revised:
19
April
2026
Accepted:
21
April
2026
Abstract
Deploying multimodal large language models (MLLMs) at the network edge is critical for enabling low-latency, privacy-preserving multimodal intelligence. However, the substantial computational and memory demands of MLLMs present significant challenges for deployment on heterogeneous and resource-constrained edge devices. This survey systematically reviews existing approaches aimed at addressing these challenges. We categorize the literature along two complementary dimensions: model-level compression, which focuses on efficient architectural design and parameter reduction, and system-level inference acceleration, which emphasizes runtime optimizations such as scheduling and resource management. In addition, the survey examines the practical applications of edge-deployed MLLMs in domains such as cyber intelligence and embodied intelligence, and discusses emerging research directions, including edge-native model architectures, to further improve the trade-off between intelligence capability and resource efficiency.
Key words: edge computing / multimodal large language models / model compression / inference optimization / cyber intelligence / embodied intelligence
© The Author(s) 2026. Published by Science Press and EDP Sciences.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Lcense (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.
