📄 Download IEEE MLMC 2026 Technical Program (PDF)

MLMC 2026 Technical Program

📅 July 27, 2026 (Monday)

14:00 – 20:00	Registration Conference Lobby, Wuyishan Taiwei Hotel
18:30 – 20:30	Welcome Reception

📅 July 28, 2026 (Tuesday)

08:30 – 08:45	Arrival & Seating
08:45 – 09:00	Opening Ceremony + Group Photo Welcome address by conference chairs
09:00 – 09:40	Title: From Perception to Principles: Quality Assessment for AI-Generated Visual Contents Keynote speaker: Chang Wen Chen Host: Hanli Wang International Conference Hall, 3F
09:40 – 10:20	Title: Physics-Informed Generative Image Restoration: From Data to Models Keynote speaker: Chia-Wen Lin Host: Tiesong Zhao International Conference Hall, 3F
10:20 – 10:40	Coffee Break
10:40 – 11:20	Title: Human Sustainability in AI Scene: Toward QoAI Experience Keynote speaker: Patrick Le Callet Host: Wei Gao International Conference Hall, 3F
11:20 – 12:00	Title: Robust Multimodal Visual Cognition: From Precise Perception Engines to Interactive Reflective Agents Keynote speaker: Zechao Li Host: Ran Wang International Conference Hall, 3F
12:00 – 14:00	Lunch Break
14:00 – 15:30	Oral Session 1: 3D Vision & Generation (9 papers, 10 min each) Conference Room 3, 3F Session Chairs: Xuelin Shen and Jielian Lin
15:30 – 15:50	Coffee Break
15:50 – 17:20	Oral Session 2: ML Theory & Applications (9 papers, 10 min each) Conference Room 3, 3F Session Chairs: Zhongyuan Guo and Liqun Lin
08:30 – 17:20	Poster Session (All day, parallel with other sessions) Corridor, 3F (outside the International Conference Hall) Session Chairs: Wenhui Wu and Hua Li
18:00 – 20:00	Dinner

📅 July 29, 2026 (Wednesday)

08:30 – 10:00	Oral Session 3: Image/Video Proc. & Enhancement (9 papers, 10 min each) Conference Room 3, 3F Session Chairs: Yue Liu and Lingyu Zhu
10:00 – 10:20	Coffee Break
10:20 – 11:50	Oral Session 4: Visual Recognition & Understanding (9 papers, 10 min each) Conference Room 3, 3F Session Chairs: Yifan Huang and Nanfeng Jiang
12:00 – 14:00	Lunch Break
14:00 – 17:30	Internal Meeting
19:00 – 21:00	Banquet

📅 July 30, 2026 (Thursday)

All day

Departure / Check-out

🎤 Oral Presenter Guidelines

In-Person Presentation
1. Please arrive at the presentation room 5 minutes before the Session.
2. A computer will be provided; Bring your presentation on a USB drive.
3. Each paper is allotted 10 minutes, (including Q&A).

📌 Poster Presenter Guidelines

In-Person Presentation
1. Arrive at the poster area 5 minutes before the Session.
2. Poster board provided. Size: 1:2.5 (recommended: 80 cm (width) × 200 cm (height).
3. Displayed throughout July 28 (parallel with all sessions).

Keynote Speakers

Chang Wen Chen (陈长汶)

Chair Professor of Visual Computing, Interim Dean of the Faculty of Computer and Mathematical Sciences, The Hong Kong Polytechnic University, IEEE Fellow

🎤 From Perception to Principles: Quality Assessment for AI-Generated Visual Contents

Abstract: The rapid advancement of generative AI (GenAI) has transformed the creation of images and videos, enabling unprecedented levels of visual realism, diversity, and controllability. Yet one pressing issue is in how to properly assess the quality of AI-generated visual content as such new types of contents poses significant challenges that extend well beyond the scope of traditional image and video quality assessment. Whereas conventional approaches have largely focused on perceptual distortions such as blur, noise, compression artifacts, and transmission degradation, assessing generated contents calls for a broader spectrum of evaluation framework—one that have to consider semantic fidelity, temporal coherence, physical plausibility, aesthetic quality, and exact alignment with human expectations and prompts. This keynote will examine the transition from traditional perception-driven quality assessment to a new paradigm for AI-generated visual content, in which human perceptual experience needs to be integrated with principled understanding of semantics, motion, and the physical world. We shall highlight some recent efforts toward interpretable and comprehensive evaluation frameworks that assess not only visual fidelity, but also motion realism, multidimensional quality, relational semantic consistency, and physical plausibility. This talk will also discuss potential future directions for next-generation GenAI quality assessment, emphasizing the critical intersection of perception and principles as the foundation for designing trustworthy generative visual systems.

Biography: Prof. Chang Wen Chen is currently Chair Professor of Visual Computing at The Hong Kong Polytechnic University. Before his current position, he served as Dean of the School of Science and Engineering at The Chinese University of Hong Kong, Shenzhen, from 2017 to 2020, and concurrently as Deputy Director at Peng Cheng Laboratory from 2018 to 2021. Previously, he was an Empire Innovation Professor at the State University of New York at Buffalo (SUNY) from 2008 to 2021 and the Allan Henry Endowed Chair Professor at the Florida Institute of Technology from 2003 to 2007. He received his BS degree from the University of Science and Technology of China in 1983, an MS degree from the University of Southern California in 1986, and his PhD degree from the University of Illinois at Urbana-Champaign (UIUC) in 1992. He is currently Deputy Editor-in-Chief for IEEE Trans. Image Processing. He has also served as Editor-in-Chief for IEEE Trans. Multimedia (2014-2016) and for IEEE Trans. Circuits and Systems for Video Technology (2006-2009). Over several decades of professional career, he has received many professional achievement awards, including eleven (11) Best Paper Awards or Best Student Paper Awards, the prestigious Alexander von Humboldt Award in 2010, the SUNY Chancellor’s Award for Excellence in Scholarship and Creative Activities in 2016, the UIUC ECE Distinguished Alumni Award in 2019, and the ACM SIGMM Outstanding Technical Achievement Award in 2024. He is an IEEE Fellow, a SPIE Fellow, and a Member of Academia Europaea.

Chia-Wen Lin (林嘉文)

Distinguished Professor, Department of Electrical Engineering, National Tsing Hua University, Deputy Director of NTHU AI Research Center, IEEE Fellow

🎤 Physics-Informed Generative Image Restoration: From Data to Models

Abstract: Images captured in real-world environments often suffer from diverse degradation patterns governed by underlying physical processes, such as motion blur, haze, rain, and low illumination, which lead to undesired contrast loss and appearance distortions. With the rapid development of deep generative image models, numerous image restoration methods have been proposed to effectively alleviate these degradations. As most deep restoration approaches rely on supervised learning, their performance strongly depends on the diversity and representativeness of the training data. However, for many real-world degradations—such as rain, haze, and motion blur—collecting paired degraded and clean images is expensive and often impractical, severely limiting the coverage of available training datasets. These data acquisition challenges restrict the effectiveness and generalization ability of existing restoration models. In this talk, we will present how physics-based models can be leveraged both to enrich training datasets with realistic degradation distributions and to be integrated into diffusion-based restoration frameworks. By incorporating physics-informed priors, the performance of generative restoration models can be substantially enhanced. Representative results will be demonstrated on image dehazing and deblurring tasks.

Biography: Prof. Chia-Wen Lin is currently a Distinguished Professor with the Department of Electrical Engineering, National Tsing Hua University (NTHU), Taiwan. He also serves as Deputy Director of NTHU AI Research Center. His research interests include image/video processing, computer vision, and video networking. Dr. Lin is an IEEE Fellow, and has served on IEEE Circuits and Systems Society (CASS) Fellow Evaluation Committee (2021–2023), and CASS BoG members-at-Large (2022–2024). He was Steering Committee Chair of IEEE ICME (2020–2021), IEEE CASS Distinguished Lecturer (2018–2019), APSIPA Distinguished Lecturer (2023–2024), and President of the Chinese Image Processing and Pattern Recognition (IPPR) Association, Taiwan (2019–2020). He is currently Associate Editor-in-Chief for IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), and has served as Associate Editor of IEEE Transactions on Image Processing, IEEE Transactions on Multimedia, IEEE TCSVT, and IEEE Multimedia. He served as TPC Chair of IEEE ICME in 2010, IEEE ICIP in 2019, and PCS in 2022, and the Conference Chair of IEEE VCIP in 2018 and PCS in 2024.

Patrick Le Callet

Full Professor, Polytech Nantes / Nantes Université, Senior Member of the Institut Universitaire de France, IEEE Fellow

🎤 Human sustainability in AI scene: toward QoAI experience

Abstract: This talk examines how AI is progressively challenging human sustainability. We use Quality of Experience (QoE) as an interpretative lens to highlight new paradigm in AI evaluation. Facing aging society, AI can be seen as technological solutions. In that context, I first trace the evolution from imitation learning to foundation models, and then to world models and embodied AI, where human behavior is increasingly abstracted into data, patterns, and simulated environments. Beyond human sustainability in that context, I also illustrate what QoE sciences can bring for Machine Experience. Second, I present the emergence of a token-based cognitive economy, and question its ressources debt and possible QoE metrics to mitigate it. Then I present the notion of AI psychological debt. AI does not only optimize tasks, it reconfigures the structure of human experience and cognitive participation. Building on this, I propose QoAI (Quality of AI Experience) as an extended framework for AI systems evaluation. QoAI aims to assess not only efficiency and accuracy, but also human agency, cognitive effort, dependency formation, attentional integrity, and long-term cognitive resilience. The QoAI Experience Dashboard operationalizes these dimensions to make visible the hidden experiential costs of AI systems. Ultimately, the question is no longer only how well AI performs, but what kind of human cognition it sustains or erodes over time.

Biography: Prof. Patrick Le Callet is Full Professor at Polytech Nantes / Nantes Université and Senior Member of the Institut Universitaire de France. He leads interdisciplinary research on human perception, media processing, and cognitive computing. His work focuses on sustainable visual communication, quality of experience, and cognitive computing. Former scientific director of the "Ouest Industries Créatives" cluster, he has authored over 500 publications and 16 patents. Active in international editorial boards and standardization bodies (VQEG, IEEE-SA), he is co‑recipient of a Technology & Engineering Emmy Award in 2020 for his contributions to perceptual metrics in video encoding.

Zechao Li (李泽超)

Professor and Dean, School of Computer Science and Engineering / School of Artificial Intelligence / School of Software, Nanjing University of Science and Technology

🎤 Robust Multimodal Visual Cognition: From Precise Perception Engines to Interactive Reflective Agents

Abstract: Building robust multimodal AI hinges on elevating static low-level perception into dynamic reasoning systems equipped with structured cognition and self-reflection. This report summarizes our series of research along this trajectory. First, to build a precise visual perception engine, we propose CTNet to resolve pixel-level ambiguity in complex scenes. Furthermore, through Singular Value Fine-tuning (SVF) and the VRP-SAM framework, we achieve open-world generalization with minimal parameter cost and visual reference prompts, establishing a powerful "visual specialist" foundation. However, Multimodal Large Language Models (MLLMs) frequently suffer from hallucinations due to the lack of precise descriptions regarding fine-grained visual attributes and object relations. To address this, we introduce the EDC framework, which leverages the aforementioned visual specialists to finely extract target attributes and transform them into high-quality image-text descriptions, significantly enhancing the visual cognition of MLLMs. Finally, targeting the pain point that large models often reason de novo and repeatedly make the same mistakes, we develop ViLoMem, a dual-stream memory framework that separately encodes logical reasoning errors and visual perception traps. This research achieves a technical leap from precise perception engines to the elimination of multimodal cognitive hallucinations, ultimately evolving into interactive reflective agents equipped with semantic memory and continual learning capabilities.

Biography: Prof. Zechao Li is a Professor and the Dean of the School of Computer Science and Engineering / School of Artificial Intelligence / School of Software at Nanjing University of Science and Technology. His research interests primarily focus on multimodal intelligent analysis and computer vision. He has led several prestigious national grants, including the National Science Fund for Distinguished Young Scholars, the National Science and Technology Major Project for New Generation AI, and Key Projects of the NSFC Joint Fund. Selected as a Young Top-Notch Talent of the National "Ten Thousand Talents Program," he has published over 60 papers in top-tier journals and conferences, such as IEEE TPAMI, IJCV, and CCF Rank-A venues. His major accolades include the First Prize of the Jiangsu Provincial Science and Technology Progress Award (2024, 1st contributor), the First Prize of the Natural Science Award from the Chinese Institute of Electronics (2022, 2nd contributor), and the First Prizes of the Jiangsu Provincial Science and Technology Award (2020 as 2nd contributor, and 2017 as 3rd contributor). Additionally, he received the Best Paper Awards at ACM MM Asia in both 2020 and 2024. Professor Li currently serves as an Associate Editor for renowned journals including IEEE TPAMI, IEEE TCSVT, IEEE TMM, and Pattern Recognition (PR), and previously served on the editorial boards of IEEE TNNLS and Information Sciences.

MLMC 2026 Technical Program

🎤 Oral Presenter Guidelines

📌 Poster Presenter Guidelines

Keynote Speakers

Download the full technical program by clicking the button at the top of this page.