报告名称:Towards Multimodal Learning
报告专家:张薇博士、副教授
专家所在单位:澳大利亚阿德莱德大学
报告时间:2025年12月10日上午10:00-12:00
报告地点:数统学院201报告厅
专家简介:张薇博士现为澳大利亚阿德莱德大学计算机与数学科学学院副教授、澳大利亚机器学习研究院高级成员、澳大利亚研究理事会青年学者。研究方向涵盖自然语言处理、多模态学习、分布式计算等,重点聚焦多模态生成评测、知识抽取与可信人工智能等领域。已发表学术论文140余篇,论文被引用超过4,200次。发表在《ACM Transactions on Intelligent Systems and Technology》的综述论文入选创刊以来五篇最杰出论文之一。张博士曾担任ACML、ADMA、KDD、IJCAI、WWW、ACL等顶级会议的领域主席与程序委员会委员、国际期刊《Personal and Ubiquitous Computing》《Future Internet》《Natural Language Processing Journal》《Frontiers in Oral Health》的特刊主编,曾获南澳“青年科学新星奖”(SA Young Tall Poppy Award,2024)、澳大利亚有色女性科学奖(Women of Colour in STEM Award)、阿德莱德大学Barbara Kidman女性学者奖学金。
报告摘要:Multimodal learning aims to build AI systems that integrate and reason over multiple data types, like vision and language. While promising, current models often fail to genuinely fuse modalities, over-relying on text—a problem known as modality collapse. This talk diagnoses this issue in vision-language models, presents solutions for real-world challenges in medical imaging and federated learning, and introduces new benchmarks and methods for robust evaluation and schema-guided reasoning, advancing toward more trustworthy multimodal AI.