信息科學(xué)與工程學(xué)院學(xué)術(shù)報(bào)告預(yù)告： Model-Free Finite Horizon H-Infinity Control: An Off-Policy Approach with Double Minimax Q-learning

來(lái)源：信息科學(xué)與工程學(xué)院 2025-08-25 12:14 瀏覽：次

報(bào)告題目： Model-Free Finite Horizon H-Infinity Control: An Off-Policy Approach with Double Minimax Q-learning

報(bào) 告人：Wen Yu （余文）

報(bào)告時(shí)間：2025年8月28日 09:30-10:30

報(bào)告地點(diǎn)：蓮花街校區(qū)惟學(xué)樓228會(huì)議室

報(bào)告人簡(jiǎn)介：

Wen Yu（余文），墨西哥科學(xué)院院士，墨西哥國(guó)立理工學(xué)院全職教授，入選前2%高被引科學(xué)家，1990年獲得清華大學(xué)自動(dòng)化專(zhuān)業(yè)學(xué)士學(xué)位，1992年和1995年分別獲得東北大學(xué)自動(dòng)控制專(zhuān)業(yè)碩士和博士學(xué)位。1995年至1996年期間，在東北大學(xué)自動(dòng)控制系擔(dān)任講師。自1996年起，就職于墨西哥國(guó)立理工學(xué)院。2002年至2003年，在墨西哥石油研究所擔(dān)任研究職務(wù)。2006年至2007年，他作為高級(jí)訪(fǎng)問(wèn)研究員在英國(guó)貝爾法斯特女王大學(xué)工作；2009年至2010年，他在美國(guó)加利福尼亞州圣克魯茲的加州大學(xué)擔(dān)任訪(fǎng)問(wèn)副教授。自2006年以來(lái)，他還擔(dān)任東北大學(xué)的訪(fǎng)問(wèn)教授。

已發(fā)表500余篇學(xué)術(shù)論文，其中包括200余篇期刊論文，并出版了8部專(zhuān)著。指導(dǎo)了38篇博士論文和40篇碩士論文。根據(jù)Google Scholar統(tǒng)計(jì)，學(xué)術(shù)成果已被引用超過(guò)12,000次，H指數(shù)為52。曾擔(dān)任IEEE旗艦?zāi)陼?huì)SSCI 2023的大會(huì)主席。還曾擔(dān)任《IEEE Transactions on Cybernetics》《IEEE Transactions on Neural Networks and Learning Systems》《Neurocomputing》《Scientific Reports》《Intelligence & Robotics》等期刊的副編輯。

報(bào)告內(nèi)容簡(jiǎn)介：

Finite horizon H-infinity control is essential for robust system design, particularly when guaranteed system performance is required over a specific time interval. Despite offering practical benefits over its infinite horizon counterpart, these model-based frameworks present complexities, notably the time-varying nature of the Difference Riccati Equation (DRE), which significantly complicates solutions for systems with unknown dynamics. This paper proposes a novel model-free method by leveraging off-policy reinforcement learning (RL), known for its superior data efficiency and flexibility compared to traditional on-policy methods prevalent in model-free H-infinity control literature. Recognizing the unique challenges of off-policy RL within the inherent minimax optimization problem of H-infinity control, we propose the Neural Network-based Double Minimax Q-learning (NN-DMQ) algorithm. This algorithm is specifically designed to handle the adversarial interaction between the controller and the worst-case disturbance, while also mitigating the bias introduced by Q-value overestimation, which can destabilize learning. A key theoretical contribution of this work is a rigorous convergence proof of the proposed Double Minimax Q-learning (DMQ) algorithm. This proof provides strong guarantees for the algorithm's stability and capability to learn the optimal finite-horizon robust control and worst-case disturbance policies. Extensive were performed to verify the effectiveness and robustness of our approach, illustrating its real-world implementation in challenging real-world control problems with unknown dynamics.

歡迎廣大師生參加！

信息科學(xué)與工程學(xué)院

2025年8月25日

（責(zé)任編輯：李翰）

最新動(dòng)態(tài)

演講者	Wen Yu （余文）	演講時(shí)間	2025年8月28日 09:30-10:30
地點(diǎn)	蓮花街校區(qū)惟學(xué)樓228會(huì)議室	分類(lèi)
職位		攝影
審核		審校
主要負(fù)責(zé)		聯(lián)系學(xué)院
事記時(shí)間