韩国永利赌场 小芸-澳门永利赌场官网网站-大发888容易赢吗

信息科學(xué)與工程學(xué)院學(xué)術(shù)報(bào)告預(yù)告: Model-Free Finite Horizon H-Infinity Control: An Off-Policy Approach with Double Minimax Q-learning

來(lái)源:信息科學(xué)與工程學(xué)院 2025-08-25 12:14 瀏覽:次
演講者 Wen Yu (余文) 演講時(shí)間 2025年8月28日 09:30-10:30
地點(diǎn) 蓮花街校區(qū)惟學(xué)樓228會(huì)議室 分類(lèi)
職位 攝影
審核 審校
主要負(fù)責(zé) 聯(lián)系學(xué)院
事記時(shí)間

報(bào)告題目: Model-Free Finite Horizon H-Infinity Control: An Off-Policy Approach with Double Minimax Q-learning

報(bào) 告 人:Wen Yu (余文)

報(bào)告時(shí)間:2025年8月28日 09:30-10:30

報(bào)告地點(diǎn):蓮花街校區(qū)惟學(xué)樓228會(huì)議室

報(bào)告人簡(jiǎn)介:

Wen Yu(余文),墨西哥科學(xué)院院士,墨西哥國(guó)立理工學(xué)院全職教授,入選前2%高被引科學(xué)家,1990年獲得清華大學(xué)自動(dòng)化專(zhuān)業(yè)學(xué)士學(xué)位,1992年和1995年分別獲得東北大學(xué)自動(dòng)控制專(zhuān)業(yè)碩士和博士學(xué)位。1995年至1996年期間,在東北大學(xué)自動(dòng)控制系擔(dān)任講師。自1996年起,就職于墨西哥國(guó)立理工學(xué)院。2002年至2003年,在墨西哥石油研究所擔(dān)任研究職務(wù)。2006年至2007年,他作為高級(jí)訪(fǎng)問(wèn)研究員在英國(guó)貝爾法斯特女王大學(xué)工作;2009年至2010年,他在美國(guó)加利福尼亞州圣克魯茲的加州大學(xué)擔(dān)任訪(fǎng)問(wèn)副教授。自2006年以來(lái),他還擔(dān)任東北大學(xué)的訪(fǎng)問(wèn)教授。

已發(fā)表500余篇學(xué)術(shù)論文,其中包括200余篇期刊論文,并出版了8部專(zhuān)著。指導(dǎo)了38篇博士論文和40篇碩士論文。根據(jù)Google Scholar統(tǒng)計(jì),學(xué)術(shù)成果已被引用超過(guò)12,000次,H指數(shù)為52。曾擔(dān)任IEEE旗艦?zāi)陼?huì)SSCI 2023的大會(huì)主席。還曾擔(dān)任《IEEE Transactions on Cybernetics》《IEEE Transactions on Neural Networks and Learning Systems》《Neurocomputing》《Scientific Reports》《Intelligence & Robotics》等期刊的副編輯。

報(bào)告內(nèi)容簡(jiǎn)介:

Finite horizon H-infinity control is essential for robust system design, particularly when guaranteed system performance is required over a specific time interval. Despite offering practical benefits over its infinite horizon counterpart, these model-based frameworks present complexities, notably the time-varying nature of the Difference Riccati Equation (DRE), which significantly complicates solutions for systems with unknown dynamics. This paper proposes a novel model-free method by leveraging off-policy reinforcement learning (RL), known for its superior data efficiency and flexibility compared to traditional on-policy methods prevalent in model-free H-infinity control literature. Recognizing the unique challenges of off-policy RL within the inherent minimax optimization problem of H-infinity control, we propose the Neural Network-based Double Minimax Q-learning (NN-DMQ) algorithm. This algorithm is specifically designed to handle the adversarial interaction between the controller and the worst-case disturbance, while also mitigating the bias introduced by Q-value overestimation, which can destabilize learning. A key theoretical contribution of this work is a rigorous convergence proof of the proposed Double Minimax Q-learning (DMQ) algorithm. This proof provides strong guarantees for the algorithm's stability and capability to learn the optimal finite-horizon robust control and worst-case disturbance policies. Extensive were performed to verify the effectiveness and robustness of our approach, illustrating its real-world implementation in challenging real-world control problems with unknown dynamics.

歡迎廣大師生參加!


信息科學(xué)與工程學(xué)院

2025年8月25日

(責(zé)任編輯:李翰)