论文标题
强化基于学习的合作P2P功率交易的DC纳米级群集与风与PV能源资源之间
Reinforcement Learning-Based Cooperative P2P Power Trading between DC Nanogrid Clusters with Wind and PV Energy Resources
论文作者
论文摘要
在用可再生能源的碳中立性代替化石燃料时,间歇性风和光伏(PV)功率的不平衡资源生产是对等交易(P2P)功率交易的关键问题。为了解决这个问题,本文介绍了增强学习(RL)技术。对于RL,基于合作游戏理论,将图形卷积网络(GCN)和双向长期记忆(BI-LSTM)网络共同应用于纳米级簇之间的P2P功率交易。柔性且可靠的DC纳米粒适合整合分配系统的可再生能源。每个局部纳米级群集都采用了生产者的位置,同时着重于功率生产和消费。对于纳米格会的电源管理,使用物联网(IoT)技术将多目标优化应用于每个本地纳米粒子集群。考虑到风和光伏电力生产的间歇性特征,执行电动汽车(EV)的充电/排放。 RL算法,例如用于深Q学习网络(DQN)的GCN卷积神经网络(CNN)层,用于深度复发Q学习网络(DRQN)的GCN-LSTM层,GCN-BI-LSTM层,用于DRQN的GCN-BI-LSTM层,用于DRQN和GCN-BI-LSTM层,以及用于gcn-bi-LSTM层以实现for for for for for for forscimal Polistion(sim)供应(PPO),PPO),PPO,PPO)。因此,合作P2P电力交易系统通过考虑使用时间(TOU)基于关税的电力成本和系统边际价格(SMP),从而最大程度地利用了利润,并最大程度地减少了电网功耗的量。用P2P电源交易的纳米格会的电源管理实时在配电测试馈线上模拟,并且提议的GCN-BI-LSTM-PPO技术实现了用于比较的RL算法中最低的电力成本,将电力成本降低了36.7%,均降低了纳米格里德群集的36.7%。
In replacing fossil fuels with renewable energy resources for carbon neutrality, the unbalanced resource production of intermittent wind and photovoltaic (PV) power is a critical issue for peer-to-peer (P2P) power trading. To address this issue, a reinforcement learning (RL) technique is introduced in this paper. For RL, a graph convolutional network (GCN) and a bi-directional long short-term memory (Bi-LSTM) network are jointly applied to P2P power trading between nanogrid clusters, based on cooperative game theory. The flexible and reliable DC nanogrid is suitable for integrating renewable energy for a distribution system. Each local nanogrid cluster takes the position of prosumer, focusing on power production and consumption simultaneously. For the power management of nanogrid cluster, multi-objective optimization is applied to each local nanogrid cluster with the Internet of Things (IoT) technology. Charging/discharging of an electric vehicle (EV) is executed considering the intermittent characteristics of wind and PV power production. RL algorithms, such as GCN- convolutional neural network (CNN) layers for deep Q-learning network (DQN), GCN-LSTM layers for deep recurrent Q-learning network (DRQN), GCN-Bi-LSTM layers for DRQN, and GCN-Bi-LSTM layers for proximal policy optimization (PPO), are used for simulations. Consequently, the cooperative P2P power trading system maximizes the profit by considering the time of use (ToU) tariff-based electricity cost and the system marginal price (SMP), and minimizes the amount of grid power consumption. Power management of nanogrid clusters with P2P power trading is simulated on a distribution test feeder in real time, and the proposed GCN-Bi-LSTM-PPO technique achieving the lowest electricity cost among the RL algorithms used for comparison reduces the electricity cost by 36.7%, averaging over nanogrid clusters.