强化基于学习的合作P2P功率交易的DC纳米级群集与风与PV能源资源之间

论文标题

强化基于学习的合作P2P功率交易的DC纳米级群集与风与PV能源资源之间

Reinforcement Learning-Based Cooperative P2P Power Trading between DC Nanogrid Clusters with Wind and PV Energy Resources

论文作者

Lee, Sangkeum, Nengroo, Sarvar Hussain, Jin, Hojun, Heo, Taewook, Doh, Yoonmee, Lee, Chungho, Har, Dongsoo

论文摘要

在用可再生能源的碳中立性代替化石燃料时，间歇性风和光伏（PV）功率的不平衡资源生产是对等交易（P2P）功率交易的关键问题。为了解决这个问题，本文介绍了增强学习（RL）技术。对于RL，基于合作游戏理论，将图形卷积网络（GCN）和双向长期记忆（BI-LSTM）网络共同应用于纳米级簇之间的P2P功率交易。柔性且可靠的DC纳米粒适合整合分配系统的可再生能源。每个局部纳米级群集都采用了生产者的位置，同时着重于功率生产和消费。对于纳米格会的电源管理，使用物联网（IoT）技术将多目标优化应用于每个本地纳米粒子集群。考虑到风和光伏电力生产的间歇性特征，执行电动汽车（EV）的充电/排放。 RL算法，例如用于深Q学习网络（DQN）的GCN卷积神经网络（CNN）层，用于深度复发Q学习网络（DRQN）的GCN-LSTM层，GCN-BI-LSTM层，用于DRQN的GCN-BI-LSTM层，用于DRQN和GCN-BI-LSTM层，以及用于gcn-bi-LSTM层以实现for for for for for for forscimal Polistion（sim）供应（PPO），PPO），PPO，PPO）。因此，合作P2P电力交易系统通过考虑使用时间（TOU）基于关税的电力成本和系统边际价格（SMP），从而最大程度地利用了利润，并最大程度地减少了电网功耗的量。用P2P电源交易的纳米格会的电源管理实时在配电测试馈线上模拟，并且提议的GCN-BI-LSTM-PPO技术实现了用于比较的RL算法中最低的电力成本，将电力成本降低了36.7％，均降低了纳米格里德群集的36.7％。

In replacing fossil fuels with renewable energy resources for carbon neutrality, the unbalanced resource production of intermittent wind and photovoltaic (PV) power is a critical issue for peer-to-peer (P2P) power trading. To address this issue, a reinforcement learning (RL) technique is introduced in this paper. For RL, a graph convolutional network (GCN) and a bi-directional long short-term memory (Bi-LSTM) network are jointly applied to P2P power trading between nanogrid clusters, based on cooperative game theory. The flexible and reliable DC nanogrid is suitable for integrating renewable energy for a distribution system. Each local nanogrid cluster takes the position of prosumer, focusing on power production and consumption simultaneously. For the power management of nanogrid cluster, multi-objective optimization is applied to each local nanogrid cluster with the Internet of Things (IoT) technology. Charging/discharging of an electric vehicle (EV) is executed considering the intermittent characteristics of wind and PV power production. RL algorithms, such as GCN- convolutional neural network (CNN) layers for deep Q-learning network (DQN), GCN-LSTM layers for deep recurrent Q-learning network (DRQN), GCN-Bi-LSTM layers for DRQN, and GCN-Bi-LSTM layers for proximal policy optimization (PPO), are used for simulations. Consequently, the cooperative P2P power trading system maximizes the profit by considering the time of use (ToU) tariff-based electricity cost and the system marginal price (SMP), and minimizes the amount of grid power consumption. Power management of nanogrid clusters with P2P power trading is simulated on a distribution test feeder in real time, and the proposed GCN-Bi-LSTM-PPO technique achieving the lowest electricity cost among the RL algorithms used for comparison reduces the electricity cost by 36.7%, averaging over nanogrid clusters.

下载PDF全文

下载文献需遵守相关版权规定

论文标题