← ML Research Wiki / 2504.18937

Two-Agent DRL for Power Allocation and IRS Orientation in Dynamic NOMA-based OWC Networks

(2025)

Paper Information
arXiv ID

Abstract

Intelligent reflecting surfaces (IRSs) technology has been considered a promising solution in visible light communication (VLC) systems due to its potential to overcome the line-of-sight (LoS) blockage issue and enhance coverage.Moreover, integrating IRS with a downlink non-orthogonal multiple access (NOMA) transmission technique for multi-users is a smart solution to achieve a high sum rate and improve system performance.In this paper, a dynamic IRS-assisted NOMA-VLC system is modeled, and an optimization problem is formulated to maximize sum energy efficiency (SEE) and fairness among multiple mobile users under power allocation and IRS mirror orientation constraints.Due to the non-convex nature of the optimization problem and the non-linearity of the constraints, conventional optimization methods are impractical for real-time solutions.Therefore, a two-agent deep reinforcement learning (DRL) algorithm is designed for optimizing power allocation and IRS orientation based on centralized training with decentralized execution to obtain fast and real-time solutions in dynamic environments.The results show the superior performance of the proposed DRL algorithm compared to standard DRL algorithms typically used for resource allocation in wireless communication.The results also show that the proposed DRL algorithm achieves higher performance compared to deployments without IRS and with randomly oriented IRS elements.Index Terms-Optical wireless communication (OWC), intelligent reflecting surface (IRS), non-orthogonal multiple access (NOMA), and reinforcement learning (RL).

Summary

This paper presents a novel two-agent deep reinforcement learning (DRL) approach for optimizing power allocation and mirror orientation in intelligent reflecting surface (IRS)-assisted non-orthogonal multiple access (NOMA) visible light communication (VLC) networks. The study models a dynamic IRS-assisted system and formulates an optimization problem aimed at maximizing sum energy efficiency (SEE) and user fairness under varying constraints. The complexities arising from the non-convex nature of the problem make conventional optimization methods infeasible for real-time solutions. Thus, the authors propose a two-agent DRL algorithm that enables fast, adaptable optimization for mobile users in dynamic environments. Results indicate that the proposed DRL algorithm significantly surpasses traditional DRL models in performance, achieving more effective resource management and improved system performance when compared to conventional approaches that lack IRS or implement randomly oriented IRS elements. The paper concludes that utilizing DRL facilitates efficient real-time decision-making, crucial for maintaining high QoS in 6G networks with increasing user demands.

Methods

This paper employs the following methods:

  • Deep Reinforcement Learning
  • Two-Agent DRL

Models Used

  • None specified

Datasets

The following datasets were used in this research:

  • None specified

Evaluation Metrics

  • Sum Energy Efficiency (SEE)
  • Fairness Index

Results

  • The proposed DRL algorithm outperformed standard DRL algorithms in optimizing resource allocation for IRS-assisted VLC systems.
  • Achieved maximum sum energy efficiency of approximately 9.7 Mbits/Joule at around 500 episodes.
  • Demonstrated significant improvements in system performance compared to scenarios without IRS.

Technical Requirements

  • Number of GPUs: None specified
  • GPU Type: None specified
  • Compute Requirements: None specified

Papers Using Similar Methods

External Resources