Reinforcement Learning Approach for Commodity Market Trading Strategy

Pei Yuin Wong, Ervin Gubin Moung, Ali Farzamnia, Farashazillah Yahya, Joe Henry Obit, Zaidatol Haslinda Abdullah Sani

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The apparent phenomena of commodity price fluctuations significantly affect the cost of living. Most current studies utilize datasets collected before the Russo-Ukrainian War and Covid-19. Moreover, many people are focusing on fund investment, exploring avenues such as commodity trading in addition to stocks and forex investments. However, most research for price prediction in commodities does not cover the periods of Covid-19 and the Russo-Ukrainian war. The aim of this project is to develop trading strategy models to predict whether to buy or sell a commodity, and to evaluate the potential rewards and profits. The dataset used contains daily historical prices of various types of commodities from the year 2000 until March 2022. Furthermore, a real-world dataset, specifically the gold trading dataset from Nasdaq, will be used to validate the performance of the best-performing trading models. The algorithms employed are reinforcement learning-based: Advantage Actor Critic (A2C) and Proximal Policy Optimization (PPO). Evaluation performance across six rounds of experiments has shown that the A2C model in a forex environment, using 80% of the dataset for training and 20% for testing, achieved the best results, with a Sharpe ratio of 0.63, a Sortino ratio of 1.0, an Omega ratio of 1.24, and a Calmar ratio of 0.55. The best-performing trading models in Objective 2 and Objective 3 are similar but employ different window sizes. Window size specifies the timesteps that will serve as reference points for the trading model to determine the next trade. Different datasets may require different window sizes, an issue that necessitates further refinement. This refinement is crucial as it involves tailoring the window size to align with the unique characteristics and volatility patterns of each dataset, thereby ensuring that the model's predictive accuracy is optimized for varied market conditions and historical trends. In conclusion, the best-performing trading model is the Advantage Actor Critic (A2C) model in a forex environment.

Original languageEnglish
Title of host publicationProceedings of the 13th National Technical Seminar on Unmanned System Technology 2023
Subtitle of host publicationNUSYS 2023
EditorsZainah Md. Zain, Zool Hilmi Ismail, Huiping Li, Xianbo Xiang, Rama Rao Karri
PublisherSpringer Singapore
Pages181-191
Number of pages11
Volume1184
ISBN (Electronic)9789819720279
ISBN (Print)9789819720262, 9789819720293
DOIs
Publication statusPublished - 17 Sep 2024
Event13th National Technical Symposium on Unmanned System Technology - Penang, Malaysia
Duration: 2 Oct 20233 Oct 2023
Conference number: 13

Publication series

NameLecture Notes in Electrical Engineering
PublisherSpringer
Volume1184 LNEE
ISSN (Print)1876-1100
ISSN (Electronic)1876-1119

Conference

Conference13th National Technical Symposium on Unmanned System Technology
Abbreviated titleNUSYS 2023
Country/TerritoryMalaysia
CityPenang
Period2/10/233/10/23

Cite this