Sentiment Analysis and Machine Learning-Based Stock Prediction for 10 KLSE Index Stocks

Authors

Keywords:

Stock market prediction, KLSE Index Stock, investor sentiment analysis, machine learning, correlation analysis

Abstract

Abstract:Research Question: How can the integration of localized investor sentiment from social media with historical stock data enhance the predictive accuracy of machine learning models for Kuala Lumpur Stock Exchange (KLSE) index stocks? Motivation: This study introduces novelty by utilizing a localized sentiment dataset extracted from the KLSE Screener, a platform specific to Malaysian investors. This approach better reflects regional investor behavior, language diversity, and local market dynamics compared to global datasets, offering a more contextualized analysis of the Malaysian financial ecosystem. Idea: The core idea is to develop a hybrid prediction model that integrates historical stock data with quantified investor sentiment scores. The central hypothesis is that this integration will yield superior forecasting performance compared to models using historical data alone. The dependent variable is the stock’s closing price, and key independent variables include the historical price features and the sentiment score. Data: Ten years of historical price data were analyzed for the top KLSE stocks. This was combined with one year of investor comments scraped from the KLSE Screener. Sentiment was measured using the Valence Aware Dictionary and Sentiment Reasoner (VADER) lexicon tool. Method/Tools: After preprocessing and normalizing the data, six machine learning models implemented and compared. Model performance was evaluated using several performance indicators. Findings: XGBoost emerged as the best-performing model, MSE=0.0001 and R²=0.99998, effectively capturing complex patterns between price and sentiment data. Correlation analysis revealed mixed, generally weak relationships between sentiment and price movements across different stocks, indicating that investor sentiment in this context is often reactive rather than predictive. This finding adds nuance to literature that often assumes a straightforward positive correlation. Contributions: This paper’s primary contribution is twofold: it demonstrates the superiority of ensemble methods like XGBoost for financial forecasting with sentiment data, and it provides a pioneering analysis using a localized Malaysian sentiment dataset, establishing a foundation for more regionally-aware financial market research.

Downloads

Download data is not yet available.

References

Downloads

Published

19-12-2025

How to Cite

Sentiment Analysis and Machine Learning-Based Stock Prediction for 10 KLSE Index Stocks. (2025). Capital Markets Review, 33(2), 75-93. https://mfa-cmr.com/cmr/article/view/266

Similar Articles

1-10 of 204

You may also start an advanced similarity search for this article.