Pisolla, Anil and Moghul, Sameer Baig and Gaddam, Triveni and Iyengar, N Ch Sriman Narayana (2025) Salary Prediction Using TF-IDF and Ensemble Machine Learning: A Lightweight and Interpretable Approach. World Journal of Advanced Research and Reviews, 26 (2). pp. 4445-4453. ISSN 2581-9615
![WJARR-2025-2102.pdf [thumbnail of WJARR-2025-2102.pdf]](https://eprint.scholarsrepository.com/style/images/fileicons/text.png)
WJARR-2025-2102.pdf - Published Version
Available under License Creative Commons Attribution Non-commercial Share Alike.
Abstract
Salary prediction is not just a number, it’s a decision-maker for job seekers, employers, and HR teams, shaping expectations and negotiations. Traditional models rely on structured data like job titles, experience levels, and locations, but overlook job descriptions, where real insights hide—skills, responsibilities, and industry-specific language. This study bridges that gap, combining structured and unstructured data for a more intuitive model. TF-IDF extracts key terms, assigning weights to highlight critical information, while structured data undergoes preprocessing through one-hot encoding and feature scaling. An ensemble learning approach strengthens predictions—Random Forest captures patterns, XGBoost refines them, and Linear Regression serves as a baseline. A meta-model, like Logistic Regression, optimally weighs predictions, enhancing accuracy. Evaluated through Accuracy, Macro Average F1-score, and Weighted Average F1-score, the model outperforms standalone approaches, achieving superior classification performance. The results demonstrate that integrating TF-IDF with ensemble learning provides a more accurate, scalable, and interpretable salary prediction system, ready for real-world applications.
Item Type: | Article |
---|---|
Official URL: | https://doi.org/10.30574/wjarr.2025.26.2.2102 |
Uncontrolled Keywords: | Salary Prediction; TF-IDF (Term Frequency-Inverse Document Frequency); Ensemble Learning; Machine Learning |
Depositing User: | Editor WJARR |
Date Deposited: | 20 Aug 2025 11:52 |
Related URLs: | |
URI: | https://eprint.scholarsrepository.com/id/eprint/3751 |