Sharma, ML and Kumar, Sunil and Mittal, Rajveer and Rai, Shubhankar and Jain, Akshat and Gandhi, Anurag and Nagpal, Swayam and Ranjan, Anurag and Yadav, Riya and Mishra, Vatshank (2025) Building an LLM from Scratch. International Journal of Science and Research Archive, 15 (1). pp. 1426-1434. ISSN 2582-8185
![IJSRA-2025-1140.pdf [thumbnail of IJSRA-2025-1140.pdf]](https://eprint.scholarsrepository.com/style/images/fileicons/text.png)
IJSRA-2025-1140.pdf - Published Version
Available under License Creative Commons Attribution Non-commercial Share Alike.
Abstract
In this work, the development of a basic large language model (LLM) has been presented, with a primary focus on the pre-training process and model architecture. A simplified transformer-based design has been implemented to demonstrate core LLM principles with the incorporation of reinforcement learning techniques. Key components such as tokenization, and training objectives have been discussed to provide a foundational understanding of LLM construction. Additionally, an overview of several established models—including GPT-2, LLaMA 3.1, and DeepSeek—has been provided to contextualize current advancements in the field. Through this comparative and explanatory approach, the essential building blocks of large-scale language models have been explored in a clear and accessible manner.
Item Type: | Article |
---|---|
Official URL: | https://doi.org/10.30574/ijsra.2025.15.1.1140 |
Uncontrolled Keywords: | Pretraining; Introduction to the Neural Networks in LLM; Transformer Architecture; Post Training; Post Training with Reinforcement Learning |
Depositing User: | Editor IJSRA |
Date Deposited: | 22 Jul 2025 23:02 |
Related URLs: | |
URI: | https://eprint.scholarsrepository.com/id/eprint/1626 |