A Master Thesis project by John Oskar Holmen Skjeldrum & Peder Tanberg, researching GPT models in low-resourced languages.
University: Copenhagen Business School | Program: Business Administration & Data Science
Research Questions:
- To what extent can a GPT-j model, trained on a corpus of Norwegian language data, effectively compete with larger models in performing downstream tasks, given the inherent low-resource characteristics of the Norwegian Language?
- What are the multifaceted challenges and inherent limitations associated with fine-tuning the existing Norwegian GPT-j model employing a limited instruction dataset and constrained resources in terms of time and computational power, and how do these factors impact the performance of the model?