Conclusion and outlook#

The purpose of this notebook was to demonstrate that it is possible to reproduce the subfield classification contained within the Dimensions DB with only moderate effort by fine-tuning a GPT-3 language model. The results obtained were found to be on par with, or even better than, those achieved by Dimensions, indicating that fine-tuning GPT-3 is a promising approach for this and similar tasks.

Notably, the fine-tuning process did not require any particular domain knowledge or model feature engineering, highlighting the effectiveness of the GPT-3 model in this regard.

However, one major downside of the technology is that it is neither open nor free. Moreover, the trained models cannot be shared between users, preventing the kinds of collaboration and re-use that are common in the humanities.

Although the subfield classification tested here is an interesting use case, in particular historians of science may find it even more compelling to study the presence of abstract concepts or methods in scientific texts. By identifying how these concepts emerge, spread, and go out of use, we can gain valuable insights about the modes of knowledge production in different field of science. As the level of abstraction in the task of identifying such concepts or methods in texts is similar to that of subfield classification, it is implied that such aims should likewise be achievable with GPT-3.

While there are reasons to be skeptical about the use of GPT-3 in research, we have demonstrated here that it also opens new opportunities for research in the humanities. It allows for tasks that were previously unsolvable or required high expenses to be accomplished. The purpose of this short study was thus to encourage researchers in Digital Humanities to think creatively about how they can use GPT-3 in their own work.