Evaluation of natural language processing models to measure similarity between scenarios written in Spanish

Authors

  • Gabriela Perez LIFIA
  • Catalina Mostaccio LIFIA
  • Giuliana Maltempo LIFIA
  • Leandro Antonelli LIFIA

DOI:

https://doi.org/10.12957/cadinf.2024.87935

Abstract

Requirements engineering is a critical phase in software development; it seeks to understand and document system requirements from early stages. Typically, requirements specification involves close collaboration between customers and development teams. Customers contribute their expertise in the domain language, while developers use more technical, computational terms. Despite these differences, achieving mutual understanding is crucial.
One of the most widely used artifacts for this purpose is scenarios. In environments where multiple actors write scenarios, duplication is common. Thus, there is a need for mechanisms to detect similar scenarios and prevent redundancy. In
this paper we empirically evaluate several pre-trained Natural Language Processing models to analyze the semantic similarity between scenarios in Spanish, identifying words or phrases with equivalent meanings. It is important to note that the analysis is performed in this language to contribute to the region.
Finally, we present a tool that facilitates the creation of new scenarios by identifying potential similarities with existing ones. The tool supports multiple models, allowing users to select the most appropriate one to detect similar scenarios accurately during the definition process.

Downloads

Download data is not yet available.

Published

2025-01-17

How to Cite

Perez, G., Mostaccio, C., Maltempo, G., & Antonelli, L. (2025). Evaluation of natural language processing models to measure similarity between scenarios written in Spanish. Cadernos Do IME - Série Informática, 50. https://doi.org/10.12957/cadinf.2024.87935