Understanding requirements dependency in requirements prioritization: a systematic literature review

ABSTRACT

Requirement prioritization (RP) is a crucial task in managing requirements as it determines the order of implementation and, thus, the delivery of a software system. Improper RP may cause software project failures due to over budget and schedule as well as a low-quality product. Several factors influence RP. One of which is requirements dependency. Handling inappropriate handling of requirements dependencies can lead to software development failures. If a requirement that serves as a prerequisite for other requirements is given low priority, it affects the overall project completion time. Despite its importance, little is known about requirements dependency in RP, particularly its impacts, types, and techniques. This study, therefore, aims to understand the phenomenon by analyzing the existing literature. It addresses three objectives, namely, to investigate the impacts of requirements dependency on RP, to identify different types of requirements dependency, and to discover the techniques used for requirements dependency problems in RP. To fulfill the objectives, this study adopts the Systematic Literature Review (SLR) method. Applying the SLR protocol, this study selected forty primary articles, which comprise 58% journal papers, 32% conference proceedings, and 10% book sections. The results of data synthesis indicate that requirements dependency has significant impacts on RP, and there are a number of requirements dependency types as well as techniques for addressing requirements dependency problems in RP. This research discovered various techniques employed, including the use of Graphs for RD visualization, Machine Learning for handling large-scale RP, decision making for multi-criteria handling, and optimization techniques utilizing evolutionary algorithms. The study also reveals that the existing techniques have encountered serious limitations in terms of scalability, time consumption, interdependencies of requirements, and limited types of requirement dependencies.
RP activities always involve two parties, namely developers and stakeholders. The two parties have different focuses and thus, priorities [11]. Stakeholders focus on urgency, needs, and business values [12], [13]. Although developers are concerned about project attributes such as effort [12] and cost [14], [13], they are also aware of internal constraints such as dependency between functions or requirements [12]. This is due to the fact that requirements dependency is commonly found on the project software [2], [15]. It is thus risky to conduct RP without considering the dependency between requirements [16], [17]. For instance, giving high priority to requirements that depend on other requirements can increase the waiting time and delay the project [18]. This is because the dependent requirements have to wait for the prerequisite requirements to be completed before they could be implemented. In addition, requirements dependency also implies product complexity [2] and project risk [19]. The higher the dependency, the higher the complexity of the system and thus the higher its risk of failure is [20].
Several studies have investigated RP concerning the criteria and techniques used in the process, such as Hujainah et al. [21], Tan and Mohamed [22], Falak Sher et al. [23], Muhammad Sufian et al. [24], Pitangueira et al. [25], Achimugu et al. [16], and Al Ta'ani and Razali [26]. However, none of the studies examine requirements dependency in depth. In fact, only 4 out of the 65 RP techniques consider requirements dependency [2]. Many studies on RP do not include requirements dependencies as one of the factors influencing priority sequencing. Little is known about requirements dependency in RP. Therefore, this study aims to explore further requirements dependency in RP by conducting a systematic literature review (SLR) in order to improve the understanding of the phenomenon. The objectives are: • to investigate the impacts of requirements dependency on RP • to identify different types of requirements dependency • to discover the techniques used for requirements dependency problems in RP The structure of the paper is as follows. Section 2 describes the methodology used in the review. Section 3 discusses the threats to validity. Section 4 presents the results and discussions. Section 5 concludes the study.

Method
This study adopts the Systematic Literature Review (SLR) method proposed by Kitchenham et al. [27]. Fig. 1 illustrates the review protocol used, which comprises five stages: identification, search strategy, study selection strategy, data retrieval, and result.

Fig. 1. Review Protocol
In the first stage, the research questions were constructed and aligned with the research motivation and research question. The second stage determined the resources and the search strings based on the research questions. The third stage outlined the inclusion and exclusion criteria for screening the gathered articles together with the Quality Assessment Criteria (QAC) process. The fourth stage finalized the selection of the data collection through which data synthesis was made. In the final step, the results of the synthesis were obtained, which are presented in this paper.

Research Questions
The study aims to understand the relationships between requirements dependency and RP. Therefore, the following research questions (RQ) were constructed: • RQ1: Does requirements dependency have impacts on RP?
• RQ2: What are the different types of requirements dependency?
• RQ3: What are the existing techniques used for requirements dependency problems in RP?

Search Strategy
The search strategy undertaken in this study began with determining the sources of scientific literature. Seven sources were used in the literature search, as listed in Table 1. • Resources: The seven sources were selected because their contents are relevant to the subjects of this study, besides being referred by researchers in the field. • Search Strings: Specifically, the keywords used for searching research articles in this study were 'requirement prioritization' or 'dependencies'. The search keywords for review papers were 'requirement prioritization' (AND/OR) 'literature review'.

Study Selection Criteria
The searches in the seven sources using the predefined keywords found 432 articles. These articles were firstly screened in terms of suitability based on their titles and/or abstracts. As a result, only 133 articles were selected. Fig. 2 shows the distribution of the articles. Most articles are journal and conference papers.
To ensure only the most relevant articles would be selected, the 133 articles were further vetted through several subsequent stages, as shown in Fig. 3. Inclusion and Exclusion Criteria: The inclusion criteria used to select the articles are as follows: 1) Articles are written in English; 2) Articles focus on requirements dependency in RP domain; and 3) Articles are able to answer at least one of the research questions. The exclusion criteria include: 1) Articles are not written in English; 2) Duplicate articlesexcluding multiple copies of the same study; and 3) Articles are not answering any of the research questions. Each collected article was briefly read through its title, abstract, and content. Studies that did not address the research question were excluded. Similarly, studies that were still in the research process or not published by a publisher were not included. This study aims to gather findings that have been proven empirically. Therefore, review articles were excluded from the selection. If the same article was found from different sources, only one would be chosen. The articles were published within the period of 2012 to 2022. Precentage of collected articels based on titles and/or abstracts Quality Assessment Criteria: Quality Assessment Criteria (QAC) was used to measure the quality of the gathered articles with respect to the objectives of the study. First, the articles shall cover requirements dependency and RP. Second, the articles shall be trustworthy. Eight questions were derived to represent the criteria, as listed in Table 2. The possible score for each question was divided into three: Yes (1), Partially (0.5) and No (0). The weighted score for each study was the sum of scores for the eight questions. The assessments were conducted by the authors, through which the scores were consensually determined. After the assessment, only forty articles were selected based on the weighted score > 4.5. A score of 4.5 was used as the baseline as it designates that the article has achieved more than 56% of the best score (4.5 out of 8). Table 3 presents the weighted scores for the forty selected articles. The highest score is 8 (five articles) and the lowest score is 5 (five article), whereas the median score is 6 (twelve articles). This implies that the selected articles are well-documented, and thus contain well-conducted studies. This claim is particularly demonstrated by the scores attained for Q1, Q7 and Q8, which are mainly 1 (Yes). The selected articles are however moderately covering the focus of this study, as the scores for Q2 to Q6 are mostly 0.5 (Partly). This indicates that the emphasis of the current available studies on requirements dependency and RP are still lacking, albeit relevant. Only five articles (Score = 8) fulfill the quality criteria of the study entirely.

Data Retrieval
The data retrieval consists of two activities, namely data collection and data synthesis. Data collection is the process of bringing together the selected articles, whereas data synthesis is a purposeful activity that extracts facts from the selected studies for answering the stated research questions [27].
• Data Collection: This stage gathered and consolidated the selected thirty articles. The articles were then classified into three groups based on the three RQs: RQ1, RQ2 and RQ3. For example, the articles that discuss the impact of requirements dependency on RP were placed under RQ1 group. Same goes to the articles that belong to RQ2 and RQ3. The articles that address more than one RQ were placed accordingly into the respective RQ groups.
• Data Synthesis: This stage extracted facts from the grouped articles in order to find the answers for the research questions. The facts were then analyzed and visualized. For example, the findings for RQ1 are presented as a chart that shows the frequency distribution of articles across RP factors. Similarly, the requirements dependency types for RQ2 are illustrated as a taxonomy graph, whereas the techniques for RQ3 are demonstrated as chart and table. The visualization helps in explaining the results, thus providing a better understanding of the phenomenon.

Threats of validity
The main challenge in SLR is the validity of the study, which includes the completeness, publication bias and data synthesis [27]. This study adopted the review protocol to overcome the completeness threat. The searches were conducted on various databases and the articles were screened using the predetermined quality criteria. Nevertheless, the searches were limited to publications from year 2012-2022 and articles in other languages were excluded. The consideration for choosing English is due to its status as an international language widely used in reputable journals. To avoid publication bias, only articles that contain empirically proven data were considered. Therefore, gray studies that are still in progress were not included. The consideration to exclude gray literature is the ease of literature search for future researchers. To mitigate the data synthesis threat, QAC process was conducted. QAC identified and filtered reliable studies that could answer the research questions. Moreover, manual checks were carried out on the extracted facts repetitively. The assessments were carried out objectively and consensually by the authors to avoid inconsistencies. The authors read the entire collected papers and provided scores based on the QAC questions.

Results and Discussion
This section describes the results of the analysis based on the forty selected articles. Fig. 2 shows that the most selected articles are journal and conference papers (96%), whereas the rest are book sections, newspaper articles and reports (4%). After QAC process, the distribution changes slightly as shown in Fig. 4. Most articles still constitute journal and conference papers (92.5%), while the rest are book sections only (7.5%). As newspaper articles and reports generally lack scientific evidence and arguments, they could not be selected in this round. shows the topics which are consistently studied every year for the last eleven years, with at least three articles per year (median). The number is not high, this indicates that requirements dependency and RP are two topics investigated by the research community in recent years. Since it is not as many as other topics in requirements engineering field, this may suggest that more studies are required to investigate the topics.

Does requirements dependency have impacts on RP (RQ1)?
Fig. 6 indicates that requirements dependency is a critical factor that is of concern to many studies in regards to RP. The articles emphasize that requirements dependency becomes more challenging, particularly for large-scale systems [28], [35], [48]. Improper handling of requirements dependency may cause inefficiency [29], project delays [54], redesign and rework [52], as dependencies among requirements are commonly found in software projects [64]. If a requirement that becomes a prerequisite to other requirements is given a low priority, it affects the completion time of the whole project [18], [40]. The prerequisite requirements, therefore, need to be given a higher priority. This implies that requirements dependency determine the complexity of relationships between requirements and thus contributes to erroneous or redundant results [30] and also implies a higher requirement implementation risk [44], [55]. In the requirement prioritization process, there are two different perspectives from the stakeholders and developers. On the client-side, priorities depend on urgency, needs, and business value.

Distribution of Publication
On the developer side, the priority is influenced by something more technical in the system development process, which is the requirements dependencies. [29], [36]. The results of the qualitative research conducted by Al Ta'ani [26] obtained the same result, indicating that analysts and system developers considered dependency as an important factor in requirement prioritization. Fig. 6 shows that cost and risk are also relatively significant in RP. In general, cost and risk are implicitly influenced by the complexity of requirements, among others. The higher the complexity, the higher the cost and the risk of implementing the requirements are [20]. As discussed earlier, requirements dependency causes requirements complexity [2]. This fact indirectly highlights further the impacts of requirements dependency on RP.

What are the different types of requirements depenceny?(RQ2)
RQ2 focuses on extracting the types of requirements dependency. In general, there are two main classifications of requirements dependency proposed by [65] and [66]. As illustrated in Table 4, the former classifies dependency into three groups [65]: Functional; Value-related, and Time-related. The Functional consists of Combination, Implication, and Exclusion. Combination refers to the requirements to be implemented together and Implication is the requirements that must wait for other requirements to complete. Exclusion is the opposite of Combination, comprising the requirements that cannot be applied together as they are conflicting with each other. On the other hand, Value-related consists of Revenue-based and Cost-based; Revenue-based are requirements that can affect income, whereas Cost-based are requirements that can affect costs. The last group is Time-related, which is requirements that need to be implemented based on the time stated in the project schedule. The latter classifies dependency into three types, as shown in Table 5  In addition to the above classifications, there are also articles that mention indirectly and solely other types of requirements dependency. The articles mostly use different terms, even though they refer to the same kinds of dependency. In order to avoid redundancy and inconsistency, this study joins similar types together and assigns coherent terms that represent the classifications best. Fig. 7 illustrates the taxonomy view of requirements dependency categories synthesized from various classifications proposed in the selected articles. Overall, there are two requirements dependency categories: Internal and External. Internal dependency means interior attributes of the system that cause its requirements interdependencies. External dependency means exterior attributes that affect or influence the requirements of the system. Internal dependency has two subcategories, namely, Functional and Structural. External dependency is divided into four subcategories which consist of Time-related, Value-related, Human Resource, and Business Process.
For the Functional subcategory in Internal category, there are three types of dependency including: • Combination [42], [65], [66] is a pair of requirements that must be applied together. Other similar terms used are Coupling [7], Concurrence [35], Requires [54], [55], and Constrain [18], [49]. This type has two subtypes, namely Complete [29], and Limited [29]. Complete is dependent on another requirement completely while Limited is partly dependent on another requirement.
• Exclusion [42] is a pair of conflicting requirements, which cannot be applied together. Other similar terms used are Conflicts [35], Contradict [43].
Likewise, there are three types for the Structural [66] subcategory, namely: • Direct [29], [41] means that requirements depend directly on other requirements. For example, X depends on Y directly.
• Indirect [29] means that requirements depend on other requirements indirectly. For example, X depends on Z, while Z depends on Y. This shows X depends on Y but through Z.
• Refines [35], [67] means that requirements of higher levels are explained by a number of requirements of lower levels. Another term used for this type is hierarchy [31].
On the other hand, the External category consists of Time-related, Value-related, Human resources, and Business processes. There is only one type of Time-related sub-category, namely Time-based [42], [65]. This means a requirement that needs to be implemented based on the time stated in the project schedule. Meanwhile, in the Value-related subcategory, the two types comprise: • Cost-based [28], [42], [65], [58], [57], means a requirement that can affect cost. Other terms found are Contribution [35] and Cost-related [58].
There are two types of Human resource subcategory, namely: • Dependencies due to Downstream Activities [52] imply requirements whose implementation considers optimizing existing human resources.
• Team-based Dependencies [52] concern about avoiding multiple teams having to work on the same or on dependent requirements. • The last sub-category of External category is the business process, which has three types as follows: • Inter-domain Dependencies [52] indicate requirements whose implementation depends on requirements across business sectors.
• Intra-domain Dependencies [52] indicate requirements whose implementation depends on certain business processes.

What are the existing techniques used for requirements dependency problems in RP? (RQ3)
There are various techniques proposed in the selected articles for solving requirements dependency problems in RP. All the techniques used in the selected studies (based on QAC) were analyzed, clustered, and studied in their process. In general, the discovered techniques have specific problem criteria. Decision Making is used to address RP with multiple criteria: Evolutionary Algorithm for computational optimization, Fuzzy logic to handle uncertainty factors, NLP for automated identification of RP and RD based on human language, Machine Learning for automatic determination of RP based on datasets, and Graph-based approaches for mapping RD within groups of requirements.
The most commonly used techniques found in the selected articles are Decision Making, including Collaborative requirement prioritization method [12], Utility-based prioritization [68], Majority Voting Goal-Based (MVGB) [37], Analytic Hierarchy Process (AHP) [18], [50] and Hierarchical Dependencies [31]. One of the Multi-Criteria-Decision-Making techniques is AHP. AHP has excellent accuracy since pairwise comparison is able to provide decisions that are accurate and worth considering [69]. However, pairwise comparison is time-consuming for large scale projects [37].
The second-highest technique is Evolutionary Algorithm (EA), which comprises the Least-Squares-Based Random Genetic Algorithm [30], Hybrid Enriched Genetic Revamped Integer Linear Programming [42], Multi-objective Evolutionary Algorithms (MOEAs) [15], MOSAs [58], Interactive Genetic Algorithm (IGA) [51] and Early Mutation Testing [32]. The most widely used EA technique is the Genetic Algorithm (GA). This technique aims to reduce computation time. It can be combined with other techniques that are able to provide better accuracy. In the selected articles, EA is only used in simulation cases. Thus, it needs to be proven in industry settings.
The next category is Fuzzy Logic. There are three techniques in this category, namely the Hierarchical Fuzzy Inference System (HFIS) [7], Fuzzy Inference System (FIS) [45], Fuzzy Clustering [62], Rough Set Theory [63], and Tensor and Fuzzy Graphs [28]. Fuzzy is used to help in the decisionmaking process. Each stakeholder's perception of the value of a requirement is different, which is mainly based on interests and knowledge. Fuzzy Logic can be used to solve uncertainty problems due to human judgment.
Previous studies also use Neuro-Linguistic Programming (NLP) for requirements dependency in RP, such as Satisfiability Modulo Theories (SMT) [48] and SNIPR [70]. SNIPR completes SMT. NLP is used as the input for both techniques. Requirements are clustered using NLP and combined with weighting dependencies. The ranking process is combined with GA [48] and AHP [70]. NLP is quite helpful in filtering requirements, thus minimizing redundancy and similarity. Nevertheless, NLP still needs to be explored more in detecting dependencies between requirements, so that costs and time can be further optimized, especially for large scale projects.
Other existing techniques are Graph and Matrix. They are used to visualize and calculate relation weights. The Matrix can be applied separately [34], [71], [44], [38] or in conjunction with Graph [43], [57]. Graphs are composed of nodes, which represent requirements, and edges as relations. Matrix, on the other hand, consists of rows and columns, with cells showing relations between requirements. Because of the visual representation, both techniques make it easy to view dependencies among requirements. However, the techniques would consume time and cost for large-scale requirements.
Machine learning (ML) has been introduced to automate the process of RP. There are five ML methods for requirements dependency in RP, namely CDBR [29], DRank [35], Active Learning [60], Supervised Classification Technique [55], and Interactive Next Release Problem (iNRP) [39]. First, CDBR exploits the Particle Swarm Optimization (PSO) method [29]. The technique minimizes conflicts between stakeholders and developers using a variety of population sizes, between 10 to 50 set requirements. On the computational time and complexity side, CDBR shows excellent results compared to AHP. Second, DRank uses the RankBoost algorithm for learning and calculating requirements dependencies [35]. The graph is used to show or represent dependencies between requirements. There are two types of a graph generated by the DRank method-the first graph is for representing contributions, and the second graph is to represent business rules. Third, iNRP uses Least Median Square (LMS) and Multilayer Perceptron (MLP) techniques [39]. Time-consuming testing, placing DRANK, is superior to the AHP and CBRank methods [35]. In general, ML could be used to reduce interactions with practitioners. It provides better computational efficiency at significant scalability. However, the challenge is the availability of datasets and the selection of techniques that fit the project's characteristics. Fig. 8 shows the distribution of techniques used to address requirements dependency problems in RP, based on their technical bases. Most techniques seem to employ Decision Making and Evolutionary Algorithm technical bases.

Fig. 8. Techniques of handling requirements dependency in requirements prioritisation
The brief explanation of each technique in terms of process and limitations is presented in Table 6. Based on the synthesis presented in the table, each technique has its own designated approaches in solving requirements dependency problems in RP. It can be seen that most techniques to date only handle small numbers of requirements and cover limited types of dependency. They in fact have yet to be tested using real cases with large data sets. Some are incomplete and shallow, covering trivial aspects of the matter. Stakeholders have different perceptions in assessing the importance of requirements due to diverse backgrounds and knowledge. The Fuzzy Logic techniques are widely used to solve the problem of uncertainties among stakeholders in determining priorities [7], [28], [45].
Decision Making techniques help RP by comparing requirements [12], [18], [31], [33], [37], [50], and [54]. The techniques are accurate, but the comparisons become complex when they involve a large number of requirements. The techniques therefore are not suitable for large-scale projects. ML can help resolving the issue of large data. Prior to that, ML however requires training data that are based on the generated knowledge base containing patterns or rules. Graph [38], [40], [43] and Matrix [34], [43], [44], [71] can aid to visualize the dependencies between requirements. As these techniques only focus on visualization, they usually have to be combined with other techniques because RP needs to consider the values of interest between requirements. NLP works by reading and recognizing patterns. Although promising, a common obstacle of applying this technique is the inconsistency in the written requirements [48], [70]. As such, reading the patterns is difficult. Recognizing discrete patterns is also not so straightforward, as dependencies vary. EA is a metaheuristics-based technique [30], [42], [47], [51], [72]. Its advantage is the gene selection, which means only the best requirements can survive. EA can be combined with another method that exploits a genetic algorithm to reduce the number of pairs of elicited requirements [51]. Only pairs that allow the disambiguation of equally ranked or differently ranked requirements are elicited. However, this technique has yet to be applied in real projects to truly prove its practicality.

Technique Description Process Limitation
Early mutation testing [32] Technical base: EVOLUTIONARY ALGORITHM The mutation process is involved in the modification of software artifact (e.g., CS) by injecting artificial faults. Each mutated version is called a mutant.

Requirements
Change Analysis [34] Technical base: MATRIX A method based on the changes that change themselves, which are initiated at higher levels.
• Analyzing the change using functions • Identifying the difficult changing and • Identifying the dependencies using a matrix Need to identify the effort to implement a requirement change and to apply the method to a more complex case study. DRank [35]  Dependency-aware software release planning (DA-SRP) [57] Technical base: GRAPH Dependency aware software release planning (DA-SRP) maximizes the overall value of an optimal subset of features while considering the influences of value-related dependencies extracted from user preferences.
• The process starts with identification of value-related dependencies from collected user preferences. • Identified value-related dependencies will be modeled using the algebraic structure of fuzzy graphs • the resulting model is referred to as the Feature Dependency Graph (FDG) of the system • Finally, perform dependency-aware release planning to find an optimal configuration of the features using the proposed integer programming model.  Combination is the most explored. Across the dependency types, DRank (ML) is the most applied technique. However, DRank is used so far to address Internal-Functional dependency only.  Most of the techniques proposed by the authors are still at the research stage and have not been applied to solve real-world industry problems. One technique that has been applied in real-world cases is DA-SRP [57]. This technique is used in the development of industrial software called PMS-II. The case study involves 23 features considering user preferences within a certain budget. The preference matrix for PMS-II is constructed based on user preferences. The calculation of dependency strength and quality related to values is then performed using fuzzy membership functions. This technique results in optimal feature selection. Another technique is Integrating Active Learning with Ontology-Based Retrieval [60]. This technique is applied to two industrial datasets, namely Siemens Austria and Blackline Safety Corp Canada. Both companies have collected software requirements and manually determined RD. The application of the proposed technique in the research reduces efforts with good accuracy, achieving 86% accuracy in the second company.
Based on these findings, several preliminary interpretations can be made. One possible explanation on why most techniques address Internal dependency is because such dependency is definite and structured. Thus, it is more straightforward to tackle. On the other hand, External dependency involves vague and diverse elements that rely heavily on the nature of project. The elements in fact vary across projects, which are not so apparent to determine. This also helps to explain why among External dependency types, only Cost-based and Time-based are addressed by the techniques. This is due to the fact that cost and time are the most objective variables in projects. Another challenge in handling external dependencies is the conditions that are beyond the control of system developers. External dependencies can be addressed by involving stakeholders as business owners or parties related to the system requirements being developed. Considerations from these stakeholders can be used as an important factor in determining the priority sequence.

Conclusion
This study has provided an understanding of requirements dependency in RP in terms of its impacts, types and techniques based on a review made on thirty selected articles. The results show that requirements dependency has significant impacts on RP. Ignoring requirements dependency during RP could delay product release and increase project cost as well as project risk. The different types of requirements dependency have also been identified. There are at least 14 types, which can be clustered into two categories: Internal and External. Each type has different characteristics and thus requires different techniques. There are 28 techniques that are capable of handling requirements dependency problems in RP. These techniques are derived from various technical bases, including Fuzzy Logic, Decision Making, Evolutionary Algorithm, Matrix, Machine Learning, Graph, and Neuro-Linguistic Programming.
Some limitations and gaps are observed in the reviewed articles, which require further research. Most techniques focus on Internal dependency, rather than External. In fact, Functional is more investigated than Structural in Internal dependency. With regards to practicality, most techniques to date are still being tested in laboratory settings with small data sets and covering limited types of dependency. Their scalability and efficiency in handling large-scale requirements are thus arguable. Future studies should be able to apply RP techniques by considering dependency factors with various types in large-scale software development with a set of requirements. As RP plays an important role in ensuring the success of a software project, effective and yet practical solutions are necessary. In prioritizing requirements that involve multiple factors and a large number of requirements, combining Multicriteria Decision Making techniques with Machine Learning can be beneficial. However, it requires adjustments based on business