Effort Estimation
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
![]() |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Several methods have been used to analyse data, but the reference technique has always been the classic regression method. Therefore, it becomes necessary to use some other techniques that search in the space of non linear relationship. Some works in the field have built up models (through equations) according to the size, which is the factor that affects the cost (effort) of the project the most [Dol00],[KT85]. The equation that relates size and effort can be adjusted due to different environmental factors such as productivity, tools, complexity of the product and other ones. The equations are usually adjusted by the analyst to fit the real data. From this perspective, different equation patterns have come out [Dol00],[Hu97]. but none of them has produced enough evidence to be considered the definitive cost function, in case there is one. Nevertheless, the characteristic that has to be satisfied by the estimation equation is: the model should be capable of doing its best on estimating reliably the majority of the real values. It hasn't been possible until now to obtain an equation, set of equations or patterns of equations that can satisfy this premise, and therefore there is no reference of comparison parameter. Then it can be assumed that the equations are not a good tool to obtain an optimum prediction. Click here to get this description in tex format and here to get the figure in eps format. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Instances and best known solutions for those instances:The estimation of the effort invested in the development of software projects can turn into a complicated problem to be solved if the appropriate models are not available. Unfortunately, until this moment this is the situation, since there are not the necessary records in the software development companies. Years of investigation are required in order to obtain the volumes of information needed to carry out a prediction with a good level of reliability and with a low error margin. The domains are not the most suitable, due to their size and limited number of variables, and because of the fact that they depend on the particular casuistry of each company. The quality of the prediction can improve if more appropriate sets of data are available and more deep study of the methods is performed. Sets of data are provided bellow. Each set shows information about certain amount of software development projects. For each project, there are two variables: one, (independant variable) that refers to the size of the generated code -measured in lines of code or function points-, and the other (dependant variable) that indicates the effort (time) invested in the development of projects. Columns "Size" and "Effort" show the measure used. Column "Projects" shows the number of projects in the data.
Here we present some results extracted from [RGH04] and [Dol00]. Some part of the data analysis were done with a tool called WEKA, which includes methods such as: KNN, linear regression, neural networks and K*. The experiments done with the KNN method used a value of 3 and 4 for the constant k (and so are named in the tables as knn-3 and knn4). Neural networks (NN) used the backpropagation algorithm with 20 neurons in one hidden layer and 500 epochs to train. LR represents linear regression, and AR arithmetic regresion. The tools used in [Dol00] were approximation to square, cubic and logarithmic functions (named as "Curve" in the tables above) and genetic programming (GP). In order to meassure the prediction capacity of the methods, two well-known measures have been used: PRED and MMRE. Level prediction of l (PRED(l)) can be defined as the quotient between the number of cases in which the estimated values are within the absolute limit l of the real values and the total number or cases. MMRE is the Mean Magnitude of Relative Error. The criteria to consider a model as a good one is that MMRE<0,25. Table 1: Obtained predictions with 25% of PRED
Table 2: Mean Magnitude of Relative Error
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Related Papers:[RGH04] M. Rodríguez, I. Galván, J.C. Hernández, P. Isasi, "An Estimate of the Necessary Effort in the Development of Software Projects", Proceedings Workshop on Intelligent Technologies for Software Engineering (WITSE04), pp.309-319. [Dol00] J.J. Dolado, "A validation of Component-based method for software size estimation", IEEE transactions on software Engineering, 26 (10) (2000), pp.61-72. [Hu97] Q. Hu, "Evaluating alternative software functions", IEEE transactions on software Engineering, 23 (6) (1997), pp.379-387. [KT85] B.A. Kitchenham, N.R. Taylor, "Software projects development cost estimation", Journal of Systems and Software, 5 (1985), pp.267-278. Click here to get the bibliography in bibtex fotmat. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||