TL;DR: The background and service model of cloud computing, the product of the fusion of traditional computing technology and network technology, is introduced and the existing issues in cloud computing such as security, privacy, reliability and so on are introduced.
Abstract: Cloud computing, a rapidly developing information technology, has aroused the concern of the whole world. Cloud computing is Internet-based computing, whereby shared resources, software and information, are provided to computers and devices on-demand, like the electricity grid [1]. Cloud computing is the product of the fusion of traditional computing technology and network technology like grid computing, distributed computing parallel computing and so on. It aims to construct a perfect system with powerful computing capability through a large number of relatively low-cost computing entity, and using the advanced business models like SaaS (Software as a Service), PaaS (Platform as a Service), IaaS (Infrastructure as a Service) to distribute the powerful computing capacity to end users' hands. This article introduces the background and service model of cloud computing. This article also introduces the existing issues in cloud computing such as security, privacy, reliability and so on. Proposition of solution for these issues has been provided also.
TL;DR: It is demonstrated formally that decision trees can be seriously hurt by the curse of dimensionality in a sense that is a bit different from other nonparametric statistical methods, but most importantly that they cannot generalize to variations not seen in the training set.
Abstract: e de MontrMontreal, Canada The family of decision tree learning algorithms is among the most widespread and studied. Motivated by the desire to develop learning algorithms that can generalize when learning highly varying functions such as those presumably needed to achieve artificial intelligence, we study some theoretical limitations of decision trees. We demonstrate formally that they can be seriously hurt by the curse of dimensionality in a sense that is a bit different from other nonparametric statistical methods, but most importantly, that they cannot generalize to variations not seen in the training set. This is because a decision tree creates a partition of the input space and needs at least one example in each of the regions associated with a leaf to make a sensible prediction in that region. A better understanding of the fundamental reasons for this limitation suggests that one should use forests or even deeper architectures instead of trees, which provide a form of distributed representation and can generalize to variations not encountered in the training data.
TL;DR: A novel feature selection method based on part-of-speech and HowNet, which chooses the words with larger amount of information by different part- of-speech, and expands the semantic features of these words based on HowNet so that the short text has more useful features.
Abstract: Feather selection is a process that extracts a number of feature subsets which are the most representative of the original meaning from original feature set. It greatly reduces the text processing time and increases the accuracy because of removing some data outliers. With the rapid development of Web 2.0 and the further evolution of the Internet, short text like micro-blog plays an important role in people's daily life. However, existing feature selection methods cannot effectively extract these short text features, and greatly reduce the classification and clustering performance of short text. In this regard, we propose a novel feature selection method based on part-of-speech and HowNet. According to the composition of the text property, we choose the words with larger amount of information by different part-of-speech, and then expand the semantic features of these words based on HowNet, in this way the short text has more useful features. We use test data set collected from sina micro-blog and adopt the micro average and macro average of F1-Measure to evaluate the effects of short text classification. The results show that the short text feature selected by our method has a good amount of information, as well as good classification results.
TL;DR: A novel multi- microgrids distributed control oriented hierarchical and distributed multi-Agent system (MAS) architecture was constructed and designed that was composed of multi-microgrids management Agent, microgrid control Agent and local Agent.
Abstract: According to the architecture and the characteristics of multi-microgrids control system, a novel multi-microgrids distributed control oriented hierarchical and distributed multi-Agent system (MAS) architecture was constructed and designed. The MAS was composed of multi-microgrids management Agent, microgrid control Agent and local Agent. It uses CORBA technology as communication mode in the whole system. Finally, the main functions of each Agent wile be designed, and the communication model of multi-Agent system will be analyzed
TL;DR: In this method, Particle Swarm Optimization (PSO) is introduced into reconnaissance UAV path planning algorithm, and targets value, effective reconnaissance path and other factors that impact Uav path planning are included in the objective function of PSO.
Abstract: This paper presents a method of fixed-point reconnaissance path planning for Unmanned Aerial Vehicle(UAV). In this method, Particle Swarm Optimization(PSO) is introduced into reconnaissance UAV path planning algorithm, and targets value, effective reconnaissance path and other factors that impact UAV path planning are included in the objective function of PSO. The optimal solution of reconnaissance path is obtained by optimizing of PSO. At last, the simulation is carried out and satisfactory results are achieved.
TL;DR: This work proposed an adaptive selection scheme and an adaptive ranks clone scheme by the online discovered solutions in different ranks by exploiting the dynamic information of the online antibody population to solve multi‐objective optimization problems.
Abstract: Artificial immune systems (AIS) are computational systems inspired by the principles and processes of the vertebrate immune system. The AIS-based algorithms typically exploit the immune system's characteristics of learning and adaptability to solve some complicated problems. Although, several AIS-based algorithms have proposed to solve multi-objective optimization problems (MOPs), little focus have been placed on the issues that adaptively use the online discovered solutions. Here, we proposed an adaptive selection scheme and an adaptive ranks clone scheme by the online discovered solutions in different ranks. Accordingly, the dynamic information of the online antibody population is efficiently exploited, which is beneficial to the search process. Furthermore, it has been widely approved that one-off deletion could not obtain excellent diversity in the final population; therefore, a k-nearest neighbor list (where k is the number of objectives) is established and maintained to eliminate the solutions in the archive population. The k-nearest neighbors of each antibody are founded and stored in a list memory. Once an antibody with minimal product of k-nearest neighbors is deleted, the neighborhood relations of the remaining antibodies in the list memory are updated. Finally, the proposed algorithm is tested on 10 well-known and frequently used multi-objective problems and two many-objective problems with 4, 6, and 8 objectives. Compared with five other state-of-the-art multi-objective algorithms, namely NSGA-II, SPEA2, IBEA, HYPE, and NNIA, our method achieves comparable results in terms of convergence, diversity metrics, and computational time.
TL;DR: DIPKIP may be applied as a decision support system, which, under the supervision of a KM expert, can provide useful and practical proposals to senior management for the improvement of KM, leading to flexibility, cost savings, and greater competitiveness.
TL;DR: The applicability of neural network models for better prediction of reliability in a realistic environment is explored and an assessment method of software reliability growth using connectionist model is presented.
Abstract: Software reliability assessment has been a vital factor to characterize the quality of any software product quantitatively during testing phase. Over the years many analytical models have been proposed for modeling software reliability growth trends with different predictive capabilities at different phases of testing. Yet we need to develop such single model that can be applied for accurate prediction in all circumstances. Here we explore the applicability of neural network models for better prediction of reliability in a realistic environment and present an assessment method of software reliability growth using connectionist model. We apply feed forward back propagation algorithm and discuss the related issues of network architecture, method of data representation and some unrealistic assumptions incorporated with software reliability models. The model has been applied to different failure data sets collected from several standard software projects. A numerical example has been cited to illustrate the results revealing significant improvement by using artificial neural networks over conventional statistical models based on NHPP.
TL;DR: This paper describes the application of Principal Component Analysis for fault detection and diagnosis (FDD) in a real plant, and can be demonstrated that it is possible to detect and identify faults.
Abstract: This paper describes the application of Principal Component Analysis (PCA) for fault detection and diagnosis (FDD) in a real plant. PCA is a linear dimensionality reduction technique. In order to diagnosis the faults, the PCA approach includes one PCA model for each system behavior, i.e., a PCA model for normal operation conditions and a PCA model for each faulty situation. Data set is generated in closed loop. The method of fault detection and diagnosis is based on the definition of threshold minimum. These are calculated by the Q statistics and levels of significance. The PCA models outputs (in this case the Q statistics) are compared with theirs thresholds minimum, with and without faults. The only one that does not violate it threshold says us the actual system situation, i.e., identify the fault. Finally, this technique is applied to a two tanks system, and can be demonstrated that it is possible to detect and identify faults.
TL;DR: A new multiagent learning system, called ISABEL, is proposed that provides each student, that are using a specific device, with a device agent able to autonomously monitor the student's behavior when accessing e‐learning Web sites.
Abstract: Personalization is becoming a key issue in designing effective e-learning systems and, in this context, a promising solution is represented by software agents. Usually, these systems provide the student with a student agent that interacts with a site agent associated with each e-learning site. However, in presence of a large number of students and of e-learning sites, the tasks of the agents are often onerous, even more if the student agents run on devices with limited resources. To face this problem, we propose a new multiagent learning system, called ISABEL. Our system provides each student, that are using a specific device, with a device agent able to autonomously monitor the student's behavior when accessing e-learning Web sites. Each site is associated, in its turn, with a teacher agent. When a student visits an e-learning site, the teacher agent collaborates with some tutor agents associated with the student, to provide him with useful recommendations. We present both theoretical and experimental results to show that this distributed approach introduces significant advantages in quality and efficiency of the recommendation activity with respect to the performances of other past recommenders.
TL;DR: This method is better than traditional fuzzy clustering algorithms at handling datasets that are 'curved', elongated or those which contain clusters of different dispersion, and has been implemented in Matlab and C++ and is available at http://www.ox.ac.uk/cmb/difFUZZY.
Abstract: Soft (fuzzy) clustering techniques are often used in the study of high-dimensional datasets, such as microarray and other high-throughput bioinformatics data The most widely used method is the fuzzy C-means (FCM) algorithm, but it can present difficulties when dealing with some datasets A fuzzy clustering algorithm, DifFUZZY, which utilises concepts from diffusion processes in graphs and is applicable to a larger class of clustering problems than other fuzzy clustering algorithms is developed Examples of datasets (synthetic and real) for which this method outperforms other frequently used algorithms are presented, including two benchmark biological datasets, a genetic expression dataset and a dataset that contains taxonomic measurements This method is better than traditional fuzzy clustering algorithms at handling datasets that are 'curved', elongated or those which contain clusters of different dispersion The algorithm has been implemented in Matlab and C++ and is available at http://wwwmathsoxacuk/cmb/difFUZZY
TL;DR: Multi-tenancy based access control model (MTACM) was designed to embed the security duty separation principle in cloud and was a two granule level access control mechanism, one was tenant granule for CSP to compartmentalize different customers, the other was application granule to control the access to their own applications.
Abstract: Though cloud computing has many advantages, it still faces a big challenge of security and privacy problem. This problem is also an obstacle to cloud computing since no one is willing to run his businesses in facilities he has no control over it. Moreover, since cloud computing is a multi-tenancy IT service mode, there should be a capability to compartmentalize different customers in cloud facilities; therefore, security duty separation between CSP and customers must be supported in cloud. However, this security duty separation is not common in traditional security mechanisms. Multi-tenancy based access control model (MTACM) was designed to embed the security duty separation principle in cloud; it was a two granule level access control mechanism, one was tenant granule for CSP to compartmentalize different customers, the other was application granule for customers to control the access to their own applications. MTACM was technically and practically feasible. A prototype introduced in this paper showed that MTACM has a good performance.
TL;DR: The experimental results show the method can ensure that cloud neither know user's real face data, nor the face private matching identification result, to make user's face data secure, and develops a credible, efficient, low-complex method to guarantee cloud computing security.
Abstract: Supporting study of a method to solve cloud computing security issue with private face recognition. The method has three parts: user part provides face images; cloud initialization part has a face subspace and templates database; cloud private matching identification part contains the core algorithm of the method, comparing two encrypted numbers under double-encrypted conditions. The experimental results show the method can ensure that cloud neither know user's real face data, nor the face private matching identification result, to make user's face data secure, we develop a credible, efficient, low-complex method to guarantee cloud computing security.
TL;DR: Experiments based on the standard database UCI show that the proposed method can produce a high purity clustering results and eliminate the sensitivity to the initial centers of k-means algorithm to some extent.
Abstract: The traditional k-means algorithm has sensitivity to the initial start center. To solve this problem, this paper proposed a new method to find the initial center and improve the sensitivity to the initial centers of k-means algorithm. The algorithm first computes the density of the area where the data object belongs to; then it finds k data objects, which are belong to high density area, as the initial start centers. Experiments based on the standard database UCI show that the proposed method can produce a high purity clustering results and eliminate the sensitivity to the initial centers to some extent.
TL;DR: The hierarchical clustering method can assist the case-selecting and can improve the efficiency and effect of the tax inspection.
Abstract: Nowadays, some enterprises have multiplicative ways of going about tax evasion, which becomes one puzzle in tax inspection. Tax inspectors to carry out rapid and accurate work have become extremely important. The traditional inspection case-selecting is mainly based on reported information. This method to judge the delineation of the characteristics of those unscrupulous taxpayers largely depends on the past experience and some intuition of the professional inspectors. This paper uses the hierarchical clustering in the tax inspection case-selecting. First, this paper describes the theory of clustering. Second, it analyses the index data of 30 enterprises using the hierarchical clustering and gets the analyzing result. Finally, the result is compared with the known taxation case. Then we get the conclusion that the hierarchical clustering method can assist the case-selecting and can improve the efficiency and effect of the tax inspection.
TL;DR: A new clustering method which is applied to a Gaussian smoothed image using bootstrapping approach of feature weighting and found that the proposed scheme provides a better clustering performance for brain MRI lateral ventricular compartments segmentation.
Abstract: In this paper we introduce a new clustering method and apply it to brain magnetic resonance imaging (MRI) lateral ventricular compartments segmentation. The method uses Gaussian smoothing to enable fuzzy c-mean (FCM) to create both a more homogeneous clustering result and reduce effect caused by noise. With the objective of finding the optimal clustering results, we present a weighted clustering scheme which is applied to a Gaussian smoothed image using bootstrapping approach of feature weighting. The scheme is called weighted FCM with Gaussian smoothing (WGFCM). In addition to the observations on the clustering results of the MR images, we use validity functions and clustering centroids to evaluate the clustering results. Compared with the standard FCM with or without Gaussian smoothing, we found that the proposed scheme provides a better clustering performance for brain MRI lateral ventricular compartments segmentation.
TL;DR: This work derives the knowledge of complexity reduction from partial solutions and provides algorithms for automated dimension reduction in RL and proposes the cascading decomposition algorithm based on the spectral analysis on a normalized graph Laplacian to decompose a problem into several subproblems and then conduct parameter relevance analysis on each subproblem to perform dynamic state abstraction.
Abstract: High dimensionality of state representation is a major limitation for scale-up in reinforcement learning (RL). This work derives the knowledge of complexity reduction from partial solutions and provides algorithms for automated dimension reduction in RL. We propose the cascading decomposition algorithm based on the spectral analysis on a normalized graph Laplacian to decompose a problem into several subproblems and then conduct parameter relevance analysis on each subproblem to perform dynamic state abstraction. The elimination of irrelevant parameters projects the original state space into the one with lower dimension in which some subtasks are projected onto the same shared subtasks. The framework could identify irrelevant parameters based on performed action sequences and thus relieve the problem of high dimensionality in learning process. We evaluate the framework with experiments and show that the dimension reduction approach could indeed make some infeasible problem to become learnable.
TL;DR: An artificial bee colony (ABC) algorithm approach is introduced in PID tuning and on-line tuning as a novel technique for optimum adaptive control in a non-Lyapunov way.
Abstract: This paper is concerned with PID controller, which is an extremely important type of controller. However the optimal PID parameters is difficult to determined, their performance are highly sensitive to the initial guess of the solution. An artificial bee colony (ABC) algorithm approach is introduced in PID tuning and on-line tuning as a novel technique for optimum adaptive control in a non-Lyapunov way. The controller {u(k)}, parameters {k_P,k_i, k_d}'s resolving is converted to a series of multi-modal nonnegative functions' minimization, whose minimums can be optimally determined by ABC. The details of applying the proposed method are given and the experiments done show the proposed strategy is effective and robust.
TL;DR: Key steps in the instructional design process are proposed and the whole process begins with writing the course overview and re-designing the lesson plan to be launched onto a learning management system.
Abstract: In recent years, with the advances of the Internet and e-learning technologies, a blended mode of learning, which effectively combines the traditional face-to-face learning and e-learning, has evolved. Yet, this blended learning mode is not widely adopted in higher education. One major reason is that teachers are not familiar with the practices of designing courses under the blended learning environment. This paper investigates the instructional design practices for blended learning. Key steps in the instructional design process are proposed. The whole process begins with writing the course overview and re-designing the lesson plan. Following a blended learning course template, the face-to-face learning elements and e-learning elements are specified. Lesson materials and resources are then developed as learning objects to be launched onto a learning management system. Continuous review is required.
TL;DR: These two main factors that determine citizen trust the e-government based on technology acceptance model(TAM) are identified and the relation between determinants and trust in e- government is presented.
Abstract: E-governments are increasingly becoming a familiar fixture. Nations across the world are realizing the importance of e-government, the main objective of most e-government is to better serve citizens. howerver, citizen's likelihood to use e-government is low. Lack of trust has been recognized as one of the most barriers to citizen for engaging in e-government, involving the trust in the internet and the trust in the government. This paper identifies these two main factors that determine citizen trust the e-government based on technology acceptance model(TAM), and it reviews relevant studies that investigate the elements of e-government trust. The relation between determinants and trust in e-government is presented. The role of TAM in the development of trust in e-governmet is exlained in detail.
TL;DR: This research uses a survey instrument to gather data from 27 Malaysian based software firms to identify the most practiced requirements engineering activities and the least practiced requirements Engineering activities in the software firms.
Abstract: Requirement Engineering (RE) phase has been regarded as one of the important phases in the development process. Inadequate engineering of requirements can lead to more expensive errors in the later software development phases. Even though there are many methods and techniques which have been proposed in the literatures, many of these methods and techniques have not been widely practiced in the industry. To be able to rectify the situation, the assessment of the current practice is crucial. The main goal of this work is to investigate the software engineering practices especially the requirements engineering practices in the Malaysian software industry. Many of the practicing software developers are the product of the local educational institutions. The findings may help the industries to plan for enhancements in the requirement engineering practices. This research uses a survey instrument to gather data from 27 Malaysian based software firms. The main contribution of this research is the identification of the most practiced requirements engineering activities and the least practiced requirements engineering activities in the software firms.
TL;DR: Using the method of traffic data analysis, can improve the road traffic safety management level effectively and reduce the likelihood of future road traffic accidents.
Abstract: Aiming at the phenomenon of road traffic accident frequency, a method of road traffic accidents causes analysis based on data mining was put forward. First analyzed the related attributes and causes of road traffic accidents. Then introduced two kinds of basic theory of data mining : rough sets theory and the theory of association rules. Finally proposed the method of road traffic accidents causes analysis based on data mining. Using the method of traffic data analysis, can improve the road traffic safety management level effectively.
TL;DR: A quick and accurate measurement of the surface temperature for an object was realized and the development of infrared temperature measurement is helpful to non-contact, quickly and accurately measuring moving and high temperature objects.
Abstract: -The development of infrared temperature measurement is helpful to non-contact, quickly and accurately measuring moving and high temperature objects.We develop a Infrared thermometer for high temperature object under the premise of high measuring accuracy and low cost.A quick and accurate measurement of the surface temperature for an object was realized.
TL;DR: This article presents a novel model for providing safe electronic marketplaces: Commodity Trunits, a system that considers trust as a tradable commodity, and demonstrates that a market operator can manage the trunit marketplace to ensure sustainability.
Abstract: In large electronic marketplaces populated by buying and selling agents, it is difficult to judge trustworthiness. A variety of systems have been proposed to help traders to find trustworthy partners by learning to discount or disregard disreputable parties. In this article, we present a novel model for providing safe electronic marketplaces: Commodity Trunits, a system that considers trust as a tradable commodity. In this system, sellers require units of trust (trunits) to participate in transactions, and risk losing trunits if they act dishonestly. Sellers can purchase trunits when needed, and sell excess quantities. We demonstrate that under Commodity Trunits, rational sellers will choose to be honest, since this is the profit maximizing strategy. We also show that Commodity Trunits provides protection from a number of vulnerabilities common in existing trust and reputation systems, e.g., the important exit problem, where sellers can cheat without fear of repercussions if they intend to leave the market. We then present a simulation that validates the system by demonstrating that a market operator can manage the trunit marketplace to ensure sustainability. We conclude with a discussion of the value of Commodity Trunits as a method for promoting trust in electronic marketplaces.
TL;DR: An improved data clustering algorithm with a new initialization method based on finding a set of medians extracted from a dimension with maximum variances that could be adapted by web search engine developers for more efficient web search result clustering is shown.
Abstract: This paper formulates, simulates and assess an improved data clustering algorithm for mining web documents with a view to preserving their conceptual similarities and eliminating the problem of speed while increasing accuracy. The improved data clustering algorithm was formulated using the concept of K-means algorithm. Real and artificial datasets were used to test the proposed and existing algorithm. The proposed algorithm was simulated using the fuzzy logic and statistical toolbox in Matlab 7.0. The simulated results were compared with the existing data clustering algorithm using accuracy, response time, adjusted rand index and entropy as performance parameters. The results show an improved data clustering algorithm with a new initialization method based on finding a set of medians extracted from a dimension with maximum variances. The results of the simulation showed that the accuracy is at its peak when the number of clusters is 3 and reduces as the number of clusters increases. When compared with existing algorithm, the proposed clustering algorithm showed an accuracy of 89.3% while the existing had an accuracy of 88.9%. The entropy was stable for both algorithms with a value of 0.2485 at k = 3. This also decreases as the number of clusters increase until when the number of clusters reached eight where it increased slightly. The adjusted rand index values varied from 0 to 1 for both clustering algorithms. The existing method achieved a value of 53% as compared with the proposed method which achieved an adjusted rand index value of 63.7%, when the number of clusters was five. In addition, the response time decreased from 0.0451 seconds to 0.0439 seconds when the number of clusters was three. This showed that the proposed data clustering algorithm decreased by 2.7% in response time as compared to the K-means data clustering. This study has shown that the proposed data clustering algorithm could be adapted by web search engine developers for more efficient web search result clustering.
TL;DR: Results clearly show that IPL-EPSO is very competent in solving the UC problem in comparison to other existing methods using PSO or IPL, and presents not only the advantage of intelligence algorithm for addressing NP-hard problem, but also the guiding effects of human knowledge, reducing the complexity of computation.
Abstract: This paper proposes a new approach to solve ramp rate constrained unit commitment (RUC) problem by improving the method of particle swarm optimization, namely improved priority list and enhanced particle swarm optimization (IPL-EPSO). The IPL-EPSO proposed in this paper is a combination of improved priority list (IPL) and enhanced particle swarm optimization (EPSO), which decomposes UC problem into two sub-optimization problems and solves them respectively. The IPL is applied to solve unit scheduling problem, considering power balance constraint, system reverse constraint, start-up/shut-down ramp rate constraint, operation ramp rate constraint and minimum up/down-time constraint, and hence the EPSO is used to solve ramp rate constrained economic dispatch (RED) problem, in order to provide specific solutions satisfying power balance constraint and ramp rate constraint. Such cooperation fully presents not only the advantage of intelligence algorithm for addressing NP-hard problem, but also the guiding effects of human knowledge, reducing the complexity of computation. Problem formulation, representation, parameter testing and the final simulation results for 10, 20, 40 generator-scheduling problems are represented. Results clearly show that IPL-EPSO is very competent in solving the UC problem in comparison to other existing methods using PSO or IPL.
TL;DR: This paper has proposed a hybrid SAEM for feature driven development (FDD) agile process model that is hybrid of quality attribute workshop, architecture tradeoff analysis method (ATAM) and active review for intermediate designs (ARID).
Abstract: The software development industry suffers from the delay in project completion time due to heavy documentation requirements of traditional process models. To overcome these delays, agile process models are getting a wide acceptance and popularity in the industry. The beauty of these models is light weight documentation and heavy intercommunications. Due to an emphasis of these models on rapid development, there is an ever increasing need of architecture evaluation. A single software architecture evaluation method (SAEM) capable of preserving the agility does not exist at the moment. In this paper, we have proposed a hybrid SAEM for feature driven development (FDD) agile process model. The proposed SAEM is hybrid of quality attribute workshop (QAW), architecture tradeoff analysis method (ATAM) and active review for intermediate designs (ARID).
TL;DR: It has been demonstrated that fire detection platform based on probabilistic neural networks (PNN) method can detect fire faster and more accurately and discard nuance disturbances from florescent light or alcohol burner, thus providing a brighter future in applications.
Abstract: In order to reduce false alarm and alarm failure of fire detection system, a research of fire detection method based on multi-sensor data fusion was conducted. In this research, probabilistic neural networks (PNN) data fusion algorithm was employed to detect fire based on texture features from fire scene. Information of temperature and smoke concentration were processed by trend algorithm separately. Results from the above three fire detection algorithms were processed through decision level data fusion to accomplish fire detection and automatic fire alarm. It has been demonstrated that fire detection platform based on this method can detect fire faster and more accurately and discard nuance disturbances from florescent light or alcohol burner, thus providing a brighter future in applications.
TL;DR: This study comparatively assess five different classification techniques, namely multilayer perceptron and probabilistic neural networks, nearest neighbour classifiers, multi-class support vector machines and classification trees for fold recognition on a reference set of proteins that are organised in 27 folds and are described by 125-dimensional vectors of sequence-derived features.
Abstract: Fold recognition based on sequence-derived features is a complex multi-class classification problem. In the current study, we comparatively assess five different classification techniques, namely multilayer perceptron and probabilistic neural networks, nearest neighbour classifiers, multi-class support vector machines and classification trees for fold recognition on a reference set of proteins that are organised in 27 folds and are described by 125-dimensional vectors of sequence-derived features. We evaluate all classifiers in terms of total accuracy, mutual information coefficient, sensitivity and specificity measurements using a ten-fold cross-validation method. A polynomial support vector machine and a multilayer perceptron of one hidden layer of 88 nodes performed better and achieved satisfactory multi-class classification accuracies (42.8% and 42.1%, respectively) given the complexity of the problem and the reported similar classification performances of other researchers.
TL;DR: The results of the evaluation are analyzed to highlight the gaps and deficiencies in the current GSD based RE process models and to emphasize the importance of the role of a well defined RE process model in GSD that covers majority if not all of the GSD issues.
Abstract: Requirements Engineering (RE) is the first activity of Software Development Life Cycle (SDLC) which plays a vital role in success or failure of any project. RE is difficult enough when done in collocated setting but the difficulty exacerbates when done in distributed settings due to of Global Software Development (GSD) inherent and induced problems which includes communication issues, strategic issues, cultural issues, knowledge management, technical issues and time zone difference. For this, a proper and well defined RE process model for GSD setting is required. This paper reports the GSD based RE process models found in literature. Further, on the basis of coverage of RE activities and GSD issues, an evaluation of these GSD based RE process models is done. Finally, the results of the evaluation are analyzed to highlight the gaps and deficiencies in the current GSD based RE process models and to emphasize the importance of the role of a well defined RE process model in GSD that covers majority if not all of the GSD issues.