Past Students


The following are postgraduate students who have graduated under my supervision or co-supervision.

Project Students


Student Degree Project Title Abstract
Jonathan Coffey BSc Hons (Computer Science)
2020
Stellenbosch University
A Visualisation Tool for Population-based, Meta-heuristic Algorithms Many real-world optimisation problems have been solved using population-based computational intelligence (CI) algorithms. These problems produce varying search landscapes in high dimensions that can cause disruption in many of these CI algorithms. A visualisation tool that depicts the search landscapes of higher dimensional optimisation problems and how a set of population-based algorithms traverse the solutions in these search landscapes, can greatly improve the understanding of the algorithms, whether the user is a novice or experienced in the field of CI. This project will develop such visualisation tool.
James Faure B.Eng (Industrial Engineering)
2020
Stellenbosch University
Image Classification and Recognition of X-rays Used to Label Teeth and Teeth Abnormalities in Dental Analysis Analysis of an X-ray for any dentist can be time consuming and subject to human error. This project will develop machine learning technologies to automate analysis of dental X-rays for the purpose to determine if there are any abnormalities with any of the teeth. The resulting model can be implemented on an app platform which can easily be accessed by orthodontic radiologists, and especially useful to those in rural areas where there are no dentists available.
Michaela Govender B.Eng (Industrial Engineering)
2020
Stellenbosch University
Adaptive Polynomial Approximation Approach to Data Streams This study will develop a new approach to modeling non-stationary data. The approach will make use of polynomial approximation where the coefficients of a standard polynomial will be determined using particle swarm optimization (PSO). A quantum-inspired PSO algorithm will be used to allow the polynomial to adapt to non-stationary environments. Afterwhich, the effectiveness of this approach will be evaluated.
Neil Lubbe B.Eng (Industrial Engineering)
2020
Stellenbosch University
Using A Predictive Data Analytics Model to Predict The Ageing Potential of South African Red Wines Tasting wine remains a subjective experience and, furthermore, perceiving the ageing potential of a wine is an equally subjective experience. The unique arrangement of chemical compounds in a wine influences its tasting notes as well as the perception of its ageing potential. Furthermore, the development of these chemical compounds over time are responsible for the successful (or unsuccessful) ageing of the wine. It must, however, be noted that the tasting notes of a wine are not necessarily correlated with its chemical composition, nor is it known whether the one can predict the other. Fortunately, there is a tendency for experienced winemakers to agree on wines that have aged well and wines that have not aged well over time. Hence, the purpose of this project is to determine whether the actual/realized ageing potential of wine can be predicted by utilizing the initial tasting perception and/or the initial chemical composition of the wine.
Annika Nel BSc Hons (Computer Science)
2020
Stellenbosch University
A Set-Based Approach to Training Support Vector Machines Training a support vector machine (SVM) is a constrained optimisation problem which involves constructing an optimal class separating hyperplane. This has traditionally been achieved using a vector-based Lagrange multiplier approach. The objective of this project is to redefine the training process as a set-based optimisation problem and construct an SVM using set-based particle swarm optimisation. Such an implementation could provide several advantages over existing approaches. For example, unlike the vector-based approach, the dimension for candidate solutions would not have to be specified, and different candidate solutions may be of different sizes. Furthermore, a set-based solution would reduce the computational cost of training an SVM, resulting in a more scalable and efficient classifier.
Stefan van Deventer BSc Hons (Computer Science)
2020
Stellenbosch University
Learning to Trade from Zero-Knowledge This research involves the implementation and analysis of a competitive co-evolutionary particle swarm optimization (PSO) algorithm to develop trading strategies from zero-knowledge. The competitive co-evolutionary PSO will be used to train a neural network (NN) to produce trading decisions. The focus will be on South African stocks as traded on the Johannesburg stock exchange (JSE). The aim is to evaluate different particle swarm optimization and competitive co-evolutionary strategies to see if it is feasible to use the resulting neural network traders in the real world. As part of the analysis, issues such as the saturation of the neural network hidden units due to using PSO as training algorithm will be addressed. Because trading is ultimately a dynamic optimization proble, a quantum PSO will then be implemented to determine if performance of the neural network trader can be improved.
JP van Zyl BSc Hons (Computer Science)
2020
Stellenbosch University
Set-based Particle Swarm Optimization for Polynomial Approximation This project aims to develop a set-based particle swarm optimization algorithm to find optimal polynomials to approximate continuous mappings.
Donovan Edeling B.Eng (Industrial Engineering)
2021
Stellenbosch University
Set-based Particle Swarm Optimization for Polynomial Apporximation Swarm intelligence techniques that mimic the foraging behaviour of a swarm birds have been successful in various static optimisation problems. Many real-world optimisation problems, however, are dynamic, and optimisation methods are needed to be capable of continuously adapting the solution in these changing environments. The main aim of this project is to effectively develop and evaluate a new approach to approximate polynomials in dynamic environments, using Set-based Particle Swarm Optimisation (SBPSO). Polynomial approximation has been successful in static environments, where fixed data sets are used. The project will modify current techniques to address the problems experienced when using static approximation in dynamic environments. The use of quantum particles will be investigated to ensure diversity is maintained during the course of an algorithm run. The performance of the algorithm will be compared to current available approaches.
Charl Herbst B.Eng (Industrial Engineering)
2021
Stellenbosch University
Neural Networks with Adaptive Activation Functions Adaptive activation functions are activation functions that have been modified by introducing a trainable parameter in the function. The training of a neural network with the additional parameter results in a dynamic optimization problem which can be optimized to improve the performance of a neural network as it dynamically changes the topology of the loss function. The gradient and orientation of the learned decision boundary can more accurately approximate the gradient of the true decision boundary by changing the parameter of the adaptive activation function. This will result in a reduction of misclassifications and improve the generalization capability of the neural network. This project will develop a neural network with adaptive activation functions (NNAAFs) using a standard PSO algorithm. A fitness landscape analysis will be conducted to investigate how the parameter(s) influence the characteristics of the neural network error landscape. The neural network training statistics of the NNAAF model will be compared to a model that has been trained with static activation functions. As a result of the dynamic formulation of the problem, a PSO algorithm developed for dynamic environments such as the Quantum PSO algorithm will also be implemented. Then the performance of the QPSO algorithm will be tested under the presence of concept drift.
Reynard Marx B.Eng (Industrial Engineering)
2021
Stellenbosch University
Prediction and Analysis of Exoplanet Features Using Self-Organizing Maps The aim of this project is to train a self-organizing map on exoplanet data (SOM). The SOM will then be used to identify clusters of similar exoplanets, predict values for features in the dataset, and impute missing values. Exoplanet datasets tend to be sparsely populated due to limitations in measurement techniques. SOMs tend to perform well even on sparsely populate datasets, hence the utilisation of SOMs in this research. Research on exoplanets is currently a very exciting area, mainly because of the potential of finding habitable worlds. The use of a SOM to cluster exoplanets together will allow us to identify which exoplanets are most like Earth. This can be one indication that these exponents must be investigated further for signs of habitability.
Ross Nayler B.Eng (Industrial Engineering)
2021
Stellenbosch University
Incremental Feature Learning The aim of this project is to develop a dynamic particle swarm optimization (PSO) algorithm for incremental feature learning (IFL) that can be used to train a neural network (NN). IFL refers to problems where the number of descriptive features increases over time. IFL can also refer to incrementally adding features from a current set of available features. As new features are added to the model, the search landscape dimensionality will change, thus IFL is essentially a dynamic optimization problem (DOP). Feature importance will be ranked using the boruta feature selection (BFS) algorithm. Many current machine learning approaches come with a substantial computational cost, and the goal of this project is to produce an accurate IFL model that has a lower computational cost. The performance of the model will be compared to the performance of a model that has been trained on all available features.
Robert Power B.Eng (Industrial Engineering)
2022
Stellenbosch University
Dynamic Radial Basis Function Neural Networks A radial basis function neural network (RBFNN) is a neural network (NN) where the processing units in the hidden layer’s activation function is a radial basis function. The use of such an activation function provides the NN the ability to generate multivariate non-linear mapping. The purpose of this thesis is to resolve a common issue associated with the construction of RBFNNs that is selecting the optimal number of hidden units in the hidden layer. An investigation will be conducted whether the use of a particle swarm optimization (PSO) algorithm can be used to dynamically adjust the number of hidden units, the means and standard deviations of these hidden units to adjust the kernels of the hidden unit, and to adjust the hidden-output weights. If this investigation confirms this notion, the resultant NN can be classified as a dynamic RBFNN. An empirical comparison will then be conducted between this dynamic RBFNN with other concept drift robust training approaches and algorithms.
Weka Steyn B.Eng (Industrial Engineering)
2022
Stellenbosch University
Improved Multi-Guide particle swarm optimization for Many-Objective optimization The project serves to improve the scalability of the MGPSO algorithm by implementing previous research from other Many-objective optimization algorithms such as KnEA and NSGA-II, without drastically increasing the computational complexity of the algorithm. Different archive update strategies are explored, and potentially implementing secondary convergence mechanisms such as knee points. Utilizing sub-objective dominance as an additional criterion for solution dominance due to the high number of non-dominated solutions present with increasing objectives is also considered. The results are compared to that of the standard MGPSO algorithm and other state of the art Many-objective optimization algorithms.
D Ferreri B.Eng (Industrial Engineering)
2023
Stellenbosch University
Dynamic Radial Basis Function Neural Networks This project explores the effectiveness of a dynamic Quantum Particle Swarm Optimization (DQPSO) algorithm in training Radial Basis Function Neural Networks (RBFNNs) across various stationary datasets. The study hypothesizes that DQPSO can improve training efficiency and accuracy of RBFNNs compared to traditional training methods. Using datasets such as Iris, Wisconsin Breast Cancer, Wine, Heart Disease, and Adult Income, the research conducts a thorough empirical analysis to compare the performance of DQPSO-RBFNN with other RBFNN models. Key metrics such as training and generalization accuracy, along with training time, were evaluated to assess each model’s effectiveness. The findings reveal that while the DQPSO-RBFNN shows superior training accuracy across all datasets, it also requires significantly more computational time, especially in complex datasets. This highlights a crucial trade-off between computational efficiency and accuracy. In contrast, simpler algorithms like LVQ1-SGD-RBFNN and PSO-RBFNN offer a more balanced performance, especially in terms of computational efficiency. The results from this study provide valuable insights for the application of advanced machine learning algorithms in practical scenarios, emphasizing the need to consider both accuracy and computational resources in model selection. The research contributes to the broader understanding of optimizing neural network training methods and opens avenues for future studies in algorithm efficiency and applicability in diverse data environments.
MA Kurrimboccus BSc Hons (Computer Science)
2023
Stellenbosch University
A hyper-heuristic framework for the CIFY library This project aims to develop a generic hyper-heuristic framework for the python library of computational intelligence algorithms, called CIFY.
JR Lee BSc Hons (Computer Science)
2023
Stellenbosch University
A Comparative Study of Nature Inspired Algorithm for Prioritized Foraging The study investigates the efficiency of three distinct swarm foraging algorithms—swarm-based baseline model, desert ant, and honey bee—in prioritized foraging scenarios. These algorithms are evaluated across various environments, including uniform, Gaussian, and clustered distributions of prioritized objects. The performance of the models is evaluated with respect to efficiency, item prioritization, and energy consumption. The baseline model shows rapid adaptability but falls short in item prioritization. The desert ant model offers balanced performance, excelling in foraging rate, but lacking in item differentiation. In contrast, the honey bee model demonstrates superior item prioritization, although it struggles to maintain a high foraging rate. The study identifies several challenges in swarm intelligence, such as parameter tuning and adaptability, that warrant further research. Overall, the findings provide valuable insights into the effectiveness of swarm foraging algorithms, particularly in environments requiring prioritized foraging.
H Raubenheimer BSc Hons (Computer Science)
2023
Stellenbosch University
A Division of Labour Approach to Optimal Traffic Light Scheduling Traffic light scheduling is a critical aspect of traffic management with many recently developed solutions incorporating computational intelligence methodologies. This report presents a traffic light scheduling algorithm based on a task allocation model that simulates the division of labour among insects in a colony. The developed algorithm switches the green light based off of a probability calculated every second from the traffic volume around the traffic light. Application of this algorithm to several benchmark simulated traffic scenarios shows optimal performance compared to five other traffic scheduling algorithms.
G Steyn BSc Hons (Computer Science)
2023
Stellenbosch University
Adaptive Model Trees Model trees are a type of decision tree model used to predict continuous numerical target features, and traditionally employ linear multivariate regression models. Modern computational intelligence implementations of model trees provide a number of benefits to make model trees more robust to non-static data environments. Concept drift is a challenge in such environments, whereby a model trained on prior training instances becomes invalidated by new unseen data, causing the predictive performance to degrade. A non-linear model tree algorithm which uses genetic programming and particle swarm optimization is proposed and tested, with the objective to develop a model tree implementation that can be trained on non-stationary data and is relatively robust to concept drift.
R Vivier BSc Hons (Computer Science)
2023
Stellenbosch University
Automated Analysis Of Metaheuristics Metaheuristics, i.e. nature-inspired optimization algorithms, present challenges in evaluation and comparison of performance due to the stochastic characteristics of the algorithms. Sound comparisons necessitate extensive evaluations across multiple dimensions, benchmark problems, performance measures, and different independent runs. In this report, an automated system for the statistical
analysis of metaheuristic algorithms is presented. The objective of the system is to facilitate robust and comprehensive algorithm evaluation through various statistical measures and visual representations. A full-stack application was engineered through the incorporation of a web-based front end and the utilisation of non-parametric statistical tests for the evaluation of algorithm performance. Features include a maintainable archive of performance results and stats, and options for user-specific comparison
requests. The paper provides background on metaheuristics and non-parametric testing, as well as detailed discussions on the system architecture, methodological approaches adopted, and practical implications of the automated analysis and ranking system. The report also includes insights into potential future developments and enhancements of the system for a comprehensive perspective
on metaheuristic algorithm evaluation.

Masters Students (Structured)


The list of students to follow are structured masters students who have completed a mini-dissertation under my supervision.

Student Degree Thesis Title Abstract
Zowa, G M.BA
2003
University of Pretoria
Technical Survival: Strategic Technology Migration for Next Generation Networks Telecommunication Business
Van Wyk, D M.IT
2003
University of Pretoria
A Genetic Programming Approach to Normalizing Databases
Rossouw Landman MEng
2021
Stellenbosch University
Comparison of Unsupervised Machine Learning Models for Identification of Financial Time Series Regimes and Regime Changes Financial stock data has been studied extensively over many years with an objective of generating the best possible return on an investment. It is known that financial markets move through periods where securities are increasing in value (bull markets) and periods where these securities decrease in value (bear markets). These periods that exhibit similarities over different time frames are often referred to as regimes that are not necessarily limited to bull and bear regimes, but any sequences of data that experiences correlated trends. Regime extraction and detection of regime shift changes in financial time series data can be of great value to an investor. An understanding of when these financial regimes will change and in what type of regime the financial market is tending towards, can help improve investment decisions and strengthen financial portfolios. This research deals with reviewing and comparing the viability of different regime shift detection algorithms when applied to multivariate financial time series data. The selected algorithms are applied on different stocks from the Johannesburg Stock Exchange (JSE) where the algorithms' performances are compared with respect to regime shift detection accuracy and profitability of regimes in selected investment strategies.
Judene Simonis MEng
2021
Stellenbosch University
Comparison of machine learning models on different financial time series The efficient market hypothesis implies that shrewd market predictions are not profitable because each asset remains correctly priced by the weighted intelligence of the market participants. Several companies have shown that the efficient market hypothesis is invalid. Consequently, a considerable amount of research has been conducted to understand the performance and behaviour exhibited by financial markets, as such insights would prove valuable in the quest to identify which products will provide a positive future return. Recent advancements in artificial intelligence have presented researchers with exciting opportunities to develop models for forecasting financial markets.
This dissertation investigated the capabilities of different machine learning models to forecast the future percentage change of various assets in financial markets. The financial time series (FTS) data employed are the S&P 500 index, the US 10 year bond yield, the USD/ZAR currency pair, gold futures and Bitcoin. Only the closing price data for each FTS was used. The different machine learning (ML) models that are investigated are linear regression, autoregressive integrated moving average, support vector regression (SVR), multilayer perceptron (MLP), recurrent neural network, long short term memory and gated recurrent unit. This dissertation uses an empirical procedure to facilitate the formatting, transformation, and modelling of the various FTS data sets on the ML models of interest. Two validation techniques are also investigated, namely single out-of-sample validation and walk-forward validation. The performance capabilities of the models are then evaluated with the mean square error (MSE) and accuracy metric. Within the context of FTS forecasting, the accuracy metric refers to the number of correct guesses about whether the price movement increased or decreased and the total number of guesses. An accuracy that is one percentage point above 50% is considered substantial when forecasting FTS, because a 1% edge on the market can result in a higher average return, which outperforms the market.
For the individual analysis of the single out-of-sample and walk-forward validation technique, the linear regression model was the best ML model for all FTS, because it is the most parsimonious model. The concept of a parsimonious model was disregarded when comparing and contrasting the two validation techniques. The ML models applying walk-forward validation performed the best in terms of MSE on the S&P 500 index and US 10 year bond yield. The SVR model obtained the highest accuracy of 52.94% on the S&P 500 index, and the MLP model btained the highest accuracy of 51.26% on the US 10 year bond yield. The ML models applying single out-of-sample validation performed the best in terms of MSE on the USD/ZAR currency pair, gold futures and Bitcoin. The MLP model obtained the highest accuracy of 51.77% and 53.51% for the USD/ZAR currency pair and gold futures, respectively. The linear regression model obtained the highest accuracy of 55.04% for Bitcoin.
Jeandre de Bruyn MEng
2022
Stellenbosch University
Comparison of Machine Learning Models for the Classification of Fluorescent Microscopy Images The lasting health consequences of a COVID-19 infection, referred to as Long COVID, can be severe and debilitating for the individual afflicted. Symptoms of Long COVID include fatigue and brain fog. These symptoms are caused by microclots that form in the bloodstream and are not broken up by the body. Microclots in the bloodstream can entangle with other proteins and can limit oxygen exchange. This inhibition of the oxygen exchange process can cause most of the symptoms experienced with Long COVID.
Diagnosis and identification of individuals suffering from Long COVID is the first step in any process that aims to help alleviate the symptoms of the individual, or cure them. Current identification processes are manual and as such limited by the amount of manpower applied to the task. Automating parts of the process with machine learning can greatly speed up this process and allow more efficient use of manpower.
The purpose of this research assignment is to investigate whether or not machine learning algorithms can be used to classify fluorescent microscopy images as being indicative of long COVID or not. This is done by training models and predicting on features extracted from fluorescent microscopy images using computer vision techniques. Also explored is a comparison between the performance of the machine learning algorithms used in this research assignment. It was found that logistic regression is a good choice as a classifier with a strong performance in the classification of both the positive and negative classes.
Morne du Plessis MEng
2022
Stellenbosch University
An Extension of the CRISP-DM Framework to Incorporate Change Management to Improve the Adoption of Digital Projects Digital transformation brings technology such as artificial intelligence (AI) into the core operations of businesses, increasing their revenue while reducing their costs. AI deployments tripled in 2019 having grown by 270% in just four years. However, digital transformation is a challenging task to complete successfully. A total of 45% of large digital projects run over budget, while only 44% of digital projects ever achieve the predicted value. The primary reason for these failures can be attributed to the human aspects of these projects. Examples of these human aspects are the difficulty of access to software, the lack of understanding of technology, and of the knowledge to operate the technology. The continued success of digital transformation requires both technical and change management drivers to be in place before, during, and after AI implementations.
The project starts by describing digital projects. Digital projects, which include data science and AI, have an extremely low success rate, with change management as a fundamental barrier to the success of these projects.
To address the change management challenges, five different change management models are compared, from which a generalised change management model is constructed. From literature, it is concluded that the CRISP-DM framework is one of the most widely used analytics models for implementing digital projects. Using the generalised change management framework, the change management gaps within the CRISP-DM framework are identified. An extended CRISP-DM framework is constructed by filling the identified gaps in the original CRISP-DM framework with the tasks in the general change management model created. The fourth objective details the extended CRISP-DM framework. Thereafter, the extended CRISP-DM framework is validated against a real-world case study. The validation shows that the extended CRISP-DM framework indicates change management improvement areas which would most likely have improved the adoption of the project.
For this research project, the success ultimately lies in the ability of the developed framework to provide an effective way to guide data specialists through tasks that will ease the challenges of digital transformation. For this assignment, all the objectives of this research assignment are achieved. The validation of the framework shown by use of the extended framework by a data specialist has the potential to improve the success rate of the digital project at a lower risk of failure.
Tristan Mckechnie MEng
2022
Stellenbosch University
Feature engineering approaches for financial time series forecasting using machine learning This research assignment investigates feature engineering methods for financial time series forecasting using machine learning. The goal of the work is to investigate methods that overcome some time series characteristics which make forecasting difficult. The challenging characteristics are noise and non-stationarity. A literature review is conducted to identify suitable feature engineering methods and machine learning approaches for financial time series forecasting. A case study is developed to test the identified feature engineering methods with an empirical machine learning process. Multiple machine learning models are tested. To understand the benefit of the feature engineering methods, the forecasting results are compared with and without of the application of the feature engineering methods.
Several feature engineering methods are identified: Differencing and log-transforms are two methods investigated to address non-stationarity. Moving averages, exponentially weighted moving averages, Fourier and wavelet transforms, are all methods investigated to reduce noise. The feature engineering methods are implemented as preprocessing steps prior to training machine learning models for a supervised learning problem. The supervised learning problem is to forecasting a single day ahead asset price, given ten days of previous prices. Four machine learning models commonly used for financial time series forecasting are investigated. Namely, linear regression, support vector regression (SVR), multilayer perceptron (MLP), and long short-term memory (LSTM) neural networks. The work investigates the feature engineering methods and machine learning models for four univariate time series signals.
The results of the investigation found that no feature engineering method is universally helpful in improving forecasting results. For the SVR, MLP and LSTM models, denoising or smoothing the signals did improve their performance, but the best denoising or smoothing technique varies depending on the dataset used. Differencing and log-transforms caused the models to forecast a constant value near the mean of expected daily price returns, which when inverted back to the price domain cause poor regression evaluation metrics, but good directional accuracy.
The findings of this research assignment are that the investigated feature engineering methods may improve forecasting performance for financial time series, but that the gains are not large. It seems that there is limited improvement gained through feature engineering past price data to predict future price, at least for the investigated feature engineering methods. It is therefore recommended that future work focus on finding alternative data sources with predictive power for the financial time series.
Berna Coetzee MEng
2023
Stellenbosch University
Decision Support Guidelines for Selecting Modern Business Intelligence Platforms in Manufacturing to Support Business Decision Making Globally, the generation of data is increasing rapidly, and the increasing competitiveness of global markets constantly challenges the business world due to globalisation. Companies rely on sophisticated technology to manage and make decisions in this dynamic business environment and ever-evolving market. Executives are constantly strained to ensure maximised profits from new offerings and operational efficiencies and improve customer and employee experience.
As digitalisation in the manufacturing industry increases, the role of data analytics and business intelligence (BI) in decision-making is significantly increasing. Manufacturers generate abundant structured and unstructured business information throughout the product lifecycle that can be used to achieve their business objectives. However, the manufacturing industry is amongst the laggard sections pertaining to digitisation and often lacks the technological and organisational foundations required to implement data tools as part of their ecosystem.
Business Intelligence (BI) provides business insights to better understand the company’s large amounts of data, operations and customers. This in turn, can contribute to better decision-making and consequently improve results and profit. Rationalisation of the technologies, tools and techniques can be challenging. The selection of an appropriate tool can be time-consuming, complex and overwhelming due to the wide variety of available BI software products, each claiming that their solution offers distinctive and business-essential features.
This research assignment aims to address the need for a useful approach to BI tool evaluation and selection by identifying guidelines to support decision-makers in selecting BI tools. A thematic analysis approach was used to collect, analyse and interpret the information from semi-structured interviews with professionals from the manufacturing industry. The research gauged respondents’ views on the utilisation of BI, the data challenges experienced in manufacturing, the essential criteria BI tools should fulfil, and the approaches followed in practice to select software.
The research revealed that BI plays a significant role in decision-making and the prioritisation of tasks in manufacturing. The results showed that respondents valued different BI criteria requirements and decision-making processes. The findings and insights gleaned from the literature review were used to propose guidelines that support manufacturers in their decision. It elucidates the dimensions to evaluate and provides a nine-step selection process to compare BI software.
Rijk de Wet MEng
2023
Stellenbosch University
Set-based Particle Swarm Optimization for Medoids-based Clustering of Stationary and Non-Stationary Data Data clustering is the grouping of data instances so that similar instances are placed in the same group or cluster. Clustering has a wide range of applications and is a highly studied field of data science and computational intelligence. In particular, population-based algorithms such as particle swarm optimization (PSO) have shown to be effective at data clustering.
Set-based particle swarm optimization (SBPSO) is a generic set-based variant of PSO that substitutes the vector-based mechanisms of PSO with set theory. SBPSO is designed for problems that can be formulated as sets of elements, and its aim is to find the optimal subset of elements from the optimization problem universe. When applied to clustering, SBPSO searches for an optimal set of medoids from the dataset by the optimization of an internal cluster validation criteria.
In this research assignment, SBPSO is used to cluster fifteen datasets with diverse characteristics such as dimensionality, cluster counts, cluster sizes, and the presence of outliers. The SBPSO hyperparameters are tuned for optimal clustering performance on these datasets, which is compared in depth to the performance of seven other tuned clustering algorithms. Then, a sensitivity analysis of the SBPSO hyperparameters is performed to determine the effect that variation in these hyperparameters have on swarm diversity and other measures, to enable future research into the clustering of non-stationary data with SBPSO.
It was found that SBPSO is a viable clustering algorithm. SBPSO ranked third from among the algorithms evaluated, although it appeared less effective in datasets with more clusters. A significant trade-off between swarm diversity and clustering ability was discovered, and the hyperparameters that control this trade-off were determined. Strategies to address these shortcomings were suggested.
Bernard Hesse MEng
2023
Stellenbosch University
A Bagging Approach to Training Neural Networks using Metaheuristics Stochastic gradient descent has become the go-to algorithm to train neural networks. As neural networks become larger in architecture and the datasets used to train them becomes larger, so has the computational cost to train the artificial network. Metaheuristics have successfully been used to train neural networks. Furthermore, metaheuristics are more robust to noisy objective functions.
This research assignment investigates and concludes if metaheuristics, especially genetic algorithms, differential evolution, evolutionary programming and particle swarm optimisation, can be used to train an artificial neural network with a subsample of the training set. Different bagging training approaches with the reduction in training data are put forward, and the performances of the trained neural networks are evaluated. The performances of the trained neural networks are compared against the performances of the stochastic gradient descent trained neural network and the trained neural network using metaheuristic algorithms when using the entire training dataset.
The evaluation of the performance of the artificial networks compares the validation accuracy and the generalisation factor to detect if overfitting occurs. The research assignment also answers the question of whether overfitting is reduced when training the neural network if the suggested training methods are used. The results indicate that a sub-sample of the training set can be used per iteration or generation of the metaheuristic algorithm when training a neural network with similar accuracy and similar or better overfitting performance as when training is performed using the complete training set. The best performance was achieved with a bagging strategy using the same sample size for each class to classify.
Derick Jacobs MEng
2023
Stellenbosch University
A Dynamic Optimization Approach to Active Learning in Neural Networks Artificial neural networks are popular predictive models which have a broad range of applications. Artificial neural networks have been of great interest in the field of machine learning, and as a result, they received large research efforts to improve their predictive performance. Active learning is a strategy that aims to improve the performance of artificial neural networks through an active selection of training instances. The motivation for the research assignment is to determine if there is an improvement in predictive performance when a model is trained only on instances that the model deems informative. Through the continuous selection of informative training sets, the training times of these networks can also be reduced.
The training process of artificial neural networks can be seen as an optimisation problem that uses a learning algorithm to determine an optimal set of network parameters. Backpropagation is a popular learning algorithm which computes the derivatives of the loss function and the gradient descent algorithm to make appropriate parameter updates. Metaheuristic optimisation algorithms, such as particle swarm optimisation, have been shown to be efficient as neural network training algorithms.
The training process is assumed to be static under fixed set learning, a process in which the model randomly samples instances from a training set that remains fixed during the training process. However, under an active training strategy, the training set continuously changes and therefore should be modelled as a dynamic optimisation problem.
This study investigates if the performance of active learners can be improved if dynamic metaheuristics are used as learning algorithms. Different training strategies were implemented in the investigation which include a sensitivity analysis selective learning algorithm and the accelerated learning by active sample selection algorithm. The analysis utilised different learning algorithms which included backpropagation, static particle swarm optimisation, and dynamic variations of the particle swarm optimisation algorithm. These training strategies were applied to seven benchmark classification datasets obtained from the UCI repository.
Improved performance in the generalisation factor is produced for three of the seven classification problems in which a dynamic metaheuristic is used in an active learning setting. Although these improvements are observed, generally all training configurations achieved similar performance. The conclusion drawn from the study was that it is not definitive that dynamic metaheuristics improve the performance of active learners, because performance improvements are not consistent across all classification problems and evaluation metrics.
Shaun Joubert MEng
2023
Stellenbosch University
Rule Extraction from Financial Time Series The ability to predict future events is very important in scientific fields. Data mining tools extract relationships among feature and feature values, and how these relationships map to the target concept. The main goal is to extract knowledge and understand trends. The resulting rule set can then be used for prediction purposes. For many real-world applications, the actual values of a time series is irrelevant. The shape of the time series can also be used to predict future events. Unfortunately, most of these research e↵orts related to this area have had limited success. Rule induction and rule extraction techniques are often unsuccessful for real-valued time series analysis due to the lack of systematic e↵ort to find relevant trends in the data. Rule induction and rule extraction methods are applied to data describing trends in financial time series data. The purpose of this study is to explore the benefits of rule extraction and rule induction,specifically on financial time series. A review of rule extraction and rule induction approaches is conducted as a first step. Thereafter, a rule extraction and rule induction framework is developed and evaluated. The most important finding of this study was the importance of balanced data, which performed significantly better if the excessive class distributions were minimised, while the predictive performance of the di↵erent rule extraction and rule induction algorithms was not statistically significant.
Piet Kloppers MEng
2023
Stellenbosch University
Binning Continuous-Valued Features using Meta-Heuristics The success of any machine learning model implementation is heavily dependant on the quality of the input data. Discretization, which is a widely used data preprocessing step, partitions continuous-valued features into bins which transforms the data into discrete-valued features. Not only does discretization improve the interpretability of a data set, but it also provides the opportunity to implement machine learning models which require discrete input data.
This report proposes a new discretization algorithm that partitions multivariate classification problems into bins through the use of swarm intelligence. The particle swarm optimization algorithm is utilized to try and find the bin boundary values of each continuous-valued feature which leads to the optimal classification performance of classification models. The classification accuracy of the na¨ıve Bayes classifer, the C4.5 decision tree classifier and the one-rule classifier, due to the implementation of the discretizers, is used as the evaluation measure in this report. The performance of the proposed method is compared with the equal width binning, the equal frequency binning and the evolutionary cut-points selection for discretization algorithm, on different data sets that have mixed data types.
The proposed discretizer is outperformed by the evolutionary cut-points selection for discretization algorithm when paired with the C4.5 decision tree classifiers. Similarly, the equal with binning discretizer outperforms the proposed discretizer when paired with the C4.5 decision tree.
Robert Mokakatlela MEng
2023
Stellenbosch University
Course Recommendation Based on Content Affinity with Browsing Behaviour A recommender, or recommendation system (RS), filters and provides relevant content to a user based on many factors such as their historic behaviour during interactions with a particular system or software. A RS is aimed at improving user experience and overcoming issues such as the distressing search problem experienced in massive open online courses (MOOCs) platforms. One such online platform is Physioplus, whose subscribers generally have very specific educational needs and thus can greatly benefit from targeted responses when interacting with the system. It can therefore be argued that an enhanced course recommender engine possesses great potential to increase Physioplus subscribers satisfaction and thus reduce cancelations. The current search feature in Physioplus has some limitations, as it uses keywords, static course recommendations, and elastic site search without considering historic user site visits.
The purpose of this study is to build a better course recommender system for Physioplus. The recommender takes a user's recent Physiopedia browsing history and provides the user with a tailored and rank-ordered list of those courses that are most relevant to their entire content history. The content of a user browsing history is highly correlated with the content of the most relevant courses for that user. The recommender is built using a collaborative-based filtering (CF) technique, item-based and user-based approach. Natural language processing and neighbourhood similarity methods are used to complement collaborative filtering in achieving quality recommendations.
The course recommender system in this study uses a training and testing dataset from a real-world Physioplus system to assess the overall performance of the proposed approach. The experiment evaluation is measured by comparing recommended versus completed courses. The results show that the proposed RS has a recall score of 76% and an accuracy rate of 53% obtained in the offline experiment exercise. The assumption is that the performance metrics score will improve once the proposed RS integrates with the existing Physioplus production system. All in all, the proposed RS can play an essential role in assisting users with relevant courses.
Gerhard Mulder MEng
2023
Stellenbosch University
A Genetic Algorithm Approach to Tree Bucking using Mechanical Harvester Data Crosscutting of trees into timber logs is known as bucking. The logs are mainly used for producing saw logs at a mill. The logs have different value based on the length of the log and the small end diameter of the log. Maximisations of the value of the logs bucked from a tree can be viewed as an optimisation problem. This problem has been researched in the literature with most solutions using dynamic programming. This research assignment solves the problem using a metaheuristic approach, specifically a genetic algorithm. The main research question is whether an existing bucking, on a series of stands in a forest, could have been done more optimally. The dataset used to solve the problem comes from the bucking outputs of two mechanical harvesters. Multiplication of the volume of the log by the value per cubic meter of the log class to which the log belongs, gives the value of the log. Addition of the value of logs for a tree gives the value of the tree. It was found that the genetic algorithm outperformed the existing bucking performed, in terms of value. The research method firstly solved the problem for a randomly selected set of trees with dynamic programming, comparing it to the solutions obtained from the genetic algorithm. It was found that the genetic algorithm obtained very similar optimal bucking value for the trees. Secondly, a genetic algorithm uses hyperparameters, namely population size, probability of crossover and probability of mutation. The hyperparameters were estimated using a particle swarm optimisation algorithm wrapped around the genetic algorithm. A randomly selected set of trees was used for estimating the hyperparameters. The hyperparameters found were used to optimise the total value of each of the five stands. The total value of the optimised stands outperformed the value of the existing bucking performed by a large margin.
Aveer Nannoolal MEng
2023
Stellenbosch University
Financial Time Series Modelling using Gramian Angular Summation Fields Gramian angular summation fields (GASF) and Markov transition fields (MTF) have been developed as an approach to encode time series into different images, which allows the use of techniques from computer vision for time series classification and imputation. These techniques have been evaluated on a number of different time series problems. This research assignment applies GASF and MTF to financial time series.
As a first step, a suitable financial time series is collected from a real world system and analyzed. The data quality is determined to identify data quality issues to be addressed. The cleaned financial time series is encoded into images, and validated using an appropriate technique to determine if a logical mapping between the time series and image planes exists. The financial time series is analyzed to determine its characteristics. These characteristics are used to guide the formulation of a modeling problem. The modeling problem compares the usefulness of the GASF and MTF approaches against conventional time series modeling and analysis techniques.
The four models considered for the formulated modeling problem consists of time series and image modeling approaches. The results from the experiment indicates that the time series approaches are better suited to this modeling problem specifically. The GASF and MTF approaches do provide promising outcomes when used in a combinatorial fashion. The usage of a combination of GASF and MTF images do allow a model to learn better features when combined with sequence based approaches, which improves model performance.
AL Theron MEng
2023
Stellenbosch University
Metaheuristics for Training Deep Neural Networks Presently, artificial neural networks (ANNs) are popular among researchers as well as in commercial settings. The use of ANNs continue to expand into different fields. The increase in interest in ANNs have lead researchers to explore various new and innovative ways to improve the performance of ANNs. One such way is to explore the use of metaheuristics in the training of ANNs.
This research assignment theoretically and empirically compares the use of metaheuristics as an alternative to the traditional training algorithm, i.e. backpropagation with stochastic gradient descent (SGD), to train deep neural networks (DNNs). Three specific metaheuristics are considered, namely particle swarm optimisation (PSO), genetic algorithm (GA) and differential evolution (DE). An in-depth analysis of SGD is conducted to highlight some potential disadvantages which might occur in the training process. The field of metaheuristics is explored as an alternative training algorithm with specific emphasis placed on the three specified metaheuristics. Five different experiments are conducted to empirically compare the backpropagation SGD training algorithm with the SO, GA and DE training algorithms. The experiments are conducted on an image dataset. The DNN used in the experiments is a convolutional neural network (CNN). The results conclude that the SGD performs better than the metaheuristics considered. Potential future work is also discussed based on the findings of this research paper.
Charley van der Linde MEng
2023
Stellenbosch University
A Review and Analysis of Imputation Approaches Missing data is a common and major challenge which almost all data practitioners and researchers face, and which greatly affects the accuracy of any decision making process. Data mining and data preparation requires that the data is prepared, cleaned, transformed, and reduced in order to ensure that the integrity of the dataset has been maintained. Missing data is found and addressed within the data cleaning process, during which the user needs to decide on how to handle the missing data so as to not introduce significant bias into the dataset. Current methods of handling missing data include deletion and imputation methods.
This research assignment investigates the performance of different imputation methods, specifically discussing statistical and machine learning imputation methods. The statistical imputation methods investigated are mean, hot deck, regression, maximum likelihood, Markov chain Monte Carlo (MCMC), multiple imputation by chained equations, and expectation-maximization with bootstrapping imputation. The machine learning methods investigated are k-nearest neighbor (kNN), k-means, and self-organizing maps imputation. This research paper uses an empirical procedure to facilitate the formatting and transformation of the data, and the implementation of imputation methods. Two experiments are followed in this research, namely, one in which the imputation methods are evaluated against datasets which are clean, and another in which the imputation methods are evaluated against datasets which contain outliers. The performance achieved from both experiments are evaluated using the root mean squared error, mean absolute error, percent bias, and predictive accuracy.
For both experiments, it is found that MCMC imputation resulted in the best performance out of all 10 imputation methods with an overall accuracy of 75.71%. kNN imputation resulted in the second highest accuracy with an overall accuracy of 69.85%, however, introduced a large percent bias into the imputed dataset. This research concludes that single statistical imputation methods (mean, hot deck, and regression imputation) should not be used to replace missing data in any situation while multiple imputation methods are shown to have a consistent performance. MCMC imputation in particular performs the best out of all 10 imputation methods in this research, producing a high accuracy and low bias in the imputed dataset. The performance of MCMC imputation, along with its ease-of-use, makes the imputation method a suitable choice when dealing with missing data.
Arneau van der Merwe MEng
2023
Stellenbosch University
Diversity Preservation for Decomposition Particle Swarm Optimization as Feed-forward Neural Network Training Algorithm under the Presence of Concept Drift Time series forecasting is an important area of research that lends itself to various fields in which it is practically applied. The importance of time series forecasting has led to much research in efforts to improve the accuracy of predictions. The use of artificial neural networks for time series forecasting has grown, especially with the development of simple recurrent neural networks (SRNNs). SRNNs have been shown to handle temporal sequences efficiently. Specialised architectures for SRNN increase the computational cost due to the increase in the number of weights that require optimisation during training. Therefore, the training process of neural networks can be rephrased as an optimisation problem. Recent work has shown how specialised dynamic particle swarm optimisation (PSO) algorithms can replace traditional backpropagation as a learning algorithm for feed-forward neural networks (FFNNs).
Dynamic PSO algorithms to train FFNNs have been shown to outperform SRNNs using traditional backpropagation. Due to the increased dimensions for larger problems, various cooperative PSO algorithms have been developed to address the credit assignment problem as well as to better cope with variable dependency; one such PSO variant is the decomposition cooperative particle swarm optimisation algorithm. One limitation of using PSO variants for training in dynamic environments is that as the particles in a swarm converge in a specific region, the swarm diversity decays, making it difficult to adapt to environmental changes. Dynamic PSO algorithms have been successfully used in the sub-swarms of decomposition cooperative particle swarm optimisers (DCPSOs). However, these dynamic DCPSO algorithms have been shown to struggle under specific classes of dynamism. Therefore, the preservation of swarm diversity is directly linked to the ability to adapt in the presence of concept drift. This research project proposes various diversity preservation techniques to promote swarm diversity throughout various environmental changes. The diversity preservation techniques investigated are the use of random decomposition for dynamic DCPSO and a diversity-based penalty function for regularization.
For this purpose, experiments were conducted on five well-known nonstationary forecasting problems under various classes of dynamism. Results obtained on two implementations of the DCPSO using the proposed diversity preservation techniques showed success in promoting swarm diversity. Two main implementations of DCPSOs were investigated, namely dynamic and static sub-swarms. When a static PSO algorithm was used for the sub-swarms of the DCPSO, the diversity preservation showed a significant impact. The proposed diversity preservation techniques also significantly affected swarm diversity for the DCPSO using the quantum particle swarm optimisation algorithm (QSO) as sub-swarms. The use of the diversitybased penalty function for regularization showed superior performance on the training and generalization error for dynamic DCPSO. Still, it did not show a statistically significant effect on preserving swarm diversity. The use of static PSO algorithms as sub-swarms for DCPSO showed that random decomposition ranked high across the various experiments, while swarm diversity was significantly impacted. The proposed diversity preservation techniques for the dynamic DCPSO algorithms showed a trade-off between diversity preservation and performance.
Jozandri Versfeld MEng
2023
Stellenbosch University
Crawler Detection Decision Support: A Neural Network with Particle Swarm Optimisation Approach Website crawlers are popularly used to retrieve information for search engines. The concept of website crawlers was first introduced in the early nineties. Website crawling entails the deployment of automated crawling algorithms that crawl websites with the purpose of collecting and storing information about the state of other websites. Website crawlers are categorized as good website crawlers or bad website crawlers. Good website crawlers are used by search engines and do not cause harm when crawling websites. Bad website crawlers crawl websites with malicious intent and could potentially cause harm to websites or website owners. Traffic indicators on websites are inflated if website crawlers are incorrectly identified. In some cases, bad crawlers are used to intentionally crash websites. The consequences of bad website crawlers highlight the importance of successfully distinguishing human users from website crawler sessions in website traffic classification.
The focus of this research assignment is to design and implement artificial neural network algorithms capable of successfully classifying website traffic as a human user, good website crawler session, or bad website crawler session. The artificial neural network algorithms are trained with particle swarm optimizers and are validated in case studies.
First, the website traffic classification problem is considered in a stationary environment and is treated as a standard classification problem. For the standard classification problem, an artificial neural network with particle swarm optimization is applied. The constraints associated with this initial problem assume that the behavioural characteristics of humans and the behavioural characteristics of web crawlers remain constant over a period of time. Thereafter, the classification problem is considered in a non-stationary environment. The dynamic classification problem exhibits concept drift due to the assumption that website crawlers change behavioural characteristics over time. To solve the dynamic classification problem, artificial neural networks are formulated and optimized with quantum-inspired particle swarm optimisation. Results demonstrate the ability of the artificial neural networks optimised with particle swarms to classify website traffic in both stationary and on-stationary environments successfully to a reasonable extent.
Andrew Boyes MEng
2024
Stellenbosch University
Intelli-bone: Automated Fracture Detection and Classification in Radiographs using Transfer Learning Suspected fractures are one of the most common reasons for patients to visit the emergency department (ED) in hospitals. Radiographs, the primary diagnostic tool for suspected fractures, are often assessed by emergency healthcare professionals without specialised orthopaedic expertise. This restriction leads to a high number of diagnostic errors in EDs, with incorrectly diagnosed fractures accounting for over 80% of reported diagnostic mistakes. Given this problem with fracture diagnostics, there is an opportunity to use artificial intelligence (AI) to assist with the diagnosis of fractures. Successful implementation of an AI system that correctly locates and classifies fractures would lead to more accurate prognosis and treatment advice.
The selected fracture classification system for this research assignment is the Arbeitsgemeinschaft fur Osteosynthesefragen / Orthopaedic Trauma Association (AO/OTA) classification. The object detection models selected in this research to evaluate whether AI can be used for accurate location and classification of fractures according to the AO/OTA classification are the faster region-based convolutional neural network (Faster R-CNN), you only look once version 8 nano (YOLOv8n), you only look once version 8 large (YOLOv8l), and RetinaNet.
A secondary problem that this research assignment addresses is that of data scarcity. Deep learning algorithms require large amounts of data to achieve exceptional performance. The target dataset in this research assignment, the distal radius dataset (DIRAD), only consists of 776 images, where roughly half of the images contain fractures. The technique applied to overcome the data scarcity problem is transfer learning. With transfer learning, the object detection models are pretrained on larger datasets such as the common objects in context (COCO) and the Graz Paediatric Wrist Digital X-rays (GRAZPEDWRI-DX) dataset before being trained on the target dataset.
This research assignment shows that pretraining of object detectors on larger datasets leads to superior performance on scarce datasets. Furthermore, when pretraining an object detection model on a large dataset from a similar domain to perform a similar task, such as GRAZPEDWRIDX, it leads to even better results. The pretraining of the Faster R-CNN, YOLOv8n, YOLOv8l, and RetinaNet on the GRAZPEDWRI-DX improved mean average precision at an intersection over union of 50 (mAP50) by an average of 33.6% compared the same models trained from randomly initialised weights. The best performing model, namely the YOLOv8l, achieved a mAP50 of 59.7% on the DIRAD dataset.
Donovan Broughton MEng
2024
Stellenbosch University
Evolving encapsulated neural network blocks using a genetic algorithm In recent years, artificial intelligence, with its subfields of deep learning and evolutionary computation, has experienced remarkable growth. This expansion can be attributed to the increased availability of computational power and the potential value these domains offer. Consequently, this growth has fueled intensified research and attention, presenting the challenge of staying current with the rapid advancements. Furthermore, the advent of deep learning has led to the ever-increasing size and complexity of neural networks, pushing the boundaries of computational capabilities. This project investigates the viability of utilising a genetic-based evolutionary algorithm to automate the discovery of subnetworks within convolutional neural networks (CNNs), referred to as blocks, for image classification. Inspired by architectural elements in well-known CNNs like ResNet and GoogLeNet, these blocks are designed to be reusable, repeatable and modular.
The first part of this project entailed the development of a framework to represent CNN architectures, which drew inspiration from the concept of neuroevolution of augmenting topologies (NEAT). This developed representation framework was used to define the composition and layout of CNN architectures. Next, a genetic algorithm was adapted to fit within the framework, thus enabling the evolution of CNN blocks using various evolutionary operators, including mutation, speciation and crossover. The representation framework and genetic algorithm were combined to evolve a population of 100 CNN blocks over 30 generations. Throughout the evolution process, the search was guided by the measured quality of the blocks, defined by a fitness function that was designed to balance complexity and performance. Five repetitions of the experiment were performed and compared to randomly generated blocks to assess the overall success of this approach. Additionally, the performance of the evolved blocks was evaluated against manually designed blocks such as ResNet and GoogLeNet’s Inception.
The results of the comparison between the genetic algorithm and random procedures demonstrated the effectiveness of the genetic algorithm in producing highly optimal solutions based on the fitness evaluation. The results showing the distribution of the population evolutionary operators also explained how the subprocedures can be used to effectively control the search. Furthermore, the result obtained using a small sample of the bestperforming evolved blocks proved to be highly ompetitive when compared to manually designed counterparts, namely ResNet and Inception.
This study validates the concept of using evolutionary algorithms for neural network block generation and emphasises their ability to rival manually designed networks. The findings suggest that evolutionary computation successfully automates the discovery of competitive blocks within CNN architectures, offering new avenues for neuroevolution and overcoming limitations in the manual design processes.
Ashail Maharaj MEng
2024
Stellenbosch University
Review of Big Data Clustering Methods In an era defined by the challenges of processing vast and complex datasets, the study delves into the evolving landscape of big data clustering. It introduces a novel taxonomy categorizing clustering models into four distinct groups, o↵ering a roadmap for understanding their scalability and efficiency in the face of increasing data volume and complexity. The essence of this research lies in its pursuit to critically review, analyze, and evaluate various clustering models, focusing on their suitability and adaptability in handling big data, characterized by the four Vs, i.e. velocity, variety, volume, and veracity. The aim is to discern the operational dynamics of diverse clustering models, considering the findings of prior literature, which have demonstrated varying degrees of performance of these models based on selected metrics. The methodology is firmly rooted in the execution of a series of experiments on chosen clustering methods, metrics, and datasets. This empirical method is crucial to extrapolate how each model fares across different metrics and datasets, offering a comparative perspective on their performance. Subsequent to the experimental phase, an extensive analysis was conducted, breaking down the selected approaches into their algorithmic components. This decomposition is pivotal to identify the origins of gains, losses, or trade-offs in performance, allowing for an in-depth understanding of why certain models outperform others concerning given metrics and datasets. Insights from this research highlighted the scalability and efficiency of models like parallel k-means and mini batch k-means, both theoretically and empirically, marking them as exemplary for large-scale applications. Conversely, it unveiled the computational constraints of models like selective samplingbased scalable sparse subspace clustering (S5C) and purity weighted consensus clustering (PWCC), showing their limitations in scaling to big data. Acknowledging the limitations imposed by the resource constraints of Google Collab Pro+, the study presents the constraints faced during the evaluation process.
The culmination of this project is marked by a comprehensive performance summary, offering key insights into the strengths and weaknesses of the approached models and pro↵ering informed advice on the contextual utilization of each model. It lays the foundation for a centralized database for clustering research, aiming to fill existing knowledge gaps and facilitate optimal model discovery tailored to specific needs and infrastructural capabilities.
In conclusion, this research stands as an exploration and analysis in the field of big data clustering, to uncover the potentials and bottlenecks of various models, and o↵ers valuable insights and recommendations, all while reconciling theoretical complexities with empirical validations.
Santeshan Naidoo MEng
2024
Stellenbosch University
Few-shot learning for passive acoustic monitoring of endangered species The Hainan gibbon is a primate from the Chinese island-province of Hainan. The population of this primate has been in decline because of poaching, and is now facing extinction. Bioacoustics is a field concerned with the acquisition and study of animal sounds. Passive acoustic monitoring is an important step in data capture, and often captures months of data. Due to the low population numbers of endangered species, experts spend a large amount of time on the analysis and identification of biacoustic signatures.
Machine learning can be used to automate the bioacoustic identification of species, which would reduce analysis costs and time. Unfortunately, many machine learning algorithms require large amounts of data to perform reliably. Few-shot learning is a loosely defined structure in machine learning that aims to solve the limited data problem with unique approaches.
This assignment explores the viability of accurate, image-based classification models when subject to low data volumes. Audio data is converted to spectrograms and used in image analysis. A Siamese framework, which has roots in convolutional neural networks (CNN), is the foundation of the few-shot learning approach. Within this CNN-based framework, contrastive-loss and tripletloss architectures, data augmentation techniques, transfer learning methods, and reduced image resolution datasets are investigated.
The results indicate that the triplet-loss architecture produces the most accurate models, with excellent precision, recall, and F1-score statistics. The triplet-loss models prefer lower resolution images, which reduce computation time and cost. Importantly, the performance of the triplet-loss models is not affected by low data volumes. On the other hand, contrastive-loss models show significant performance degradation on lower data volumes. Overall, the triplet-loss “base CNN” model is the recommended network. This network achieves an accuracy of 99.08% and F1-score of 0.995. The Siamese framework has demonstrated a strong ability to identify the bioacoustic signature of the Hainan gibbon. Recommendations are provided for further research in this domain.
Jaco Olivier MEng
2024
Stellenbosch University
Comparison of machine learning models on financial time series data The efficient market hypothesis states that financial markets are efficient and that investors can therefore not make excess profits consistently, because all public information is instantly reflected in the share price. Academia and investors have shown that the efficient market hypothesis does not always hold true and that market prices can be exploited when the right financial trading and price models are used to model the relationship in the underlying data. This research assignment focuses on the development of multiple machine learning models, in combination with a financial trading strategy that utilises a mixture of technical indicators, to compare the performance of different machine learning algorithms on financial time series data.
The financial time series data collected for this research assignment were 10-year minute ticker data. The two foreign exchange rate data sets, the USD/ZAR and ZAR/JPY foreign exchange rates were used. The other three data sets collected were the S&P 500 index, the FTSE 100 index, and the Brent crude oil index. The first step was to analyse the quality of the data sets. After the quality had been assessed, a trading strategy and financial trading model were used to combine the 20-period moving average, the relative strength index, and the average directional index for the labelling process. Twelve machine-learning models were developed to forecast the financial time series data sets. These were the baseline logistic regression, support vector machine, k-nearest neighbour, decision tree, random forest, Elman recurrent neural network, Jordan recurrent neural network, Jordan-Elman recurrent neural network, long short-term memory neural network, time-delay neural network, resilient back propagation feed-forward neural network, and particle swarm optimisation feed-forward neural network.
The results from the experiments indicate that the support vector machine performed the best out of all the machine learning models considered. The baseline logistic regression model outperformed all the other machine learning models. The random forest and resilient back propagation feed-forward neural network models performed third and fourth best. These two models had higher recall scores than most models, but their accuracy scores were significantly lower than the baseline and support vector machine models. The recurrent neural network models had very poor performance. Specifically, the Elman and Jordan-Elman models had the poorest performance of the models investigated. It was determined that the non neural network machine learning models were less computationally complex, and were less dependent on a balanced data set than the neural network models.
Pow Ching Shane MEng
2024
Stellenbosch University
Comparison of Transfer Learning Performance between Models Trained on ImageNet and Domain Specific Datasets Convolutional neural networks (CNNs) have found increased use in the field of visual machine learning for tasks such as object classification and object detection. To reduce the amount of relevant input data needed for training a CNN, the technique of transfer learning is used. In transfer learning, the initial weights of a model before training are from a different, fully trained model. Transfer learning has proven to be useful when not a lot of relevant input data is available. A commonly used initial weight set comes from a model trained on the publicly available dataset, ImageNet. This research assignment aims to compare the efficacy of using a publicly available dataset to a domain-specific dataset within the auto-motive industry.
This assignment is being conducted in partnership with an automotive company who has provided the data necessary for creating a custom dataset. After the data quality issues were identified and addressed, a large-scale dataset was created. The model architecture used in this project was a residual network architecture which featured 50 layers. Several new models were initialised using the weights obtained from training the same model architecture on the large-scale dataset. At the same time, each of these models have a counterpart sharing the same architecture, but initialised with weights from the ImageNet dataset. Each of the models were trained on small subsets of data after which they would be used to classify the full dataset from which the subsets came from. The perfor-mance between models within the same training subset are compared by assessing the accuracy, loss, and number of training epochs.
The results indicate that despite the custom large-scale dataset containing less images, models using initial weights based on the custom dataset outperformed models using ImageNet initialisation. Where possible, the creation of a dataset for transfer learning using domain-specific data is more favourable than using the public ImageNet dataset, because it was shown in the automotive industry, and previously in the medical industry, that higher levels of accuracy can be achieved.
Garett Sidwell MEng
2024
Stellenbosch University
Investigating Sales Forecasting in the Formal Liquor Market using Deep Learning Techniques This research assignment focuses on forecasting sales in the liquor industry, examining the effectiveness of deep learning techniques and a stacked ensemble approach. Time-series forecasting is a widely used technique in various fields such as economics, finance, and operations research. A thorough literature review was conducted to gain an in-depth understanding of the topic and to survey existing solutions in the field. The study involved a thorough analysis of datasets to understand the inherent structures of the series. Evaluation metrics and various algorithms were used to assess the effectiveness of time-series forecasting techniques.
The research assignment found that deep learning techniques and ensemble theory can successfully be applied to forecast sales in the liquor industry. A stacked ensemble approach was effective in improving the overall performance. The findings have the potential to significantly improve current implementations of time-series forecasting, while reducing the computational complexity and expenses associated with granular forecasting models. The research assignment concludes that deep learning and ensemble models offer a promising avenue for efficient and accurate sales forecasting in the liquor industry, being more time-efficient and computationally less complex than traditional methods.
Matt Swanevelder MEng
2024
Stellenbosch University
Automated Localisation and Classification of Trauma Implants in Leg X-rays through Deep Learning Revision surgery often requires orthopedic surgeons to pre-operatively identify failed implants in order to reduce the complexity and cost of the surgery. Surgeons typically examine the X-rays of a patient for preoperative implant
identification, even though this method is time-consuming and occasionally unsuccessful. This study investigates the use of deep learning to automate the identification of trauma implants in leg X-rays. The investigation assesses the performance of various object detection and classification models on a dataset of trauma implants, aiming to identify the optimal deep learning solution. Challenges related to research include limited data, imbalanced class distributions, and the presence of multiple implants in the X-ray images.
The results of the investigation indicate that the optimal deep learning solution is a two-model pipeline that employs a you only look once (YOLO) object detection model and a densely connected convolutional neural network (DenseNet) classification model. The DenseNet classification model classifies the trauma implants localised by the YOLO object detection model. The proposed pipeline achieves a mean average precision (intersection of union threshold of 0.5) of 0.967 for implant localisation and an accuracy of 73.7% for implant classification. The results of the study provide proof that deep learning models are capable of identifying trauma implants. Additionally, the study offers a deep learning solution that can be utilised in future research related to identifying trauma implants.

Masters Students (Research)


Student Degree Thesis Title Abstract
Adejumo, A MSc
1999
University of Pretoria
Active learning Algorithms for Multi-layer Feedforward Neural Networks Backpropagation has played a vital role in the resurgence of interest in artificial neural networks. Eversince, a lot of research effort concentrated findillg ways to improve its performance. Active learning has emerged as an efficient alternative to improve the performance of multilayer feedforward neural networks. The leamer is given active control over the information to include in the training set, and in doinng so, the generalization accuracy is improved and the computational cost and complexity of the network are reduced compared to training on a fixed set of data. While many research effort has been invested in designing new learning approadches an elaborate comparison of active learning approaches is still lacking. The objective of this research study is to compare and critisize active learning approaches and also to propose a new selective learning algorithm. This thesis presents a comparison of four selected active learning algorithms. This thesis concentrates on one type of application, namely function and time series approximation.
Rodic, D MSc
2000
University of Pretoria
Hybrid Exhaustive Search for Knowledge Discovery The topic of this thesis is knowledge discovery and artificial intelligence based knowledge discovery algorithms. The knowledge discovery process and associated problems are discussed, followed by an overview of three classes of artificial intelligence based knowledge discovery algorithms. Typical representatives of each of these classes are presented and discussed in greater detail. Then a new knowledge discovery algorithm, called Hybrid Classifier System (HCS), is presented. The guiding concept behind the new algorithm was simplicity. The new knowledge discovery algorithm is loosely based on schemata theory. It is evaluated against one of the discussed algorithms from each class, namely: CN2, C4.5, BRAINNE and BGP. Results are discussed and compared. A comparison was done using a benchmark of classification problems. These results show that the new knowledge discovery algorithm performs satisfactory, yielding accurate, crisp rule sets. Probably the main strength of the HCS algorithm is its simplicity, so it can be the foundation for many possible future extensions. Some of the possible extensions of the new proposed algorithm are suggested in the final part of this thesis.
Ismail, A MSc
2002
University of Pretoria
Training and Optimization of Product Unit Neural Networks Product units in the hidden layer of multilayer neural networks provide a powerful mechanism for neural networks to efficiently learn higher-order combinations of inputs. Training product unit neural networks using local optimization algorithms is difficult due to an increased number of local minima and increased chances of network paralysis. This thesis discusses the problems with using local optimization, especially gradient descent, to train product unit neural networks, and shows that particle swarm optimization, genetiC algorithms and leapfrog are efficient alternatives to successfully train product unit neural networks. Architecture selection, i.e. pruning, of product unit neural networks is also studied and applied to determine near optimal neural network architectures that are used in the comparative studies.
Brits, R MSc
2003
University of Pretoria
Niching Strategies for Particle Swarm Optimization Evolutionary algorithms and swarm intelligence techniques have been shown to successfully solve optimization problems where the goal is to find a single optimal solution. In multimodal domains where the goal is to locate multiple solutionas in a singe search space, these techniques fail. Niching algoritithms extend existing global optimization algorithms to locate and maintain multiple solutions concurrently. This thesis develops strategues that utilize the inique characteristics of the particle swarm optimization algorithms to perform niching. Shrinking topological neighbourhoods and optimization with multipe subswarms are used to identify and stably maintain niches. Solving systems of equations an dmultimodal functions are used to demonstrate the effectiveness of the new algorithms.
Potgieter, G MSc
2003
University of Pretoria
Mining Continuous Classes using Evolutionary Computing Data mining is the term given to knowledge discovery paradigms that attempt to infer knowledge, in the form of rules, from structured data using machine learning algorithms. Specifically, data mining attempts to infer rules that are accurate, crisp, comprehensible and interesting. There are not many data mining algorithms for mining continuous classes. This thesis develops a new approach for mining continuous classes. The approach is based on a genetic program, which utilises an efficient genetic algorithm approach to evolve the non-linear regressions described by the leaf nodes of individuals in the genetic program's population. The approach also optimises the learning process by using an efficient, fast data clustering algorithm to reduce the training pattern search space. Experimental results from both algorithms are compared with results obtained from a neural network. The experimental results of the genetic program is also compared against a commercial data mining package (Cubist). These results indicate that the genetic algorithm technique is substantially faster than the neural network, and produces comparable accuracy. The genetic program produces substantially less complex rules than that of both the neural network and Cubist.
Paquet, U MSc
2003
University of Pretoria
Training Support Vecor Machines With Particle Swarms Particle swarms can easily be used to optimise a function with a set of linear equality constraints, by restricting the swarm's movement to the constrained search space. A linear particle swarm optimiser and a converging linear particle swarm optimiser is developed to optimise linear equality-constrained functions. It is shown that if the entire swarm of particles is initialised to consist of only feasible solutions, then the swarm can optimise the constrained objective function without ever again consideringthe set of constraints. The converging linear particle swarm optimiser overcomes the linear particle swarm optimiser's possibility of premature convergence. Training a support vector machine requires solving a constrained quadratic programming problem, and the converging linear particle swarm optimiser ideally fits the needs of an optimisation method for support vector machine training. Particle swarms are intuitive and easy to implement, and is presented as an alternative to current support vector machine training methods.
Graaff, A MSc
2004
University of Pretoria
The Artificial Immune System with Evolved Lymphocytes The main purpose of the natural immune system is to protect the body against any unwanted foreign cells that could infect the body and lead to devastating results. The natural immune system has different lymphocytes to detect and destroy these unwanted foreign patterns. The natural immune system can be modeled into an artificial immune system that can be used to detect any unwanted patterns in a non-biological environment. One of the main tasks of an immune system is to learn the structure of these unwanted patterns for a faster response to future foreign patterns with the same or similar structure. The artificial immune system (AIS) can therefore be seen as a pattern recognition system. The AIS contains artificial lymphocytes (ALC) that classify any pattern either as part of a predetermined set of patterns or not. In the immune system, lymphocytes have different states: Immature, Mature, Memory or Annihilated. Lymphocytes in the annihilated state needs to be removed from the active set of ALCs. The process of moving from one state to the next needs to be controlled in an efficient manner. This dissertation presents an AIS for detection of unwanted patterns with a dynamical active set of ALCs and proposes a threshold function to determine the state of an ALC. The AIS in the dissertation uses evolutionary computation techniques to evolve an optimal set of lymphocytes for better detection of unwanted patterns and removes ALCs in the annihilated state from the active set of ALCs.
Franken, N MSc
2004
University of Pretoria
PSO-Based Coevolutionary Game Learning This study aims to thoroughly investigate the exact nature of PSO-based coevolution techniques, with specific application to both the zero sum (Tic-Tac-Toe and Checkers) and non-zero sum (Iterated Prisoner's Dilemma) game domains. The results of the study illustrates the impact on performance of specific PSO parameters hoices, neighbourhood information sharing structure and game-specific representations. A number of enhancements to the study of game learning are also presented, including new additions to the areas of benchmarking and coevolution.
Nel, G MSc
2005
University of Pretoria
A Memetic Genetic Program for Knowledge Discovery Local search algorithms have been proved to be effective in refining solutions that have been found by other algorithms. Evolutionary algorithms, in particular global search algorithms, have shown to be successful in producing approximate solutions for optimisation and classification problems in acceptable computation times. A relatively new method, memetic algorithms, uses local search to refine the approximate solutions produced by global search algorithms. This thesis develops such a memetic algorithm. The global search algorithm used as part of the new memetic algorithm is a genetic program that implements the building block hypothesis by building simplistic decision trees representing valid solutions, and gradually increases the complexity of the trees. The specific building block hypothesis implementation is known as the building block approach to genetic programming, BGP. The effectiveness and efficiency of the new memetic algorithm, which combines the BGP algorithm with a local search algorithm, is demonstrated.
Peer, ES MSc
2005
University of Pretoria
Serendipitous Software Framework for Facilitating Collaboration in Computational Intelligence A major flaw in the academic system, particularly pertaining to computer science, is that it rewards specialisation. The highly competitive quest for new scientific developments, or rather the quest for a better reputation and more funding, forces researchers to specialise in their own fields, leaving them little time to properly explore what others are doing, sometimes even within their own field of interest. Even the peer review process, which should provide the necessary balance, fails to achieve much diversity, since reviews are typically performed by persons who are again specialists in the particular field of the work. Further, software implementations are rarely reviewed, having as a consequence the publishing of untenable results. Unfortunately, these factors contribute to an environment which is not conducive to collaboration, a cornerstone of academia -- building on the work of others.
This work takes a step back and examines the general landscape of computational intelligence from a broad perspective, drawing on multiple disciplines to formulate a collaborative software platform, which is flexible enough to support the needs of this diverse research community. Interestingly, this project did not set out with these goals in mind, rather it evolved, over time, from something more specialised into the general framework described in this dissertation. Design patterns are studied as a means to manage the complexity of the computational intelligence paradigm in a flexible software implementation. Further, this dissertation demonstrates that releasing research software under an open source license eliminates some of the deficiencies of the academic process, while preserving, and even improving, the ability to build a reputation and pursue funding.
Two software packages have been produced as products of this research: i) CILib, an open source library of computational intelligence algorithms; and ii) CiClops, which is a virtual laboratory for performing experiments that scale over multiple workstations. Together, these software packages are intended to improve the quality of research output and facilitate collaboration by sharing a repository of simulation data, statistical analysis tools and a single software implementation.
Du Plessis, J MSc
2006
University of Pretoria
ACODV: Ant Colony Optimisation Distance Vector Routing in Ad Hoc Networks A mobile ad hoc network is a collection of wireless mobile devices which dynamically form a temporary network, without using any existing network infrastructure or centralised administration. Each node in the network effectively becomes a router, and forwards packets towards the packet’s destination node. Ad hoc networks are characterized by frequently changing network topology, multi-hop wireless connections and the need for dynamic, efficient routing protocols. This work considers the routing problem in a network of uniquely addressable sensors. These networks are encountered in many industrial applications, where the aim is to relay information from a collection of data gathering devices deployed over an area to central points. The routing problem in such networks are characterised by: The overarching requirement for low power consumption, as battery powered sensors may be required to operate for years without battery replacement; an emphasis on reliable communication as opposed to real-time communication, it is more important for packets to arrive reliably than to arrive quickly; and very scarce processing and memory resources, as these sensors are often implemented on small low-power microprocessors. This work provides overviews of routing protocols in ad hoc networks, swarm intelligence, and swarm intelligence applied to ad hoc routing. Various mechanisms that are commonly encountered in ad hoc routing are experimentally evaluated under situations as close to real-life as possible. Where possible, enhancements to the mechanisms are suggested and evaluated. Finally, a routing protocol suitable for such low-power sensor networks is defined and benchmarked in various scenarios against the Ad hoc On-Demand Distance Vector (AODV) algorithm.
Messerschmidt, L MSc
2006
University of Pretoria
Using Particle Swarm Optimization to Evolve Two-Player Game Agents Computer game-playing agents are almost as old as computers themselves, and people have been developing agents since the 1950’s. Unfortunately the techniques for game-playing agents have remained basically the same for almost half a century – an eternity in computer time. Recently developed approaches have shown that it is possible to develop game playing agents with the help of learning algorithms. This study is based on the concept of algorithms that learn how to play board games from zero initial knowledge about playing strategies. A coevolutionary approach, where a neural network is used to assess desirability of leaf nodes in a game tree, and evolutionary algorithms are used to train neural networks in competition, is overviewed. This thesis then presents an alternative approach in which particle swarm optimization (PSO) is used to train the neural networks. Different variations of the PSO are implemented and compared. The results of the PSO approaches are also compared with that of an evolutionary programming approach. The performance of the PSO algorithms is investigated for different values of the PSO control parameters. This study shows that the PSO approach can be applied successfully to train game-playing agents.
Grobler, H MSc
2006
University of Pretoria
Surface Defect Detection by means of a Structural Light System During the production of a motor vehicle, damage may be sustained by the components of the vehicle at various stages of the production process. As the vehicle being assembled progresses through the production process, the cost of correcting such damage increases significantly. This document describes a Surface Defect Detection System (SDDS) that was developed to detect damage to the outer panels of a vehicle, specifically the roof panel.
The system makes use of a structural light configuration consisting of a laser and camera, together with appropriate image processing tools. The structural light system operates and is mounted on an industrial robot, that forms part of an existing motor vehicle production plant. Although the developed system is targeted specifically towards the detection of bumps (high spots) and dents (low spots) on the surface of the roof of the BMW 3-series, the implementation is sufficiently generic as to allow other panels to be scanned. The image processing software is implemented as the Surface Defect Detection System Framework (SDDSF).
Duminy, W MSc
2007
University of Pretoria
A Learning Framework for Zero-Knowledge Game Playing Agent The subjects of perfect information games, machine learning and computational intelligence combine in an experiment that investigates a method to build the skill of a game-playing agent from zero game knowledge. The skill of a playing agent is determined by two aspects, the first is the quantity and quality of the knowledge it uses and the second aspect is its search capacity. This thesis introduces a novel representation language that combines symbols and numeric elements to capture game knowledge. Insofar search is concerned, an extension to an existing knowledge-based search method is developed. Empirical tests show an improvement over alpha-beta, especially in learning conditions where the knowledge may be weak. Current machine learning techniques as applied to game agents is reviewed. From these techniques a learning framework is established. The data-mining algorithm, ID3, and the computational intelligence technique, Particle Swarm Optimisation (PSO), form the key learning components of this framework. The classification trees produced by ID3 is subjected to new post-pruning processes specifically defined for the mentioned representation language. Different combinations of these pruning processes are tested and a dominant combination is chosen for use in the learning framework. As an extension to PSO, tournaments are introduced as a relative fitness function. A variety of alternative tournament methods are described and some experiments are conducted to evaluate these. The final design decisions are incorporated into the learning framework configuration, and learning experiments are conducted on Checkers and some variations of Checkers. These experiments show that learning has occurred, but also highlights the need for further development and experimentation. Some ideas in this regard concludes the thesis.
Naicker, C MSc
2007
University of Pretoria
Derating NichePSO The search for multiple solutions is applicable to many fields (Engineering Science, Economics, and others). Multiple solutions allow for human judgement to select the best solution from a group of solutions that best match the search criteria. Finding multiple solutions to an optimisation problem has shown to be difficult to solve. Evolutionary computation (EC) and more recently Particle Swarm Optimisation (PSO) algorithms have been used in this field to locate and maintain multiple solutions with fair success. This thesis develops and empirically analyses a new method to find multiple solutions within a convoluted search space. The method is a hybrid of the NichePSO and the sequential niche technique (SNT). The original SNT was developed using a Genetic Algorithm (GA). It included restrictions such as knowing or approximating the number of solutions that exist. A further pitfall of the SNT is that it introduces false optima after modifying the search space, thereby reducing the accuracy of the solutions. However, this can be resolved with a local search in the unmodified search space. Other sequential niching algorithms require that the search be repeated sequentially until all solutions are found without considering what was learned in previous iterations, resulting in a blind and wasteful search. The NichePSO has shown to be more accurate than GA based algorithms. It does not require knowledge of the number of solutions in the search space prior to the search process. However, the NichePSO does not scale well for problems with many optima. The method developed in this thesis, referred to as the derating NichePSO, combines SNT with the NichePSO. The main objective of the derating NichePSO is to eliminate the inaccuracy of SNT and to improve the scalability of the NichePSO. The derating NichePSO is compared to the NichePSO, deterministic crowding [23] and the original SNT using various multimodal functions. The performance of the derating NichePSO is analysed and it is shown that the derating NichePSO is more accurate than SNT and more scalable than the NichePSO.
Pun, J MSc
2007
University of Pretoria
Gesture Recognition with Application in Music Arrangement This thesis studies the interaction with music synthesis systems using hand gestures. Traditionally users of such systems were limited to input devices such as buttons, pedals, faders, and joysticks. The use of gestures allows the user to interact with the system in a more intuitive way. Without the constraint of input devices, the user can simultaneously control more elements within the music composition, thus increasing the level of the system’s responsiveness to the musician’s creative thoughts. A working system of this concept is implemented, employing computer vision and machine intelligence techniques to recognise the user’s gestures.
Van der Stockt, SAG MSc
2008
University of Pretoria
A Generic Neural Network Framework using Design Patterns Designing object-oriented software is hard, and designing reusable object-oriented software is even harder. This task is even more daunting for a developer of computational intelligence applications, as optimising one design objective tends to make others inefficient or even impossible. Classic examples in computer science include ‘storage vs. time’ and ‘simplicity vs. flexibility.’ Neural network requirements are by their very nature very tightly coupled – a required design change in one area of an existing application tends to have severe effects in other areas, making the change impossible or inefficient. Often this situation leads to a major redesign of the system and in many cases a completely rewritten application. Many commercial and open-source packages do exist, but these cannot always be extended to support input from other fields of computational intelligence due to proprietary reasons or failing to fully take all design requirements into consideration.
Design patterns make a science out of writing software that is modular, extensible and efficient as well as easy to read and understand. The essence of a design pattern is to avoid repeatedly solving the same design problem from scratch by reusing a solution that solves the core problem. This pattern or template for the solution has well-understood prerequisites, structure, properties, behaviour and consequences. CILib is a framework that allows developers to develop new computational intelligence applications quickly and efficiently. Flexibility, reusability and clear separation between components are maximised through the use of design patterns. Reliability is also ensured as the framework is open source and thus has many people that collaborate to ensure that the framework is well designed and error free.
This dissertation discusses the design and implementation of a generic neural network framework that allows users to design, implement and use any possible neural network models and algorithms in such a way that they can reuse and be reused by any other computational intelligence algorithm in the rest of the framework, or any external applications. This is achieved by using object-oriented design patterns in the design of the framework.
Zablocki, F MSc
2008
University of Pretoria
Multiple Sequence Alignment using Particle Swarm Optimization The recent advent of bioinformatics has given rise to the central and recurrent problem of optimally aligning biological sequences. Many techniques have been proposed in an attempt to solve this complex problem with varying degrees of success. This thesis investigates the application of a computational intelligence technique known as particle swarm optimization (PSO) to the multiple sequence alignment (MSA) problem. Firstly, the performance of the standard PSO (S-PSO) and its characteristics are fully analyzed. Secondly, a scalability study is conducted that aims at expanding the S-PSO’s application to complex MSAs, as well as studying the behaviour of three other kinds of PSOs on the same problems. Experimental results show that the PSO is efficient in solving the MSA problem and compares positively with well-known CLUSTAL X and T-COFFEE.
Poggiolini, M MSc
2009
University of Pretoria
The Feature Detection Rule and its application within the Negative Selection Algorithm The negative selection algorithm developed by Forrest et al. was inspired by the manner in which T-cell lymphocytes mature within the thymus before being released into the blood system. The resultant T-cell lymphocytes, which are then released into the blood, exhibit an interesting characteristic: they are only activated by non-self cells that invade the human body. The work presented in this thesis examines the current body of research on the negative selection theory and introduces a new affinity threshold function, called the feature-detection rule. The feature-detection rule utilises the inter-relationship between both adjacent and non-adjacent features within a particular problem domain to determine if an artificial lymphocyte is activated by a particular antigen. The performance of the feature-detection rule is contrasted with traditional affinity-matching functions currently employed within negative selection theory, most notably the -chunks rule (which subsumes the -contiguous bits rule) and the hamming-distance rule. The performance will be characterised by considering the detection rate, false-alarm rate, degree of generalisation and degree of overfitting. The thesis will show that the feature-detection rule is superior to the -chunks rule and the hamming-distance rule, in that the feature-detection rule requires a much smaller number of detectors to achieve greater detection rates and less false-alarm rates. The thesis additionally refutes that the way in which permutation masks are currently applied within negative selection theory is incorrect and counterproductive, while placing the feature-detection rule within the spectrum of affinity-matching functions currently employed by artificial immune-system (AIS) researchers.
Grobler, J MEng
2009
University of Pretoria
Particle Swarm Optimization and Differential Evolution for Multi-Objective Multiple Machine Scheduling Production scheduling is one of the most important issues in the planning and operation of manufacturing systems. Customers increasingly expect to receive the right product at the right price at the right time. Various problems experienced in manufacturing, for example low machine utilization and excessive work-in-process, can be attributed directly to inadequate scheduling. In this dissertation a production scheduling algorithm is developed for Optimatix, a South African-based company specializing in supply chain optimization. To address the complex requirements of the customer, the problem was modeled as a flexible job shop scheduling problem with sequence-dependent set-up times, auxiliary resources and production down time.
The algorithm development process focused on investigating the application of both particle swarm optimization (PSO) and differential evolution (DE) to production scheduling environments characterized by multiple machines and multiple objectives. Alternative problem representations, algorithm variations and multi-objective optimization strategies were evaluated to obtain an algorithm which performs well against both existing rule-based algorithms and an existing complex flexible job shop scheduling solution strategy.
Finally, the generality of the priority-based algorithm was evaluated by applying it to the scheduling of production and maintenance activities at Centurion Ice Cream and Sweets. The production environment was modeled as a multi-objective uniform parallel machine shop problem with sequence-dependent set-up times and unavailability intervals. A self-adaptive modified vector evaluated DE algorithm was developed and compared to classical PSO and DE vector evaluated algorithms. Promising results were obtained with respect to the suitability of the algorithms for solving a range of multi-objective multiple machine scheduling problems.
Neethling, M MSc
2009
University of Pretoria
Using SetPSO to Determine RNA Secondary Structure RNA secondary structure prediction is an important field in Bioinformatics. A number of different approaches have been developed to simplify the determination of RNA molecule structures. RNA is a nucleic acid found in living organisms which fulfils a number of important roles in living cells. Knowledge of its structure is crucial in the understanding of its function. Determining RNA secondary structure computationally, rather than by physical means, has the advantage of being a quicker and cheaper method. This dissertation introduces a new Set-based Particle Swarm Optimisation algorithm, known as SetPSO for short, to optimise the structure of an RNA molecule, using an advanced thermodynamic model. Structure prediction is modelled as an energy minimisation problem.
Particle swarm optimisation is a simple but effective stochastic optimisation technique developed by Kennedy and Eberhart. This simple technique was adapted to work with variable length particles which consist of a set of elements rather than a vector of real numbers. The effectiveness of this structure prediction approach was compared to that of a dynamic programming algorithm called mfold. It was found that SetPSO can be used as a combinatorial optimisation technique which can be applied to the problem of RNA secondary structure prediction. This research also included an investigation into the behaviour of the new SetPSO optimisation algorithm. Further study needs to be conducted to evaluate the performance of SetPSO on different combinatorial and set-based optimisation problems.
Hauptfleisch, A MSc
2010
University of Pretoria
Automatic Road Extraction from High Resolution Satellite Imagery using Spectral Classification Methods Road networks play an important role in a number of geospatial applications, such as cartographic, infrastructure planning and traffic routing software. Automatic and semi-automatic road network extraction techniques have significantly increased the extraction rate of road networks. Automated processes still yield some erroneous and incomplete results and costly human intervention is still required to evaluate results and correct errors. With the aim of improving the accuracy of road extraction systems, three objectives are defined in this thesis: Firstly, the study seeks to develop a flexible semi-automated road extraction system, capable of extracting roads from QuickBird satellite imagery. The second objective is to integrate a variety of algorithms within the road network extraction system. The benefits of using each of these algorithms within the proposed road extraction system, is illustrated. Finally, a fully automated system is proposed by incorporating a number of the algorithms investigated throughout the thesis.
Louis, A MSc
2010
University of Pretoria
Unsupervised Discovery of Relations for Analysis of Textual Data in Digital Forensics This dissertation addresses the problem of analysing digital data in digital forensics. It will be shown that text mining methods can be adapted and applied to digital forensics to aid analysts to more quickly, efficiently and accurately analyse data to reveal truly useful information.
Investigators who wish to utilise digital evidence must examine and organise the data to piece together events and facts of a crime. The difficulty with finding relevant information quickly using the current tools and methods is that these tools rely very heavily on background knowledge for query terms and do not fully utilise the content of the data.
A novel framework in which to perform evidence discovery is proposed in order to reduce the quantity of data to be analysed, aid the analysts’ exploration of the data and enhance the intelligibility of the presentation of the data. The framework combines information extraction techniques with visual exploration techniques to provide a novel approach to performing evidence discovery, in the form of an evidence discovery system. By utilising unrestricted, unsupervised information extraction techniques, the investigator does not require input queries or keywords for searching, thus enabling the investigator to analyse portions of the data that may not have been identified by keyword searches.
The evidence discovery system produces text graphs of the most important concepts and associations extracted from the full text to establish ties between the concepts and provide an overview and general representation of the text. Through an interactive visual interface the investigator can explore the data to identify suspects, events and the relations between suspects.
Two models are proposed for performing the relation extraction process of the evidence discovery framework. The first model takes a statistical approach to discovering relations based on co-occurrences of complex concepts. The second model utilises a linguistic approach using named entity extraction and information extraction patterns. A preliminary study was performed to assess the usefulness of a text mining approach to digital forensics as against the traditional information retrieval approach. It was concluded that the novel approach to text analysis for evidence discovery presented in this dissertation is a viable and promising approach. The preliminary experiment showed that the results obtained from the evidence discovery system, using either of the relation extraction models, are sensible and useful. The approach advocated in this dissertation can therefore be successfully applied to the analysis of textual data for digital forensics.
Barla-Szabo, D MSc
2010
University of Pretoria
A Study of Grandient Based Particle Swarm Optimisers Gradient-based optimisers are a natural way to solve optimisation problems, and have long been used for their efficacy in exploiting the search space. Particle swarm optimisers (PSOs), when using reasonable algorithm parameters, are considered to have good exploration characteristics.
This thesis proposes a specific way of constructing hybrid gradient PSOs. Heterogeneous, hybrid gradient PSOs are constructed by allowing the gradient algorithm to optimise local best particles, while the PSO algorithm governs the behaviour of the rest of the swarm. This approach allows the distinct algorithms to concentrate on performing the separate tasks of exploration and exploitation. Two new PSOs, the Gradient Descent PSO, which combines the Gradient Descent and PSO algorithms, and the LeapFrog PSO, which combines the LeapFrog and PSO algorithms, are introduced. The GDPSO represents arguably the simplest hybrid gradient PSO possible, while the LeapFrog PSO incorporates the more sophisticated LFOP1(b) algorithm, exhibiting a heuristic algorithm design and dynamic time step adjustment mechanism. The strong tendency of these hybrids to prematurely converge is examined, and it is shown that by modifying algorithm parameters and delaying the introduction of gradient information, it is possible to retain strong exploration capabilities of the original PSO algorithm while also benefiting from the exploitation of the gradient algorithms.
Dean E MSc
2010
University of Pretoria
Computer Aided Identification of Biological Specimens using Self-Organizing Maps For scientific or socio-economic reasons it is often necessary or desirable that biological material be identified. Given that there are an estimated 10 million living organisms on Earth, the identification of biological material can be problematic. Consequently the services of taxonomist specialists are often required. However, if such expertise is not readily available it is necessary to attempt an identification using an alternative method. Some of these alternative methods are unsatisfactory or can lead to a wrong identification. One of the most common problems encountered when identifying specimens is that important diagnostic features are often not easily observed, or may even be completely absent. A number of techniques can be used to try to overcome this problem, one of which, the Self Organizing Map (or SOM), is a particularly appealing technique because of its ability to handle missing data. This thesis explores the use of SOMs as a technique for the identification of indigenous trees of the Acacia species in KwaZulu-Natal, South Africa. The ability of the SOM technique to perform exploratory data analysis through data clustering is utilized and assessed, as is its usefulness for visualizing the results of the analysis of numerical, multivariate botanical data sets. The SOM’s ability to investigate, discover and interpret relationships within these data sets is examined, and the technique’s ability to identify tree species successfully is tested. These data sets are also tested using the C5 and CN2 classification techniques. Results from both these techniques are compared with the results obtained by using a SOM commercial package. These results indicate that the application of the SOM to the problem of biological identification could provide the start of the long-awaited breakthrough in computerized identification that biologists have eagerly been seeking.
Papacostantis, E MSc
2010
University of Pretoria
Competitive Co-Evolution of Trend Reversal Indicators using Particle Swarm Optimisation Computational Intelligence has found a challenging testbed for various paradigms in the financial sector. Extensive research has resulted in numerous financial applications using neural networks and evolutionary computation, mainly genetic algorithms and genetic programming. More recent advances in the field of computational intelligence have not yet been applied as extensively or have not become available in the public domain, due to the confidentiality requirements of financial institutions.
This study investigates how co-evolution together with the combination of particle swarm optimisation and neural networks could be used to discover competitive security trading agents that could enable the timing of buying and selling securities to maximise net profit and minimise risk over time. The investigated model attempts to identify security trend reversals with the help of technical analysis methodologies.
Technical market indicators provide the necessary market data to the agents and reflect information such as supply, demand, momentum, volatility, trend, sentiment and retracement. All this is derived from the security price alone, which is one of the strengths of technical analysis and the reason for its use in this study. The model proposed in this thesis evolves trading strategies within a single population of competing agents, where each agent is represented by a neural network. The population is governed by a competitive co-evolutionary particle swarm optimisation algorithm, with the objective of optimising the weights of the neural networks. A standard feed forward neural network architecture is used, which functions as a market trend reversal confidence. Ultimately, the neural network becomes an amalgamation of the technical market indicators used as inputs, and hence is capable of detecting trend reversals. Timely trading actions are derived from the confidence output, by buying and short selling securities when the price is expected to rise or fall respectively.
No expert trading knowledge is presented to the model, only the technical market indicator data. The co-evolutionary particle swarm optimisation model facilitates the discovery of favourable technical market indicator interpretations, starting with zero knowledge. A competitive fitness function is defined that allows the evaluation of each solution relative to other solutions, based on predefined performance metric objectives. The relative fitness function in this study considers net profit and the Sharpe ratio as a risk measure.
For the purposes of this study, the stock prices of eight large market capitalisation companies were chosen. Two benchmarks were used to evaluate the discovered trading agents, consisting of a Bollinger Bands/Relative Strength Index rule-based strategy and the popular buy-and-hold strategy. The agents that were discovered from the proposed hybrid computational intelligence model outperformed both benchmarks by producing higher returns for in-sample and out-sample data at a low risk. This indicates that the introduced model is effective in finding favourable strategies, based on observed historical security price data. Transaction costs were considered in the evaluation of the computational intelligent agents, making this a feasible model for a real-world application.
Smit, M MSc
2010
University of Pretoria
Interactive Narrative Generation using Computational Verb Theory Interactive narrative extends traditional story-telling techniques by enabling previously passive observers to become active participants in the narrative events that unfold. A variety of approaches have attempted to construct such interactive narrative spaces and reconcile the goals of interactivity and dramatic story-telling. With the advent of the linguistic variable in 1972, a means was established for modelling natural language words and phrases mathematically and computationally. Over the past decade, the computational verb, first introduced in 1997, has been developed as a mathematical
Anguelov, B MSc
2012
University of Pretoria
Video Game Pathfinding and Improvements to Discrete Search on Grid-based Maps The most basic requirement for any computer controlled game agent in a video game is to be able to successfully navigate the game environment. Pathfinding is an essential component of any agent navigation system. Pathfinding is, at the simplest level, a search technique for finding a route between two points in an environment. The real-time multi-agent nature of video games places extremely tight constraints on the pathfinding problem. This study aims to provide the first complete review of the current state of video game pathfinding both in regards to the graph search algorithms employed as well as the implications of pathfinding within dynamic game environments. Furthermore this thesis presents novel work in the form of a domain specific search algorithm for use on grid-based game maps: the spatial grid A* algorithm which is shown to offer significant improvements over A* within the intended domain.
Duhain, JGOL MSc
2012
University of Pretoria
Particle Swarm Optimization in Dynamically Changing Environments Real-world optimisation problems often are of a dynamic nature. Recently, much research has been done to apply particle swarm optimisation (PSO) to dynamic environments (DE). However, these research efforts generally focused on optimising one variation of the PSO algorithm for one type of DE. The aim of this work is to develop a more comprehensive view of PSO for DEs. This thesis studies different schemes of characterising and taxonomising DEs, performance measures used to quantify the performance of optimisation algorithms applied to DEs, various adaptations of PSO to apply PSO to DEs, and the effectiveness of these approaches on different DE types.
The standard PSO algorithm has shown limitations when applied to DEs. To overcome these limitations, the standard PSO can be modified using personal best re-evaluation, change detection and response, diversity maintenance, or swarm sub-division and parallel tracking of optima. To investigate the strengths and weaknesses of these approaches, a representative sample of algorithms, namely, the standard PSO, re-evaluating PSO, reinitialising PSO, atomic PSO (APSO), quantum swarm optimisation (QSO), multi-swarm, and self-adapting multi-swarm (SAMS), are empirically analysed. These algorithms are analysed on a range of DE test cases, and their ability to detect and track optima are evaluated using performance measures designed for DEs. The experiments show that QSO, multi-swarm and reinitialising PSO provide the best results. However, the most effective approach to use depends on the dimensionality, modality and type of the DEs, as well as on the objective of the algorithm. A number of observations are also made regarding the behaviour of the swarms, and the influence of certain control parameters of the algorithms evaluated.
Pampara, G MSc
2012
University of Pretoria
Angle Modulated Population based Algorithms to Solve Binary Problems Recently, continuous-valued optimization problems have received a great amount of focus, resulting in optimization algorithms which are very efficient within the continuous-valued space. Many optimization problems are, however, defined within the binary-valued problem space. These continuous-valued optimization algorithms can not operate directly on a binary-valued problem representation, without algorithm adaptations because the mathematics used within these algorithms generally fails within a binary problem space. Unfortunately, such adaptations may alter the behavior of the algorithm, potentially degrading the performance of the original continuous-valued optimization algorithm. Additionally, binary representations present complications with respect to increasing problem dimensionality, interdependencies between dimensions, and a loss of precision.
This research investigates the possiblity of applying continuous-valued optimization algorithms to solve binary-valued problems, without requiring algorithm adaptation. This is achieved through the application of a mapping technique, known as angle modulation. Angle modulation effectively addresses most of the problems associated with the use of a binary representation by abstracting a binary problem into a four-dimensional continuous-valued space, from which a binary solution is then obtained. The abstraction is obtained as a bit-generating function produced by a continuous-vaued algorithm. A binary solution is then obtained by sampling the bit-generating function.
This thesis proposes a number of population-based angle-modulated continuous-valued algorithms to solve binary-valued problems. These algorithms are then compared to binary algorithm counterparts, using a suite of benchmark functions. Empirical analysis will show that the angle-modulated continuous-valued algorithms are viable alternatives to binary optimization algorithms.
Rakitianskaia, AS MSc
2012
University of Pretoria
Using Particle Swarm Optimisation to Train Feedforward Neural Networks in Dynamic Environments The feedforward neural network (NN) is a mathematical model capable of representing any non-linear relationship between input and output data. It has been succesfully applied to a wide variety of classification and function approximation problems. Various neural network training algorithms were developed, including the particle swarm optimiser (PSO), which was shown to outperform the standard back propagation training algorithm on a selection of problems. However, it was usually assumed that the environment in which a NN operates is static. Such an assumption is often not valid for real life problems, and the training algorithms have to be adapted accordingly. Various dynamic versions of the PSO have already been developed. This work investigates the applicability of dynamic PSO algorithms to NN training in dynamic environments, and compares the performance of dynamic PSO algorithms to the performance of back propagation. Three popular dynamic PSO variants are considered. The extent of adaptive properties of back propagation and dynamic PSO under different kinds of dynamic environments is determined. Dynamic PSO is shown to be a viable alternative to back propagation, especially under the environments exhibiting infrequent gradual changes.
Morkel, T MSc
2012
University of Pretoria
Image Steganography Applications for Secure Communication To securely communicate information between parties or locations is not an easy task considering the possible attacks or unintentional changes that can occur during communication. Encryption is often used to protect secret information from unauthorised access. Encryption, however, is not inconspicuous and the observable exchange of encrypted information between two parties can provide a potential attacker with information on the sender and receiver(s). The presence of encrypted information can also entice a potential attacker to launch an attack on the secure communication.
This dissertation investigates and discusses the use of image steganography, a technology for hiding information in other information, to facilitate secure communication. communication is divided into three categories: Secure self-communication, one-to-one communication and one-to-many communication, depending on the number of receivers. In this dissertation, applications that make use of image steganography are implemented for each of the secure communication categories. For self-communication, image steganography is used to hide one-time passwords (OTPs) in images that are stored on a mobile device. For one-to-one communication, a decryptor program that forms part of an encryption protocol is embedded in an image using image steganography and for one-to-many communication, a secret message is divided into pieces and different pieces are embedded in different images. The image steganography applications for each of the secure communication categories are discussed along with the advantages and disadvantages that the applications have over more conventional secure communication technologies. An additional image steganography application is proposed that determines whether information is modified during communication.
Engelbrecht, N MEng
2013
University of Pretoria
Analysis of RED Packet Loss Performance in a Simulated IP WAN The Internet supports a diverse number of applications, which have different requirements for a number of services. Next generation networks provide high speed connectivity between hosts, which leaves the service provider to configure network devices appropriately, in order to maximize network performance. Service provider settings are based on best recommendation parameters, which give an opportunity to optimize these settings even further. This dissertation focuses on a packet discarding algorithm, known as random early detection (RED), to determine parameters which will maximize ut
Scheepers, C MSc
2014
University of Pretoria
Co-evolution of Neuro-Controllers to Train Multi-Agent Teams from Zero Knowledge After the historic chess match between Deep Blue and Garry Kasparov, many researchers considered the game of chess solved and moved on to the more complex game of soccer. Artificial intelligence research has shifted focus to creating artificial players capable of mimicking the task of playing soccer. A new training algorithm is presented in this thesis for training teams of players from zero knowledge, evaluated on a simplified version of the game of soccer. The new algorithm makes use of the charged particle swarm optimiser as a neural network trainer in a coevolutionary training environment. To counter the lack of domain information a new relative fitness measure based on the FIFA league-ranking system was developed. The function provides a granular relative performance measure for competitive training. Gameplay strategies that resulted from the trained players are evaluated. It was found that the algorithm successfully trains teams of agents to play in a cooperative manner. Techniques developed in this study may also be widely applied to various other artificial intelligence fields.
Clegorn CW MSc
2014
University of Pretoria
A Generalized Theoretical Deterministic Particle Swarm Model Particle swarm optimization (PSO) is a well known population-based search algorithm, originally developed by Kennedy and Eberhart in 1995. The PSO has been utilized in a variety of application domains, providing a wealth of empirical evidence for its effectiveness as an optimizer. The PSO itself has undergone many alterations subsequent to its inception, some of which are fundamental to the PSO’s core behavior, others have been more application specific. The fundamental alterations to the PSO have to a large extent been a result of theoretical analysis of the PSO’s particle’s long term trajectory. The most obvious example, is the need for velocity clamping in the original PSO. While there were empirical findings that suggested that each particle’s velocity was increasing at a rapid rate, it was only once a solid theoretical study was performed that the reason for the velocity explosion was understood. There has been a large amount of theoretical research done on the PSO, both for the deterministic model, and more recently for the stochastic model.
This thesis presents an extension to the theoretical deterministic PSO model. Under the extended model, conditions for particle convergence to a point are derived. At present all theoretical PSO research is done under the stagnation assumption, in some form or another. The analysis done under the stagnation assumption is one where the personal best and neighborhood best are assumed to be non-changing. While analysis under the stagnation assumption is very informative, it could never provide a complete description of a PSO’s behavior. Furthermore, the assumption implicitly removes the notion of a social network structure from the analysis. The model used in this thesis greatly weakens the stagnation assumption, by instead assuming that each particle’s personal best and neighborhood best can occupy an arbitrarily large number of unique positions. Empirical results are presented to support the theoretical findings.
Georgieva, KS MSc
2015
University of Pretoria
A Computational Intelligence Approach to Clustering of Temporal Data Temporal data is common in real-world datasets. Analysis of such data, for example by means of clustering algorithms, can be difficult due to its dynamic behaviour. There are various types of changes that may occur to clusters in a dataset. Firstly, data patterns can migrate between clusters, shrinking or expanding the clusters. Additionally, entire clusters may move around the search space. Lastly, clusters can split and merge.
Data clustering, which is the process of grouping similar objects, is one approach to determine relationships among data patterns, but data clustering approaches can face limitations when applied to temporal data, such as difficulty tracking the moving clusters. This research aims to analyse the ability of particle swarm optimisation (PSO) and differential evolution (DE) algorithms to cluster temporal data. These algorithms experience two weaknesses when applied to temporal data. The first weakness is the loss of diversity, which refers to the fact that the population of the algorithm converges, becoming less diverse and, therefore, limiting the algorithm’s exploration capabilities. The second weakness, outdated memory, is only experienced by the PSO and refers to the previous personal best solutions found by the particles becoming obsolete as the environment changes. A data clustering algorithm that addresses these two weaknesses is necessary to cluster temporal data.
This research describes various adaptations of PSO and DE algorithms for the purpose of clustering temporal data. The algorithms proposed aim to address the loss of diversity and outdated memory problems experienced by PSO and DE algorithms. These problems are addressed by combining approaches previously used for the purpose of dealing with temporal or dynamic data, such as repulsion and anti-convergence, with PSO and DE approaches used to cluster data. Six PSO algorithms are introduced in this research, namely the data clustering particle swarm optimisation (DCPSO), reinitialising data clustering particle swarm optimisation (RDCPSO), cooperative data clustering particle swarm optimisation (CDCPSO), multi-swarm data clustering particle swarm optimisation (MDCPSO), cooperative multi-swarm data clustering particle swarm optimisation (CMDCPSO), and elitist cooperative multi-swarm data clustering particle swarm optimisation (eCMDCPSO). Additionally, four DE algorithms are introduced, namely the data clustering differential evolution (DCDE), re-initialising data clustering differential evolution (RDCDE), dynamic data clustering differential evolution (DCDynDE), and cooperative dynamic data clustering differential evolution (CDCDynDE).
The PSO and DE algorithms introduced require prior knowledge of the total number of clusters in the dataset. The total number of clusters in a real-world dataset, however, is not always known. For this reason, the best performing PSO and best performing DE are compared. The CDCDynDE is selected as the winning algorithm, which is then adapted to determine the optimal number of clusters dynamically. The resulting algorithm is the k-independent cooperative data clustering differential evolution (KCDCDynDE) algorithm, which was compared against the local network neighbourhood artificial immune system (LNNAIS) algorithm, which is an artificial immune system (AIS) designed to cluster temporal data and determine the total number of clusters dynamically. It was determined that the KCDCDynDE performed the clustering task well for problems with frequently changing data, high-dimensions, and pattern and cluster data migration types.
Stallmann, CF MSc
2015
University of Pretoria
Digital Audio Restoration of Gramophone Records Gramophones were the main audio recording medium for more than seven decades and regained widespread popularity over the past few years. Being an analogue storage medium, gramophone records are subject to distortions caused by scratches, dust particles, high temperatures, excessive playback and other noise induced by mishandling the record. Due to the early adoption of the compact disc and other digital audio mediums, most research to reduce the noise on gramophone records focused on physical improvements such as the enhancements of turntable hardware, amelioration of the record material or advances through better record cutting techniques. Comparatively little research has been conducted to digitally filter and reconstruct distorted gramophone recordings.
This thesis provides an extensive analysis on the digital detection and reconstruction of noise in gramophone audio signals distorted by scratches. The ability to approximated audio signals was examined though an empirical analysis of different polynomials and time series models. The investigated polynomials include the standard, Fourier, Newton, Lagrange, Hermite, osculating and piecewise polynomials. Experiments were also conducted by applying autoregressive, moving average and heteroskedasticity models, such as the AR, MA, ARMA, ARIMA, ARCH and GARCH models. In addition, different variants of an artificial neural network were tested and compared to the time series models. Noise detection was performed using algorithms based on the standard score, median absolute deviation, Mahalanobis distance, nearest neighbour, mean absolute spectral deviation and the absolute predictive deviation method. The reconstruction process employed the examined polynomials and models and also considered adjacent window, mirroring window, nearest neighbour, similarity, Lanczos and cosine interpolation.
The detection and reconstruction algorithms were benchmarked using a dataset of 800 songs from eight different genres. Simulations were conducted using artificially generated and real gramophone noise. The algorithms were compared according to their detection and reconstruction accuracy, the computational time needed and the tradeoff between the accuracy and time.
Empirical analysis showed that the highest noise detection accuracy was achieved with the absolute predictive deviation using an ARIMA model. The predictive outlier detector employing a Jordan simple recurrent artificial neural network was most efficient by achieving the best detection rate in a limited timespan. It was also found that the artificial neural networks reconstructed the audio signals more accurately than the other interpolation algorithms. The AR model was most efficient by achieving the best tradeoff between the execution time and the interpolation error.
Van Wyk, AB MSc
2015
University of Pretoria
An Analysis of Overfitting in Particle Swarm Optimized Neural Networks The phenomenon of overfitting, where a feed-forward neural network (FFNN) over trains on training data at the cost of generalisation accuracy is known to be specific to the training algorithm used. This study investigates overfitting within the context of particle swarm optimised (PSO) FFNNs. Two of the most widely used PSO algorithms are compared in terms of FFNN accuracy and a description of the overfitting behaviour is established. Each of the PSO components are in turn investigated to determine their effect on FFNN overfitting. A study of the maximum velocity (Vmax ) parameter is performed and it is found that smaller V max values are optimal for FFNN training. The analysis is extended to the inertia and acceleration coefficient parameters, where it is shown that specific interactions among the parameters have a dominant effect on the resultant FFNN accuracy and may be used to reduce overfitting. Further, the significant effect of the swarm size on network accuracy is also shown, with a critical range being identified for the swarm size for effective training. The study is concluded with an investigation into the effect of the different activation functions. Given strong empirical evidence, an hypothesis is made that stating the gradient of the activation function significantly affects the convergence of the PSO. Lastly, the PSO is shown to be a very effective algorithm for the training of self-adaptive FFNNs, capable of learning from unscaled data.
Langeveld, J MSc
2015
University of Pretoria
Set-Based Particle Swarm Optimization Particle swarm optimization (PSO) algorithms have been successfully applied to discrete-valued optimization problems. However, in many cases the algorithms have been tailored specifically for the problem at hand. This study proposes a generic set-based particle swarm optimization algorithm, called SBPSO, for use on discrete-valued optimization problems that can be formulated as set-based problems. The performance of the SBPSO is then evaluated on two different discrete optimization problems: the multidimensional knapsack problem (MKP) and the feature selection problem (FSP) from machine learning. In both cases, the SBPSO is compared to three other discrete PSO algorithms from literature. On the MKP, the SBPSO is shown to outperform, with statistical significance, the other algorithms. On the FSP and using a k-nearest neighbor classifier, the SBPSO is shown to outperform, with statistical significance, the other algorithms. When a Gaussian Naive Bayes or a J48 decision tree classifier is used, no algorithm can be shown to outperform on the FSP.
Klazar, R MSc
2016
University of Pretoria
Ant-Inspired Strategies for Opportunistic Load Balancing in the Distributed Computation of Solutions to Embarrassingly Parallel Problems Computational science is a practice that requires a large amount of computing time. One means of providing the required computing time is to construct a distributed computing system that utilises the ordinary desktop computers found within an organisation. However, when the constituent computers do not all perform computations at the same speed, the overall completion time of a project involving the execution of tasks by all of the computers in the system becomes dependent on the performance of the slowest computer in the network. This study proposes two ant-inspired algorithms for dynamic task allocation that aim to overcome the aforementioned dependency. A procedure for tuning the free parameters of the algorithms is specified and the algorithms are evaluated for their viability in terms of their effect on the overall completion time of tasks as well as their usage of bandwidth in the network.
Naidoo, T MSc
2017
University of Pretoria
Reconstruction and Analysis of Holograhic Images for Lens-less Microscopy The digital in-line holographic configuration is motivated by the goal of developing a portable, cost effective sensor system for pre-screening patient blood samples. The theory of holography is explained from the foundational concepts in scalar diffraction theory all the way through to the implementation of reconstruction algorithms. Methods for the enhancement of holographic reconstructions are described. The algorithms that perform an automated count of the reconstructed objects are described and demonstrated. Simulated and experimental results are provided. Together, the lens-free holographic microscopy of micro-sized particles along with the application of image processing techniques for the automated detection and counting of objects of interest, provide a component towards realising a sensor system that can be used for pre-screening patient blood samples.
Van Heerden, WS MSc
2017
University of Pretoria
Self-Organizing Feature Maps for Exploratory Data Analysis and Data Mining: A Practical Perspective The self-organizing feature map (SOM) is an unsupervised machine learning approach that offers extremely useful data modeling and visualization abilities. The approach has been successfully employed in order to solve a wide variety of problems. Exploratory data analysis (EDA) and data mining (DM) are both fields that attempt to algorithmically extract insightful knowledge from a data set, with a greater or lesser level of human assistance. This work overviews the SOM algorithm, with a focus on EDA and DM applications. Supervised and unsupervised neuron labeling methods for SOMs, which are an integral part of most SOM-based EDA and DM exercises, are investigated. Existing SOM-based EDA and DM approaches are also critically compared. A novel unsupervised neuron labeling approach, namely unsupervised weight-based cluster labeling, is introduced. This research also proposes HybridSOM, a novel hybrid framework for DM-oriented rule extraction, which combines a SOM with an arbitrary rule extraction algorithm. Two empirical investigations are reported. Firstly, existing supervised neuron labeling techniques are experimentally compared. Secondly, the HybridSOM framework is empirically compared to several existing DM rule extractors.
Leonard, BJ MSc
2017
University of Pretoria
Critical Analysis of Angle Modulated Particle Swarm Optimisers This dissertation presents an analysis of the angle modulated particle swarm optimisation (AMPSO) algorithm. AMPSO is a technique that enables one to solve binary optimisation problems with particle swarm optimisation (PSO), without any modifications to the PSO algorithm. While AMPSO has been successfully applied to a range of optimisation problems, there is little to no understanding of how and why the algorithm might fail. The work presented here includes in-depth theoretical and emprical analyses of the AMPSO algorithm in an attempt to understand it better. Where problems are identified, they are supported by theoretical and/or empirical evidence. Furthermore, suggestions are made as to how the identified issues could be overcome. In particular, the generating function is identified as the main cause for concern. The generating function in AMPSO is responsible for generating binary solutions. However, it is shown that the increasing frequency of the generating function hinders the algorithm’s ability to effectively exploit the search space. The problem is addressed by introducing methods to construct different generating functions, and to quantify the quality of arbitrary generating functions. In addition to this, a number of other problems are identified and addressed in various ways. The work concludes with an empirical analysis that aims to identify which of the various suggestions made throughout this dissertatioin hold substantial promise for further research.
Kyle Erwin MSc
2021
Stellenbosch University
Set-based Particle Swarm Optimization for Portfolio Optimization Portfolio optimization is a complex problem, not only in the depth of the topics it covers but also in breadth. It is the process of determining which assets to include in a portfolio while simultaneously maximizing profit and minimizing risk. Portfolio optimization is rich with interesting research not only by researchers in finance, but also in computer science. The overlap of these fields has lead to an increase in the use of meta-heuristics to make intelligent investment decisions. This thesis conducts a thorough investigation into the current state of evolutionary and swarm intelligence algorithms for portfolio optimization. The investigation showed that these algorithms suffer from stability issues for larger portfolio optimization problems. A new approach using set-based particle swarm optimization (SBPSO) is proposed to reduce the dimensionality, and therefore complexity, of portfolio optimization problems. The results show that SBPSO is capable of obtaining good-quality solutions while being relatively fast. New set-based diversity measures are developed in order to better understand the exploration and exploitation behaviour of SBPSO, and set-based algorithms in general. It is shown that SBPSO fails to converge to a single solution and uses an inadequate process to determine the contribution of each asset to the portfolio. Based on these findings, improvements are made to the proposed SBPSO approach that yield significant gains in performance. The first multi-objective adaptation of SBPSO is also developed and is shown to scale to larger portfolio problems better than the multi-guided particle swarm optimization (MGPSO) algorithm, with lower levels of risk.
Cian Steenkamp MSc
2021
Stellenbosch University
Multi-guide Particle Swarm Optimization for Many-objective Optimization Problems The scalability of the multi-guide particle swarm optimization (MGPSO) algorithm, with respect to the number of objectives for a problem, is investigated. Two MGPSO algorithm adaptations are proposed; that is, the partialdominance multi-guide particle swarm optimization (PMGPSO) algorithm and the knee-point driven multi-guide particle swarm optimization (KnMGPSO) algorithm. As a sub-objective, the effect of different archive balance coefficient update strategies for the MGPSO, the PMGPSO, and the KnMGPSO algorithms are investigated. The proposed algorithms attempt to address the scalability limitations associated with a certain component of the MGPSO algorithm. This study does not consider scalability with respect to the number of decision variables. This study assumes a static search space; that is, where the number of objectives remains fixed throughout the optimization. This study also assumes that each objective remains static throughout the search process. This study also considers only problems with boundary constraints. The results indicate that the MGPSO algorithm scaled to many-objectives competitively compared to other state-of-the-art many-objective optimization algorithms. The results were unexpected because the MGPSO algorithm uses the Pareto-dominance relation, which is known to degrade as the number of objectives increases. The proposed PMGPSO and KnMGPSO algorithms also scaled competitively, however, these algorithms were not superior to the MGPSO algorithm. The investigated dynamic archive balance coefficient update strategies did not improve the performance of the MGPSO, the PMGPSO, or the KnMGPSO algorithms.
James Faure MEng
2023
Stellenbosch University
An Intelligent System for X-ray Analysis and Patient Diagnosis In South African public health, orthodontic radiology has deficiencies in both capacity and expertise. A patient arrives at a hospital to receive treatment for oral anomalies and diseases. After waiting in a long queue, the patient receives an X-ray of their oral area. Between screening and a clinic doctor, a maxillofacial radiologist is tasked to perform a diagnostic procedure where oral anomalies and diseases are identified. A report is then sent to the dentist so the patient can be prescribed treatment. In South Africa, the radiologist is overloaded with too many X-rays to be analysed. A privation of diagnosis or misdiagnosis is often fatal for the patient. The goal of the thesis is demonstrate medical imaging diagnosis, using a dental X-ray dataset.
Deep learning is a subfield of artificial intelligence that uses complex architectures to solve problems that have large problems. Computer vision is a process of deriving information from visual inputs. Deep learning has accelerated computer vision by analysing image datasets to identify predictable features. The number of class labels that a deep learning model can predict is dependent on the number of images that are annotated. Annotation of large datasets is timeconsuming and complex, especially in the cases where domain experts are needed. In the case of dentistry, over 50 tooth anomalies and hundreds of diseases are detectable in the oral area. To create one model to predict all the classes would be infeasible from a human resource perspective due to the extensive labelling process. It is more efficient to train a variety of smaller models with each model specialized in a limited number of related diseases or anomalies. The models are then engineered into a multi-component system that is used to successfully diagnose a patient. The research field is referred to as machine learning engineering, where development is focused on the system as a whole rather than deep learning model architecture alone. The researcher must use engineering principles to account for hardware, software development, machine learning model training and deployment, and computer vision techniques.
The results of the thesis provide evidence that deep learning can successfully be used to make predictions on dental X-rays. The problem of scarce data is solved by grouping a minimum number of classes per model and hence training several different models. The models are then deployed in series and parallel using a queuing system for communication. The trained deep learning models are deployed into a highly scalable system that can successfully diagnose an X-ray in a quicker time than a radiologist.
Refiloe Shabe MEng
2023
Stellenbosch University
Incremental Reinforcement Learning for Dynamic Portfolio Optimisation Portfolio optimisation is a decision-making problem that involves allocation of a certain fund across different financial assets, with the objective of maximising profit and minimising risk, simultaneously. Portfolio optimisation is a difficult problem to analyse. There is a wide range of research on various portfolio optimisation approaches in finance and computational intelligence. The two fields overlap. Thus, the use of meta-heuristics to make intelligent investment decisions is a result of the intersection of finance and computational intelligence. Meta-heuristics formulated the portfolio optimisation problem as a static optimisation problem and successfully obtained optimal portfolios. However, in the real world, investment decision-making is a dynamic problem that involves daily trading. Therefore, it is more representative of real-world investments to formulate the portfolio optimisation problem as a dynamic optimisation problem. This thesis explores a reinforcement learning approach to formulate a dynamic investment strategy. The concept of reinforcement learning has improved the development of multistage stochastic optimisation; a primary component in sequential portfolio optimisation. A recurrent form of a reinforcement learning algorithm called proximal policy optimisation (PPO), that allocates portfolios based on historic asset prices is presented. The results provide a conclusive support for the ability of PPO to identify good-quality portfolios. The results also show that the strategy becomes outdated overtime as it fails to perform as well during the COVID-19 pandemic. Based on this finding, the recurrent PPO approach was improved in order to take into account the presence of concept drift caused by pandemics and potential financial contagions. The approach was adapted to incrementally learn the financial market as the portfolio optimisation process takes place. The incremental recurrent PPO algorithm is shown to be able to adapt to drastic changes in the market and obtain optimal portfolios.
Werner van der Merwe MEng
2023
Stellenbosch University
A Genetic Algorithm Based Model Tree Forest This thesis presents an ensemble approach that reduces the high variance error exhibited by model trees that comprise multivariate nonlinear models and increases the overall robustness of model trees. The ensemble approach is conceptualised, tuned for, and evaluated against competing regression ensemble models on ten separate benchmarking datasets. The ensemble, referred to as the model tree forest (MTF), incorporates a hybrid genetic algorithm approach to construct structurally optimal polynomial expressions (GASOPE) within the leaf nodes of greedy induced model trees that form the base learners of the ensemble. Bootstrap aggregation, together with the implementation of randomised feature space splits during tree induction, sufficiently decorrelates the base learners within the ensemble. Thereby, the variance error of MTF is reduced compared to that of a single model tree, whilst the favourable low bias error of model trees is retained. The multivariate nonlinear models that predict the output enable MTF to produce approximations of highly nonlinear data. The addition of ensembling methods passively combat overfitting brought forth by the increased model complexity, compared to the previous implementation of GASOPE within a tree structure, which exhibits overfitting in specific cases. MTF produced a similar predictive accuracy to the random forest method and outperformed an artificial feed-forward neural network ensemble, an ensemble of M5 model trees, and ensembled support vector regression models. However, the computational cost of the MTF induction algorithm is up to four orders of magnitude greater than RF.
JP van Zyl MSc
2023
Stellenbosch University
Rule Induction with Swarm Intelligence Rule induction is the process by which explainable mappings are created between a set of input data instances and a set of labels for the input instances. This process can be seen as an extension of traditional classification algorithms, because rule induction algorithms perform classification but have the added property of being transparent when making inferences. Popular algorithms in existing literature tend to use antiquated spproaches to induce rule sets. The existing approaches tend to be greedy in nature and do not provide a platform for algorithm expansion or improvement.
This thesis investigates a new approach to rule induction using a set-based particle swarm optimisation algorithm. The investigation starts with a comprehensive review of the relevant literature, after which the novel algorithm is proposed and compared with popular rule induction algorithms.
After the establishment of the capabilities and validity of the set-based particle swarm optimisation rule induction algorithm, the effect of the objective function on the algorithm is investigated. The objective function is tested with 12 existing performance evaluation metrics in order to understand how the performance of the algorithm can be improved. These 12 existing metrics are then used as inspiration for the proposal of 11 new performance evaluation metrics which are also tested as part of the objective function effect analysis. The effect of varying distributions of the values of the target class is also examined.
This thesis also investigates the reformulation of the rule induction problem as a multi-objective optimisation problem and applies the newly developed multi-guide set-based particle swarm optimisation algorithm to the multiobjective formulation of rule induction. The performance of rule induction as a multi-objective problem is evaluated by examining how the trade-off between the defined objectives functions affects performance for different datasets. The existing metrics and newly proposed metrics tested in the single objective formulation of the rule induction problem are also tested in the multi-objective formulation.
RD Lang MSc
2024
Stellenbosch University
Landscape Analysis-based Automated Algorithm Selection The algorithm selection problem, which was developed by Rice, refers to the challenge of choosing the best algorithm available to solve a specific problem. The framework put forth by Rice includes four spaces: the problem space, the algorithm space, the characteristic space, and the performance space. In this thesis, the focus is on the problem space of continuous-valued single-objective, boundary-constrained optimisation problems. Landscape analysis, which is a set of techniques that utilises mathematical and statistical methods to characterise the properties of optimisation problems, can be used to describe the characteristic space. Selection of the most effective algorithm for optimisation problems is a complex task, because metaheuristics have varying strengths. Therefore, a data-driven approach that utilises landscape analysis techniques is employed in this thesis to create an automated algorithm selector.
This thesis proposes enhancements to the characteristic space by critically evaluating the reliability of the methods used in generating landscape analysis measures. Additionally, a new benchmark suite, which is more representative of the problem space than commonly used benchmark suites in the literature, is proposed. An investigation into the impact on the performance of hybrid metaheuristics created by combining the sampling algorithms used to calculate landscape analysis features with standard metaheuristics, which are used to solve the optimisation problem, is conducted. By combining the proposed improvements to the characteristic and problem spaces of the algorithm selection framework, this thesis concludes with a landscape analysis-based automated algorithm selection model that outperforms the current automated algorithm selection models found in the literature in terms of performance and generalisability. Furthermore, this thesis explores the use of automated machine learning and hybrid metaheuristics in automated algorithm selection.
B Strelitz MSc
2024
Stellenbosch University
Particle Swarm Optimization for Constrained Multimodal Function Optimization This thesis investigates the efficiency of particle swarm optimization (PSO) algorithms at finding many feasible global optima for constrained multimodal optimization problems. The proposed approach is the niching migratory multi-swarm optimizer with Deb's comparison criteria (NMMSO-DCC) algorithm. The NMMSO-DCC algorithm uses the same core architecture as the niching migratory multi-swarm optimization (NMMSO) algorithm, but uses Deb's comparison criteria as a constraint handling method. Deb's comparison criteria allows the NMMSO-DCC algorithm to find many feasible global optima for constrained multimodal optimization problems (CMMOPs), whereas the NMMSO algorithm was designed only to find global optima for boundary constrained multimodal optimization problems (MMOPs). The NMMSO algorithm is one of the state-of-the-art multiomodal optimization algorithms, but cannot be used when constraints are placed on the objective function. Thus, the proposed algorithm addresses the inability of the NMMSO algorithm to solve constrained multimodal optimization problems. This study assumes that the objective function to be optimized remains static throughout the search process. This study also assumes that the constraints placed upon the objective function remain static during the search process. All benchmark problems in this study contain boundary constraints. The results indicate that the NMMSO-DCC performs competitively compared to other state-of-the-art constrained multimodal optimization algorithms. The results in terms of success rate are particularly convincing, whereas NMMSO-DCC struggled more with respect to the peak ratio. This means that although the NMMSO-DCC algorithm is able to locate all global optima within a given tolerance level in some of the independent runs, it struggles to do so consistently across multiple independent runs.

Doctoral Students


Student Information Degree Thesis Title Abstract
Van den Bergh, F PhD
2002
University of Pretoria
An Analysis of Particle Swarm Optimizers Many scientific, engineering and economic problems involve the optimisation of a set of parameters. These problems include examples like minimising the losses in a power grid by finding the optimal configuration of the components, or training a neural network to recognise images of people’s faces. Numerous optimisation algorithms have been proposed to solve these problems, with varying degrees of success. The Particle Swarm Optimiser (PSO) is a relatively new technique that has been empirically shown to perform well on many of these optimisation problems. This thesis presents a theoretical model that can be used to describe the long-term behaviour of the algorithm. An enhanced version of the Particle Swarm Optimiser is constructed and shown to have guaranteed convergence on local minima. This algorithm is extended further, resulting in an algorithm with guaranteed convergence on global minima. A model for constructing cooperative PSO algorithms is developed, resulting in the introduction of two new PSO-based algorithms. Empirical results are presented to support the theoretical properties predicted by the various models, using synthetic benchmark functions to investigate specific properties. The various PSO-based algorithms are then applied to the task of training neural networks, corroborating the results obtained on the synthetic benchmark functions.
Omran, M PhD
2005
University of Pretoria
Particle Swarm Optimization Methods for Pattern Recognition and Image Processing Pattern recognition has as its objective to classify objects into different categories and classes. It is a fundamental component of artificial intelligence and computer vision. This thesis investigates the application of an efficient optimization method, known as Particle Swarm Optimization (PSO), to the field of pattern recognition and image processing. First a clustering method that is based on PSO is proposed. The application of the proposed clustering algorithm to the problem of unsupervised classification and segmentation of images is investigated. A new automatic image generation tool tailored specifically for the verification and comparison of various unsupervised image classification algorithms is then developed. A dynamic clustering algorithm which automatically determines the "optimum" number of clusters and simultaneously clusters the data set with minimal user interference is then developed. Finally, PSO-based approaches are proposed to tackle the color image quantization and spectral unmixing problems. In all the proposed approaches, the influence of PSO parameters on the performance of the proposed algorithms is evaluated.
Rodic, D PhD
2005
University of Pretoria
Intelligent Distributed Agent Based Architecture This thesis presents work done on the development of a multi-agent system architecture that facilitates coordination and a novel social networks based approach to coordination. The field of multi-agent system research is undergoing tremendous expansion and it would be impossible to address all the issues related to the field. Instead, this thesis focuses on the coordination aspect of multi-agent systems. The architecture presented here is named the INtelligent Distributed Agent Based Architecture, INDABA.
INDABA, as a hybrid agent architecture, combines the subsymbolic knowledge representation layered architecture with a symbolic layer that allows for deliberative reasoning and learning. INDABA also introduces a layer that facilitates coordination in a society of agents, namely the interaction layer.
The new approach to coordination was inspired by social networks, as observed in higher mammalian societies. Two social relationships were explored, namely kinship and trust. Coordination is achieved through team selection. Using characteristics of social networks, such as learning and the ability to deal with uncertainties, the best team is selected for task execution.
The experiments conducted for the purpose of this thesis were done on three levels. Firstly, an abstract simulated environment was created where a society of a large number of agents could be observed. Secondly, experiments were done in a more realistic simulated robot environment. The last set of experiments was done in a real-world environment, with the implementation of INDABA in embodied mobile agents (robots). The experiments have confirmed the applicability of INDABA as an agent architecture, as well as the validity of the social networks coordination approach.
Khan, S PhD
2009
University of Pretoria
Design and Analysis of Iterative Heuristics for Topology Design of Distributive Local Area Networks Topology design of distributed local area networks (DLANs) can be classified as an NP-hard problem. Intelligent algorithms, such as evolutionary and swarm intelligence techniques, are candidate approaches to address this problem and to produce desirable solutions. DLAN topology design consists of several conflicting objectives such as minimization of cost, minimization of network delay, minimization of the number of hops between two nodes, and maximization of reliability. It is possible to combine these objectives in a single-objective function, provided that the trade-offs among these objectives are adhered to.
This thesis proposes a strategy and a new aggregation operator based on fuzzy logic to combine the four objectives in a single-objective function. The thesis also investigates the use of a number of evolutionary algorithms such as stochastic evolution, simulated evolution, and simulated annealing. A number of hybrid variants of the above algorithms are also proposed. Furthermore, the applicability of swarm intelligence techniques such as ant colony optimization and particle swarm optimization to topology design has been investigated. All proposed techniques have been evaluated empirically with respect to their algorithm parameters. Results suggest that simulated annealing produced the best results among all proposed algorithms. In addition, the hybrid variants of simulated annealing, simulated evolution, and stochastic evolution generated better results than their respective basic algorithms. Moreover, a comparison of ant colony optimization and particle swarm optimization shows that the latter generated better results than the former.
Graaff, A PhD
2011
University of Pretoria
A Local Network Neighbourhood Artificial Immune System As information is becoming more available online and will forevermore be part of any business, the true value of the large amounts of stored data is in the discovery of hidden and unknown relations and connections or traits in the data. The acquisition of these hidden relations can influence strategic decisions which have an impact on the success of a business. Data clustering is one of many methods to partition data into different groups in such a way that data patterns within the same group share some common trait compared to patterns across different groups. This thesis proposes a new artificial immune model for the problem of data clustering. The new model is inspired by the network theory of immunology and differs from its network based predecessor models in its formation of artificial lymphocyte networks. The proposed model is first applied to data clustering problems in stationary environments. Two different techniques are then proposed which enhances the proposed artificial immune model to dynamically determine the number of clusters in a data set with minimal to no user interference. A technique to generate synthetic data sets for data clustering of non-stationary environments is then proposed. Lastly, the original proposed artificial immune model and the enhanced version to dynamically determine the number of clusters are then applied to generated synthetic non-stationary data clustering problems. The influence of the parameters on the clustering performance is investigated for all versions of the proposed artificial immune model and supported by empirical results and statistical hypothesis tests.
Schoeman, IL PhD
2011
University of Pretoria
Niching in Particle Swarm Optimization Optimization forms an intrinsic part of the design and implementation of modern systems, such as industrial systems, communication networs, and the configuration of electric or electronic components. Population-based single-solution optimization algorithms such as particle swarm optimization (PSO) have been shown to perform well when a number of optimal or suboptimal solutions exist. However, some problems require algorithms that locate all or most of these optimal and suboptimal solutions. Such algorithms are known as niching or speciation algorithms.
Several techniques have been proposed to extend the PSO paradaigm so that multiple optima can be located and maintained within a convoluted search space. A significant number of these implementations are subswarm-based, that is, portions of the swarm are optimized separately. Niches are formed to contain these subswarms, a process that often requires user-specified parameters, as the sizes and placing of the niches are unknown. This thesis presents a new niching technique that uses the vector dot product of the social and cognitive direction vectors to determine niche boundaries. Thus, a separate niche radius is calculated for each niche, a process that requires minimal knowledge of the search space. This strategy differs from other techniques using niche radii where a niche radius is either required to be set in advance, or calculated from the distances between particles.
The development of the algorithm is traced and tested extensively using synthetic benchmark functions. Empirical results are reported using a variety of settings. An analysis of the algorithms is presented as well as a scalability study. The performance of the algorithm is also compared to that of the two other wll-known PSO niching algorithms. To conclude, the vector-based PSO is extended to locate and track multiple optima in dynamic environments.
Lutu, PEN PhD
2011
University of Pretoria
Dataset Selection for Aggregate Model Implementation in Predictive Data Mining This thesis addresses the problem of dataset selection for predictive data mining. Dataset selection was studied in the context of aggregate modeling for classification. The central argument of this thesis is that, for predictive data mining, it is possible to systematically select many dataset samples and employ different approaches (different from current practice) to feature selection, training dataset selection, and model construction. When a large amount of information in a large dataset is utilised in the modeling process, the resulting models will have a high level of predictive performance and should be more reliable. Aggregate classification models, also known as ensemble classifiers, have been shown to provide a high level of predictive accuracy on small datasets. Such models are known to achieve a reduction in the bias and variance components of the prediction error of a model. The research for this thesis was aimed at the design of aggregate models and the selection of training datasets from large amounts of available data. The objectives for the model design and dataset selection were to reduce the bias and variance components of the prediction error for the aggregate models.
Large datasets obtained from the UCI KDD Archive were used in the experiments. Two classification algorithms: See5 for classification tree modeling and K-Nearest Neighbour, were used in the experiments. The two methods of aggregate modeling that were studied are One-Vs-All (OVA) and positive-Vs-negative (pVn) modeling. While OVA is an existing method that has been used for small datasets, pVn is a new method of aggregate modeling, proposed in this thesis. Methods for feature selection from large datasets, and methods for training dataset selection from large datasets, for OVA and pVn aggregate modeling, were studied.
The experiments of feature selection revealed that the use of many samples, robust measures of correlation, and validation procedures result in the reliable selection of relevant features for classification. A new algorithm for feature subset search, based on the decision rule-based approach to heuristic search, was designed and the performance of this algorithm was compared to two existing algorithms for feature subset search. The experimental results revealed that the new algorithm makes better decisions for feature subset search. The information provided by a confusion matrix was used as a basis for the design of OVA and pVn base models which are combined into one aggregate model. A new construct called a confusion graph was used in conjunction with new algorithms for the design of pVn base models. A new algorithm for combining base model predictions and resolving conflicting predictions was designed and implemented. Experiments to study the performance of the OVA and pVn aggregate models revealed the aggregate models provide a high level of predictive accuracy compared to single models. Finally, theoretical models to depict the relationships between the factors that influence feature selection and training dataset selection for aggregate models are proposed, based on the experimental results.
Constantinou, D PhD
2011
University of Pretoria
Ant Colony Optimisation Algorithms for Solving Multi-Objective Power-Aware Metrics for Mobile Ad Hoc Networks A mobile ad hoc network (MANET) is an infrastructure-less multi-hop network where each node communicates with other nodes directly or indirectly through intermediate nodes. Thus, all nodes in a MANET basically function as mobile routers participating in some routing protocol required for deciding and maintaining the routes. Since MANETs are infrastructure-less, self-organizing, rapidly deployable wireless networks, they are highly suitable for applications such as military tactical operations, search and rescue missions, disaster relief operations, and target tracking. Building such ad-hoc networks poses a significant technical challenge because of energy constraints and specifically in relation to the application of wireless network protocols. As a result of its highly dynamic and distributed nature, the routing layer within the wireless network protocol stack, presents one of the key technical challenges in MANETs. In particular, energy efficient routing may be the most important design criterion for MANETs since mobile nodes are powered by batteries with limited capacity and variable recharge frequency, according to application demand. In order to conserve power it is essential that a routing protocol be designed to guarantee data delivery even should most of the nodes be asleep and not forwarding packets to other nodes. Load distribution constitutes another important approach to the optimisation of active communication energy. Load distribution enables the maximisation of the network lifetime by facilitating the avoidance of over-utilised nodes when a route is in the process of being selected.
Routing algorithms for mobile networks that attempt to optimise routes while attempting to retain a small message overhead and maximise the network lifetime has been put forward. However certain of these routing protocols have proved to have a negative impact on node and network lives by inadvertently over-utilising the energy resources of a small set of nodes in favour of others. The conservation of power and careful sharing of the cost of routing packets would ensure an increase in both node and network lifetimes.
This thesis proposes simultaneously, by using an ant colony optimisation (ACO) approach, to optimise five power-aware metrics that do result in energy-efficient routes and also to maximise the MANET’s lifetime while taking into consideration a realistic mobility model. By using ACO algorithms a set of optimal solutions – the Pareto-optimal set – is found. This thesis proposes five algorithms to solve the multi-objective problem in the routing domain. The first two algorithms, namely, the energy efficiency for a mobile network using a multi-objective, ant colony optimisation, multi-pheromone (EEMACOMP) algorithm and the energy efficiency for a mobile network using a multi-objective, ant colony optimisation, multi-heuristic (EEMACOMH) algorithm are both adaptations of multi-objective, ant colony optimisation algorithms (MOACO) which are based on the ant colony system (ACS) algorithm. The new algorithms are constructive which means that in every iteration, every ant builds a complete solution. In order to guide the transition from one state to another, the algorithms use pheromone and heuristic information. The last algorithm implemented, namely, the energy efficiency for a mobile network using a multi-objective, ant colony optimisation, multi-colony (EEMACOMC) algorithm uses a multiple colony ACO algorithm.
Du Plessis, MC PhD
2012
University of Pretoria
Adaptive Multi-Population Differential Evolution for Dynamic Environments Dynamic optimisation problems are problems where the search space does not remain constant over time. Evolutionary algorithms aimed at static optimisation problems often fail to effectively optimise dynamic problems. The main reason for this is that the algorithms converge to a single optimum in the search space, and then lack the necessary diversity to locate new optima once the environment changes.
Many approaches to adapting traditional evolutionary algorithms to dynamic environments are available in the literature, but differential evolution (DE) has been investigated as a base algorithm by only a few researchers. This thesis reports on adaptations of existing DE-based optimisation algorithms for dynamic environments.
A novel approach, which evolves DE sub-populations based on performance in order to discover optima in an dynamic environment earlier, is proposed. It is shown that this approach reduces the average error in a wide range of benchmark instances.
A second approach, which is shown to improve the location of individual optima in the search space, is combined with the first approach to form a new DE-based algorithm for dynamic optimisation problems. The algorithm is further adapted to dynamically spawn and remove sub-populations, which is shown to be an effective strategy on benchmark problems where the number of optima is unknown or fluctuates over time. Finally, approaches to self-adapting DE control parameters are incorporated into the newly created algorithms. Experimental evidence is presented to show that, apart from reducing the number of parameters to fine-tune, a benefit in terms of lower error values is found when employing self-adaptive control parameters.
Helbig, M PhD
2012
University of Pretoria
Solving Dynamic Multi-Objective Optimisation Problems using Vector Evaluated Particle Swarm Optimisation Most optimisation problems in everyday life are not static in nature, have multiple objectives and at least two of the objectives are in conflict with one another. However, most research focusses on either static multi-objective optimisation (MOO) or dynamic single-objective optimisation (DSOO). Furthermore, most research on dynamic multi-objective optimisation (DMOO) focusses on evolutionary algorithms (EAs) and only a few particle swarm optimisation (PSO) algorithms exist. This thesis proposes a multi-swarm PSO algorithm, dynamic Vector Evaluated Particle Swarm Optimisation (DVEPSO), to solve dynamic multi-objective optimisation problems (DMOOPs). In order to determine whether an algorithm solves DMOO efficiently, functions are required that resembles real world DMOOPs, called benchmark functions, as well as functions that quantify the performance of the algorithm, called performance measures. However, one major problem in the field of DMOO is a lack of standard benchmark functions and performance measures. To address this problem, an overview is provided from the current literature and shortcomings of current DMOO benchmark functions and performance measures are discussed. In addition, new DMOOPs are introduced to address the identified shortcomings of current benchmark functions. Guides guide the optimisation process of DVEPSO. Therefore, various guide update approaches are investigated. Furthermore, a sensitivity analysis of DVEPSO is conducted to determine the influence of various parameters on the performance of DVEPSO. The investigated parameters include approaches to manage boundary constraint violations, approaches to share knowledge between the sub-swarms and responses to changes in the environment that are applied to either the particles of the sub-swarms or the non-dominated solutions stored in the archive. From these experiments the best DVEPSO configuration is determined and compared against four state-of-the-art DMOO algorithms.
Malan, KM PhD
2014
University of Pretoria
Characterising Continuous Optimisation Problems for Particle Swarm Optimisation Performance Prediction Real-world optimisation problems are often very complex. Population-based metaheuristics, such as evolutionary algorithms and particle swarm optimisation (PSO) algorithms, have been successful in solving many of these problems, but it is well known that they sometimes fail. Over the last few decades the focus of research in the field has been largely on the algorithmic side with relatively little attention being paid to the study of the problems. Questions such as ‘Which algorithm will most accurately solve my problem?’ or ‘Which algorithm will most quickly produce a reasonable answer to my problem?’ remain unanswered.
This thesis contributes to the understanding of optimisation problems and what makes them hard for algorithms, in particular PSO algorithms. Fitness landscape analysis techniques are developed to characterise continuous optimisation problems and it is shown that this characterisation can be used to predict PSO failure. An essential feature of this approach is that multiple problem characteristics are analysed together, moving away from the idea of a single measure of problem hardness. The resulting prediction models not only lead to a better understanding of the algorithms themselves, but also takes the field a step closer towards the goal of informed decision-making where the most appropriate algorithm is chosen to solve any new complex problem.
Grobler, J PhD(Eng)
2015
University of Pretoria
The Heterogeneous Meta-hyper-heuristic: From Low Level Heuristics to Low Level Meta-heuristics Meta-heuristics have already been used extensively for the successful solution of a wide range of real world problems. A few industrial engineering examples include inventory optimization, production scheduling, and vehicle routing, all areas which have a significant impact on the economic success of society. Unfortunately, it is not always easy to predict which meta-heuristic from the multitude of algorithms available, will be best to address a specific problem. Furthermore, many algorithm development options exist with regards to operator selection and parameter setting. Within this context, the idea of working towards a higher level of automation in algorithm design was born. Hyper-heuristics promote the design of more generally applicable search methodologies and tend to focus on performing relatively well on a large set of different types of problems.
This thesis develops a heterogeneous meta-hyper-heuristic algorithm (HMHH) for single-objective unconstrained continuous optimization problems. The algorithm development process focused on investigating the use of meta-heuristics as low level heuristics in a hyper-heuristic context. This strategy is in stark contrast to the problem-specific low level heuristics traditionally employed in a hyper-heuristic framework. Alternative low level meta-heuristics, entity-to-algorithm allocation strategies, and strategies for incorporating local search into the HMHH algorithm were evaluated to obtain an algorithm which performs well against both its constituent low level meta-heuristics and four state-of-the-art multi-method algorithms.
Finally, the impact of diversity management on the HMHH algorithm was investigated. Hyper-heuristics lend themselves to two types of diversity management, namely solution space diversity (SSD) management and heuristic space diversity (HSD) management. The concept of heuristic space diversity was introduced and a quantitative metric was defined to measure heuristic space diversity. An empirical evaluation of various solution space diversity and heuristic space diversity intervention mechanisms showed that the systematic control of heuristic space diversity has a significant impact on hyper-heuristic performance.
Cleghorn, CW PhD
2017
University of Pretoria
Particle Swarm Optimization: Empirical and Theoretical Stability Analysis Particle swarm optimization (PSO) is a well-known stochastic population-based search algorithm, originally developed by Kennedy and Eberhart in 1995. Given PSO’s success at solving numerous real world problems, a large number of PSO variants have been proposed. However, unlike the original PSO, most variants currently have little to no existing theoretical results. This lack of a theoretical underpinning makes it difficult, if not impossible, for practitioners to make informed decisions about the algorithmic setup. This thesis focuses on the criteria needed for particle stability, or as it is often refereed to as, particle convergence.
While new PSO variants are proposed at a rapid rate, the theoretical analysis often takes substantially longer to emerge, if at all. In some situation the theoretical analysis is not performed as the mathematical models needed to actually represent the PSO variants become too complex or contain intractable subproblems. It is for this reason that a rapid means of determining approximate stability criteria that does not require complex mathematical modeling is needed. This thesis presents an empirical approach for determining the stability criteria for PSO variants. This approach is designed to provide a real world depiction of particle stability by imposing absolutely no simplifying assumption on the underlying PSO variant being investigated. This approach is utilized to identify a number of previously unknown stability criteria. This thesis also contains novel theoretical derivations of the stability criteria for both the fully informed PSO and the unified PSO. The theoretical models are then empirically validated utilizing the aforementioned empirical approach in an assumption free context.
The thesis closes with a substantial theoretical extension of current PSO stability research. It is common practice within the existing theoretical PSO research to assume that, in the simplest case, the personal and neighborhood best positions are stagnant. However, in this thesis, stability criteria are derived under a mathematical model where by the personal best and neighborhood best positions are treated as convergent sequences of random variables. It is also proved that, in order to derive stability criteria, no weaker assumption on the behavior of the personal and neighborhood best positions can be made. The theoretical extension presented caters for a large range of PSO variants.
Gary Pampara PhD (Industrial Engineering)
2021
Stellenbosch University
Particle Swarm Optimisation for Dynamically Constrained, Dynamic Optimisation Problems Real-world problems are usually time-dependent problems, which change over time. Due to the dynamic nature of real problems, it is important to be able to optimize problems that also change over time. The natural extension to problems that change over time are problems that change over time but may or may not have valid solutions when considering other factors about the optimization problem. Relating this class of dynamic yet constrained optimization problems back to the real world allows the comparison of such dynamic constrained optimization problems to the real problem of the stock market. Stocks are bought and sold on the stock market every day and at far greater velocities than a few decades ago. When a prospector attempts to acquire some stocks, a number of constraints are immediately present. Firstly, the prospector may only have a fixed budget from which to buy stocks. Secondly, the stock market is a dynamic environment with the price of stocks fluctuating frequently. Lastly, constraints exist for the stock purchasing on the stock market itself, such as a limited supply of stocks. When the prospector attempts to purchase stocks, a balance must be struck between the volatility of the dynamic environment (influencing the number of stocks the prospector can afford to purchase) and the actual availability of stocks (which constrains the number of any stocks that may be purchased). Dynamic constrained optimization problems define the combination of problem space changes and possible constraint changes.
Taiwo Omomule PhD (Computer Science)
2022
Stellenbosch University
Heterogeneous Mixtures of Experts This research considers the problem of the No-Free-Launch-Theorem, which states that no one machine learning algorithm performs best on all problems due to algorithms having different inductive biases. Another problem is that the combinations of experts of the same type, referred to as a mixture of homogeneous experts, do not capitalize on the different inductive biases of different machine learning algorithms. Research has shown that mixtures of homogeneous experts deliver improved accuracy compared to that of the base experts in the mixture. However, the predictive power of a homogeneous mixture of experts is still limited by the inductive bias of the algorithm that makes up the mixture of experts. Therefore, this research proposes the development of mixtures of heterogeneous experts through the combination of different machine learning algorithms to take advantage of the strengths of the machine learning algorithms and to reduce the adverse effects of the inductive biases of the different algorithms.
A set of different machine learning algorithms are selected to develop four different types of mixtures of experts in the research. Empirical analyses are performed using nonparametric statistical tests to compare the generalization performance of the ensembles. The comparison is carried out to investigate the performance of the homogeneous and heterogeneous ensembles in a number of modelling studies examined on a set of classification and regression problems using selected performance measures. The problems represent varying levels of complexity and characteristics to determine the characteristics and complexities for which the heterogeneous ensembles outperform homogeneous ensembles. For classification problems, the empirical results across six modelling studies indicate that heterogeneous ensembles generate improved predictive performance compared to the developed homogeneous ensembles by taking advantage of the different inductive biases of the different base experts in the ensembles. Specifically, the heterogeneous ensembles developed using different machine learning algorithms, with the same and different configurations, showed superiority over other heterogeneous ensembles and the homogeneous ensembles developed in this study. The ensembles achieved the best and second-best overall accuracy rank across the classification datasets in each modelling study.
For regression problems, the heterogeneous ensembles outperformed the homogeneous ensembles across five modelling studies. Although, a random forest algorithm achieved competitive generalization performance compared to that of the heterogeneous ensembles. Based on the average ranks, the heterogeneous ensembles developed using different machine learning algorithms where the base members consist of the same and different configurations still produced better predictive performance than a number of heterogeneous ensembles and homogeneous ensembles across the modelling studies.
Therefore, the implementation of a mixture of heterogeneous experts removes the need for the computationally expensive process of finding the best performing homogeneous ensemble. The heterogeneous ensembles of different machine learning algorithms are consistently the most or one of the most accurate ensembles across all classification and regression problems. This is attributed to the advantage of capitalizing on the inductive biases of the different machine learning algorithms and the different configurations of the base members in the ensembles.
Werner Mostert PhD (Computer Science)
2023
Stellenbosch University
Landscape Aware Algorithm Selection for Feature Selection Feature selection is commonly applied as a pre-processing technique for machine learning to reduce the dimensionality of a problem by removing redundant and irrelevant features. Another desirable outcome of feature selection lies in the potential performance improvement of predictive models. The development of new feature selection algorithms are common within the field, however, relatively little research has historically been done to better understand the feature selection problem from a theoretical perspective. Researchers and practitioners in the field often rely on a trial-and-error strategy to decide on which feature selection algorithm to use for a specific instance of a machine learning problem.
This thesis contributes towards a better understanding of the complex feature selection problem by investigating the link between feature selection problem characteristics and the performance of feature selection algorithms. A variety of fitness landscape analysis techniques are used to gain insights into the structure of the feature selection fitness landscape. Performance complementarity for feature selection algorithms is empirically shown, emphasising the potential value of automated algorithm selection for feature selection algorithms. Towards the realisation of a landscape aware algorithm selector for feature selection, a novel performance metric for feature selection algorithms is presented. The baseline fitness improvement (BFI) performance metric is unbiased and can be used for comparative analysis across feature selection problem instances. The insights obtained via landscape analysis are used with other meta-features of datasets and the BFI performance measure to develop a new landscape aware algorithm selector for feature selection. The landscape aware algorithm selector provides a human-interpretable predictive model of the best feature selection algorithm for a specific dataset and classification problem.