Current Students


The following are postgraduate students currently under my supervision or co-supervision.

Project Students


Computer Science Honours

Student Project Title Abstract
Kian Anderson Bone fracture prediction from limited X-ray images
Benjamin Botes Active learning in neural network ensembles
Banjamin Grunewald A music score and tutor app
Kadhan Olivier Automated identification of leg implants from X-ray images
Ben van Duivenbooden Knowledge-based decision tree induction using genetic programming

Engineering Skripsies

Student Project Title Abstract

Masters Students


Computer Science

Student Thesis Title Abstract
Martin Brink Constrained Set-based Particle Swarm Optimization This research will develop approaches to allow the set-based particle swarm optimization algorithm to be applied to solve constrained set-based combinatorial optimization problems.
Heinrich Cilliers Adaptive Gaussian Mixture Models A Gaussian mixture model (GMM) is used in unsupervised learning to represent clusters in a dataset as a mixture of Gaussian distributions. GMMs are usually fitted using the Expectation-Maximization (EM) algorithm, which is prone to yielding sub-optimal solutions. Additionally, the EM algorithm fits GMMs to stationary data and requires the number of clusters to be specified beforehand. This study aims to propose, evaluate and compare various approaches to fitting a GMM to stationary and non-stationary data, as well as dynamically determining the optimal number of Gaussians using particle swarm optimization.
Phiwokuhle Dube Active Learning and Mini-batch Learning as Dynamic Optimization Problems Particle swarm optimization algorithms will be developed to traing neural networks where the training set changes during training as a consequence of active learning and mini-batch sampling.
Chimbioputo Fabiano Set-based PSO for Community Detection in Social Networks This work will develop a set-based particle swarm optimization algorithm to detect communities in socal networks.
Ignazio Ferreira Neural Network Ensembles and Concept Drift This research developes an approach to train a neural network ensemble under the presence of concept drift. Particl swarm optimization algorithms developed for solving dynamic optimization problems will be used to train each member of the ensemble and to adapt learned decision boundaries as concept drift is experienced. A multi-modal particle swarm optimization algorithm will be developed to ensure that ensemble members are situated on different local minima of the neural network landscape. Different mechanisms to ensure diversity in enesmble member decision making will also be investigated.
Mnkandla Fidelis Training Support Vector Machines under Concept Drift Particle swarm optimization algorithms will be developed and analyzed to train support vector machines under the presence of concept drift.
Lauren Hayward TBD
Christoff Jordaan Incremental Feature Learning Incremental feature learning approaches incrementally adds additional features to a predictive model. For neural networks, this results in a neural network architecture and search landscape that changes as additional features are added. This research will develop particle swarm optimization approaches to train neural networks with incrementally added features, starting from most important features.
Surani Laubscher Particle Swarm Optimization for Knapsack Problems The knapsack problem is one of the most studied problems in combinatorial optimization, with various variations to the orginale probelm and many real-life applications. The general goal is to pack a set of items, given values and sizes, into a container with maximum capacity. This research will start with an in-depth review of the knapsack problem and its variations, including multi-modal, multi-objective and dynamic knapsack problems. The study will then proceed to illustrate to what extend particle swarm optimization algorithms can be applie to solve as many as possible of these problems. The study will likely also develop new particle swarm optimization algorithms, and will compare the particle swarm optimization implementations with other state-of-the-art algorithms.
Jin Ree Lee Nature-inspired Approaches to Prioritized Foraging in Dynamic Environments
Francois Nel Cryptocurrency Forecasting using Neural Networks Trained with Cooperative Quantum Particle Swarm Optimisation
Gary Steyn Adaptive Model Trees for Classification under the Presence of Concept Drift
Nkululeo Thangelane Nature-inspired Transformer Models

Engineering

Student Thesis Title Abstract
Samantha Downing Identification and Classification of Bird Species from Video
Takudzwa Masunungure Time Series Forecasting without Recurrent Connections
Medelace Mojalefa Meta-heuristics for deep learning

Data Science

Student Thesis Title Abstract
Frederik Andersen Set-based Particle Swarm Optimization for Training Support Vector Machines
Francois Conradie Active Learning for Ensemble Learning
Hein Cooke Hot Rolled Shape Classifier
Abdulrehman Dandekar A Framework for Digitization of Legacy Medical Records
Jean De Smidt Development of Machine Learning Approaches to Aid in the Automated Analysis of Medical Images
Keamogetswe Dladla Evolving Oblique Trees
Jacques du Plessis Knowledge-biased Decision Tree Induction using Genetic Programming
Nkosinathi Gule Hierarchical Convolutional Predictive Models for Bone Fracture Detection and Classification from X-rays
Frans Jordaan Multi-modal Particle Swarm Optimization Algorithms for Large-Scale Multi-Modal Optimization Problems
Isaya Kawana Advancing Energy Forecasting: Evaluating and Enhancing Large Language Models for Individualized Consumption Prediction
Jaco Kemp Road Mapping and Classification
Kabelo Kholoane Behavioral Scorecards Development and Machine Learning
Carl Kirstein Critial Review of Remaining Useful Life Prediction Models
Nicholas Minnie Transformer Models for Bone Fracture Detection and Classification from X-rays
Shafeeq Mollagee Methods for Converting Classified Image Outputs to Geocoded Road Vector Segments
Wihan Mouton Incremental Class Learning using Dynamic Meta-heuristics
Nemadandila Maduvha Intrusion detection using machine learning
Andre Nel Three-dimensional Reconstruction of Bones from Two-dimensional X-rays
Gert Peens Fracture Prediction form Limited X-ray Images
Thakhani Ramaru Ensemble of k-Nearest Neighbour Algorithms
Kabelo Sekwadi Automated Identification of Leg Implants from X-rays
Jason September Measures of Predictive Model Confidence
Nadine Smal To Go Deep, Or Not To Go Deep
Jacomine Smit Predicting the Spread of Anti-Microbacterial Organisms in South African Private Healthcare Facilities using Machine Learning Techniques
Frances Steyl Automated Idetification of Musical Instrument from Sound
Louzanne Swart The Role of RNA Decay in Plasmodium Falciparum's Transcriptomics and Proteomics
Lauren Trinder-Smith A Critical Review and Analysis of Particle Swarm Optimization Approaches for Multi- and Many-Objective Optimization
Shinay van Wyk Semi-supervised Bone Fracture Detection and Classification
Hendrik Wait Determination of Usage of Surgical Instruments and Material using Video Analysis

Doctoral Students


Computer Science

Student Thesis Title Abstract
Adekoya Adekunle Multi-Objective Optimization For Dynamic Incremental Machine Learning Algorithms Due to data streams becoming more prevalent, research to improve the understanding, analysis and processing of big data stream is very active. The main goal of these research is to improve prediction and decision-making based on data streams. However, many of these data streams are generated and processed in environments that are characterized by uncertainty, such as temporal changes to the statistical properties of the data stream. A number of research studies are ongoing on how to handle the uncertainty around data streams.
As a result of the forgoing, this research aims to investigate the efficacy of evolutionary and swarm-based multi-objective optimization techniques to develop machine learning predictive models for data streams. An important considaration for for developing these predictive models is the presence of concept drift, where the statistical distribution of data and/or target variables may change with time. The consequences of concept drift include degradation in performance, and changes in the optimility of the resulting model architecture.
This research will formulate machine learning in the presence of concept drift as a dynamic multi-objective optimization problem, where the objectives are to optimize prediction accuracy and to optimize model architecture (in order to prevent overfitting and underfitting). Both objectives are dynamic, due to the consequences of concept drift.
Multi-objective machine learning predictive models for data streams will be developed and extensively evaluated. These predictive models will then combined into a heterogeneous ensemble model, and the performance of this ensemble will be evaluated in comparison with the individual machine learning models.
Kyle Erwin Classification Dataset Complexity Measures and Correlation to Predictive Model Perfomance
Godfree Hoko Analysis of Boosting Concepts in Deep Learning Architectures for Imbalanced Classification
Amani Saad Differential Evolution and Optimal Population Sizes Parameter control is a significant topic in the design of evolutionary algorithms (EAs). The performance of EAs is greatly affected by the selection of control parameters. Therefore, optimal selection for values of control parameters is particularly noteworthy research field. One common control parameter among all EAs is the population size. Differential Evolution (DE) is sensitive to its control parameters which are the crossover rate, the scale factor and the population size. Despite the fact of having population size as an important control parameter which significantly influences the performance of DE, the volume of work dedicated to address the population size indicates that this aspect is still under-investigated. A number of empirical studies have advised that setting the population size should be related to the problem dimensionality. Based on these empirical studies, a general perception within the DE research community that advocates setting the size of a DE population to 10 times the dimension of the problem prevailed. However, the conclusions derived from these studies were based on very limited benchmark suite containing only a few benchmark functions and hence are not suitable for all problems instances. Also, the common method of increasing the population size gradually to achieve better performance is subjective. A clear incremental strategy was not defined. Instead, rules of thumb were suggested as a user guide. The main objective of this research is to empirically analyze DE with respect to optimal population sizes, and to derive correlations between optimal population size and fitness landscape characteristics. The impact of different population sizes on search behavior will also be investigated.
Jean-Pierre van Zyl A Set-based Particle Swarm Optimization Approach to AutoML

Engineering

Student Thesis Title Abstract
Emmanuel Buabin Noncommutative Time Series Feature Extraction with Banach Lie Algebra In this thesis, the focus is directed at algebraic evolutionary time series feature extractor conceptualization, design and implementation. To be specific, a mathematical theory that constitute 1) specialized Banach/Hilbert space, 2) specialized Banach Lie related Algebra and 3) specialized body of mechanics (quantum motivated), is motivated for the overarching goal of algebraic time series feature data production, machine learning framework modelling and other interactive concept modelling. The time series feature extractor, equipped with, constituting novel algebraic evolutionary (swarm) time series feature learning procedures, is adopted for feature extraction duty on produced (algebraic) time series datasets, within a specific time series problem context. To ascertain performance levels, experimentations are varied across different parameters.
Eldon Burger Removal of Confounding Features from Convolutional Neural Netwokrs A convolutional neural network (CNN) can be used as a computer-aided diagnostic tool to diagnose one or more diseases from medical images. When the prediction of a CNN is based on confounding features as opposed to the causal features of the diseases, the CNN can incorrectly diagnose a patient. Traditional methods that have been designed to remove confounding features from a predictive model is either not compatible with CNNs or result in a significant reduction in the performance of CNNs. This study will identify, develop, and compare methods to remove confounding features from CNNs.
Timothy Carolus *Control Parameter Importance and Stability Analysis of Population-based Algorithms * A common problem in the design of optimization algorithms is ensuring that the sequence of solutions converges. This problem becomes more problematic for population-based meta-heuristics. One such group of iterative algorithms are swarm intelligence based algorithms, such as particle swarm optimization (PSO). Stability conditions have been derived on the control parameters of a class of such population-based algorithms, where the position updates can be reformulated in a specific recurrence relation. This research will investigate a number of swarm intelligence based algorithms and work towards reformulation of their position updates in the standard recurrence relation. From this, stability conditions will be drived for these algorithms, to provide guidance on how values for control parameters should be initialized to guarantee that an equilibrium state will be reached. Furthermore, an analysis of the control parameter importance within this region is carried out using functional analysis of variance. This study is applied to both single objective and multi-objective optimization algorithms.
Haroon Gool TBD
Michael Kgopa Set-based Meta-heuristics
Chucknorris Madamombe Review and Analysis of Swarm Based Algorithms for Optimization Due to their powerful and resourceful performance for solving difficult optimization problems, swarm-based algorithms have been of much interest to many researchers in the scientific domain. All these swarm-based algorithms have been inspired by the natural behavior of swarms of biological organisms e.g. animals, birds, bacteria, insects, fish and amphibians. It has been shown that these organisms provide a unique set of characteristics that can be used to design new swarm algorithms. Thus, the fascinating activities that are observed on a day to day basis in nature has been used as the basis for the formulation of new techniques for solving sophisticated problems in real life. A surfeit of swarm-based algorithms have been proposed since 1992 when they were first published. These swarm-based algorithms have been successfully used to solve sophisticated real-life optimization problems. Even though each of these swarm-based algorithms is supported by an analogy from nature, based on some nature-inspired metaphor, their mathematical/algorithmic models are almost similar or at least share significant overlap.
The initial phase of the proposed study will be to conduct an extensive literature review on the available swarm-based algorithms. A total of 80 swarm-based algorithms will be listed. The study will be further narrowed down to review only the most popular algorithms based on Google Scholar citation counts. Only 65 most popular swarm-based algorithms will be reviewed. The review will cover the background (source of inspiration) of each swarm-based algorithm, the mathematical model as well as the algorithmic model of each swarm-based algorithm. The major focus of the proposed study is to examine the mathematical models of each algorithm and to draw out similarities and differences from these swarm-based algorithms. The descriptions of these swarm-based algorithms will be as extensive as possible.
The main goal of this research is to identify and categorize swarm-based algorithms for optimization based on different views such as nature-inspired view, application class view, optimization problems class view, computational complexity view and mathematical/algorithmic model view. A critical review of swarm-based algorithms will be done with reference to these different views. The critical review will develop a taxonomization based on the different views.
The second goal of this research is to conduct an extensive empirical analysis of these algorithms on a large benchmark suite of continuous-valued, single-objective, static, boundary constrained optimization problems. The goal of the empirical analysis is to conduct a control parameter sensitivity analysis from which best values of the control parameters can be derived. The other goal of the empirical analysis is to identify the best algorithm(s) for specific optimization problem classes based on different performance criteria. The computational complexity, i.e. the actual execution time as well as the asymptotic complexity analysis, of each algorithm will be examined.
Kondwani Magamba Crop management using predictive data analytics and leaf venation networks Malawi has an estimated population 18.6 million as per 2019 reports. It is expected that the population will double by 2038. This increase poses a threat not only to sustainable food production but also food security and this may impact on the country's drive to achieve one of the sustainable development goals of the United Nations (UN)- Goal 3. Agricultural production in Malawi is hindered by many factors including crop pest and diseases, inability to predict crop yield reliably and lack of information about meteorological conditions, soil properties and land cover.
There is need therefore that the identified challenges be overcome as Malawi's economy is predominantly agriculture based and makes up about 30% of the country's Gross Domestic Product and employs over 64% of the national workforce.
The goal of this study is to use machine learning (ML) techniques to develop models for crop yield prediction; disease; crop quality and crop species recognition. The study will achieve its objectives by using ML to study leaf venation networks of irish and sweet potatoes.
Noma Mkwananzi Fitness Landscape Analysis of Neural Networks for Regression Problems The aim of training an artificial neural networks (ANN) is to determine a set of weights that minimize the error rates. An optimization algorithm is used to train the ANN, that is, adjust the weights of the ANN. An understanding of the ANN search landscapes may help to better inform on the best optimization and even architecture of ANN for regression problems. Fitness landscape analysis is one approach that can be used. A study on fitness landscape analysis has been carried out to understand the characteristics of the search space of neural networks for classification problems. This research is an extension to this study. It will focus on regression problems and it seeks to determine if landscape properties of regression problems vary from classification problems landscape properties.
Aveer Nannoolal Heterogeneous Ensemble for Financial Time Series Modelling
Robert Nshimirimana Optimization of Digital Radiography Using Multi-objective Particle Swarm Optimization Radiography is a 2-D transmission imaging technique that is extensively used for non-destructive investigation of materials. The integrity of the investigation depends on the quality of the image which is obtained by arranging the radiography system parameters in such a way that it approaches a compromised optimum. A manual optimization is time consuming, labour intensive, and prone to human error. This research aims to develop an automated radiography system optimizer based on multi-objective particle swarm optimization to provide scanning or design parameters in the form of a set of Pareto optimal solutions for a radiography system.
Daniel von Eschwege Self-Adaptive Meta-Heuristics using Cultural Algorithms Cultural algorithms (CA) are evolutionary algorithms which maintain a belief space in parallel with a population space. The population space represents a set of candidate solutions to the optimization problem, and the belief space maintains a set of beliefs about where in the search landscape and optimum resides. Any population-based metaheuristic can be used in the population space to find an optimal solution to the relevant optimization problem. The belief space is a collection of "beliefs" formed by a few individuals in the population as to where in the search space these individuals believe the optimal solution can be found.
Meta-heuristics have control parameters, with different control parameter configurations resulting in different levels of performance. Control parameter configurations are also very problem dependent, and usually requires computationally expensive parameter tuning prior to solving the problem, which has to be repeated for each new problem. Conversely, self-adaptive algorithms adjust control parameter values during the optimization process. Considering particle swarm optimization (PSO) specifically, several self-adaptive, but inefficient approaches have been developed.
This research will develop a CA approach to search for the optimal PSO control parameter values used in the population space, by defining a belief space to represent the control parameter space. The belief space will indicate parts of the control parameter space where the best performing individuals believe the best control parameter values can be found. Each individual will then sample values for its control parameters from this belief space. Different strategies to update and utilize the belief space will be developed to prevent premature convergence in both the belief and population spaces. The result aims to be a more efficient self-adaptive PSO algorithm. The approach will be extensively empirically analyzed.
Zander Wessels A Walk-Forward Multi-Factor Machine Learning Investment Process The investment management industry is going through a paradigm shift: from biased and expensive human-centric investment decision making, to unbiased, scalable, adaptive, and testable algorithmic investment decision making at lower costs. This shift is being driven by cutting-edge machine learning algorithms, large amounts of structured and unstructured data, and processing power. Thus, the goal of this thesis is to propose an online collective intelligence framework where online machine learning algorithms and fundamental financial models can develop different views on the securities and assets in question. After the algorithms have voted on which assets they believe will go up or down in the future, portfolios can be constructed using heuristic algorithms, e.g. PSO. Because these models are unbiased and behave reliably, they can be simulated robustly through time. These simulations accounts for survivorship bias, lookahead bias, transaction costs, market impacts, liquidity risk, and risk management.

Post-doctoral Fellows

Student Thesis Title Abstract