ORCID Profile
0000-0002-8387-5739
Current Organisation
Massey University
Does something not look right? The information on this page has been harvested from data sources that may not be up to date. We continue to work with information providers to improve coverage and quality. To report an issue, use the Feedback Form.
In Research Link Australia (RLA), "Research Topics" refer to ANZSRC FOR and SEO codes. These topics are either sourced from ANZSRC FOR and SEO codes listed in researchers' related grants or generated by a large language model (LLM) based on their publications.
Conservation and Biodiversity | Environmental Science and Management | Applied Statistics | Environmental management | Ecology | Wildlife and Habitat Management | Aboriginal and Torres Strait Islander Environmental Knowledge | Natural Resource Management | Ecological impacts of climate change and ecological adaptation | Conservation and biodiversity | Modelling and simulation | Community Ecology | Global Change Biology
Ecosystem Assessment and Management at Regional or Larger Scales | Flora, Fauna and Biodiversity at Regional or Larger Scales | Ecosystem Adaptation to Climate Change | Expanding Knowledge in the Environmental Sciences |
Publisher: Cold Spring Harbor Laboratory
Date: 28-06-2018
DOI: 10.1101/357798
Abstract: When applied to structured data, conventional random cross-validation techniques can lead to underestimation of prediction error, and may result in inappropriate model selection. We present the R package blockCV , a new toolbox for cross-validation of species distribution modelling. The package can generate spatially or environmentally separated folds. It includes tools to measure spatial autocorrelation ranges in candidate covariates, providing the user with insights into the spatial structure in these data. It also offers interactive graphical capabilities for creating spatial blocks and exploring data folds. Package blockCV enables modellers to more easily implement a range of evaluation approaches. It will help the modelling community learn more about the impacts of evaluation approaches on our understanding of predictive performance of species distribution models.
Publisher: Public Library of Science (PLoS)
Date: 30-07-2014
Publisher: Wiley
Date: 22-01-2019
DOI: 10.1111/DDI.12892
Publisher: Wiley
Date: 02-04-2021
Abstract: Line‐transect distance s ling is widely used to estimate population densities using distances of observed targets from transect lines to model detectability. When the target taxa are high density, the frequent measuring of distances may make the method seem impractical. We present a method that improves the efficiency of distance s ling when the target species occurs at high density. Only a proportion of targets are measured to model the detection function, and the time saved on the survey is then used to cover a longer total length of transect and accrue a larger ‘count only’ s le. This approach can improve the precision of the population density estimate when the cost of measuring the distance to a detected target is more than half the cost of walking to the next target. We find the optimal proportion of distances to measure that minimises the variance of the density estimate for a fixed survey budget. We quantify how much this optimised strategy increases the precision of the density estimate compared with conventional line‐transect distance s ling. We then use simulated distance s ling data to test our expressions, and illustrate circumstances under which the optimised approach would be beneficial using distance s ling data on high‐density plants. The simulations indicate that the optimised method delivers benefits in precision, but the magnitude of the benefit is lower than predicted from our expressions, which are based on an asymptotic approximation of the variance. We apply an adjustment to the predicted benefit equation to account for this difference, and show that, in all three plant case studies, the optimised approach could improve the precision gained from a distance s ling survey between 20% and 50%. This new approach could broaden the ecological contexts in which distance s ling is applied, to include estimation of densities of abundant taxa where plots are conventionally used. The method may have interesting applications for other survey types, including multispecies surveys or those using cues or signs that occur at high density.
Publisher: Wiley
Date: 22-02-2015
DOI: 10.1111/DDI.12311
Publisher: Wiley
Date: 17-09-2014
Publisher: Wiley
Date: 29-11-2013
DOI: 10.1111/GEB.12138
Abstract: Species often remain undetected at sites where they are present. However, the impact of imperfect detection on species distribution models ( SDM s) is not fully appreciated. In this paper we evaluate the influence of imperfect detection on the calibration and discrimination capacity of SDM s. We compare the performance of three types of SDM s: (1) a technique based on presence–absence data, (2) a technique based on presence–background data, and (3) a technique based on detection/non‐detection data that accounts for imperfect detection. We use simulations to evaluate the impacts of imperfect detection in SDM s. This allows us to assess model performance with respect to the true objective of the models: the estimation of species distributions. We study a range of scenarios of occupancy and detection based on ecologically plausible environmental relationships and identify the circumstances in which imperfect detection affects model calibration and discrimination. We show that imperfect detection can substantially reduce the inferential and predictive accuracy of presence–absence and presence–background methods that do not account for detectability. While calibration is always affected, the influence on discrimination depends on the relationship of detectability and environmental variables. The performance of a model should be assessed with respect to its objectives. Comparative studies that intend to assess the performance of an SDM by evaluating its ability to predict detections rather than presences fail to reveal the benefits of accounting for detectability. Disregarding imperfect detection can have severe consequences for SDM performance, and hence for the estimation of species distributions. To date, this issue has been largely ignored in the SDM literature. Simultaneously modelling occupancy and detection does not necessarily require a greater s ling effort, but rather that data are collected so that they are informative about detectability. We recommend that consideration of imperfect detection become standard practice for species distribution modelling.
Publisher: Wiley
Date: 19-06-2021
DOI: 10.1111/GCB.15723
Abstract: Predictions of species' current and future ranges are needed to effectively manage species under environmental change. Species ranges are typically estimated using correlative species distribution models (SDMs), which have been criticized for their static nature. In contrast, dynamic occupancy models (DOMs) explicitily describe temporal changes in species’ occupancy via colonization and local extinction probabilities, estimated from time series of occurrence data. Yet, tests of whether these models improve predictive accuracy under current or future conditions are rare. Using a long‐term data set on 69 Swiss birds, we tested whether DOMs improve the predictions of distribution changes over time compared to SDMs. We evaluated the accuracy of spatial predictions and their ability to detect population trends. We also explored how predictions differed when we accounted for imperfect detection and parameterized models using calibration data sets of different time series lengths. All model types had high spatial predictive performance when assessed across all sites (mean AUC 0.8), with flexible machine learning SDM algorithms outperforming parametric static and DOMs. However, none of the models performed well at identifying sites where range changes are likely to occur. In terms of estimating population trends, DOMs performed best, particularly for species with strong population changes and when fit with sufficient data, while static SDMs performed very poorly. Overall, our study highlights the importance of considering what aspects of performance matter most when selecting a modelling method for a particular application and the need for further research to improve model utility. While DOMs show promise for capturing range dynamics and inferring population trends when fitted with sufficient data, computational constraints on variable selection and model fitting can lead to reduced spatial accuracy of predictions, an area warranting more attention.
Publisher: Wiley
Date: 08-11-2021
Publisher: Wiley
Date: 16-11-2021
DOI: 10.1002/ECM.1486
Abstract: Species distribution modeling (SDM) is widely used in ecology and conservation. Currently, the most available data for SDM are species presence‐only records (available through digital databases). There have been many studies comparing the performance of alternative algorithms for modeling presence‐only data. Among these, a 2006 paper from Elith and colleagues has been particularly influential in the field, partly because they used several novel methods (at the time) on a global data set that included independent presence–absence records for model evaluation. Since its publication, some of the algorithms have been further developed and new ones have emerged. In this paper, we explore patterns in predictive performance across methods, by reanalyzing the same data set (225 species from six different regions) using updated modeling knowledge and practices. We apply well‐established methods such as generalized additive models and MaxEnt, alongside others that have received attention more recently, including regularized regressions, point‐process weighted regressions, random forests, XGBoost, support vector machines, and the ensemble modeling framework biomod. All the methods we use include background s les (a s le of environments in the landscape) for model fitting. We explore impacts of using weights on the presence and background points in model fitting. We introduce new ways of evaluating models fitted to these data, using the area under the precision‐recall gain curve, and focusing on the rank of results. We find that the way models are fitted matters. The top method was an ensemble of tuned in idual models. In contrast, ensembles built using the biomod framework with default parameters performed no better than single moderate performing models. Similarly, the second top performing method was a random forest parameterized to deal with many background s les (contrasted to relatively few presence records), which substantially outperformed other random forest implementations. We find that, in general, nonparametric techniques with the capability of controlling for model complexity outperformed traditional regression methods, with MaxEnt and boosted regression trees still among the top performing models. All the data and code with working ex les are provided to make this study fully reproducible.
Publisher: Wiley
Date: 27-10-2021
DOI: 10.1111/ECOG.05615
Abstract: The random forest (RF) algorithm is an ensemble of classification or regression trees and is widely used, including for species distribution modelling (SDM). Many researchers use implementations of RF in the R programming language with default parameters to analyse species presence‐only data together with ‘background' s les. However, there is good evidence that RF with default parameters does not perform well for such ‘presence‐background' modelling. This is often attributed to the disparity between the number of presence and background s les, also known as 'class imbalance', and several solutions have been proposed. Here, we first set the context: the background s le should be large enough to represent all environments in the region. We then aim to understand the drivers of poor performance of RF when models are fitted to presence‐only species data alongside background s les. We show that 'class overlap' (where both classes occur in the same environment) is an important driver of poor performance, alongside class imbalance. Class overlap can even degrade performance for presence–absence data. We explain, test and evaluate suggested solutions. Using simulated and real presence‐background data, we compare performance of default RF with other weighting and s ling approaches. Our results demonstrate clear evidence of improvement in the performance of RFs when techniques that explicitly manage imbalance are used. We show that these either limit or enforce tree depth. Without compromising the environmental representativeness of the s led background, we identify approaches to fitting RF that ameliorate the effects of imbalance and overlap and allow excellent predictive performance. Understanding the problems of RF in presence‐background modelling allows new insights into how best to fit models, and should guide future efforts to best deal with such data.
Publisher: Cold Spring Harbor Laboratory
Date: 17-11-2020
DOI: 10.1101/2020.11.16.384164
Abstract: The Random Forest (RF) algorithm is an ensemble of classification or regression trees, and is a widely used and high-performing machine learning technique. It is increasingly used for species distribution modelling (SDM). Many researchers use implementations of RF in the R programming language with default parameters to analyse species presence-only data together with background s les. However, there is good evidence that RF with default parameters does not perform well with such species “presence-background” data. This is often attributed to the typical disparity between the number of presence and background s les also known as class imbalance , and several solutions have been proposed. Here, we first set the context: the background s le should be large enough to represent all environments in the region. We then aim to understand the drivers of poor performance of RF with presence-background data, and explain, test and evaluate suggested solutions. Using simulated and real species data, we compare performance of default RF with other weighting and s ling approaches. We show that class overlap is an important driver of poor performance, alongside class imbalance. The results demonstrate clear evidence of improvement in the performance of RFs when class imbalance is explicitly managed by s ling methods or when the overfitting commonly associated with overlapping classes is avoided by forcing shallow trees. Presence-background data is a particular version of class imbalance in which class overlap is highly likely and extreme imbalance exists. Without compromising the environmental representativeness of the s led background, we show several approaches to fitting RF that ameliorate the effects of imbalance and overlap, and allow excellent predictive performance. Understanding the problems of RF in presence-background data allows new insights into how best to fit models, and should guide future efforts to best deal with such data.
Publisher: Inter-Research Science Center
Date: 16-04-2010
DOI: 10.3354/ESR00274
Publisher: Wiley
Date: 27-05-2020
DOI: 10.1111/ACV.12601
Publisher: Wiley
Date: 02-12-2020
DOI: 10.1111/ECOG.05250
Abstract: Accurately predicting species ranges is a primary goal of ecology. Demographic distribution models (DDMs), which correlate underlying vital rates (e.g. survival and reproduction) with environmental conditions, can potentially predict species ranges through time and space. However, tests of DDM accuracy across wide ranges of species' life histories are surprisingly lacking. Using simulations of 1.5 million hypothetical species' range dynamics, we evaluated when DDMs accurately predicted future ranges, to provide clear guidelines for the use of this emerging approach. We limited our study to deterministic demographic models ignoring density dependence, since these models are the most commonly used in the literature. We found that density‐independent DDMs overpredicted extinction if populations were near carrying capacity in the locations where demographic data were available. However, DDMs accurately predicted species ranges if demographic data were limited to sites with mean initial abundance less than one half of carrying capacity. Additionally, the DDMs required demographic data from at least 25 sites, over a short time‐interval ( 10 time‐steps), as populations initially below carrying capacity can saturate in long‐term studies. For species with demographic data from many low density sites, DDMs predicted occurrence more accurately than correlative species distribution models (SDMs) in locations where the species eventually persisted, but not where the species went extinct. These results were insensitive to differences in simulated dispersal, levels of environmental stochasticity, the effects of the environmental variables and the functional forms of density dependence. Our findings suggest that deterministic, density‐independent DDMs are appropriate for applications where locating all possible sites the species might occur in is prioritized over reducing false presence predictions in absent sites. This makes DDMs a promising tool for mapping invasion risk. However, demographic data are often collected at sites where a species is abundant. Density‐independent DDMs are inappropriate in this case.
Publisher: Wiley
Date: 11-04-2014
DOI: 10.1002/ECE3.1056
Publisher: Wiley
Date: 18-08-2023
DOI: 10.1111/ECOG.06048
Abstract: Ecological models used to forecast range change (range change models RCM) have recently ersified to account for a greater number of ecological and observational processes in pursuit of more accurate and realistic predictions. Theory suggests that process‐explicit RCMs should generate more robust forecasts, particularly under novel environmental conditions. RCMs accounting for processes are generally more complex and data hungry, and so, require extra effort to build. Thus, it is necessary to understand when the effort of building a more realistic model is likely to generate more reliable forecasts. Here, we review the literature to explore whether process‐explicit models have been tested through benchmarking their temporal predictive performance (i.e. their predictive performance when transferred in time) and model transferability (i.e. their ability to keep their predictive performance when transferred to generate predictions into a different time) against simpler models, and highlight the gaps between the rapid development of process‐explicit RCMs and the testing of their potential improvements. We found that, out of five ecological processes (dispersal, demography, physiology, evolution, species interactions) and two observational processes (s ling bias, imperfect detection) that may influence reliability of forecasts, only the effects of dispersal, demography and imperfect detection have been benchmarked using temporally‐independent datasets. Only nine out of twenty‐nine process‐explicit model types have been tested to assess whether accounting for processes improves temporal predictive performance. We found no benchmarks assessing model transferability. We discuss potential reasons for the lack of empirical validation of process‐explicit models. Considering these findings, we propose an expanded research agenda to properly test the performance of process‐explicit RCMs, and highlight some opportunities to fill the gaps by suggesting models to be benchmarked using existing historical datasets.
Publisher: Wiley
Date: 06-2020
DOI: 10.1111/ECOG.04960
Publisher: Queensland University of Technology
Date: 16-08-2021
DOI: 10.5204/SSJ.1762
Abstract: The first year at university is always challenging, but particularly in 2020 when COVID-19 triggered lockdowns and a rapid shift to online learning. This mixed methods study tracked the wellbeing and engagement of 60 new students in an undergraduate teacher education program at an Australian university throughout the first trimester of 2020. Follow-up focus groups with 14 students used interview and photo elicitation to explore how COVID-19 influenced wellbeing and engagement. Quantitative results demonstrate both student wellbeing and student engagement dipped strongly at the start of lockdown but recovered towards the end of the trimester. Focus group findings illustrate the ersity of experience in terms of student access to time and space to study, their ability to sustain relationships online, and the cumulative stress of COVID-19. The findings lead to recommendations for supporting this cohort and for future research.
Publisher: Elsevier BV
Date: 02-2015
Publisher: Wiley
Date: 08-01-2015
DOI: 10.1111/GEB.12268
Publisher: Elsevier BV
Date: 06-2013
Publisher: Wiley
Date: 08-11-2018
Publisher: Wiley
Date: 27-01-2023
DOI: 10.1111/GEB.13639
Abstract: To assess whether flexible species distribution models that perform well at nearby testing locations still perform strongly when evaluated on spatially separated testing data. Australian Wet Tropics (AWT), Ontario, Canada (CAN), north‐east New South Wales, Australia (NSW), New Zealand (NZ), five countries of South America (SA), and Switzerland (SWI). Most species data were collected between 1950 and 2000. Birds, mammals, plants and reptiles. We compared 10 species distribution modelling methods with varying flexibility in terms of the allowed complexity of their fitted functions [boosted regression trees (BRT), generalized additive model (GAM), multivariate adaptive regression splines (MARS), maximum entropy (MaxEnt), support vector machine (SVM), variants of generalized linear model (GLM) and random forest (RF), and an Ensemble model]. We used established practices for model selection to avoid overfitting, including parameter tuning in learning methods. Models were trained on presence–background data for 171 species and tested on presence–absence data. Training and testing data were separated using both random and spatial partitioning, the latter based on 75‐km blocks. We calculated the average performance and mean rank of the methods (focussing on the area under the receiver operating characteristic and precision‐recall gain curves, and correlation) and assessed the statistical significance of the differences between them. The ranking of methods did not change when evaluated on spatially separated testing data. Methods with the strongest predictive performance were nonparametric methods known to be flexible. An ensemble formed by averaging predictions of five pre‐selected modelling methods was the best model in both random and spatial partitioning, followed by MaxEnt and a variant of random forest. Whilst some modellers expect methods limited to simple smooth functions to predict better spatially separated data, we found no evidence of that using blocks of 75 km. We conclude that flexible models that are tuned well enough to avoid overfitting are effective at predicting to spatially distinct areas.
Publisher: Wiley
Date: 03-11-2019
Publisher: Wiley
Date: 16-08-2010
Publisher: Wiley
Date: 25-03-2021
DOI: 10.1002/ECE3.7206
Publisher: Wiley
Date: 27-01-2020
DOI: 10.1111/ECOG.04890
Start Date: 02-2016
End Date: 12-2019
Amount: $360,000.00
Funder: Australian Research Council
View Funded ActivityStart Date: 01-2018
End Date: 12-2021
Amount: $396,250.00
Funder: Australian Research Council
View Funded ActivityStart Date: 2019
End Date: 12-2024
Amount: $490,233.00
Funder: Australian Research Council
View Funded ActivityStart Date: 05-2023
End Date: 05-2026
Amount: $654,671.00
Funder: Australian Research Council
View Funded Activity