Increasing anthropogenic impact and global change effects on natural ecosystems has prompted the development of less expensive and more efficient bioassessments methodologies. One promising approach is the integration of DNA metabarcoding in environmental monitoring. A critical step in this process is the inference of ecological quality (EQ) status from identified molecular bioindicator signatures that mirror environmental classification based on standard macroinvertebrate surveys. The most promising approaches to infer EQ from biotic indices (BI) are supervised machine learning (SML) and the calculation of indicator values (IndVal). In this study we compared the performance of both approaches using DNA metabarcodes of bacteria and ciliates as bioindicators obtained from 152 samples collected from seven Norwegian salmon farms. Results from standard macroinvertebrate-monitoring of the same samples were used as reference to compare the accuracy of both approaches. First, SML outperformed the IndVal approach to infer EQ from eDNA metabarcodes. The Random Forest (RF) algorithm appeared to be less sensitive to noisy data (a typical feature of massive environmental sequence data sets) and uneven data coverage across EQ classes (a typical feature of environmental compliance monitoring scheme) compared to a widely used method to infer IndVals for the calculation of a BI. Second, bacteria allowed for a more accurate EQ assessment than ciliate eDNA metabarcodes. For the implementation of DNA metabarcoding into routine monitoring programs to assess ecological quality around salmon aquaculture cages, we therefore recommend bacterial DNA metabarcodes in combination with SML to classify EQ categories based on molecular signatures.
see on Pubmed