中文版
Home>PROGRAMMES>CAS-ANSO Science Programs>Other Collaborative Research>Research Activities

HiRisk-Detector: an early warning algorithm for high-risk variants of SARS-CoV-2

Accurate early warning methods for high-risk variants of global emerging infectious diseases

11 11, 2024

Since the outbreak of COVID-19, SARS-CoV-2 has continued to evolve. The emergence of new high-risk variants, which may bypass the protection offered by current vaccines and antibodies, requires ongoing adjustments to control strategies to mitigate potential dangers. Therefore, accurate and timely early warnings for high-risk variants are crucial for epidemic control.

With the support of ANSO collaborative research project entitled “Study on accurate early warning methods for high-risk variants of global emerging infectious diseases”, the team led by Prof. SONG Shuhui from China National Center for Bioinformation (CNCB) developed HiRisk-Detector, a machine-learning algorithm for the early detection and warning of high-risk SARS-CoV-2 variants. This algorithm, based on publicly available whole-genome sequences of SARS-CoV-2, provides a technical method to support precise monitoring and control of COVID-19 globally. The research article entitled “Machine learning early detection of SARS-CoV-2 high-risk variants” was published online in the journal Advanced Science on October 14, 2024.

This work was built on the team’s previously established Resource for Coronavirus 2019 (RCoV19) and the haplotype network construction algorithm for genomic big data named McAN. By constructing and extracting continuous temporal haplotype network features and testing various machine learning models with optimal feature combinations, the team developed HiRisk-Detector for monitoring and early warning of high-risk variants. Using over 7.6 million high-quality and complete SARS-CoV-2 genome sequences and their metadata, the team validated the effectiveness, robustness, and generalization capability of HiRisk-Detector.

Firstly, retrospective studies showed that HiRisk-Detector was able to give warning of all 13 high-risk variants reported by the World Health Organization (WHO), with an average warning time of 27 days earlier than the WHO's official announcements, demonstrating effective early warning capabilities. Secondly, even when sequencing intensity was reduced to a quarter of the actual value, the warning delay was only 3.8 days, proving the algorithm’s robustness. Finally, HiRisk-Detector is also applicable for risk warnings of Omicron sub-lineages, with performance metrics such as ROC-AUC and PR-AUC exceeding 0.92, showcasing the algorithm's broad applicability. Overall, HiRisk-Detector enables automated early warning for high-risk variants and holds significant value for the prevention and control of emerging infectious diseases.

The source code for HiRisk-Detector has been publicly released on BioCode* and GitHub*, where users can download it for free. The high-risk variant warning results generated by HiRisk-Detector can also be visually tracked through the high-risk variant warning module of RCoV19*.

*Link of BioCode: https://ngdc.cncb.ac.cn/biocode/tools/BT007386
*Link of GitHub: https://github.com/Theory-Lun/HiRiskPredictor
*Link of the high-risk variant warning module of RCoV19:https://ngdc.cncb.ac.cn/ncov/monitoring/risk
*Link of the publication:https://doi.org/10.1002/advs.202405058



Illustration of HiRisk-Detector Algorithm

Contributors: SONG Shuhui, China National Center for Bioinformation, songshh@big.ac.cn

CONTACTS ANSO SECRETARIAT

Tel: 86-10-84097121 Email: anso-public@anso.org.cn Location: No.16 Lincui Road, Chaoyang District, Beiing, China