PhytoAFP: In Silico Approaches for Designing Plant-Derived Antifungal Peptides

Loading...
Thumbnail Image

Authors

Tyagi, Atul
Roy, Sudeep
Singh, Sanjay
Semwal, Manoj
Shasany, Ajit
Sharma, Ashok
Provazník, Valentýna

Advisor

Referee

Mark

Journal Title

Journal ISSN

Volume Title

Publisher

MDPI
Altmetrics

Abstract

Emerging infectious diseases (EID) are serious problems caused by fungi in humans and plant species. They are a severe threat to food security worldwide. In our current work, we have developed a support vector machine (SVM)-based model that attempts to design and predict therapeutic plant-derived antifungal peptides (PhytoAFP). The residue composition analysis shows the preference of C, G, K, R, and S amino acids. Position preference analysis shows that residues G, K, R, and A dominate the N-terminal. Similarly, residues N, S, C, and G prefer the C-terminal. Motif analysis reveals the presence of motifs like NYVF, NYVFP, YVFP, NYVFPA, and VFPA. We have developed two models using various input functions such as mono-, di-, and tripeptide composition, as well as binary, hybrid, and physiochemical properties, based on methods that are applied to the main data set. The TPC-based monopeptide composition model achieved more accuracy, 94.4%, with a Matthews correlation coefficient (MCC) of 0.89. Correspondingly, the second-best model based on dipeptides achieved an accuracy of 94.28% under the MCC 0.89 of the training dataset.
Emerging infectious diseases (EID) are serious problems caused by fungi in humans and plant species. They are a severe threat to food security worldwide. In our current work, we have developed a support vector machine (SVM)-based model that attempts to design and predict therapeutic plant-derived antifungal peptides (PhytoAFP). The residue composition analysis shows the preference of C, G, K, R, and S amino acids. Position preference analysis shows that residues G, K, R, and A dominate the N-terminal. Similarly, residues N, S, C, and G prefer the C-terminal. Motif analysis reveals the presence of motifs like NYVF, NYVFP, YVFP, NYVFPA, and VFPA. We have developed two models using various input functions such as mono-, di-, and tripeptide composition, as well as binary, hybrid, and physiochemical properties, based on methods that are applied to the main data set. The TPC-based monopeptide composition model achieved more accuracy, 94.4%, with a Matthews correlation coefficient (MCC) of 0.89. Correspondingly, the second-best model based on dipeptides achieved an accuracy of 94.28% under the MCC 0.89 of the training dataset.

Description

Citation

Antibiotics-Basel. 2021, vol. 10, issue 7, p. 1-12.
https://www.mdpi.com/2079-6382/10/7/815

Document type

Peer-reviewed

Document version

Published version

Date of access to the full text

Language of document

en

Study field

Comittee

Date of acceptance

Defence

Result of defence

Endorsement

Review

Supplemented By

Referenced By

Creative Commons license

Except where otherwised noted, this item's license is described as Creative Commons Attribution 4.0 International
Citace PRO