Introduction

Great news!

We're thrilled to see PhaBOX becoming an increasingly valuable tool for phage sequence identification and characterization. Since our launch, we've received over 10,000 submissions and 300 citations. Thank you for your trust and for being part of our community!

To meet the growing demand, we’re excited to share some new updates designed to enhance your experience:

  • 🎉Generalize for all kinds of viruses with faster speed! But we will call it PhaBOX2 for a better inheritance :).
  • 🎉Provide a more comprehensive taxonomy classification (latest ICTV 2024) and complete taxonomy lineage.
  • 🎉Provide a genus-level clustering for potential new genus (genus-level vOTU).
  • 🎉Provide a protein annotation function.
  • 🎉Provide a contamination and prophage detection module
  • 🎉More user-friendly commands

The web server is still upgrading; please be patient.

The following functions will be coming soon!

  • More flexible host prediction options (e.g., CRISPR-only prediction, MAGs' CRISPR detection).
  • Marker searching module for phylogenetic tree needs.

We’re also upgrading our hardware for faster speeds and upgrading our GitHub resources for large-scale data processing.

Your feedback is crucial to us. We’d love to hear any suggestions or ideas you have to further improve our pipelines.

A server for identifying and characterizing phage contigs in metagenomic data

Bacteriophages are viruses infecting bacteria. Being key players in microbial communities, they can regulate the composition/function of the microbiome by infecting their bacterial hosts and mediating gene transfer. Recently, metagenomic sequencing, which can sequence all genetic materials from various microbiomes, has become popular for new phage discovery. However, accurate and comprehensive detection of phages from metagenomic data remains challenging. High diversity/abundance and limited reference genomes pose major challenges for recruiting phage fragments from metagenomic data.

This server, named PhaBOX, aims to provide one-stop phage identification and analysis. PhaBOX integrates our previously published tools: PhaMer, PhaTYP, PhaGCN, and CHERRY, for phage identification, lifestyle prediction, taxonomy classification, and host prediction, respectively. All these tools combined the strength of the reference-based and the deep learning model to learn different sequence similarity features, including protein organizations, sequence homology, and protein-protein associations.

The default mode of PhaBOX is to run all the analysis programs (see the above paragraph for the program names) for users. We optimized the functions in these programs to save computational recourses and time. Meanwhile, PhaBOX has a modular design. Users can choose to run only the needed programs rather than the end-to-end pipeline. However, if users have a specific goal to analyze their phage contigs, they can select the program of interest to run either.

To help users understand the prediction and analysis results, PhaBOX provides important evidence or features behind these predictions. For each predicted phage contig, we visualized the essential components of PhaBOX, such as the similarity-based relationships between the contigs and other phages, predicted proteins on the contigs, and protein homology, to show evidence for generating predictions.

The diagrammatic illustration of PhaBOX is shown below.


Reminder:

The following browsers are supported/tested by this website:

  • Windows: Chrome, Firefox, Edge
  • Mac: Chrome, Firefox, Safari
  • Linux: Chrome, Firefox