Seq-Hereversion: 0.1.0

A fast toolkit for bioinformatics sequence processing, written by Rust.

Get Started View on GitHub View on Crates.io

Features

Lightning Fast

Based on rust programming language with parallel processing and memory-mapped files for high performance.

Versatile Formats

Support for FASTA, FASTQ, GFF/GTF and other common bioinformatics formats.

Powerful Extraction

Extract sequences or segments based on IDs, specific positions, or annotation features with precise control over what you need.

Getting Started

Command-line Application

Use our Command-line App for quick sequence processing:

# Install from crates.io
cargo install seq-here

# Get basic information from sequence files
seq-here info fa sample.fasta

Extract Sequences by ID

# Extract sequences with specific ID
seq-here extract segment sample.fasta --str GhID000001 -o output.fasta

# Extract sequences with IDs from a file
seq-here extract segment sample.fasta --file ids.txt -o output.fasta

# Extract specific region from sequence (positions 100-200)
seq-here extract segment sample.fasta --str GhID000001 --start 100 --end 200

Extract Annotated Features

# Extract all annotated features
seq-here extract explain --seq sample.fasta --gff annotation.gff -o output_dir

# Extract only specific feature types
seq-here extract explain --seq sample.fasta --gff annotation.gff --type CDS,gene -o output_dir

Process Files

# Combine multiple files
seq-here process combine file1.fasta,file2.fasta -o combined.fasta
seq-here process combine file_folder/

Library Crate

Use our lib crate in your Rust project:

1. Add Dependency

# Cargo.toml
[dependencies]
seq-here = "0.0.6"

2. Extract Sequences by ID

// Import required modules
use seq_here::extract::ExtractSegment;
use std::path::PathBuf;

fn extract_sequences() {
    let input_files = vec![PathBuf::from("sample.fasta")];
    let output_file = PathBuf::from("output.fasta");
    
    // Extract complete sequence
    ExtractSegment::extract_id(
        input_files.clone(),
        "SEQUENCE_ID".to_string(),
        output_file.clone(),
        None,  // start position (optional)
        None   // end position (optional)
    );
    
    // Extract specific segment (positions 10-50)
    ExtractSegment::extract_id(
        input_files,
        "SEQUENCE_ID".to_string(),
        output_file,
        Some(10),  // start position
        Some(50)   // end position
    );
}

3. Extract Features from Annotation Files

// Import annotation processing module
use seq_here::extract::ExtractExplain;
use std::path::PathBuf;

fn extract_annotations() {
    // Prepare input files
    let seq_files = vec![PathBuf::from("genome.fasta")];
    let anno_files = vec![PathBuf::from("annotation.gff")];
    let output_dir = PathBuf::from("output_directory");
    
    // Filter for specific feature types (e.g., CDS, gene)
    let feature_types = Some(vec!["CDS".to_string(), "gene".to_string()]);
    
    // Execute extraction
    ExtractExplain::extract(
        seq_files,
        anno_files,
        output_dir,
        feature_types  // Use None to extract all features
    );
}

Contact Us

For any questions or suggestions, please contact us at zhixiaovo@gmail.com