graph_conv-rotate_h128_l3_edge_prediction

This model was trained using Napistu-Torch, a PyTorch framework for training graph neural networks on biological pathway networks.

The dataset used for training is the 8-source "Octopus" human consensus network, which integrates pathway data from STRING, OmniPath, Reactome, and others. The network encompasses ~50K genes, metabolites, and complexes connected by ~8M interactions.

Task

This model performs edge prediction on biological pathway networks. Given node embeddings, the model predicts the likelihood of edges (interactions) between biological entities such as genes, proteins, and metabolites. This is useful for:

  • Discovering novel biological interactions
  • Validating experimentally observed interactions
  • Completing incomplete pathway databases
  • Predicting functional relationships between genes/proteins

The model learns to score potential edges based on learned embeddings of source and target nodes, optionally incorporating relation types for relation-aware prediction.

Model Description

  • Encoder
    • Type: graph_conv
    • Hidden Channels: 128
    • Number of Layers: 3
    • Dropout: 0.2
    • Edge Encoder: βœ“ (dim=32)
  • Head
    • Type: rotate
    • Relation-Aware: βœ“

Training Date: 2026-01-01

For detailed experiment and training settings see this repository's config.json file.

Performance

Metric Value
Validation relation-weighted AUC 0.8105
Test relation-weighted AUC 0.8193
Validation AUC 0.7859
Test AUC 0.8089
Validation AP 0.7918
Test AP 0.8126

Links

Usage

1. Setup Environment

To reproduce the environment used for training, run the following commands:

pip install torch==2.8.0
pip install torch-scatter torch-sparse -f https://data.pyg.org/whl/2.8.0+cpu.html
pip install 'napistu==0.8.5'
pip install 'napistu-torch[pyg,lightning]==0.3.4'

2. Setup Data Store

First, download the Octopus consensus network data to create a local NapistuDataStore:

from napistu_torch.load.gcs import gcs_model_to_store

# Download data and create store
napistu_data_store = gcs_model_to_store(
    napistu_data_dir="path/to/napistu_data",
    store_dir="path/to/store",
    asset_name="human_consensus",
    # Pin to stable version for reproducibility
    asset_version="20250923"
)

3. Load Pretrained Model from HuggingFace Hub

from napistu_torch.ml.hugging_face import HFModelLoader

# Load checkpoint
loader = HFModelLoader("seanhacks/relation_prediction_rotate_128e")
checkpoint = loader.load_checkpoint()

# Load config to reproduce experiment
experiment_config = loader.load_config()

4. Use Pretrained Model for Training

You can use this pretrained model as initialization for training via the CLI:

# Create a training config that uses the pretrained model
cat > my_config.yaml << EOF
name: my_finetuned_model

model:
  use_pretrained_model: true
  pretrained_model_source: huggingface
  pretrained_model_path: seanhacks/relation_prediction_rotate_128e
  pretrained_model_freeze_encoder_weights: false  # Allow fine-tuning

data:
  sbml_dfs_path: path/to/sbml_dfs.pkl
  napistu_graph_path: path/to/graph.pkl
  napistu_data_name: edge_prediction

training:
  epochs: 100
  lr: 0.001
EOF

# Train with pretrained weights
napistu-torch train my_config.yaml

Citation

If you use this model, please cite:

@software{napistu_torch,
  title = {Napistu-Torch: Graph Neural Networks for Biological Pathway Analysis},
  author = {Hackett, Sean R.},
  url = {https://github.com/napistu/Napistu-Torch},
  year = {2025},
  note = {Model: graph_conv-rotate_h128_l3_edge_prediction}
}

License

MIT License - See LICENSE for details.

Downloads last month
55
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support