---
title: "run PhyloNet on Philippines *Dicaeum* RADseq"
format:
html:
code-fold: show
code-tools: true
toc: true
toc-title: Document Contents
number-sections: true
embed-resources: true
---
### PhyloNet approach
- I am going to be following the tutorial and info outlined in this link: https://phylogenomics.rice.edu/media/bookChapter.pdf
- To start, I will set up the input for the maximum-pseudo-likelihood approach, the details of which can be found here: https://phylogenomics.rice.edu/html/commands/InferNetwork_MPL.html
### set up InferNetwork_MPL input
- The PhyloNet command 'InferNetwork_MPL' takes as input a set of gene trees and uses a maximum pseudo-likelihood approach to infer a species tree given number of reticulations with inferred branch lengths and inheritance probabilities. Despite the fact that the full likelihood of each proposed tree is not computed, it has been shown to perform quite reliably, and is computationally tractable.
- I will begin by setting up a nexus full of the 2590 ML gene trees generated as input for ASTRAL, for input to PhyloNet. Branch lengths must be remmoved before input, so we will do that here.
quarto-executable-code-5450563D
```r
#make infer_MPL input
#load libraries
library(phytools)
library(ape)
library(vcfR)
#read in gene trees inferred by iqtree2
x<-phytools::read.newick("~/Desktop/phil.dicaeum/astral/ml_best.trees")
#remove branch lengths
for (i in 1:length(x)){
x[[i]]$edge.length<-NULL
}
#remove outgroup
for (i in 1:length(x)){
x[[i]]<-drop.tip(x[[i]],c("niKU28413","niKU28414"))
}
#write to file
#write.tree(x, file="~/Desktop/phil.dicaeum/astral/mlbesttrees.nobrlens.nooutgroup.nex")
```
- I then edited this output nexus file called 'mlbesttrees.nobrlens.nooutgroup.nex' in a text editor, to create an input file that looks like this:
```{bash, eval=FALSE}
#NEXUS
BEGIN TREES;
Tree gt1=((C,((B,D),A)),E);
Tree gt2=(B,(D,(C,(A,E))));
......
Tree gt2590=(D,((B,E),(C,A)));
END;
BEGIN PHYLONET;
InferNetwork_MPL (all) 1 -a <luzon:hyp18070,hyp20218,hyp20213,hypFMNH454,hyp25637,hyp25921,hyp26984,hyp462070,hyp25675,hyp454950,hyp25880,hyp19638,hyp25868,hyp17969,hyp25672,hyp26975,hyp25670,hyp20193; zamboanga:hyp18191,hyp18159,hyp19177,hyp19178,hyp18193,hyp29945,hyp29951; mindanao:hyp14037,hyp14079,hyp14065,hyp14120,hyp14075,hyp1271,hyp1273,hyp1275,hyp3274,hyp3208,hyp3158,hyp2253,hyp3095,hyp3314,hyp357608,hyp357615,hyp357612,hyp2229,hyp2067,hyp28329,hyp28361,hyp28376,hyp2208,hyp28294,hyp1956,hyp20921,hyp28416,hyp27450,hyp19046,hyp19136,hyp28676,hyp31636,hyp31644> -pl 15;
END;
```
- Where all gene trees are specified with a unique name, and the command at the bottom instructs the program to use (all) input gene trees, infer 1 reticulation edge, -a provides a 'taxa map' of individuals into species, and -pl tells the program the number of cores to use for the analysis. I will save this nexus file as 'mpl.1retic.nex', and move it to the KU cluster.
### run InferNetwork_MPL
I can now run this analysis on the KU cluster using the following bash script:
```{bash, eval=FALSE}
#!/bin/sh
#
#SBATCH --job-name=phylonet # Job Name
#SBATCH --nodes=1 # 40 nodes
#SBATCH --ntasks-per-node=15 # 40 CPU allocation per Task
#SBATCH --partition=bi # Name of the Slurm partition used
#SBATCH --chdir=/home/d669d153/work/phil.dicaeum/phylonet/inferMPL # Set working d$
#SBATCH --mem-per-cpu=5gb # memory requested
#SBATCH --time=5000
module load java
java -jar /home/d669d153/work/PhyloNet_3.8.0.jar mpl.1retic.nex
```
### InferNetwork_MPL output
quarto-executable-code-5450563D
```r
library(tanggle)
z <- read.evonet(text = "((((mindanao:1.0)#H1:1.0::0.494,zamboanga:1.0):0.3244889262249451,luzon:1.0):0.035718031146395546,#H1:1.0::0.506);")
#Plot an explicit network:
ggevonet(z, layout = "rectangular") + geom_tiplab() + geom_nodelab()
ggevonet(z, layout = "slanted") + geom_tiplab() + geom_nodelab()
```