This function requires a vcfR object as input, and returns a vcfR object filtered to retain only SNPs greater than a specified distance apart on each scaffold. The function starts by automatically retaining the first SNP on a given scaffold, and then subsequently keeping the next SNP that is greater than the specified distance away, until it reaches the end of the scaffold/chromosome. This function scales well with an increasing number of SNPs, but poorly with an increasing number of scaffolds/chromosomes. For this reason, there is a built in progress bar, to monitor potentially long-running executions with many scaffolds. This type of filtering is often employed to reduce linkage among input SNPs, especially for downstream input to programs like structure, which require unlinked SNPs.

distance_thin(vcfR, min.distance = NULL)

Arguments

vcfR

a vcfR object

min.distance

a numeric value representing the smallest distance (in base-pairs) allowed between SNPs after distance thinning

Value

An identical vcfR object, except that SNPs separated by less than the specified distance have been removed from the file

Examples

distance_thin(vcfR = SNPfiltR::vcfR.example, min.distance = 1000)
#> 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |======================================================================| 100%
#> 367 out of 500 input SNPs were not located within 1000 base-pairs of another SNP and were retained despite filtering
#> ***** Object of Class vcfR *****
#> 20 samples
#> 1 CHROMs
#> 367 variants
#> Object size: 0.5 Mb
#> 41.61 percent missing data
#> *****        *****         *****