<div dir="ltr"><div>Hi Jarrett, <br></div><div><br></div><div>The code below doesn't work as subst.model, transi.rate, and transiv.rate have not been defined. Could these be the problems you are running into? Also, you may wish to include a link to the previous discussion so people can understand what's going on.</div><div><br></div><div>Best,<br></div><div>Zhian<br></div><div><br><div class="gmail_quote"><div dir="ltr">On Thu, Sep 27, 2018 at 11:00 AM <<a href="mailto:adegenet-forum-request@lists.r-forge.r-project.org">adegenet-forum-request@lists.r-forge.r-project.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Send adegenet-forum mailing list submissions to<br>

        <a href="mailto:adegenet-forum@lists.r-forge.r-project.org" target="_blank">adegenet-forum@lists.r-forge.r-project.org</a><br>

<br>

To subscribe or unsubscribe via the World Wide Web, visit<br>

        <a href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum" rel="noreferrer" target="_blank">https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum</a><br>

<br>

or, via email, send a message with subject or body 'help' to<br>

        <a href="mailto:adegenet-forum-request@lists.r-forge.r-project.org" target="_blank">adegenet-forum-request@lists.r-forge.r-project.org</a><br>

<br>

You can reach the person managing the list at<br>

        <a href="mailto:adegenet-forum-owner@lists.r-forge.r-project.org" target="_blank">adegenet-forum-owner@lists.r-forge.r-project.org</a><br>

<br>

When replying, please edit your Subject line so it is more specific<br>

than "Re: Contents of adegenet-forum digest..."<br>

<br>

<br>

Today's Topics:<br>

<br>

   1. Nucleotide substitution model used by     haploGen()/SeqTrack()<br>

      (Jarrett Phillips)<br>

<br>

<br>

----------------------------------------------------------------------<br>

<br>

Message: 1<br>

Date: Wed, 26 Sep 2018 15:46:07 +0000<br>

From: Jarrett Phillips <<a href="mailto:jphill01@uoguelph.ca" target="_blank">jphill01@uoguelph.ca</a>><br>

To: "<a href="mailto:adegenet-forum@lists.r-forge.r-project.org" target="_blank">adegenet-forum@lists.r-forge.r-project.org</a>"<br>

        <<a href="mailto:adegenet-forum@lists.r-forge.r-project.org" target="_blank">adegenet-forum@lists.r-forge.r-project.org</a>><br>

Subject: [adegenet-forum] Nucleotide substitution model used by<br>

        haploGen()/SeqTrack()<br>

Message-ID:<br>

        <<a href="mailto:YQXPR01MB069458BFD63BE601F9505BCAD5150@YQXPR01MB0694.CANPRD01.PROD.OUTLOOK.COM" target="_blank">YQXPR01MB069458BFD63BE601F9505BCAD5150@YQXPR01MB0694.CANPRD01.PROD.OUTLOOK.COM</a>><br>

<br>

Content-Type: text/plain; charset="iso-8859-1"<br>

<br>

Dear Thibaut,<br>

<br>

<br>

This question stems from  ones I asked in August.<br>

<br>

<br>

I've been experimenting with the R code from haploGen() and include it below (with my own slight modifications for my specific problem). Hopefully you can lend some advice as to the issue I am facing (see below).<br>

<br>

<br>

Note here I refer to a haplotype as a unique DNA sequence.<br>

<br>

<br>

<br>

library(pegas)<br>

<br>

<br>

set.seed(17)<br>

<br>

<br>

sim.seqs <- TRUE<br>

<br>

length.seqs <- 500<br>

<br>

num.seqs <- 100 # number of DNA sequences<br>

<br>

<br>

<br>

if (sim.seqs == TRUE) {<br>

<br>

<br>

<br>

      nucl <- as.DNAbin(c('a','c','g','t'))<br>

<br>

<br>

     res <- sample(nucl, size = length.seqs, replace = TRUE, prob = rep(0.25, 4))<br>

<br>

<br>

     if (subst.model == "K80") {<br>

<br>

<br>

<br>

        transi.set <- list('a' = as.DNAbin('g'),<br>

<br>

                           'c' = as.DNAbin('t'),<br>

<br>

                           'g' = as.DNAbin('a'),<br>

<br>

                           't' = as.DNAbin('c'))<br>

<br>

        transv.set <- list('a' = as.DNAbin(c('c', 't')),<br>

<br>

                           'c' = as.DNAbin(c('a', 'g')),<br>

<br>

                           'g' = as.DNAbin(c('c', 't')),<br>

<br>

                           't' = as.DNAbin(c('a', 'g')))<br>

<br>

<br>

<br>

        transi <- function(res) {<br>

<br>

          unlist(transi.set[as.character(res)])<br>

<br>

        }<br>

<br>

<br>

<br>

        transv <- function(res) {<br>

<br>

          sapply(transv.set[as.character(res)], sample, 1)<br>

<br>

        }<br>

<br>

<br>

<br>

        duplicate.seq <- function(res) {<br>

<br>

          num.transi <- rbinom(n = 1, size = length.seqs, prob = transi.rate) # total number of transitions<br>

<br>

          if (num.transi > 0) {<br>

<br>

            idx <- sample(length.seqs, size = num.transi, replace = FALSE)<br>

<br>

            res[idx] <- transi(res[idx])<br>

<br>

          }<br>

<br>

<br>

<br>

          num.transv <- rbinom(n = 1, size = length.seqs, prob = transv.rate) # total number of transversions<br>

<br>

          if (num.transv > 0) {<br>

<br>

            idx <- sample(length.seqs, size = num.transv, replace = FALSE)<br>

<br>

            res[idx] <- transv(res[idx])<br>

<br>

          }<br>

<br>

          res<br>

<br>

          }<br>

<br>

        }<br>

<br>

<br>

<br>

      res <- matrix(replicate(num.seqs, duplicate.seq(res)), byrow = TRUE, nrow = num.seqs)<br>

<br>

<br>

<br>

      class(res) <- "DNAbin"<br>

<br>

<br>

<br>

      # write.dna(res, file = "seqs.fas", format = "fasta")<br>

<br>

<br>

<br>

      h <- sort(haplotype(res), decreasing = TRUE, what = "frequencies")<br>

<br>

      rownames(h) <- 1:nrow(h)<br>

<br>

<br>

<br>

    }<br>

<br>

<br>

## Output<br>

<br>

<br>

h<br>

<br>

<br>

Haplotypes extracted from: res<br>

<br>

<br>

    Number of haplotypes: 5<br>

<br>

         Sequence length: 500<br>

<br>

<br>

Haplotype labels and frequencies:<br>

<br>

<br>

  1  2  3  4  5<br>

<br>

96  1  1  1  1<br>

<br>

<br>

<br>

If the code is run multiple times (without the seed), the same pattern emerges: h is always skewed toward the most dominant haplotype.<br>

<br>

<br>

Is there a way that you've implemented in haploGen() (and that I have potentially glossed over) that gives different trends each time the code is run?<br>

<br>

<br>

For example, I want to be able to run the code (without seed) and get something like<br>

<br>

<br>

h<br>

<br>

<br>

Haplotypes extracted from: res<br>

<br>

<br>

    Number of haplotypes: 5<br>

<br>

         Sequence length: 500<br>

<br>

<br>

Haplotype labels and frequencies:<br>

<br>

<br>

  1    2    3    4   5<br>

<br>

35  15  25  20  5<br>

<br>

<br>

<br>

haploGen() does exactly this. Each time haploGen() is run, a different distribution for h  is outputted. I just want to emulate what you have done, but am stuck at this roadblock and I m uncertain of what is causing this issue.<br>

<br>

<br>

Note: I am just wanting to generalize your approach to generate DNA sequences (not necessarily for outbreak modelling).<br>

<br>

<br>

Can you please shed some light?<br>

<br>

<br>

<br>

Thank you in advance.<br>

<br>

<br>

<br>

Sincerely,<br>

<br>

<br>

Jarrett Phillips<br>

-------------- next part --------------<br>

An HTML attachment was scrubbed...<br>

URL: <<a href="http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20180926/408416d5/attachment-0001.html" rel="noreferrer" target="_blank">http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20180926/408416d5/attachment-0001.html</a>><br>

<br>

------------------------------<br>

<br>

Subject: Digest Footer<br>

<br>

_______________________________________________<br>

adegenet-forum mailing list<br>

<a href="mailto:adegenet-forum@lists.r-forge.r-project.org" target="_blank">adegenet-forum@lists.r-forge.r-project.org</a><br>

<a href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum" rel="noreferrer" target="_blank">https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum</a><br>

<br>

------------------------------<br>

<br>

End of adegenet-forum Digest, Vol 121, Issue 3<br>

**********************************************<br>

</blockquote></div></div></div>