Phylogenetic trees Case Study Analysis
The performance of exploring and building the B&B tree tend to be mainly based on the use of four different operators such as pruning, selection, bounding, and branching. In the B&B algorithm, the node A of some trees with lower bound is considered greater as compared to that of the best solution determination as compared to node B. For some of the searches, node A can be discarded. The potential approach of B&B algorithm is known to bring significant reduction in the number of nodes that are being explored.
However, the execution time of B&B is known to represent an increase with the instance size. Often, it is only associated with solving the small or moderate-sized instances practically. Due to this reason, in past decade, the revelation of parallel computing had turned out to be an attractive method for dealing with larger instances for the problems associated with combinatorial optimization.
Similarly, considering the parallel GPU-B&B algorithm is one of the first approach known for the implementation of all four operators of B&B on the GPU based on requirement of no virtual interaction with the CPU during the process of exploration. Such approach is known to be based on IVM data structure allowing the efficient management and storage of the pool of the sub-problems associated with the problems of combinatorial optimization depending on permutation. (J. Gmys, 2015)
On the other hand, some of the commonly used approaches involving the operators of tree rearrangement are tree bisection and reconnection (TBR), subtree pruning and regrafting, and nearest neighbor interchange that are considered for application for defining a state space for tree topologies. Such approaches are then traversed in the local searches with the strategic approaches like the best first with the genetic algorithms, simulated annealing, and backtracking. The introduction of extensive rearrangements of tree by operators mainly include TBR and SPE that are known to yield an improved solution as compared to less extensive operators of tree rearrangement like NNI. However, these are known to be associated with the longer runtimes cost. (Ivan( Gregor, 2013)
By traditional means, the capability of the researchers is known to be limited to explore the tree space through use of multiple replicates of Wagner addition supported by typical algorithms of hill climbing like branch swapping of TBR and SPR. But, such methods are known to be represent insufficiency for larger data sets.
Thus, an alternative strategic approach for explicating the enumeration is based on the use of shortcuts guaranteeing the identification of all optimal trees. Therefore, one of the most common shortcut is considered to be branch and bound algorithm. However, for the purpose of teaching, explicit enumeration and branch and bound doing the trick, for most of the biological interesting datasets, such algorithms cannot be applied in a useful manner.(Giribet, 2007)
Whereas the five different methods of maximum likelihood, ECRML, ECRML+PHYML, fastDNAml tend to be lower than PAxML and PHYML as a whole. Recently, the recognition of PHYML is as the fasted maximum likelihood. The PHYML efficacy is known to be obtained through simultaneous optimization of tree topology and edge lengths. Additionally, the RAxML efficiency is known to be presented with a large extent from a quite efficient implementation to store trees and calculation of likelihoods. However, there are no specified skills in ECRML, ECRML+PHYML, fastDNAml. Furthermore, ECRML, ECRML+PHYML computing time is basically the total of about b20 iterations. After each specified iteration, the likelihood of current tree and branches length is to be updated occupying most of the computation time. (Mao-Zu Guo, 2008)
In contradiction, the theoretical understanding of the approaches to sampling of phylogenetic is expected to be less developed as compared to that of the approaches of optimization. However, a different number of sampling steps are mainly required for the production of accurate samples for the functions of tree partition. Therefore, the method proposed by the researchers Navodit et al tend to be mainly dependent on the replacement of the standard tree rearrangement moving with Markov model as an alternative approach allowing one to significantly solve a theoretically difficult but practically tractable problems of optimization on each step of the sampling process.
The resulting method is considered for application to a wide-range of standard probability models that yields practical algorithm in terms of efficient sampling and rigorous proof based on accurate sampling for the heated versions of some of the essential cases. It significantly demonstrated the versatility and efficient of the methods based on the uncertainty analysis in the tree interference depending on the variation in the input sizes. Additionally, in order to provide a new practical approach for phylogenetic sampling, such technique is expected to bring significant improvement for proving the applicability to many of the similar issues which mainly involves sampling over combinatorial objects weighted through the likelihood model. (Navodit Misra, 2011)
On the other hand, the fixed tree topology, the inference of ancestral sequences by TreeTime is to maximize the likelihood of joint sequence. Furthermore, the correspondence of branch lengths with the maximum likelihood of molecular clock phylogeny can be significantly considered for computation in linear time through use of dynamic programming approach of technique of message passing. Given the branch length and a tree topology, the ancestral sequence’s maximum likelihood is known to infer in linear time. Similarly, the branch length’s maximum likelihood is primarily given the offspring and parent sequences tend to be easy for optimization.
Additionally, in TreeTime, each of the tree node is to be given a probabilistic or strict data constraint. The complexity of such higher model is known to result in the longer times of run. But, the runtime scaling is expected to remain linear in the dataset size and alignments with the routine analysis of thousands of sequences. The time tree dating and interference tend to be a bit faster as compared to the estimation of the tree topology. Considering the case of extensive uncertainity of the states of ancestors and tree topology, the iterative steps convergence is mainly not guaranteed. Futhermore, in a number of cases, TreeTime is expected to demonstrate an approximate branch lengths, estimation of time tree, and ancestral assignments that are required to be check for plausibility. Generally, avoiding the posterior sampling and global optimization is somehow not possible. (Pavel Sagulenko, 2018)
For obtaining improved solutions, it is considered essential to perform the methods and techniques we are aware of such as branch swapping. The branch exchanging on the tree with the refining object of a previous solution can be considered. The first of such type of branch swappers tend to be incorporated in the programs such as PHYSYS which was named as branch-breaking and later came to be known as tree bisection and reconnection. In the following period, NNI – nearest-neighbot interchanges, tree bisection and reconnection, and subtree pruning and regrafting has tend to be the standard algorithm mainly for the process of branch swapping. (Giribet, Efficient Tree Searches with Available Algorithms, 2007)
The character-based methods are expected to be one of the informative approach to reconstruct the mutations sequences and unobserved ancestral states. But they have now been computationally infeasible on the larger marker sets. Similarly, some of the methods that are computationally efficient include parsimony methods associated with character-based methods. But, they tend to be completely dependent on the assumptions that mutations are considered rare due to the questions on the tumors assumptions. In comparison to the earlier approaches, such type of models are known for better handling of high rates of mutation, noisy data, and tree inferences uncertainty. But, these methods are known to be more demanding in computational means as compared to parsimony methods.
The general methods for phylogenetic inference is known to fall under 4 different categories i.e. Bayesian methods, maximum likelihood, maximum parsimony, and distance-based methods. Distance-based methods are the methods for data clustering considering only the pairwise measurement of evolutionary distances among different sequences. The assumption of maximum parsimony is based on the correction of phylogenetic tree i.e. one approach that requires the smallest number of the evolutionary events for the explaining the input sequences.
The approach of maximum likelihood is dependent on the need of a substitution model to assess the probability of the particular phylogenetic trees. It is based on aiming to find a tree with the highest likelihood in regards to the specified’ substitution model. The reliance of Bayesian methods on the probabilistic models for the evolution such as tree interference of maximum likelihood. (Ivan Gregor, PTree: pattern-based, stochastic search for maximum parsimony phylogenies, 2013)
On the other hand, according to the study conducted by Riester et al, the development of the similar approach particularly for the sequencing of RNA data is based on the use of minimum evolution of phylogenies. This is the parsimony method that rely on the distance-based analogue. Currently, the technological approaches of next generation sequencing is known to produce sets that comprise thousands of sequences, strong tree topology identification which are considered optimal with respect to the standard criteria like the posterior probability, maximum likelihood, and maximum parsimony in association with the methods of phylogenetic inference is a computationally high-demand task.......................................
This is just a sample partical work. Please place the order on the website to get your own originally done case solution.