cgt package¶
cgt.distances module¶
Implements a number of distance measures for genomes under the position paradigm.
The primary function of this module is the distance_matrix function, which returns a distance matrix for a given set of genomes and distance measure.
- The distance measures implemented are:
min: the minimum number of rearrangements required to transform one genome into another
min_weighted: the minimum number of rearrangements required to transform one genome into another, weighted by the inverse of the probability of the rearrangement
MFPT: the mean first passage time from the identity to a given genome, where the target is an absorbing state
MLE: the maximum likelihood estimate of the time elapsed between the identity and a given genome
Individual likelihood functions for the time elapsed between the identity and a given genome can also be obtained
-
cgt.distances.
MFPT
(framework, model, genome_reps=None, scale_by=1)¶ Return the mean time elapsed rearranging ref->genome where the target is an absorbing state
-
cgt.distances.
dict_to_distance_matrix
(distances, framework, genomes=None)¶ If need to convert to pairwise distances, supply a list of genomes.
-
cgt.distances.
distance_matrix
(framework, model, genomes, distance)¶ Compute a distance matrix for a given set of genomes and distance measure.
-
cgt.distances.
likelihood_function
(framework, model, genome, attempt_exact=False, use_projections=True)¶ Return the likelihood function for a given genome
-
cgt.distances.
maximise
(framework, L, max_time=100)¶ Return the time that maximises likelihood function L, using additional information from the framework
- Parameters
framework (Framework) – the framework
L (function) – the likelihood function
max_time (float) – the maximum time to consider
- Returns
the time that maximises the likelihood function
- Return type
float
-
cgt.distances.
min_distance
(framework, model, genome_reps=None, weighted=False)¶ Return dictionary of minimum distances ref->genome, using inverse of model probabilities as weights (or not)
-
cgt.distances.
mle
(framework, model, genome_instance)¶ Returns the maximum likelihood estimate for a given genome instance under the given model and framework.
- Parameters
framework (Framework) – the framework
model (Model) – the model
genome_instance (group element) – the genome instance to compute the MLE for
- Returns
the maximum likelihood estimate
- Return type
float
-
cgt.distances.
mles
(framework, model, genome_instances=None, verbose=False)¶ Returns a dictionary of maximum likelihood estimates for each genome instance under the given model and framework.
- Parameters
framework (Framework) – the framework
model (Model) – the model
genome_instances (list) – a list of genome instances to compute MLEs for. If None, all genomes in the framework are used.
verbose (bool) – whether to print progress
- Returns
a dictionary of maximum likelihood estimates, indexed by genome instance
- Return type
dict
cgt.enums module¶
-
class
cgt.enums.
ALGEBRA
(value)¶ Bases:
enum.Enum
An enumeration.
-
genome
= 2¶
-
genome_class
= 3¶
-
group
= 1¶
-
-
class
cgt.enums.
CLASSES
(value)¶ Bases:
enum.Enum
An enumeration.
-
conjugacy_classes
= 1¶
-
cosets
= 2¶
-
double_cosets
= 3¶
-
-
class
cgt.enums.
DISPLAY
(value)¶ Bases:
enum.Enum
An enumeration.
-
arrows
= 1¶
-
cycles
= 3¶
-
one_row
= 2¶
-
-
class
cgt.enums.
DISTANCE
(value)¶ Bases:
enum.Enum
An enumeration.
-
MFPT
= 3¶
-
MLE
= 4¶
-
maximum_likelihood_distance
= 4¶
-
maximum_likelihood_estimate
= 4¶
-
min
= 1¶
-
min_first_passage_time
= 3¶
-
min_weighted
= 2¶
-
minimum
= 1¶
-
minimum_weighted
= 2¶
-
-
class
cgt.enums.
FORMAT
(value)¶ Bases:
enum.Enum
An enumeration.
-
dictionary
= 3¶
-
equiv_classes
= 2¶
-
formal_sum
= 1¶
-
only_reps
= 4¶
-
-
class
cgt.enums.
IRREP_TYPE
(value)¶ Bases:
enum.Enum
An enumeration.
-
orthogonal
= 2¶
-
seminormal
= 3¶
-
specht
= 1¶
-
speck
= 1¶
-
-
class
cgt.enums.
MODEL
(value)¶ Bases:
enum.Enum
An enumeration.
-
all_inversions
= 'all inversions'¶
-
one_region_inversions
= 'inversions of a single region'¶
-
one_region_swaps
= 'adjacent transpositions'¶
-
two_region_adjacent_transpositions
= 'transpositions of two adjacent regions'¶
-
two_region_inversions
= 'inversions of two adjacent regions'¶
-
-
class
cgt.enums.
SET
(value)¶ Bases:
enum.Enum
An enumeration.
-
one_row
= 3¶
-
signed_cycles
= 1¶
-
unsigned_cycles
= 2¶
-
wreath
= 4¶
-
wreath_s2
= 5¶
-
cgt.genome module¶
-
class
cgt.genome.
Genome
(framework, genome)¶ Bases:
object
Optional wrapper for genomes, so that they can be printed/operated on in a more pythonic way.
cgt.models module¶
-
class
cgt.models.
Model
(framework, generating_dictionary)¶ Bases:
object
Defines a model. A model consists of some collection of permutations and a map from these permutations to probabilities [0,1]
-
classmethod
named_model_with_relative_probs
(framework, named_model_dictionary)¶
-
reg_rep
()¶
-
reg_rep_of_zs
()¶ Return the regular representation of zs as comptued by PositionParadigmFramework.reg_rep_zs, but store the sparse result
-
s_element
(in_algebra=<ALGEBRA.genome: 2>)¶
-
classmethod
cgt.position_paradigm module¶
position_paradigm is the most important module in cgt. PositionParadigmFramework keeps track of important objects, can convert group elements and genomes between different forms, and instances can be passed to functions from other modules so that they can acces information without it needing to be recreated.
-
class
cgt.position_paradigm.
PositionParadigmFramework
(num_regions, oriented=True, symmetry=<SYMMETRY.circular: 1>, genome_type=<TYPE.reg_to_signed_pos: 1>)¶ Bases:
object
Everything you need for working with genomes under the position paradigm
-
canonical_instance
(instance)¶ Return the ‘canonical’ instance of the genome represented by the permutation if there is one.
Examples
sage: import cgt; ppf = cgt.PositionParadigmFramework(3, symmetry=cgt.SYMMETRY.circular) sage: ppf.canonical_instance(‘(1,-2)(-1,2)’) [1, 2, -3]
-
coefficient_in
(more_terms, fewer_terms, none_if_fail=False)¶ For example, the coefficient of (a/2 + b/2) in x=(a/3 + b/3 + c/3) is 2/3, since x=2/3*(a/2 + b/2) + c/3
-
collect_genome_terms
(formal_sum, display=<DISPLAY.one_row: 2>)¶ Return a weighted sum of genomes, represented as [canonical representation]z
-
cycles
(element)¶
-
draw_instance
(instance, shortened=False)¶ Return a one-line drawing of a genome using arrows indicating orientation and elipsis indicating rotational symmetry
-
genome
(instance, format=None)¶
-
genome_conjugacy_classes
(sort_classes=True)¶ Return conjugacy classes of instances
-
genome_equivalence_classes
(combine_inverse_classes=False, sort_classes=True)¶ Return double cosets of instances—genomes in the same class will have the same likelihood functions
-
genome_group
()¶ Return the permutation group containing genome instances.
-
genomes
(format=<FORMAT.dictionary: 3>, sort_genomes=True)¶
-
group_algebra
()¶ Return the group alegbra, where the group is the group containing genome instances.
-
instances
(genome)¶ Return a list of instances of the genome.
-
irreps
(element=None)¶ Return a complete list of pairwise irreducible representations of the genome group.
- Parameters
element – return the image of element for each irreducible representation.
- Returns
a complete list of pairwise irreducible representations of the genome group, or a list of matrices if element is not None. Output is cached so that irreps are computed only once per session. Matrix entries sit inside the UniversalCyclotomicField and are exact expressions.
-
matrix
(element)¶ Return the defining representation matrix. Note that matrix(x)matrix(y) = matrix(yx)
-
num_genomes
()¶ Return the number of distinct genomes up to symmetries.
-
one_row
(element, as_list=False)¶ Return a given genome instance in one-row notation.
- Parameters
element – the genome instance to represent in one-row notation. Multiple formats accepted.
as_list – if true, return as a list of images, otherwise return a sage permutation.
Examples
sage: import cgt sage: ppf = cgt.PositionParadigmFramework(3) sage: ppf.cycles(ppf.one_row(‘(1,-2)(-1,2)’)) == ppf.cycles(‘(1,-2)(-1,2)’) True
-
random_genome
(format=<FORMAT.formal_sum: 1>)¶ Return a random genome
-
random_instance
(genome=None)¶ Return a random permutation corresponding to a given genome, or a random genome if none is given.
-
reg_rep_of_zs
(model, to_adjacency_matrix=False, sparse=True)¶
-
regular_representation
(g)¶ Return the regular representation of a single element
-
standard_reflection
()¶ Return a permutation which reflects a genome instance about the center region (n odd), or the center two regions (n even).
-
standard_rotation
()¶ Return a permutation which rotates positions clockwise by one position.
-
symmetry_element
()¶ Return the symmetry element: a convex sum of elements from symmetry_group().
-
symmetry_group
()¶ Return the symmetry group of the genomes.
-
cgt.rearrangements module¶
-
cgt.rearrangements.
all_adjacent_transpositions_representatives
(framework, num_regions=None)¶
-
cgt.rearrangements.
all_inversions_representatives
(framework, num_regions=None)¶
-
cgt.rearrangements.
c_perm
(n)¶
-
cgt.rearrangements.
conjugacy_class
(framework, perm)¶
-
cgt.rearrangements.
css
(cuts_set)¶
-
cgt.rearrangements.
cuts
(framework, sigma)¶
-
cgt.rearrangements.
double_coset
(framework, perm)¶
-
cgt.rearrangements.
permutation_with_cuts
(framework, cuts, perm=None, start=None)¶
-
cgt.rearrangements.
single_coset
(framework, perm)¶
cgt.structures module¶
A place for functions which return important but general algebraic structures requiring only a few parameters
-
cgt.structures.
HyperoctahedralGroup
(n)¶ Return the hyperoctahedral group of order (2^n)n! as a subgroup of SymmetricGroup(2n) with two generators.
-
cgt.structures.
set_plus_minus
(n)¶
cgt.visualisation module¶
-
cgt.visualisation.
draw_genome_instance
(framework, instance, show=False)¶ Produce a drawing of a circular genome with n regions
cgt.simulations module¶
-
cgt.simulations.
draw_branch_length
()¶
-
cgt.simulations.
draw_genome
(formal_sum)¶
-
cgt.simulations.
draw_rearrangement
(model)¶
-
cgt.simulations.
draw_tree
(tree)¶
-
cgt.simulations.
evolve_on_tree
(tree, framework, model, root='random')¶
-
cgt.simulations.
grow_tree
(framework, model, num_taxa)¶
-
cgt.simulations.
newick_to_tree
(newick_string)¶
-
cgt.simulations.
set_node
(node, framework, genome)¶
cgt.hyperoctahedral module¶
This file contains the class for the hyperoctahedral group.
-
cgt.hyperoctahedral.
HyperoctahedralGroupRepresentations
(n)¶ This function returns the irreducible representations of the hyperoctahedral group of order 2^n * n!, indexed by pairs of partitions of l and m, where l + m = n.
- Parameters
n (int) – The order of the hyperoctahedral group.
- Returns
A dictionary of the irreducible representations of the hyperoctahedral group of order 2^n * n!
- Return type
dict
-
class
cgt.hyperoctahedral.
hyperoctahedral_group
(n, default_element_type='signed_permutation')¶ Bases:
object
This class represents the hyperoctahedral group of order 2^n * n!. It stores the group as a number of different isomorphic structures, as well as a number of other things that are helpful to keep track of in the context of genome rearrangements. It implements functions to compute the irreducible representations of the hyperoctahedral group.
-
Phi_inv
(elt)¶
-
character_for_tuple
(c)¶
-
element_in_little_subgroup
(elt, l, m)¶ Takes elt as an element of the little subgroup of Sn of order l!m! and returns an element of the direct product of S_l and S_m.
-
hyperoctahedral_twist
(s, c)¶ This function implements the twist of the hyperoctahedral group, which is the action of S_n on (C_2)^n by permuting the factors.
- Parameters
s (Permutation) – A permutation in S_n.
c (tuple) – A tuple of elements of C_2.
- Returns
The tuple of elements of C_2 obtained by applying s to the tuple c.
- Return type
an element of (C_2)^n
-
irrep
(partition_pair)¶ Returns the irreducible representation of the hyperoctahedral group of order 2^n * n! indexed by the pair of partitions of l and m where l + m = n. The representation is given as a function that takes an element of the hyperoctahedral group as input and returns the corresponding matrix.
- Parameters
partition_pair (tuple) – A pair of partitions of l and m where l + m = n.
- Returns
A function that takes an element of the hyperoctahedral group as input and returns the corresponding matrix.
- Return type
function (group element -> real matrix)
-
little_subgroup
(l, m, as_direct_product=False)¶
-
little_subgroup_pairs
()¶
-
little_subgroups
()¶
-
orbit_representative
(l, m)¶
-
representation_little_subgroup
(partition_pair)¶
-
right_transversal
(group, subgroup)¶
-
semidirect_product_with_hyperoctahedral_twist
(H, G)¶ Compute a semi-direct product of H and G with the hyperoctahedral twist (G acts on H^n by permuting factors). The group returned is a group of pairs (c, s) where c is an element of H^n and s is an element of G. The group operation is given by (c1, s1) * (c2, s2) = (c1 * s1(c2), s1 * s2).
- Parameters
H (group) – the direct product of n copies of some group.
G (group) – A group.
- Returns
The wreath product of H and G with the hyperoctahedral twist.
- Return type
group
-