cgt package
cgt.distances module
Implements a number of distance measures for genomes under the position paradigm.
The primary function of this module is the distance_matrix function, which returns a distance matrix for a given set of genomes and distance measure.
- The distance measures implemented are:
min: the minimum number of rearrangements required to transform one genome into another
min_weighted: the minimum number of rearrangements required to transform one genome into another, weighted by the inverse of the probability of the rearrangement
MFPT: the mean first passage time from the identity to a given genome, where the target is an absorbing state
MLE: the maximum likelihood estimate of the time elapsed between the identity and a given genome
Individual likelihood functions for the time elapsed between the identity and a given genome can also be obtained
- cgt.distances.MFPT(framework, model, genome_reps=None, scale_by=1)
Return the mean time elapsed rearranging ref->genome where the target is an absorbing state
- cgt.distances.dict_to_distance_matrix(distances, framework, genomes=None)
If need to convert to pairwise distances, supply a list of genomes.
- cgt.distances.discrete_MFPT(framework, model, genome_reps=None, verbose=False)
- cgt.distances.distance(framework, model, genome_instance, distance_measure)
- cgt.distances.distance_between(framework, model, genome_instance_1, genome_instance_2, distance_measure)
- cgt.distances.distance_matrix(framework, model, genomes, distance, replace_nan_with=nan)
Compute a distance matrix for a given set of genomes and distance measure.
- Parameters:
framework (PositionParadigmFramework) – the framework
model (Model) – the model
genomes (list) – a list of genomes to compute distances for
distance (DISTANCE) – the distance measure to use
- cgt.distances.fast_MFPT(framework, model)
- cgt.distances.first_nonzero_value(framework, function, limit=None)
- cgt.distances.genomes_for_dist_matrix(framework, genomes)
- cgt.distances.get_distance_function(distance_type)
- cgt.distances.likelihood_function(framework, model, genome, use_eigenvectors=True)
Return the likelihood function for a given genome
- cgt.distances.maximise(framework, L, max_time=None)
Return the time that maximises likelihood function L, using additional information from the framework
- Parameters:
framework (PositionParadigmFramework) – the framework
L (function) – the likelihood function to maximise
max_time (float) – the maximum time to consider (default: 100)
- Returns:
the time that maximises the likelihood function
- Return type:
float
- cgt.distances.min_distance(framework, model, genome_reps=None, weighted=False)
Return dictionary of minimum distances ref->genome, using inverse of model probabilities as weights (or not)
- cgt.distances.min_distance_using_irreps(framework, model, genome_reps=None)
- cgt.distances.mle(framework, model, genome_instance, show_work=False)
Returns the maximum likelihood estimate for a given genome instance under the given model and framework.
- Parameters:
framework (PositionParadigmFramework) – the framework
model (Model) – the model
genome_instance (group element) – the genome instance to compute the MLE for
- Returns:
the maximum likelihood estimate
- Return type:
float
- cgt.distances.mles(framework, model, genome_instances=None, verbose=False, show_work=False)
Returns a dictionary of maximum likelihood estimates for each genome instance under the given model and framework.
- Parameters:
framework (PositionParadigmFramework) – the framework
model (Model) – the model
genome_instances (list) – a list of genome instances to compute MLEs for. If None, all genomes in the framework are used.
verbose (bool) – whether to print progress
- Returns:
a dictionary of maximum likelihood estimates, indexed by genome instance
- Return type:
dict
- cgt.distances.prob_to_reach_in_steps_func(framework, model, sigma, use_eigenvectors=True)
- cgt.distances.probability_to_reach_in_steps(framework, model, sigma, k)
cgt.enums module
- class cgt.enums.ALGEBRA(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
Bases:
Enum
- genome = 2
- genome_class = 3
- group = 1
- class cgt.enums.CLASSES(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
Bases:
Enum
- conjugacy_classes = 1
- cosets = 2
- double_cosets = 3
- class cgt.enums.DATA(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
Bases:
Enum
- eig_data = 'eigen_data'
- eigval_lists = 'eigenvalue_lists'
- eigval_sets = 'eigenvalue_sets'
- eigval_sets_old = 'eig_lists'
- eigvec_lists = 'eigenvector_lists'
- eigvec_mat_inv = 'eigenvector_matrix_inverses'
- irreps_z = 'irreps_of_z'
- irreps_z_np = 'irreps_of_z_np'
- irreps_zs = 'irreps_of_zs'
- partial_traces = 'partial_traces'
- projections = 'projections'
- reg_rep_of_zs = 'reg_rep_of_zs'
- class cgt.enums.DISPLAY(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
Bases:
Enum
- arrows = 1
- cycles = 3
- one_row = 2
- class cgt.enums.DISTANCE(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
Bases:
Enum
- DMFPT = 5
- MFPT = 3
- MLE = 4
- discrete_MFPT = 5
- discrete_mean_first_passage_time = 5
- maximum_likelihood_distance = 4
- maximum_likelihood_estimate = 4
- mean_first_passage_time = 3
- min = 1
- min_weighted = 2
- minimum = 1
- minimum_distance = 1
- minimum_weighted = 2
- class cgt.enums.FORMAT(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
Bases:
Enum
- dictionary = 3
- equiv_classes = 2
- formal_sum = 1
- only_reps = 4
- class cgt.enums.IRREP_TYPE(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
Bases:
Enum
- orthogonal = 2
- seminormal = 3
- specht = 1
- speck = 1
- class cgt.enums.MODEL(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
Bases:
Enum
- all_inversions = 'all inversions'
- all_inversions_larger_less_likely = 'all inversions, with larger inversions less likely'
- all_transpositions = 'all transpositions'
- one_region_inversions = 'inversions of a single region'
- one_region_moves = 'Move a single region, with or without inversion'
- one_region_moves_without_inversions = 'Move a single region, without inversion'
- three_region_transpositions = 'transpositions involving a segment of three regions, with or without inversion'
- two_region_adjacent_transpositions = 'transpositions of two adjacent regions'
- two_region_inversions = 'inversions of two adjacent regions'
- two_region_revrevs = 'Take two adjacent regions and invert both of them'
- two_region_transpositions = 'Two region adjacent transpositions, with or without inversion'
- two_region_transpositions_without_inversions = 'Two region adjacent transpositions, without inversion'
- class cgt.enums.SET(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
Bases:
Enum
- one_row = 3
- signed_cycles = 1
- unsigned_cycles = 2
- wreath = 4
- wreath_s2 = 5
cgt.examples module
- cgt.examples.example(n)
cgt.genome module
- class cgt.genome.Genome(framework, genome)
Bases:
object
Optional wrapper for genomes, so that they can be printed/operated on in a more pythonic way.
cgt.models module
- class cgt.models.Model(framework, generating_dictionary)
Bases:
object
Defines a model. A model consists of some collection of permutations and a map from these permutations to probabilities [0,1]
- classmethod named_model_with_relative_probs(framework, named_model_dictionary)
Create a model from a dictionary of named classes of rearrangements and their relative probabilities.
- Parameters:
framework (PositionParadigmFramework) – The framework to create the model in
named_model_dictionary (dict) – A dictionary of named classes of rearrangements and their relative probabilities.
- Returns:
A model with the specified named classes of rearrangements and their relative probabilities.
- Return type:
- reg_rep_of_zs()
Return the regular representation of zs as comptued by PositionParadigmFramework.reg_rep_zs, but store the sparse result
- s_element(in_algebra=ALGEBRA.genome)
cgt.position_paradigm module
position_paradigm is the most important module in cgt. PositionParadigmFramework keeps track of important objects, can convert group elements and genomes between different forms, and instances can be passed to functions from other modules so that they can acces information without it needing to be recreated.
- cgt.position_paradigm.Framework
alias of
PositionParadigmFramework
- cgt.position_paradigm.GenomeFramework
alias of
PositionParadigmFramework
- class cgt.position_paradigm.PositionParadigmFramework(num_regions, oriented=True, symmetry=SYMMETRY.circular, genome_type=TYPE.reg_to_signed_pos, precompute_irreps=False)
Bases:
object
Everything you need for working with genomes under the position paradigm
- canonical_double_cosets(join_inverse_classes=False)
- canonical_instance(instance)
Return the ‘canonical’ instance of the genome represented by the permutation if there is one.
Examples
sage: import cgt; ppf = cgt.PositionParadigmFramework(3, symmetry=cgt.SYMMETRY.circular) sage: ppf.canonical_instance(‘(1,-2)(-1,2)’) [1, 2, -3]
- coefficient_in(more_terms, fewer_terms, none_if_fail=False)
For example, the coefficient of (a/2 + b/2) in x=(a/3 + b/3 + c/3) is 2/3, since x=2/3*(a/2 + b/2) + c/3
- collect_genome_terms(formal_sum, display=DISPLAY.one_row)
Return a weighted sum of genomes, represented as [canonical representation]z
- cycles(element)
- draw_instance(instance, shortened=False)
Return a one-line drawing of a genome using arrows indicating orientation and elipsis indicating rotational symmetry
- fast_canonical_instance_generator(cycles=False)
- fast_reg_rep_of_zs(model)
- genome(instance, format=None)
- genome_canonical_instances()
Return a set of all canonical genome instances
- genome_canonical_instances_generator()
- genome_conjugacy_classes(sort_classes=True)
Return conjugacy classes of instances
- genome_equivalence_classes(combine_inverse_classes=False, sort_classes=True)
Return double cosets of instances—genomes in the same class will have the same likelihood functions
- genome_group()
Return the permutation group containing genome instances.
- genomes(format=FORMAT.dictionary, sort_genomes=True)
- group_algebra()
Return the group alegbra, where the group is the group containing genome instances.
- identity_instance()
Return the identity permutation.
- instances(genome)
Return a list of instances of the genome.
- irreps(element=None)
Return a complete list of pairwise irreducible representations of the genome group.
- Parameters:
element – return the image of element for each irreducible representation.
- Returns:
a complete list of pairwise irreducible representations of the genome group, or a list of matrices if element is not None. Output is cached so that irreps are computed only once per session. Matrix entries sit inside the UniversalCyclotomicField and are exact expressions.
- num_genomes()
Return the number of distinct genomes up to symmetries.
- one_row(element, as_list=False)
Return a given genome instance in one-row notation.
- Parameters:
element – the genome instance to represent in one-row notation. Multiple formats accepted.
as_list – if true, return as a list of images, otherwise return a sage permutation.
Examples
sage: import cgt sage: ppf = cgt.PositionParadigmFramework(3) sage: ppf.cycles(ppf.one_row(‘(1,-2)(-1,2)’)) == ppf.cycles(‘(1,-2)(-1,2)’) True
- random_genome(format=FORMAT.formal_sum)
Return a random genome
- random_instance(genome=None)
Return a random permutation corresponding to a given genome, or a random genome if none is given.
- reg_rep_of_zs(model, to_adjacency_matrix=False, sparse=True)
- regular_representation(g)
Return the regular representation of a single element
- standard_reflection()
Return a permutation which reflects a genome instance about the center region (n odd), or the center two regions (n even).
- standard_rotation()
Return a permutation which rotates positions clockwise by one position.
- symmetry_element()
Return the symmetry element: a convex sum of elements from symmetry_group().
- symmetry_group()
Return the symmetry group of the genomes.
cgt.rearrangements module
- cgt.rearrangements.all_adjacent_transpositions_representatives(framework, num_regions=None)
- cgt.rearrangements.all_inversion_instances(framework, length=None)
Return all instances that represent an inversion of the given length
- Parameters:
framework (PositionParadigmFramework) – The framework to create the inversion in
length (int, optional) – The length of the inversion. Defaults to None (all inversions).
- Returns:
The set of all instances that represent an inversion of the given length
- Return type:
set
- cgt.rearrangements.all_inversions_representatives(framework, num_regions=None)
Return the representatives of all inversions in the symmetry group of framework
- Parameters:
framework (PositionParadigmFramework) – The framework to create the transposition in
num_regions (int, optional) – The number of regions to invert. Defaults to None (all inversions).
- Returns:
The set of representatives of all inversions in the symmetry group of framework
- Return type:
set
- cgt.rearrangements.all_transposition_instances(framework, scope_limit=None, single_segment_limit=None, with_inversion=True, include_revrevs=False, only_revrevs=False, canonical_reps_only=False)
Return all instances that represent a transposition of the given length
- Parameters:
framework (PositionParadigmFramework) – The framework to create the transposition in
- Returns:
The set of all instances that represent a transposition
- Return type:
set
- cgt.rearrangements.c_perm(n)
Return the permutation (1,…,n)(-n,…,1))
- cgt.rearrangements.conjugacy_class(framework, perm)
Return the conjugacy class of perm in the symmetry group of framework
- cgt.rearrangements.cut_positions(framework, sigma)
The set of cuts required to perform a rearrangement represented by sigma
- Parameters:
framework (PositionParadigmFramework) – The framework to create the transposition in
sigma (Permutation) – The permutation representing the rearrangement
- Returns:
The set of cuts required to perform the rearrangement represented by sigma
- Return type:
set
- cgt.rearrangements.double_coset(framework, perm)
Return the double coset of perm in the symmetry group of framework
- cgt.rearrangements.fast_all_inversion_reps(framework)
- cgt.rearrangements.inversion(framework, about_position, length)
Return an instance of the inversion that inverts a segment in a genome.
- Parameters:
framework (PositionParadigmFramework) – The framework to create the inversion in
about_position (int) – The position to invert about
length (int) – The length of the segment to invert
- Returns:
An instance representing the inversion that inverts the segment
- Return type:
Permutation
- cgt.rearrangements.permutation_with_cuts(framework, cuts, perm=None, start=None)
A generator for all permutations with the specified set of cuts.
- Parameters:
framework (PositionParadigmFramework) – The framework to create the transposition in
cuts (set) – The set of cuts to use
- Yields:
Permutation – A permutation with the specified set of cuts
- cgt.rearrangements.representatives(framework, set_of_permutations, classes=CLASSES.double_cosets, prioritise_string_length=False)
- cgt.rearrangements.segment_length(n, start, end)
Return the length of the segment from start to end
- cgt.rearrangements.segment_midpoint(n, start, end)
Return the midpoint of the segment from start to end
- cgt.rearrangements.signed_inversion(framework, about_position, length)
Return an instance of the inversion that inverts a segment in a genome.
- Parameters:
framework (PositionParadigmFramework) – The framework to create the inversion in
about_position (int) – The position to invert about
length (int) – The length of the segment to invert
- Returns:
An instance representing the inversion that inverts the segment
- Return type:
Permutation
- cgt.rearrangements.single_coset(framework, perm)
Return the single coset of perm in the symmetry group of framework
- cgt.rearrangements.transposition(framework, sec_1, sec_2, inv_1=False, inv_2=False, revrev=False)
Return an instance of the transposition that swaps two segments in a genome.
- Parameters:
framework (PositionParadigmFramework) – The framework to create the transposition in
sec_1 (tuple) – The first section to transpose (tuple of start (included) and end (excluded) positions)
sec_2 (tuple) – The second section to transpose (tuple of start (included) and end (excluded) positions)
inv_1 (bool, optional) – Whether to also invert the first section. Defaults to False.
inv_2 (bool, optional) – Whether to also invert the second section. Defaults to False.
- Returns:
An instance representing the transposition that swaps the two segments
- Return type:
Permutation
cgt.structures module
A place for functions which return important but general algebraic structures requiring only a few parameters
- cgt.structures.HyperoctahedralGroup(n)
Return the hyperoctahedral group of order (2^n)n! as a subgroup of SymmetricGroup(2n) with two generators.
- cgt.structures.set_plus_minus(n)
cgt.visualisation module
- cgt.visualisation.draw_genome_instance(framework, instance, show=False)
Produce a drawing of a circular genome with n regions
cgt.simulations module
This module contains functions for simulating the evolution of genomes on a tree.
- cgt.simulations.draw_branch_length()
- cgt.simulations.draw_genome(formal_sum)
- cgt.simulations.draw_rearrangement(model)
- cgt.simulations.draw_tree(tree)
- cgt.simulations.evolve_on_tree(tree, framework, model, root='random', at_least_one_change=False, exactly_n_changes=None)
- cgt.simulations.grow_tree(framework, model, num_taxa)
- cgt.simulations.newick_to_tree(newick_string)
- cgt.simulations.set_node(node, framework, genome)
cgt.hyperoctahedral module
This file contains the class for the hyperoctahedral group.
- class cgt.hyperoctahedral.HyperoctahedralGroup(n, default_element_type='signed_permutation')
Bases:
object
This class represents the hyperoctahedral group of order 2^n * n!. It stores the group as a number of different isomorphic structures, as well as a number of other things that are helpful to keep track of in the context of genome rearrangements. It implements functions to compute the irreducible representations of the hyperoctahedral group.
- Phi(elt)
- character_for_tuple(c)
- element_in_little_subgroup(elt, l, m)
Takes elt as an element of the little subgroup of Sn of order l!m! and returns an element of the direct product of S_l and S_m.
- hyperoctahedral_twist(s, c)
This function implements the twist of the hyperoctahedral group, which is the action of S_n on (C_2)^n by permuting the factors.
- Parameters:
s (Permutation) – A permutation in S_n.
c (tuple) – A tuple of elements of C_2.
- Returns:
The tuple of elements of C_2 obtained by applying s to the tuple c.
- Return type:
an element of (C_2)^n
- irrep(partition_pair)
Returns the irreducible representation of the hyperoctahedral group of order 2^n * n! indexed by the pair of partitions of l and m where l + m = n. The representation is given as a function that takes an element of the hyperoctahedral group as input and returns the corresponding matrix.
- Parameters:
partition_pair (tuple) – A pair of partitions of l and m where l + m = n.
- Returns:
A function that takes an element of the hyperoctahedral group as input and returns the corresponding matrix.
- Return type:
function (group element -> real matrix)
- little_subgroup(l, m, as_direct_product=False)
- little_subgroup_pairs()
- little_subgroups()
- orbit_representative(l, m)
- representation_little_subgroup(partition_pair)
- right_transversal(group, subgroup)
- right_transversal_of_little_subgroup(k)
- semidirect_product_with_hyperoctahedral_twist(G, N)
Compute a semi-direct product of H and G with the hyperoctahedral twist (G acts on H^n by permuting factors). The group returned is a group of pairs (c, s) where c is an element of H^n and s is an element of G. The group operation is given by (c1, s1) * (c2, s2) = (c1 * s1(c2), s1 * s2).
- Parameters:
H (group) – the direct product of n copies of some group.
G (group) – A group.
- Returns:
The wreath product of H and G with the hyperoctahedral twist.
- Return type:
group
- cgt.hyperoctahedral.HyperoctahedralGroupRepresentations(n)
This function returns the irreducible representations of the hyperoctahedral group of order 2^n * n!, indexed by pairs of partitions of l and m, where l + m = n.
- Parameters:
n (int) – The order of the hyperoctahedral group.
- Returns:
A dictionary of the irreducible representations of the hyperoctahedral group of order 2^n * n!
- Return type:
dict
cgt.sorting module
Implements sorting key functions for CGT.
cgt.pickle_manager module
- cgt.pickle_manager.retrieve_irrep_of_z(n=None)
Retrieve the irreps of z
cgt.constants module
Declare constants for the CGT package.