cgt package

cgt.distances module

Implements a number of distance measures for genomes under the position paradigm.

The primary function of this module is the distance_matrix function, which returns a distance matrix for a given set of genomes and distance measure.

The distance measures implemented are:
  • min: the minimum number of rearrangements required to transform one genome into another

  • min_weighted: the minimum number of rearrangements required to transform one genome into another, weighted by the inverse of the probability of the rearrangement

  • MFPT: the mean first passage time from the identity to a given genome, where the target is an absorbing state

  • MLE: the maximum likelihood estimate of the time elapsed between the identity and a given genome

Individual likelihood functions for the time elapsed between the identity and a given genome can also be obtained

cgt.distances.MFPT(framework, model, genome_reps=None, scale_by=1)

Return the mean time elapsed rearranging ref->genome where the target is an absorbing state

cgt.distances.dict_to_distance_matrix(distances, framework, genomes=None)

If need to convert to pairwise distances, supply a list of genomes.

cgt.distances.distance_matrix(framework, model, genomes, distance)

Compute a distance matrix for a given set of genomes and distance measure.

Parameters
  • framework (Framework) – the framework

  • model (Model) – the model

  • genomes (list) – a list of genomes to compute distances for

  • distance (DISTANCE) – the distance measure to use

cgt.distances.likelihood_function(framework, model, genome, attempt_exact=False, use_projections=True)

Return the likelihood function for a given genome

cgt.distances.maximise(framework, L, max_time=100)

Return the time that maximises likelihood function L, using additional information from the framework

Parameters
  • framework (Framework) – the framework

  • L (function) – the likelihood function

  • max_time (float) – the maximum time to consider

Returns

the time that maximises the likelihood function

Return type

float

cgt.distances.min_distance(framework, model, genome_reps=None, weighted=False)

Return dictionary of minimum distances ref->genome, using inverse of model probabilities as weights (or not)

cgt.distances.mle(framework, model, genome_instance)

Returns the maximum likelihood estimate for a given genome instance under the given model and framework.

Parameters
  • framework (Framework) – the framework

  • model (Model) – the model

  • genome_instance (group element) – the genome instance to compute the MLE for

Returns

the maximum likelihood estimate

Return type

float

cgt.distances.mles(framework, model, genome_instances=None, verbose=False)

Returns a dictionary of maximum likelihood estimates for each genome instance under the given model and framework.

Parameters
  • framework (Framework) – the framework

  • model (Model) – the model

  • genome_instances (list) – a list of genome instances to compute MLEs for. If None, all genomes in the framework are used.

  • verbose (bool) – whether to print progress

Returns

a dictionary of maximum likelihood estimates, indexed by genome instance

Return type

dict

cgt.enums module

class cgt.enums.ALGEBRA(value)

Bases: enum.Enum

An enumeration.

genome = 2
genome_class = 3
group = 1
class cgt.enums.CLASSES(value)

Bases: enum.Enum

An enumeration.

conjugacy_classes = 1
cosets = 2
double_cosets = 3
class cgt.enums.DISPLAY(value)

Bases: enum.Enum

An enumeration.

arrows = 1
cycles = 3
one_row = 2
class cgt.enums.DISTANCE(value)

Bases: enum.Enum

An enumeration.

MFPT = 3
MLE = 4
maximum_likelihood_distance = 4
maximum_likelihood_estimate = 4
min = 1
min_first_passage_time = 3
min_weighted = 2
minimum = 1
minimum_weighted = 2
class cgt.enums.FORMAT(value)

Bases: enum.Enum

An enumeration.

dictionary = 3
equiv_classes = 2
formal_sum = 1
only_reps = 4
class cgt.enums.IRREP_TYPE(value)

Bases: enum.Enum

An enumeration.

orthogonal = 2
seminormal = 3
specht = 1
speck = 1
class cgt.enums.MODEL(value)

Bases: enum.Enum

An enumeration.

all_inversions = 'all inversions'
one_region_inversions = 'inversions of a single region'
one_region_swaps = 'adjacent transpositions'
two_region_adjacent_transpositions = 'transpositions of two adjacent regions'
two_region_inversions = 'inversions of two adjacent regions'
class cgt.enums.SET(value)

Bases: enum.Enum

An enumeration.

one_row = 3
signed_cycles = 1
unsigned_cycles = 2
wreath = 4
wreath_s2 = 5
class cgt.enums.SYMMETRY(value)

Bases: enum.Enum

An enumeration.

C_2 = 2
D_n = 1
S_2 = 2
circular = 1
dihedral = 1
flip = 2
linear = 2
reflection = 2
class cgt.enums.TYPE(value)

Bases: enum.Enum

An enumeration.

pos_to_signed_reg = 2
reg_to_signed_pos = 1
signed_pos_to_reg = 4
signed_reg_to_pos = 3

cgt.examples module

cgt.examples.example(n)

cgt.genome module

class cgt.genome.Genome(framework, genome)

Bases: object

Optional wrapper for genomes, so that they can be printed/operated on in a more pythonic way.

cgt.models module

class cgt.models.Model(framework, generating_dictionary)

Bases: object

Defines a model. A model consists of some collection of permutations and a map from these permutations to probabilities [0,1]

classmethod named_model_with_relative_probs(framework, named_model_dictionary)
reg_rep()
reg_rep_of_zs()

Return the regular representation of zs as comptued by PositionParadigmFramework.reg_rep_zs, but store the sparse result

s_element(in_algebra=<ALGEBRA.genome: 2>)

cgt.position_paradigm module

position_paradigm is the most important module in cgt. PositionParadigmFramework keeps track of important objects, can convert group elements and genomes between different forms, and instances can be passed to functions from other modules so that they can acces information without it needing to be recreated.

class cgt.position_paradigm.PositionParadigmFramework(num_regions, oriented=True, symmetry=<SYMMETRY.circular: 1>, genome_type=<TYPE.reg_to_signed_pos: 1>)

Bases: object

Everything you need for working with genomes under the position paradigm

canonical_instance(instance)

Return the ‘canonical’ instance of the genome represented by the permutation if there is one.

Examples

sage: import cgt; ppf = cgt.PositionParadigmFramework(3, symmetry=cgt.SYMMETRY.circular) sage: ppf.canonical_instance(‘(1,-2)(-1,2)’) [1, 2, -3]

coefficient_in(more_terms, fewer_terms, none_if_fail=False)

For example, the coefficient of (a/2 + b/2) in x=(a/3 + b/3 + c/3) is 2/3, since x=2/3*(a/2 + b/2) + c/3

collect_genome_terms(formal_sum, display=<DISPLAY.one_row: 2>)

Return a weighted sum of genomes, represented as [canonical representation]z

cycles(element)
draw_instance(instance, shortened=False)

Return a one-line drawing of a genome using arrows indicating orientation and elipsis indicating rotational symmetry

genome(instance, format=None)
genome_conjugacy_classes(sort_classes=True)

Return conjugacy classes of instances

genome_equivalence_classes(combine_inverse_classes=False, sort_classes=True)

Return double cosets of instances—genomes in the same class will have the same likelihood functions

genome_group()

Return the permutation group containing genome instances.

genomes(format=<FORMAT.dictionary: 3>, sort_genomes=True)
group_algebra()

Return the group alegbra, where the group is the group containing genome instances.

instances(genome)

Return a list of instances of the genome.

irreps(element=None)

Return a complete list of pairwise irreducible representations of the genome group.

Parameters

element – return the image of element for each irreducible representation.

Returns

a complete list of pairwise irreducible representations of the genome group, or a list of matrices if element is not None. Output is cached so that irreps are computed only once per session. Matrix entries sit inside the UniversalCyclotomicField and are exact expressions.

matrix(element)

Return the defining representation matrix. Note that matrix(x)matrix(y) = matrix(yx)

num_genomes()

Return the number of distinct genomes up to symmetries.

one_row(element, as_list=False)

Return a given genome instance in one-row notation.

Parameters
  • element – the genome instance to represent in one-row notation. Multiple formats accepted.

  • as_list – if true, return as a list of images, otherwise return a sage permutation.

Examples

sage: import cgt sage: ppf = cgt.PositionParadigmFramework(3) sage: ppf.cycles(ppf.one_row(‘(1,-2)(-1,2)’)) == ppf.cycles(‘(1,-2)(-1,2)’) True

random_genome(format=<FORMAT.formal_sum: 1>)

Return a random genome

random_instance(genome=None)

Return a random permutation corresponding to a given genome, or a random genome if none is given.

reg_rep_of_zs(model, to_adjacency_matrix=False, sparse=True)
regular_representation(g)

Return the regular representation of a single element

standard_reflection()

Return a permutation which reflects a genome instance about the center region (n odd), or the center two regions (n even).

standard_rotation()

Return a permutation which rotates positions clockwise by one position.

symmetry_element()

Return the symmetry element: a convex sum of elements from symmetry_group().

symmetry_group()

Return the symmetry group of the genomes.

cgt.rearrangements module

cgt.rearrangements.all_adjacent_transpositions_representatives(framework, num_regions=None)
cgt.rearrangements.all_inversions_representatives(framework, num_regions=None)
cgt.rearrangements.c_perm(n)
cgt.rearrangements.conjugacy_class(framework, perm)
cgt.rearrangements.css(cuts_set)
cgt.rearrangements.cuts(framework, sigma)
cgt.rearrangements.double_coset(framework, perm)
cgt.rearrangements.permutation_with_cuts(framework, cuts, perm=None, start=None)
cgt.rearrangements.single_coset(framework, perm)

cgt.structures module

A place for functions which return important but general algebraic structures requiring only a few parameters

cgt.structures.HyperoctahedralGroup(n)

Return the hyperoctahedral group of order (2^n)n! as a subgroup of SymmetricGroup(2n) with two generators.

cgt.structures.set_plus_minus(n)

cgt.visualisation module

cgt.visualisation.draw_genome_instance(framework, instance, show=False)

Produce a drawing of a circular genome with n regions

cgt.simulations module

cgt.simulations.draw_branch_length()
cgt.simulations.draw_genome(formal_sum)
cgt.simulations.draw_rearrangement(model)
cgt.simulations.draw_tree(tree)
cgt.simulations.evolve_on_tree(tree, framework, model, root='random')
cgt.simulations.grow_tree(framework, model, num_taxa)
cgt.simulations.newick_to_tree(newick_string)
cgt.simulations.set_node(node, framework, genome)

cgt.hyperoctahedral module

This file contains the class for the hyperoctahedral group.

cgt.hyperoctahedral.HyperoctahedralGroupRepresentations(n)

This function returns the irreducible representations of the hyperoctahedral group of order 2^n * n!, indexed by pairs of partitions of l and m, where l + m = n.

Parameters

n (int) – The order of the hyperoctahedral group.

Returns

A dictionary of the irreducible representations of the hyperoctahedral group of order 2^n * n!

Return type

dict

class cgt.hyperoctahedral.hyperoctahedral_group(n, default_element_type='signed_permutation')

Bases: object

This class represents the hyperoctahedral group of order 2^n * n!. It stores the group as a number of different isomorphic structures, as well as a number of other things that are helpful to keep track of in the context of genome rearrangements. It implements functions to compute the irreducible representations of the hyperoctahedral group.

Phi_inv(elt)
character_for_tuple(c)
element_in_little_subgroup(elt, l, m)

Takes elt as an element of the little subgroup of Sn of order l!m! and returns an element of the direct product of S_l and S_m.

hyperoctahedral_twist(s, c)

This function implements the twist of the hyperoctahedral group, which is the action of S_n on (C_2)^n by permuting the factors.

Parameters
  • s (Permutation) – A permutation in S_n.

  • c (tuple) – A tuple of elements of C_2.

Returns

The tuple of elements of C_2 obtained by applying s to the tuple c.

Return type

an element of (C_2)^n

irrep(partition_pair)

Returns the irreducible representation of the hyperoctahedral group of order 2^n * n! indexed by the pair of partitions of l and m where l + m = n. The representation is given as a function that takes an element of the hyperoctahedral group as input and returns the corresponding matrix.

Parameters

partition_pair (tuple) – A pair of partitions of l and m where l + m = n.

Returns

A function that takes an element of the hyperoctahedral group as input and returns the corresponding matrix.

Return type

function (group element -> real matrix)

little_subgroup(l, m, as_direct_product=False)
little_subgroup_pairs()
little_subgroups()
orbit_representative(l, m)
representation_little_subgroup(partition_pair)
right_transversal(group, subgroup)
semidirect_product_with_hyperoctahedral_twist(H, G)

Compute a semi-direct product of H and G with the hyperoctahedral twist (G acts on H^n by permuting factors). The group returned is a group of pairs (c, s) where c is an element of H^n and s is an element of G. The group operation is given by (c1, s1) * (c2, s2) = (c1 * s1(c2), s1 * s2).

Parameters
  • H (group) – the direct product of n copies of some group.

  • G (group) – A group.

Returns

The wreath product of H and G with the hyperoctahedral twist.

Return type

group