cgt package

cgt.distances module

Implements a number of distance measures for genomes under the position paradigm.

The primary function of this module is the distance_matrix function, which returns a distance matrix for a given set of genomes and distance measure.

The distance measures implemented are:

min: the minimum number of rearrangements required to transform one genome into another
min_weighted: the minimum number of rearrangements required to transform one genome into another, weighted by the inverse of the probability of the rearrangement
MFPT: the mean first passage time from the identity to a given genome, where the target is an absorbing state
MLE: the maximum likelihood estimate of the time elapsed between the identity and a given genome

Individual likelihood functions for the time elapsed between the identity and a given genome can also be obtained

cgt.distances.MFPT(framework, model, genome_reps=None, scale_by=1): Return the mean time elapsed rearranging ref->genome where the target is an absorbing state

cgt.distances.dict_to_distance_matrix(distances, framework, genomes=None): If need to convert to pairwise distances, supply a list of genomes.

cgt.distances.discrete_MFPT(framework, model, genome_reps=None, verbose=False)

cgt.distances.distance(framework, model, genome_instance, distance_measure)

cgt.distances.distance_between(framework, model, genome_instance_1, genome_instance_2, distance_measure)

cgt.distances.distance_matrix(framework, model, genomes, distance, replace_nan_with=nan)

Compute a distance matrix for a given set of genomes and distance measure.

Parameters:

framework (PositionParadigmFramework) – the framework
model (Model) – the model
genomes (list) – a list of genomes to compute distances for
distance (DISTANCE) – the distance measure to use

cgt.distances.fast_MFPT(framework, model)

cgt.distances.first_nonzero_value(framework, function, limit=None)

cgt.distances.genomes_for_dist_matrix(framework, genomes)

cgt.distances.get_distance_function(distance_type)

cgt.distances.likelihood_function(framework, model, genome, use_eigenvectors=True): Return the likelihood function for a given genome

cgt.distances.maximise(framework, L, max_time=None)

Return the time that maximises likelihood function L, using additional information from the framework

Parameters:

framework (PositionParadigmFramework) – the framework
L (function) – the likelihood function to maximise
max_time (float) – the maximum time to consider (default: 100)

Returns:

the time that maximises the likelihood function

Return type:

float

cgt.distances.min_distance(framework, model, genome_reps=None, weighted=False): Return dictionary of minimum distances ref->genome, using inverse of model probabilities as weights (or not)

cgt.distances.min_distance_using_irreps(framework, model, genome_reps=None)

cgt.distances.mle(framework, model, genome_instance, show_work=False)

Returns the maximum likelihood estimate for a given genome instance under the given model and framework.

Parameters:

framework (PositionParadigmFramework) – the framework
model (Model) – the model
genome_instance (group element) – the genome instance to compute the MLE for

Returns:

the maximum likelihood estimate

Return type:

float

cgt.distances.mles(framework, model, genome_instances=None, verbose=False, show_work=False)

Returns a dictionary of maximum likelihood estimates for each genome instance under the given model and framework.

Parameters:

framework (PositionParadigmFramework) – the framework
model (Model) – the model
genome_instances (list) – a list of genome instances to compute MLEs for. If None, all genomes in the framework are used.
verbose (bool) – whether to print progress

Returns:

a dictionary of maximum likelihood estimates, indexed by genome instance

Return type:

dict

cgt.distances.prob_to_reach_in_steps_func(framework, model, sigma, use_eigenvectors=True)

cgt.distances.probability_to_reach_in_steps(framework, model, sigma, k)

cgt.enums module

class cgt.enums.ALGEBRA(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

genome = 2

genome_class = 3

group = 1

class cgt.enums.CLASSES(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

conjugacy_classes = 1

cosets = 2

double_cosets = 3

class cgt.enums.DATA(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

eig_data = 'eigen_data'

eigval_lists = 'eigenvalue_lists'

eigval_sets = 'eigenvalue_sets'

eigval_sets_old = 'eig_lists'

eigvec_lists = 'eigenvector_lists'

eigvec_mat_inv = 'eigenvector_matrix_inverses'

irreps_z = 'irreps_of_z'

irreps_z_np = 'irreps_of_z_np'

irreps_zs = 'irreps_of_zs'

partial_traces = 'partial_traces'

projections = 'projections'

reg_rep_of_zs = 'reg_rep_of_zs'

class cgt.enums.DISPLAY(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

arrows = 1

cycles = 3

one_row = 2

class cgt.enums.DISTANCE(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

DMFPT = 5

MFPT = 3

MLE = 4

discrete_MFPT = 5

discrete_mean_first_passage_time = 5

maximum_likelihood_distance = 4

maximum_likelihood_estimate = 4

mean_first_passage_time = 3

min = 1

min_weighted = 2

minimum = 1

minimum_distance = 1

minimum_weighted = 2

class cgt.enums.FORMAT(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

dictionary = 3

equiv_classes = 2

formal_sum = 1

only_reps = 4

class cgt.enums.IRREP_TYPE(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

orthogonal = 2

seminormal = 3

specht = 1

speck = 1

class cgt.enums.MODEL(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

all_inversions = 'all inversions'

all_inversions_larger_less_likely = 'all inversions, with larger inversions less likely'

all_transpositions = 'all transpositions'

one_region_inversions = 'inversions of a single region'

one_region_moves = 'Move a single region, with or without inversion'

one_region_moves_without_inversions = 'Move a single region, without inversion'

three_region_transpositions = 'transpositions involving a segment of three regions, with or without inversion'

two_region_adjacent_transpositions = 'transpositions of two adjacent regions'

two_region_inversions = 'inversions of two adjacent regions'

two_region_revrevs = 'Take two adjacent regions and invert both of them'

two_region_transpositions = 'Two region adjacent transpositions, with or without inversion'

two_region_transpositions_without_inversions = 'Two region adjacent transpositions, without inversion'

class cgt.enums.SET(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

one_row = 3

signed_cycles = 1

unsigned_cycles = 2

wreath = 4

wreath_s2 = 5

class cgt.enums.SYMMETRY(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

C_2 = 2

D_n = 1

S_2 = 2

circular = 1

dihedral = 1

flip = 2

linear = 2

reflection = 2

class cgt.enums.TYPE(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

pos_to_signed_reg = 2

reg_to_signed_pos = 1

signed_pos_to_reg = 4

signed_reg_to_pos = 3

cgt.examples module

cgt.examples.example(n)

cgt.genome module

class cgt.genome.Genome(framework, genome)

Bases: object

Optional wrapper for genomes, so that they can be printed/operated on in a more pythonic way.

cgt.models module

class cgt.models.Model(framework, generating_dictionary)

Bases: object

Defines a model. A model consists of some collection of permutations and a map from these permutations to probabilities [0,1]

classmethod named_model_with_relative_probs(framework, named_model_dictionary)

Create a model from a dictionary of named classes of rearrangements and their relative probabilities.

Parameters:

framework (PositionParadigmFramework) – The framework to create the model in
named_model_dictionary (dict) – A dictionary of named classes of rearrangements and their relative probabilities.

Returns:

A model with the specified named classes of rearrangements and their relative probabilities.

Return type:

Model

reg_rep_of_zs(): Return the regular representation of zs as comptued by PositionParadigmFramework.reg_rep_zs, but store the sparse result

s_element(in_algebra=ALGEBRA.genome)

cgt.position_paradigm module

position_paradigm is the most important module in cgt. PositionParadigmFramework keeps track of important objects, can convert group elements and genomes between different forms, and instances can be passed to functions from other modules so that they can acces information without it needing to be recreated.

cgt.position_paradigm.Framework: alias of PositionParadigmFramework

cgt.position_paradigm.GenomeFramework: alias of PositionParadigmFramework

class cgt.position_paradigm.PositionParadigmFramework(num_regions, oriented=True, symmetry=SYMMETRY.circular, genome_type=TYPE.reg_to_signed_pos, precompute_irreps=False)

Bases: object

Everything you need for working with genomes under the position paradigm

canonical_double_cosets(join_inverse_classes=False)

canonical_instance(instance)

Return the ‘canonical’ instance of the genome represented by the permutation if there is one.

Examples

sage: import cgt; ppf = cgt.PositionParadigmFramework(3, symmetry=cgt.SYMMETRY.circular) sage: ppf.canonical_instance(‘(1,-2)(-1,2)’) [1, 2, -3]

coefficient_in(more_terms, fewer_terms, none_if_fail=False): For example, the coefficient of (a/2 + b/2) in x=(a/3 + b/3 + c/3) is 2/3, since x=2/3*(a/2 + b/2) + c/3

collect_genome_terms(formal_sum, display=DISPLAY.one_row): Return a weighted sum of genomes, represented as [canonical representation]z

cycles(element)

draw_instance(instance, shortened=False): Return a one-line drawing of a genome using arrows indicating orientation and elipsis indicating rotational symmetry

fast_canonical_instance_generator(cycles=False)

fast_reg_rep_of_zs(model)

genome(instance, format=None)

genome_canonical_instances(): Return a set of all canonical genome instances

genome_canonical_instances_generator()

genome_conjugacy_classes(sort_classes=True): Return conjugacy classes of instances

genome_equivalence_classes(combine_inverse_classes=False, sort_classes=True): Return double cosets of instances—genomes in the same class will have the same likelihood functions

genome_group(): Return the permutation group containing genome instances.

genomes(format=FORMAT.dictionary, sort_genomes=True)

group_algebra(): Return the group alegbra, where the group is the group containing genome instances.

identity_instance(): Return the identity permutation.

instances(genome): Return a list of instances of the genome.

irreps(element=None)

Return a complete list of pairwise irreducible representations of the genome group.

Parameters:: element – return the image of element for each irreducible representation.
Returns:: a complete list of pairwise irreducible representations of the genome group, or a list of matrices if element is not None. Output is cached so that irreps are computed only once per session. Matrix entries sit inside the UniversalCyclotomicField and are exact expressions.

num_genomes(): Return the number of distinct genomes up to symmetries.

one_row(element, as_list=False)

Return a given genome instance in one-row notation.

Parameters:

element – the genome instance to represent in one-row notation. Multiple formats accepted.
as_list – if true, return as a list of images, otherwise return a sage permutation.

Examples

sage: import cgt sage: ppf = cgt.PositionParadigmFramework(3) sage: ppf.cycles(ppf.one_row(‘(1,-2)(-1,2)’)) == ppf.cycles(‘(1,-2)(-1,2)’) True

random_genome(format=FORMAT.formal_sum): Return a random genome

random_instance(genome=None): Return a random permutation corresponding to a given genome, or a random genome if none is given.

reg_rep_of_zs(model, to_adjacency_matrix=False, sparse=True)

regular_representation(g): Return the regular representation of a single element

standard_reflection(): Return a permutation which reflects a genome instance about the center region (n odd), or the center two regions (n even).

standard_rotation(): Return a permutation which rotates positions clockwise by one position.

symmetry_element(): Return the symmetry element: a convex sum of elements from symmetry_group().

symmetry_group(): Return the symmetry group of the genomes.

cgt.rearrangements module

cgt.rearrangements.all_adjacent_transpositions_representatives(framework, num_regions=None)

cgt.rearrangements.all_inversion_instances(framework, length=None)

Return all instances that represent an inversion of the given length

Parameters:

framework (PositionParadigmFramework) – The framework to create the inversion in
length (int, optional) – The length of the inversion. Defaults to None (all inversions).

Returns:

The set of all instances that represent an inversion of the given length

Return type:

set

cgt.rearrangements.all_inversions_representatives(framework, num_regions=None)

Return the representatives of all inversions in the symmetry group of framework

Parameters:

framework (PositionParadigmFramework) – The framework to create the transposition in
num_regions (int, optional) – The number of regions to invert. Defaults to None (all inversions).

Returns:

The set of representatives of all inversions in the symmetry group of framework

Return type:

set

cgt.rearrangements.all_transposition_instances(framework, scope_limit=None, single_segment_limit=None, with_inversion=True, include_revrevs=False, only_revrevs=False, canonical_reps_only=False)

Return all instances that represent a transposition of the given length

Parameters:: framework (PositionParadigmFramework) – The framework to create the transposition in
Returns:: The set of all instances that represent a transposition
Return type:: set

cgt.rearrangements.c_perm(n): Return the permutation (1,…,n)(-n,…,1))

cgt.rearrangements.conjugacy_class(framework, perm): Return the conjugacy class of perm in the symmetry group of framework

cgt.rearrangements.cut_positions(framework, sigma)

The set of cuts required to perform a rearrangement represented by sigma

Parameters:

framework (PositionParadigmFramework) – The framework to create the transposition in
sigma (Permutation) – The permutation representing the rearrangement

Returns:

The set of cuts required to perform the rearrangement represented by sigma

Return type:

set

cgt.rearrangements.double_coset(framework, perm): Return the double coset of perm in the symmetry group of framework

cgt.rearrangements.fast_all_inversion_reps(framework)

cgt.rearrangements.inversion(framework, about_position, length)

Return an instance of the inversion that inverts a segment in a genome.

Parameters:

framework (PositionParadigmFramework) – The framework to create the inversion in
about_position (int) – The position to invert about
length (int) – The length of the segment to invert

Returns:

An instance representing the inversion that inverts the segment

Return type:

Permutation

cgt.rearrangements.permutation_with_cuts(framework, cuts, perm=None, start=None)

A generator for all permutations with the specified set of cuts.

Parameters:

framework (PositionParadigmFramework) – The framework to create the transposition in
cuts (set) – The set of cuts to use

Yields:

Permutation – A permutation with the specified set of cuts

cgt.rearrangements.representatives(framework, set_of_permutations, classes=CLASSES.double_cosets, prioritise_string_length=False)

cgt.rearrangements.segment_length(n, start, end): Return the length of the segment from start to end

cgt.rearrangements.segment_midpoint(n, start, end): Return the midpoint of the segment from start to end

cgt.rearrangements.signed_inversion(framework, about_position, length)

Return an instance of the inversion that inverts a segment in a genome.

Parameters:

framework (PositionParadigmFramework) – The framework to create the inversion in
about_position (int) – The position to invert about
length (int) – The length of the segment to invert

Returns:

An instance representing the inversion that inverts the segment

Return type:

Permutation

cgt.rearrangements.single_coset(framework, perm): Return the single coset of perm in the symmetry group of framework

cgt.rearrangements.transposition(framework, sec_1, sec_2, inv_1=False, inv_2=False, revrev=False)

Return an instance of the transposition that swaps two segments in a genome.

Parameters:

framework (PositionParadigmFramework) – The framework to create the transposition in
sec_1 (tuple) – The first section to transpose (tuple of start (included) and end (excluded) positions)
sec_2 (tuple) – The second section to transpose (tuple of start (included) and end (excluded) positions)
inv_1 (bool, optional) – Whether to also invert the first section. Defaults to False.
inv_2 (bool, optional) – Whether to also invert the second section. Defaults to False.

Returns:

An instance representing the transposition that swaps the two segments

Return type:

Permutation

cgt.structures module

A place for functions which return important but general algebraic structures requiring only a few parameters

cgt.structures.HyperoctahedralGroup(n): Return the hyperoctahedral group of order (2^n)n! as a subgroup of SymmetricGroup(2n) with two generators.

cgt.structures.set_plus_minus(n)

cgt.visualisation module

cgt.visualisation.draw_genome_instance(framework, instance, show=False): Produce a drawing of a circular genome with n regions

cgt.simulations module

This module contains functions for simulating the evolution of genomes on a tree.

cgt.simulations.draw_branch_length()

cgt.simulations.draw_genome(formal_sum)

cgt.simulations.draw_rearrangement(model)

cgt.simulations.draw_tree(tree)

cgt.simulations.evolve_on_tree(tree, framework, model, root='random', at_least_one_change=False, exactly_n_changes=None)

cgt.simulations.grow_tree(framework, model, num_taxa)

cgt.simulations.newick_to_tree(newick_string)

cgt.simulations.set_node(node, framework, genome)

cgt.hyperoctahedral module

This file contains the class for the hyperoctahedral group.

class cgt.hyperoctahedral.HyperoctahedralGroup(n, default_element_type='signed_permutation')

Bases: object

This class represents the hyperoctahedral group of order 2^n * n!. It stores the group as a number of different isomorphic structures, as well as a number of other things that are helpful to keep track of in the context of genome rearrangements. It implements functions to compute the irreducible representations of the hyperoctahedral group.

Phi(elt)

character_for_tuple(c)

element_in_little_subgroup(elt, l, m): Takes elt as an element of the little subgroup of Sn of order l!m! and returns an element of the direct product of S_l and S_m.

hyperoctahedral_twist(s, c)

This function implements the twist of the hyperoctahedral group, which is the action of S_n on (C_2)^n by permuting the factors.

Parameters:

s (Permutation) – A permutation in S_n.
c (tuple) – A tuple of elements of C_2.

Returns:

The tuple of elements of C_2 obtained by applying s to the tuple c.

Return type:

an element of (C_2)^n

irrep(partition_pair)

Returns the irreducible representation of the hyperoctahedral group of order 2^n * n! indexed by the pair of partitions of l and m where l + m = n. The representation is given as a function that takes an element of the hyperoctahedral group as input and returns the corresponding matrix.

Parameters:: partition_pair (tuple) – A pair of partitions of l and m where l + m = n.
Returns:: A function that takes an element of the hyperoctahedral group as input and returns the corresponding matrix.
Return type:: function (group element -> real matrix)

little_subgroup(l, m, as_direct_product=False)

little_subgroup_pairs()

little_subgroups()

orbit_representative(l, m)

representation_little_subgroup(partition_pair)

right_transversal(group, subgroup)

right_transversal_of_little_subgroup(k)

semidirect_product_with_hyperoctahedral_twist(G, N)

Compute a semi-direct product of H and G with the hyperoctahedral twist (G acts on H^n by permuting factors). The group returned is a group of pairs (c, s) where c is an element of H^n and s is an element of G. The group operation is given by (c1, s1) * (c2, s2) = (c1 * s1(c2), s1 * s2).

Parameters:

H (group) – the direct product of n copies of some group.
G (group) – A group.

Returns:

The wreath product of H and G with the hyperoctahedral twist.

Return type:

group

cgt.hyperoctahedral.HyperoctahedralGroupRepresentations(n)

This function returns the irreducible representations of the hyperoctahedral group of order 2^n * n!, indexed by pairs of partitions of l and m, where l + m = n.

Parameters:: n (int) – The order of the hyperoctahedral group.
Returns:: A dictionary of the irreducible representations of the hyperoctahedral group of order 2^n * n!
Return type:: dict

cgt.sorting module

Implements sorting key functions for CGT.

cgt.pickle_manager module

cgt.pickle_manager.retrieve_irrep_of_z(n=None): Retrieve the irreps of z

cgt.constants module

Declare constants for the CGT package.