yatel.weight package

Submodules

yatel.weight.core module

Base structure for weight calculations in Yatel.

class yatel.weight.core.BaseWeight[source]

Bases: object

Base class of all weight calculators.

classmethod names()[source]

Abstract Method.

Names of the registered calculators.

Raises:NotImplementedError
weight(hap0, hap1)[source]

Abstract Method.

A float distance between 2 yatel.dom.Haplotype instances.

weights(nw, to_same=False, env=None, **kwargs)[source]

Calculates the distance between all combinations of existing haplotypes of the network environment or a collection.

Parameters:

calcname : string

Registered calculator name (see: yatel.weight.calculators)

nw : yatel.db.YatelNetwork or

to_same : bool

If True calculate the distance between the same haplotype.

env : dict or None

Enviroment dictionary only if nw is yatel.db.YatelNetwork instance.

kwargs :

Variable parameters to use as enviroment filters only if nw is yatel.db.YatelNetwork instance.

Returns:

Iterator

Like (hap_x, hap_y), float where hap_x is the origin node, hap_y is the end node and float is the weight between them.

yatel.weight.euclidean module

Euclidean distance implementation of Yatel.

class yatel.weight.euclidean.Euclidean(to_num=None)[source]

Bases: yatel.weight.core.BaseWeight

Calculates “ordinary” distance/weight between two haplotypes given by the Pythagorean formula.

Every attribute value is converted to a number by a to_num function. The default behavior of to_num is a sumatory of base64 ord value of every attribute value. Example:

def to_num(attr):
    value = 0
    for c in str(attr).encode("base64"):
        value += ord(c)
    return value

to_num("h") # 294

For more info about euclidean distance: http://en.wikipedia.org/wiki/Euclidean_distance

classmethod names()[source]

Synonims names to call this weight calculation.

weight(hap0, hap1)[source]

A float distance between 2 yatel.dom.Haplotype instances

yatel.weight.euclidean.to_num_default(attr)[source]

The default behavior of to_num is a sumatory of base64 ord value of every attribute value.

yatel.weight.hamming module

Hamming distance implementation of Yatel.

class yatel.weight.hamming.Hamming[source]

Bases: yatel.weight.core.BaseWeight

Calculate the hamming distance between two haplotypes, by counting the number of differences in attributes.

The distance is incremented by “1” by two reasons:

  1. haplotype0.attr_a != haplotype1.attr_a
  2. attr_a exist in haplotype0 but not exist in haplotype1.

Examples

>>> from yatel import dom, weigth
>>> h0 = dom.Haplotype("0", attr_a="a", attr_b="b", attr_c=0)
>>> h1 = dom.Haplotype("1", attr_a="a", attr_c="0")
>>> hamming = weight.Hamming()
>>> dict(hamming(h0, h1))
{(<haplotype0>, <haplotype1>): 2.0}
classmethod names()[source]

Synonims names to call this weight calculation.

weight(hap0, hap1)[source]

A float distance between 2 dom.Haplotype instances

yatel.weight.levenshtein module

Levenshtein and Damerau Levenshtein distance implementation of Yatel.

class yatel.weight.levenshtein.DamerauLevenshtein(to_seq=None)[source]

Bases: yatel.weight.levenshtein.Levenshtein

Calculates the Damerau-Levenshtein distance between haplotypes.

This distance is the number of additions, deletions, substitutions, and transpositions needed to transform the first haplotypes as sequences into the second.

Transpositions are exchanges of consecutive characters; all other operations are self-explanatory.

This implementation is O(N*M) time and O(M) space, for N and M the lengths of the two sequences.

Note: Previously the haplotypes attribute values are base64 encoded.

classmethod names()[source]

Synonims names to call this weight calculation.

weight(hap0, hap1)[source]

A float distance between 2 dom.Haplotype instances

class yatel.weight.levenshtein.Levenshtein(to_seq=None)[source]

Bases: yatel.weight.core.BaseWeight

The Levenshtein distance between two haplotypes is defined as the minimum number of edits needed to transform one haplotype as squence (sumatory of attribute values) into the other, with the allowable edit operations being insertion, deletion, or substitution of a single character.

Note: Previously the haplotypes attribute values are encoded with to_seq funcion.

classmethod names()[source]

Synonims names to call this weight calculation.

weight(hap0, hap1)[source]

A float distance between 2 dom.Haplotype instances

yatel.weight.levenshtein.to_seq_default(obj)[source]

Converts a given object to a normalized base64 of self.

Module contents

This package contains several modules and functions to calculate distances between haplotypes.

Esentially contains some known algorithms to calculate distances between elements that can be used as edge weights.

yatel.weight.weight(calcname, hap0, hap1)[source]

Calculates the weight between yatel.dom.Haplotype instances by the given calculator.

Parameters:

calcname : string

Registered calculator name (see: yatel.weight.calculators)

hap0 : yatel.dom.Haplotype

A Haplotype

hap1 : yatel.dom.Haplotype

A Haplotype

Examples

>>> from yatel import dom, weight
>>> hap0 = dom.Haplotype(1, att0="foo", att1=34)
>>> hap1 = dom.Haplotype(2, att1=65)
>>> weight.weight("hamming", hap0, hap1)
2
yatel.weight.weights(calcname, nw, to_same=False, env=None, **kwargs)[source]

Calculates the distance between all combinations of existing haplotypes in the network enviroment or a collection by the given calculator algorithm.

Parameters:

calcname : string

Registered calculator name (see: yatel.weight.calculators)

nw : yatel.db.YatelNetwork or yatel.dom.Haplotype

yatel.db.YatelNetwork instance or iterable of yatel.dom.Haplotype instances.

to_same : bool

If True calculate the distance between the same haplotype.

env : dict or None

Enviroment dictionary only if nw is yatel.db.YatelNetwork instance.

kwargs :

Variable parameters to use as enviroment filters only if nw is yatel.db.YatelNetwork instance.

Returns:

Iterator

Like (hap_x, hap_y), float where hap_x is the origin node, hap_y is the end node and float is the weight between them.

Examples

>>> from yatel import db, dom, weight
>>> nw = db.YatelNetwork('memory', mode=db.MODE_WRITE)
>>> nw.add_elements([dom.Haplotype(1, att0="foo", att1=34),
...                  dom.Haplotype(2, att1=65),
...                  dom.Haplotype(3)])
>>> nw.add_elements([dom.Fact(1, att0=True, att1=4),
...                  dom.Fact(2, att0=False),
...                  dom.Fact(2, att0=True, att2="foo")])
>>> nw.add_elements([dom.Edge(12, 1, 2),
...                  dom.Edge(34, 2, 3)])
>>> nw.confirm_changes()
>>> dict(weight.weights("lev", nw))
{(<Haplotype '1' at 0x2823c10>, <Haplotype '2' at 0x2823c50>): 5,
 (<Haplotype '1' at 0x2823c10>, <Haplotype '3' at 0x2823d50>): 7,
 (<Haplotype '2' at 0x2823c50>, <Haplotype '3' at 0x2823d50>): 4}
>>> dict(weight.weights("ham", nw, to_same=True, att0=False))
{(<Haplotype '2' at 0x1486c90>, <Haplotype '2' at 0x1486c90>): 0}