User guide¶
Wiggelen is a light-weigh library and tries not to over-engineer. For example, builtin datatypes such as tuples are used instead of custom objects. Sane defaults are used throughout and things like indices are handled transparently to the user.
The central operation in Wiggelen is walking a track. Be it in fixedSteps
or variableSteps
format, using any window size and step interval, walking
a track yields values one position at a time. Many operations accept walkers
as input and/or return walkers as output.
This guide uses a.wig
and b.wig
as example wiggle tracks, with the
following contents, respectively:
track type=wiggle_0 name=a visibility=full
variableStep chrom=MT
1 520.0
2 536.0
3 553.0
4 568.0
track type=wiggle_0 name=b visibility=full
variableStep chrom=MT
1 510.0
2 512.0
5 508.0
8 492.0
Walking over a track¶
Walking a track is done with the wiggelen.walk()
function, which yields
tuples of region, position, value:
>>> for region, position, value in walk(open('a.wig')):
... print region, position, value
...
MT 1 520.0
MT 2 536.0
MT 3 553.0
MT 4 568.0
Note
Walkers are implemented as generators, therefore walking (i.e. iterating) over them means consuming them. In other words, you can only iterate over a walker once.
Multiple tracks can be walked simultaneously with the wiggelen.zip_()
function, yielding a walker with lists of values for each track:
>>> a = walk(open('a.wig'))
>>> b = walk(open('b.wig'))
>>> for region, position, value in zip_(a, b):
... print region, position, value
...
1 1 [520.0, 510.0]
1 2 [536.0, 512.0]
1 3 [553.0, None]
1 4 [568.0, None]
1 5 [None, 508.0]
1 8 [None, 492.0]
Sometimes it is useful to force a walk over every subsequent position, even
when some positions are skipped in the original track file. This can be done
with the wiggelen.fill()
function:
>>> for region, position, value in fill(walk(open('b.wig'))):
... print region, position, value
...
1 1 510.0
1 2 512.0
1 3 None
1 4 None
1 5 508.0
1 6 None
1 7 None
1 8 492.0
Writing a walker to a track¶
Any walker can be written to a track file using the wiggelen.write()
function, which by default writes to standard output:
>>> write(walk(open('a.wig')), name='My example')
track type=wiggle_0 name="My example"
variableStep chrom=MT
1 520.0
2 536.0
3 553.0
4 568.0
Value transformations¶
For doing simple transformations on values from a walker, the
itertools.imap()
function is often useful:
>>> from itertools import imap
>>> transform = lambda (r, p, v): (r, p, v * 2)
>>> for region, position, value in imap(transform,
... walk(open('a.wig'))):
... print region, position, value
...
MT 1 1040.0
MT 2 1072.0
MT 3 1106.0
MT 4 1136.0
Similarly, the itertools.ifilter()
function can be used to quickly
filter some values from a walker.
The wiggelen.transform
module contains several predefined
transformations for calculating the derivative of a walker:
>>> for region, position, value in transform.forward_divided_difference(
... walk(open('a.wig'))):
... print region, position, value
...
MT 1 16.0
MT 2 17.0
MT 3 15.0
Note
Walker values can be of any type, but valid wiggle tracks according to the specification can only contain int or float values.
Coverage intervals¶
Genomic intervals of consecutively defined positions can be extracted from a
walker using the wiggelen.intervals.coverage()
function:
>>> for region, begin, end in intervals.coverage(walk(open('b.wig'))):
... print region, begin, end
...
MT 1 2
MT 5 5
MT 8 8
Merging walkers¶
The wiggelen.merge
module provides a way to merge any number of wiggle
tracks with a given merge operation. Some standard merge operations are
pre-defined in wiggelen.merge.mergers
.
>>> for region, position, value in merge.merge(
... walk(open('a.wig')), walk(open('b.wig')),
... merger=merge.mergers['sum']):
... print region, position, value
...
1 1 1030.0
1 2 1048.0
1 3 553.0
1 4 568.0
1 5 508.0
1 8 492.0
Distance matrices¶
Wiggelen can calculate the distance between two or more wiggle tracks
according to a pairwise multiset distance metric. This is implemented in the
wiggelen.distance
module and can be used to assess similarity of next
generation datasets.
>>> distance.distance(open('a.wig'), open('b.wig'))
{(1, 0): 0.5704115928792818}
Four pairwise multiset distance metrics are pre-defined in
wiggelen.distance.metrics
.
Plotting tracks¶
Some rudimentary functionality for plotting a wiggle track is provided by the
wiggelen.plot
module. It requires the matplotlib package to be
installed.
Note
The wiggelen.plot.plot()
function should not be used on very
large tracks.
For example, to quickly visualize the tests/data/complex.wig
file in the
Wiggelen source repository:
>>> fig, _, _, _ = plot.plot(walk(open('tests/data/complex.wig')))
>>> fig.show()