diff --git a/dev/api/public/index.html b/dev/api/public/index.html index 6602dcb..8b1e08c 100644 --- a/dev/api/public/index.html +++ b/dev/api/public/index.html @@ -1,5 +1,5 @@ -Public · XAM.jl

Public

Public API Reference

Contents

SAM API

The following methods and types are provided by the SAM submodule for public use.

XAM.SAM.FLAG_DUPConstant.

0x0400: optical or PCR duplicate

source
XAM.SAM.FLAG_MREVERSEConstant.

0x0020: the mate is mapped to the reverse strand

source
XAM.SAM.FLAG_MUNMAPConstant.

0x0008: the mate is unmapped

source
XAM.SAM.FLAG_PAIREDConstant.

0x0001: the read is paired in sequencing, no matter whether it is mapped in a pair

source

0x0002: the read is mapped in a proper pair

source
XAM.SAM.FLAG_QCFAILConstant.

0x0200: QC failure

source
XAM.SAM.FLAG_READ1Constant.

0x0040: this is read1

source
XAM.SAM.FLAG_READ2Constant.

0x0080: this is read2

source
XAM.SAM.FLAG_REVERSEConstant.

0x0010: the read is mapped to the reverse strand

source

0x0100: not primary alignment

source

0x0800: supplementary alignment

source
XAM.SAM.FLAG_UNMAPConstant.

0x0004: the read itself is unmapped; conflictive with SAM.FLAGPROPERPAIR

source
XAM.SAM.HeaderMethod.
SAM.Header()

Create an empty header.

source
XAM.SAM.MetaInfoMethod.
MetaInfo(tag::AbstractString, value)

Create a SAM metainfo with tag and value.

tag is a two-byte ASCII string. If tag is "CO", value must be a string; otherwise, value is an iterable object with key and value pairs.

Examples

julia> SAM.MetaInfo("CO", "some comment")
+Public · XAM.jl

Public

Public API Reference

Contents

SAM API

The following methods and types are provided by the SAM submodule for public use.

XAM.SAM.FLAG_DUPConstant.

0x0400: optical or PCR duplicate

source
XAM.SAM.FLAG_MREVERSEConstant.

0x0020: the mate is mapped to the reverse strand

source
XAM.SAM.FLAG_MUNMAPConstant.

0x0008: the mate is unmapped

source
XAM.SAM.FLAG_PAIREDConstant.

0x0001: the read is paired in sequencing, no matter whether it is mapped in a pair

source

0x0002: the read is mapped in a proper pair

source
XAM.SAM.FLAG_QCFAILConstant.

0x0200: QC failure

source
XAM.SAM.FLAG_READ1Constant.

0x0040: this is read1

source
XAM.SAM.FLAG_READ2Constant.

0x0080: this is read2

source
XAM.SAM.FLAG_REVERSEConstant.

0x0010: the read is mapped to the reverse strand

source

0x0100: not primary alignment

source

0x0800: supplementary alignment

source
XAM.SAM.FLAG_UNMAPConstant.

0x0004: the read itself is unmapped; conflictive with SAM.FLAGPROPERPAIR

source
XAM.SAM.HeaderMethod.
SAM.Header()

Create an empty header.

source
XAM.SAM.MetaInfoMethod.
MetaInfo(tag::AbstractString, value)

Create a SAM metainfo with tag and value.

tag is a two-byte ASCII string. If tag is "CO", value must be a string; otherwise, value is an iterable object with key and value pairs.

Examples

julia> SAM.MetaInfo("CO", "some comment")
 BioAlignments.SAM.MetaInfo:
     tag: CO
   value: some comment
@@ -13,7 +13,7 @@ BioAlignments.SAM.MetaInfo:
   value: SN=chr1 LN=12345
 
 julia> string(ans)
-"@SQ	SN:chr1	LN:12345"
source
XAM.SAM.MetaInfoMethod.
MetaInfo(str::AbstractString)

Create a SAM metainfo from str.

Examples

julia> SAM.MetaInfo("@CO	some comment")
+"@SQ	SN:chr1	LN:12345"
source
XAM.SAM.MetaInfoMethod.
MetaInfo(str::AbstractString)

Create a SAM metainfo from str.

Examples

julia> SAM.MetaInfo("@CO	some comment")
 BioAlignments.SAM.MetaInfo:
     tag: CO
   value: some comment
@@ -21,4 +21,4 @@ BioAlignments.SAM.MetaInfo:
 julia> SAM.MetaInfo("@SQ	SN:chr1	LN:12345")
 BioAlignments.SAM.MetaInfo:
     tag: SQ
-  value: SN=chr1 LN=12345
source
XAM.SAM.ReaderMethod.
SAM.Reader(input::IO)

Create a data reader of the SAM file format.

Arguments

  • input: data source
source
XAM.SAM.RecordMethod.
SAM.Record(str::AbstractString)

Create a SAM record from str. This function verifies the format and indexes fields for accessors.

source
XAM.SAM.RecordMethod.
SAM.Record(data::Vector{UInt8})

Create a SAM record from data. This function verifies the format and indexes fields for accessors. Note that the ownership of data is transferred to a new record object.

source
XAM.SAM.RecordMethod.
SAM.Record()

Create an unfilled SAM record.

source
XAM.SAM.WriterType.
Writer(output::IO, header::Header=Header())

Create a data writer of the SAM file format.

Arguments

  • output: data sink
  • header=Header(): SAM header object
source
Base.findallMethod.
find(header::Header, key::AbstractString)::Vector{MetaInfo}

Find metainfo objects satisfying SAM.tag(metainfo) == key.

source
BioGenerics.headerMethod.
header(reader::Reader)::Header

Get the header of reader.

source
alignlength(record::Record)::Int

Get the alignment length of record.

source
XAM.SAM.alignmentMethod.
alignment(record::Record)::BioAlignments.Alignment

Get the alignment of record.

source
XAM.SAM.auxdataMethod.
auxdata(record::Record)::Dict{String,Any}

Get the auxiliary data (optional fields) of record.

source
XAM.SAM.cigarMethod.
cigar(record::Record)::String

Get the CIGAR string of record.

source
XAM.SAM.flagMethod.
flag(record::Record)::UInt16

Get the bitwise flag of record.

source
XAM.SAM.iscommentMethod.
iscomment(metainfo::MetaInfo)::Bool

Test if metainfo is a comment (i.e. its tag is "CO").

source
XAM.SAM.ismappedMethod.
ismapped(record::Record)::Bool

Test if record is mapped.

source
isnextmapped(record::Record)::Bool

Test if the mate/next read of record is mapped.

source
XAM.SAM.isprimaryMethod.
isprimary(record::Record)::Bool

Test if record is a primary line of the read.

This is equivalent to flag(record) & 0x900 == 0.

source
XAM.SAM.keyvaluesMethod.
keyvalues(metainfo::MetaInfo)::Vector{Pair{String,String}}

Get the values of metainfo as string pairs.

source
mappingquality(record::Record)::UInt8

Get the mapping quality of record.

source
nextposition(record::Record)::Int

Get the position of the mate/next read of record.

source
nextrefname(record::Record)::String

Get the reference name of the mate/next read of record.

source
XAM.SAM.positionMethod.
position(record::Record)::Int

Get the 1-based leftmost mapping position of record.

source
XAM.SAM.qualityMethod.
quality(::Type{String}, record::Record)::String

Get the ASCII-encoded base quality of record.

source
XAM.SAM.qualityMethod.
quality(record::Record)::Vector{UInt8}

Get the Phred-scaled base quality of record.

source
XAM.SAM.refnameMethod.
refname(record::Record)::String

Get the reference sequence name of record.

source
rightposition(record::Record)::Int

Get the 1-based rightmost mapping position of record.

source
XAM.SAM.seqlengthMethod.
seqlength(record::Record)::Int

Get the sequence length of record.

source
XAM.SAM.sequenceMethod.
sequence(::Type{String}, record::Record)::String

Get the segment sequence of record as String.

source
XAM.SAM.sequenceMethod.
sequence(record::Record)::BioSequences.DNASequence

Get the segment sequence of record.

source
XAM.SAM.tagMethod.
tag(metainfo::MetaInfo)::String

Get the tag of metainfo.

source
XAM.SAM.templengthMethod.
templength(record::Record)::Int

Get the template length of record.

source
XAM.SAM.tempnameMethod.
tempname(record::Record)::String

Get the query template name of record.

source
XAM.SAM.valueMethod.
value(metainfo::MetaInfo)::String

Get the value of metainfo as a string.

source

BAM API

The following methods and types are provided by the BAM submodule for public use.

XAM.BAM.BAIMethod.
BAI(filename::AbstractString)

Load a BAI index from filename.

source
XAM.BAM.BAIMethod.
BAI(input::IO)

Load a BAI index from input.

source
XAM.BAM.ReaderType.
BAM.Reader(input::IO; index=nothing)

Create a data reader of the BAM file format.

Arguments

  • input: data source
  • index=nothing: filepath to a random access index (currently bai is supported)
source
XAM.BAM.RecordType.
BAM.Record()

Create an unfilled BAM record.

source
XAM.BAM.WriterType.
BAM.Writer(output::BGZFStream, header::SAM.Header)

Create a data writer of the BAM file format.

Arguments

  • output: data sink
  • header: SAM header object
source
BioGenerics.headerMethod.
header(reader::Reader; fillSQ::Bool=false)::SAM.Header

Get the header of reader.

If fillSQ is true, this function fills missing "SQ" metainfo in the header.

source
alignlength(record::Record)::Int

Get the alignment length of record.

source
XAM.BAM.alignmentMethod.
alignment(record::Record)::BioAlignments.Alignment

Get the alignment of record.

source
XAM.BAM.auxdataMethod.
auxdata(record::Record)::BAM.AuxData

Get the auxiliary data of record.

source
XAM.BAM.cigarFunction.
cigar(record::Record)::String

Get the CIGAR string of record.

Note that in the BAM specification, the field called cigar typically stores the cigar string of the record. However, this is not always true, sometimes the true cigar is very long, and due to some constraints of the BAM format, the actual cigar string is stored in an extra tag: CG:B,I, and the cigar field stores a pseudo-cigar string.

Calling this method with checkCG set to true (default) this method will always yield the true cigar string, because this is probably what you want the vast majority of the time.

If you have a record that stores the true cigar in a CG:B,I tag, but you still want to access the pseudo-cigar that is stored in the cigar field of the BAM record, then you can set checkCG to false.

See also BAM.cigar_rle.

source
XAM.BAM.cigar_rleFunction.
cigar_rle(record::Record, checkCG::Bool = true)::Tuple{Vector{BioAlignments.Operation},Vector{Int}}

Get a run-length encoded tuple (ops, lens) of the CIGAR string in record.

Note that in the BAM specification, the field called cigar typically stores the cigar string of the record. However, this is not always true, sometimes the true cigar is very long, and due to some constraints of the BAM format, the actual cigar string is stored in an extra tag: CG:B,I, and the cigar field stores a pseudo-cigar string.

Calling this method with checkCG set to true (default) this method will always yield the true cigar string, because this is probably what you want the vast majority of the time.

If you have a record that stores the true cigar in a CG:B,I tag, but you still want to access the pseudo-cigar that is stored in the cigar field of the BAM record, then you can set checkCG to false.

See also BAM.cigar.

source
XAM.BAM.flagMethod.
flag(record::Record)::UInt16

Get the bitwise flag of record.

source
XAM.BAM.ismappedMethod.
ismapped(record::Record)::Bool

Test if record is mapped.

source
isnextmapped(record::Record)::Bool

Test if the mate/next read of record is mapped.

source
ispositivestrand(record::Record)::Bool

Test if record is aligned to the positive strand.

This is equivalent to flag(record) & 0x10 == 0.

source
XAM.BAM.isprimaryMethod.
isprimary(record::Record)::Bool

Test if record is a primary line of the read.

This is equivalent to flag(record) & 0x900 == 0.

source
mappingquality(record::Record)::UInt8

Get the mapping quality of record.

source
XAM.BAM.n_cigar_opFunction.
n_cigar_op(record::Record, checkCG::Bool = true)

Return the number of operations in the CIGAR string of record.

Note that in the BAM specification, the field called cigar typically stores the cigar string of the record. However, this is not always true, sometimes the true cigar is very long, and due to some constraints of the BAM format, the actual cigar string is stored in an extra tag: CG:B,I, and the cigar field stores a pseudo-cigar string.

Calling this method with checkCG set to true (default) this method will always yield the number of operations in the true cigar string, because this is probably what you want, the vast majority of the time.

If you have a record that stores the true cigar in a CG:B,I tag, but you still want to get the number of operations in the cigar field of the BAM record, then set checkCG to false.

source
nextposition(record::Record)::Int

Get the 1-based leftmost mapping position of the next/mate read of record.

source
XAM.BAM.nextrefidMethod.
nextrefid(record::Record)::Int

Get the next/mate reference sequence ID of record.

source
nextrefname(record::Record)::String

Get the reference name of the mate/next read of record.

source
XAM.BAM.positionMethod.
position(record::Record)::Int

Get the 1-based leftmost mapping position of record.

source
XAM.BAM.qualityMethod.
quality(record::Record)::Vector{UInt8}

Get the base quality of record.

source
XAM.BAM.refidMethod.
refid(record::Record)::Int

Get the reference sequence ID of record.

The ID is 1-based (i.e. the first sequence is 1) and is 0 for a record without a mapping position.

See also: BAM.rname

source
XAM.BAM.reflenMethod.
reflen(record::Record)::Int

Get the length of the reference sequence this record applies to.

source
XAM.BAM.refnameMethod.
refname(record::Record)::String

Get the reference sequence name of record.

See also: BAM.refid

source
rightposition(record::Record)::Int

Get the 1-based rightmost mapping position of record.

source
XAM.BAM.seqlengthMethod.
seqlength(record::Record)::Int

Get the sequence length of record.

source
XAM.BAM.sequenceMethod.
sequence(record::Record)::BioSequences.DNASequence

Get the segment sequence of record.

source
XAM.BAM.templengthMethod.
templength(record::Record)::Int

Get the template length of record.

source
XAM.BAM.tempnameMethod.
tempname(record::Record)::String

Get the query template name of record.

source
+ value: SN=chr1 LN=12345
source
XAM.SAM.ReaderMethod.
SAM.Reader(input::IO)

Create a data reader of the SAM file format.

Arguments

  • input: data source
source
XAM.SAM.RecordMethod.
SAM.Record(str::AbstractString)

Create a SAM record from str. This function verifies the format and indexes fields for accessors.

source
XAM.SAM.RecordMethod.
SAM.Record(data::Vector{UInt8})

Create a SAM record from data. This function verifies the format and indexes fields for accessors. Note that the ownership of data is transferred to a new record object.

source
XAM.SAM.RecordMethod.
SAM.Record()

Create an unfilled SAM record.

source
XAM.SAM.WriterType.
Writer(output::IO, header::Header=Header())

Create a data writer of the SAM file format.

Arguments

  • output: data sink
  • header=Header(): SAM header object
source
Base.findallMethod.
find(header::Header, key::AbstractString)::Vector{MetaInfo}

Find metainfo objects satisfying SAM.tag(metainfo) == key.

source
BioGenerics.headerMethod.
header(reader::Reader)::Header

Get the header of reader.

source
alignlength(record::Record)::Int

Get the alignment length of record.

source
XAM.SAM.alignmentMethod.
alignment(record::Record)::BioAlignments.Alignment

Get the alignment of record.

source
XAM.SAM.auxdataMethod.
auxdata(record::Record)::Dict{String,Any}

Get the auxiliary data (optional fields) of record.

source
XAM.SAM.cigarMethod.
cigar(record::Record)::String

Get the CIGAR string of record.

source
XAM.SAM.flagMethod.
flag(record::Record)::UInt16

Get the bitwise flag of record.

source
XAM.SAM.iscommentMethod.
iscomment(metainfo::MetaInfo)::Bool

Test if metainfo is a comment (i.e. its tag is "CO").

source
XAM.SAM.ismappedMethod.
ismapped(record::Record)::Bool

Test if record is mapped.

source
isnextmapped(record::Record)::Bool

Test if the mate/next read of record is mapped.

source
XAM.SAM.isprimaryMethod.
isprimary(record::Record)::Bool

Test if record is a primary line of the read.

This is equivalent to flag(record) & 0x900 == 0.

source
XAM.SAM.keyvaluesMethod.
keyvalues(metainfo::MetaInfo)::Vector{Pair{String,String}}

Get the values of metainfo as string pairs.

source
mappingquality(record::Record)::UInt8

Get the mapping quality of record.

source
nextposition(record::Record)::Int

Get the position of the mate/next read of record.

source
nextrefname(record::Record)::String

Get the reference name of the mate/next read of record.

source
XAM.SAM.positionMethod.
position(record::Record)::Int

Get the 1-based leftmost mapping position of record.

source
XAM.SAM.qualityMethod.
quality(::Type{String}, record::Record)::String

Get the ASCII-encoded base quality of record.

source
XAM.SAM.qualityMethod.
quality(record::Record)::Vector{UInt8}

Get the Phred-scaled base quality of record.

source
XAM.SAM.refnameMethod.
refname(record::Record)::String

Get the reference sequence name of record.

source
rightposition(record::Record)::Int

Get the 1-based rightmost mapping position of record.

source
XAM.SAM.seqlengthMethod.
seqlength(record::Record)::Int

Get the sequence length of record.

source
XAM.SAM.sequenceMethod.
sequence(::Type{String}, record::Record)::String

Get the segment sequence of record as String.

source
XAM.SAM.sequenceMethod.
sequence(record::Record)::BioSequences.DNASequence

Get the segment sequence of record.

source
XAM.SAM.tagMethod.
tag(metainfo::MetaInfo)::String

Get the tag of metainfo.

source
XAM.SAM.templengthMethod.
templength(record::Record)::Int

Get the template length of record.

source
XAM.SAM.tempnameMethod.
tempname(record::Record)::String

Get the query template name of record.

source
XAM.SAM.valueMethod.
value(metainfo::MetaInfo)::String

Get the value of metainfo as a string.

source

BAM API

The following methods and types are provided by the BAM submodule for public use.

XAM.BAM.BAIMethod.
BAI(filename::AbstractString)

Load a BAI index from filename.

source
XAM.BAM.BAIMethod.
BAI(input::IO)

Load a BAI index from input.

source
XAM.BAM.ReaderType.
BAM.Reader(input::IO; index=nothing)

Create a data reader of the BAM file format.

Arguments

  • input: data source
  • index=nothing: filepath to a random access index (currently bai is supported)
source
XAM.BAM.RecordType.
BAM.Record()

Create an unfilled BAM record.

source
XAM.BAM.WriterType.
BAM.Writer(output::BGZFStream, header::SAM.Header)

Create a data writer of the BAM file format.

Arguments

  • output: data sink
  • header: SAM header object
source
BioGenerics.headerMethod.
header(reader::Reader; fillSQ::Bool=false)::SAM.Header

Get the header of reader.

If fillSQ is true, this function fills missing "SQ" metainfo in the header.

source
alignlength(record::Record)::Int

Get the alignment length of record.

source
XAM.BAM.alignmentMethod.
alignment(record::Record)::BioAlignments.Alignment

Get the alignment of record.

source
XAM.BAM.auxdataMethod.
auxdata(record::Record)::BAM.AuxData

Get the auxiliary data of record.

source
XAM.BAM.cigarFunction.
cigar(record::Record)::String

Get the CIGAR string of record.

Note that in the BAM specification, the field called cigar typically stores the cigar string of the record. However, this is not always true, sometimes the true cigar is very long, and due to some constraints of the BAM format, the actual cigar string is stored in an extra tag: CG:B,I, and the cigar field stores a pseudo-cigar string.

Calling this method with checkCG set to true (default) this method will always yield the true cigar string, because this is probably what you want the vast majority of the time.

If you have a record that stores the true cigar in a CG:B,I tag, but you still want to access the pseudo-cigar that is stored in the cigar field of the BAM record, then you can set checkCG to false.

See also BAM.cigar_rle.

source
XAM.BAM.cigar_rleFunction.
cigar_rle(record::Record, checkCG::Bool = true)::Tuple{Vector{BioAlignments.Operation},Vector{Int}}

Get a run-length encoded tuple (ops, lens) of the CIGAR string in record.

Note that in the BAM specification, the field called cigar typically stores the cigar string of the record. However, this is not always true, sometimes the true cigar is very long, and due to some constraints of the BAM format, the actual cigar string is stored in an extra tag: CG:B,I, and the cigar field stores a pseudo-cigar string.

Calling this method with checkCG set to true (default) this method will always yield the true cigar string, because this is probably what you want the vast majority of the time.

If you have a record that stores the true cigar in a CG:B,I tag, but you still want to access the pseudo-cigar that is stored in the cigar field of the BAM record, then you can set checkCG to false.

See also BAM.cigar.

source
XAM.BAM.flagMethod.
flag(record::Record)::UInt16

Get the bitwise flag of record.

source
XAM.BAM.ismappedMethod.
ismapped(record::Record)::Bool

Test if record is mapped.

source
isnextmapped(record::Record)::Bool

Test if the mate/next read of record is mapped.

source
ispositivestrand(record::Record)::Bool

Test if record is aligned to the positive strand.

This is equivalent to flag(record) & 0x10 == 0.

source
XAM.BAM.isprimaryMethod.
isprimary(record::Record)::Bool

Test if record is a primary line of the read.

This is equivalent to flag(record) & 0x900 == 0.

source
mappingquality(record::Record)::UInt8

Get the mapping quality of record.

source
XAM.BAM.n_cigar_opFunction.
n_cigar_op(record::Record, checkCG::Bool = true)

Return the number of operations in the CIGAR string of record.

Note that in the BAM specification, the field called cigar typically stores the cigar string of the record. However, this is not always true, sometimes the true cigar is very long, and due to some constraints of the BAM format, the actual cigar string is stored in an extra tag: CG:B,I, and the cigar field stores a pseudo-cigar string.

Calling this method with checkCG set to true (default) this method will always yield the number of operations in the true cigar string, because this is probably what you want, the vast majority of the time.

If you have a record that stores the true cigar in a CG:B,I tag, but you still want to get the number of operations in the cigar field of the BAM record, then set checkCG to false.

source
nextposition(record::Record)::Int

Get the 1-based leftmost mapping position of the next/mate read of record.

source
XAM.BAM.nextrefidMethod.
nextrefid(record::Record)::Int

Get the next/mate reference sequence ID of record.

source
nextrefname(record::Record)::String

Get the reference name of the mate/next read of record.

source
XAM.BAM.positionMethod.
position(record::Record)::Int

Get the 1-based leftmost mapping position of record.

source
XAM.BAM.qualityMethod.
quality(record::Record)::Vector{UInt8}

Get the base quality of record.

source
XAM.BAM.refidMethod.
refid(record::Record)::Int

Get the reference sequence ID of record.

The ID is 1-based (i.e. the first sequence is 1) and is 0 for a record without a mapping position.

See also: BAM.rname

source
XAM.BAM.reflenMethod.
reflen(record::Record)::Int

Get the length of the reference sequence this record applies to.

source
XAM.BAM.refnameMethod.
refname(record::Record)::String

Get the reference sequence name of record.

See also: BAM.refid

source
rightposition(record::Record)::Int

Get the 1-based rightmost mapping position of record.

source
XAM.BAM.seqlengthMethod.
seqlength(record::Record)::Int

Get the sequence length of record.

source
XAM.BAM.sequenceMethod.
sequence(record::Record)::BioSequences.DNASequence

Get the segment sequence of record.

source
XAM.BAM.templengthMethod.
templength(record::Record)::Int

Get the template length of record.

source
XAM.BAM.tempnameMethod.
tempname(record::Record)::String

Get the query template name of record.

source
diff --git a/dev/hts-files/index.html b/dev/hts-files/index.html index d7cf6d4..913af5d 100644 --- a/dev/hts-files/index.html +++ b/dev/hts-files/index.html @@ -51,7 +51,7 @@ julia> find(header(reader), "SQ") Bio.Align.SAM.MetaInfo: tag: SQ value: SN=mitochondria LN=366924 -

In the above we can see there were 7 sequences in the reference: 5 chromosomes, one chloroplast sequence, and one mitochondrial sequence.

SAM and BAM Records

BioAlignments supports the following accessors for SAM.Record types.

XAM.SAM.flagFunction.
flag(record::Record)::UInt16

Get the bitwise flag of record.

source
XAM.SAM.ismappedFunction.
ismapped(record::Record)::Bool

Test if record is mapped.

source
XAM.SAM.isprimaryFunction.
isprimary(record::Record)::Bool

Test if record is a primary line of the read.

This is equivalent to flag(record) & 0x900 == 0.

source
XAM.SAM.refnameFunction.
refname(record::Record)::String

Get the reference sequence name of record.

source
XAM.SAM.positionFunction.
position(record::Record)::Int

Get the 1-based leftmost mapping position of record.

source
XAM.SAM.rightpositionFunction.
rightposition(record::Record)::Int

Get the 1-based rightmost mapping position of record.

source
XAM.SAM.isnextmappedFunction.
isnextmapped(record::Record)::Bool

Test if the mate/next read of record is mapped.

source
XAM.SAM.nextrefnameFunction.
nextrefname(record::Record)::String

Get the reference name of the mate/next read of record.

source
XAM.SAM.nextpositionFunction.
nextposition(record::Record)::Int

Get the position of the mate/next read of record.

source
XAM.SAM.mappingqualityFunction.
mappingquality(record::Record)::UInt8

Get the mapping quality of record.

source
XAM.SAM.cigarFunction.
cigar(record::Record)::String

Get the CIGAR string of record.

source
XAM.SAM.alignmentFunction.
alignment(record::Record)::BioAlignments.Alignment

Get the alignment of record.

source
XAM.SAM.alignlengthFunction.
alignlength(record::Record)::Int

Get the alignment length of record.

source
XAM.SAM.tempnameFunction.
tempname(record::Record)::String

Get the query template name of record.

source
XAM.SAM.templengthFunction.
templength(record::Record)::Int

Get the template length of record.

source
XAM.SAM.sequenceFunction.
sequence(record::Record)::BioSequences.DNASequence

Get the segment sequence of record.

source
sequence(::Type{String}, record::Record)::String

Get the segment sequence of record as String.

source
XAM.SAM.seqlengthFunction.
seqlength(record::Record)::Int

Get the sequence length of record.

source
XAM.SAM.qualityFunction.
quality(record::Record)::Vector{UInt8}

Get the Phred-scaled base quality of record.

source
quality(::Type{String}, record::Record)::String

Get the ASCII-encoded base quality of record.

source
XAM.SAM.auxdataFunction.
auxdata(record::Record)::Dict{String,Any}

Get the auxiliary data (optional fields) of record.

source

BioAlignments supports the following accessors for BAM.Record types.

XAM.BAM.flagFunction.
flag(record::Record)::UInt16

Get the bitwise flag of record.

source
XAM.BAM.ismappedFunction.
ismapped(record::Record)::Bool

Test if record is mapped.

source
XAM.BAM.isprimaryFunction.
isprimary(record::Record)::Bool

Test if record is a primary line of the read.

This is equivalent to flag(record) & 0x900 == 0.

source
XAM.BAM.refidFunction.
refid(record::Record)::Int

Get the reference sequence ID of record.

The ID is 1-based (i.e. the first sequence is 1) and is 0 for a record without a mapping position.

See also: BAM.rname

source
XAM.BAM.refnameFunction.
refname(record::Record)::String

Get the reference sequence name of record.

See also: BAM.refid

source
XAM.BAM.reflenFunction.
reflen(record::Record)::Int

Get the length of the reference sequence this record applies to.

source
XAM.BAM.positionFunction.
position(record::Record)::Int

Get the 1-based leftmost mapping position of record.

source
XAM.BAM.rightpositionFunction.
rightposition(record::Record)::Int

Get the 1-based rightmost mapping position of record.

source
XAM.BAM.isnextmappedFunction.
isnextmapped(record::Record)::Bool

Test if the mate/next read of record is mapped.

source
XAM.BAM.nextrefidFunction.
nextrefid(record::Record)::Int

Get the next/mate reference sequence ID of record.

source
XAM.BAM.nextrefnameFunction.
nextrefname(record::Record)::String

Get the reference name of the mate/next read of record.

source
XAM.BAM.nextpositionFunction.
nextposition(record::Record)::Int

Get the 1-based leftmost mapping position of the next/mate read of record.

source
XAM.BAM.mappingqualityFunction.
mappingquality(record::Record)::UInt8

Get the mapping quality of record.

source
XAM.BAM.cigarFunction.
cigar(record::Record)::String

Get the CIGAR string of record.

Note that in the BAM specification, the field called cigar typically stores the cigar string of the record. However, this is not always true, sometimes the true cigar is very long, and due to some constraints of the BAM format, the actual cigar string is stored in an extra tag: CG:B,I, and the cigar field stores a pseudo-cigar string.

Calling this method with checkCG set to true (default) this method will always yield the true cigar string, because this is probably what you want the vast majority of the time.

If you have a record that stores the true cigar in a CG:B,I tag, but you still want to access the pseudo-cigar that is stored in the cigar field of the BAM record, then you can set checkCG to false.

See also BAM.cigar_rle.

source
XAM.BAM.alignmentFunction.
alignment(record::Record)::BioAlignments.Alignment

Get the alignment of record.

source
XAM.BAM.alignlengthFunction.
alignlength(record::Record)::Int

Get the alignment length of record.

source
XAM.BAM.tempnameFunction.
tempname(record::Record)::String

Get the query template name of record.

source
XAM.BAM.templengthFunction.
templength(record::Record)::Int

Get the template length of record.

source
XAM.BAM.sequenceFunction.
sequence(record::Record)::BioSequences.DNASequence

Get the segment sequence of record.

source
XAM.BAM.seqlengthFunction.
seqlength(record::Record)::Int

Get the sequence length of record.

source
XAM.BAM.qualityFunction.
quality(record::Record)::Vector{UInt8}

Get the base quality of record.

source
XAM.BAM.auxdataFunction.
auxdata(record::Record)::BAM.AuxData

Get the auxiliary data of record.

source

Accessing auxiliary data

SAM and BAM records support the storing of optional data fields associated with tags.

Tagged auxiliary data follows a format of TAG:TYPE:VALUE. TAG is a two-letter string, and each tag can only appear once per record. TYPE is a single case-sensetive letter which defined the format of VALUE.

TypeDescription
'A'Printable character
'i'Signed integer
'f'Single-precision floating number
'Z'Printable string, including space
'H'Byte array in Hex format
'B'Integer of numeric array

For more information about these tags and their types we refer you to the [SAM/BAM specification][samtools-spec] and the additional [optional fields specification][samtags] document.

There are some tags that are reserved, predefined standard tags, for specific uses.

To access optional fields stored in tags, you use getindex indexing syntax on the record object. Note that accessing optional tag fields will result in type instability in Julia. This is because the type of the optional data is not known until run-time, as the tag is being read. This can have a significant impact on performance. To limit this, if the user knows the type of a value in advance, specifying it as a type annotation will alleviate the problem:

Below is an example of looping over records in a bam file and using indexing syntax to get the data stored in the "NM" tag. Note the UInt8 type assertion to alleviate type instability.

for record in open(BAM.Reader, "data.bam")
+

In the above we can see there were 7 sequences in the reference: 5 chromosomes, one chloroplast sequence, and one mitochondrial sequence.

SAM and BAM Records

BioAlignments supports the following accessors for SAM.Record types.

XAM.SAM.flagFunction.
flag(record::Record)::UInt16

Get the bitwise flag of record.

source
XAM.SAM.ismappedFunction.
ismapped(record::Record)::Bool

Test if record is mapped.

source
XAM.SAM.isprimaryFunction.
isprimary(record::Record)::Bool

Test if record is a primary line of the read.

This is equivalent to flag(record) & 0x900 == 0.

source
XAM.SAM.refnameFunction.
refname(record::Record)::String

Get the reference sequence name of record.

source
XAM.SAM.positionFunction.
position(record::Record)::Int

Get the 1-based leftmost mapping position of record.

source
XAM.SAM.rightpositionFunction.
rightposition(record::Record)::Int

Get the 1-based rightmost mapping position of record.

source
XAM.SAM.isnextmappedFunction.
isnextmapped(record::Record)::Bool

Test if the mate/next read of record is mapped.

source
XAM.SAM.nextrefnameFunction.
nextrefname(record::Record)::String

Get the reference name of the mate/next read of record.

source
XAM.SAM.nextpositionFunction.
nextposition(record::Record)::Int

Get the position of the mate/next read of record.

source
XAM.SAM.mappingqualityFunction.
mappingquality(record::Record)::UInt8

Get the mapping quality of record.

source
XAM.SAM.cigarFunction.
cigar(record::Record)::String

Get the CIGAR string of record.

source
XAM.SAM.alignmentFunction.
alignment(record::Record)::BioAlignments.Alignment

Get the alignment of record.

source
XAM.SAM.alignlengthFunction.
alignlength(record::Record)::Int

Get the alignment length of record.

source
XAM.SAM.tempnameFunction.
tempname(record::Record)::String

Get the query template name of record.

source
XAM.SAM.templengthFunction.
templength(record::Record)::Int

Get the template length of record.

source
XAM.SAM.sequenceFunction.
sequence(record::Record)::BioSequences.DNASequence

Get the segment sequence of record.

source
sequence(::Type{String}, record::Record)::String

Get the segment sequence of record as String.

source
XAM.SAM.seqlengthFunction.
seqlength(record::Record)::Int

Get the sequence length of record.

source
XAM.SAM.qualityFunction.
quality(record::Record)::Vector{UInt8}

Get the Phred-scaled base quality of record.

source
quality(::Type{String}, record::Record)::String

Get the ASCII-encoded base quality of record.

source
XAM.SAM.auxdataFunction.
auxdata(record::Record)::Dict{String,Any}

Get the auxiliary data (optional fields) of record.

source

BioAlignments supports the following accessors for BAM.Record types.

XAM.BAM.flagFunction.
flag(record::Record)::UInt16

Get the bitwise flag of record.

source
XAM.BAM.ismappedFunction.
ismapped(record::Record)::Bool

Test if record is mapped.

source
XAM.BAM.isprimaryFunction.
isprimary(record::Record)::Bool

Test if record is a primary line of the read.

This is equivalent to flag(record) & 0x900 == 0.

source
XAM.BAM.refidFunction.
refid(record::Record)::Int

Get the reference sequence ID of record.

The ID is 1-based (i.e. the first sequence is 1) and is 0 for a record without a mapping position.

See also: BAM.rname

source
XAM.BAM.refnameFunction.
refname(record::Record)::String

Get the reference sequence name of record.

See also: BAM.refid

source
XAM.BAM.reflenFunction.
reflen(record::Record)::Int

Get the length of the reference sequence this record applies to.

source
XAM.BAM.positionFunction.
position(record::Record)::Int

Get the 1-based leftmost mapping position of record.

source
XAM.BAM.rightpositionFunction.
rightposition(record::Record)::Int

Get the 1-based rightmost mapping position of record.

source
XAM.BAM.isnextmappedFunction.
isnextmapped(record::Record)::Bool

Test if the mate/next read of record is mapped.

source
XAM.BAM.nextrefidFunction.
nextrefid(record::Record)::Int

Get the next/mate reference sequence ID of record.

source
XAM.BAM.nextrefnameFunction.
nextrefname(record::Record)::String

Get the reference name of the mate/next read of record.

source
XAM.BAM.nextpositionFunction.
nextposition(record::Record)::Int

Get the 1-based leftmost mapping position of the next/mate read of record.

source
XAM.BAM.mappingqualityFunction.
mappingquality(record::Record)::UInt8

Get the mapping quality of record.

source
XAM.BAM.cigarFunction.
cigar(record::Record)::String

Get the CIGAR string of record.

Note that in the BAM specification, the field called cigar typically stores the cigar string of the record. However, this is not always true, sometimes the true cigar is very long, and due to some constraints of the BAM format, the actual cigar string is stored in an extra tag: CG:B,I, and the cigar field stores a pseudo-cigar string.

Calling this method with checkCG set to true (default) this method will always yield the true cigar string, because this is probably what you want the vast majority of the time.

If you have a record that stores the true cigar in a CG:B,I tag, but you still want to access the pseudo-cigar that is stored in the cigar field of the BAM record, then you can set checkCG to false.

See also BAM.cigar_rle.

source
XAM.BAM.alignmentFunction.
alignment(record::Record)::BioAlignments.Alignment

Get the alignment of record.

source
XAM.BAM.alignlengthFunction.
alignlength(record::Record)::Int

Get the alignment length of record.

source
XAM.BAM.tempnameFunction.
tempname(record::Record)::String

Get the query template name of record.

source
XAM.BAM.templengthFunction.
templength(record::Record)::Int

Get the template length of record.

source
XAM.BAM.sequenceFunction.
sequence(record::Record)::BioSequences.DNASequence

Get the segment sequence of record.

source
XAM.BAM.seqlengthFunction.
seqlength(record::Record)::Int

Get the sequence length of record.

source
XAM.BAM.qualityFunction.
quality(record::Record)::Vector{UInt8}

Get the base quality of record.

source
XAM.BAM.auxdataFunction.
auxdata(record::Record)::BAM.AuxData

Get the auxiliary data of record.

source

Accessing auxiliary data

SAM and BAM records support the storing of optional data fields associated with tags.

Tagged auxiliary data follows a format of TAG:TYPE:VALUE. TAG is a two-letter string, and each tag can only appear once per record. TYPE is a single case-sensetive letter which defined the format of VALUE.

TypeDescription
'A'Printable character
'i'Signed integer
'f'Single-precision floating number
'Z'Printable string, including space
'H'Byte array in Hex format
'B'Integer of numeric array

For more information about these tags and their types we refer you to the [SAM/BAM specification][samtools-spec] and the additional [optional fields specification][samtags] document.

There are some tags that are reserved, predefined standard tags, for specific uses.

To access optional fields stored in tags, you use getindex indexing syntax on the record object. Note that accessing optional tag fields will result in type instability in Julia. This is because the type of the optional data is not known until run-time, as the tag is being read. This can have a significant impact on performance. To limit this, if the user knows the type of a value in advance, specifying it as a type annotation will alleviate the problem:

Below is an example of looping over records in a bam file and using indexing syntax to get the data stored in the "NM" tag. Note the UInt8 type assertion to alleviate type instability.

for record in open(BAM.Reader, "data.bam")
     nm = record["NM"]::UInt8
     # do something
 end

Getting records in a range

BioAlignments supports the BAI index to fetch records in a specific range from a BAM file. [Samtools][samtools] provides index subcommand to create an index file (.bai) from a sorted BAM file.

$ samtools index -b SRR1238088.sort.bam