<htmllang="en"><head><metacharset="UTF-8"/><metaname="viewport"content="width=device-width, initial-scale=1.0"/><title>SAM and BAM · XAM.jl</title><scriptdata-outdated-warnersrc="../../assets/warner.js"></script><linkhref="https://cdnjs.cloudflare.com/ajax/libs/lato-font/3.0.0/css/lato-font.min.css"rel="stylesheet"type="text/css"/><linkhref="https://cdnjs.cloudflare.com/ajax/libs/juliamono/0.039/juliamono-regular.css"rel="stylesheet"type="text/css"/><linkhref="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.3/css/fontawesome.min.css"rel="stylesheet"type="text/css"/><linkhref="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.3/css/solid.min.css"rel="stylesheet"type="text/css"/><linkhref="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.3/css/brands.min.css"rel="stylesheet"type="text/css"/><linkhref="https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.13.11/katex.min.css"rel="stylesheet"type="text/css"/><script>documenterBaseURL="../.."</script><scriptsrc="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js"data-main="../../assets/documenter.js"></script><scriptsrc="../../siteinfo.js"></script><scriptsrc="../../../versions.js"></script><linkclass="docs-theme-link"rel="stylesheet"type="text/css"href="../../assets/themes/documenter-dark.css"data-theme-name="documenter-dark"data-theme-primary-dark/><linkclass="docs-theme-link"rel="stylesheet"type="text/css"href="../../assets/themes/documenter-light.css"data-theme-name="documenter-light"data-theme-primary/><scriptsrc="../../assets/themeswap.js"></script></head><body><divid="documenter"><navclass="docs-sidebar"><aclass="docs-logo"href="../../"><imgsrc="../../assets/logo.svg"alt="XAM.jl logo"/></a><divclass="docs-package-name"><spanclass="docs-autofit"><ahref="../../">XAM.jl</a></span></div><formclass="docs-search"action="../../search/"><inputclass="docs-search-query"id="documenter-search-query"name="q"type="text"placeholder="Search docs"/></form><ulclass="docs-menu"><li><aclass="tocitem"href="../../">Home</a></li><liclass="is-active"><aclass="tocitem"href>SAM and BAM</a><ulclass="internal"><li><aclass="tocitem"href="#Introduction"><span>Introduction</span></a></li><li><aclass="tocitem"href="#Reading-SAM-and-BAM-files"><span>Reading SAM and BAM files</span></a></li><li><aclass="tocitem"href="#SAM-and-BAM-Headers"><span>SAM and BAM Headers</span></a></li><li><aclass="tocitem"href="#SAM-and-BAM-Records"><span>SAM and BAM Records</span></a></li><li><aclass="tocitem"href="#Accessing-auxiliary-data"><span>Accessing auxiliary data</span></a></li><li><aclass="tocitem"href="#Getting-records-in-a-range"><span>Getting records in a range</span></a></li><li><aclass="tocitem"href="#Getting-records-overlapping-genomic-features"><span>Getting records overlapping genomic features</span></a></li><li><aclass="tocitem"href="#Writing-files"><span>Writing files</span></a></li></ul></li><li><aclass="tocitem"href="../api/">API Reference</a></li></ul><divclass="docs-version-selector field has-addons"><divclass="control"><spanclass="docs-label button is-static is-size-7">Version</span></div><divclass="docs-selector control is-expanded"><divclass="select is-fullwidth is-size-7"><selectid="documenter-version-selector"></select></div></div></div></nav><divclass="docs-main"><headerclass="docs-navbar"><navclass="breadcrumb"><ulclass="is-hidden-mobile"><liclass="is-active"><ahref>SAM and BAM</a></li></ul><ulclass="is-hidden-tablet"><liclass="is-active"><ahref>SAM and BAM</a></li></ul></nav><divclass="docs-right"><aclass="docs-edit-link"href="https://github.com/BioJulia/XAM.jl/blob/develop/docs/src/man/hts-files.md"title="Edit on GitHub"><spanclass="docs-icon fab"></span><spanclass="docs-label is-hidden-touch">Edit on GitHub</span></a><aclass="docs-settings-button fas fa-cog"id="documenter-settings-button"href="#"title="Settings"></a><aclass="docs-sidebar-button fa fa-bars is-hidden-desktop"id="documenter-sidebar-button"href="#"></a></div></header><articlecl
r001 147 ref 37 30 9M = 7 -39 CAGCGGCAT * NM:i:1</code></pre><p>Where the first two lines are part of the "header", and the following lines are "records". Each record describes how a read aligns to some reference sequence. Sometimes one record describes one read, but there are other cases like chimeric reads and split alignments, where multiple records apply to one read. In the example above, <code>r003</code> is a chimeric read, and <code>r004</code> is a split alignment, and <code>r001</code> are mate pair reads. Again, we refer you to the official <ahref="https://samtools.github.io/hts-specs/SAMv1.pdf">specification</a> for more details.</p><p>A BAM file stores this same information but in a binary and compressible format that does not make for pretty printing here!</p><h2id="Reading-SAM-and-BAM-files"><aclass="docs-heading-anchor"href="#Reading-SAM-and-BAM-files">Reading SAM and BAM files</a><aid="Reading-SAM-and-BAM-files-1"></a><aclass="docs-heading-anchor-permalink"href="#Reading-SAM-and-BAM-files"title="Permalink"></a></h2><p>A typical script iterating over all records in a file looks like below:</p><pre><codeclass="language-julia hljs">using XAM
close(reader)</code></pre><p>The size of a BAM file is often extremely large. The iterator interface demonstrated above allocates an object for each record and that may be a bottleneck of reading data from a BAM file. In-place reading reuses a pre-allocated object for every record and less memory allocation happens in reading:</p><pre><codeclass="language-julia hljs">reader = open(BAM.Reader, "data.bam")
end</code></pre><h2id="SAM-and-BAM-Headers"><aclass="docs-heading-anchor"href="#SAM-and-BAM-Headers">SAM and BAM Headers</a><aid="SAM-and-BAM-Headers-1"></a><aclass="docs-heading-anchor-permalink"href="#SAM-and-BAM-Headers"title="Permalink"></a></h2><p>Both <code>SAM.Reader</code> and <code>BAM.Reader</code> implement the <code>header</code> function, which returns a <code>SAM.Header</code> object. To extract certain information out of the headers, you can use the <code>find</code> method on the header to extract information according to SAM/BAM tag. Again we refer you to the <ahref="https://samtools.github.io/hts-specs/SAMv1.pdf">specification</a> for full details of all the different tags that can occur in headers, and what they mean.</p><p>Below is an example of extracting all the info about the reference sequences from the BAM header. In SAM/BAM, any description of a reference sequence is stored in the header, under a tag denoted <code>SQ</code> (think <code>reference SeQuence</code>!).</p><pre><codeclass="language-jlcon hljs">julia> reader = open(SAM.Reader, "data.sam");
</code></pre><p>In the above we can see there were 7 sequences in the reference: 5 chromosomes, one chloroplast sequence, and one mitochondrial sequence.</p><h2id="SAM-and-BAM-Records"><aclass="docs-heading-anchor"href="#SAM-and-BAM-Records">SAM and BAM Records</a><aid="SAM-and-BAM-Records-1"></a><aclass="docs-heading-anchor-permalink"href="#SAM-and-BAM-Records"title="Permalink"></a></h2><h3id="SAM.Record"><aclass="docs-heading-anchor"href="#SAM.Record">SAM.Record</a><aid="SAM.Record-1"></a><aclass="docs-heading-anchor-permalink"href="#SAM.Record"title="Permalink"></a></h3><p>The <code>XAM</code> package supports the following accessors for <code>SAM.Record</code> types.</p><articleclass="docstring"><header><aclass="docstring-binding"id="XAM.flag"href="#XAM.flag"><code>XAM.flag</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia hljs">flag(record::Union{SAM.Record, BAM.Record})::UInt16</code></pre><p>Get the bitwise flags of <code>record</code>. The returned value is a <code>UInt16</code> of each flag being OR'd together. The possible flags are:</p><pre><codeclass="nohighlight hljs">0x0001 template having multiple segments in sequencing
0x0002 each segment properly aligned according to the aligner
0x0004 segment unmapped
0x0008 next segment in the template unmapped
0x0010 SEQ being reverse complemented
0x0020 SEQ of the next segment in the template being reverse complemented
0x0040 the first segment in the template
0x0080 the last segment in the template
0x0100 secondary alignment
0x0200 not passing filters, such as platform/vendor quality controls
0x0400 PCR or optical duplicate
0x0800 supplementary alignment</code></pre></div><aclass="docs-sourcelink"target="_blank"href="https://github.com/BioJulia/XAM.jl/blob/c7114bce16c331804b748cbbede224d4c35b906f/src/XAM.jl#L7-L25">source</a></section></article><articleclass="docstring"><header><aclass="docstring-binding"id="XAM.SAM.ismapped"href="#XAM.SAM.ismapped"><code>XAM.SAM.ismapped</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia hljs">ismapped(record::Record)::Bool</code></pre><p>Test if <code>record</code> is mapped.</p></div><aclass="docs-sourcelink"target="_blank"href="https://github.com/BioJulia/XAM.jl/blob/c7114bce16c331804b748cbbede224d4c35b906f/src/sam/record.jl#L162-L166">source</a></section></article><articleclass="docstring"><header><aclass="docstring-binding"id="XAM.SAM.isprimary"href="#XAM.SAM.isprimary"><code>XAM.SAM.isprimary</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia hljs">isprimary(record::Record)::Bool</code></pre><p>Test if <code>record</code> is a primary line of the read.</p><p>This is equivalent to <code>flag(record) & 0x900 == 0</code>.</p></div><aclass="docs-sourcelink"target="_blank"href="https://github.com/BioJulia/XAM.jl/blob/c7114bce16c331804b748cbbede224d4c35b906f/src/sam/record.jl#L171-L177">source</a></section></article><articleclass="docstring"><header><aclass="docstring-binding"id="XAM.SAM.refname"href="#XAM.SAM.refname"><code>XAM.SAM.refname</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia hljs">refname(record::Record)::String</code></pre><p>Get the reference sequence name of <code>record</code>.</p></div><aclass="docs-sourcelink"target="_blank"href="https://github.com/BioJulia/XAM.jl/blob/c7114bce16c331804b748cbbede224d4c35b906f/src/sam/record.jl#L182-L186">source</a></section></article><articleclass="docstring"><header><aclass="docstring-binding"id="XAM.SAM.position"href="#XAM.SAM.position"><code>XAM.SAM.position</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia hljs">position(record::Record)::Int</code></pre><p>Get the 1-based leftmost mapping position of <code>record</code>.</p></div><aclass="docs-sourcelink"target="_blank"href="https://github.com/BioJulia/XAM.jl/blob/c7114bce16c331804b748cbbede224d4c35b906f/src/sam/record.jl#L199-L203">source</a></section></article><articleclass="docstring"><header><aclass="docstring-binding"id="XAM.SAM.rightposition"href="#XAM.SAM.rightposition"><code>XAM.SAM.rightposition</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia hljs">rightposition(record::Record)::Int</code></pre><p>Get the 1-based rightmost mapping position of <code>record</code>.</p></div><aclass="docs-sourcelink"target="_blank"href="https://github.com/BioJulia/XAM.jl/blob/c7114bce16c331804b748cbbede224d4c35b906f/src/sam/record.jl#L217-L221">source</a></section></article><articleclass="docstring"><header><aclass="docstring-binding"id="XAM.SAM.isnextmapped"href="#XAM.SAM.isnextmapped"><code>XAM.SAM.isnextmapped</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia hljs">isnextmapped(record::Record)::Bool</code></pre><p>Test if the mate/next read of <code>record</code> is mapped.</p></div><aclass="docs-sourcelink"target="_blank"href="https://github.com/BioJulia/XAM.jl/blob/c7114bce16c331804b748cbbede224d4c35b906f/src/sam/record.jl#L230-L234">source</a></section></article><articleclass="docstring"><header><aclass="docstring-binding"id="XAM.SAM.nextrefname"href="#XAM.SAM.nextrefname"><code>XAM.SAM.nextrefname</code></a> — <spanclass="docstring-category">Function</span></header><section><div><pre><codeclass="language-julia hljs">nextrefname(record::Record)::String</code></pre><p>Get the reference name of the mate/next read of <code>record</code>.</p></
end</code></pre><h2id="Getting-records-in-a-range"><aclass="docs-heading-anchor"href="#Getting-records-in-a-range">Getting records in a range</a><aid="Getting-records-in-a-range-1"></a><aclass="docs-heading-anchor-permalink"href="#Getting-records-in-a-range"title="Permalink"></a></h2><p>The <code>XAM</code> package supports the BAI index to fetch records in a specific range from a BAM file. <ahref="https://samtools.github.io/">Samtools</a> provides <code>index</code> subcommand to create an index file (.bai) from a sorted BAM file.</p><pre><codeclass="language-console hljs">$ samtools index -b SRR1238088.sort.bam
close(reader)</code></pre><h2id="Getting-records-overlapping-genomic-features"><aclass="docs-heading-anchor"href="#Getting-records-overlapping-genomic-features">Getting records overlapping genomic features</a><aid="Getting-records-overlapping-genomic-features-1"></a><aclass="docs-heading-anchor-permalink"href="#Getting-records-overlapping-genomic-features"title="Permalink"></a></h2><p>The <code>eachoverlap</code> method also accepts the <code>Interval</code> type defined in <ahref="https://github.com/BioJulia/GenomicFeatures.jl">GenomicFeatures.jl</a>.</p><p>This allows you to do things like first read in the genomic features from a GFF3 file, and then for each feature, iterate over all the BAM records that overlap with that feature.</p><pre><codeclass="language-julia hljs">using GenomicFeatures
close(reader)</code></pre><h2id="Writing-files"><aclass="docs-heading-anchor"href="#Writing-files">Writing files</a><aid="Writing-files-1"></a><aclass="docs-heading-anchor-permalink"href="#Writing-files"title="Permalink"></a></h2><p>In order to write a BAM or SAM file, you must first create a <code>SAM.Header</code>.</p><p>A <code>SAM.Header</code> is constructed from a vector of <code>SAM.MetaInfo</code> objects.</p><p>For example, to create the following simple header:</p><pre><codeclass="nohighlight hljs">@HD VN:1.6 SO:coordinate
</code></pre><p>Then to create the writer for a SAM file, construct a <code>SAM.Writer</code> using the header and an <code>IO</code> type:</p><pre><codeclass="language-julia hljs">julia> samw = SAM.Writer(open("my-data.sam", "w"), h)
</code></pre><p>To make a BAM Writer is slightly different, as you need to use a specific stream type from the <ahref="https://github.com/BioJulia/BGZFStreams.jl">https://github.com/BioJulia/BGZFStreams.jl</a> package:</p><pre><codeclass="language-julia hljs">julia> using BGZFStreams
</code></pre><p>Once you have a BAM or SAM writer, you can use the <code>write</code> method to write <code>BAM.Record</code>s or <code>SAM.Record</code>s to file:</p><pre><codeclass="language-julia hljs">julia> write(bamw, rec) # Here rec is a `BAM.Record`
330780</code></pre></article><navclass="docs-footer"><aclass="docs-footer-prevpage"href="../../">« Home</a><aclass="docs-footer-nextpage"href="../api/">API Reference »</a><divclass="flexbox-break"></div><pclass="footer-message">Powered by <ahref="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <ahref="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><divclass="modal"id="documenter-settings"><divclass="modal-background"></div><divclass="modal-card"><headerclass="modal-card-head"><pclass="modal-card-title">Settings</p><buttonclass="delete"></button></header><sectionclass="modal-card-body"><p><labelclass="label">Theme</label><divclass="select"><selectid="documenter-themepicker"><optionvalue="documenter-light">documenter-light</option><optionvalue="documenter-dark">documenter-dark</option></select></div></p><hr/><p>This document was generated with <ahref="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 0.27.10 on <spanclass="colophon-date"title="Saturday 27 November 2021 01:37">Saturday 27 November 2021</span>. Using Julia version 1.6.4.</p></section><footerclass="modal-card-foot"></footer></div></div></div></body></html>