<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	
	>
<channel>
	<title>
	Comments on: Parsing blast xml output	</title>
	<atom:link href="https://www.polarmicrobes.org/parsing-blast-xml-output/feed/" rel="self" type="application/rss+xml" />
	<link>https://www.polarmicrobes.org/parsing-blast-xml-output/</link>
	<description>Marine Microbial Ecology</description>
	<lastBuildDate>Wed, 30 Aug 2023 06:31:01 +0000</lastBuildDate>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9</generator>
	<item>
		<title>
		By: Ben		</title>
		<link>https://www.polarmicrobes.org/parsing-blast-xml-output/#comment-486</link>

		<dc:creator><![CDATA[Ben]]></dc:creator>
		<pubDate>Wed, 30 Aug 2023 06:31:01 +0000</pubDate>
		<guid isPermaLink="false">http://www.polarmicrobes.org/?p=753#comment-486</guid>

					<description><![CDATA[Since I am reading this today, some might be tempted to use this in 2023: be careful when using .strip(foo) and .rstrip(bar)

They do not remove the foo and bar prefixes/suffixes, they would remove from the ends of the string (or only the tail when using rstrip) every character belonging in the set &#039;foo&#039; or &#039;bar&#039;.

Example :
&#039;&#039;&#039;
&#062;&#062;&#062; s = &quot;id_seed&quot;
&#062;&#062;&#062; s.strip(&quot;&quot;).strip(&quot;&#060;/&#034;)
&#039;seed&#039;
&#039;&#039;&#039;

...and the &#034;id_&#034; is lost!]]></description>
			<content:encoded><![CDATA[<p>Since I am reading this today, some might be tempted to use this in 2023: be careful when using .strip(foo) and .rstrip(bar)</p>
<p>They do not remove the foo and bar prefixes/suffixes, they would remove from the ends of the string (or only the tail when using rstrip) every character belonging in the set &#8216;foo&#8217; or &#8216;bar&#8217;.</p>
<p>Example :<br />
&#8221;&#8217;<br />
&gt;&gt;&gt; s = &#8220;id_seed&#8221;<br />
&gt;&gt;&gt; s.strip(&#8220;&#8221;).strip(&#8220;&lt;/&quot;)<br />
&#039;seed&#039;<br />
&#039;&#039;&#039;</p>
<p>&#8230;and the &quot;id_&quot; is lost!</p>
]]></content:encoded>
		
			</item>
		<item>
		<title>
		By: Jeff		</title>
		<link>https://www.polarmicrobes.org/parsing-blast-xml-output/#comment-137</link>

		<dc:creator><![CDATA[Jeff]]></dc:creator>
		<pubDate>Fri, 19 Apr 2013 18:45:21 +0000</pubDate>
		<guid isPermaLink="false">http://www.polarmicrobes.org/?p=753#comment-137</guid>

					<description><![CDATA[I just learned a great trick that makes this a whole lot easier.  Python has a module (gzip) that allows gzipped files to be read as text files, e.g.

&lt;pre&gt;import gzip
with gzip.open(&#039;really_big_file.xml.gz&#039;, &#039;rb&#039;) as xml:
   for line in xml:
      do some stuff&lt;/pre&gt;

This allows massive blast output files to stay compressed at about 1/5 their inflated size.  I&#039;ve placed an updated script &lt;a href=&quot;https://sites.google.com/site/bowmanjs/methods/python_scripts&quot; rel=&quot;nofollow&quot;&gt;here&lt;/a&gt;.  The updated script also very niftily creates a fasta of hits, as aa seqs if blastx was used.]]></description>
			<content:encoded><![CDATA[<p>I just learned a great trick that makes this a whole lot easier.  Python has a module (gzip) that allows gzipped files to be read as text files, e.g.</p>
<pre>import gzip
with gzip.open('really_big_file.xml.gz', 'rb') as xml:
   for line in xml:
      do some stuff</pre>
<p>This allows massive blast output files to stay compressed at about 1/5 their inflated size.  I&#8217;ve placed an updated script <a href="https://sites.google.com/site/bowmanjs/methods/python_scripts" rel="nofollow">here</a>.  The updated script also very niftily creates a fasta of hits, as aa seqs if blastx was used.</p>
]]></content:encoded>
		
			</item>
	</channel>
</rss>
