<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Arin Sarkissian</title>
	<atom:link href="http://arin.me/blog/feed" rel="self" type="application/rss+xml" />
	<link>http://arin.me/blog</link>
	<description>My Blog about Code, PHP, Music, Punk Rock, Guitars, Basses &#38; Other Random Stuff</description>
	<lastBuildDate>Thu, 21 Jan 2010 18:50:09 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Worthless Commenting</title>
		<link>http://arin.me/blog/worthless-commenting</link>
		<comments>http://arin.me/blog/worthless-commenting#comments</comments>
		<pubDate>Thu, 21 Jan 2010 18:50:09 +0000</pubDate>
		<dc:creator>phatduckk</dc:creator>
				<category><![CDATA[Geek]]></category>
		<category><![CDATA[Code]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Rant]]></category>

		<guid isPermaLink="false">http://arin.me/blog/?p=1030</guid>
		<description><![CDATA[Writing comments in your code is fine and dandy but sometimes its just a fucking waste of time. Here&#8217;s an example:
So, yay &#8211; that chunk of code passes PHPCS (using the PEAR standard). All the parameters are documented, there&#8217;s a line of text explaining what the function does and the docblock even states the return [...]]]></description>
			<content:encoded><![CDATA[<p>Writing comments in your code is fine and dandy but sometimes its just a fucking waste of time. Here&#8217;s an example:</p>
<div id="gist-283042" class="gist">
  
  
    
            

      <div class="gist-file">
        <div class="gist-data gist-syntax">
          
          
          
            <div class="highlight"><pre><div class="line" id="LC1"><span class="cp">&lt;?php</span></div><div class="line" id="LC2"><span class="k">class</span> <span class="nc">UserDataAccessClass</span></div><div class="line" id="LC3"><span class="p">{</span></div><div class="line" id="LC4">&nbsp;&nbsp;&nbsp;&nbsp;<span class="sd">/**</span></div><div class="line" id="LC5"><span class="sd">     * Get the number of followers a user has</span></div><div class="line" id="LC6"><span class="sd">     *</span></div><div class="line" id="LC7"><span class="sd">     * @param string $userID The user id</span></div><div class="line" id="LC8"><span class="sd">     *</span></div><div class="line" id="LC9"><span class="sd">     * @return int</span></div><div class="line" id="LC10"><span class="sd">     */</span></div><div class="line" id="LC11">&nbsp;&nbsp;&nbsp;&nbsp;<span class="k">public</span> <span class="k">function</span> <span class="nf">getNumFollowers</span><span class="p">(</span><span class="nv">$userID</span><span class="p">)</span></div><div class="line" id="LC12">&nbsp;&nbsp;&nbsp;&nbsp;<span class="p">{</span></div><div class="line" id="LC13">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="k">return</span> <span class="nx">valueFromAServiceCallOrQueryEtc</span><span class="p">(</span><span class="nv">$userID</span><span class="p">)</span></div><div class="line" id="LC14">&nbsp;&nbsp;&nbsp;&nbsp;<span class="p">}</span></div><div class="line" id="LC15"><span class="p">}</span></div><div class="line" id="LC16"><span class="cp">?&gt;</span><span class="x"></span></div></pre></div>
          
        </div>

        <div class="gist-meta">
          <a href="http://gist.github.com/raw/283042/089f505529e12cc23b95cc77a921f78124c560fe/stupidComments.php" style="float:right;">view raw</a>
          <a href="http://gist.github.com/283042#file_stupid_comments.php" style="float:right;margin-right:10px;color:#666">stupidComments.php</a>
          <a href="http://gist.github.com/283042">This Gist</a> brought to you by <a href="http://github.com">GitHub</a>.
        </div>
      </div>
    
  
</div>

<p>So, yay &#8211; that chunk of code passes <a href="http://pear.php.net/package/PHP_CodeSniffer/redirected">PHPCS</a> (using the PEAR standard). All the parameters are documented, there&#8217;s a line of text explaining what the function does and the docblock even states the return type&#8230; how cute!</p>
<p>But why? Why do I have to type all that crap? The function&#8217;s name is self documenting &#038; its sole parameter is obvious. The return type makes sense to me but the rest is bullshit. God forbid your function takes multiple parameters; then you&#8217;d have to line up the <code>@param</code>&#8217;s types and descriptions &#8217;cause PHP people have a strange hardon for lining shit up.</p>
<p>The truth is the only reason I do all that stuff is cuz we run <a href="http://pear.php.net/package/PHP_CodeSniffer/redirected">PHPCS</a> on our code at work and I don&#8217;t wanna be &#8220;that guy&#8221;. If it wasn&#8217;t a &#8220;standrad&#8221; at work there&#8217;s no way in hell id ever bother. </p>
<p>I&#8217;d much have the documentation go like this instead:</p>
<div id="gist-283048" class="gist">
  
  
    
            

      <div class="gist-file">
        <div class="gist-data gist-syntax">
          
          
          
            <div class="highlight"><pre><div class="line" id="LC1"><span class="cp">&lt;?php</span></div><div class="line" id="LC2"><span class="k">class</span> <span class="nc">UserDataAccessClass</span></div><div class="line" id="LC3"><span class="p">{</span></div><div class="line" id="LC4">&nbsp;&nbsp;&nbsp;&nbsp;<span class="sd">/**</span></div><div class="line" id="LC5"><span class="sd">     * @return int</span></div><div class="line" id="LC6"><span class="sd">     */</span></div><div class="line" id="LC7">&nbsp;&nbsp;&nbsp;&nbsp;<span class="k">public</span> <span class="k">function</span> <span class="nf">getNumFollowers</span><span class="p">(</span><span class="nx">string</span> <span class="nv">$userID</span><span class="p">)</span></div><div class="line" id="LC8">&nbsp;&nbsp;&nbsp;&nbsp;<span class="p">{</span></div><div class="line" id="LC9">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="k">return</span> <span class="nx">valueFromAServiceCallOrQueryEtc</span><span class="p">(</span><span class="nv">$userID</span><span class="p">)</span></div><div class="line" id="LC10">&nbsp;&nbsp;&nbsp;&nbsp;<span class="p">}</span></div><div class="line" id="LC11"><span class="p">}</span></div><div class="line" id="LC12"><span class="cp">?&gt;</span><span class="x"></span></div></pre></div>
          
        </div>

        <div class="gist-meta">
          <a href="http://gist.github.com/raw/283048/501e7979bfa3da43d3099fd43307cd615df4fa40/stupidPHPCS2.php" style="float:right;">view raw</a>
          <a href="http://gist.github.com/283048#file_stupid_phpcs2.php" style="float:right;margin-right:10px;color:#666">stupidPHPCS2.php</a>
          <a href="http://gist.github.com/283048">This Gist</a> brought to you by <a href="http://github.com">GitHub</a>.
        </div>
      </div>
    
  
</div>

<p>&#8230;And to be 100% honest the only reason I&#8217;d include the <code>@return</code> is &#8217;cause I&#8217;m an Eclipse &#038; it helps PDT&#8217;s static analysis (aka autocomplete gets more better).</p>
]]></content:encoded>
			<wfw:commentRss>http://arin.me/blog/worthless-commenting/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Brian  Setzer is a Pimp</title>
		<link>http://arin.me/blog/brian-setzer-is-a-pimp</link>
		<comments>http://arin.me/blog/brian-setzer-is-a-pimp#comments</comments>
		<pubDate>Mon, 21 Dec 2009 22:20:48 +0000</pubDate>
		<dc:creator>phatduckk</dc:creator>
				<category><![CDATA[Guitar & Bass]]></category>
		<category><![CDATA[Music]]></category>
		<category><![CDATA[Brian Setzer]]></category>
		<category><![CDATA[Gretsch]]></category>
		<category><![CDATA[Sleepwalk]]></category>

		<guid isPermaLink="false">http://arin.me/blog/?p=1011</guid>
		<description><![CDATA[
]]></description>
			<content:encoded><![CDATA[<div class="video"><object width="425" height="344"><param name="movie" value="http://www.youtube.com/v/ZZuHREUIVz8&#038;hl=en_US&#038;fs=1&#038;"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/ZZuHREUIVz8&#038;hl=en_US&#038;fs=1&#038;" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"></embed></object></div>
]]></content:encoded>
			<wfw:commentRss>http://arin.me/blog/brian-setzer-is-a-pimp/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Joe Bonamassa Shows Off a Few Guitars</title>
		<link>http://arin.me/blog/joe-bonamassa-shows-off-a-few-guitars</link>
		<comments>http://arin.me/blog/joe-bonamassa-shows-off-a-few-guitars#comments</comments>
		<pubDate>Sat, 12 Dec 2009 09:19:07 +0000</pubDate>
		<dc:creator>phatduckk</dc:creator>
				<category><![CDATA[Guitar & Bass]]></category>
		<category><![CDATA[Gibson]]></category>
		<category><![CDATA[Joe Bonamassa]]></category>
		<category><![CDATA[Music Man]]></category>

		<guid isPermaLink="false">http://arin.me/blog/?p=1002</guid>
		<description><![CDATA[
He shows off a bunch of Gibsons then busts out a few of his Music Man guitars about 2 minutes in. He had great things to say about his MM axes:
They really are the nicest stuff
- Joe Bonamassa
]]></description>
			<content:encoded><![CDATA[<div class="video"><object width="560" height="340"><param name="movie" value="http://www.youtube.com/v/Kco8BEcClyQ&#038;hl=en_US&#038;fs=1&#038;"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/Kco8BEcClyQ&#038;hl=en_US&#038;fs=1&#038;" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="560" height="340"></embed></object></div>
<p>He shows off a bunch of Gibsons then busts out a few of his Music Man guitars about 2 minutes in. He had great things to say about his MM axes:</p>
<blockquote><p>They really are the nicest stuff</p>
<div>- Joe Bonamassa</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://arin.me/blog/joe-bonamassa-shows-off-a-few-guitars/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Indexing Nodes in Neo4J</title>
		<link>http://arin.me/blog/indexing-nodes-in-neo4j</link>
		<comments>http://arin.me/blog/indexing-nodes-in-neo4j#comments</comments>
		<pubDate>Fri, 11 Dec 2009 21:36:29 +0000</pubDate>
		<dc:creator>phatduckk</dc:creator>
				<category><![CDATA[Geek]]></category>
		<category><![CDATA[Code]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Neo4J]]></category>
		<category><![CDATA[NoSQL]]></category>

		<guid isPermaLink="false">http://arin.me/blog/?p=986</guid>
		<description><![CDATA[I&#8217;ve been playing with #neo4j quite a bit lately. It&#8217;s a great &#038; fun project. It&#8217;s a graph database that mitigates all the bullshit you have to deal with when trying to, ya know, do graph stuff. Example: find all User Nodes who&#8217;s gender property is set to female, have an outgoing likes relationship to [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been playing with <a href="http://neo4j.org/">#neo4j</a> quite a bit lately. It&#8217;s a great &#038; fun project. It&#8217;s a graph database that mitigates all the bullshit you have to deal with when trying to, ya know, do graph stuff. Example: find all <code>User Nodes</code> who&#8217;s <em>gender</em> property is set to <em>female</em>, have an outgoing <em>likes</em> relationship to the <code>Node</code> <em>punk music</em> and are less than 3 degrees of separation from <code>Node #4</code>. Stuff like that. Its super good at doing this.</p>
<p>But here&#8217;s the deal&#8230; each <code>Node</code> is gettable via ID which is nice &#8211; but the ID&#8217;s are Neo4J&#8217;s internal ID; you don&#8217;t get to set &#8216;em when you create a <code>Node</code>. So, what if I want to get a <code>Node</code> who&#8217;s <em>username</em> property is <em>phatduckk</em> &#038; start the traversal from there? The problem lies in the fact that you don&#8217;t know that <em>phatduckk</em> is <code>Node #4</code> so you need a simple &#038; efficient way to do that lookup &#038; grab that <code>Node</code>.</p>
<p>If your dataset is small, I guess, you can just use a <code>Map</code> and store the mapping yourself but that solution will fall over pretty quickly. You could also toss info into MySQL but why would you do that? It just doesn&#8217;t feel right to use 2 different stores. So, checking out some of the docs you&#8217;ll see that Neo4J&#8217;s got some indexing capabilities.</p>
<p>Initially I tried out the <a href="http://gist.github.com/253569"><code>SingleValueIndex</code></a> which <a href="http://gist.github.com/253569">fell over</a> in a multi-threaded scenario. So, I hit up the list and was advised to check out the <a href="http://components.neo4j.org/index-util/apidocs/index.html"><code>LuceneIndexService</code></a>. This worked like a charm. Even with multiple threads constantly indexing the same <code>Node</code>.</p>
<p>Here&#8217;s a little test app. It&#8217;s a brute force, little hack that creates a single <code>Node</code> and indexes it by its <em>username</em> property 100,000 times using 10 threads. This is a pretty unrealistic situation but I really wanted to make sure it behaved well in a multi-threaded scenario and didn&#8217;t frustrate me like the <code>SingleValueIndex</code> did.</p>
<div id="gist-254435" class="gist">
  
  
    
            

      <div class="gist-file">
        <div class="gist-data gist-syntax">
          
          
          
            <div class="highlight"><pre><div class="line" id="LC1"><span class="kn">package</span> <span class="n">com</span><span class="o">.</span><span class="na">digg</span><span class="o">.</span><span class="na">tmp</span><span class="o">;</span></div><div class="line" id="LC2">&nbsp;</div><div class="line" id="LC3"><span class="kn">import</span> <span class="nn">org.neo4j.api.core.*</span><span class="o">;</span></div><div class="line" id="LC4"><span class="kn">import</span> <span class="nn">org.neo4j.util.index.IndexService</span><span class="o">;</span></div><div class="line" id="LC5"><span class="kn">import</span> <span class="nn">org.neo4j.util.index.LuceneIndexService</span><span class="o">;</span></div><div class="line" id="LC6">&nbsp;</div><div class="line" id="LC7"><span class="kn">import</span> <span class="nn">java.util.concurrent.ExecutorService</span><span class="o">;</span></div><div class="line" id="LC8"><span class="kn">import</span> <span class="nn">java.util.concurrent.Executors</span><span class="o">;</span></div><div class="line" id="LC9">&nbsp;</div><div class="line" id="LC10"><span class="kd">public</span> <span class="kd">class</span> <span class="nc">LuceneIndex</span> <span class="o">{</span></div><div class="line" id="LC11">&nbsp;&nbsp;&nbsp;&nbsp;<span class="kd">private</span> <span class="kd">static</span> <span class="kd">final</span> <span class="n">String</span> <span class="n">USERNAME_INDEX</span> <span class="o">=</span> <span class="s">&quot;usernameIndex&quot;</span><span class="o">;</span></div><div class="line" id="LC12">&nbsp;&nbsp;&nbsp;&nbsp;<span class="kd">private</span> <span class="kd">static</span> <span class="kd">final</span> <span class="kt">int</span> <span class="n">NUM_THREADS</span> <span class="o">=</span> <span class="mi">10</span><span class="o">;</span></div><div class="line" id="LC13">&nbsp;&nbsp;&nbsp;&nbsp;<span class="kd">private</span> <span class="kd">static</span> <span class="kd">final</span> <span class="kt">int</span> <span class="n">NUM_LINES</span> <span class="o">=</span> <span class="mi">1000000</span><span class="o">;</span></div><div class="line" id="LC14">&nbsp;&nbsp;&nbsp;&nbsp;<span class="kd">private</span> <span class="kd">static</span> <span class="kd">final</span> <span class="n">String</span> <span class="n">USERNAME</span> <span class="o">=</span> <span class="s">&quot;phatduckk&quot;</span><span class="o">;</span></div><div class="line" id="LC15">&nbsp;</div><div class="line" id="LC16">&nbsp;&nbsp;&nbsp;&nbsp;<span class="kd">public</span> <span class="kd">static</span> <span class="kt">void</span> <span class="nf">main</span><span class="o">(</span><span class="n">String</span><span class="o">[]</span> <span class="n">args</span><span class="o">)</span> <span class="o">{</span></div><div class="line" id="LC17">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="c1">// always use a new store</span></div><div class="line" id="LC18">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">NeoService</span> <span class="n">neo</span> <span class="o">=</span> <span class="k">new</span> <span class="n">EmbeddedNeo</span><span class="o">(</span><span class="s">&quot;test-&quot;</span> <span class="o">+</span> <span class="n">System</span><span class="o">.</span><span class="na">currentTimeMillis</span><span class="o">());</span></div><div class="line" id="LC19">&nbsp;</div><div class="line" id="LC20">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="c1">// now create the node we want indexed:</span></div><div class="line" id="LC21">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">Transaction</span> <span class="n">txUser</span> <span class="o">=</span> <span class="n">neo</span><span class="o">.</span><span class="na">beginTx</span><span class="o">();</span></div><div class="line" id="LC22">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">Node</span> <span class="n">userNode</span> <span class="o">=</span> <span class="n">neo</span><span class="o">.</span><span class="na">createNode</span><span class="o">();</span></div><div class="line" id="LC23">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">userNode</span><span class="o">.</span><span class="na">setProperty</span><span class="o">(</span><span class="n">USERNAME_INDEX</span><span class="o">,</span> <span class="n">USERNAME</span><span class="o">);</span></div><div class="line" id="LC24">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">txUser</span><span class="o">.</span><span class="na">success</span><span class="o">();</span></div><div class="line" id="LC25">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">txUser</span><span class="o">.</span><span class="na">finish</span><span class="o">();</span></div><div class="line" id="LC26">&nbsp;</div><div class="line" id="LC27">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="c1">// now create the index &amp; setup a pool</span></div><div class="line" id="LC28">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">IndexService</span> <span class="n">idxServ</span> <span class="o">=</span> <span class="k">new</span> <span class="n">LuceneIndexService</span><span class="o">(</span><span class="n">neo</span><span class="o">);</span></div><div class="line" id="LC29">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="kd">final</span> <span class="n">ExecutorService</span> <span class="n">executorService</span> <span class="o">=</span> <span class="n">Executors</span><span class="o">.</span><span class="na">newFixedThreadPool</span><span class="o">(</span><span class="n">NUM_THREADS</span><span class="o">);</span></div><div class="line" id="LC30">&nbsp;</div><div class="line" id="LC31">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="c1">// now let&#39;s index that same node NUM_LINES times</span></div><div class="line" id="LC32">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="c1">// the reason we&#39;re indexing the same node is cuz i&#39;m checking for thread safety during indexing issues</span></div><div class="line" id="LC33">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="c1">// otherwise you&#39;d normally be indexing new nodes who&#39;s data you got from some external source</span></div><div class="line" id="LC34">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="k">for</span> <span class="o">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="o">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">NUM_LINES</span><span class="o">;</span> <span class="n">i</span><span class="o">++)</span> <span class="o">{</span></div><div class="line" id="LC35">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="s">&quot;line: &quot;</span> <span class="o">+</span> <span class="n">i</span><span class="o">);</span></div><div class="line" id="LC36">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">IndexRunner</span> <span class="n">command</span> <span class="o">=</span> <span class="k">new</span> <span class="n">IndexRunner</span><span class="o">(</span><span class="n">userNode</span><span class="o">,</span> <span class="n">neo</span><span class="o">,</span> <span class="n">idxServ</span><span class="o">);</span></div><div class="line" id="LC37">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">executorService</span><span class="o">.</span><span class="na">execute</span><span class="o">(</span><span class="n">command</span><span class="o">);</span></div><div class="line" id="LC38">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="o">}</span></div><div class="line" id="LC39">&nbsp;</div><div class="line" id="LC40">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="c1">// should do a clean neo.shutdown() at some point ;-)</span></div><div class="line" id="LC41">&nbsp;&nbsp;&nbsp;&nbsp;<span class="o">}</span></div><div class="line" id="LC42">&nbsp;</div><div class="line" id="LC43">&nbsp;&nbsp;&nbsp;&nbsp;<span class="kd">static</span> <span class="kd">class</span> <span class="nc">IndexRunner</span> <span class="kd">implements</span> <span class="n">Runnable</span> <span class="o">{</span></div><div class="line" id="LC44">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">NeoService</span> <span class="n">neo</span><span class="o">;</span></div><div class="line" id="LC45">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">IndexService</span> <span class="n">idxServ</span><span class="o">;</span></div><div class="line" id="LC46">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">Node</span> <span class="n">userNode</span><span class="o">;</span></div><div class="line" id="LC47">&nbsp;</div><div class="line" id="LC48">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">IndexRunner</span><span class="o">(</span><span class="n">Node</span> <span class="n">userNode</span><span class="o">,</span> <span class="n">NeoService</span> <span class="n">neo</span><span class="o">,</span> <span class="n">IndexService</span> <span class="n">idxServ</span><span class="o">)</span> <span class="o">{</span></div><div class="line" id="LC49">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="k">this</span><span class="o">.</span><span class="na">userNode</span> <span class="o">=</span> <span class="n">userNode</span><span class="o">;</span></div><div class="line" id="LC50">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="k">this</span><span class="o">.</span><span class="na">neo</span> <span class="o">=</span> <span class="n">neo</span><span class="o">;</span></div><div class="line" id="LC51">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="k">this</span><span class="o">.</span><span class="na">idxServ</span> <span class="o">=</span> <span class="n">idxServ</span><span class="o">;</span></div><div class="line" id="LC52">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="o">}</span></div><div class="line" id="LC53">&nbsp;</div><div class="line" id="LC54">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="kd">public</span> <span class="kt">void</span> <span class="nf">run</span><span class="o">()</span> <span class="o">{</span></div><div class="line" id="LC55">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">Transaction</span> <span class="n">nodetx</span> <span class="o">=</span> <span class="n">neo</span><span class="o">.</span><span class="na">beginTx</span><span class="o">();</span></div><div class="line" id="LC56">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">Node</span> <span class="n">nodeFromIndex</span> <span class="o">=</span> <span class="n">idxServ</span><span class="o">.</span><span class="na">getSingleNode</span><span class="o">(</span><span class="n">USERNAME_INDEX</span><span class="o">,</span> <span class="n">USERNAME</span><span class="o">);</span></div><div class="line" id="LC57">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</div><div class="line" id="LC58">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="k">if</span> <span class="o">(</span><span class="n">nodeFromIndex</span> <span class="o">!=</span> <span class="kc">null</span><span class="o">)</span> <span class="o">{</span></div><div class="line" id="LC59">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="s">&quot;found &quot;</span> <span class="o">+</span> <span class="n">USERNAME</span> <span class="o">+</span> <span class="s">&quot; in the &quot;</span> <span class="o">+</span> <span class="n">USERNAME_INDEX</span></div><div class="line" id="LC60">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="o">+</span> <span class="s">&quot; index. Node ID is: &quot;</span> <span class="o">+</span> <span class="n">nodeFromIndex</span><span class="o">.</span><span class="na">getId</span><span class="o">());</span></div><div class="line" id="LC61">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="o">}</span> <span class="k">else</span> <span class="o">{</span></div><div class="line" id="LC62">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">idxServ</span><span class="o">.</span><span class="na">index</span><span class="o">(</span><span class="n">userNode</span><span class="o">,</span> <span class="n">USERNAME_INDEX</span><span class="o">,</span> <span class="n">USERNAME</span><span class="o">);</span></div><div class="line" id="LC63">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="o">}</span></div><div class="line" id="LC64">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</div><div class="line" id="LC65">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">nodetx</span><span class="o">.</span><span class="na">success</span><span class="o">();</span></div><div class="line" id="LC66">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">nodetx</span><span class="o">.</span><span class="na">finish</span><span class="o">();</span></div><div class="line" id="LC67">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="o">}</span></div><div class="line" id="LC68">&nbsp;&nbsp;&nbsp;&nbsp;<span class="o">}</span></div><div class="line" id="LC69"><span class="o">}</span></div></pre></div>
          
        </div>

        <div class="gist-meta">
          <a href="http://gist.github.com/raw/254435/913da7ecb1094e016b33848530c7729e57da5118/LuceneIndex.java" style="float:right;">view raw</a>
          <a href="http://gist.github.com/254435#file_lucene_index.java" style="float:right;margin-right:10px;color:#666">LuceneIndex.java</a>
          <a href="http://gist.github.com/254435">This Gist</a> brought to you by <a href="http://github.com">GitHub</a>.
        </div>
      </div>
    
  
</div>

<p>Although this is an off the wall example it can also serve as a simple example of how to index a <code>Node</code>. Anywho &#8211; hope this helps out a few folks that ran into the same needs/problems/scenarios I did. In hindsight it&#8217;s all pretty simple &#038; straightforward  &#8211; I just went down the wrong path with the <code>SingleValueIndex</code>&#8230; when browsing <a href="http://components.neo4j.org/index-util/">the docs</a> it sounded like the right tool for the job but, from what I can tell, you should avoid it and use the <code>LuceneService</code> instead.</p>
]]></content:encoded>
			<wfw:commentRss>http://arin.me/blog/indexing-nodes-in-neo4j/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Quick &#8216;n Dirty App Store API Search</title>
		<link>http://arin.me/blog/quick-n-dirty-app-store-api-search</link>
		<comments>http://arin.me/blog/quick-n-dirty-app-store-api-search#comments</comments>
		<pubDate>Wed, 09 Dec 2009 07:09:52 +0000</pubDate>
		<dc:creator>phatduckk</dc:creator>
				<category><![CDATA[Geek]]></category>
		<category><![CDATA[App Store]]></category>
		<category><![CDATA[Apple]]></category>
		<category><![CDATA[PHP]]></category>

		<guid isPermaLink="false">http://arin.me/blog/?p=976</guid>
		<description><![CDATA[More info here.
]]></description>
			<content:encoded><![CDATA[<div id="gist-252317" class="gist">
  
  
    
            

      <div class="gist-file">
        <div class="gist-data gist-syntax">
          
          
          
            <div class="highlight"><pre><div class="line" id="LC1"><span class="cp">&lt;?php</span></div><div class="line" id="LC2">&nbsp;</div><div class="line" id="LC3"><span class="k">if</span> <span class="p">(</span><span class="o">!</span> <span class="nb">isset</span><span class="p">(</span><span class="nv">$argv</span><span class="p">[</span><span class="m">1</span><span class="p">]))</span> <span class="p">{</span></div><div class="line" id="LC4">&nbsp;&nbsp;&nbsp;&nbsp;<span class="k">echo</span> <span class="s2">&quot;enter a search term:</span><span class="se">\n</span><span class="s2">&quot;</span><span class="p">;</span></div><div class="line" id="LC5">&nbsp;&nbsp;&nbsp;&nbsp;<span class="k">echo</span> <span class="s1">&#39;php &#39;</span> <span class="o">.</span> <span class="k">__FILE__</span> <span class="o">.</span> <span class="s2">&quot; &lt;search_term&gt;</span><span class="se">\n</span><span class="s2">&quot;</span><span class="p">;</span></div><div class="line" id="LC6">&nbsp;&nbsp;&nbsp;&nbsp;<span class="k">exit</span><span class="p">;</span></div><div class="line" id="LC7"><span class="p">}</span></div><div class="line" id="LC8">&nbsp;</div><div class="line" id="LC9"><span class="nv">$term</span> <span class="o">=</span> <span class="nb">urlencode</span><span class="p">(</span><span class="nv">$argv</span><span class="p">[</span><span class="m">1</span><span class="p">]);</span></div><div class="line" id="LC10"><span class="nv">$url</span>  <span class="o">=</span> <span class="s2">&quot;http://ax.phobos.apple.com.edgesuite.net/WebObjects/MZStoreServices.woa/wa/wsSearch?limit=10&amp;entity=software&amp;term=</span><span class="si">$term</span><span class="s2">&quot;</span><span class="p">;</span></div><div class="line" id="LC11"><span class="nv">$json</span> <span class="o">=</span> <span class="nb">file_get_contents</span><span class="p">(</span><span class="nv">$url</span><span class="p">);</span></div><div class="line" id="LC12">&nbsp;</div><div class="line" id="LC13"><span class="nb">print_r</span><span class="p">(</span><span class="nx">json_decode</span><span class="p">(</span><span class="nv">$json</span><span class="p">,</span> <span class="k">true</span><span class="p">));</span></div><div class="line" id="LC14">&nbsp;</div><div class="line" id="LC15"><span class="cp">?&gt;</span><span class="x"></span></div></pre></div>
          
        </div>

        <div class="gist-meta">
          <a href="http://gist.github.com/raw/252317/bbc623bdeaf8a792ee7c01504916c191602b3885/appsearch-example.php" style="float:right;">view raw</a>
          <a href="http://gist.github.com/252317#file_appsearch_example.php" style="float:right;margin-right:10px;color:#666">appsearch-example.php</a>
          <a href="http://gist.github.com/252317">This Gist</a> brought to you by <a href="http://github.com">GitHub</a>.
        </div>
      </div>
    
  
</div>

<p>More info <a href="http://www.apple.com/itunesaffiliates/API/AffiliatesSearch2.1.pdf">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://arin.me/blog/quick-n-dirty-app-store-api-search/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Embed a Gist in Your Wordpress Blog</title>
		<link>http://arin.me/blog/embed-a-gist-in-your-wordpress-blog</link>
		<comments>http://arin.me/blog/embed-a-gist-in-your-wordpress-blog#comments</comments>
		<pubDate>Mon, 07 Dec 2009 09:10:01 +0000</pubDate>
		<dc:creator>phatduckk</dc:creator>
				<category><![CDATA[Geek]]></category>
		<category><![CDATA[Gist]]></category>
		<category><![CDATA[Gistson]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[WordPress]]></category>

		<guid isPermaLink="false">http://arin.me/blog/?p=958</guid>
		<description><![CDATA[I stumbled upon Paul William&#8217;s plugin for embedding a Gist into a Wordpress blog.
Its a quick &#8216;n clean plugin but it relies on a JS &#60;script&#62; tag to render the Gist&#8217;s content&#8230; so, I made a quick tweak to get the plugin to actually put the Gist&#8217;s content into your HTML source. There may already [...]]]></description>
			<content:encoded><![CDATA[<p>I stumbled upon <a href="http://www.entropytheblog.com/blog/2008/12/wordpress-github-gist-shortcode-plugin/">Paul William&#8217;s</a> plugin for embedding a <a href="http://gist.github.com">Gist</a> into a Wordpress blog.</p>
<p>Its a quick &#8216;n clean plugin but it relies on a JS &lt;script&gt; tag to render the Gist&#8217;s content&#8230; so, I made a quick tweak to get the plugin to actually put the Gist&#8217;s content into your HTML source. There may already be something similar but, eh, it was just a quick hackjob.</p>
<p>The plugin&#8217;s code and instructions for installation &amp; usage are in the Gist below.</p>
<div id="gist-250722" class="gist">
  
  
    
            

      <div class="gist-file">
        <div class="gist-data gist-syntax">
          
          
          
            <div class="highlight"><pre><div class="line" id="LC1"><span class="cp">&lt;?php</span></div><div class="line" id="LC2"><span class="cm">/*</span></div><div class="line" id="LC3"><span class="cm">Plugin Name: Gistson - Embedded Gist WP Plugin</span></div><div class="line" id="LC4"><span class="cm">Plugin URI: http://arin.me/blog/tag/gistson</span></div><div class="line" id="LC5"><span class="cm">Description: Use a shortcode [gist id=&quot;12345&quot;] to embed A Gist from http://gist.github.com into your blog</span></div><div class="line" id="LC6"><span class="cm">Version: 0.1</span></div><div class="line" id="LC7"><span class="cm">Author: Arin Sarkissian</span></div><div class="line" id="LC8"><span class="cm">Author URI: http://arin.me</span></div><div class="line" id="LC9">&nbsp;</div><div class="line" id="LC10"><span class="cm">Copyright 2009 Arin Sarkissian</span></div><div class="line" id="LC11">&nbsp;</div><div class="line" id="LC12"><span class="cm">This program is free software; you can redistribute it and/or modify</span></div><div class="line" id="LC13"><span class="cm">it under the terms of the GNU General Public License as published by</span></div><div class="line" id="LC14"><span class="cm">the Free Software Foundation; either version 2 of the License, or</span></div><div class="line" id="LC15"><span class="cm">(at your option) any later version.</span></div><div class="line" id="LC16">&nbsp;</div><div class="line" id="LC17"><span class="cm">This program is distributed in the hope that it will be useful,</span></div><div class="line" id="LC18"><span class="cm">but WITHOUT ANY WARRANTY; without even the implied warranty of</span></div><div class="line" id="LC19"><span class="cm">MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the</span></div><div class="line" id="LC20"><span class="cm">GNU General Public License for more details.</span></div><div class="line" id="LC21">&nbsp;</div><div class="line" id="LC22"><span class="cm">You should have received a copy of the GNU General Public License</span></div><div class="line" id="LC23"><span class="cm">along with this program; if not, write to the Free Software</span></div><div class="line" id="LC24"><span class="cm">Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA</span></div><div class="line" id="LC25"><span class="cm">*/</span></div><div class="line" id="LC26">&nbsp;</div><div class="line" id="LC27"><span class="cm">/*</span></div><div class="line" id="LC28"><span class="cm">CREDIT:</span></div><div class="line" id="LC29"><span class="cm">    Heavily based on Paul William&#39;s plugin:</span></div><div class="line" id="LC30"><span class="cm">        http://www.entropytheblog.com/blog/    </span></div><div class="line" id="LC31"><span class="cm">        http://www.entropytheblog.com/blog/2008/12/wordpress-github-gist-shortcode-plugin/</span></div><div class="line" id="LC32"><span class="cm">        </span></div><div class="line" id="LC33"><span class="cm">    Main difference is that this version doesn&#39;t do a JS, &lt;script&gt;, embed... the code from your gist is</span></div><div class="line" id="LC34"><span class="cm">    actually in the HTML source.</span></div><div class="line" id="LC35">&nbsp;</div><div class="line" id="LC36"><span class="cm">INSTALL:</span></div><div class="line" id="LC37"><span class="cm">    Toss the gistson.php file into your blogs wp-content/plugins folder. Login to WP and enable the plugin.</span></div><div class="line" id="LC38">&nbsp;</div><div class="line" id="LC39"><span class="cm">USE:</span></div><div class="line" id="LC40"><span class="cm">    Put this &lt;LINK&gt; tag in &lt;HEAD&gt; of header.php</span></div><div class="line" id="LC41"><span class="cm">        &lt;link rel=&quot;stylesheet&quot; href=&quot;http://gist.github.com/stylesheets/gist/embed.css&quot;/&gt;</span></div><div class="line" id="LC42"><span class="cm">    </span></div><div class="line" id="LC43"><span class="cm">    When you wanna embed a gist just type in:</span></div><div class="line" id="LC44"><span class="cm">        [gist id=&quot;gist-id-from-gist.github.com-here&quot;]        </span></div><div class="line" id="LC45"><span class="cm">        example:</span></div><div class="line" id="LC46"><span class="cm">        [gist id=&quot;250709&quot;]</span></div><div class="line" id="LC47"><span class="cm">        </span></div><div class="line" id="LC48"><span class="cm">    You can exclude the attribution by doing this:        </span></div><div class="line" id="LC49"><span class="cm">        [gist id=&quot;250709&quot; nometa=&quot;true&quot;]    </span></div><div class="line" id="LC50"><span class="cm">        </span></div><div class="line" id="LC51"><span class="cm">        This is useful for when you have multiple gists. But for big chunks of code etc</span></div><div class="line" id="LC52"><span class="cm">        I&#39;d encourge you to keep the attribution cuz those guys have a business to run</span></div><div class="line" id="LC53"><span class="cm">*/</span>    </div><div class="line" id="LC54">&nbsp;</div><div class="line" id="LC55"><span class="k">function</span> <span class="nf">gist_shortcode_func</span><span class="p">(</span><span class="nv">$atts</span><span class="p">,</span> <span class="nv">$content</span> <span class="o">=</span> <span class="k">null</span><span class="p">)</span> <span class="p">{</span></div><div class="line" id="LC56">	<span class="nv">$url</span>   <span class="o">=</span> <span class="s1">&#39;http://gist.github.com/&#39;</span> <span class="o">.</span> <span class="nb">trim</span><span class="p">(</span><span class="nv">$atts</span><span class="p">[</span><span class="s1">&#39;id&#39;</span><span class="p">])</span> <span class="o">.</span> <span class="s1">&#39;.json&#39;</span><span class="p">;</span></div><div class="line" id="LC57">	<span class="nv">$json</span>  <span class="o">=</span> <span class="nb">file_get_contents</span><span class="p">(</span><span class="nv">$url</span><span class="p">);</span></div><div class="line" id="LC58">	<span class="nv">$assoc</span> <span class="o">=</span> <span class="nx">json_decode</span><span class="p">(</span><span class="nv">$json</span><span class="p">,</span> <span class="k">true</span><span class="p">);</span></div><div class="line" id="LC59">&nbsp;</div><div class="line" id="LC60">	<span class="k">if</span> <span class="p">(</span><span class="nb">isset</span><span class="p">(</span><span class="nv">$atts</span><span class="p">[</span><span class="s1">&#39;nometa&#39;</span><span class="p">]))</span> <span class="p">{</span></div><div class="line" id="LC61">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="c1">// you&#39;ll end up with 2 1px borders at the bottom =(</span></div><div class="line" id="LC62">	   <span class="nv">$assoc</span><span class="p">[</span><span class="s1">&#39;div&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="nb">preg_replace</span><span class="p">(</span><span class="s1">&#39;/&lt;div class=&quot;gist\-meta&quot;&gt;.*?(&lt;\/div&gt;)/is&#39;</span><span class="p">,</span> <span class="s1">&#39;&#39;</span><span class="p">,</span> <span class="nv">$assoc</span><span class="p">[</span><span class="s1">&#39;div&#39;</span><span class="p">]);</span></div><div class="line" id="LC63">	<span class="p">}</span></div><div class="line" id="LC64">&nbsp;</div><div class="line" id="LC65">	<span class="k">return</span> <span class="nv">$assoc</span><span class="p">[</span><span class="s1">&#39;div&#39;</span><span class="p">];</span></div><div class="line" id="LC66">&nbsp;</div><div class="line" id="LC67"><span class="p">}</span></div><div class="line" id="LC68"><span class="nx">add_shortcode</span><span class="p">(</span><span class="s1">&#39;gist&#39;</span><span class="p">,</span> <span class="s1">&#39;gist_shortcode_func&#39;</span><span class="p">);</span></div><div class="line" id="LC69">&nbsp;</div><div class="line" id="LC70"><span class="cp">?&gt;</span><span class="x"></span></div></pre></div>
          
        </div>

        <div class="gist-meta">
          <a href="http://gist.github.com/raw/250722/e1579fb44144def9879b0c0ab086c65a7162e20c/Gistson.php" style="float:right;">view raw</a>
          <a href="http://gist.github.com/250722#file_gistson.php" style="float:right;margin-right:10px;color:#666">Gistson.php</a>
          <a href="http://gist.github.com/250722">This Gist</a> brought to you by <a href="http://github.com">GitHub</a>.
        </div>
      </div>
    
  
</div>

<p>Oh ya &#8211; I named it Gistson &#8217;cause it grabs the Gist&#8217;s data via an HTTP GET to a JSON doc. Ya, I know, not too creative.</p>
]]></content:encoded>
			<wfw:commentRss>http://arin.me/blog/embed-a-gist-in-your-wordpress-blog/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>I Want A More Better A3</title>
		<link>http://arin.me/blog/i-want-a-more-better-a3</link>
		<comments>http://arin.me/blog/i-want-a-more-better-a3#comments</comments>
		<pubDate>Mon, 07 Dec 2009 08:09:47 +0000</pubDate>
		<dc:creator>phatduckk</dc:creator>
				<category><![CDATA[Life]]></category>
		<category><![CDATA[Audi]]></category>
		<category><![CDATA[Audi A3]]></category>

		<guid isPermaLink="false">http://arin.me/blog/?p=944</guid>
		<description><![CDATA[
So, I&#8217;ve had my current A3 for about 2 years now and I love it but&#8230;
There&#8217;s a lot to like about the car but I settled when I bought. I had an A4 Wagon before that and dug it but never really made peace with the fact that I was driving around in a station [...]]]></description>
			<content:encoded><![CDATA[<div class="image"><img width="650" src="http://farm3.static.flickr.com/2787/4165826568_1513daa4da_o.jpg" alt="" /></div>
<p>So, I&#8217;ve had my current <a href="http://www.audiusa.com/us/brand/en/models/a3.html">A3</a> for about 2 years now and I love it but&#8230;</p>
<p>There&#8217;s a lot to like about the car but I settled when I bought. I had an A4 Wagon before that and dug it but never really made peace with the fact that I was driving around in a station wagon. So, one day, I randomly popped into the dealership and traded it in for the A3.</p>
<p>I was looking for something smaller and a bit less expensive so I didn&#8217;t pop for some features. One feature I was stoked on was the iPod integration &#038; did get that&#8230; unfortunately I didn&#8217;t bother trying it before I signed all the paperwork and it turns out it sucked.</p>
<p>Anyways &#8211; over the last couple years I&#8217;ve been mostly happy with the A3 but really wished I had the nav (better iPod integration), bigger wheels, <a href="http://en.wikipedia.org/wiki/Audi_Quattro">Quatro</a> &#038; a few other niceties that, at the time, I was convinced I didn&#8217;t want/need.</p>
<p>Well, I&#8217;m pretty close to deciding that I kinda want it all at this point. I don&#8217;t want a different car &#8211; just a &#8220;better&#8221; version of the one I already have. So, today I almost got the car above; things didn&#8217;t work at (dealer was a dick) so I didn&#8217;t get it. I&#8217;m glad I didn&#8217;t though&#8230; for now. That one had everything I wanted except the nav/ipod kit. Turns out there&#8217;s none in America w/ the config I want and ordering&#8217;s my only option.</p>
<p>I&#8217;m gonna think on this for a bit and see what happens. If you know me (even remotely) then you know the odds are pretty high that I&#8217;ll be placing an order pretty soon <img src='http://arin.me/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  .</p>
]]></content:encoded>
			<wfw:commentRss>http://arin.me/blog/i-want-a-more-better-a3/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Video of My NoSQL East Talk</title>
		<link>http://arin.me/blog/video-of-my-nosql-east-talk</link>
		<comments>http://arin.me/blog/video-of-my-nosql-east-talk#comments</comments>
		<pubDate>Sun, 29 Nov 2009 21:46:41 +0000</pubDate>
		<dc:creator>phatduckk</dc:creator>
				<category><![CDATA[Work]]></category>
		<category><![CDATA[Cassandra]]></category>
		<category><![CDATA[Digg]]></category>
		<category><![CDATA[NoSQL]]></category>

		<guid isPermaLink="false">http://arin.me/blog/?p=936</guid>
		<description><![CDATA[


]]></description>
			<content:encoded><![CDATA[<p>
<div align="center">
<embed src="http://blip.tv/play/AYGxyC0C" type="application/x-shockwave-flash" width="480" height="300" allowscriptaccess="always" allowfullscreen="true"></embed></div></p>
]]></content:encoded>
			<wfw:commentRss>http://arin.me/blog/video-of-my-nosql-east-talk/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Just Add an Index</title>
		<link>http://arin.me/blog/just-add-an-index</link>
		<comments>http://arin.me/blog/just-add-an-index#comments</comments>
		<pubDate>Sat, 31 Oct 2009 16:24:04 +0000</pubDate>
		<dc:creator>phatduckk</dc:creator>
				<category><![CDATA[Work]]></category>
		<category><![CDATA[Cassandra]]></category>
		<category><![CDATA[Digg]]></category>
		<category><![CDATA[Internet]]></category>
		<category><![CDATA[NoSQL]]></category>

		<guid isPermaLink="false">http://arin.me/?p=935</guid>
		<description><![CDATA[Here&#8217;s my presentation from NoSQL East.
Enjoy

You can also download it as a PDF, Keynote or Powerpoint doc.
]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s my presentation from <a href="http://nosqleast.com">NoSQL East</a>.</p>
<p>Enjoy</p>
<div class="img"><object codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=9,0,0,0" id="doc_564404118410221" name="doc_564404118410221" classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" align="middle"	height="500" width="100%" ><param name="movie"	value="http://d1.scribdassets.com/ScribdViewer.swf?document_id=21962142&#038;access_key=key-1yghwzr47eth5g7sd95k&#038;page=1&#038;version=1&#038;viewMode=slideshow"><param name="quality" value="high"><param name="play" value="true"><param name="loop" value="true"><param name="scale" value="showall"><param name="wmode" value="opaque"><param name="devicefont" value="false"><param name="bgcolor" value="#ffffff"><param name="menu" value="true"><param name="allowFullScreen" value="true"><param name="allowScriptAccess" value="always"><param name="salign" value=""><param name="mode" value="slideshow"><embed src="http://d1.scribdassets.com/ScribdViewer.swf?document_id=21962142&#038;access_key=key-1yghwzr47eth5g7sd95k&#038;page=1&#038;version=1&#038;viewMode=slideshow" quality="high" pluginspage="http://www.macromedia.com/go/getflashplayer" play="true" loop="true" scale="showall" wmode="opaque" devicefont="false" bgcolor="#ffffff" name="doc_564404118410221_object" menu="true" allowfullscreen="true" allowscriptaccess="always" salign="" type="application/x-shockwave-flash" align="middle" mode="slideshow" height="500" width="100%"></embed></object></div>
<p>You can also download it as a <a href="http://arin.s3.amazonaws.com/pub/presentations/nosqleast/NoSQL%20East%20-%20Cassandra%20at%20Digg.pdf">PDF</a>, <a href="http://arin.s3.amazonaws.com/pub/presentations/nosqleast/NoSQL%20East%20-%20Cassandra%20at%20Digg%20(keynote).zip">Keynote</a> or <a href="http://arin.s3.amazonaws.com/pub/presentations/nosqleast/NoSQL%20East%20-%20Cassandra%20at%20Digg%20(powerpoint).zip">Powerpoint</a> doc.</p>
]]></content:encoded>
			<wfw:commentRss>http://arin.me/blog/just-add-an-index/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>WTF is a SuperColumn? An Intro to the Cassandra Data Model</title>
		<link>http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model</link>
		<comments>http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model#comments</comments>
		<pubDate>Tue, 01 Sep 2009 16:49:24 +0000</pubDate>
		<dc:creator>phatduckk</dc:creator>
				<category><![CDATA[Work]]></category>
		<category><![CDATA[Cassandra]]></category>
		<category><![CDATA[Code]]></category>
		<category><![CDATA[NoSQL]]></category>

		<guid isPermaLink="false">http://arin.me/?p=683</guid>
		<description><![CDATA[For the last month or two the Digg engineering team has spent quite a bit of time looking into, playing with and finally deploying Cassandra in production. It&#8217;s been a super fun project to take on &#8211; but even before the fun began we had to spend quite a bit of time figuring out Cassandra&#8217;s [...]]]></description>
			<content:encoded><![CDATA[<p>For the last month or two the Digg engineering team has spent quite a bit of time looking into, playing with and finally deploying <a href="http://incubator.apache.org/cassandra/">Cassandra</a> in production. It&#8217;s been a super fun project to take on &#8211; but even before the fun began we had to spend quite a bit of time figuring out Cassandra&#8217;s data model&#8230; the phrase &#8220;WTF is a &#8217;super column&#8217;&#8221; was uttered quite a few times. <img src='http://arin.me/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>If you&#8217;re coming from an RDBMS background (which is almost everyone) you&#8217;ll probably trip over some of the naming conventions while learning about Cassandra&#8217;s data model. It took me and my team members at Digg a couple days of talking things out before we &#8220;got it&#8221;. In recent weeks a <a href="http://mail-archives.apache.org/mod_mbox/incubator-cassandra-dev/200908.mbox/%3cb6f68fc60908111037ofdc0d6csa39543857e3583a2@mail.gmail.com%3e">bikeshed</a> went down in the dev mailing list proposing a completely new naming scheme to alleviate some of the confusion. Throughout this discussion I kept thinking: &#8220;maybe if there were some decent examples out there people wouldn&#8217;t get so confused by the naming.&#8221; So, this is my stab at explaining Cassandra&#8217;s data model; It&#8217;s intended to help you get your feet wet &#038; doesn&#8217;t go into every single detail but, hopefully, it helps clarify a few things.</p>
<p><em>BTW: this is long. If you&#8217;d rather have a PDF version of this you can <a href="http://arin.s3.amazonaws.com/pub/docs/WTF-is-a-SuperColumn.pdf">download it here</a>.</em></p>
<h2>The Pieces</h2>
<p>Let&#8217;s first go thru the building blocks before we see how they can all be stuck together:</p>
<h3>Column</h3>
<p>The <code>column</code> is the lowest/smallest increment of data. It&#8217;s a <a href="http://en.wikipedia.org/wiki/Tuple">tuple (triplet)</a> that contains a name, a value and a timestamp.</p>
<p>Here&#8217;s a <code>column</code> represented in JSON-ish notation:</p>
<div class="code"><code>
<pre>
{  // this is a column
    name: "emailAddress",
    value: "arin@example.com",
    timestamp: 123456789
}
</pre>
<p></code></div>
<p>That&#8217;s all it is. For simplicity sake let&#8217;s ignore the timestamp. Just think of it as a name/value pair.</p>
<p>Also, it&#8217;s worth noting is that the <code>name</code> and <code>value</code> are both binary (technically <code>byte[]</code>) and can be of any length.</p>
<h3>SuperColumn</h3>
<p>A <code>SuperColumn</code> is a tuple w/ a binary name &#038; a value which is a map containing an unbounded number of <code>Column</code>s &#8211; keyed by the <code>Column</code>&#8217;s name. Keeping with the JSON-ish notation we get:</p>
<div class="code"><code>
<pre>
{   // this is a SuperColumn
    name: "homeAddress",
    // with an infinite list of Columns
    value: {
        // note the keys is the name of the Column
        street: {name: "street", value: "1234 x street", timestamp: 123456789},
        city: {name: "city", value: "san francisco", timestamp: 123456789},
        zip: {name: "zip", value: "94107", timestamp: 123456789},
    }
}
</pre>
<p></code></div>
<h3>Column vs SuperColumn</h3>
<p><code>Column</code>s and <code>SuperColumn</code>s are both a tuples w/ a name &#038; value. The  key difference is that a standard <code>Column</code>&#8217;s value is a &#8220;string&#8221; and in a <code>SuperColumn</code> the value is a Map of <code>Column</code>s. That&#8217;s the main difference&#8230; their values contain different types of data. Another minor difference is that <code>SuperColumn</code>&#8217;s don&#8217;t have a timestamp component to them.</p>
<h3>Before We Get Rolling</h3>
<p>Before I move on I wanna simplify our notation a couple ways: 1) ditch the timestamps from <code>Column</code>s &#038; 2) pull the <code>Column</code>s&#8217; &#038; <code>SuperColumn</code>s&#8217; names component out so that it looks like a key/value pair. So we&#8217;re gonna go from:</p>
<div class="code"><code>
<pre>
{ // this is a super column
    name: "homeAddress",
    // with an infinite list of columns
    value: {
        street: {name: "street", value: "1234 x street", timestamp: 123456789},
        city: {name: "city", value: "san francisco", timestamp: 123456789},
        zip: {name: "zip", value: "94107", timestamp: 123456789},
    }
}
</pre>
<p></code></div>
<p>to</p>
<div class="code"><code>
<pre>
homeAddress: {
    street: "1234 x street",
    city: "san francisco",
    zip: "94107",
}
</pre>
<p></code></div>
<h2>Grouping &#8216;Em</h2>
<p>There&#8217;s a single structure used to group both the <code>Column</code>s and <code>SuperColumn</code>s&#8230;this structure is called a <code>ColumnFamily</code> and comes in 2 varieties <code>Standard</code> &#038; <code>Super</code>.</p>
<h3>ColumnFamily</h3>
<p>A <code>ColumnFamily</code> is a structure that contains an infinite number of <code>Row</code>s. Huh, did you say <code>Row</code>s? Ya &#8211; <code>rows</code> <img src='http://arin.me/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  To make it sit easier in your head just think of it as a table in an RDBMS.</p>
<p>OK &#8211; each <code>Row</code> has a client supplied (that means you) key &#038; contains a map of <code>Column</code>s. Again, the keys in the map are the names of the <code>Column</code>s and the values are the <code>Column</code>s themselves:</p>
<div class="code"><code>
<pre>
UserProfile = { // this is a ColumnFamily
    phatduckk: {   // this is the key to this Row inside the CF
        // now we have an infinite # of columns in this row
        username: "phatduckk",
        email: "phatduckk@example.com",
        phone: "(900) 976-6666"
    }, // end row
    ieure: {   // this is the key to another row in the CF
        // now we have another infinite # of columns in this row
        username: "ieure",
        email: "ieure@example.com",
        phone: "(888) 555-1212"
        age: "66",
        gender: "undecided"
    },
}

<i>Remember: for simplicity we're only showing the value of the <code>Column</code> but in reality the values in the
map are the entire</i> <code>Column</code>.
</pre>
<p></code></div>
<p>You can think of it as a <code>HashMap/dictionary</code> or <code>associative array</code>. If you start thinking that way then you&#8217;re are the right track.</p>
<p>One thing I want to point out is that there&#8217;s no schema enforced at this level. The <code>Row</code>s do not have a predefined list of <code>Column</code>s that they contain. In our example above you see that the row with the key &#8220;ieure&#8221; has <code>Column</code>s with names &#8220;age&#8221; and &#8220;gender&#8221; whereas the row identified by the key &#8220;phatduckk&#8221; doesn&#8217;t. It&#8217;s 100% flexible: one <code>Row</code> may have 1,989 <code>Column</code>s whereas the other has 2.  One <code>Row</code> may have a <code>Column</code> called &#8220;foo&#8221; whereas none of the rest do. This is the schemaless aspect of Cassandra.</p>
<h3>A ColumnFamily Can Be Super Too</h3>
<p>Now, a <code>ColumnFamily</code> can be of type <code>Standard</code> or <code>Super</code>.</p>
<p>What we just went over was an example of the <code>Standard</code> type. What makes it <code>Standard</code> is the fact that all the <code>Row</code>s contains a map of <em>normal</em> (aka not-Super) <code>Column</code>s&#8230; there&#8217;s no <code>SuperColumn</code>s scattered about.</p>
<p>When a <code>ColumnFamily</code> is of type <code>Super</code> we have the opposite: each <code>Row</code> contains a map of <code>SuperColumn</code>s.  The map is keyed with the name of each <code>SuperColumn</code> and the value is the <code>SuperColumn</code> itself. And, just to be clear, since this <code>ColumnFamily</code> is of type <em>Super</em>, there are no <em>Standard</em> <code>ColumnFamily</code>&#8217;s in there. Here&#8217;s an example:</p>
<div class="code"><code>
<pre>
AddressBook = { // this is a ColumnFamily of type Super
    phatduckk: {    // this is the key to this row inside the Super CF
        // the key here is the name of the owner of the address book

        // now we have an infinite # of super columns in this row
        // the keys inside the row are the names for the SuperColumns
        // each of these SuperColumns is an address book entry
        friend1: {street: "8th street", zip: "90210", city: "Beverley Hills", state: "CA"},

        // this is the address book entry for John in phatduckk's address book
        John: {street: "Howard street", zip: "94404", city: "FC", state: "CA"},
        Kim: {street: "X street", zip: "87876", city: "Balls", state: "VA"},
        Tod: {street: "Jerry street", zip: "54556", city: "Cartoon", state: "CO"},
        Bob: {street: "Q Blvd", zip: "24252", city: "Nowhere", state: "MN"},
        ...
        // we can have an infinite # of ScuperColumns (aka address book entries)
    }, // end row
    ieure: {     // this is the key to another row in the Super CF
        // all the address book entries for ieure
        joey: {street: "A ave", zip: "55485", city: "Hell", state: "NV"},
        William: {street: "Armpit Dr", zip: "93301", city: "Bakersfield", state: "CA"},
    },
}
</pre>
<p></code></div>
<h3>Keyspace</h3>
<p>A <code>Keyspace</code> is the outer most grouping of your data. All your <code>ColumnFamily</code>&#8217;s go inside a <code>Keyspace</code>. Your <code>Keyspace</code> will probably named after your application.</p>
<p>Now, a <code>Keyspace</code> can have multiple <code>ColumnFamily</code>&#8217;s but that doesn&#8217;t mean there&#8217;s an imposed relationship between them. For example: they&#8217;re not like tables in MySQL&#8230; you can&#8217;t <code>join</code> them. Also, just because <em>ColumnFamily_1</em> has a <code>Row</code> with key &#8220;phatduckk&#8221; that doesn&#8217;t mean <em>ColumnFamily_2</em> has one too.</p>
<h2>Sorting</h2>
<p>OK &#8211; we&#8217;ve gone through what the various data containers are about but another key component of the data model is how the data is sorted. Cassandra is not queryable like SQL &#8211; you do not specify how you want the data sorted when you&#8217;re fetching it (among other differences). The data is sorted as soon as you put it into the cluster and it always remains sorted! This is a tremendous performance boost for reads but in exchange for that benefit you&#8217;re going to have to make sure to plan your data model in a such a way that you&#8217;re able satisfy your access patterns.</p>
<p><code>Column</code>s are always sorted within their <code>Row</code> by the <code>Column</code>&#8217;s name. This is important so i&#8217;ll say it again: <code>Column</code>s are always sorted by their name! How the names are compared depends on the <code>ColumnFamily</code>s <em>CompareWith</em> option. Out of the box you have the following options: <em>BytesType, UTF8Type, LexicalUUIDType, TimeUUIDType, AsciiType</em>, and <em>LongType</em>. Each of these options treats the <code>Column</code>s&#8217; name as a different data type giving you quite a bit of felxibility. For example: Using <em>LongType</em> will treat your <code>Column</code>s&#8217; names as a 64bit <code>Long</code>s. Let&#8217;s try and clear this up by taking a look at some data before and after it&#8217;s sorted:</p>
<div class="code">
<pre><code>
    // Here's a view of all the Columns from a particular Row in random order
    // Cassandra would "never" store data in random order. This is just an example
    // Also, ignore the values - they don't matter for sorting at all
    {name: 123, value: "hello there"},
    {name: 832416, value: "kjjkbcjkcbbd"},
    {name: 3, value: "101010101010"},
    {name: 976, value: "kjjkbcjkcbbd"}
</code></pre>
</div>
<p>So, given the fact that we&#8217;re using the <em>LongType</em> option, these <code>Column</code>s will look like this when they&#8217;re sorted:</p>
<div class="code">
<pre><code>
    &lt;!--
    ColumnFamily definition from storage-conf.xml
    -->
    &lt;ColumnFamily CompareWith="LongType" Name="CF_NAME_HERE"/>

    // See, each Column's name is treated as a 64bit long
    // in effect, numerically ordering our Columns' by name
    {name: 3, value: "101010101010"},
    {name: 123, value: "hello there"},
    {name: 976, value: "kjjkbcjkcbbd"},
    {name: 832416, value: "kjjkbcjkcbbd"}
</code></pre>
</div>
<p>As you can see the <code>Column</code>s&#8217; names were compared as if they were 64bit <code>Long</code>s (aka: numbers that can get pretty big). Now, if we&#8217;d used another <em>CompareWith</em> option we&#8217;d end up with a different result. If we&#8217;d set <em>CompareWith</em> to <em>UTF8Type</em> our sorted <code>Column</code>s&#8217; names would be treated as a UTF8 encoded strings yielding a sort order like this:</p>
<div class="code">
<pre><code>
    &lt;!--
    ColumnFamily definition from storage-conf.xml
    -->
    &lt;ColumnFamily CompareWith="UTF8Type" Name="CF_NAME_HERE"/>

    // Each Column name is treated as a UTF8 string
    {name: 123, value: "hello there"},
    {name: 3, value: "101010101010"},
    {name: 832416, value: "kjjkbcjkcbbd"},
    {name: 976, value: "kjjkbcjkcbbd"}
</code></pre>
</div>
<p>The result is completely different!</p>
<p>This sorting principle applies to <code>SuperColumn</code>s as well but we get an extra dimension to deal with: not only do we determine how the <code>SuperColumn</code>s are sorted in a <code>Row</code> but we also determine how the <code>Column</code>s within each <code>SuperColumn</code> are sorted. The sort of the <code>Column</code>s within each <code>SuperColumn</code> is determined by the value of <em>CompareSubcolumnsWith</em>. Here&#8217;s an example:</p>
<div class="code">
<pre><code>
    // Here's a view of a Row that has 2 SuperColumns in it.
    // currently they're in some random order

    { // first SuperColumn from a Row
        name: "workAddress",
        // and the columns within it
        value: {
            street: {name: "street", value: "1234 x street"},
            city: {name: "city", value: "san francisco"},
            zip: {name: "zip", value: "94107"}
        }
    },
    { // another SuperColumn from same Row
        name: "homeAddress",
        // and the columns within it
        value: {
            street: {name: "street", value: "1234 x street"},
            city: {name: "city", value: "san francisco"},
            zip: {name: "zip", value: "94107"}
        }
    }
</code></pre>
</div>
<p>Now if we decided to set both <em>CompareSubcolumnsWith</em> &#038; <em>CompareWith</em> to <em>UTF8Type</em> we&#8217;d have the following end result:</p>
<div class="code">
<pre><code>
    // Now they're sorted

    {
        // this one's first b/c when treated as UTF8 strings
        { // another SuperColumn from same Row

            // This Row comes first b/c "homeAddress" is before "workAddress"
            name: "homeAddress",

            // the columns within this SC are also sorted by their names too
            value: {
                // see, these are sorted by Column name too
                city: {name: "city", value: "san francisco"},
                street: {name: "street", value: "1234 x street"},
                zip: {name: "zip", value: "94107"}
            }
        },
        name: "workAddress",
        value: {
            // the columns within this SC are also sorted by their names too
            city: {name: "city", value: "san francisco"},
            street: {name: "street", value: "1234 x street"},
            zip: {name: "zip", value: "94107"}
        }
    }
</code></pre>
</div>
<p>I want to note that in the last example <em>CompareSubcolumnsWith</em> &#038; <em>CompareWith</em> were set to <em>UTF8Type</em> but this doesn&#8217;t have to be the case. You can mix and match the values of <em>CompareSubcolumnsWith</em> &#038; <em>CompareWith</em> as necessary.</p>
<p>The last bit about sorting I want to mention is that you can write a custom class to perform the sorting. The sorting mechanism is pluggable&#8230; you can set <em>CompareSubcolumnsWith</em> and/or <em>CompareWith</em> to any fully-qualified class name as long as that class implements org.apache.cassandra.db.marshal.IType (aka you can write custom comparators).</p>
<h2>Example Schema</h2>
<p>Alrighty &#8211; Now we&#8217;ve got all the pieces of the puzzle so let&#8217;s finally put &#8216;em all together and model a simple blog application. We&#8217;re going to model a simple app with the following specs:</p>
<ul>
<li>support a single blog</li>
<li>we can have multiple authors</li>
<li>entries contain title, body, slug &#038; publish date</li>
<li>entries can be associated with any # of tags</li>
<li>people can leave comments but cant register: they enter profile info each time (just keeping it simple)</li>
<li>comments have text, time submitted, commenter&#8217;s name &#038; commenter&#8217;s name</li>
<li>must be able to show all posts in reverse chronological order (newest first)</li>
<li>must be able to show all posts within a given tag in reverse chronological order</li>
</ul>
<p>Each of the following sections will describe a <code>ColumnFamily</code> that we&#8217;re going to define in our app&#8217;s <em>Keyspace</em>, show the xml definition, talk about why we picked the particular sort option(s) as well as display the data in the <code>ColumnFamily</code> w/ our JSON-ish notation.</p>
<h3>Authors ColumnFamily</h3>
<p>Modeling the <em>Authors</em> <code>ColumnFamily</code> is going to be pretty basic; we&#8217;re not going to do anything fancy here. We&#8217;re going to give each <em>Author</em> their own <code>Row</code> &#038; key it by the <em>Author</em>&#8217;s full name. Inside the <code>Row</code>s each <code>Column</code> is going to represent a single &#8220;profile&#8221; attribute for the <em>Author</em>.</p>
<p>This is an example of using each <code>Row</code> to represent an object&#8230; in this case an <em>Author</em> object. With this approach each <code>Column</code> will serve as an attribute. Super simple. I want to point out that since there&#8217;s no &#8220;definition&#8221; of what <code>Column</code>s must be present within a <code>Row</code> we kinda sorta have a schemaless design.</p>
<p>We&#8217;ll be accessing the <code>Row</code>s in this <code>ColumnFamily</code> via key lookup &#038; will grab every <code>Column</code> with each get (ex: we won&#8217;t ever be fetching the first 3 columns from the <code>Row</code> with key &#8216;foo&#8217;). This means that we don&#8217;t care how the <code>Column</code>s are sorted so we&#8217;ll use <em>BytesType</eM> sort options because it doesn&#8217;t require any validation of the <code>Column</code>s&#8217; names.</p>
<div class="code">
<pre><code>
&lt;!--
    ColumnFamily: Authors
    We'll store all the author data here.

    Row Key => Author's name (implies names must be unique)
    Column Name: an attribute for the entry (title, body, etc)
    Column Value: value of the associated attribute

    Access: get author by name (aka grab all columns from a specific Row)

    Authors : { // CF
        Arin Sarkissian : { // row key
            // and the columns as "profile" attributes
            numPosts: 11,
            twitter: phatduckk,
            email: arin@example.com,
            bio: "bla bla bla"
        },
        // and the other authors
        Author 2 {
            ...
        }
    }
-->
&lt;ColumnFamily CompareWith="BytesType" Name="Authors"/>
</code></pre>
</div>
<h3>BlogEntries ColumnFamily</h3>
<p>Again, this <code>ColumnFamily</code> is going to act as a simple key/value lookup. We&#8217;ll be storing 1 entry per <code>Row</code>. Within that <code>Row</code> the <code>Column</code>s will just serve as attributes of the entry: title, body, etc (just like the previous example). As a small optimization we&#8217;ll denormalize the tags into a <code>Column</code> as a comma separated string. Upon display we&#8217;ll just split that <code>Column</code>&#8217;s value to get a list of tags.</p>
<p>The key to each <code>Row</code> will be the entries slug. So whenever we want to grab a single entry we can simply look it up by its key (slug).</p>
<div class="code">
<pre><code>
&lt;!--
    ColumnFamily: BlogEntries
    This is where all the blog entries will go:

    Row Key +> post's slug (the seo friendly portion of the uri)
    Column Name: an attribute for the entry (title, body, etc)
    Column Value: value of the associated attribute

    Access: grab an entry by slug (always fetch all Columns for Row)

    fyi: tags is a denormalization... its a comma separated list of tags.
    im not using json in order to not interfere with our
    notation but obviously you could use anything as long as your app
    knows how to deal w/ it

    BlogEntries : { // CF
        i-got-a-new-guitar : { // row key - the unique "slug" of the entry.
            title: This is a blog entry about my new, awesome guitar,
            body: this is a cool entry. etc etc yada yada
            author: Arin Sarkissian  // a row key into the Authors CF
            tags: life,guitar,music  // comma sep list of tags (basic denormalization)
            pubDate: 1250558004      // unixtime for publish date
            slug: i-got-a-new-guitar
        },
        // all other entries
        another-cool-guitar : {
            ...
            tags: guitar,
            slug: another-cool-guitar
        },
        scream-is-the-best-movie-ever : {
            ...
            tags: movie,horror,
            slug: scream-is-the-best-movie-ever
        }
    }
-->
&lt;ColumnFamily CompareWith="BytesType" Name="BlogEntries"/>
</code></pre>
</div>
<h3>TaggedPosts ColumnFamily</h3>
<p>Alright &#8211; here&#8217;s where things get a bit interesting. This <code>ColumnFamily</code> is going to do some heavy lifting for us. It&#8217;s going to be responsible for keeping our tag/entry associations. Not only is it going to store the associations but it&#8217;s going to allow us to fetch all <em>BlogEntry</em>s for a certain tag in pre-sorted order (remember all that sorting jazz we went thru?).</p>
<p>A design point I want to point out is that we&#8217;re going have our app logic tag every <em>BlogEntry</em> with the tag &#8220;__notag__&#8221; (a tag I just made up). Tagging every <em>BlogEntry</em> with &#8220;__notag__&#8221; will allow us to use this <code>ColumnFamily</code> to also store a list of all <em>BlogEntry</em>s in pre-sorted order. We&#8217;re kinda cheating but it allows us to use a single <code>ColumnFamily</code> to serve &#8220;show me all recent posts&#8221; and &#8220;show me all recent posts tagged &#8216;foo&#8217;&#8221;.</p>
<p>Given this data model if an entry has 3 tags it will have a corresponding <code>Column</code> in 4 <code>Row</code>s&#8230; 1 for each tag and one for the &#8220;__notag__&#8221; tag.</p>
<p>Since we&#8217;re going to want to display lists of entries in chronological order we&#8217;ll make sure each <code>Column</code>s name is a <a href="http://en.wikipedia.org/wiki/Universally_Unique_Identifier">time UUID</a> and set the <code>ColumnFamily</code>s <em>CompareWith</em> to <em>TimeUUIDType</em>.  This will sort the <code>Column</code>s by time satisfying our &#8220;chronological order&#8221; requirement <img src='http://arin.me/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  So doing stuff like &#8220;get the latest 10 entries tagged &#8216;foo&#8217;&#8221; is going to be a super efficient operation.</p>
<p>Now when we want display the 10 most recent entries (on the front page, for example) we would:</p>
<ol>
<li>grab the last 10 <code>Column</code>s in the <code>Row</code> w/ key &#8220;__notag__&#8221; (our &#8220;all posts&#8221; tag)</li>
<li>loop thru that set of <code>Column</code>s</li>
<li>while looping, we know the value of each <code>Column</code> is the key to a <code>Row</code> in the <em>BlogEntries</em> <code>ColumnFamily</code></li>
<li>so we go ahead and use that to grab the <code>Row</code> for this entry from the <em>BlogEntries</em> <code>ColumNFamily</code>. this gives us all the data for this entry</li>
<li>one of the <code>Column</code>s from the <em>BlogEntries</em> <code>Row</code> we just grabbed is named &#8220;author&#8221; and the value is the key into the <em>Authors</em> <code>ColumnFamily</code> we need to use to grab that author&#8217;s profile data.</li>
<li>at this point we&#8217;ve got the entry data and the author data on hand</li>
<li>next we&#8217;ll split the &#8220;tags&#8221; <code>Column</code>s value to get a list tags</li>
<li>now we have everything we need to display this post (no comments yet &#8211; this aint the permalink page)</li>
</ol>
<p>We can go through the same procedure above using any tag&#8230; so it works for &#8220;all entries&#8221; and &#8220;entries tagged &#8216;foo&#8217;&#8221;. Kinda nice.</p>
<div class="code">
<pre><code>
&lt;!--
    ColumnFamily: TaggedPosts
    A secondary index to determine which BlogEntries are associated with a tag

    Row Key => tag
    Column Names: a TimeUUIDType
    Column Value: row key into BlogEntries CF

    Access: get a slice of entries tagged 'foo'

    We're gonna use this CF to determine which blog entries to show for a tag page.
    We'll be a bit ghetto and use the string __notag__ to mean
    "don't restrict by tag". Each entry will get a column in here...
     this means we'll have to have #tags + 1 columns for each post.

    TaggedPosts : { // CF
        // blog entries tagged "guitar"
        guitar : {  // Row key is the tag name
            // column names are TimeUUIDType, value is the row key into BlogEntries
            timeuuid_1 : i-got-a-new-guitar,
            timeuuid_2 : another-cool-guitar,
        },
        // here's all blog entries
        __notag__ : {
            timeuuid_1b : i-got-a-new-guitar,

            // notice this is in the guitar Row as well
            timeuuid_2b : another-cool-guitar,

            // and this is in the movie Row as well
            timeuuid_2b : scream-is-the-best-movie-ever,
        },
        // blog entries tagged "movie"
        movie: {
            timeuuid_1c: scream-is-the-best-movie-ever
        }
    }
-->
&lt;ColumnFamily CompareWith="TimeUUIDType" Name="TaggedPosts"/>
</code></pre>
</div>
<h3>Comments ColumnFamily</h3>
<p>The last thing we need to do is figure out how to model the comments. Here we&#8217;ll get to bust out some <code>SuperColumn</code>s.</p>
<p>We&#8217;ll have 1 <code>Row</code> per entry. The key to the <code>Row</code> will be the entries slug. Within each <code>Row</code> we&#8217;ll have a <code>SuperColumn</code> for each comment. The name of the <code>SuperColumn</code>s will be a UUID that we&#8217;ll be applying the <em>TimeUUIDType</em> to. This will ensure that all our comments for an entry are sorted in chronological order. The <code>Column</code>s within each <code>SuperColumn</code> will be the various attributes of the comment (commenter&#8217;s name, comment time etc).</p>
<p>So, this is pretty simple as well&#8230; nothing fancy.</p>
<div class="code">
<pre><code>
&lt;!--
    ColumnFamily: Comments
    We store all comments here

    Row key => row key of the BlogEntry
    SuperColumn name: TimeUUIDType

    Access: get all comments for an entry

    Comments : {
        // comments for scream-is-the-best-movie-ever
        scream-is-the-best-movie-ever : { // row key = row key of BlogEntry
            // oldest comment first
            timeuuid_1 : { // SC Name
                // all Columns in the SC are attribute of the comment
                commenter: Joe Blow,
                email: joeb@example.com,
                comment: you're a dumb douche, the godfather is the best movie ever
                commentTime: 1250438004
            },

            ... more comments for scream-is-the-best-movie-ever

            // newest comment last
            timeuuid_2 : {
                commenter: Some Dude,
                email: sd@example.com,
                comment: be nice Joe Blow this isnt youtube
                commentTime: 1250557004
            },
        },

        // comments for i-got-a-new-guitar
        i-got-a-new-guitar : {
            timeuuid_1 : { // SC Name
                // all Columns in the SC are attribute of the comment
                commenter: Johnny Guitar,
                email: guitardude@example.com,
                comment: nice axe dawg...
                commentTime: 1250438004
            },
        }

        ..
        // then more Super CF's for the other entries
    }
-->
&lt;ColumnFamily CompareWith="TimeUUIDType" type="Super"
&nbsp;&nbsp;&nbsp;&nbsp;CompareSubcolumnsWith="BytesType" Name="Comments"/>
</code></pre>
</div>
<h2>Woot!</h2>
<p>That&#8217;s it. Out little blog app is all modeled and ready to go. It&#8217;s quite a bit to digest but in the end you end up with a pretty small chunk of XML you&#8217;ve gotta store in the <em>storage-conf.xml</em>:</p>
<div class="code">
<pre><code>
    &lt;Keyspace Name="BloggyAppy">
        &lt;!-- other keyspace config stuff -->

        &lt;!-- CF definitions -->
        &lt;ColumnFamily CompareWith="BytesType" Name="Authors"/>
        &lt;ColumnFamily CompareWith="BytesType" Name="BlogEntries"/>
        &lt;ColumnFamily CompareWith="TimeUUIDType" Name="TaggedPosts"/>
        &lt;ColumnFamily CompareWith="TimeUUIDType" Name="Comments"
            CompareSubcolumnsWith="BytesType" type="Super"/>
    &lt;/Keyspace>
</code></pre>
</div>
<p>Now all you need to do is figure out how to get the data in and out of Cassandra <img src='http://arin.me/blog/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> . That&#8217;s all accomplished via the <a href="http://wiki.apache.org/cassandra/ThriftInterface">Thrift Interface</a>. The <a href="http://wiki.apache.org/cassandra/API">API wiki page</a> does a decent job at explaining what the various endpoints do so I won&#8217;t go into all those details. But, in general, you just compile the <code>cassandra.thrift</code> file and use the generated code to access the various <a href="http://wiki.apache.org/cassandra/API">endpoints</a>. Alternatively you can take advantage of this <a href="http://blog.evanweaver.com/files/doc/fauna/cassandra_client/files/README.html">Ruby client</a> or this <a href="http://github.com/digg/lazyboy/tree/master">Python client</a>.</p>
<p>Alrighty&#8230; hopefully all that made sense &#038; you finally understand WTF a <code>SuperColumn</code> is and can start building some awesome apps.</p>
]]></content:encoded>
			<wfw:commentRss>http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model/feed</wfw:commentRss>
		<slash:comments>33</slash:comments>
		</item>
	</channel>
</rss>
