September 1st, 2010

Blekko’s Pretty Rad

  • Digg
  • del.icio.us
  • Facebook
  • Reddit
  • Twitter
Tagged: ,
January 21st, 2010

Worthless Commenting

Writing comments in your code is fine and dandy but sometimes its just a fucking waste of time. Here’s an example:

<?php
class UserDataAccessClass
{
    /**
* Get the number of followers a user has
*
* @param string $userID The user id
*
* @return int
*/
    public function getNumFollowers($userID)
    {
        return valueFromAServiceCallOrQueryEtc($userID)
    }
}
?>

So, yay – that chunk of code passes PHPCS (using the PEAR standard). All the parameters are documented, there’s a line of text explaining what the function does and the docblock even states the return type… how cute!

But why? Why do I have to type all that crap? The function’s name is self documenting & its sole parameter is obvious. The return type makes sense to me but the rest is bullshit. God forbid your function takes multiple parameters; then you’d have to line up the @param‘s types and descriptions ’cause PHP people have a strange hardon for lining shit up.

The truth is the only reason I do all that stuff is cuz we run PHPCS on our code at work and I don’t wanna be “that guy”. If it wasn’t a “standrad” at work there’s no way in hell id ever bother.

I’d much have the documentation go like this instead:

<?php
class UserDataAccessClass
{
    /**
* @return int
*/
    public function getNumFollowers(string $userID)
    {
        return valueFromAServiceCallOrQueryEtc($userID)
    }
}
?>

…And to be 100% honest the only reason I’d include the @return is ’cause I’m an Eclipse & it helps PDT’s static analysis (aka autocomplete gets more better).

  • Digg
  • del.icio.us
  • Facebook
  • Reddit
  • Twitter
Tagged: , ,
December 11th, 2009

Indexing Nodes in Neo4J

I’ve been playing with #neo4j quite a bit lately. It’s a great & fun project. It’s a graph database that mitigates all the bullshit you have to deal with when trying to, ya know, do graph stuff. Example: find all User Nodes who’s gender property is set to female, have an outgoing likes relationship to the Node punk music and are less than 3 degrees of separation from Node #4. Stuff like that. Its super good at doing this.

But here’s the deal… each Node is gettable via ID which is nice – but the ID’s are Neo4J’s internal ID; you don’t get to set ‘em when you create a Node. So, what if I want to get a Node who’s username property is phatduckk & start the traversal from there? The problem lies in the fact that you don’t know that phatduckk is Node #4 so you need a simple & efficient way to do that lookup & grab that Node.

If your dataset is small, I guess, you can just use a Map and store the mapping yourself but that solution will fall over pretty quickly. You could also toss info into MySQL but why would you do that? It just doesn’t feel right to use 2 different stores. So, checking out some of the docs you’ll see that Neo4J’s got some indexing capabilities.

Initially I tried out the SingleValueIndex which fell over in a multi-threaded scenario. So, I hit up the list and was advised to check out the LuceneIndexService. This worked like a charm. Even with multiple threads constantly indexing the same Node.

Here’s a little test app. It’s a brute force, little hack that creates a single Node and indexes it by its username property 100,000 times using 10 threads. This is a pretty unrealistic situation but I really wanted to make sure it behaved well in a multi-threaded scenario and didn’t frustrate me like the SingleValueIndex did.

package com.digg.tmp;

import org.neo4j.api.core.*;
import org.neo4j.util.index.IndexService;
import org.neo4j.util.index.LuceneIndexService;

import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

public class LuceneIndex {
    private static final String USERNAME_INDEX = "usernameIndex";
    private static final int NUM_THREADS = 10;
    private static final int NUM_LINES = 1000000;
    private static final String USERNAME = "phatduckk";

    public static void main(String[] args) {
        // always use a new store
        NeoService neo = new EmbeddedNeo("test-" + System.currentTimeMillis());

        // now create the node we want indexed:
        Transaction txUser = neo.beginTx();
        Node userNode = neo.createNode();
        userNode.setProperty(USERNAME_INDEX, USERNAME);
        txUser.success();
        txUser.finish();

        // now create the index & setup a pool
        IndexService idxServ = new LuceneIndexService(neo);
        final ExecutorService executorService = Executors.newFixedThreadPool(NUM_THREADS);

        // now let's index that same node NUM_LINES times
        // the reason we're indexing the same node is cuz i'm checking for thread safety during indexing issues
        // otherwise you'd normally be indexing new nodes who's data you got from some external source
        for (int i = 0; i < NUM_LINES; i++) {
            System.out.println("line: " + i);
            IndexRunner command = new IndexRunner(userNode, neo, idxServ);
            executorService.execute(command);
        }

        // should do a clean neo.shutdown() at some point ;-)
    }

    static class IndexRunner implements Runnable {
        NeoService neo;
        IndexService idxServ;
        Node userNode;

        IndexRunner(Node userNode, NeoService neo, IndexService idxServ) {
            this.userNode = userNode;
            this.neo = neo;
            this.idxServ = idxServ;
        }

        public void run() {
            Transaction nodetx = neo.beginTx();
            Node nodeFromIndex = idxServ.getSingleNode(USERNAME_INDEX, USERNAME);
            
            if (nodeFromIndex != null) {
                System.out.println("found " + USERNAME + " in the " + USERNAME_INDEX
                        + " index. Node ID is: " + nodeFromIndex.getId());
            } else {
                idxServ.index(userNode, USERNAME_INDEX, USERNAME);
            }
            
            nodetx.success();
            nodetx.finish();
        }
    }
}

Although this is an off the wall example it can also serve as a simple example of how to index a Node. Anywho – hope this helps out a few folks that ran into the same needs/problems/scenarios I did. In hindsight it’s all pretty simple & straightforward – I just went down the wrong path with the SingleValueIndex… when browsing the docs it sounded like the right tool for the job but, from what I can tell, you should avoid it and use the LuceneService instead.

  • Digg
  • del.icio.us
  • Facebook
  • Reddit
  • Twitter
Tagged: , , ,
<?php

if (! isset($argv[1])) {
    echo "enter a search term:\n";
    echo 'php ' . __FILE__ . " <search_term>\n";
    exit;
}

$term = urlencode($argv[1]);
$url = "http://ax.phobos.apple.com.edgesuite.net/WebObjects/MZStoreServices.woa/wa/wsSearch?limit=10&entity=software&term=$term";
$json = file_get_contents($url);

print_r(json_decode($json, true));

?>

More info here.

  • Digg
  • del.icio.us
  • Facebook
  • Reddit
  • Twitter
Tagged: , ,

I stumbled upon Paul William’s plugin for embedding a Gist into a WordPress blog.

Its a quick ‘n clean plugin but it relies on a JS <script> tag to render the Gist’s content… so, I made a quick tweak to get the plugin to actually put the Gist’s content into your HTML source. There may already be something similar but, eh, it was just a quick hackjob.

The plugin’s code and instructions for installation & usage are in the Gist below.

<?php
/*
Plugin Name: Gistson - Embedded Gist WP Plugin
Plugin URI: http://arin.me/blog/tag/gistson
Description: Use a shortcode [gist id="12345"] to embed A Gist from http://gist.github.com into your blog
Version: 0.1
Author: Arin Sarkissian
Author URI: http://arin.me

Copyright 2009 Arin Sarkissian

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/

/*
CREDIT:
Heavily based on Paul William's plugin:
http://www.entropytheblog.com/blog/
http://www.entropytheblog.com/blog/2008/12/wordpress-github-gist-shortcode-plugin/
Main difference is that this version doesn't do a JS, <script>, embed... the code from your gist is
actually in the HTML source.

INSTALL:
Toss the gistson.php file into your blogs wp-content/plugins folder. Login to WP and enable the plugin.

USE:
Put this <LINK> tag in <HEAD> of header.php
<link rel="stylesheet" href="http://gist.github.com/stylesheets/gist/embed.css"/>
When you wanna embed a gist just type in:
[gist id="gist-id-from-gist.github.com-here"]
example:
[gist id="250709"]
You can exclude the attribution by doing this:
[gist id="250709" nometa="true"]
This is useful for when you have multiple gists. But for big chunks of code etc
I'd encourge you to keep the attribution cuz those guys have a business to run
*/

function gist_shortcode_func($atts, $content = null) {
$url = 'http://gist.github.com/' . trim($atts['id']) . '.json';
$json = file_get_contents($url);
$assoc = json_decode($json, true);

if (isset($atts['nometa'])) {
        // you'll end up with 2 1px borders at the bottom =(
$assoc['div'] = preg_replace('/<div class="gist\-meta">.*?(<\/div>)/is', '', $assoc['div']);
}

return $assoc['div'];

}
add_shortcode('gist', 'gist_shortcode_func');

?>
view raw Gistson.php This Gist brought to you by GitHub.

Oh ya – I named it Gistson ’cause it grabs the Gist’s data via an HTTP GET to a JSON doc. Ya, I know, not too creative.

  • Digg
  • del.icio.us
  • Facebook
  • Reddit
  • Twitter
Tagged: , , ,
Lots of responses on my laptop question. Thanks for the feedback guys 2 hrs ago

Search This Blog