January 21st, 2010

Worthless Commenting

Writing comments in your code is fine and dandy but sometimes its just a fucking waste of time. Here’s an example:

<?php
class UserDataAccessClass
{
    /**
* Get the number of followers a user has
*
* @param string $userID The user id
*
* @return int
*/
    public function getNumFollowers($userID)
    {
        return valueFromAServiceCallOrQueryEtc($userID)
    }
}
?>

So, yay – that chunk of code passes PHPCS (using the PEAR standard). All the parameters are documented, there’s a line of text explaining what the function does and the docblock even states the return type… how cute!

But why? Why do I have to type all that crap? The function’s name is self documenting & its sole parameter is obvious. The return type makes sense to me but the rest is bullshit. God forbid your function takes multiple parameters; then you’d have to line up the @param’s types and descriptions ’cause PHP people have a strange hardon for lining shit up.

The truth is the only reason I do all that stuff is cuz we run PHPCS on our code at work and I don’t wanna be “that guy”. If it wasn’t a “standrad” at work there’s no way in hell id ever bother.

I’d much have the documentation go like this instead:

<?php
class UserDataAccessClass
{
    /**
* @return int
*/
    public function getNumFollowers(string $userID)
    {
        return valueFromAServiceCallOrQueryEtc($userID)
    }
}
?>

…And to be 100% honest the only reason I’d include the @return is ’cause I’m an Eclipse & it helps PDT’s static analysis (aka autocomplete gets more better).

  • Digg
  • del.icio.us
  • Facebook
  • Reddit
  • Twitter
Tagged: , ,

He shows off a bunch of Gibsons then busts out a few of his Music Man guitars about 2 minutes in. He had great things to say about his MM axes:

They really are the nicest stuff

- Joe Bonamassa

December 11th, 2009

Indexing Nodes in Neo4J

I’ve been playing with #neo4j quite a bit lately. It’s a great & fun project. It’s a graph database that mitigates all the bullshit you have to deal with when trying to, ya know, do graph stuff. Example: find all User Nodes who’s gender property is set to female, have an outgoing likes relationship to the Node punk music and are less than 3 degrees of separation from Node #4. Stuff like that. Its super good at doing this.

But here’s the deal… each Node is gettable via ID which is nice – but the ID’s are Neo4J’s internal ID; you don’t get to set ‘em when you create a Node. So, what if I want to get a Node who’s username property is phatduckk & start the traversal from there? The problem lies in the fact that you don’t know that phatduckk is Node #4 so you need a simple & efficient way to do that lookup & grab that Node.

If your dataset is small, I guess, you can just use a Map and store the mapping yourself but that solution will fall over pretty quickly. You could also toss info into MySQL but why would you do that? It just doesn’t feel right to use 2 different stores. So, checking out some of the docs you’ll see that Neo4J’s got some indexing capabilities.

Initially I tried out the SingleValueIndex which fell over in a multi-threaded scenario. So, I hit up the list and was advised to check out the LuceneIndexService. This worked like a charm. Even with multiple threads constantly indexing the same Node.

Here’s a little test app. It’s a brute force, little hack that creates a single Node and indexes it by its username property 100,000 times using 10 threads. This is a pretty unrealistic situation but I really wanted to make sure it behaved well in a multi-threaded scenario and didn’t frustrate me like the SingleValueIndex did.

package com.digg.tmp;
 
import org.neo4j.api.core.*;
import org.neo4j.util.index.IndexService;
import org.neo4j.util.index.LuceneIndexService;
 
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
 
public class LuceneIndex {
    private static final String USERNAME_INDEX = "usernameIndex";
    private static final int NUM_THREADS = 10;
    private static final int NUM_LINES = 1000000;
    private static final String USERNAME = "phatduckk";
 
    public static void main(String[] args) {
        // always use a new store
        NeoService neo = new EmbeddedNeo("test-" + System.currentTimeMillis());
 
        // now create the node we want indexed:
        Transaction txUser = neo.beginTx();
        Node userNode = neo.createNode();
        userNode.setProperty(USERNAME_INDEX, USERNAME);
        txUser.success();
        txUser.finish();
 
        // now create the index & setup a pool
        IndexService idxServ = new LuceneIndexService(neo);
        final ExecutorService executorService = Executors.newFixedThreadPool(NUM_THREADS);
 
        // now let's index that same node NUM_LINES times
        // the reason we're indexing the same node is cuz i'm checking for thread safety during indexing issues
        // otherwise you'd normally be indexing new nodes who's data you got from some external source
        for (int i = 0; i < NUM_LINES; i++) {
            System.out.println("line: " + i);
            IndexRunner command = new IndexRunner(userNode, neo, idxServ);
            executorService.execute(command);
        }
 
        // should do a clean neo.shutdown() at some point ;-)
    }
 
    static class IndexRunner implements Runnable {
        NeoService neo;
        IndexService idxServ;
        Node userNode;
 
        IndexRunner(Node userNode, NeoService neo, IndexService idxServ) {
            this.userNode = userNode;
            this.neo = neo;
            this.idxServ = idxServ;
        }
 
        public void run() {
            Transaction nodetx = neo.beginTx();
            Node nodeFromIndex = idxServ.getSingleNode(USERNAME_INDEX, USERNAME);
            
            if (nodeFromIndex != null) {
                System.out.println("found " + USERNAME + " in the " + USERNAME_INDEX
                        + " index. Node ID is: " + nodeFromIndex.getId());
            } else {
                idxServ.index(userNode, USERNAME_INDEX, USERNAME);
            }
            
            nodetx.success();
            nodetx.finish();
        }
    }
}

Although this is an off the wall example it can also serve as a simple example of how to index a Node. Anywho – hope this helps out a few folks that ran into the same needs/problems/scenarios I did. In hindsight it’s all pretty simple & straightforward – I just went down the wrong path with the SingleValueIndex… when browsing the docs it sounded like the right tool for the job but, from what I can tell, you should avoid it and use the LuceneService instead.

  • Digg
  • del.icio.us
  • Facebook
  • Reddit
  • Twitter
Tagged: , , ,
<?php
 
if (! isset($argv[1])) {
    echo "enter a search term:\n";
    echo 'php ' . __FILE__ . " <search_term>\n";
    exit;
}
 
$term = urlencode($argv[1]);
$url = "http://ax.phobos.apple.com.edgesuite.net/WebObjects/MZStoreServices.woa/wa/wsSearch?limit=10&entity=software&term=$term";
$json = file_get_contents($url);
 
print_r(json_decode($json, true));
 
?>

More info here.

  • Digg
  • del.icio.us
  • Facebook
  • Reddit
  • Twitter
Tagged: , ,

Search This Blog