the human graph
Dec. 1st, 2009 05:19 pmSo, right. I mentioned the project for 198 already. The actual project itself ceased to be interesting about 3 seconds after I contemplated it, but I was curious about the nature of the dictionary we were provided.
There are a few obstacles to my curiosity: 1) I'm not actually a programmer. 2) we're working with ruby 3) prof isn't actually interested in helping me to optimize my code.
The original problem is this: given a fixed dictionary (in this case, 2355 4-letter words), construct a valid path between two pairs of words such that in the path the next-hop (word) differs from the current word by no more than one letter, and that no word is repeated in the path. for extra credit, ensure that no shorter path exists.
brute-force BFS will guarantee a shortest path. But I wanted to know which pair of words in the dictionary (that were actually connected) had the longest path between them.
Having constructed the graph (which, given that it is ruby and ruby doesn't do real threading, takes about 10 seconds), finding the shortest path using BFS between a few million unique pairs of words takes... a while. Done serially, it takes about 4 days. Did I mention ruby doesn't do threading?
My prof has no interest in actually helping me (I could be charitable and say it was because it's not part of the project, but really he's just an ass and doesn't want to help me), but fortunately I have friends that are more accommodating.
I'd already been using memoization for parent relationships, but got nudged in the right direction to use it for everything else- all partial chains found, all failed paths, etc. Watching the output of the new code is interesting... Now at about 23% through the pairs, i'd say I'm using partial chains for about 55% of the hits, 30% are cached failed paths, 5% are full hits on cached chains, and the rest are still fully-computed paths. It'll probably be done this evening, just in time for the original run to finish. I think I probably missed a few opportunities for caching chains, but given that this is so far from the original problem as to be pointless, I'm not going to spend any more mental time on it. Trying. There is always the hamiltonian path problem...
There is also the 'do it in another language that uses real threads' path that I am ignoring. It *is* in fact an embarrassingly parallel problem, but I don't really feel like spawning a few million processes on one of our clusters. Maybe later, when I don't have a 40-page paper due in a week.
........
how does this tie into the human graph? Well, I realized that for me, getting to know people is somewhat like this dictionary search. on the surface, every pair is unique, and then you get a level or two in, and realize "oh, wait, I've seen this before. I know how this is going to go."
Now, sometimes, this knowing is a good thing. There are comfortable paths, and comfortable people, and just because they don't necessarily surprise us doesn't mean they are boring or should be discarded. But sometimes we keep trying to explore similar paths, thinking "maybe this time I'll find a solution" when really there is none. And then there are those people that tempt us into hoping that this time we'll find a new, deep, and interesting path, but turn out to be terribly shallow. And already known.
There are a few obstacles to my curiosity: 1) I'm not actually a programmer. 2) we're working with ruby 3) prof isn't actually interested in helping me to optimize my code.
The original problem is this: given a fixed dictionary (in this case, 2355 4-letter words), construct a valid path between two pairs of words such that in the path the next-hop (word) differs from the current word by no more than one letter, and that no word is repeated in the path. for extra credit, ensure that no shorter path exists.
brute-force BFS will guarantee a shortest path. But I wanted to know which pair of words in the dictionary (that were actually connected) had the longest path between them.
Having constructed the graph (which, given that it is ruby and ruby doesn't do real threading, takes about 10 seconds), finding the shortest path using BFS between a few million unique pairs of words takes... a while. Done serially, it takes about 4 days. Did I mention ruby doesn't do threading?
My prof has no interest in actually helping me (I could be charitable and say it was because it's not part of the project, but really he's just an ass and doesn't want to help me), but fortunately I have friends that are more accommodating.
I'd already been using memoization for parent relationships, but got nudged in the right direction to use it for everything else- all partial chains found, all failed paths, etc. Watching the output of the new code is interesting... Now at about 23% through the pairs, i'd say I'm using partial chains for about 55% of the hits, 30% are cached failed paths, 5% are full hits on cached chains, and the rest are still fully-computed paths. It'll probably be done this evening, just in time for the original run to finish. I think I probably missed a few opportunities for caching chains, but given that this is so far from the original problem as to be pointless, I'm not going to spend any more mental time on it. Trying. There is always the hamiltonian path problem...
There is also the 'do it in another language that uses real threads' path that I am ignoring. It *is* in fact an embarrassingly parallel problem, but I don't really feel like spawning a few million processes on one of our clusters. Maybe later, when I don't have a 40-page paper due in a week.
........
how does this tie into the human graph? Well, I realized that for me, getting to know people is somewhat like this dictionary search. on the surface, every pair is unique, and then you get a level or two in, and realize "oh, wait, I've seen this before. I know how this is going to go."
Now, sometimes, this knowing is a good thing. There are comfortable paths, and comfortable people, and just because they don't necessarily surprise us doesn't mean they are boring or should be discarded. But sometimes we keep trying to explore similar paths, thinking "maybe this time I'll find a solution" when really there is none. And then there are those people that tempt us into hoping that this time we'll find a new, deep, and interesting path, but turn out to be terribly shallow. And already known.