One of the users on another web site posted a question about how to associate users in a tree-like organization. That web site isn't well suited to posting code or ongoing discussions about code, so I'm going to post the example here. Feel free to discuss as you see fit.DROP TABLE LI_UserLinks
GO
DROP TABLE LI_Users
GO
DECLARE @.d1 DATETIME
, @.d2 DATETIME
, @.d3 DATETIME
, @.d4 DATETIME
, @.d5 DATETIME
, @.d6 DATETIME
, @.d7 DATETIME
, @.d8 DATETIME
SELECT @.d1 = GetDate()
CREATE TABLE LI_Users (
uid INT
PRIMARY KEY (uid)
)
SELECT @.d2 = GetDate()
CREATE TABLE LI_UserLinks (
uid_from INT
CONSTRAINT XFK01LI_UserLinks FOREIGN KEY (uid_from)
REFERENCES LI_Users (uid)
, uid_to INT
CONSTRAINT SFK01LI_UserLinks FOREIGN KEY (uid_to)
REFERENCES LI_Users (uid)
CONSTRAINT XPKLI_UserLinks PRIMARY KEY (uid_from, uid_to)
)
ALTER TABLE LI_Userlinks
ADD CONSTRAINT XCK01LI_UserLinks CHECK (uid_from != uid_to)
SELECT @.d3 = GetDate()
INSERT INTO LI_Users (
uid) SELECT n0 + 10 * n1 + 100 * n2 + 1000 * n3 + 10000 * n4 + 100000 * n5
FROM (SELECT 0 AS n0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3
UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7
UNION SELECT 8 UNION SELECT 9) AS z0
CROSS JOIN (SELECT 0 AS n1 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3
UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7
UNION SELECT 8 UNION SELECT 9) AS z1
CROSS JOIN (SELECT 0 AS n2 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3
UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7
UNION SELECT 8 UNION SELECT 9) AS z2
CROSS JOIN (SELECT 0 AS n3 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3
UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7
UNION SELECT 8 UNION SELECT 9) AS z3
CROSS JOIN (SELECT 0 AS n4 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3
UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7
UNION SELECT 8 UNION SELECT 9) AS z4
CROSS JOIN (SELECT 0 AS n5 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3
UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7
UNION SELECT 8 UNION SELECT 9) AS z5
SELECT @.d4 = GetDate()
INSERT INTO LI_UserLinks (
uid_from, uid_to) SELECT
uid, 100 * uid + n0 + 10 * n1
FROM LI_Users
CROSS JOIN (SELECT 0 AS n0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3
UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7
UNION SELECT 8 UNION SELECT 9) AS z0
CROSS JOIN (SELECT 0 AS n1 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3
UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7
UNION SELECT 8 UNION SELECT 9) AS z1
WHERE LI_Users.uid BETWEEN 0 AND 99
AND uid != 100 * uid + n0 + 10 * n1
SELECT @.d5 = GetDate()
INSERT INTO LI_UserLinks (
uid_from, uid_to) SELECT
uid, 100 * uid + n0 + 10 * n1
FROM LI_Users
CROSS JOIN (SELECT 0 AS n0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3
UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7
UNION SELECT 8 UNION SELECT 9) AS z0
CROSS JOIN (SELECT 0 AS n1 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3
UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7
UNION SELECT 8 UNION SELECT 9) AS z1
WHERE LI_Users.uid BETWEEN 100 AND 9999
SELECT @.d6 = GetDate()
SELECT u.uid, r1.uid_to, r2.uid_to
FROM LI_Users AS u
INNER JOIN LI_UserLinks AS r1
ON (r1.uid_from = u.uid)
INNER JOIN LI_UserLinks AS r2
ON (r2.uid_from = r1.uid_to)
WHERE 1 = u.uid
SELECT @.d7 = GetDate()
SELECT Count(DISTINCT u.uid), Count(DISTINCT r1.uid_to), Count(distinct r2.uid_to)
FROM LI_Users AS u
INNER JOIN LI_UserLinks AS r1
ON (r1.uid_from = u.uid)
INNER JOIN LI_UserLinks AS r2
ON (r2.uid_from = r1.uid_to)
SELECT @.d8 = GetDate()
SELECT
DateDiff(ms, @.d1, @.d2)
, DateDiff(ms, @.d2, @.d3)
, DateDiff(ms, @.d3, @.d4)
, DateDiff(ms, @.d4, @.d5)
, DateDiff(ms, @.d5, @.d6)
, DateDiff(ms, @.d6, @.d7)
, DateDiff(ms, @.d7, @.d8)
, DateDiff(ms, @.d1, @.d8)-PatPWhat do you want us to discuss? Where is the description of the problem he/she wants to solve? Is it an organization hierarcy, a family tree?
Really don't know what to discuss or comment on.|||There is no need to discuss anything unless you have something to contribute. I haven't gotten an answer from the other site's management about whether they'll even tolerate cross-site references (some sites object to that as a form of advertising, DBForums is often abused by other sites using posts here to attract new users), so I'll hide the site name for the moment.
The original question was:I'm trying to come up with a solution on how <<removed by PatP>> stores relationships between users, specifically, how it recognizes, and retrieves the network tree.
What I'm after is how this happens in the backend, and database. How it says "David is a 2nd tier connection to Joe". I can't seem to come up with any solution that would be scalable. I had originally considered using Joe Celko's "nested set" model, but this was simpler, efficient enough for this use, and uses much more common schema and code constructs.
-PatP|||Well, that site was easy enough to find using Google|||My only comment would be that the logic is not recursive.|||the network on that site isn't really a tree, it's a graph where there can be more than one way to get from one node to another. There's no parent-child relationship, just a "we're connected" relationship.
the trick is to find the *shortest* path to get from one to the other.
a well known algorithm for this is Djikstra's. Peso has a slick impementation in sql here:
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=77262
although my instinct is this algorithm would be better done in compiled code.|||http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=77262
although my instinct is this algorithm would be better done in compiled code.
I thought the goal was to get rid of all the developers|||Well, that site was easy enough to find using GoogleYeah, I know that... Until I get the "mother may I" from their admins to link to another site (like here) I'm not going to just lay it out.
They are relatively new to the question and answer format, and don't really have any good way that I can find to present code. I tend to follow the "ravioli" model for posting and keep things completely contained on one site when I can, but at least to me this problem just itched for a DBForums-like presentation for the solution.
-PatP|||the trick is to find the *shortest* path to get from one to the other.This isn't much of a trick because they only support three levels, and because this is a directed graph it is functionally identical to a tree. Someone of interest to you is either your own direct connection, or a connection of your direct connections... There is no need to search the entire graph because that site limits the search to two levels.
-PatP|||I thought the goal was to get rid of all the developers
without developers, your beloved SQL Server goes away.
besides, I am a dev. ;)|||This isn't much of a trick because they only support three levels, and because this is a directed graph it is functionally identical to a tree. Someone of interest to you is either your own direct connection, or a connection of your direct connections... There is no need to search the entire graph because that site limits the search to two levels.
-PatP
a directed graph is not the same as a tree. trees don't have cycles (unless you are into incest?), but a graph can.
also here's a current thread on sqlteam on this topic: http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=89323|||An unconstrained graph can certainly have cycles, but a directed graph can never have cycles by its definition. Because a directed graph has "direction" and the start node of any link must be nearer the root than the end node of that link, cycles can't happen in a directed graph. That's part of why a directed graph can't have circular links (where a node connects to itself) because the end node must be further from the root than the start node for any given link.
-PatP|||One thought about the code sample that I posted that I do need to add is how to find the "degree of relation". You are in the U alias, your direct connections appear in the R1 alias, and their direct connections (your second level connections) appear in the R2 alias. When checking to see how "close" to you another node in the graph is, check R1 first then check R2 (because it is possible that someone could be both your direct connection as well as an indirect connection (through a common friend).
-PatP|||An unconstrained graph can certainly have cycles, but a directed graph can never have cycles by its definition. Because a directed graph has "direction" and the start node of any link must be nearer the root than the end node of that link, cycles can't happen in a directed graph. That's part of why a directed graph can't have circular links (where a node connects to itself) because the end node must be further from the root than the start node for any given link.
-PatP
a directed graph can certainly have cycles. a directed graph is just a normal graph where you replace the edges by "arcs" where arcs have a direction. edges have no direction.
see the section here on directed graphs. there's a diagram there of a directed graph with a cycle.
http://en.wikipedia.org/wiki/Graph_(mathematics)
if a directed graph has no cycles, then it's called a directed acyclic graph.
also I don't know what you mean by root. a graph has no root. trees have roots.|||Christmas trees don't have roots.
Shoe trees don't have roots.|||I was taught (granted this was 30 years ago) that a CSDG (Computer Sicent Directed Graph) was fundamentatlly different than a mathematical graph in several ways, in order to make it possible to solve some problems with the existing hardware in usable time frames.
A mathematical graph is fundamentally a set made up of nodes and associated vectors. As a set, there is no implicit order of the nodes. A CSDG is fundamentally a *possibly multiply) linked list, so that nodes always have an implicit order (this makes pruning possible/practical).
A mathematical graph is undirected, meaning that any two related nodes have two vectors that connect them. These vectors are commonly represented as a single, bi-directional vector even though that is definitely NOT the same thing from the standpoint of a formal proof. A CSDG always has unidirectional vectors that point away from the root, with the understanding that they can and should be inverted if both necerssary and possible (obviously this is irrelevant if the vector or either node can be pruned).
Mathematical graphs are always treated as sets, and as such can only have set operations applied to them. A CSDG on the other hand can be treated as a set (in which case you need to apply special handling to deal with the unique representation of the vectors), but in order to avoid needless processing they can also be handled as a linked list.
I will conceed that Wikipedia considers mathematical graphs and CSDG as interchangable, and in an ideal world they should be, but I have to solve real problems with real hardware... For the purpose of solving the kinds of problems that I use a CSDG to solve, a mathematical graph would either consume more RAM than any machine I've got, execute longer than I'll live, or both. I'll live with the limitations of what I was taught until I can afford hardware that will let me use the purer solution.
-PatP|||Christmas trees don't have roots.
Shoe trees don't have roots.
they did at one time, they were just pruned!
:)|||i have never heard of a CSDG. then again I've never taken any CS courses. :)
when i need to store a graph in code, i'll use either an adjacency list or an adjacency matrix. former is more space efficient but slower on lookups. both are covered in most books on data structures.
No comments:
Post a Comment