Natural Key vs. Surrogate Key - Userid in Social Networks
All of us (almost) are the members of different social networks: Facebook, Bebo, Myspace, Hi5, etc. We ARE users. User is a real entity - a real person.
One of the problems is the ability to identify a user in a unique way across the internet.
There are 2 approaches to identify entity - both inherited from Relational database model.
Natural Key
Natural Key - a key which has logical reference to (or the same as) already existing unique entity attribute(s). The key has to identify a real entity whose existence can be verified. The most common example is person national ID number, which identifies national passport (?) in a unique way. However the ideal candidate for the persons natural key would be some fictional Global Universe ID. For social networks the best candidate so far is users EMAIL address since:
- Email address is unique.
- Email identifies a user in a unique way across all common social networks. The same email represents the same user in Facebook, Bebo, Myspace, etc.
The problem: usually email address cannot be shared once this or another social network decides to open its platform and share the user data due to privacy issues. For example email address is not accessible within Facebook API. The second issue is that actually the same user can have several mail addresses, which makes it less trivial to identify the user. Should we use a combination of all user emails to be able to perform identification in a unique way or just use the primary email address ?
Surrogate Key
Surrogate Key - a generated key which is not related to the entity attributes. Usually integer (INT or BIGINT, sometimes GUID). For example my userid on facebook is 500491268 and on bebo - 5282127447.
"Because the surrogate key is completely unrelated to the data of the row to which it is attached, the key is disassociated from that row. Disassociated keys are unnatural to the application's world, resulting in an additional level of indirection from which to audit."
As we can see the surrogate key is not unique across the internet. We get real problem when social networks (and their competitors) open their platforms to 3rd parties and decide to share the data. In order to identify the user in a unique way we need to scale the surrogate key to include another surrogate key which identifies the platform, but it remains still impossible to associate the same user on different platforms.
Conclusion
Recently we can see how different social networks are trying to win and dominate the social network space by trying to develop Connect Schemas in order to make connect more sites and applications and make them use their surrogate key to identify users. Recent examples are: Facebook Connect and Google Friend Connect.
I think this approach will not survive in the long run. The solution is the creation of something like Global Universe ID which identifies a person in the universe, globally and historically unique and shareable. LOL
Friday, May 30, 2008 1:36 PM