项目实战--云计算Social Networking Timeline

CMU 15619 Cloud Computing 的 individual project,项目全名是 Social Networking Timeline with Heterogeneous Back-ends,通过 MySQL/HBase/MongoDB 实现简化版 twitter 的后端。

Implementing Basic Login with MySQL on RDS

AWS RDS 配置 MySQL 并导入 users.csv and userinfo.csv 数据集,

数据集:

  • users.csv [UserID, Password]
  • userinfo.csv [UserID, Name, Profile Image URL]

如果用户名密码正确,返回 user name and Profile Image Url,
如果不正确,name 返回 “Unauthorized”,Profile Image URL 返回 “#”.

Request:

GET /task1?id=[UserID]&pwd=[Password]

Response:

returnRes({"name":"my_name", "profile":"profile_image_url"})

效果:

Storing Social Graph using HBase

数据集:

  • links.csv [Followee, Follower]

对 followers 进行排序。排序规则:

  1. 按姓名进行升序排序
  2. 按 Profile Image URL 进行升序排序

实现:
从 HBase 中根据 userid 找出 followers,再从 MySQL 中根据 follower userid 找出 name 和 profile url 并进行排序。

这里的问题是 HBase 的表如何设计能最大化性能。可以采用的方式为:
对数据集进行处理,按 followee 排序然后按 followers 排序,并进行合并,得到 [Followee, FollowerList],followee 作为 rowkey。

Request:

GET /task2?id=[UserID]

Response:

{"followers":[{"name":"follower_name_1", "profile":"profile_image_url_1"}, {"name":"follower_name_2", "profile":"profile_image_url_2"}, ...]}

Build Homepage using MongoDB

同样是对 HBase 表的设计。这里要求的是根据 userid 找到 followees,然后再找到 followees 的 posts。为了提高性能,可以做的是:
对数据集进行处理,按 follower 排序然后按 followees 排序,并进行合并,得到 [Follower, FolloweeList]

数据集:
posts.csv
{
“pid”:xxx, // PostID
“uid”:xxx, // UserID of poster
“name”:”xxx”, // User name of poster
“profile”:”xxx”, // Poster profile image URL
“timestamp”:”YYYY-MM-DD HH:MM:SS”, // When post is posted
“image”:”xxx”, // Post image
“content”:”xxx”, // Post text content
“comments”:[ // comments json array
{
“uid”:xxx, // UserID of commenter
“name”:”xxx”, // User name of commenter
“profile”:”xxx”, // Commenter profile image URL
“timestamp”:”YYYY-MM-DD HH:MM:SS”, // When comment is made
“content”:”xxx” // Comment text content
},
{
“uid”:xxx,
…….
},
……
]
}

Request:

GET /task3?id=[UserID]

Response:

{"posts":[{post1_json}, {post2_json}, ...]}

Put Everything Together

显示 user 关注的人的最新 30 篇 posts
排序规则:
对 followers 进行排序。排序规则:

  1. 按姓名进行升序排序
  2. 按 Profile Image URL 进行升序排序

对最新 30 篇 posts 排序:

  1. 按 timestamp 升序排序
  2. 按 pid (PostID) 升序排序

不满 30 篇 posts 返回全部。

Sample Request:

http://backend-public-dns:8080/MiniSite/task4?id=99

Sample Response:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
"followers": [
{
"name": "Alastair Moock",
"profile": "https://cmucloudsocial.s3.amazonaws.com/profiles/61ed34b1f6bcd5498d888e3c2a1768.png"
},
{
"name": "Amr Diab",
"profile": "https://cmucloudsocial.s3.amazonaws.com/profiles/507e08e097e49ffaa584b988748180.png"
},
{
"name": "CJ Bolland",
"profile": "https://cmucloudsocial.s3.amazonaws.com/profiles/699bb8d37e61d66750e58fd1513637.png"
},
... (more followers ommitted)
],
"name": "Accent",
"posts": [
{
"content": "Wow, just experienced Screamers (2006)",
"timestamp": "2015-08-07 19:06:57",
"uid": 2587,
"_id": {
"$oid": "56b06bde2fa550d2061f30c2"
},
"name": "Beggars Opera",
"image": "http://cmucloudsocial.s3.amazonaws.com/posts/Screamers_2006_.png",
"pid": 156154,
"comments": [
{
"uid": 34190,
"timestamp": "2015-10-21 13:57:59",
"content": "I have seen this movie on starz, I regret to say that I was not lucky enough to have watched it while the screening took place. This documentary follows the one and only System Of A Down, throughout several locations from LA to Europe, while diggin deep in history and showing the truth about the forgotten genocide. This movie includes interviews with experts and some of the survivors of the genocide. This sad story of human history also follows the massacre that took place in africa during 2004 while the whole world stood watching, idle in the face of massive death. To conclude this, several SOAD fans won't be dissapointed by the extensive repertoire of songs played throughout the film. I'm glad I'm finally getting the DVD after a long wait, I hope you feel the same way.",
"name": "Youngster",
"profile": "https://cmucloudsocial.s3.amazonaws.com/profiles/6a75f0eea78b8cd2c0ae4da2f85f34.png"
},
{
"uid": 27184,
"timestamp": "2015-10-21 21:36:12",
"content": "Great movie for fans of System of a Down. Better yet, this is a great movie documenting genocide in general, and the Armenian genocide in particular. I highly recommend this movie. It is a must see. Share the movie with friends, family, and members of your local community. Everyone will thank you for it. Very eye-opening experience.",
"name": "Shirobon",
"profile": "https://cmucloudsocial.s3.amazonaws.com/profiles/539869e8b863511771c3b0b5e13d94.png"
},
... (more comments omitted)
],
"profile": "https://cmucloudsocial.s3.amazonaws.com/profiles/8f64d02d77cf734ddde87b7832ca76.png"
},
{
"content": "Wow, just experienced The Tube (2004)",
"timestamp": "2015-08-11 02:42:40",
"uid": 357,
"_id": {
"$oid": "56b06be12fa550d2061f706a"
},
"name": "Agoria",
"image": "http://cmucloudsocial.s3.amazonaws.com/posts/The_Tube_2004_.png",
"pid": 175927,
"comments": [
{
"uid": 29700,
"timestamp": "2015-09-15 10:20:44",
"content": "A FORMER GOVERNMENT AGENT HOLDS A TRAIN HOSTAGE WITH A BOMB THAT'LL BLOW UP IF THE TRAIN STOPS AND IT'S UP TO A DETECTIVE TO STOP HIM AND FIND A WAY TO SAVE THE LIVES OF THE PASSENGERS. WHAT WE HAVE HERE IS BASICALLY ANOTHER IMITATION OF ''SPEED''. THE DIALOGUE IS LAUGHABLE AND THE ACTION [WHICH THERE IS PLENTY OF] IS NOT REALLY THAT ENTERTAINING. THE ACTING IS ALSO PRETTY BAD, BUT THE MOVIE TENDS TO SHOW A FEW SIGNS OF LIFE IN THE LAST 30 MINUTES. IF YOU'RE AN ACTION FAN [LIKE ME] AND YOU'RE CURIOUS ABOUT THIS MOVIE, RENT IT. BUT DON'T BUY IT. ON THIS DVD, YOU HAVE THE CHOICE OF WATCHING THIS MOVIE DUBBED IN EITHER ENGLISH OR FRENCH OR YOU CAN WATCH THIS MOVIE IN ITS ORIGINAL LANGUAGE, WHICH IS KOREAN.",
"name": "Teimoso",
"profile": "https://cmucloudsocial.s3.amazonaws.com/profiles/a3a08f80ee44be2438293a04d2f9f2.png"
},
{
"uid": 31346,
"timestamp": "2015-10-12 06:16:52",
"content": "A Hollywood movie went out into the world, traveled to Korea, got assimilated and regurgitated, and now it returns to our shores as this. The studios know it and advertise it using reviews that cast it as the Korean version of Speed. It also "borrows" a score straight from Hans Zimmer's work for The Rock, and the main actor looks and acts like Chow Yun Fat light. It's discouraging to see Korean cinema paying homage to American action flicks when it has so many more interesting stories to tell. At least Woon-Hak Baek's first feature, Shiri, spoke in a unique voice and told a story personal to the Korean experience. This is a step backwards for him.On the other hand, this movie composite of so many action movies we've seen before is fascinating in its skewed familiarity. It's not terrible; the production values are high, the acting occasionally thrilling, the one-liners sometimes amusing. It's no more or less diverting than the average Hollywood Die Hard knockoff. I think of it as top notch karaoke, like American Idol. In the proper context, it's impressive.In the grand scheme of things, though, it's depressing, especially when Korean directors like Chan-wook Park are producing such unique and energetic work.",
"name": "The Veronicas",
"profile": "https://cmucloudsocial.s3.amazonaws.com/profiles/1bd4dc1a4ca7a49daf53ca9a03735e.png"
},
... (more comments omitted)
],
"profile": "https://cmucloudsocial.s3.amazonaws.com/profiles/329125b52f0db7d04ed8828b5eccac.png"
},
... (more posts ommitted)
],
"profile": "https://cmucloudsocial.s3.amazonaws.com/profiles/8e8a1b156037ed1ecfba40b917084e.png"
})

Basic Recommendation

根据 userid 推荐 10 个 user。
算法: 基于用户的协同过滤算法。
Eg.
assume A follows {B, C, D}.
Followee B follows {C, E, A},
followee C follows {F, G} and
followee D follows {G, H}.

得分:{G: 2, E: 1, F: 1, H: 1}

排序规则:

  1. 按得分降序排序
  2. 按 user id 升序排序

少于 10 个用户返回全部。

Request:

GET /task2?id=[UserID]

Response:

returnRes({"recommendation":[{name:, profile:},{name:, profile:},...,{name:, profile:]})

徐阿衡 wechat
欢迎关注:徐阿衡的微信公众号
客官,打个赏呗~