krishnakanth ks krishnakanth ks - 2 months ago 19
JSON Question

Schema design for Dynamo Db table or Equivalent Nosql Db for Discussion Forum

i am interested in designing a simple discussion forum where users can post a thread,comment on the thread and reply to a particular comment.

I am finding it very difficult to come up with schema/database model to implement the comment and reply functionality

Schema for Users and threads

name : "Krishnakanth",
username : "krish", (-)
password : "rbswew424",
age : 21,
hometown : "abc",

name : "someName",
threadId : "someID",(-)
username : "someName,
timeOfCreation : "time in some standard format" (--)

  • (-) => Partition Key*

  • (--) => Sort Key*

  • kindly help to come up with a schema design to implement the comment and reply functionality*

if you are from a non Dynamo db background just post the JSON format for the comment and reply functionality..!

Thanks in advance.


I have come up with 2 different approaches both have its own pros and cos,

1) Schema without StringSet

threadMaster(threadId-hash, timestamp, otherAttributes)
threadDetails(threadId- hash, replyId-range, timestamp, otherAttributes)

Data in this schema would look like:

t1, 31-11-2016:10:30
t2, 31-11-2016:09:34
t3, 31-11-2016:11:30

t1, r1, 31-11-2016:10:33
t1, r2, 31-11-2016:11:09
t1, r3, 31-11-2016:13:20
r1, r1, 31-11-2016:10:38  **
r1, r2, 31-11-2016:10:44  **

In above schema, theadMaster will hold master threadId which will be passed to detail table to get all the reply for that particular threadId.

**Your reply in-turn have reply that will be maintained in detail table, you can add another attribute in detail table with a true/false value which will show that if this comment has reply or not to avoid scanning table.

The problem with above schema is if one of the thread has millions of reply then there will be an impact on performance.

2) With StringSet

threadMaster(threadId-hash, replyId(Set of Ids), timestamp, otherAttributes)
threadDetails(replyId- hash, timestamp, otherAttributes)

Data in this schema would look like:

t1, [r1,r2,r3..rn], 31-10-2016:10:18
t2, [r11,r12,r13..r1n], 11-11-2016:20:00
t3, [r21,r22,r33..r2n], 21-11-2016:00:30
r4, [r99,r98]  **

r1 31-11-2016:10:30
r2 31-11-2016:11:20
r99 01-11-2016:11:20

In above schema, you master table will have the information about the thread and its replies.

** When there is a reply to you comment you will make another entry in master as a thread itself.

The problem with above schema is that add/remove comment will be tedious, with each reply you have to update master thread.

You can use timestamp attribute at the application level to sort the comments accordingly.

Hope that helps