Rachel's Yard

| A New Continuation
Tags RethinkDB

Preface

Back in February, fmt01 at Hurricane Electric suffers from a power loss event (I believe?), and ALL servers are offline for 4 hours (8:00am-12:00pm). Of course, as a big ISP, they refunded you and say "Sorry":

credit memo

It would be fine if I were just to run only my services, because

  1. Dermail has a built-in failure mode, and no emails will be loss just because the central processing API is down.
  2. DNS is deployed throughout the world (currently 9 hosts) so a single region outage is not an issue.
  3. Up to two (out of five) physical nodes can fail in my deployment and everything will still be fine (except storage, which is currently two independent point of failure). Each node can be rebooted and state will be synchronized automatically.
  4. Plus, we have the amazing things of KVM and CoreOS and Kubernetes
  5. So within the fmt01, things can go down.

However, problems arise when you are running client facing services, because

  1. If fmt01 does go down, then clients will see a total blackout event.
  2. Not good when your entire datacenter goes down when enrollment is happening, and they are actively trying to use your website

I will admit it, it's my fault that the service goes down. Remember back in 2017 when internet's hard drive died and countless websites went fucked? Yeah I remember. I was still in college, and we can't submit our homework, because the vendor is too cheap to use multi AZs.

Although, as my strict principle, no vendor lock-in is a requirement for my services. Thus, I refuse to use any services from Amazon AWS, for the longest time. However, the power outage incident back in February got me thinking, what would it take to make SlugSurvival resilient to a datacenter failure?

Sh*t Hits the Fan, now what?

Architecture Redesign

When I was designing the components for SlugSurvival two years ago, I have a somewhat idea of how to make each component resilient to failure, at least within a datacenter. However, it was still a somewhat monolithic application, where SlugSurvival Frontend does its own things, and SlugSurvival Notify does its own things AND storing data AND serving data, etc etc.

So, looking for inspirations, I found this talk, this talk, and of course this talk.

In my mind, I have a somewhat clear idea of what I want to do.

  1. I want to separate the persistence layer
  2. I want to separate the business logic layer
  3. I want to separate the presentation layer
  4. Somehow connect all those layers
  5. Somehow all those layers need to be geographically redundant

Persistence Layer

Since I don't have like terabytes of data to store, some sort of database that requires minimum administrative overhead, and have built-in geographical replication will be an excellent choice. I've personally used Percona before (in fact, the DNS), and the new features in 5.7 make it an even more appealing choice for my application.

SlugSurvival has lots of reads and writes, but clients don't care about write speed, and they want to read as fast as possible. Backend service cares about write speed, and it also wants to read as fast possible, yet it requires a strong consistency. Galera WAN replication is the perfect choice for this (but with Percona flavor). Local reads and writes; write once replicate to all; either writes to all or no writes.

Connecting the Dots

If we are going to be highly available, this is obviously going into microservices. What's the best messaging system you ask? Look no further, nats.io.

nodeJS is obviously the choice of language here. Now I need a microservice framework. Look no further, moleculer IS my personal favorite. In fact, I like the features and architecture of the framework so much, I became a Patreon for this project.


We Have a Winner

architecture

Whirlpool

Whirlpool is the persistent layer of the entire system. It is responsible for the Percona instance (as of right now), and a compute service to support accessing the database. You want the information about the user? Make a service call. You want to insert the latest enrollment? Make a service call. It is also responsible for gnatsd (since it would be more logical to host the "hub" closer to where your data is).

Cool thing about this is that, you can replace "Percona" with any other things that would do replication (such as Cassandra), and the architecture will still be the same. Service call will still get the same data. This also has a the added benefits of using multiple types of database within the system (SlugSurvival uses Percona, Dermail uses Cassandra, if in the future I do decide to move Dermail off RethinkDB).

Messier

Messier is the "microservices" of the system. It is stateless, and it can support caching, and it does all the logics.

Andromeda

Andromeda is the user-facing end of the system, exposed as a web endpoint. Routing is done here, and static files will be served here (coming soon). Requests will be forwarded to Messier, and optionally caching the results as well.

Though, Andromeda does break my principle somewhat. DNS is currently powered by Route 53 for latency-based and weighted routing + health checks. However, I can convince myself that DNS is not strictly a vendor lock-in, so that would be fine...

(moleculer)

The microservice framework is great and all, but in my application, I need a little more than what it offers. Currently, there are 7 instances of Whirlpool running throughout continental United States, and sometimes the request gets routed to the East Coast even when the user is in the West Coast.

Thus, my fork of the upstream added a LatencyStrategy module. On the surface, it measures the network latency between A and B. However, since the latency has to go though gnatsd and the Broker, it also indirectly measures application/response latency. It is possible that even a user is in the West Coast, but the servers on the West Coast is overloaded so much, that routing the requests to the East Coast might be a better option.


Migration

Actually it was worse than I expected. Moving from NoSQL (RethinDB) to SQL (Percona) was such a painful thing to do... Tracking table has ~600,000 rows right now (and it will add ~200,000 rows every quarter), thus, optimization is a must here. This website is great if you are into that sort of stuff...

As my co-worker said:

You might be the first young engineer that I know who is trying to go from NoSQL to SQL.

I learned a lot of things.

  1. I suddenly have a lot more respects for DBA.
  2. My SQL class in college was just barely enough to get me going.
  3. Weak types language (yeah JavaScript I'm looking at you) is painful when you are doing Schema.

Conclusion

yeah me

Most of the SlugSurvival are now running on this new distributed system, some components still require migration to a less monolithic architecture. But now, at least when the entire datacenter is out again, the website would still be functional.

Oct 15 2016

What the Hell Is This?

It is basically a glorified calendar web app written in VueJS (no backend) that basically helps you (students) plan/search your classes better.

Here's a TL; DR page for you.

What the Hell? Why Reinvent the Wheel?

Well, sometimes the school's AIS is too slow for my likeky. Also, I have always dreamed of being able to enroll classes with ease. However, a typical enrollment process/checking for classes has always been like this:

  1. Search your classes on the AIS (Now they have a better interface)
  2. Add the classes that you want to the shopping cart
  3. Oh wait, before you can do that, select the section that you want
  4. Then you add your class+section to the shopping cart
  5. Add classes * click *
  6. Ah shit, class conflicts
  7. Repeats

Finding your classes should not be that hard.

OK, So It's a Calendar App, No Big Deal, Right?

You are wrong.

For starters, where are you getting the data? It was sort of impossible before, until the school rolled out a better interface on PISA, where it uses bootstrap in 2015, and it is actually human readable. Now we can use all kind of crazy DOM parser to find the class data.

Of course, me being me, always write spaghetti code, then fix later, this is how it looks like right now:

1
2
3
4
5
6
7
8
9
10
11
12
13
split = sectionDom[i].children[1].children[0].data.split(' ');
section.num = split[0].match(/\d+/g)[0];
section.sec = split[2];
section.loct = [
{
t: classDataCompatibleTime,
loc: sectionDom[i].children[7].children[0].data.replace('Loc: ', '')
}
]

section.ins = sectionDom[i].children[5].children[0].data.trim();
section.cap = sectionDom[i].children[9].children[0].data.substring(sectionDom[i].children[9].children[0].data.lastIndexOf('/') + 1).trim();
sections.push(section);

Well, don't worry about it, it gets the job done, at least for now. I will use prev() and next() and what not when I actually have time to improve the code base.

Fine, So What? Does Your App Enroll Classes For Users Too? HACKS?!

No, it does not enroll users automatically.

Well no shit sherlock. It involves student credentials, and I don't want to fuck with that.

SO YOUR APP DOES SHITS

Calm down, it will notify you when your classes are opened. Basically, I have a dispatcher and a bunch of workers to poll data from the website, and insert the changes to the database. It does all sort of magical stuff in the background. Allow me to explain:

Architecture of SlugSurvival

  1. The Data Fetcher will periodically compare the term list on S3 and PISA (usually every 4 days or 7 days). If there are new terms available, then it will fetch the newest term automatically, and upload the course data to S3. If there are no new terms, it will refresh the data for current quarter, until the drop deadline since there are usually changes to the courses up until drop deadline.
  2. The Data Fetcher will also spawns workers periodically to fetch data on RateMyProfessors (usually every 14 days), and only doing so incrementally.
  3. Then, the frondend loads the data from S3, and you will see something like this: http://url.sc/fall2016
  4. The real MVP here is the watcher. It will poll openings data from PISA and insert the changes to RethinkDB, and db2Queue will have a changefeed to push the delta to another queue, where the notification API can notify the students about their class openings.
  5. Of course, when you have time series data, you should graph them
  6. At the time of writing, I'm still trying to improve the automation aspects of the notification component (Tracked Here). But the idea is to unsubscribe users automatically after drop deadline and etc.

So yeah, this is sort of a big project in terms of reliability and automation requirements. I do want to talk to the school and see if they want to use this as part of the AIS.

I will update this post when I have more time and more changes made

The last "guide" was pretty shitty, I know. But, fear not, I will save you.

Step 1: Consider your architecture

When I was writing Dermail, application level and infrastructure level redudancy and failover was considered, and it was a big thing. Of course, being a young programmer, I will make mistakes. Please open an issue on Github or send me an email when you do encouter problems when you are trying out Dermail.

Anyway, architecture. Dermail runs with three major softwares:

  1. node.js
  2. RethinkDB
  3. Redis

Therefore, any one of three components can be redundant.

RethinkDB

RethinkDB provides a dead simple approach for redundancy. Read RethinkDB's official documents on Scaling, sharding and replication, Architecture FAQ, and Failover.

You can literally control shards and replicas in the WebUI, and you decide how safe you want your data to be. In my application, I have three instances running in a cluster, and each instance is running as a VM on three separate hypervisor. Although, they are connected to a shared storage, so there's my single point of failure. When I have more capital, I will consider invest in an SAN with multipath redundancy. But for, this will have to do.

Redis

Come on, just Google on how to setup master-slave redis already.

In my application, all redis instance are running solo, meaning that there's no redundancy for message queue. I hope to solve that problem by introducing some form of permanant storage in case the queue dies catastrophically, then at least the queue can be restored manually.

node.js

All instances are running in a cluster mode (4 instances) by default, therefore redundancy and failover should be taken care of.

However, in my application:

  1. API is running behind two nginx (but without HA setup), with two instances.
  2. MTA has three geographically independent instances
  3. TX has two independent instances
  4. Webmail has only one instance running

The worse case senario would be the API died, and MTA somehow didn't save your incoming mails to be stored in the API, and MTA died as well. Therefore, that mail is lost permanantly. However, that would require all 7 (nginx x2, API x2, MTA x3) instances to die simultaneously to happen.

Step 2: Database (RethinkDB)

Please refers to the official document on how to install RethinkDB on your distro. I'm running Debian for its legendary stability.

If you do decide to run RethinkDB in a cluster, please consider running a proxy RethinkDB instance on your API instance, to alleviate the network and CPU pressure.

Step 3: Message Queue (Redis)

By default, apt-get install redis-server should be all you need. However, you do want to change the redis.conf when eviction does happen. You want to change it to LRU instead of volatile, because of Bull, the message queue implementation that I'm using, does not set expiration. Therefore, LRU would be your best bet.


For now on, assuming that you want to use the domain myemail.com as your domain, and mx-1.myemail.com as your MX server, api.myemail.com as your API endpoint, tx-1.myemail.com as your TX helper endpoint, and web.myemail.com as your Webmail endpoint.

Step 4: SMTP Outbound, TX

This should be pretty straightforward. On your VPS/dedicated server,

  1. Clone the TX repo (Github, git.fm)
  2. npm install
  3. npm install -g forever
  4. forever start cluster.js

This concludes the installation for your TX component. You want to point tx-1.myemail.com to this server's IP, and best to set the rDNS of the IP to tx-1.myemail.com as well.

Step 5: Center of Operation, API

This requires more configration, but it should be straightforward as well.

  1. Clone the API repo (Github, git.fm)

  2. npm install

  3. npm install -g pm2

  4. Install Redis on this server. Or if you prefer, you can put Redis on a separate server

  5. Install RethinkDB on this server. Or if you prefer, you can put RethinkDB on a separate server.

  6. Then, you will need to fill out the config.json:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    {
    "remoteSecret": "SECRET",
    "jwt": "ANOTHER SECRET",
    "gcm_api_key": "GCM KEY",
    "tx": [{
    "priority": 10,
    "hook": "https://TX-ENDPOINT/tx-hook"
    }],
    "s3": {
    "endpoint": "S3-ENDPOINT",
    "key": "S3-KEY",
    "secret": "S3-SECRET",
    "bucket": "S3-BUCKET"
    },
    "rethinkdb": {
    "host": "127.0.0.1",
    "port": 28015,
    "db": "dermail"
    },
    "redisQ": {
    "host": "127.0.0.1",
    "port": 6379
    }
    }

  7. remoteSecret is used to "authenticate" MTA or other components. You will need this again later

  8. jwt is your secret key to encrypt JWT token. Since Webmail is a single page app, it will use JWT for, authentication and authorization. Please, make it safe.

  9. gcm_api_key is your developer API key. You will need it and you will want it for push notification.

  10. tx is your TX endpoint. Change tx to your TX endpoint configured above, or you will not be able to send emails. It is an array of endpoints, so you can have multiple endpoints to avoid single point of failure.

  11. s3 is your S3 information. It can be AWS, it can be on-premise, but you will need a S3 somewhere for your attachments

  12. rethinkdb and redisQ should be straightforward.

And you are almost done.

Setting up the database

Under the API repo, there should be a database.md. The easiest way to do it is to copy the code, and put them into the data explorer of RethinkDB, then you are done with the setup on the database part.

Or, here's aggregated version. This assumes that your database is called dermail:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
r.db('dermail').tableCreate('accounts', {
primaryKey: 'accountId'
})
r.db('dermail').tableCreate('addresses', {
primaryKey: 'addressId'
})
r.db('dermail').tableCreate('attachments', {
primaryKey: 'attachmentId'
})
r.db('dermail').tableCreate('domains', {
primaryKey: 'domainId'
})
r.db('dermail').tableCreate('folders', {
primaryKey: 'folderId'
})
r.db('dermail').tableCreate('pushSubscriptions', {
primaryKey: 'userId'
})
r.db('dermail').tableCreate('messageHeaders', {
primaryKey: 'headerId'
})
r.db('dermail').tablseCreate('messages', {
primaryKey: 'messageId'
})
r.db('dermail').tableCreate('queue', {
primaryKey: 'queueId'
})
r.db('dermail').tableCreate('users', {
primaryKey: 'userId'
})
r.db('dermail').tableCreate('filters', {
primaryKey: 'filterId'
})
r.db('dermail').tableCreate('payload', {
primaryKey: 'endpoint'
})

Then, for the indicies:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
r.db('dermail').table("domains").indexCreate("domain")
r.db('dermail').table("domains").indexCreate("alias", {multi: true})
r.db('dermail').table("users").indexCreate("username")
r.db('dermail').table("accounts").indexCreate("userId")
r.db('dermail').table("folders").indexCreate("accountId")
r.db('dermail').table("messages").indexCreate("accountId")
r.db('dermail').table("queue").indexCreate("userId")
r.db('dermail').table("filters").indexCreate("accountId")
r.db('dermail').table("attachments").indexCreate("checksum")
r.db('dermail').table('messages').indexCreate('folderDate', [ r.row('folderId'), r.row('date')])
r.db('dermail').table('messages').indexCreate('unreadCount', [r.row('folderId'), r.row('isRead')])
r.db('dermail').table('accounts').indexCreate('userAccountMapping', [ r.row('userId'), r.row('accountId')])
r.db('dermail').table('messages').indexCreate('messageAccountMapping', [r.row('messageId'), r.row('accountId')])
r.db('dermail').table('folders').indexCreate('accountFolderMapping', [ r.row('accountId'), r.row('folderId')])
r.db('dermail').table('accounts').indexCreate('accountDomainId', [ r.row('account'), r.row('domainId')])
r.db('dermail').table('addresses').indexCreate('accountDomain', [ r.row('account'), r.row('domain')])
r.db('dermail').table('addresses').indexCreate('accountDomainAccountId', [ r.row('account'), r.row('domain'), r.row('accountId')])
r.db('dermail').table('folders').indexCreate('accountIdInbox', [ r.row('accountId'), r.row('displayName')])

Then, run pm2 start app.json, and your API should be up and running.

Reverse Proxy (nginx)

Please refers to the pro tips.

This concludes the installation for your API component. You want to point api.myemail.com to reverse proxy's IP, and best to set the rDNS of the IP to api.myemail.com as well.

Step 6: Create a first user

Refers to the firstUser.js under usefulScripts/ at the API, change the information accordingly. For the bcrypt hash, refers to the screenshot bcrypt.png. After you have changed the information, run node usefulScripts/firstUser.js to create you very first user. You can't do anything yet, because you don't have everything set up.

Step 7: SMTP Inbound, MTA

This should be pretty straightforward. On your VPS/dedicated server,

  1. Clone the MTA repo (Github, git.fm)

  2. npm install

  3. npm install -g pm2

  4. Install Redis on this server. Or if you prefer, you can put Redis on a separate server

  5. Then, you will need to fill out the config.json:

    1
    2
    3
    4
    5
    6
    7
    8
    {
    "redisQ":{
    "host": "127.0.0.1",
    "port": 6379
    },
    "remoteSecret": "SECRET",
    "apiEndpoint": "https://API"
    }

  6. redisQ should be straightforward.

  7. remoteSecret must match the one in API

  8. API is the API endpoint setup earlier. Change API to your API endpoint configured above, or you will not receive emails. IT MUST BE HTTPS. For example, https://api.myemail.com

Then, run pm2 start app.json, and your MTA should be up and running.

This concludes the installation for your MTA component. You want to point mx-1.myemail.com to this server's IP, and best to set the rDNS of the IP to mx-1.myemail.com as well.

Step 8: Interface, Webmail

Unfortunately, Dermail does not have IMAP/POP3 capability yet. Although they are on the roadmap. For now, the only way to interact with Dermail is either Webmail, or using the API.

This step will setup the Webmail of the Dermail system, and Dermail requests for resources via the API, so it is merely doing rendering.

  1. Clone the Webmail repo (Github, git.fm)

  2. npm install

  3. npm install -g pm2

  4. Then, you will need to fill out the config.json:

    1
    2
    3
    4
    5
    {
    "port": 2001,
    "apiEndpoint": "API",
    "siteURL": "DOMAIN"
    }

  5. API is the API endpoint setup earlier. Change API to your API endpoint configured above. For example, https://api.myemail.com

  6. Default port is 2001. Point your reverse proxy to this. 3.siteURL is where your webmail is accesible. For example, https://web.myemail.com WITHOUT TRAILING SLASHES

Reverse Proxy (nginx)

Come on, just Google this already

This concludes the installation for your Webmail component. You want to point web.myemail.com to reverse proxy's IP, and best to set the rDNS of the IP to web.myemail.com as well.

Step 9: Profit

Your instance of Dermail is now up and running. Send a first email from other email addresses to your Dermail address, assuming that you have setup the MX record correctly, and you should see a new email under Inbox!

If you run into problems, again, open an issue on Github, or send me an email.

Weightless Theme
Rocking Basscss
RSS