Scaling the System at AR - Part 2 - Message Queue for Integration

28 March 2020
misc

It has been one and a half year since my first post about this topic :(

Continue from my previous post Scaling the System at AR - Part 1 - Data Pre-Computation, this time I’m going to talk about one of the most important component of the AR system: The Message Queue.

Message Queue is an asynchronous inter-service communication pattern. It is a temporary place to store the data, waiting for the message receiver to process. It encourages decoupling of logic and components in the system, provides a lightweight and unified protocol for communication between different services (written in different languages) and is perfectly suitable for Microservice design. A good message queue should satisfy these criteria

It must be fast and capable of handling a large amount of messages coming in at the same time.
It have to ensure the success of message processing. A message must be processed and retried until success. Otherwise, an Error queue (Dead Letter queue) should be provided to store the failed messages for later processing.
It is required that each message is processed by one and only one consumer at the same time.
The message queue should be independent from any languages and allow various applications written in different languages to send and receive messages without any problem.

Because the Message Queue is so important to us and there is a limit in number of developers, we decided to switch to third-party services after several months doing both dev and ops work with Kafka. Both Google Pub/Sub and AWS SQS offer the service in a relatively cheap price and you can choose either of them, depending on the Cloud platform that you are using. AWS SQS seems to be better since it offers a lot of functionalities around its SQS service, for example, mapping the Message events to Lambda, which allows us to save a lot of time working on the ops side and focus more on our business core value.

Currently, we are running 2 different systems on 2 different Cloud providers and we are using both solutions.

Con dao đa năng Victorinox đầu tiên của tôi

21 March 2020
misc

Vậy là sau một thời gian đắn đo, cuối cùng tôi cũng đã quyết định xuống tiền mua chiếc dao đa năng Victorinox đầu tiên. Bài blog này chỉ đơn giản là để khoe về chiếc dao đa năng Victorinox mới mua mà thôi 😤

Cảm nhận đầu tiên là nó nhỏ, rất nhỏ gọn luôn. Lúc xem specs trên mạng cũng đo ướm thử xem sao nhưng mà khi ra tới shop rồi mới tận mắt thấy, mấy mẫu 58mm rất đẹp và nhỏ gọn, đúng ý của mình luôn là đang cần tìm một mẫu dao đa năng treo chìa khóa được. Có cậu em đồng nghiệp trong công ty gợi ý là nên mua mấy cái như Nextool hầm hố, có đủ cả kìm kéo và size to luôn nhưng mục đích chính của mình là tìm một mẫu đeo móc chìa khóa trước, sau này thấy ổn sẽ mua thêm những mẫu khác sau.

Hộp đóng gói khá nhỏ gọn

Set up a personal Jenkins server with Nginx and Free Cloudflare SSL

02 January 2020
misc

Just a simple setup, just put here in case I need it in the future.

Yeah, I’m familiar with Jenkins and it has a bunch of useful utilities to automate my personal workflow, not just a simple build tool, for example, an automated task runner with familiar UI. This instruction is for Ubuntu 18.04 and AWS Lightsail but the same instructions are applied for all other VPS/Cloud services.

Bootstrap the server and install Jenkins

Create a new VPS on AWS Lightsail, choose an Ubuntu 18.04 server with any specs that you want.
Optionally: set up swap on the server if you have limited amount of RAM, following this guide but this can be done later.
Some Cloud providers (like AWS Lightsail) offer an extra layer of network security by blocking all the incoming traffic on all ports (except SSH and HTTP) by default. Since Jenkins will run on port 8080, you need to add that port to the allowed list

port

I've just set up my new home network with Tenda Nova Mesh

13 October 2019

and it is deadly simple

So, here is the problem. Although my apartment is just a small one and the distance between the Wifi router and my PC is also not far, they are separated by some wall layers.

Apartment

Async/Await in C# and Javascript

20 November 2018
.net javascript

I recently changed from the NodeJS team to work in the .Net team (in the same company). Coming back to C# after a long time, there are a lot of new stuffs. Actually, I used to hate .Net (simply because I hate using Windows :LOL:). But thing has changed. .Net Core can now run on non-Windows systems without any differences. It is becoming easier to develop .Net applications on Mac/Linux (using Jetbrains Rider like me or Visual Studio Community for Mac, which is a bad idea).

One interesting thing that I found in C# after a long time working in JS is the Async/Await operation, which simplifies asynchronous programming significantly. I heard that JS borrows the Async/Await idea from C#, so I decided to take a deeper look at the Async/Await operation in C# and compare it to the one in JS to see if there are any other things that C# is more successful at. There may be things that I was wrong about because I’m relatively new to C#.

Below is the comparison table between using Async/Await pattern in C# and JS. I also mentioned JS Generator because it can be applied pretty much in the same way as the one using Promise. Actually, it used to be an innovative way to solve asynchronous problems in JS before the birth of Async/Await. Many teams and products are still using it as the code base was developed many years ago. Today, Async/Await is the preferred way for handling asynchronous tasks in JS, leaving Generator back to its original purpose.

Symbol Tables and Binary Search Trees summary

23 September 2018
algorithm

Nothing special here. It’s just a blog post for summarising my algorithm learning course. Although this was already taught in the University, it’s still god to summarize here

1. Symbol Tables

Key-value pair abstraction.

Insert a value with specified key.
Given a key, search for the corresponding value.

Example

domain name	IP address
www.cs.princeton.edu	128.112.136.11
www.princeton.edu	128.112.128.15
www.yale.edu	130.132.143.21
www.harvard.edu	128.103.060.55
www.simpsons.com	209.052.165.60

Symbol Table APIs

Symbol Tables act as an associative array, associate one value with each key.

public class ST<Key, Value> {
    void put(Key key, Value, val);
    Value get(Key key);
    void delete(Key key);
    boolean contains(Key key);
    boolean isEmpty();
    int size();
    Iterable<Key> keys();
}

Scaling the System at AR - Part 1 - Data Pre-Computation

08 August 2018
misc

At the time of this writing, I have been working at Agency Revolution (AR) for more than 2 years, on a product focusing mostly on automation email marketing for the Insurance Agencies. I have been working on this product since it was in beta, when it could only serve only a few clients, send thousands of emails each month and handle very little amount of integration data without downtime until it can deliver millions of emails each month, store and react to terabytes of data flow every day. The dev team has been working very hard and suffering a lot of problem to cope with the increasing number of customers that the sale team brought to us. Here the summary of some techniques and strategies that we have applied in order to deliver a better user experience.

The problem of On-demand computing

By On-demand, I mean the action of computing the required data only when it is needed.

One of the core value of our system is to deliver the right messages to the right people at the right time. Our product allows users to set up automated emails, which will be sent at a suitable time in the future. The emails are customised to each specific recipient based on their newest data at the time they receive the email, for example the current customer status, whether that customer is an active or lost customer at that time, how many policies he/she has or the total value that customer has spent until that time.

Some Optimizations in RethinkDB - Part 2

23 June 2018
rethinkdb

Part 1 here Some Optimizations in RethinkDB - Part 1

Yes, it’s RethinkDB, a discontinued product. Again, read my introduction in the previous post. It’s not only about RethinkDB but it also the basic idea for many other database systems. This post introduces other techniques that I and the team have applied at AR to maximize the workload that RethinkDB can handle but most of them can be applied for other database systems as well.

Increase the Memory with NVME SSD

Well, sound like a very straight forward solution, huh? More memory, better performance, sound quite obvious! Yes, the key thing is how to increase the memory without significant cost. The answer is to setup swap as the temporary space for storing RethinkDB cached data. RethinkDB, as well as other database systems, caches the query result data into memory so that it can be re-used next time the same query executes again. The problem is that swap is much slower than RAM, because we rely on the disk to store the data. However, since we are running on Google Cloud and Google Cloud offers the Local-SSDs solution, we have been exploiting this to place our swap data. Here is the Local-SSDs definition, according to Google

Local SSDs are physically attached to the server that hosts your virtual machine instance. Local SSDs have higher throughput and lower latency than standard persistent disks or SSD persistent disks. The data that you store on a local SSD persists only until the instance is stopped or deleted.

Feature Toggle and Feature Release

17 June 2018
misc

Feature Toggle is a very popular technique that enables you to test the new feature on real production environment before releasing it to your clients. It’s also helpful when you want to enable the feature for just some beta clients or just some clients who pay for the specific features. The technique requires both backend and frontend work involved. In this post, I’m going to talk about some simple solutions that I and the team at AR have applied as well as some other useful ways that we are still discussing and may apply one day in the future.

1. Backend Data Organization

Feature Flag table

Of course, the simplest solution is to create a specific table for storing the all the feature flags in the database. The table may looks like this

{
  featureName: <string>,
  released: <bool>,
  enabledList: <array>, // enabled clients list
  disabledList: <array> // disabled clients list
}

The above mentioned data structure may be suitable for the case your system has a lot of users. You can simply add some admin user to the enabledList and test the new feature on production before releasing it to your users.

Inline User feature data

If your product is to serve business clients, you can also store the enabled feature directly to the client object itself. This can save you extra queries to the database to get the feature information. If that’s the case, your Client object might look like this

{
  clientId: <string>,
  enabledFeatured: <array>
}

Unix Permission style

Binary Heap and Heapsort Summary - Part 2 - Heapsort

16 June 2018
algorithm

Nothing special here. It’s just a blog post for summarising my algorithm learning course. Probably this was taught in the University but I don’t remember anything, I have no idea about its definition and applications until I take this course. Part 1 here Binary Heap & Heapsort Summary - Part 1 - Binary Heap

The Idea

Start with array of keys in arbitrary order
Create max-heap with all N keys.
Repeatedly remove the maximum key (in place) to create a sorted array.