Part 2 here Basic Logging & Debugging in Microservices - Part 2

ElasticSearch for storing the log data

At this time, you may have a good logging library that you built on your own. Your application probably produced a bunch of log entries and you can debug your application using cat and tail command. You can also open the log file in your favorite text editor and search for the log entries that you want. However, there should be a better solution for that. Imagine when your log file grows to thousands or millions or even billions of lines, that’s not feasible to debug manually like that.

Fortunately, ElasticSearch is a good choice for organising log entries. Setting up ElasticSearch is a quite straight forward task. There is even pre-built docker image for it. Our team at AR doesn’t even care about optimising it because we rarely face the problem with it. Of course, our ElasticSearch instance goes down sometimes, but since it’s not a critical module of the system, doesn’t affect main application’s status, we don’t need to invest much time in optimising it. What we do is to configure restart policy so that Docker can recover the process after OOM. Another thing we need to do is to set up a daily script for deleting the old log entries to save the disk space. You can find installation instructions for ElasticSearch on its official Docker image page

fluentd for pushing the log data to ElasticSearch

Now you have your ElasticSearch server up and running. The next thing you need to do is to push your log data to it. fluentd, an open source data collector for unified logging layer, allows you to unify data collection and consumption for a better use and understanding of data. If your application writes all the log entries into a file or to standard output inside Docker, you can set up fluentd to tail the log files and push to ElasticSearch every time one line is finished (just like the tail command). In case you don’t know, Docker redirect its standard output to a file on disk. You can find a sample fluentd setup here https://github.com/tmtxt/clojure-pedigree/tree/master/images/fluentd. The most important thing that you need to notice is its configuration file, the td-agent.conf file. You will need to modify it to fit your requirements

Read more

Configuration, it sounds very simple but many people do it the wrong way.

Configuration is an essential part of any application (The term Configuration here does not include internal application config, such as the route path definition). Configuration varies across deploys and environments, which provides confidential information to the application (such as database connection string) and defines the way the application should behave (for example, enable or disable specific features). It sounds quite simple but I found that many people did it the wrong way. That’s why we have this blog post.

Take a look at this

Recently, I worked on another project, which was written in ASP.Net Core 3.1. After a while digging into the code, I saw these patterns and felt really annoyed about it. They are related to the way the Configuration is implemented and passed around the application as well as the way it is consumed.

Here is the Setting interface and its implementations.

public interface ISetting
{
    Env Environment { get; }
    string MailgunApiKey { get; }
    string AppName { get; }
}

public class LocalSetting : ISetting
{
    Env Environment => Env.Local;
    public string MailgunApiKey => "key-xxxx";
    public string AppName => "Prod Name";
}

public class StagingSetting : ISetting
{
    Env Environment => Env.Staging;
    public string MailgunApiKey => "key-yyyy";
    public string AppName => "Prod Name";
}

public class ProdSetting : ISetting
{
    Env Environment => Env.Prod;
    public string MailgunApiKey => "key-zzzz";
    public string AppName => "Prod Name";
}

Here is an app component which uses the Setting object

public async Task Execute()
{
    var setting = Resolve<ISetting>();
    var emailToUse = "";

    switch (setting.Environment)
    {
        case Env.Local:
            emailToUse = "[email protected]";
            break;
        case Env.Staging:
            emailToUse = "[email protected]";
            break;
        case Env.Prod:
            emailToUse = request.ActualEmailAddress;
            break;
        default:
            break;
    }

    await SendEmail(emailToUse, setting.MailgunApiKey);
}
Read more

Part 3: Scaling the System at AR - Part 3 - Message Queue in general

Continue from my previous post, I’m going to demonstrate the internal tool that we use at AR to work with Message Queue. I will also summarize some of our experience when designing a system with Message Queue.

The Message Queue in AR system

Currently, we have over 100 different types of workers/queues. We have built a tool to manage them efficiently.

The tool allows us to quickly filter for any Subscriptions (Queues/Workers) MessageBus

Read more

Part 2: Scaling the System at AR - Part 2 - Message Queue for Integration

In previous post, I mentioned Messages Queue in some specific use cases for Integration components. In this post, I’m going to talk about Message Queue in general and how the workflow looks like at AR.

Messages Queue in design

One of the main difference of the AR system is that most of the tasks are background tasks and backed by several Message Queues. There are several reasons for us to choose this design

  • We want to keep the user-facing API and databases simple. This way, the API will respond very fast and the app performs more smoothly. That brings a good impression to our users and makes them happier.
  • We can isolate different aspects of the system
    • We can easily limit the resources consumption of the less important tasks (the tasks that are not user-facing or do not need the results immediately), for example, the task to log User Activities or the task to export User Data.
    • We can also allocate more resources for and scale only the tasks that are critical to the users, for instance, the task to send a Blast email in case of disaster.
    • This is controlled by via various parameters when creating the queue and running the worker
      • The number of worker instances running at the same time
      • The number of concurrent messages that a worker instance can pull and process at the same time
      • The delay of the messages that are published to the queue
  • The Message queue ensures the eventual consistency for our system in case of failure. The system is fault-tolerant by design. Even if the database or the network is down, all the tasks are guaranteed to be processed at some points in the future. Moreover, we only need to retry the failed parts, not the full flow.
  • Each worker is a re-usable workflow with the message value as the input data. Want to implement a new feature which re-use the same flow? Instead of activating the same functions, you publish a message to the corresponding queue.
  • This allows us to choose different technology for each worker, depending on the requirements. Most of our workers are written in Nodejs. However, there are some of the written in Golang. We also have a team with many C# experts working on Integration projects. There are no problems for us to integrate everything into one same workflow.
Read more

It has been one and a half year since my first post about this topic :(

Continue from my previous post Scaling the System at AR - Part 1 - Data Pre-Computation, this time I’m going to talk about one of the most important component of the AR system: The Message Queue.

Message Queue is an asynchronous inter-service communication pattern. It is a temporary place to store the data, waiting for the message receiver to process. It encourages decoupling of logic and components in the system, provides a lightweight and unified protocol for communication between different services (written in different languages) and is perfectly suitable for Microservice design. A good message queue should satisfy these criteria

  • It must be fast and capable of handling a large amount of messages coming in at the same time.
  • It have to ensure the success of message processing. A message must be processed and retried until success. Otherwise, an Error queue (Dead Letter queue) should be provided to store the failed messages for later processing.
  • It is required that each message is processed by one and only one consumer at the same time.
  • The message queue should be independent from any languages and allow various applications written in different languages to send and receive messages without any problem.

Because the Message Queue is so important to us and there is a limit in number of developers, we decided to switch to third-party services after several months doing both dev and ops work with Kafka. Both Google Pub/Sub and AWS SQS offer the service in a relatively cheap price and you can choose either of them, depending on the Cloud platform that you are using. AWS SQS seems to be better since it offers a lot of functionalities around its SQS service, for example, mapping the Message events to Lambda, which allows us to save a lot of time working on the ops side and focus more on our business core value.

Currently, we are running 2 different systems on 2 different Cloud providers and we are using both solutions.

Read more

Vậy là sau một thời gian đắn đo, cuối cùng tôi cũng đã quyết định xuống tiền mua chiếc dao đa năng Victorinox đầu tiên. Bài blog này chỉ đơn giản là để khoe về chiếc dao đa năng Victorinox mới mua mà thôi 😤

Cảm nhận đầu tiên là nó nhỏ, rất nhỏ gọn luôn. Lúc xem specs trên mạng cũng đo ướm thử xem sao nhưng mà khi ra tới shop rồi mới tận mắt thấy, mấy mẫu 58mm rất đẹp và nhỏ gọn, đúng ý của mình luôn là đang cần tìm một mẫu dao đa năng treo chìa khóa được. Có cậu em đồng nghiệp trong công ty gợi ý là nên mua mấy cái như Nextool hầm hố, có đủ cả kìm kéo và size to luôn nhưng mục đích chính của mình là tìm một mẫu đeo móc chìa khóa trước, sau này thấy ổn sẽ mua thêm những mẫu khác sau.

Hộp đóng gói khá nhỏ gọn
img

Read more

Just a simple setup, just put here in case I need it in the future.

Yeah, I’m familiar with Jenkins and it has a bunch of useful utilities to automate my personal workflow, not just a simple build tool, for example, an automated task runner with familiar UI. This instruction is for Ubuntu 18.04 and AWS Lightsail but the same instructions are applied for all other VPS/Cloud services.

Bootstrap the server and install Jenkins

  • Create a new VPS on AWS Lightsail, choose an Ubuntu 18.04 server with any specs that you want.
  • Optionally: set up swap on the server if you have limited amount of RAM, following this guide but this can be done later.
  • Some Cloud providers (like AWS Lightsail) offer an extra layer of network security by blocking all the incoming traffic on all ports (except SSH and HTTP) by default. Since Jenkins will run on port 8080, you need to add that port to the allowed list

port

Read more

I recently changed from the NodeJS team to work in the .Net team (in the same company). Coming back to C# after a long time, there are a lot of new stuffs. Actually, I used to hate .Net (simply because I hate using Windows :LOL:). But thing has changed. .Net Core can now run on non-Windows systems without any differences. It is becoming easier to develop .Net applications on Mac/Linux (using Jetbrains Rider like me or Visual Studio Community for Mac, which is a bad idea).

One interesting thing that I found in C# after a long time working in JS is the Async/Await operation, which simplifies asynchronous programming significantly. I heard that JS borrows the Async/Await idea from C#, so I decided to take a deeper look at the Async/Await operation in C# and compare it to the one in JS to see if there are any other things that C# is more successful at. There may be things that I was wrong about because I’m relatively new to C#.

Below is the comparison table between using Async/Await pattern in C# and JS. I also mentioned JS Generator because it can be applied pretty much in the same way as the one using Promise. Actually, it used to be an innovative way to solve asynchronous problems in JS before the birth of Async/Await. Many teams and products are still using it as the code base was developed many years ago. Today, Async/Await is the preferred way for handling asynchronous tasks in JS, leaving Generator back to its original purpose.

Read more

Nothing special here. It’s just a blog post for summarising my algorithm learning course. Although this was already taught in the University, it’s still god to summarize here

1. Symbol Tables

Key-value pair abstraction.

  • Insert a value with specified key.
  • Given a key, search for the corresponding value.

Example

domain name IP address
www.cs.princeton.edu 128.112.136.11
www.princeton.edu 128.112.128.15
www.yale.edu 130.132.143.21
www.harvard.edu 128.103.060.55
www.simpsons.com 209.052.165.60

Symbol Table APIs

Symbol Tables act as an associative array, associate one value with each key.

public class ST<Key, Value> {
    void put(Key key, Value, val);
    Value get(Key key);
    void delete(Key key);
    boolean contains(Key key);
    boolean isEmpty();
    int size();
    Iterable<Key> keys();
}
Read more