Nothing special here. It’s just a blog post for summarising my algorithm learning course.
1. The 3-sum problem
The 3-sum problem is described as below
Read moreGiven N distinct integers, how many triples sum to exactly zero?
Nothing special here. It’s just a blog post for summarising my algorithm learning course.
The 3-sum problem is described as below
Read moreGiven N distinct integers, how many triples sum to exactly zero?
Long time ago, I worked one a project which allowed users to select images on a 2D canvas and then draw that image on a cylinder surface (the mug in this case).
I had to Google for the suitable libraries but I couldn’t find any. I also asked a question on stackoverflow but the answer did not satisfy me. The reason is that it demonstrates how to stretch the image, not how to bend the image (which is what shown in the picture above). Because of that, I decided to implement it by myself and turned out that it was not as hard as I thought before. Everything is just a loop of basic mathematic formula that I had been taught in (Vietnamese) high school.
Read moreNothing special here. It’s just a blog post for summarising my algorithm learning course. Here are some
Quick Union
related interview questions and my answers
Given a social network containing n members and a log file containing m timestamps at which times pairs of members formed friendships, design an algorithm to determine the earliest time at which all members are connected (i.e., every member is a friend of a friend of a friend … of a friend). Assume that the log file is sorted by timestamp and that friendship is an equivalence relation. The running time of your algorithm should be mlogn or better and use extra space proportional to n.
The earliest time at which all members are connected is when we union all into 1 connected component (1 tree). That means all the nodes in the tree have the same root. This is an improvement of weighted quick union algorithm. Every time we call the union, we will check the weight of the tree to see whether it is equal to the size of n.
Read moreFirst part here Basic Logging & Debugging in Microservices - Part 1
In previous post, I have talked about the advantages of building a custom Logger module that can group all the related log data into one single log entry. In this post, I will continue discussing about some basic ideas to integrate it into the Microservices architecture and organise all the log data for better investigation.
Integrating the custom MyLogger module into the application is a quite straightforward task. Instead of manually initialising and flushing the logs, you will need to implement a wrapper or higher order function to do that automatically. For example, if you are using Koa.js to for your http service, simply wrap the requests inside a logger middleware like this
const MyLogger = require('./my-logger.js')
// initialise koa app
// ...
function* myLoggerMdw(next) {
// init logger and assign to the context
const metadata = {
appName: 'your-service-name',
routeName: this.request.routeName
};
const logger = new MyLogger(metadata);
this.logger = logger;
// wrap logger around your request
try {
logger.push('info', 'Start request', this.request.headers);
yield next;
logger.push('info', 'Request success', this.status);
logger.write();
} catch(e) {
if (e.status < 500) {
logger.push('warn', 'Handled error', e.message);
} else {
logger.push('error', 'Unhandled error', e.message);
}
logger.write();
throw e;
}
}
// custom middleware for MyLogger
app.use(myLoggerMdw)
Yes it’s RethinkDB. Please don’t shout at me why you still write about the optimisations for a discontinued product like RethinkDB. I’m neither a fan of RethinkDB nor NoSQL. It is because I have to work with RethinkDB right now, deal with all the pains of RethinkDB and NoSQL and the team cannot move away from it since there are a lot of services currently depend on RethinkDB. But hey, most of the enhancements that we made are actually the basic philosophy in database scaling and optimisation. All those theory can be applied later in other database systems, not just RethinkDB.
So, you may have already known that, at the time of writing this post, I am working at Agency Revolution. We have been running the system which relies on RethinkDB for more than 3 years. We have built a great and highly scalable system with it. Beside that, we also have faced a lot of difficulties when the system grew too quickly, when the number requests peek during real life events (the agencies needed to send a lot of emails before holiday or after the disaster) or when large amount of data came in and out of the system. We have applied a lot of solutions in order to cope with the increase of work load so that our RethinkDB clusters can still serve the user within an acceptable time range. Some of those optimisations will be mentioned in this post.
Read moreOne of the biggest difficulties when working with Microservices (or with other Distributed systems) is to debug if any problems occur. It is because the business logic is divided into several small places. The code bug in one service can result in a cascading series of issues in many related services. Tracing which service is the root cause of the issue is always a challenging mission. By implementing a good Logging solution, you can reduce the time it takes to discover the bug. It also helps you feel more confident about what happened in your code as well as makes the problem easier to reason about.
So you decided it’s time to build a logging solution for your Microservices system, here are some steps that you probably need to do in order to build that.
Before starting with a full Logging solution for the whole large application, it is important that you get your smallest building block to work properly. You will first need to build a logging solution that can work well in one service, and then apply to all other services. You have to define a logging standard that all the other services will follow so that you can store all the log entries into another logging backend storage for later investigation.
The simplest logging way is to write the log immediately whenever you want. For example, when you receive one API request, when the HTTP request is done processing or when the server finishes update one record in the database. However, you will soon end up with a bunch of messy log entries because the web server usually processes multiple requests at the same time and you don’t know which ones have the correlation with the others. This is quite common in the concurrent and parallel world where the system can handle different tasks at once. You need to design a logging backend that can associate all the related log entries into one.
Read moreIn the first post, I discussed the overhead that you have to pay for when working with Microservices. This time, I’m going to talk about another problem with Microservices. It is the problem of the distributed systems that you have to face with from the very beginning.
Working with a Distributed system has never been an easy job. For Microservices, you have to face it from very early.
A distributed system with a lot of small services followed by difference data storages means that there are no constraints between those data storages. In a traditional SQL database, this can be solved easily by adding foreign keys between tables and perform a cascading update/delete whenever you want to modify the data. Ensuring that constraint in a Microservices design is really challenging.
Read moreIt has been nearly 2 years since I started working at Agency Revolution, a team working on a software platform that utilizes Microservices architecture to build a highly scalable system for Automation Marketing. It comes with both pros and cons when building a Microservices system from scratch and I’m not for nor against Microservices. There are many articles and books on the Internet talking about the advantages of Microservices so I’m not going to write another post about the benefits of using Microservices. This post is just a summary of my experience and the difficulties after 2 years working with it as well as how we deal with those issues to get the most value of Microservices.
First, let me introduce a bit about the tech stack that we are using. We have been running our application on our private server for about 2 years before migrating to Google Cloud Platform. There are 3 types of service in the system. They are
HTTP services are used for handling simple requests, which can be completed within milliseconds/seconds. For the long-running tasks, we published a message to Google PubSub and schedule it to be processed later by the Google PubSub workers. Each of them is deployed and scaled as a pod in Kubernetes.
Read moreThis post is the second part of the first post here. This post focuses on how to utilize RethinkDB Secondary Index in different use cases efficiently.
RethinkDB Indexes, similar to Indexes in other database, are the trade-off between read and write performance. Therefore, the basic rules for RethinkDB Indexes are similar to other database.
filter
query.It has been more than one year since my last post. But yeah, I’m still here, not going anywhere. This time, I write about the database that I have been working over the last one year at Agency Revolution, RethinkDB.
At Agency Revolution, we make heavy use of RethinkDB. Nearly everything is stored in RethinkDB. Probably at the time you are reading this blog post, that will not be true anymore and we have been utilizing other databases as well. However, as it’s still one of our main data storage, we used to have a lot of performance issues related to storing and retrieving data (and we still have until now). This blog post is to summarize how we use RethinkDB indexes to solve those problems as well as some use cases for different kind of indexes in RethinkDB.
Read more