It has been one and a half year since my first post about this topic :(
Continue from my previous post Scaling the System at AR - Part 1 - Data Pre-Computation, this time I’m going to talk about one of the most important component of the AR system: The Message Queue.
Message Queue is an asynchronous inter-service communication pattern. It is a temporary place to store the data, waiting for the message receiver to process. It encourages decoupling of logic and components in the system, provides a lightweight and unified protocol for communication between different services (written in different languages) and is perfectly suitable for Microservice design. A good message queue should satisfy these criteria
- It must be fast and capable of handling a large amount of messages coming in at the same time.
- It have to ensure the success of message processing. A message must be processed and retried until success. Otherwise, an Error queue (Dead Letter queue) should be provided to store the failed messages for later processing.
- It is required that each message is processed by one and only one consumer at the same time.
- The message queue should be independent from any languages and allow various applications written in different languages to send and receive messages without any problem.
Because the Message Queue is so important to us and there is a limit in number of developers, we decided to switch to third-party services after several months doing both dev and ops work with Kafka. Both Google Pub/Sub and AWS SQS offer the service in a relatively cheap price and you can choose either of them, depending on the Cloud platform that you are using. AWS SQS seems to be better since it offers a lot of functionalities around its SQS service, for example, mapping the Message events to Lambda, which allows us to save a lot of time working on the ops side and focus more on our business core value.
Currently, we are running 2 different systems on 2 different Cloud providers and we are using both solutions.