BigQuery Subscription

It’s hard to view the content of the messages that were published to a topic because the application has already processed and acknowledged them before you can do anything. Usually, you have to create another test subscription for the messages to be replicated to and then pull messages from that test subscription. However, the Google PubSub UI doesn’t provide any way to pull specific message by id. The GCloud Console UI is a frustrating UI itself, slow to load and had to pull several times to find the necessary messages.

Google offers BigQuery Subscription, a solution to that issue and also to provide a long term storage for your messages so you can troubleshoot and do complex query later. In this post, I’m going to show a sample BigQuery Subscription workflow with Pulumi.

Configure BigQuery Dataset and Table

First, you need to create a BigQuery Dataset and a BigQuery Table following the schema defined here. You can do it manually on the UI or via Pulumi

BigQuery Dataset

const pubsubDatasetId = `pubsub`;

export const pubsubDataset = new gcp.bigquery.Dataset(
  `my-dataset`,
  { datasetId: pubsubDatasetId }
);

BigQuery Table (a bit messy since the schema has to be defined in JSON string)

export const messageTable = new gcp.bigquery.Table(
  `my-table`,
  {
    datasetId: pubsubDatasetId,
    tableId: `message-values`,
    // if you don't want other people to accidentally delete is, set to true
    deletionProtection: true,
    schema: `
  [
    {
      "name": "data",
      "type": "STRING",
      "mode": "NULLABLE",
      "description": "The message body"
    },
    {
      "name": "subscription_name",
      "type": "STRING",
      "mode": "NULLABLE",
      "description": ""
    },
    {
      "name": "message_id",
      "type": "STRING",
      "mode": "NULLABLE",
      "description": ""
    },
    {
      "name": "publish_time",
      "type": "TIMESTAMP",
      "mode": "NULLABLE",
      "description": ""
    },
    {
      "name": "attributes",
      "type": "STRING",
      "mode": "NULLABLE",
      "description": "Message attributes as JSON string"
    }
  ]
  `,
  },
  {
    dependsOn: [pubsubDataset],
  }
);
Read more

Ok, the story is that, I’m really bad at css. I have never worked on building any frontend component and I was given a task to build the Custom Checkbox component with Reactjs from scratch. Here is how…

1. The basic HTML and CSS

Prepare the structure

Here is how you usually create a checkbox with pure html and css. To avoid any complicated event handler, I will simply wrap the <label> tag around, which allows clicking on any element inside to transfer the event to the corresponding <input> element without any Javascript needed.

<label class="mylabel">
  <input class="myinput" type="checkbox" name="checkbox" />
  <div class="mylabel">Checkbox label</div>
</label>
.mylabel {
  display: flex;
  gap: 5px;
  align-items: center;
  margin: 2px;
}

Try the live example in the below iframe (or direct link)

Read more

just a blog post for summarising my algorithm course

The Problem

Given a data structure organised as a set of N objects, is there a path connecting 2 objects?

// union: connect 2 objects
// connected: whether 2 objects are connected?
union(4, 3);
union(3, 8);
union(6, 5);
union(9, 4);
union(2, 1);

connected(0, 7) return false;
connected(8, 9) return true;

union(5, 0);
union(7, 2);
union(6, 1);
union(1, 0);

connected(0, 7) return true;

Can only answer the question with Yes or No. The Dynamic Connectivity implementation cannot answer the exact path between 2 objects. It can only answer whether there are any paths connecting 2 objects.

Read more

Leetcode: Binary Search

This is so trivial. I just put it here so I can look up faster.

Given an array of integers nums which is sorted in ascending order, and an integer target, write a function to search target in nums. If target exists, then return its index. Otherwise, return -1. You must write an algorithm with O(log n) runtime complexity.

Example 1

Input: nums = [-1,0,3,5,9,12], target = 9
Output: 4
Explanation: 9 exists in nums and its index is 4

Example 2

Input: nums = [-1,0,3,5,9,12], target = 2
Output: -1
Explanation: 2 does not exist in nums so return -1

Constraints

1 <= nums.length <= 104
-104 < nums[i], target < 104
All the integers in nums are unique.
nums is sorted in ascending order.
Read more

Just an interview question 🙃

The interview question: Re-implement Javascript Promise from scratch. The implementation should support chaining (asynchronously, of course).

Promise API

It has been too long since I started using async/await. I almost forgot these Promise APIs.

Let’s revisit the basic Javascript Promise APIs

const myPromise = new Promise((resolve, reject) => {
  setTimeout(() => {
    resolve('foo');
  }, 300);
});

// try catch each one separately
myPromise
  .then(handleResolvedA, handleRejectedA)
  .then(handleResolvedB, handleRejectedB)
  .then(handleResolvedC, handleRejectedC);

// or catch the first one if any
myPromise
  .then(handleResolvedA)
  .then(handleResolvedB)
  .then(handleResolvedC)
  .catch(handleRejectedAny);
Read more

Part 2: Refactor a legacy Worker Base - Part 2 - Scope Management

Rewrite the WorkerBase class

After fixing the Scope management problem, it’s time to rewrite the WorkerBase class in a way that the components are loosely coupling, composable and detachable. The solution turned out to be a very simple approach. It’s the middleware design that is very common in popular Web server frameworks (ASP.Net Core, Express.js, Koa.js,…).

In case you don’t know what a middleware is, read ASP.Net Core Middleware.

After analyzing the legacy WorkerBase class, I found that they could be organized into these middlewares

  • Exception handling middleware
  • Logging middleware
  • Message queue behaviors middleware

Worker middlewares

Read more

Part 1: Refactor a legacy Worker Base - Part 1 - The long lasting pain

The refactoring solution I presented in this post is the solution that I wrote in C#. It doesn’t mean you cannot do this in Nodejs. It is just because the team is migrating away from Nodejs to C#. We are familiar with these tools and they are already available as standard pattern in C#.

First thing first: An IOC Container

We learnt this at university. Why the heck did we forget this? Is it because the program language allows us to make this mistake so easily or is it because of the community that encourages the bad behaviors everywhere?

As I mentioned earlier, scope management of the legacy codebase is awful. We actually used a function to surround the scope of a message and the derived class is actually just a collection of function, not a scope container. Every time we want to activate a new method, we have to pass all the parameters downstream.

class WorkerBase {
  async start() {
    let message;
    do {
      message = await pullMessages(1);
      const context = this.buildContext(message);

      // this processMessage function wrap the scope of a message
      await this.processMessage(message, context);
    } while (message != null);
  }
}

// Worker service 1
class Worker1 extends WorkerBase {
  myProp = 1;

  async processMessage(message, context) {
    logic1(message, context);
    logic2(message, context);

    myProp++; // this will mutate myProp and affect other message
  }

  logic1(message, context) {}

  logic2(message, context) {}
}

We wrote JS in an OOP way but didn’t apply the OOP best practices!

Read more

Let’s talk about Microservices again!

In order to manage a Microservices system efficiently, people usually enforce all the microservices to follow some common patterns. This helps you standardize the monitoring process and add a new microservice more easily. In the Microservices project that I’m currently working on (at Agency Revolution), we also have to implement a base class for the each different type of microservice. The base class contains the logic to write log in the correct format, to handle some common errors correctly or to alert the developers if something go wrong, etc.

Basically, there are 2 types of Microservices in our system: Synchronous and Asynchronous. I will focus mostly on one type of Async worker in this post: The Message Queue workers. The base class was initially built in Nodejs. After several years of development, we started to face many problems with the design. And now, I’m going to show you how I identified the drawbacks and improved it with a better version in C#.

Why C#? I may explain in another post. But right now, you can take a look at this post first.

How it all began

First, we started with this Inheritance model, the way that most people will think of when they start implementing a Worker base module. We defined a super class that all the workers in the system can derive from. It contains the logic to pull messages from the corresponding queue and activate the main handler function.

// This is Javascript code
// The base class
class WorkerBase {
  constructor(config) { this.queueName = config.queueName; }

  async start() {
    let message;
    do {
      message = await pullMessages(1); // pull 1 message at a time
      await this.processMessage(message);
    } while (message != null);
  }

  // implement this in the derived class
  async processMessage(message) { throw new Error('Not implemented'); }
}

// Worker service 1
class Worker1 extends WorkerBase {
  constructor() { super({ queueName: 'Worker1' }); }

  async processMessage(message) {
    // implement worker1 logic here
  }
}

// Worker service 1
class Worker2 extends WorkerBase {
  constructor() { super({ queueName: 'Worker2' }); }

  async processMessage(message) {
    // implement worker2 logic here
  }
}

// to activate a worker
const worker = new Worker1();
await worker.start();
Read more

Nothing special here. It’s just a blog post for summarising my algorithm learning course. Here are some questions related to Priority Queues.

Related knowledge: Binary Heap & Heapsort Summary - Part 1 - Binary Heap

1. Dynamic median

Design a data type that supports insert in logarithmic time, find-the-median in constant time, and remove-the-median in logarithmic time. If the number of keys in the data type is even, find/remove the lower median.

Solution: Use 2 Sorted Binary Heap

  • The Max heap to store half smaller items
    • No items in the Max heap are bigger than the ones in the Mean heap
  • The Min heap to store the other half
    • No items in the Min heap are smaller than the ones in the Max heap
  • The 2 heap should be equal. That means, the number of item in each heap should be equal or maximum 1 item different than the other one.

You will need these methods

class Median {
    int[] maxHeap;
    int[] minHeap;

    void balance() {...}
    void insert(int) {...}
    int findMedian() {...}
    int removeMedian() {...}
}
Read more

Clean architecture with C#/.Net Core and MediatR - Part 3

The MediatR Library

Clean Architecture is an Interface design. Everything is connected via Interfaces and their hidden Implementations. A request comes from the Framework layer to the Business handler via a Business interface. A request from the Business circle to the database or other services is also activated using an Adapter interface.

Mediator pattern and MediatR library are just another way to write Interfaces and Implementations (via the TRequest/TResponse type). In fact, you can simply define your own interface and attach the corresponding implementation through your IOC container. However, the main reason I use MediatR is because of its excellent dynamic pipeline behavior, which helps me separate most of the cross-cutting concerns out of the main handler, makes everything cleaner and produces a concise, testable handler class.

A very simple handler in MediatR looks like this

public class InsertUser
{
    /// <summary>
    /// Insert a new User and return that User
    /// </summary>
    public class Request : IRequest<Response>
    {
        public int ClientId { get; set; }
        public string Username { get; set; }
        public string Password { get; set; }
    }

    public class Response
    {
        public User User { get; set; }
    }

    public class Handler : IRequestHandler<Request, Response>
    {
        public async Task<Response> Handle(Request request, CancellationToken cancellationToken)
        {
            // implement your logic here
            await CheckExistence(request.UserName);

            // implement your logic here
            var user = await SomeFunction(cancellationToken);

            return new Response
            {
                User = user
            }
        }
    }
}

In this simplest form, it doesn’t look so different from the way we usually do with a normal Interface. However, let’s imagine what will happen when you want to add these requirements

  • Log the related information to debug later.
  • Track the process metrics to monitor and analyze performance.
  • Lock the process to avoid race-condition.
  • Transform the request/response format in a pre-defined way.
  • Handle errors.
  • Other cross-cutting concerns?…
  • A more important question: How to group related requests and re-use these cross-cutting concern handlers?
Read more