Initial problem

It’s the classic logging issue again! In the Warehouse Management system that I’m working on, the team usually needs to add this logging pattern

const sendToteToPackingStation = (warehouseId: string, toteId: string): Promise<Result> => {
  logger.info('Sending tote to packing station', { warehouseId, toteId });

  const result = await someLogic(...);

  logger.info('Send tote result', { result });
  return result;
};

The purpose is simple. It’s what you have to do for production debugging in every system. You should write out some unique ids so you have a place to start querying your logging system. From that point, you will then trace the related entries using a correlation id that your system provides.

From time to time, when the system scaled up, there were new areas that could slow down the system. We then added more logging logic to the system, for example, execution time logging to help build some visualization dashboards to identify the root cause.

const sendToteToPackingStation = (warehouseId: string, toteId: string): Promise<Result> => {
  const startTime = performance.now();

  logger.info('Sending tote to packing station', { warehouseId, toteId });
  const result = await someLogic(...);
  logger.info('Send tote result', { result });

  logger.info('Execution time', { durationMs: elapsedTimeMs(startTime) });
  return result;
};

Of course, when this was repeated multiple times, we started thinking about making a higher order function (HOF) to reuse everywhere is the system

First implementation…

Let’s begin with the type definition. Here are the generic types how a HOF looks like. It’s a function that receives a function and return another function with the same signature with the input function

export type WrappedFunction<
  FunctionArgs extends ReadonlyArray<unknown>,
  FunctionReturn
> = (...args: FunctionArgs) => FunctionReturn;

export type HigherOrderFunction = <
  FunctionArgs extends ReadonlyArray<unknown>,
  FunctionReturn
>(
  func: WrappedFunction<FunctionArgs, FunctionReturn>
) => WrappedFunction<FunctionArgs, FunctionReturn>;
Read more

Let’s build Product, not Software - Part 2

In the previous post, I have shown you a real example about how to solve the problem as a Software Engineer. This time, we gonna do it by the Product Engineer approach.

Make Agile great again!

If you refer to the first post that I wrote, Building Product is about delivering user values, collecting feedbacks and constantly adapting to the change of business. Does it sound like Agile? Yes, in my opinion, Agile seems to be the best fit out there for Product company at a small and medium size.

Don’t do this

Let’s start with the non-Agile way by this picture-by-picture story

delivery-1

delivery-2

delivery-4

delivery-3

When you put this into a Software perspective, it’s pretty much the same with the first example that I showed in my previous post. The way most people would choose is to implement the whole feature, from backend to frontend before delivering to the customer. Again, do NOT do this.

Read more

Previous post: Let’s build Product, not Software - Part 1

In the first part of this series, I have walked you quickly through some differences between Building Product and Building Software and why Building Product is important. In this post, I’ll show you a simple example, analyze its problem and come up with a solution following the Product Engineer mindset.

A Software Engineer approach

Let’s take a look at this simple project

You are working for an Automation marketing platform for E-commerce merchants. The product needs the data about the sales orders of the merchant and your task is to build a 1 way Sales Order integration feature to sync data from Shopify to your system

After going through several discovery steps with your customers and the PO, you decide that it’s time to make the implementation plan. You now come up with this plan

Milestones Details Duration
Backend Build the backend to handle auth & webhook requests from Shopify 1 month
Frontend Build the frontend for the users 1 month
Beta Release to some beta customers 0.5 month
Bug fixes Handle issues reported from customers 0.5 month
Go live Release to everybody 0.5 month

Sounds good? Yes, this is a perfect plan from a Software Engineer perspective, but…

Read more

This may not always be true. It really depends on the type of company that I will go through in this post. This also sometimes sounds strange from a Software Engineering perspective, but as we get used to, it really did help FMG (my old company) grow to the leading player that niche market

Software engineers love technology, for sure. We love building things, love applying the latest technology into our product. However, we usually forget one important thing: The technology that doesn’t fit in the product, cannot make a profitable business, is a useless one. That sounds obvious, right? Surprisingly, a lot of Software Engineers that I’ve met made this mistake, especially the talented ones.

Let me walk you through the 2 approaches, compare the differences between them and analyze some real examples. In latter posts, you can also find some techniques that I’ve applied in order to help build a better Product Engineer mindset.

img1

This is converted from a presentation that I made at work

Read more

Just a collection of tips to make working with Postgres (and other SQL-like databases) easier

Integration data

Usually when you build a system that integrates with other 3rd party service, you will need to store integration information related to the entity, for example the id of the entity on the 3rd party system or some of its configuration on that system. Imagine that you are building an e-commerce related product, you may want to sync the Sales order information from Shopify to do the analytics on customer behavior. The first solution you can think of is to add a column like shopify_entity_id on that table.

  • What will happens if you introduce another integration later? Does the name shopify_entity_id still make sense? You may consider renaming it to external_entity_id. How do you know where it comes from? Adding another source column? How do you store extra 3rd party information about the sales order? Keep adding columns like external_something? Do those columns actually belong to the sales_order table itself?
  • What will happen if an single entity exists on multiple 3rd party system? For instance, the sales order may be presented on both Shopify and on another Shipping service. How would you deal with it? Keep adding more columns? What if we introduct another integration?
    • A Json (Jsonb) column could solve the above issue but also creates a whole new problem. How about schema enforcement and constraint? How do we make sure that nobody will accidentally update it the an incorrect schema? How about null and undefined values (in case you are working with Javascript)? How about indexing the values for quick access? You can index inside the json but it just makes things more complicated due to those schema problems mentioned above.

The solution, of course, is a SQL approach: make an entity integration table (sales_order_integration in this case). It’s a 1-N relationship, 1 sales order could have 0 or multiple integrations

sales_order table

id shipping_address price weightMg
1 Ho Chi Minh city 10 20
2 Hanoi 20 30

sales_order_integration table

id sales_order_id external_entity_id source
1 1 external-id1 SHOPIFY
2 1 external-id2 WOOCOMMERCE
3 2 external-id3 SHOPIFY
Read more

Just a collection of tips to make working with Postgres (and other SQL-like databases) easier

Measurement Unit

Take this case for an example, you have a product table and you want to store product information like its size and weight by adding these columns to describe such properties: width, length, height and weight.

id name width length height weight
1 ipad 10 20 0.1 0.5
2 macbook 20 30 0.5 2

So what’s the problem with the above table? We are assuming that the size props (width, height and length) are measured in cm and the weight is measured in kg. Usually, we could put this logic in application layer to make sure we convert everything to cm and kg before inserting into the database. However, it could lead to even more problems

  • What will happen if a new dev join the team? How can you make sure that person will know when to convert and when not?
  • What will happen if a dev using pound join the team?
  • What will happen if a dev accidentally assume the value in weight is in mg?
  • Sometimes, you could do a double conversion, making thing worse.
  • You need to remember adding comment to every place in your code, just to remind people which measurement unit that function is using.
  • Which data type to choose? Integer, of course, is not a good choice. However, working with real, double, decimal or numeric is always harder compare to int. They could cause some problems with parsing and datatype for languages/libraries like Nodejs.
Read more

An example about configuring PubSub BigQuery Subscription with Pulumi

BigQuery Subscription

It’s hard to view the content of the messages that were published to a topic because the application has already processed and acknowledged them before you can do anything. Usually, you have to create another test subscription for the messages to be replicated to and then pull messages from that test subscription. However, the Google PubSub UI doesn’t provide any way to pull specific message by id. The GCloud Console UI is a frustrating UI itself, slow to load and had to pull several times to find the necessary messages.

Google offers BigQuery Subscription, a solution to that issue and also to provide a long term storage for your messages so you can troubleshoot and do complex query later. In this post, I’m going to show a sample BigQuery Subscription workflow with Pulumi.

Configure BigQuery Dataset and Table

First, you need to create a BigQuery Dataset and a BigQuery Table following the schema defined here. You can do it manually on the UI or via Pulumi

BigQuery Dataset

const pubsubDatasetId = `pubsub`;

export const pubsubDataset = new gcp.bigquery.Dataset(
  `my-dataset`,
  { datasetId: pubsubDatasetId }
);

BigQuery Table (a bit messy since the schema has to be defined in JSON string)

export const messageTable = new gcp.bigquery.Table(
  `my-table`,
  {
    datasetId: pubsubDatasetId,
    tableId: `message-values`,
    // if you don't want other people to accidentally delete is, set to true
    deletionProtection: true,
    schema: `
  [
    {
      "name": "data",
      "type": "STRING",
      "mode": "NULLABLE",
      "description": "The message body"
    },
    {
      "name": "subscription_name",
      "type": "STRING",
      "mode": "NULLABLE",
      "description": ""
    },
    {
      "name": "message_id",
      "type": "STRING",
      "mode": "NULLABLE",
      "description": ""
    },
    {
      "name": "publish_time",
      "type": "TIMESTAMP",
      "mode": "NULLABLE",
      "description": ""
    },
    {
      "name": "attributes",
      "type": "STRING",
      "mode": "NULLABLE",
      "description": "Message attributes as JSON string"
    }
  ]
  `,
  },
  {
    dependsOn: [pubsubDataset],
  }
);
Read more

Ok, the story is that, I’m really bad at css. I have never worked on building any frontend component and I was given a task to build the Custom Checkbox component with Reactjs from scratch. Here is how…

1. The basic HTML and CSS

Prepare the structure

Here is how you usually create a checkbox with pure html and css. To avoid any complicated event handler, I will simply wrap the <label> tag around, which allows clicking on any element inside to transfer the event to the corresponding <input> element without any Javascript needed.

<label class="mylabel">
  <input class="myinput" type="checkbox" name="checkbox" />
  <div class="mylabel">Checkbox label</div>
</label>
.mylabel {
  display: flex;
  gap: 5px;
  align-items: center;
  margin: 2px;
}

Try the live example in the below iframe (or direct link)

Read more

just a blog post for summarising my algorithm course

The Problem

Given a data structure organised as a set of N objects, is there a path connecting 2 objects?

// union: connect 2 objects
// connected: whether 2 objects are connected?
union(4, 3);
union(3, 8);
union(6, 5);
union(9, 4);
union(2, 1);

connected(0, 7) return false;
connected(8, 9) return true;

union(5, 0);
union(7, 2);
union(6, 1);
union(1, 0);

connected(0, 7) return true;

Can only answer the question with Yes or No. The Dynamic Connectivity implementation cannot answer the exact path between 2 objects. It can only answer whether there are any paths connecting 2 objects.

Read more

Leetcode: Binary Search

This is so trivial. I just put it here so I can look up faster.

Given an array of integers nums which is sorted in ascending order, and an integer target, write a function to search target in nums. If target exists, then return its index. Otherwise, return -1. You must write an algorithm with O(log n) runtime complexity.

Example 1

Input: nums = [-1,0,3,5,9,12], target = 9
Output: 4
Explanation: 9 exists in nums and its index is 4

Example 2

Input: nums = [-1,0,3,5,9,12], target = 2
Output: -1
Explanation: 2 does not exist in nums so return -1

Constraints

1 <= nums.length <= 104
-104 < nums[i], target < 104
All the integers in nums are unique.
nums is sorted in ascending order.
Read more