It's has beed a long time since my last post. I was researching the scheduling problem in the Hay Day game for a while. And now is the good time to share the result with you.

## What is Hay Day?

Hay Day is a popular mobile game created by Supercell in 2012. It is a farming simulator where you can plant crops and trees, raise animals, sell products, compete in derbies, and so much more. Hay Day is a freemium game so all features of the game are open to players who choose not to buy in-app-purchases.

In Hay Day, your main job is to make products based on the orders that you receive from the customers. When you complete the order, you will get coins and experience points. The orders are generated randomly and you can choose to accept or decline them. The orders are combine of products that you can make in the game.

To make products in the game, you need two main resources: goods (or product) and machine. Each goods has a production time, a machine that can produce it and goods that are required to make it. For example, to make a *chicken feed*, you need *2 wheat* and *1 corn*. The production time of a *chicken feed* is *5 minutes* and it can be made in the *feed mill*. And

For a deep dive into the game, you can check out the Hay Day Wiki.

## The problem

Hay Day is a really easy game, at least when I started playing. Wheat is planted, gathered, and sold. Alternately, you may use wheat to produce and sell bread. However, as the game progresses, you will acquire additional machines, goods, and orders.

As you can see, just in level 20 of the game, we have 40 goods and 10 machines. The permutation of the goods and machines is huge. At this point, the scheduling issue arises.

So what do we want to solve here? Let clarify the problem:

- An order is a set of goods that need to be made.
- A machine can only produce one good at a time.
- A machine can only produce a good if it has all the required goods.
- A good can only be produced on a specific machine.

Our target:

- Given a set of orders, find a schedule that maximize the
**makespan**of the orders. - The makespan is the time between the start of the first order and the end of the last order.

A simple scheduling to make a order with two egg will look like this:

But in reality, the scheduling is more complex. Here is the image showing the scheduling of 2 orders with 4 goods:

Specifically, our problem is a **Flexible Job Shop Scheduling Problem (FJSP)**. The Flexible Job Shop Scheduling Problem is a variations of the Job-shop scheduling problem (JSSP).

The JSSP can be formulated as follows:

- For $n$ jobs, each one is composed of several operations that must be executed on $m$ machines. For each operation uses only one machine for a fixed duration.
- Each machine can process at most one operation at a time and once an operation starts processing on a given machine, it must complete processing on that same machine without any interruption.
- The operations of a given job have to be processed in a given set of order.
- The problem’s main aim is to schedule the activities or operations of the jobs on the machines such that the total time to finish all jobs, that is the makespan, is minimized.

The term makespan refers to the cumulative time to complete all the operations of all jobs on all machines. It is a measure of the time period from the starting time of the first operation to the ending time of the last operation. Sometimes there may be multiple solutions that have the minimum makespan, but the goal is to find out any one of them: it is not necessary to find all possible optimum solutions

## First attempt

So how can we solve this problem?

The first thing that comes to my mind is genetic algorithm.

The genetic algorithm is a search heuristic inspired by Darwin's theory of evolution. This algorithm borrows the following concepts from natural selection:

- Each individual (solution) has an associated fitness score
- Individuals with high fitness scores are selected for reproduction
- Chosen individuals reproduce to create offspring with the characteristics of both parents
- Some offspring would have random mutations applied to them

The idea is that if the parents have high fitness, the offspring would have high fitness as well.

This entire process of **selection**, **reproduction** (more commonly known as crossover), and **mutation** will repeat many times. At the end, only the fittest individuals will remain. These fittest individuals represent the solutions to our problem.

Only the fittest will survive - Charles Darwin

### Encoding and decoding

Encoding and decoding are the most important parts of the genetic algorithm. The encoding is the way we represent the solution in the genetic algorithm. The decoding is the way we convert the encoded solution to the real solution.

In our case, the encoding solution is a $2 \times n$ matrix where $n$ is the number of task (produce good task) we need to schedule. The first row is the machine that will produce the good and the second row is the id of the good that will be produced. The order in the second row is the order of what task will be produced first. If task $j$ is required to produce task $i$, the index of task $j$ in the second row must be less than the index of task $i$.

For example, given a 5 tasks, 3 machine to schedule:

- Task 1: can be produced by machine 1, required Task 2 and Task 3
- Task 2: can be produced by machine 2,3, required Task 4
- Task 3: can be produced by machine 1, required Task 5
- Task 4: can be produced by machine 1,3
- Task 5: can be produced by machine 3

A valid solution can be encoding as:

$\begin{bmatrix} 1 & 3 & 2 & 1 & 1 \\ 4 & 5 & 2 & 3 & 1 \end{bmatrix}$

Meaning:

- Task 4 will be produced by machine 1, Task 5 will be produced by machine 3. They can be produced at the same time.
- Task 2 will be produced by machine 2 after Task 4 is done.
- Task 3 will be produced by machine 1 after Task 5 is done.
- Task 1 will be produced by machine 1 after Task 2 and Task 3 are done.

To decode the solution, we will create a timeline of the machine and the task that will be produced by the machine at that time. We loop through the solution matrix and add the task to the timeline from left to right. At each step, we check the time with the required tasks of the task that will be produced and the time of the machine that will produce the task.

Here is the visualization of the decoding solution: