In [1]:

```
from IPython.core.display import Image, display
```

The active learning component in Polybot provides AI/ML algorithms that can perform search and optimization of the materials structure-processing-property relationship.

Polybot can utilize off-the-shelf AI/ML algorithms (e.g., Bayesian, stochastic gradient decent, random forest, gaussian processes, etc.) from packages such as Scikit-learn. For exploring the high dimensional structure-processing-property space, we have developed our own algorithm called c-MCTS. Here, we briefly describe the c-MCTS algorithm and provide benchmarking results that demonstrate its performance for different standard trial functions.

Table of Contents

1. Description of our c-MCTS algorithm

- Basic principles of decision tree algorithms
- MCTS for search in a continuous space0

2. Performance and Benchmarking results

- Performance in high dimensional space
- Benchmarking results for standard trial functions

Monte Carlo tree search (MCTS) belongs to the class of decision tree algorithms. It is a powerful algorithm for planning, optimization and learning tasks owing to its generality, simplicity, low computational requirements, and a theoretical bound on exploration-vs-exploitation trade-off.

- Basic principles of decision tree algorithms:

A MCTS search utilizes tree-structure to balance between exploration and exploitation, and the algorithm consists of four key stages:

**selection**:

based on a tree policy select the leaf node that has the highest current score

**expansion**:

add a child node to the selected leaf after taking a possible (unexplored) action

**simulation**:

from the selected node, perform Monte Carlo trials of possible actions using a playout policy to estimate the associated expected reward

**back-propagation**:

pass the rewards generated by the simulated episodes to update the scores of all the parent leafs encountered while moving up the tree.

In [2]:

```
display(Image(filename="cmcts/tree_search.png"))
```

- MCTS for search in a continuous space:

A typical MCTS algorithm only operates in a discrete action space, *e.g.*, "move pawn to e4", "add acetone reagent", or "remove chemical group -COOH". To explore the structure-processing-property space, Polybot needs to make decisions and perform search in a continuous action space.

In [3]:

```
display(Image(filename="cmcts/discreate_to_continuous.png"))
```

This figure highlights the difference between a discrete (left) and continuous (right) action space. For the same boundary conditions, a discrete action space has a finite number of possibilities (e.g., 361 grid points) whereas a continuous action space has infinite number of possibilites.

c-MCTS is specially designed to handle this major difference. We implemented different strategies such as range narrowing and uniqueness score to enable efficient sampling in continous high dimentional space.

</font>

- Performance in high dimensional space:

c-MCTS is efficient in sampling a high dimensional continuous space. The figure below highlights the results of four 50-dimensional trail functions. c-MCTS outperforms standard optimization algorithms such as Bayesian, particle swarm, and random sampling methods both in terms of the number of iterations to reach the solution and the quality of the best solution.

In [4]:

```
display(Image(filename="cmcts/benchmark.png",width=700))
```

- Benchmarking results for standard trial functions:

Here, we compare the performance of c-MCTS with common algorithms in finding the solution for 25 different trial functions. The two performance metrics we focus on are the number of iterations (within 30000 iterations) to reach the target solution and the quality of the best solution obtained. D denotes the dimension of the trial function while the tolerance indicates the accepted error of the solution.

In [5]:

```
display(Image(filename="cmcts/trials.png"))
```