## Massively Parallel Nonlinear Device-Level Electromagnetic-Thermal Modeling of Power Electronic Apparatus for HVDC Grid Transient Simulation

by

Ning Lin

A thesis submitted in partial fulfillment of the requirements for the degree of

Doctor of Philosophy in Energy Systems

Department of Electrical and Computer Engineering University of Alberta

©Ning Lin, 2018

# Abstract

Multi-terminal DC (MTDC) grid is turning into reality with fast technological advances towards the modular multilevel converter (MMC) and the landmark hybrid DC breaker. The electromagnetic transient (EMT) simulation tools, including the hardware-in-the-loop emulators and off-line EMT-type solvers, play a significant role in the converter design, as well as test of control and protection strategies for the preparation of on-site type tests in the industry. However, the growing scale of the DC grid from two-terminal high-voltage DC (HVDC) transmission to multiple stations has become a severe challenge to the computational capability of transient simulators.

Meanwhile, accurate models which provide insight into detailed device behavior are necessary to shorten the design cycle and consequently reduce costs. The performance of power semiconductor switches in a megawatt or even gigawatt converter is a particular concern. Currently mainstream simulators used for HVDC grid study do not have devicelevel transient models albeit their actual voltage and current stresses, as well as the junction temperature, should be estimated in the design procedure to avoid unnecessary shutdown which may incur a heavy economic loss.

In a trade-off between computation speed and the depth of information, system-level simulation tools, especially the HIL emulators, prefer the former, while other off-line solvers are dedicated to the latter, but they are unable to compute a large HVDC grid since the simulation process is always accompanied by extraordinarily slow speed and frequent termination due to numerical divergence.

Therefore, the focus of this research is to implement device-level power semiconductor switch models for power electronic apparatus in the system-level HVDC grid transient simulation which consequently ensures both computation efficiency and high fidelity. Circuit partitioning is an effective approach in splitting the HVDC grid to create a substantial number of sub-circuits that can be proceeded by parallel algorithms, which are another key aspect in the work, where two types of processors, i.e., the field programmable gate array (FPGA) and the graphics processing unit (GPU), are utilized for different scenarios. Three insulated-gate bipolar transistor (IGBT) and its anti-parallel diode models are proposed, i.e., the linearized curve-fitting model, the dynamic curve-fitting model, and the nonlinear behavioral model to cater for various simulation objects of the HVDC grid. With circuit partitioning of the MMC submodules from their arms, real-time execution becomes feasible on the FPGA with a time-step of 500ns for the two curve-fitting models, whilst the results are as accurate as commercial off-line simulation tools and experimental results. A multi-layer architecture in the hardware design is proposed, so that at system-level and device-level models run simultaneously under distinct time-steps to ensure a high fidelity of the emulated IGBT behaviors. Similarly, the hybrid HVDC breaker can be represented by a basic unit with a smaller number of nodes after circuit partitioning. Thus, a dramatic hardware resource utilization reduction is achieved, facilitating the deployment of a large MTDC grid on the FPGA.

The GPU is investigated for efficient off-line simulation of large-scale MTDC grid. The single-instruction-multiple-thread (SIMT) mode enabled the GPU kernel to launch many threads and compute them concurrently. Therefore, power electronic components having the same attribute are written as one kernel to achieve massively parallel computing architecture. Meanwhile, for efficiency comparison, multi-core CPU program for the MTDC grid is also developed. It is shown that for the entire CIGRÉ B4 DC grid system with nonlinear behavioral IGBT/diode models, the GPU can attain up to 134 and 265 times speedup over multi-core CPU parallelism when the half-bridge and full-bridge submodules are employed in the MMC, and the accuracy of the GPU simulation is validated by industrial standard tools such as SaberRD<sup>®</sup> and PSCAD/EMTDC<sup>®</sup>.

# Preface

The contents of this thesis is based on original work by Ning Lin. As detailed below, material from some chapters of the thesis has been published as journal articles under the supervision of Dr. Venkata Dinavahi in concept formation and by providing comments and corrections to the article manuscript.

Chapter 3 includes the results published in the following paper:

 N. Lin and V. Dinavahi, "Behavioral device-level modeling of modular multilevel converters in real time for variable-speed drive applications", *IEEE J. Emerg. Sel. Topics Power Electron.*, vol. 5, no. 3, pp. 1177-1191, Sep. 2017.

Chapter 4 contains contents from the following papers:

- N. Lin and V. Dinavahi, "Dynamic electro-magnetic-thermal modeling of MMCbased DC-DC converter for real-time simulation of MTDC grid", *IEEE Trans. Power Del.*, vol. 33, no. 3, pp. 1337-1347, Jun. 2018.
- N. Lin, B. Shi, and V. Dinavahi, "Non-linear behavioural modelling of device-level transients for complex power electronic converter circuit hardware realisation on FPGA", *IET Power Electron.*, vol. 11, no. 9, pp. 1566-1574, Aug. 2018.

The materials presented in Chapter 5 are published or have been submitted:

- N. Lin and V. Dinavahi, "Detailed device-level electrothermal modeling of the proactive hybrid HVDC breaker for real-time hardware-in-the-loop simulation of DC grids", *IEEE Trans. Power Electron.*, vol. 33, no. 2, pp. 1118-1134, Feb. 2018.
- N. Lin and V. Dinavahi, "Real-time transient electro-thermal model of thyristorbased ultrafast mechatronic circuit breaker for DC grid application", *under review*.

The contents from the following papers are included in Chapter 6:

• N. Lin and V. Dinavahi, "High-fidelity massively parallel electromagnetic transient simulation of large-scale MTDC grids", *under review*.

• N. Lin and V. Dinavahi, "Exact nonlinear micro-modeling for fine-grained parallel EMT simulation of MTDC grid interaction with wind farm", *IEEE Trans. Ind. Electron.*, accepted, pp. 1-10, DOI: 10.1109/TIE.2018.2860566

Chapter 7 is based on the following paper that is currently under peer review:

• N. Lin and V. Dinavahi, "Variable time-stepping MMC model for fast parallel EMT simulation of MTDC grid", *under review*.

To my parents for your unconditional love and to my elder sister who always supports me.

# Acknowledgements

I would like to express my sincere appreciation to my supervisor *Prof. Venkata Dinavahi* for his support, encouragement, and inspiring guidance through my study at the University of Alberta. His passion towards research is a tremendous impetus to me.

It is an honor for me to extend my gratitude to my Ph.D. committee members *Dr. John Salmon, Dr. Yasser Abdel-Rady I. Mohamed, Dr. Edmond Lou* and *Dr. Vijay Sood* for reviewing my thesis and providing thoughtful comments to improve it.

In the meantime, I wish to deliver thanks to members of the RTX-Lab, particularly *Mr*. *Bo Shi*, who worked together with me at the beginning of my research.

Many thanks to my friends and former colleagues, for providing me with the latest industrial information in this domain, and the support whenever I need it.

I want to express my gratitude to my parents as it is beyond any descriptive words, and towards my elder sister and brother-in-law, for the greatest support and understanding of my decisions.

# Table of Contents

| 1 | Intr | oductio | on and a second s | 1  |
|---|------|---------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
|   | 1.1  | Litera  | ture Review                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 4  |
|   |      | 1.1.1   | Modular Multi-level Converter Modeling                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 4  |
|   |      | 1.1.2   | Hybrid HVDC Breaker Modeling                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 5  |
|   |      | 1.1.3   | IGBT and Diode Models                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | 5  |
|   |      | 1.1.4   | Variable Time-Stepping Methods                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | 6  |
|   |      | 1.1.5   | Circuit Partitioning                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | 7  |
|   |      | 1.1.6   | Real-time Hardware-In-The-Loop Emulation                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | 7  |
|   |      | 1.1.7   | Off-line EMT Simulation                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 8  |
|   | 1.2  | Motiv   | ration of this work                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | 8  |
|   | 1.3  | Thesis  | S Objectives                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 10 |
|   | 1.4  | Contr   | ibution of the Thesis                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | 12 |
|   | 1.5  | Appli   | cation of the Work                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | 14 |
|   | 1.6  | Thesis  | Outline                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 14 |
| _ | -    |         |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |    |
| 2 | Ove  | rview   | of Parallel Processors                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 16 |
|   | 2.1  | Introd  |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | 16 |
|   | 2.2  | FPGA    | Introduction                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 17 |
|   |      | 2.2.1   | FPGA Hardware Architecture                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 17 |
|   |      | 2.2.2   | Configurable Logic Block                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | 18 |
|   |      | 2.2.3   | Block RAM (BRAM)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | 18 |
|   |      | 2.2.4   | Digital Signal Processing (DSP) Slice                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | 20 |
|   | 2.3  | HIL Iı  | mplementation Procedure                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 21 |
|   |      | 2.3.1   | Vivado <sup>®</sup> High-Level Synthesis Tool $\ldots \ldots \ldots \ldots \ldots \ldots \ldots$                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | 22 |
|   |      | 2.3.2   | Vivado <sup>®</sup> Top-Level Design $\ldots$                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | 23 |
|   |      | 2.3.3   | FPGA Experiment                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | 24 |
|   | 2.4  | GPU I   | Introduction                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 24 |
|   |      | 2.4.1   | $NVIDIA^{\mathbb{R}}$ GPU Architecture                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 25 |
|   |      | 2.4.2   | GPU Massively Parallel Processing                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | 26 |
|   |      | 2.4.3   | Multi-Core CPU                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | 28 |
|   | 2.5  | Summ    | nary                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | 29 |

| 3 | Line                                   | earized                                                                                                                                        | Device-Level Modular Multi-Level Converter Model                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 30                                                                                                                          |
|---|----------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------|
|   | 3.1                                    | Introc                                                                                                                                         | luction                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | 30                                                                                                                          |
|   | 3.2                                    | EMT                                                                                                                                            | Model of Basic Elements                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | 31                                                                                                                          |
|   |                                        | 3.2.1                                                                                                                                          | Resistor                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | 31                                                                                                                          |
|   |                                        | 3.2.2                                                                                                                                          | Inductor                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | 31                                                                                                                          |
|   |                                        | 3.2.3                                                                                                                                          | Capacitor                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | 33                                                                                                                          |
|   |                                        | 3.2.4                                                                                                                                          | TLM-Link                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | 34                                                                                                                          |
|   | 3.3                                    | Powe                                                                                                                                           | r Semiconductor Switch-Based MMC Modeling                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | 35                                                                                                                          |
|   |                                        | 3.3.1                                                                                                                                          | IGBT/Diode Curve-Fitting Model                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | 36                                                                                                                          |
|   | 3.4                                    | Fine-O                                                                                                                                         | Grained MMC Partitioning Schemes                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 39                                                                                                                          |
|   |                                        | 3.4.1                                                                                                                                          | TLM-Link Partitioning                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 39                                                                                                                          |
|   | 3.5                                    | Hardy                                                                                                                                          | ware Design on FPGA                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 42                                                                                                                          |
|   |                                        | 3.5.1                                                                                                                                          | Hardware Platform                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | 42                                                                                                                          |
|   |                                        | 3.5.2                                                                                                                                          | Controller Emulation                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 42                                                                                                                          |
|   |                                        | 3.5.3                                                                                                                                          | MMC Emulation on FPGA                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 43                                                                                                                          |
|   | 3.6                                    | Real-1                                                                                                                                         | Time Emulation Results                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | 47                                                                                                                          |
|   |                                        | 3.6.1                                                                                                                                          | MMC                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 47                                                                                                                          |
|   |                                        | 3.6.2                                                                                                                                          | Induction Machine Driven by 5-Level MMC                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | 52                                                                                                                          |
|   | 3.7                                    | Sumn                                                                                                                                           | nary                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 53                                                                                                                          |
|   |                                        |                                                                                                                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |                                                                                                                             |
| 4 | Nor                                    | nlinear                                                                                                                                        | Device-Level Modular Multi-Level Converter Model                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 55                                                                                                                          |
| 4 | <b>Nor</b><br>4.1                      | nlinear<br>Introc                                                                                                                              | Device-Level Modular Multi-Level Converter Model                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | <b>55</b><br>55                                                                                                             |
| 4 | Nor<br>4.1<br>4.2                      | ilinear<br>Introc<br>Powe                                                                                                                      | Device-Level Modular Multi-Level Converter Model       luction                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | <b>55</b><br>55<br>56                                                                                                       |
| 4 | Nor<br>4.1<br>4.2                      | Introc<br>Introc<br>Powe<br>4.2.1                                                                                                              | Device-Level Modular Multi-Level Converter Model       luction                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | <b>55</b><br>55<br>56<br>56                                                                                                 |
| 4 | Nor<br>4.1<br>4.2                      | Introc<br>Powe<br>4.2.1<br>4.2.2                                                                                                               | Device-Level Modular Multi-Level Converter Model       luction                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | <b>55</b><br>55<br>56<br>56<br>57                                                                                           |
| 4 | Nor<br>4.1<br>4.2                      | Introc<br>Powe<br>4.2.1<br>4.2.2<br>4.2.3                                                                                                      | Device-Level Modular Multi-Level Converter Model       luction                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | <b>55</b><br>55<br>56<br>56<br>56<br>57<br>60                                                                               |
| 4 | Nor<br>4.1<br>4.2                      | Introc<br>Powe<br>4.2.1<br>4.2.2<br>4.2.3<br>4.2.3                                                                                             | Device-Level Modular Multi-Level Converter Model       Huction       Iuction       r Semiconductor Switch-Based MMC Modeling       MMC TLM-Stub Model (TLM-S)       IGBT/Diode Dynamic Curve-Fitting Model       Power Diode Nonlinear Behavioral Model       IGBT Nonlinear Behavioral Model                                                                                                                                                                                                                                                                                                                                                                | <b>55</b><br>55<br>56<br>56<br>57<br>60<br>61                                                                               |
| 4 | Nor<br>4.1<br>4.2                      | Introc<br>Powe<br>4.2.1<br>4.2.2<br>4.2.3<br>4.2.4<br>4.2.5                                                                                    | Device-Level Modular Multi-Level Converter Model       luction       r Semiconductor Switch-Based MMC Modeling       MMC TLM-Stub Model (TLM-S)       IGBT/Diode Dynamic Curve-Fitting Model       Power Diode Nonlinear Behavioral Model       IGBT Nonlinear Behavioral Model       Electro-Thermal Network                                                                                                                                                                                                                                                                                                                                                | <b>55</b><br>55<br>56<br>56<br>57<br>60<br>61<br>63                                                                         |
| 4 | Nor<br>4.1<br>4.2<br>4.3               | Introc<br>Powe<br>4.2.1<br>4.2.2<br>4.2.3<br>4.2.4<br>4.2.5<br>Fine-O                                                                          | Device-Level Modular Multi-Level Converter Model       Huction       r Semiconductor Switch-Based MMC Modeling       MMC TLM-Stub Model (TLM-S)       IGBT/Diode Dynamic Curve-Fitting Model       Power Diode Nonlinear Behavioral Model       IGBT Nonlinear Behavioral Model       Electro-Thermal Network       Grained MMC Partitioning Schemes                                                                                                                                                                                                                                                                                                         | <b>55</b><br>55<br>56<br>56<br>57<br>60<br>61<br>63<br>65                                                                   |
| 4 | Nor<br>4.1<br>4.2<br>4.3               | Introc<br>Powe<br>4.2.1<br>4.2.2<br>4.2.3<br>4.2.4<br>4.2.5<br>Fine-0<br>4.3.1                                                                 | Device-Level Modular Multi-Level Converter Model       luction       r Semiconductor Switch-Based MMC Modeling       MMC TLM-Stub Model (TLM-S)       IGBT/Diode Dynamic Curve-Fitting Model       Power Diode Nonlinear Behavioral Model       IGBT Nonlinear Behavioral Model       Electro-Thermal Network       Grained MMC Partitioning Schemes       V-I Coupling                                                                                                                                                                                                                                                                                      | <b>55</b><br>56<br>56<br>57<br>60<br>61<br>63<br>65<br>65                                                                   |
| 4 | Nor<br>4.1<br>4.2<br>4.3               | Introc<br>Powe<br>4.2.1<br>4.2.2<br>4.2.3<br>4.2.4<br>4.2.5<br>Fine-C<br>4.3.1<br>4.3.2                                                        | Device-Level Modular Multi-Level Converter Model       luction       r Semiconductor Switch-Based MMC Modeling       MMC TLM-Stub Model (TLM-S)       IGBT/Diode Dynamic Curve-Fitting Model       Power Diode Nonlinear Behavioral Model       IGBT Nonlinear Behavioral Model       Electro-Thermal Network       Grained MMC Partitioning Schemes       V-I Coupling       Hybrid Arm Model                                                                                                                                                                                                                                                               | <b>55</b><br>56<br>56<br>57<br>60<br>61<br>63<br>65<br>65<br>65                                                             |
| 4 | Nor<br>4.1<br>4.2<br>4.3<br>4.4        | linear<br>Introc<br>Powe<br>4.2.1<br>4.2.2<br>4.2.3<br>4.2.4<br>4.2.5<br>Fine-C<br>4.3.1<br>4.3.2<br>Hardy                                     | Device-Level Modular Multi-Level Converter Model       luction       r Semiconductor Switch-Based MMC Modeling       MMC TLM-Stub Model (TLM-S)       IGBT/Diode Dynamic Curve-Fitting Model       Power Diode Nonlinear Behavioral Model       IGBT Nonlinear Behavioral Model       Electro-Thermal Network       Grained MMC Partitioning Schemes       V-I Coupling       Hybrid Arm Model       MMC TLM-Stub Integer                                                                                                                                                                                                                                    | <b>55</b><br>55<br>56<br>57<br>60<br>61<br>63<br>65<br>65<br>65<br>65<br>67                                                 |
| 4 | Nor<br>4.1<br>4.2<br>4.3<br>4.4        | Introd<br>Power<br>4.2.1<br>4.2.2<br>4.2.3<br>4.2.4<br>4.2.5<br>Fine-C<br>4.3.1<br>4.3.2<br>Hardy<br>4.4.1                                     | Device-Level Modular Multi-Level Converter Model       luction       r Semiconductor Switch-Based MMC Modeling       MMC TLM-Stub Model (TLM-S)       IGBT/Diode Dynamic Curve-Fitting Model       Power Diode Nonlinear Behavioral Model       IGBT Nonlinear Behavioral Model       Electro-Thermal Network       V-I Coupling       Hybrid Arm Model       MMC TLM-Station Case 1 – DCFM       DC-DC Converter HIL Emulation                                                                                                                                                                                                                              | <b>55</b><br>55<br>56<br>57<br>60<br>61<br>63<br>65<br>65<br>65<br>67<br>67                                                 |
| 4 | Nor<br>4.1<br>4.2<br>4.3<br>4.4        | Introc<br>Powe<br>4.2.1<br>4.2.2<br>4.2.3<br>4.2.4<br>4.2.5<br>Fine-C<br>4.3.1<br>4.3.2<br>Hardy<br>4.4.1<br>4.4.2                             | Device-Level Modular Multi-Level Converter Model       luction       r Semiconductor Switch-Based MMC Modeling       MMC TLM-Stub Model (TLM-S)       IGBT/Diode Dynamic Curve-Fitting Model       Power Diode Nonlinear Behavioral Model       IGBT Nonlinear Behavioral Model       Electro-Thermal Network       Grained MMC Partitioning Schemes       V-I Coupling       Hybrid Arm Model       Mare Emulation Case 1 – DCFM       DC-DC Converter HIL Emulation       Real-Time HIL Emulation Results                                                                                                                                                  | <b>55</b><br>55<br>56<br>57<br>60<br>61<br>63<br>65<br>65<br>65<br>67<br>67<br>71                                           |
| 4 | Nor<br>4.1<br>4.2<br>4.3<br>4.4        | linear<br>Introc<br>Powe<br>4.2.1<br>4.2.2<br>4.2.3<br>4.2.4<br>4.2.5<br>Fine-C<br>4.3.1<br>4.3.2<br>Hardy<br>4.4.1<br>4.4.2                   | Device-Level Modular Multi-Level Converter Model       luction       r Semiconductor Switch-Based MMC Modeling       MMC TLM-Stub Model (TLM-S)       IGBT/Diode Dynamic Curve-Fitting Model       Power Diode Nonlinear Behavioral Model       IGBT Nonlinear Behavioral Model       Electro-Thermal Network       Grained MMC Partitioning Schemes       V-I Coupling       Hybrid Arm Model       DC-DC Converter HIL Emulation       Real-Time HIL Emulation Results       4.4.2.1                                                                                                                                                                       | <b>55</b><br>56<br>56<br>57<br>60<br>61<br>63<br>65<br>65<br>65<br>67<br>67<br>71<br>71                                     |
| 4 | Nor<br>4.1<br>4.2<br>4.3<br>4.4        | Introd<br>Power<br>4.2.1<br>4.2.2<br>4.2.3<br>4.2.4<br>4.2.5<br>Fine-C<br>4.3.1<br>4.3.2<br>Hardw<br>4.4.1<br>4.4.2                            | Device-Level Modular Multi-Level Converter Model       luction       r Semiconductor Switch-Based MMC Modeling       mMC TLM-Stub Model (TLM-S)       IGBT/Diode Dynamic Curve-Fitting Model       Power Diode Nonlinear Behavioral Model       IGBT Nonlinear Behavioral Model       Electro-Thermal Network       Grained MMC Partitioning Schemes       V-I Coupling       Hybrid Arm Model       DC-DC Converter HIL Emulation       Real-Time HIL Emulation Results       4.4.2.1       Device-Level Behavior                                                                                                                                           | <b>55</b><br>55<br>56<br>57<br>60<br>61<br>63<br>65<br>65<br>65<br>67<br>67<br>71<br>71<br>72                               |
| 4 | Nor<br>4.1<br>4.2<br>4.3<br>4.4        | linear<br>Introc<br>Powe<br>4.2.1<br>4.2.2<br>4.2.3<br>4.2.4<br>4.2.5<br>Fine-C<br>4.3.1<br>4.3.2<br>Hardy<br>4.4.1<br>4.4.2                   | Device-Level Modular Multi-Level Converter Model       fuction       r Semiconductor Switch-Based MMC Modeling       mMC TLM-Stub Model (TLM-S)       IGBT/Diode Dynamic Curve-Fitting Model       Power Diode Nonlinear Behavioral Model       IGBT Nonlinear Behavioral Model       IGBT Nonlinear Behavioral Model       Electro-Thermal Network       Grained MMC Partitioning Schemes       V-I Coupling       Hybrid Arm Model       DC-DC Converter HIL Emulation       Real-Time HIL Emulation Results       4.4.2.1       Device-Level Behavior       4.4.2.3       System Tests                                                                    | <b>55</b><br>56<br>56<br>57<br>60<br>61<br>63<br>65<br>65<br>65<br>67<br>71<br>71<br>71<br>72<br>73                         |
| 4 | Nor<br>4.1<br>4.2<br>4.3<br>4.4<br>4.5 | linear<br>Introd<br>Power<br>4.2.1<br>4.2.2<br>4.2.3<br>4.2.4<br>4.2.5<br>Fine-C<br>4.3.1<br>4.3.2<br>Hardw<br>4.4.1<br>4.4.2                  | Device-Level Modular Multi-Level Converter Model       fuction       r Semiconductor Switch-Based MMC Modeling       mMC TLM-Stub Model (TLM-S)       IGBT/Diode Dynamic Curve-Fitting Model       Power Diode Nonlinear Behavioral Model       IGBT Nonlinear Behavioral Model       Electro-Thermal Network       Grained MMC Partitioning Schemes       V-I Coupling       Hybrid Arm Model       DC-DC Converter HIL Emulation       Real-Time HIL Emulation Results       4.4.2.1       Device-Level Performance       4.4.2.3       System Tests       ware Emulation Case 2 – NBM                                                                     | <b>55</b><br>55<br>56<br>57<br>60<br>61<br>63<br>65<br>65<br>65<br>67<br>71<br>71<br>71<br>72<br>73<br>76                   |
| 4 | Nor<br>4.1<br>4.2<br>4.3<br>4.4<br>4.5 | linear<br>Introd<br>Powe<br>4.2.1<br>4.2.2<br>4.2.3<br>4.2.4<br>4.2.5<br>Fine-C<br>4.3.1<br>4.3.2<br>Hardw<br>4.4.1<br>4.4.2<br>Hardw<br>4.5.1 | Device-Level Modular Multi-Level Converter Model       fuction       r Semiconductor Switch-Based MMC Modeling       MMC TLM-Stub Model (TLM-S)       IGBT/Diode Dynamic Curve-Fitting Model       Power Diode Nonlinear Behavioral Model       IGBT Nonlinear Behavioral Model       Electro-Thermal Network       Grained MMC Partitioning Schemes       V-I Coupling       Hybrid Arm Model       More Emulation Case 1 – DCFM       DC-DC Converter HIL Emulation       Real-Time HIL Emulation Results       4.4.2.1       Device-Level Behavior       4.4.2.3       System Tests       ware Emulation Case 2 – NBM       Power Converter HIL Emulation | <b>55</b><br>55<br>56<br>56<br>57<br>60<br>61<br>63<br>65<br>65<br>65<br>65<br>67<br>71<br>71<br>71<br>72<br>73<br>76<br>76 |

|   |      | 4.5.3   | Islanded MMC Performance 78                     |
|---|------|---------|-------------------------------------------------|
|   |      | 4.5.4   | MMC-MVDC Performance                            |
|   | 4.6  | Summ    | nary                                            |
| 5 | Hig  | gh-Fide | lity Device-Level Hybrid HVDC Breaker Models 85 |
|   | 5.1  | Introd  | luction                                         |
|   | 5.2  | HHB     | in MTDC System                                  |
|   |      | 5.2.1   | MTDC Schematic                                  |
|   |      | 5.2.2   | DC Line Protection                              |
|   |      |         | 5.2.2.1 Voltage Derivative Protection           |
|   |      |         | 5.2.2.2 Over Current Protection                 |
|   | 5.3  | Proact  | tive Hybrid HVDC Breaker                        |
|   |      | 5.3.1   | EMT Model of the Proposed HHB  88               |
|   |      | 5.3.2   | Varistor Model                                  |
|   |      | 5.3.3   | General HHB Unit Model                          |
|   |      | 5.3.4   | Two-Node IGBT Models  93                        |
|   |      | 5.3.5   | IGBT Nonlinear Behavioral Model  95             |
|   |      |         | 5.3.5.1 IGBT Fourth-order Behavioral Model      |
|   |      |         | 5.3.5.2 Parameters Extraction                   |
|   |      |         | 5.3.5.3 Sensitivity Analysis                    |
|   |      |         | 5.3.5.4 Model Parallelization                   |
|   |      | 5.3.6   | Electro-Thermal Network                         |
|   | 5.4  | Hardv   | ware Implementation on FPGA                     |
|   | 5.5  | HIL E   | mulation Results                                |
|   |      | 5.5.1   | Device-Level Performance                        |
|   |      | 5.5.2   | System-Level Performance                        |
|   | 5.6  | Ultraf  | ast Mechatronic Circuit Breaker                 |
|   |      | 5.6.1   | Thyristor Modeling                              |
|   |      | 5.6.2   | UFMCB Modeling 114                              |
|   |      | 5.6.3   | UFMCB Hardware Design                           |
|   |      | 5.6.4   | UFMCB Real-Time Tests and Validation            |
|   | 5.7  | Summ    | nary                                            |
| 6 | Fixe | d Time  | e-Step CIGRÉ DC Grid Simulation on GPU 122      |
|   | 6.1  | Introd  | luction                                         |
|   | 6.2  | Wind    | Farm-Integrated MTDC Grid                       |
|   |      | 6.2.1   | Induction Machine Model                         |
|   |      | 6.2.2   | Three-Phase Transformer                         |
|   |      | 6.2.3   | Frequency Dependent Line Model                  |
|   |      | 6.2.4   | Aggregated Wind Farm EMT Model                  |

|    |       | 6.2.5 IGBT/Diode Grouping                                 | 128 |
|----|-------|-----------------------------------------------------------|-----|
|    | 6.3   | MTDC Grid GPU Program Design                              | 130 |
|    |       | 6.3.1 MTDC Multi-Level Partitioning Scheme                | 130 |
|    |       | 6.3.2 MMC GPU Kernel Design                               | 132 |
|    |       | 6.3.3 Hybrid Circuit Breaker Model                        | 135 |
|    |       | 6.3.3.1 HHB GPU Computational Kernel                      | 137 |
|    |       | 6.3.4 Construction of Large-Scale MTDC Grids              | 138 |
|    | 6.4   | EMT Simulation Results with CFM-IGBT                      | 138 |
|    |       | 6.4.1 GPU Simulation of Basic MMC                         | 139 |
|    |       | 6.4.2 GPU Simulation for Point-to-Point HVDC Transmission | 141 |
|    |       | 6.4.3 GPU Simulation of MTDC Grid Test Cases              | 144 |
|    | 6.5   | EMT Simulation Results with NBM-IGBT                      | 150 |
|    |       | 6.5.1 Device-Level Switching Transients                   | 150 |
|    |       | 6.5.2 Wind Farm Integration Dynamics                      | 152 |
|    |       | 6.5.3 MTDC System Tests                                   | 154 |
|    | 6.6   | Summary                                                   | 155 |
| 7  | MT    | DC Grid Variable Time-Stepping Simulation on GPU          | 158 |
|    | 7.1   | Introduction                                              | 158 |
|    | 7.2   | Proposed Variable Time-Stepping Schemes                   | 158 |
|    |       | 7.2.1 Event-Correlated Criterion                          | 158 |
|    |       | 7.2.2 Local Error Truncation                              | 159 |
|    |       | 7.2.3 Newton-Raphson Iteration Count                      | 160 |
|    |       | 7.2.4 Hybrid Time-Step Control and Synchronization        | 160 |
|    |       | 7.2.5 VTS-Based MMC                                       | 161 |
|    |       | 7.2.5.1 Two-State Switch Model                            | 161 |
|    |       | 7.2.5.2 MMC Main Circuit                                  | 162 |
|    |       | 7.2.5.3 VTS MMC Kernel                                    | 162 |
|    | 7.3   | VTS Simulation Results and Validation                     | 163 |
|    |       | 7.3.1 System Setup                                        | 163 |
|    |       | 7.3.2 VTS in Device-Level Simulation                      | 164 |
|    |       | 7.3.3 MTDC System Preview                                 | 164 |
|    | 7.4   | Summary                                                   | 168 |
| 8  | Con   | clusions and Future Works                                 | 171 |
| 5  | 8.1   | Contributions of Thesis                                   | 172 |
|    | 8.2   | Directions for Future Work                                | 174 |
|    |       |                                                           |     |
| Bi | bliog | rapny                                                     | 175 |

| Append | dix A                                     | 188 |  |
|--------|-------------------------------------------|-----|--|
| A.1    | IGBT/Diode NBM Parameters                 | 188 |  |
| A.2    | IGBT DCFM Parameters                      | 188 |  |
| A.3    | SST Test Case Parameters in Chapter 4     | 189 |  |
| A.4    | MVDC System Parameters in Chapter 4       | 189 |  |
| A.5    | Full NBM IGBT Matrix                      | 190 |  |
| _      |                                           |     |  |
| Append | dix B                                     | 191 |  |
| B.1    | MTDC System Parameters in Chapter 5       | 191 |  |
| B.2    | Transmission Line Parameters in Chapter 5 | 191 |  |
| B.3    | ABB HHB Parameters in Chapter 5           | 191 |  |
| B.4    | Alstom Grid HHB Parameters in Chapter 5   | 191 |  |
| B.5    | UFMCB Companion Model                     | 192 |  |
| Append | dix C                                     | 193 |  |
| C.1    | CIGRÉ B4 DC Grid Parameters               | 193 |  |
| C.2    | Greater CIGRÉ DC Grid                     | 193 |  |
| Append | Appendix D 1                              |     |  |

# List of Tables

| 2.1 | FPGA logic resources                                                                                            | 17   |
|-----|-----------------------------------------------------------------------------------------------------------------|------|
| 2.2 | Dual-port RAM description                                                                                       | 20   |
| 2.3 | GeForce <sup>®</sup> GTX 1080 and Tesla <sup>®</sup> V100 specifics $\ldots \ldots \ldots \ldots \ldots \ldots$ | 25   |
| 3.1 | MMC submodule operation states                                                                                  | 36   |
| 3.2 | Hardware utilization of the MMC-IM system                                                                       | 42   |
| 3.3 | Latencies of different hardware modules in the 5-level MMC-IM system                                            | 44   |
| 3.4 | Parameters of MMC-IM system                                                                                     | 47   |
| 3.5 | Switching times of IGBT and diode                                                                               | 50   |
| 3.6 | Energy consumption validation of proposed IGBT and diode model                                                  | 51   |
| 4.1 | MMC model simulation speed comparison                                                                           | 67   |
| 4.2 | MMC hardware design specifications                                                                              | 70   |
| 4.3 | Simulation execution times from EMT simulators and HIL systems                                                  | 79   |
| 4.4 | Validation of IGBT and power diode nonlinear behavioral models by SaberRD <sup>6</sup>                          | ® 81 |
| 5.1 | IGBT parameters as a function of junction temperature                                                           | 99   |
| 5.2 | Latencies and hardware resource utilization of principal hardware modules                                       |      |
|     | in the 3-terminal HVDC system                                                                                   | 100  |
| 5.3 | Energy consumed by different HHB components                                                                     | 108  |
| 5.4 | UFMCB parts hardware design summary                                                                             | 116  |
| 6.1 | Execution time $t_{exe}$ of different platforms for 1s simulation duration $\ldots$                             | 140  |
| 6.2 | CPU and GPU execution times of $\pm 100$ kV HVDC for 1s simulation                                              | 143  |
| 6.3 | CPU and GPU execution times of $\pm 200$ kV DCS2 for 1s simulation                                              | 147  |
| 6.4 | CPU and GPU execution times of the CIGRÉ B4 DC system for 1s simulation                                         | 147  |
| 6.5 | CPU and GPU execution times of the Greater CIGRÉ DC system for 1s sim-                                          |      |
|     | ulation                                                                                                         | 150  |
| 6.6 | NBM-based MMC execution time by various platforms for 100ms duration                                            | 151  |
| 6.7 | Execution time of CIGRÉ B4 DC grid by CPUs and GPU for 1s duration                                              | 155  |
| 7.1 |                                                                                                                 |      |
|     | Comparison of VTS schemes' efficiency on CPU                                                                    | 167  |

A.1 Behavioural IGBT and diode parameters provided by SaberRD  $^{\ensuremath{\mathbb{R}}}$  . . . . . 188

# List of Figures

| 1.1  | Power semiconductor switch model categorization.                                                           | 3  |
|------|------------------------------------------------------------------------------------------------------------|----|
| 2.1  | FPGA hardware: (a) mesh architecture, (b) Virtex <sup><math>\mathbb{R}</math></sup> -7 ASMBL architecture. | 18 |
| 2.2  | Configurable logic block architecture.                                                                     | 19 |
| 2.3  | 7-series FPGA block memory: (a) simple dual-port RAM, (b) true dual-port                                   |    |
|      | RAM                                                                                                        | 19 |
| 2.4  | Basic DSP48E1 slice functionality.                                                                         | 20 |
| 2.5  | Hardware design procedure and experimental setup.                                                          | 21 |
| 2.6  | Demonstration of top-level hardware design.                                                                | 24 |
| 2.7  | GeForce GTX 1080 block diagram.                                                                            | 26 |
| 2.8  | Streaming multiprocessor diagram of (a) GeForce GTX 1080, (b) Tesla V100.                                  | 27 |
| 2.9  | GPU implementation process.                                                                                | 27 |
| 3.1  | A resistor and its EMT model.                                                                              | 31 |
| 3.2  | Inductor: (a) symbol, (b) companion model, (c) TLM representation, and (d)                                 |    |
|      | TLM-stub model.                                                                                            | 32 |
| 3.3  | Capacitor: (a) symbol, (b) companion model, (c) TLM representation, and                                    |    |
|      | (d) TLM-stub model                                                                                         | 33 |
| 3.4  | Lossless transmission line and its TLM-link model.                                                         | 34 |
| 3.5  | MMC configuration and its half-bridge submodule models.                                                    | 35 |
| 3.6  | The behavior of IGBT and diode: (a) IGBT static $I - V$ characteristics and                                |    |
|      | switching transient waveforms, and (b) diode static $I-V$ characteristics and                              |    |
|      | reverse recovery process.                                                                                  | 37 |
| 3.7  | Unified IGBT/diode pair behavioral model for (a) static characteristics, and                               |    |
|      | (b) dynamic features                                                                                       | 39 |
| 3.8  | TLM-based model for $(N+1)$ -level MMC: (a) MMC partitioning approach,                                     |    |
|      | and (b) discretized schematic for the overall system.                                                      | 40 |
| 3.9  | Control algorithm for the MMC-IM system.                                                                   | 43 |
| 3.10 | Hardware structure and signal flow diagram for the FPGA emulation of the                                   |    |
|      | MMC-IM system.                                                                                             | 45 |
| 3.11 | Finite state machine of the overall MMC-IM system for hardware emulation.                                  | 46 |

| 3.12 | Comparison of performances of 5-level ((a), (b) and (c)) and 7-level ((d),                    |    |
|------|-----------------------------------------------------------------------------------------------|----|
|      | (e) and (f)) MMC between real-time HIL emulation (top) and SaberRD®                           |    |
|      | (bottom). (a) 5-level MMC output voltage, (b) arm currents, (c) DC volt-                      |    |
|      | age ripples of submodules in upper and lower arms, (d) 7-level MMC out-                       |    |
|      | put voltage, and (e), (f) DC voltage ripples of submodules. Oscilloscope                      |    |
|      | axes settings: (a), (d) x-axis 5 ms/div, y-axis 133.34 V/div ( $v_{out}$ ) and 66.67          |    |
|      | V/div (FFT); (b) x-axis 5 ms/div, y-axis 13.333 A/div; (c), (e) and (f) x-axis                |    |
|      | 5 ms/div, y-axis 2.667 V/div                                                                  | 48 |
| 3.13 | Details of switching processes and power losses of IGBT or diode from HIL                     |    |
|      | emulation (top) and Saber $RD^{\mathbb{R}}$ simulation (bottom). (a) IGBT turning on, (b)     |    |
|      | IGBT turning off, and (c) diode reverse recovery. Oscilloscope axes settings:                 |    |
|      | x-axis 1 $\mu$ s/div, y-axis 40 V/div and 26.67 A/div                                         | 50 |
| 3.14 | System-level behavior of 11-level MMC: (a) real-time oscilloscope results;                    |    |
|      | (b) SaberRD <sup>®</sup> simulation results. Oscilloscope axes settings: x-axis 5 ms/div,     |    |
|      | y-axis 133.34 V/div ( $v_{out}$ ), 66.67 V/div (FFT) and 66.67 A/div                          | 51 |
| 3.15 | Regulation of induction machine speed by 5-level MMC: (a) real-time os-                       |    |
|      | cilloscope results, and (b) off-line simulation results. Oscilloscope x-axis: 1               |    |
|      | s/div                                                                                         | 53 |
| 3.16 | Real-time oscilloscope results of stator current in $\alpha$ - $\beta$ frame under (a) start- |    |
|      | ing period, and (b) steady-state with $T_m$ =0, 100 and 200 N·m, respectively.                |    |
|      | Oscilloscope x- and y-axis settings: (a) 93.34 A/div; (b) 26.67 A/div                         | 54 |
| 4.1  | MMC TLM-stub model: (a) SM on-state/blocked state, (b) SM off-state/blocked                   | d  |
|      | state, and (c) general representation.                                                        | 57 |
| 4.2  | IGBT transient waveforms from a bridge-structure test circuit: (a) turn-on                    |    |
|      | process, (b) turn-off process, and (c) coefficient <i>K</i> determination                     | 59 |
| 4.3  | Dynamic IGBT model: (a) VCCS for descending curves, (b) CCCS for rising                       |    |
|      | curves.                                                                                       | 60 |
| 4.4  | Nonlinear power diode model: (a) Simplified power diode model, (b) lin-                       |    |
|      | earized discrete-time equivalent circuit.                                                     | 61 |
| 4.5  | Nonlinear IGBT EMT model: (a) Continuous-time behavioral model, (b)                           |    |
|      | Linearized discrete-time equivalent circuit.                                                  | 62 |
| 4.6  | IGBT inherent electro-thermal transient network                                               | 64 |
| 4.7  | MMC partitioning by <i>V</i> - <i>I</i> coupling.                                             | 66 |
| 4.8  | MMC hybrid arm model with <i>V</i> - <i>I</i> couplings                                       | 67 |
| 4.9  | MMC-based DC-DC converter for MTDC system.                                                    | 68 |
| 4.10 | Top-level hardware structure of $MMC_H$                                                       | 68 |
| 4.11 | SST top-level finite state machine                                                            | 69 |
| 4.12 | SST control scheme: (a) $MMC_L$ controller, and (b) $MMC_H$ controller                        | 69 |

| 4.13 | IGBT transient tests under different collector current at $T_{vj}$ =125°C: (a) turn-            |                   |
|------|-------------------------------------------------------------------------------------------------|-------------------|
|      | on process, (b) turn-off process, and (c) turn-on and turn-off energy                           | 72                |
| 4.14 | $MMC_H$ lower IGBT operation status: (a) power loss waveforms for 55L-                          |                   |
|      | MMC, (b) junction temperature waveforms for 55L-MMC, (c) 5L-MMC SaberRE                         | )®                |
|      | validation, and (d) relation between $T_{vj}$ and MMC level                                     | 73                |
| 4.15 | SST converter-level results (left: HIL emulation; right: PSCAD/EMTDC <sup>®</sup> ):            |                   |
|      | (a), (b) MFT primary and secondary voltages at 60Hz and 180Hz, (c) SM                           |                   |
|      | DC voltage ripples, and (d) ideal MMC models comparison. Oscilloscope                           |                   |
|      | horizontal axes setting: 10ms/div.                                                              | 74                |
| 4.16 | MTDC system power reversal from HIL emulation (up/left) and PSCAD/EMT                           | DC®               |
|      | (bottom/right). Oscilloscope horizontal axes setting: 1s/div                                    | 75                |
| 4.17 | SST fault isolation test waveforms from HIL emulation (up/left) and PSCAD/E                     | MTDC®             |
|      | (bottom/right). Oscilloscope horizontal axes setting: 0.5s/div                                  | 76                |
| 4.18 | $MMC_1$ blocked state test results                                                              | 77                |
| 4.19 | Passive charging of $MMC_1$ with opened DC line                                                 | 77                |
| 4.20 | MMC-based MVDC system: (a) system configuration, and (b) station con-                           |                   |
|      | trol scheme.                                                                                    | 78                |
| 4.21 | Hardware architecture and its signal flow routes for the MMC submodule                          |                   |
|      | with non-linear behavioral switch models.                                                       | 78                |
| 4.22 | System-level performance of MMC with non-linear behavioral models from                          |                   |
|      | proposed models (top, middle) and Saber $RD^{(R)}$ simulation (bottom): (a) Out-                |                   |
|      | put voltage, (b) Capacitor voltages, and (c) Arm currents. Oscilloscope y-                      |                   |
|      | axis: (a) 396V(A)/div., (b) 155V/div., (c) 155A/div.; x-axis: 50ms/div                          | 81                |
| 4.23 | Performance of MMC with non-linear behavioral models from HIL emu-                              |                   |
|      | lation (top) and Saber $\mathbb{RD}^{\mathbb{R}}$ simulation (bottom): (a) IGBT turn-on without |                   |
|      | dead-time, (b-c) Switching transients with $5\mu s$ dead-time, (d-e) Switching                  |                   |
|      | transients with $2\mu$ s dead-time, and (f) Operation of complimentary switches                 |                   |
|      | in a SM from HIL emulation. Oscilloscope y-axis: (a) 156V(A)/div., (b)-(e)                      |                   |
|      | $130V(A)/div., (f) 255V(A)/div.; x-axis: (a)-(e) 5\mu s/div., (f) 10ms/div.$                    | 82                |
| 4.24 | MVDC System-level performance from HIL emulation (top) and PSCAD/EMT                            | $DC^{\mathbb{R}}$ |
|      | (bottom): (a) System start, (b) Line-to-line fault response, and (c) Power re-                  |                   |
|      | versal. Oscilloscope y-axis: (a) 2.58kV/div., (b) 1.73kV/div., 272A/div., (c)                   |                   |
|      | 1.72kV/div., 246A/div.; x-axis: (a) 1s/div., (b) 100ms/div., (c) 10s/div                        | 83                |
| 5.1  | Schematic of a three-terminal monopole HVDC system and its control and                          |                   |
|      | protection concepts.                                                                            | 87                |
| 5.2  | Models of unidirectional HHB for EMT simulation: (a) scaled-down model,                         |                   |
|      | (b) conventional full-scale model, (c) $3 \times 3$ IGBT array, and (d) decomposition           |                   |
|      | of HHB full-scale model using <i>v</i> - <i>i</i> coupling                                      | 89                |
|      |                                                                                                 |                   |

| 5.3  | HVDC power transfer corridor with HHB separated: (a) equivalent circuit topology, (b) EMT simulation model.                                                                                                                      | 90  |
|------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|
| 5.4  | (a) IGBT two-state switch model, (b) steady-state representation of IGBT                                                                                                                                                         |     |
|      | IGBT curve-fitting model.                                                                                                                                                                                                        | 94  |
| 5.5  | (a) IGBT fourth-order nonlinear behavioral model, (b) IGBT second-order<br>nonlinear behavioral model, (c) general representation of IGBT behavioral<br>model, and (d) linearized discrete-time equivalent model for electromag- |     |
|      | netic transient analysis                                                                                                                                                                                                         | 96  |
| 5.6  | Hardware design of HHB integrated with MMC in a pipelined structure on                                                                                                                                                           |     |
| 5.7  | the FPGA and signal flow routes                                                                                                                                                                                                  | 101 |
|      | ules                                                                                                                                                                                                                             | 102 |
| 5.8  | Turn-off performance of HHB with RCD snubber circuit: (a) MB NBM model of IGBT from HIL emulation, (b) MB TSSM model of IGBT from HIL emu-                                                                                       |     |
|      | lation, (c) single MB IGBT power loss, (d) MB NBM model of IGBT from                                                                                                                                                             |     |
|      | SaberRD <sup>®</sup> , (e) MB TSSM model of IGBT from PSCAD/EMTDC <sup>®</sup> , and (f)                                                                                                                                         |     |
|      | MB IGBT junction temperature. Oscilloscope horizontal axes settings: $20\mu s/d$                                                                                                                                                 | iv. |
|      |                                                                                                                                                                                                                                  | 104 |
| 5.9  | Turn-off performance of HHB with RC snubber circuit: (a) MB NBM model                                                                                                                                                            |     |
|      | of IGBT from HIL emulation, (b) MB CFM model of IGBT from HIL emula-                                                                                                                                                             |     |
|      | tion, (c) single MB IGBT power loss, (d) MB NBM IGBT from SaberRD <sup>®</sup> , (e)                                                                                                                                             |     |
|      | comparison between RCD and RC snubber, and (f) MB IGBT junction tem-                                                                                                                                                             |     |
|      | perature. Oscilloscope horizontal axes settings: (a) $10\mu s/div$ , (b) $5\mu s/div$ .                                                                                                                                          |     |
|      |                                                                                                                                                                                                                                  | 105 |
| 5.10 | Junction temperature variation during operation: (a) MB IGBT under DC                                                                                                                                                            |     |
|      | current 1kA, (b) MB IGBT under DC current 4kA, (c) LCS IGBT under DC                                                                                                                                                             |     |
|      | current 1kA, and (d) LCS IGBT under DC current 4kA.                                                                                                                                                                              | 106 |
| 5.11 | Varistor voltage and current during protection: (a) HIL emulation of Type-3                                                                                                                                                      |     |
|      | HHB model, (b) HIL emulation of Type-1 HHB model, and (c) PSCAD/EMTD                                                                                                                                                             | C®  |
|      | simulation results. Oscilloscope horizontal axes settings: (a) $5ms/div$ , (b)                                                                                                                                                   |     |
|      | 1ms/div.                                                                                                                                                                                                                         | 107 |
| 5.12 | Varistor voltage and line current during protection from HIL emulation (top)                                                                                                                                                     |     |
|      | and $PSCAD/EMTDC^{\mathbb{R}}$ simulation (bottom): (a) scaled-down model, (b)                                                                                                                                                   |     |
|      | full-scale model. Oscilloscope horizontal axes settings: (a) 20ms/div, (b)                                                                                                                                                       |     |
|      | 2 <i>ms</i> /div                                                                                                                                                                                                                 | 109 |

| 5.13 | System-level performance of the MTDC system during long-term line fault                  |     |
|------|------------------------------------------------------------------------------------------|-----|
|      | with proposed and scaled-down HHB models from HIL emulation (top)                        |     |
|      | and PSCAD/EMTDC <sup>®</sup> simulation (middle and bottom). (a)(b)(c) Converter         |     |
|      | side DC voltages, currents and active powers with proposed HHB models,                   |     |
|      | (d)(e)(f) Converter side DC voltages, currents and active powers with the                |     |
|      | scaled-down HHB model. Oscilloscope horizontal axes settings: (a)(b)(c)                  |     |
|      | 50ms/div.                                                                                | 110 |
| 5.14 | Ultrafast mechatronic circuit breaker decoupled from the transmission path.              | 111 |
| 5.15 | Nonlinear behavioral thyristor model: (a) Single device, and (b) cascaded                |     |
|      | thyristor equivalent circuit.                                                            | 112 |
| 5.16 | SCR logic validation (top: $v_s$ and $V_g$ ; middle: thyristor voltage; bottom:          |     |
|      | thyristor current): (a) Proposed model, and (b) $ANSYS/Simplorer^{\mathbb{R}}$           | 113 |
| 5.17 | Basic thyristor test circuit.                                                            | 114 |
| 5.18 | SCR reverse recovery: (a) Proposed model, and (b) ANSYS/Simplorer ${}^{\textcircled{R}}$ | 115 |
| 5.19 | Hardware architecture of the UFMCB.                                                      | 117 |
| 5.20 | UFMCB top-level finite state machine                                                     | 117 |
| 5.21 | UFMCB terminal waveforms.                                                                | 118 |
| 5.22 | UFMCB SCR1 and SCR12 currents                                                            | 119 |
| 5.23 | Power loss calculation by various EMT tools for: (a) SCR1, and (b) SCR2 . $\ .$          | 120 |
| 6.1  | The CIGRÉ B4 DC Grid integrated with offshore wind farms                                 | 124 |
| 6.2  | General form of a frequency-dependent transmission line model: (a) Norton                |     |
|      | equivalent circuit, and (b) the Thévenin equivalent circuit.                             | 127 |
| 6.3  | MMC-based converter station controller                                                   | 127 |
| 6.4  | Offshore wind farm integration into MTDC grid: (a) DFIG array connected                  |     |
|      | with MMC, and (b) rectifier side EMT model with aggregated wind farms                    | 128 |
| 6.5  | IGBT/diode nonlinear electro-thermal behavioral model                                    | 129 |
| 6.6  | Partitioned SM nonlinear behavioral model: (a) Half-bridge submodule,                    |     |
|      | and (b) full-bridge submodule.                                                           | 130 |
| 6.7  | Circuit partitioning of HVDC stations in the DC subsystem DCS2 by trans-                 |     |
|      | mission line models.                                                                     | 131 |
| 6.8  | DC yard kernel structure and variable sort algorithm.                                    | 132 |
| 6.9  | (a) EMT model of an 3-phase MMC main circuit, (b) a general controller                   |     |
|      | scheme for various control targets.                                                      | 133 |
| 6.10 | MMC inner loop control for single-phase: (a) Phase-shift control in CPU, (b)             |     |
|      | massive thread parallel structure of PSC and SM on GPU                                   | 134 |
| 6.11 | Hierarchical dynamic parallelism implementation of a 3-phase HVDC con-                   |     |
|      | verter                                                                                   | 135 |
| 6.12 | HHB unit kernel design and its EMT calculation manner in conjunction with                |     |
|      | DC yard and LPR.                                                                         | 137 |

| 6.13 | OpenMP <sup><math>\mathbb{R}</math></sup> pseudo code for multi-core MTDC system CPU simulation | 139              |
|------|-------------------------------------------------------------------------------------------------|------------------|
| 6.14 | Single-phase MMC results of GPU simulation (top) validated by PSCAD/EMT                         | ſDC <sup>®</sup> |
|      | (bottom). (a)-(c) 5-level, 17-level, and 33-level MMC output voltages, (d)-(f)                  |                  |
|      | SM DC capacitor voltages of 5-level, 17-level, and 33-level MMCs.                               | 141              |
| 6.15 | IGBT device-level performance ((c)-(f) results from proposed model (left)                       |                  |
|      | and SaberRD <sup>®</sup> (right)). (a) variation of turn-on and -off times, (b) averaged        |                  |
|      | power loss under different switching frequencies, (c) SM upper IGBT junc-                       |                  |
|      | tion temperature, (d) upper switch current waveform, (e) SM lower IGBT                          |                  |
|      | junction temperature, and (f) lower switch current waveform.                                    | 142              |
| 6.16 | Subsystem DCS1 results of GPU simulation (top) validated by PSCAD/EMTD                          | $C^{\mathbb{R}}$ |
|      | (bottom). (a) System simultaneous start, (b)-(c) rectifier station power step                   |                  |
|      | tests, (d) inverter voltage step test, and (e)-(f) DC line-to-line fault lasting                |                  |
|      | 5ms                                                                                             | 144              |
| 6.17 | 4-terminal MTDC results of GPU simulation (top) validated by PSCAD/EMTI                         | ℃®               |
|      | (bottom). (a) DC voltages of all stations, (b) DC line currents, (c) current                    |                  |
|      | waveform amplification of Lm1 at Cm-E1 side, (d) detailed actions of Cm-                        |                  |
|      | E1 HHB, (e) power export of each station, and (f) power transferred on DC                       |                  |
|      | lines.                                                                                          | 145              |
| 6.18 | CIGRÉ DC grid power reversal simulation by GPU (left) and PSCAD/EMTDC                           | R                |
|      | (right)                                                                                         | 148              |
| 6.19 | GPU performance in simulation of different DC systems with IGBT TSSM                            |                  |
|      | and DCFM. (a) HVDC with HHB, (b) DCS-2, and (c) CIGRÉ B4 DC system.                             | 149              |
| 6.20 | Switching transients of behavioral IGBT/diode pair: (a) Turn-on, (b) turn-                      |                  |
|      | off, (c) diode reverse recovery, and (d) IGBT turn-on current under different                   |                  |
|      | gate conditions.                                                                                | 151              |
| 6.21 | Switching pattern and IGBT junction temperature: (a) Upper switch current,                      |                  |
|      | (b) lower switch current, (c) IGBT junction temperatures, and (d) switching                     |                  |
|      | pattern difference between device-level model and two-state switch model.                       | 152              |
| 6.22 | OWF integration into DCS1: (a) DFIG $P$ - $v$ characteristics, (b) wind speed                   |                  |
|      | and rotor mechanical velocity, (c) rectifier AC currents, (d) rectifier AC volt-                |                  |
|      | age, (e) converter station DC yard power, and (f) terminal DC voltages                          | 153              |
| 6.23 | Inverter side HBSM- and FBSM-MMC response to DC fault (L: GPU simu-                             |                  |
|      | lation, R: $PSCAD/EMTDC^{(R)}$ : (a) DC currents, and (b) DC voltages                           | 155              |
| 6.24 | Power flow in the CIGRÉ B4 DC Grid.                                                             | 156              |
| 7.1  | Hybrid FTS-VTS scheme for MTDC grid simulation: (a) System structure.                           |                  |
|      | and (b) time instant synchronization.                                                           | 161              |
| 7.2  | Nonlinear behavioral MMC kernel with VTS scheme.                                                | 163              |
| 7.3  | MMC-based MTDC grid with wind farm integration.                                                 | 164              |
|      | · · · · · · · · · · · · · · · · · · ·                                                           |                  |

| 7.4 | VTS schemes for nonlinear MMC simulation (left: output voltage; right: zoomed-in waveform): (a) SaberRD <sup>®</sup> results, (b) event-correlated criterion, |                         |
|-----|---------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------|
|     | (c) LTE, and (d) N-R iteration count.                                                                                                                         | 165                     |
| 7.5 | IGBT nonlinear behavioral model VTS control (left: proposed model; right:                                                                                     |                         |
|     | SaberRD <sup><math>\mathbb{R}</math></sup> ): (a) IGBT turn-on, (b) diode reverse recovery, and (c) junction                                                  |                         |
|     | temperatures.                                                                                                                                                 | 166                     |
| 7.6 | HVDC-link1 pole-pole fault: (a) HBSM-MMC response, (b) FBSM-MMC re-                                                                                           |                         |
|     | sponse                                                                                                                                                        | 168                     |
| 7.7 | MTDC system dynamics with wind farms (left: proposed model, right: PSCAI                                                                                      | D/EMTDC <sup>®</sup> ). |
|     |                                                                                                                                                               | 169                     |
| C.1 | Greater CIGRÉ DC Grid consisting of multiple CIGRÉ DC B4 systems                                                                                              | 194                     |

# List of Acronyms

- AC Alternating Current
- **AVM** Averaged Value Model
- **CFM** Curve-Fitting Model
- CPU Central Processing Unit
- CUDA Compute Unified Device Architecture
- DC Direct Current
- **DCFM** Dynamic Curve-Fitting Model
- **DEM** Detailed Equivalent Model
- DFIG Doubly-Fed Induction Machine
- **EMT** Electromagnetic Transient
- FDLM Frequency-Dependent Line Model
- **FPGA** Field Programmable Gate Array
- **FSM** Finite State Machine
- GPU Graphics Processing Unit
- **HHB** Hybrid HVDC breaker
- HIL Hardware-in-the-loop
- HLS High-Level Synthesis
- **HVDC** High-voltage Direct Current
- **IGBT** Insulated-Gate Bipolar Transistor
- LCS Load Commutation Switch
- **LTE** Local Error Truncation
- LUT Look-Up Table
- MMC Modular Multi-level Converter
- **MOV** Metal-Oxide Varistor
- **MTDC** Multi-terminal DC
- **NBM** Nonlinear Behavioral Model
- **OWF** Offshore Wind Farm
- **PSC** Phase-Shift Control
- **RCB** Residual Current Breaker
- SIMD Single-Instruction-Multiple-Data
- SM Submodule
- SST Solid-State Transformer
- TLM Transmission Line Modeling
- **TSSM** Two-State Switch Model
- UFD Ultra-Fast Disconnector
- VHDL Very-high-speed-integrated-circuit Hardware Description Language
- **VTS** Variable Time-Stepping

# Introduction

The modular multilevel converter (MMC) has received tremendous attention in recent years with wide industrial applications in medium- and high-power occasions, such as static synchronous compensator [1–3], renewable energy grid integration [4–6], medium-power drives [7, 8], and high-voltage direct current (HVDC) transmission projects [9, 10], where this voltage source converter has been gaining momentum and is expected to over-take traditional thyristor-based line-commutated converters as the main vehicle for electrical energy conversion due to its advantages such as resilience to commutation failure and capacity of regulating reactive power. Meanwhile, it also has merits over traditional two-level or other multilevel converters such as high-power quality quasi-sinusoidal output waveforms, obviating the need for bulky filtering equipment and scalability, which allows for the number of submodules to be flexibly changed to adjust to different voltage stresses or to produce the demanded voltage levels.

Therefore, the MMC is an ideal option for constructing the HVDC converter station. Its application as a front-to-front connected solid-state transformer (SST) further prompted the development of the multi-terminal DC (MTDC) grid by enabling the connection of several regional DC grids with distinct DC voltage ratings [11, 12]. Being composed of an inverter, a rectifier, and a transformer that is designed to be physically isolated from the AC grid, the SST brings benefits such as fault isolation, power flow control [13–15], and the fact that the transformer can operate at a medium frequency means that it has a compact volume [16]. However, such a benefit is accompanied by a corresponding rise of switching frequency and subsequent higher power loss.

The recently emerged hybrid HVDC breaker is another critical component in forming the MTDC grid [17, 18]. It has the capability of isolating DC line faults within several milliseconds to reduce its hazardous impact on the power system to a minimum [19, 20].

Meanwhile, merits such as quick response and low conduction power loss make it a new favorite compared to a mechanical circuit breaker and solid-state circuit breaker that is either too slow or extremely energy consuming [21,22].

The electromagnetic transient (EMT) simulation enables studying the overall power system – including both primary and secondary devices – under various scenarios such as line faults, lightning, as well as transients caused artificial operation. Thus, the simulation of MTDC grid has to include models of all aforementioned power electronic apparatuses for the validation of control and protection strategies. Nevertheless, the power semiconductor switch model is a key factor that determines the credibility of EMT-type solvers. Based on their complexity, the CIGRÉ working group B4.57 identifies seven types of computational models [23]:

- 1. Type 1 Full physics based models
- 2. Type 2 Full detailed models
- 3. Type 3 Simplified switchable resistances
- 4. Type 4 Detailed equivalent circuit models
- 5. Type 5 Average value models based on switching functions
- 6. Type 6 Simplified average value models
- 7. Type 7 RMS load-flow models

The document drafted by the group also classified models suitable for EMT simulators, i.e., Type 2-6. whereas Type 1 at the highest level is excluded due to its complexity, while Type 7 falls short of system dynamics. Fig. 1.1 shows the extent of information that each type is able to reveal, where a higher level model covers a lower level model.

As low-level types have insufficient accuracy which cannot meet the requirement of high-fidelity, Type 3 is the prevalent model that current EMT-type solvers provide in their libraries for constructing a power converter. With regard to MMC modeling, Type 4 which has the Thévenin equivalence of Type 3 is also provided for efficient circuit solution. However, as the semiconductor switch model is a mere two-state resistor, both types lack proper reverse block capability, resulting in erroneous results under some operation modes. Type 2 and Type 1 avoid the above issue, and in addition disclose device-level information, but the computational burden will increase dramatically once they are included in the simulation without taking any further optimization of the program.

As the operation status of a single switch in a high-power converter is increasingly concerned for design purpose, Type 2 turns out to be the simplest model – in fact, Type 1 which has a much higher fidelity and versatility is preferred for testing various scenarios. For small-scale converters, the device-level power switch models are frequently

| Type<br>Steady s | e 7<br>state                                                       |  |
|------------------|--------------------------------------------------------------------|--|
| Type 6 S         | ystem dynamics                                                     |  |
| Туре 5 Н         | Iarmonics                                                          |  |
| Type 3,4         | Internal imbalance<br>Switching time                               |  |
| Type 2           | Static characteristics<br>Energy estimation                        |  |
| Type 1           | Switching transients<br>Parasitic effects<br>Electro-thermal model |  |

Figure 1.1: Power semiconductor switch model categorization.

included since the total number of nodes is acceptable for efficient computation; nevertheless, in MMC-based HVDC application, hundreds or even thousands of such devices are connected, making the circuit solution extremely slow if proceeded sequentially, and due to an extensive distribution of nonlinearity, the simulation often ends up terminating with numerical divergence.

Therefore, better parallel processing schemes and processors are gaining tremendous attention. Among them, the real-time simulation system based on the field programmable gate array (FPGA) is the main platform for hardware-in-the-loop (HIL) test of functions of actual control and protection devices. And in recent work, detailed device-level models are considered because of FPGA's high computational efficiency enabled by parallelism and pipelined structure.

One noticeable aspect of hardware implementation is the resource utilization when deployed to the FPGA, the MTDC grid consumes various types of resources, e.g., the look-up table (LUT), the flip-flops (FFs), and digital signal processor (DSP). The larger the MTDC grid, the higher requirement on available resources. The Type 1 and Type 2 models consume much more hardware than their counterparts, let alone the scale of the MTDC grid itself has already been a severe challenge. Meanwhile, HIL platforms are not always available due to its high cost.

Therefore, off-line simulation packages such as PSCAD/EMTDC<sup>®</sup>, EMTP-RV<sup>®</sup>, and MATLAB/Simulink<sup>®</sup> run on the central processing unit (CPU) are popular in validating the framework of control and protection strategies as well as for system preview. Compared with the HIL system, the off-line simulation can be conducted on a personal computer and is not bound by the processor resource. The main drawback is that the simulation slows down with an increase in the number of components. The graphics processing

unit (GPU) with massive parallelism is particularly suitable for computing a large-scale HVDC grid.

## **1.1 Literature Review**

In this section, a review of previous studies related to this research are conducted.

#### 1.1.1 Modular Multi-level Converter Modeling

The modularity of MMC indicates that there could be dozens or even hundreds of submodules, and – depending on the SM configuration – a few times more power semiconductor switches. From the circuit solution point of view, the nodes or meshes in the MMC outnumber those of other converter topologies, which means it takes a longer execution time for a CPU to calculate the results when the output voltage level rises [24–27]. Although this negative impact can be remarkably minimized using parallelism of the fieldprogrammable gate arrays, the large quantity of switches still poses a challenge to attaining real-time for HIL emulation, not to mention fast simulation for other EMT-type solvers.

A variety of models have so far been proposed for MMC modeling, which largely falls into the following categories: device-level model, detailed equivalent model (DEM), and average value model (AVM). The switches in device-level model can either be detailed physics-based models of IGBT and diode, which are accurate but rather complicated, or equivalent models that combine an ideal switch with classical nonlinear diodes, enabling engineers to estimate conduction loss [28]. While such detailed models are useful and actually needed for being able to offer greater accuracy [29], new MMC model construction as well as validation [30], and converter design to achieve a high efficiency [31, 32], they are computationally burdensome and may not be suitable for real-time HIL emulation or system-level simulation. DEM ignores specific features of the IGBT/diode pair, which is replaced by a bidirectional two-state switch with fixed on-state and off-state resistances commanded by the driving pulses, allowing the submodule to be represented by its Thévenin equivalent circuit and the subsequent merging of all submodules in the same arm [33–37]. In this way, the node number is significantly reduced and the computational speed is fast. AVM is carried out under the assumption that all internal variables, including submodule DC capacitor voltages, are well controlled and balanced. Switching effects are not explicitly shown and submodules are modeled as equivalent voltage and current sources, which are also merged to replace all submodules [38–40]. For DEM and AVM, which are currently the predominant models for MMC, the accuracy is sufficient for system-level power system studies; however, both models lose some specificities, for example, the possibilities of observing switching transients and assessing converter power losses are excluded, and even the capacitor charging process is not observable for the latter. Although AVM variants are proposed to show the dc capacitor voltage ripples [41] or to enable converter power loss calculation [42], the focus is still on system-level performance and the individuality of the switches cannot be shown; meanwhile, the power loss based on the DEM is merely a rough estimation as it is carried out under the assumption that the IGBT and diode have equal on-state resistance and are set constant.

#### 1.1.2 Hybrid HVDC Breaker Modeling

It is meaningful to include an integrated HHB model as a fundamental component in the libraries of various EMT tools. Hitherto, precise HHB models suitable for fast simulation or capable of real-time HIL execution are yet to be developed. To withstand high voltage and large current exert on the DC circuit breaker during the protection process, hundreds of IGBTs in the main path of the HHB are constructed in series and parallel [43]. Nevertheless, for the purpose of fast simulation, much of the previous modeling work has focused on a scaled-down model, i.e., the number of IGBTs in the main path is only one or two for unidirectional and bidirectional HHBs, respectively [44–48], which is far less than that of a real DC breaker. Such a simplification is reasonable and could provide good results for system-level grid studies, as the main interest is to validate the protection concepts. In the meantime, the IGBTs and their auxiliary circuits such as the snubber are also omitted. One benefit of this model is that the number of meshes or nodes can be kept at a minimum, avoiding matrices of large dimension in the simulation process that would take an extraordinarily long time for the CPU, GPU, or FPGA to compute. However, these models lose specificities and, therefore, fall short of providing guidance on the HHB design, typically the snubber circuit that has a significant impact on the HVDC grid performance, and the state of components or devices within the breaker for operation monitoring.

On the contrary, a full-scale HHB model contains the exact number of IGBTs and other devices so that more system-level details can be shown [49, 50]. However, hundreds or even more circuit nodes of this model significantly slows down off-line simulation speed as the corresponding large matrix equation costs a long computational time [51]. Similarly, it is impractical for real-time simulation platforms to test control and protection strategies of an MTDC system in real time due to extremely slow speed and high hardware resource utilization reasons. Moreover, despite all components having been included in the model, the fidelity is still not high enough because the two-state switch model employed is insufficient to evaluate IGBTs device-level behavior, such as switching transients, power loss, as well as the junction temperature, which in turn affects the performance of the HHB.

#### 1.1.3 IGBT and Diode Models

Detailed physics-based analytical device-level models for IGBTs are available in the literature [52,53], which are among the most prevalent models. Highly exact numerical models based on finite element methods [54] and hybrid models [55] combining the analytical and numerical concepts also exist. Nevertheless, all these IGBT models are seldom used for time-domain simulation of power converters even though high accuracy is preferred and demanded [56, 57] since they involve many nonlinear physical phenomena and employing them would contribute to very long computational time even with a few devices, at a moderate switching frequency, and for a fraction of the simulation interval [58]. On the other hand, parameter exaction is not instantly feasible, even for the lumped-charge model [59] that is simpler than the Hefner model, an experimental setup is still required. The situation is similar in power diode modeling where aforementioned methods are also adopted [60–62].

Behavioral models reveal the necessary device static and dynamic characteristics in circuit simulation while omitting excessive device physics. Therefore, they gain computational advantages over aforementioned models and are better in accuracy and details than system-level models such as the ideal model and the averaged value model. There are a number of variants, e.g., the macro-model [63], the Hammerstein configuration [64], all of which have an order greater than 5. A considerably simpler first-order model was proposed [65]; however, custom experiments and curve-fitting were used for parameter determination. An improved behavioral model was presented [66] to accurately capture the device behavior, and the data-sheet-driven feature that exempts itself from acquiring a long list of inaccessible device-correlated parameters as in the analytical or numerical models makes it more applicable.

While the hardware emulation of power converters mainly aims at validating functions of the converter and its control strategies, nonlinear IGBT and diode models were rarely included due to their complexity even though they have long been in existence. On the contrary, simpler switch models prevail. The two-node model having a resistance in parallel with a current source [67] showed its effectiveness in two-level voltage-sourced converters. And the ideal switch model claimed dominance in various circuit simulation occasions. Nevertheless, those IGBT and diode models merely reflect the on- and off-state characteristics and are incapable of providing further details for converter evaluation. Typical switching transients were recorded in a curve-fitting based linear switch model [68] and the LUT method [69]. The accuracy of the former was compromised due to the omission of non-linearities, and they both lack versatility as the waveform shapes stored in FPGA ROM cannot change along with the electromagnetic environment, typically the gate driver circuit conditions, underlining the importance of adaptive models.

#### 1.1.4 Variable Time-Stepping Methods

A fixed time-step is not always mandatory for off-line EMT simulation as at distinct stages the requirement on the density of results is different. The idea that a sufficiently small timestep is used when the concerned events which trigger rapid current or voltage variations occur, and that it should be enlarged when the impact mitigates, avoid unnecessary computation and consequently accelerates the simulation speed. Using various criteria such as local truncation error, performance of Newton corrector iteration, and switching events and faults [70–73], this variable time-stepping (VTS) scheme was successfully applied on a few occasions where the system is relatively simple, but its application to a larger and more complex nonlinear system such as the MTDC grid with device-level specifics has yet to be explored. In the meantime, it was also explored for real-time hardware implementation of simple cases with the circuit scale strictly limited to a few components [74].

## 1.1.5 Circuit Partitioning

In power electronic converter simulation, circuit partitioning was proposed for splitting a relatively large system into several smaller sub-circuits, and consequently improve computation efficiency. There are two main partitioning schemes: the transmission line modeling (TLM) link and the voltage-current source coupling. The usage of TLM-link is reliant on the existence of energy storage elements, which can be modeled as a section of lossless transmission line [75–77]. As a result, a unit time-step delay is introduced in the exchange of information between the two sub-circuits on both sides. However, so long as the capacitor or the inductor is sufficiently large to provide a stiff voltage or current, the error between two neighboring time-steps becomes trivial [78]. In case the energy storage element is unavailable, the second decoupling method can be applied if the circuit section demonstrates same phenomena as the energy storage elements [79–81]. Or even if they are in the presence, the method can also be applied. Hitherto, those two methods have been applied to small-scale circuits, and therefore, their application to a large MMC topology needs to be investigated.

## 1.1.6 Real-time Hardware-In-The-Loop Emulation

Driven by increasingly higher demands such as converter design evaluation, control and protection algorithm test, and system performance preview, the real-time HIL systems have witnessed an explosive growth in these applications for being able to interact with real power system secondary devices. Commercial products as such the RTDS<sup>®</sup> and OPAL-RT<sup>®</sup> have been popular in industry for factory acceptance tests as well as user training [82–85], not only because they provide a virtual primary electrical system that is safe to operate, but also their much lower cost than a real system, e.g., the HVDC transmission system. As the goal of real-time systems is to simulate a real power electronic system as closely as possible, other HIL test systems based on FPGA are also springing up for various application ranging from smaller power-rating converters [86–88], electrical machine driving [89–92], transformer [93, 94], to hundreds of megawatt-level HVDC transmission [95–97] and large-scale power system [98, 99], and eventually, the MTDC grid is expected to be deployed onto the FPGA boards.

#### 1.1.7 Off-line EMT Simulation

Currently, the CPU is the dominant platform for a variety of computer-aided design (CAD) tools, such as SaberRD<sup>®</sup>, PSpice<sup>®</sup>, and PSCAD/EMTDC<sup>®</sup> where off-line time-domain EMT simulations of power electronic circuits and power systems can be conducted [100, 101]. The single-core CPU has superior performance in simulating small-scale circuits, while for a gigantic system or a complex component such as the MMC being composed of numerous elements, it has been shown that conventional EMT simulation tools based on CPU are extremely time-consuming [102, 103], forcing the investigation of new methods such as model-order reduction with the subsequent loss of device or equipment model details. Under circumstances that many repetitive components exist, multi-threading programming methods supported by multi-core CPU (MCPU) are able to improve the simulation speed. The drawback is that the parallelism is not sufficiently high since it heavily depends on the number of CPU cores in a personal computer or a workstation, and its speedup over single-core CPU is not satisfactory when a lot of irregularities that exempt themselves from being computed concurrently exist.

The necessity of EMT computation of large-scale HVDC grids in the time-domain on the GPU is thus manifested by above facts. Hitherto, only a few such cases are available on conventional AC system EMT simulation [104–107], or large-scale power grid analysis [108, 109]. While large-scale AC power systems exhibit highly complex behavior such as frequency-dependency and nonlinearities in equipment such as transmission lines and cables, rotating machines, and transformers, their computational burden pales in comparison to large-scale DC grids with multiple power converters with device-level models wherein the circuit complexity can quickly become untenable due to the large number of discrete nonlinear switching devices, and the high number of voltage levels for MMCs used in practical DC grids. Thus, GPU for power converter simulation becomes attractive [110, 111], albeit the circuit scale is currently limited to converter level and consequently far from MTDC simulation for both device-level and system-level performance preview.

## **1.2** Motivation of this work

Current EMT-type simulation tools can be largely categorized as the following two types:

- System-level solver
- Device-level simulator

The first one, including PSCAD/EMTDC<sup>®</sup>, EMTP-RV<sup>®</sup>, and Matlab/Simulink<sup>®</sup>, pursue simulation efficiency and therefore do not have detailed device-level power semiconductor models, which, on the contrary, are typically taken as two-state resistors with a maximum detail of distinguishing current flowing direction [112]. And for the modular multi-level case, as mentioned in the review section, the arm is even simplified by its Thévenin equivalent circuit for nodes reduction. The HIL system runs in real-time also belongs to this category.

The second type represented mainly by off-line tools such as SaberRD<sup>®</sup>, PSpice<sup>®</sup>, and Ansys/Simplorer<sup>®</sup>, has much slower computational speed once detailed models are involved in the simulation. In the context that the power loss of a megawatt-level converter is concerned, as well as the focus on actual voltage and current stresses of a power semiconductor switch during various operating conditions for converter design evaluation, showing device-level transients by simulation turns increasingly significant [113]. Although aforementioned simulators are available, they are mostly unable to calculate a high-power converter which has a large quantity of IGBT and diodes that induce numerous nodes and nonlinearity. As a consequent, the simulation is quite time-consuming even with a few such devices and often ends with numerical divergence once the converter scales up.

Therefore, integration of device-level models into system-level simulation is the major motivation of this work. A real-time system would be a better hardware-in-the-loop platform for accurate simulation of various power systems and power electronic equipment components in the factory test as well as for academic study since it provides higher fidelity to the actual system. The FPGA as an attractive alternative is being intensively used in this domain for its excellent hardwired parallelism. However, so far for MMC application, the power switch models are always simplified with only basic on- and offstate features retained for achieving real-time execution. Considering for the last decade, the FPGA has witnessed a significant leap toward advanced manufacturing technology, which enables new boards to have more hardware resources to accommodate more complex models and a larger system. In the meantime, the programming process becomes easier. The software Vivado<sup>®</sup> launched high-level synthesis (HLS) tool which facilitates hardware module design by allowing writing a corresponding function in C/C++ rather than logic-gate manipulation using VHDL. Thus, it can be used in this work to shorten the HIL implementation cycle.

Realization of parallel EMT simulation of MTDC grid on GPU is another major goal of this research that has not been achieved before. Hardware resource requirement is quite a realistic challenge for HIL implementation of the MTDC grid even the manufacturing is maturing, and the high cost of real-time system hinders its wide application. With regard to validating the frame of control and protection algorithms, off-line simulation is sufficient, and due to the lower cost, it has wider application. However, considering the scale of the object HVDC grid and the extent of information needs to be revealed, traditional EMT simulation based on the CPU is not the prime option.

The GPU has begun to induce attention in recent years. It enables the simulation of AC power system to gain a remarkable speedup. Yet its application in HVDC grid which is more complex has to be explored. It has better performance in computing largescale power system due to the massive parallelism. The single-instruction-multiple-thread (SIMT) mode has higher efficiency than the multi-core CPU and consequently, the GPU is expected to be a new off-line simulation platform. The modeling approaches and the parallel computational techniques investigated could be referred by future commercial EMT-type tools.

# **1.3 Thesis Objectives**

The primary objective of this research work is to develop a multi-terminal HVDC grid with device-level electro-thermal IGBT and diode models which can be implemented in massive parallelism on FPGA for HIL tests and on the GPU to accelerate off-line simulation. Since device-level models are rarely included in the system-level study, completion of following tasks would enable to achieve the objective:

• Device-level modeling of IGBT and diode

The terminology device-level model indicates a minimum requirement of Type 2 should be satisfied, and in case high fidelity is demanded, e.g., for IGBT junction temperature estimation, Type 1 which has switching transients is the only option. Nevertheless, as pointed out in the review section, Type 1 model contains various models, among which the static curve-fitting model is the basic one and easy to implement. Thus, in the first step, typical voltage and current waveform shapes during turn-on and turn-off processes are recorded and duplicated in the modular multilevel converter with their amplitudes appropriately determined. On the other hand, the static *I-V* characteristics are obtained from manufacturer's datasheet to ensure a complete IGBT/diode model.

To increase the fidelity a bit, the dynamic curve-fitting model is proposed, which has the capability to adjust its switching transient waveforms according to the gate resistance, collector current, junction temperature, etc. Thus, this model is able to reflect the normal operation of a power semiconductor switch, since the power loss, as well as actual voltage or current stress the device withstand, is more accurate.

Nevertheless, the curve-fitting models are still unable to reveal the device's performance under special scenarios, e.g., when the dead-time of two complementary switches is not sufficient. A large current surge is expected during the overlapped conduction period. For this purpose, the datasheet-driven nonlinear behavioral models are proposed.

Meanwhile, an electro-thermal network is built for all three models so that the junction temperature can be calculated and the impact of thermal dynamics on electrical circuits can be studied. • HVDC grid component modeling

Modeling of the entire HVDC grid is mandatory for this work. Other than the MMC and hybrid HVDC breaker, all remaining components, including the inductor, capacitor, transformer, induction machine, wind turbine, and DC transmission line, is described in system-level.

The inductor and capacitor, as fundamental elements, can be discretized by a few methods, such as the traditional implicit Trapezoidal rule and Backward Euler method. In addition, TLM-stub model is also applicable. All those methods result in the companion model represented by either the Thévenin equivalent circuit or its Norton counterpart which can be converted easily using fundamental circuit theory to accommodate different circuit configurations.

The transformer and induction machine, on the other hand, are described by matrices, particularly the latter which is described by the state-space equation and does not have a concrete circuit form that can be directly integrated with its surrounding apparatuses. In such cases, the Trapezoidal rule is applied to those models for discretization.

As for the transmission line, it can be generally represented by the Bergeron line model, or the frequency dependent line model if the requirement on accuracy is stricter. In short distance scenarios, it can even be replaced by basic *RLC* elements.

• Circuit partitioning

The multi-terminal HVDC grid is a huge system corresponding to a matrix equation of high dimension that is too burdensome for processors to compute without partitioning. The transmission line provides natural separation of each converter station after being discretized. However, one MTDC terminal still poses a severe challenge to efficient computation as the MMC and HHB both constitute a large number of IGBT and diodes.

Thus, artificial circuit partitioning is investigated based on the principle that an element with slowly changing voltage or current compared with the frequency of EMT computation is the potential contributor to circuit separation. Two partitioning schemes are adopted for reshaping the EMT configuration of the MTDC grid, i.e., the TLM-link which is the equivalent circuit of reactive components, and in case they are absent, the voltage-current source coupling is another option. Therefore, a converter station is further split into many subsystems corresponding to a number of matrix equations with lower dimension, and the computational burden is reduced, especially when the parallel computation is exploited.

• FPGA hardware design

FPGA is the main carrier for real-time HIL emulation of power electronics and power

systems. One distinct feature of this work from previous system-level implementation is that the time-step is much smaller for the purpose of capturing switching transients, which last approximately a few hundred nanoseconds to a few microseconds. On the other hand, the computational burden is even heavier due to more complex device models. Following circuit partitioning, efforts are put into parallel computation and pipelined hardware design. Vivado<sup>®</sup> HLS which enables C/C++ coding is utilized for the generation of IP cores, which are then imported into the VHDL description tool Vivado<sup>®</sup> where the final top-level design is conducted. Consequently, the design cycle is shorted by avoiding logic-gate-level maneuver.

• Efficient massively parallel simulation on GPU

The graphics processing unit is used for EMT simulation of HVDC grids, where extensive identical components exist which can be coded as one GPU kernel – defined as a global function written by the programming language CUDA C. Then, the SIMT mode enables the parallel computation of all circuits corresponding to that kernel so a high efficiency is achieved. Among the hierarchical HVDC grid, different circuit types have distinct numbers, increasing the irregularity. However, after the creation of multiple types of circuits by partitioning, they are programmed as individual kernels. The number of threads invoked by one kernel is exactly the number of circuit components, thus, the GPU has a higher parallelism than multi-core CPU. With a proper reorganization of the HVDC grid and massively parallel architecture of the GPU, system-level simulation with device-level details that otherwise infeasible by CPU turns out an efficient approach for studying the system.

Moreover, several variable time-stepping schemes are proposed to further expedite the off-line simulation. Thought nonlinear device-level IGBT and diode models are adopted for system preview, the requirement on time-step at different stage varies. Under steady state, variables in the system change much slower than the period when dramatic state shift is taking place, and therefore it can tolerate a larger timestep. Then, the focus lies on correctly judging and controlling the time-step. Three main criteria are proposed, i.e., based on events taking place in the system, the local error truncation, and the Newton-Raphson iteration count for nonlinear systems.

## **1.4** Contribution of the Thesis

This work targets both real-time HIL emulation and off-line simulation of multi-terminal DC grid. The main contributions are briefly summarized in the following:

• The proposal of a low-latency device-level IGBT/diode model using the curve-fitting technique. The model includes two aspects: the static voltage-current characteristics are reflected by a piecewise linear resistor; while the switching transients are calculated in advance.

- Improvement on switching transients of the aforementioned model by the dynamic curve-fitting method. The IGBT rise and fall times are taken as functions of variables affecting the switching characteristics.
- MMC hybrid arm structure for low hardware resource utilization. The arm is composed of ideal switch-based submodules which are simplified by its Thévenin equivalent circuit and then merged, and the separated submodules containing device-level IGBT/diode models.
- Two circuit partitioning methods for improving simulation efficiency. Circuit sections with stiff current or voltage are taken as lossless transmission lines which are subsequently discretized for reducing the admittance matrix dimension, and the coupled voltage-current source is applicable in the absence of reactive components.
- Subsequent multi-layer hardware implementation of one integrated circuit. The large latency disparity between device-level circuits and system-level systems including the controller can then be processed in a pipelined manner.
- High-order behavioral IGBT modeling using superimposition. The model is taken as a collection of sub-circuits corresponding to various behaviors of the IGBT, and the final waveform is their superimposition.
- Development of the electro-thermal model for IGBT curve-fitting model and the nonlinear behavioral model. The interactive electro-thermal network provides junction temperature which impacts the IGBT behavior.
- The GPU simulation of HVDC grids including the CIGRÉ B4 DC test system using massive parallelism invoked by the processor's single-instruction multiple-thread mode. Various power electronic and power system components are modeled and designed into GPU kernels.
- Introduction of the virtual subsystem to improve the regularity of HVDC grids for SIMT implementation.
- GPU program design of nonlinear IGBT/diode-based HVDC grid for higher fidelity. The creation of a large number of identical sub-circuits caters for GPU SIMT implementation, making system-level simulation involving device-level models feasible.
- Proposal of three variable time-stepping schemes for further acceleration of both CPU and GPU simulation. Three main categories are identified and classified according to their applications.

## **1.5** Application of the Work

Based on the processors used, this work has two parts, i.e., the real-time HIL emulation on FPGA and off-line simulation using massive parallelism of the GPU. With the proposal and adoption of device-level models of the switching elements, a virtual primary system with higher fidelity becomes available and the study of the HVDC grid produces more convincing results.

The real-time system is essential to a wide range of applications in both academia and industry. For example, it provides a virtual power system which can be integrated with actual control and protection devices for their function validation during factory acceptance test before being sent to the field where on-site tests are conducted; and even within an HVDC converter station, such real-time systems are available for the operators to learn the power system's behavior under various scenarios, or to gain acquaintance with system operation procedures prior to commanding the actual station. Meanwhile, the inclusion of detailed IGBT/diode models in this work enables revelation of device specifics such as power loss and junction temperature, thus the power electronic apparatus design can be evaluated in an interactive large-scale system, instead of an isolated environment that omits devices' mutual impact.

The off-line simulation based on GPU would also play a significant role regarding the HVDC grid system design and test. The academia has shown a great interest in studying a multi-terminal HVDC system, and a high accuracy is always pursued. While the traditional EMT simulation based on CPU or even multi-core CPU is inefficient in solving a large system with complex power semiconductor models, the GPU provides a platform which expedites the simulation with a remarkable speedup. Therefore, the time spent on the DC grid research is shortened, e.g., the subsequent impact of new control algorithms or the selection of a certain type of device can be immediately known.

## **1.6 Thesis Outline**

This thesis consists of eight chapters and is organized as follows:

- **Chapter 1: Introduction** The background of this work is briefly introduced. The motivation and objectives are also summarized.
- **Chapter 2: Overview of Parallel Processors** In this chapter, parallel processors for EMT simulation are specified.
- Chapter 3: Linearized Device-Level Modular Multi-Level Converter Model Detailed MMC modeling technique for efficient simulation is demonstrated in this chapter. Circuit partitioning based on TLM-link is investigated for splitting the MMC, which adopts linearized IGBT/diode models.
- Chapter 4: Nonlinear Device-Level Modular Multi-Level Converter Model This chapter focuses on nonlinear device-level modeling of the MMC. Two types of IGBT and its anti-parallel diode models are proposed, which have higher accuracy than the piecewise linearized model. Coupled voltage-current source is adopted for fine-grained circuit partitioning of the MMC to achieve parallelism.
- Chapter 5: High-Fidelity Device-Level Hybrid HVDC Breaker Models Two types of hybrid HVDC circuit breakers are presented. Modeling approaches which lead to efficient simulation are discussed.
- Chapter 6: Fixed Time-Step CIGRÉ DC Grid Simulation on GPU The CIGRÉ B4 DC grid is taken as the study object in this chapter. Three IGBT/diode models, including the ideal two-state switch model, are applied to the HVDC grid for different simulation purpose. Two programs for GPU and multi-core CPU execution are designed and their computational times are compared.
- Chapter 7: MTDC Grid Variable Time-Stepping Simulation on GPU Based on the work in Chapter 7, further improvement to the simulation speed is carried out by exploiting the variable time-stepping method.
- Chapter 8: Conclusions and Future Works The research conclusions are provided and future work is discussed.

2

# **Overview of Parallel Processors**

### 2.1 Introduction

The EMT simulation can be conducted on various types of processors, e.g., the CPU, the FPGA, the application specific integrated circuit (ASIC), and the GPU which recently becomes attractive. The FPGA has been widely seen in circuit real-time simulation, mainly due to its reconfigurability and high computational efficiency achieved by its extraordinary parallelism. The hardware design can be altered by the user whenever a new power system is expected to be simulated on the board even a previous fabrication is completed. On the contrary, the ASIC is excluded from studying power systems due to the high cost and its incapability of hardware reconfiguration albeit it has higher performance. The CPU is not able to achieve real-time for a relatively large system by itself even though it has a much higher clock rate usually in the range of several gigahertz since the instructions are implemented sequentially. Nevertheless, the CPU still has some parallelism when multiple cores are incorporated into a processor.

With regard to off-line simulation, FPGA is not the prime choice, as its limited hardware resource confines the scale of the system to be studied. The CPU is currently the most prevalent platform, but it turns out to be inefficient when the node number increases. The GPU, which is initially used as specialized graphics processor for displaying images, has now evolved into a highly parallel, multi-core processor whose tremendous computational power has induced increasingly interests in accelerating a broad array of computations known as heterogeneous computing. Moreover, its low cost and compact volume enable it to be installed in workstations, personal computers, laptops, meaning off-line simulation based on this device will be as convenient as the traditional EMT-type solvers. Thus, motivated by the desire for more efficient off-line simulation, the GPU is expected to be another mainstream processor in the future.

In this chapter, as the main processors for parallel computation, the FPGA and GPU architectures are first briefed, followed by corresponding program designs using the programming languages VHDL and CUDA C, respectively.

### 2.2 FPGA Introduction

### 2.2.1 FPGA Hardware Architecture

The FPGA is an integrated circuit containing an array of 2-dimensional configurable logic blocks (CLBs) which are interconnected through hardwires and programmable switch matrices. A fundamental CLB is able to implement both combinational and sequential logic functions, and the programmable switch matrices also help to achieve hardware reconfigurability. Two typical FPGA hardware architectures are given in Fig. 2.1 [114, 115], which shows the input/output (I/O) blocks connecting the CLBs and programmable switch matrices are arranged at the periphery of the logic array. The column-based advanced silicon modular block (ASMBL) architecture created by Xilinx<sup>®</sup> offers users a greater convenience in choosing an FPGA device with proper features for their design. This structure is adopted for the 7-series FPGA boards. According to Fig. 2.1(b), the FPGA is composed of a great resource of LUT-6 CLBs, on-chip block memory, DSP slices, precise clocking resources, enhanced PCIe<sup>®</sup> interface blocks, and the programmable switches interconnected via wires.

The Xilinx<sup>®</sup> Virtex<sup>®</sup>-7 VX485TFPGA manufactured with 28nm process technology is the main platform used in this work for real-time HIL testing of power electronic systems. Compared with other 7-series boards, it has more logic resources and higher computational performance. As new FPGAs are kept being launched, the XCVU9P chip on the latest UltraScale+ VCU118 platform providing the highest performance and integration on FinFET is also used. A comprehensive comparison of their main parameters is conducted in Table 2.1 [116, 117].

| Resource        | Virtex <sup>®</sup> -7 VX485T | UltraScale+ VU9P    |
|-----------------|-------------------------------|---------------------|
| Logic Cells     | 485760                        | 2586150             |
| CLB FFs         | 607200                        | 2364480             |
| Block RAM (Kb)  | 37080                         | 75900               |
| Clocking (CMTs) | 14                            | 30                  |
| DSP Slices      | 2800                          | 6840                |
| PCIe®           | 4 Gen2                        | 6 Gen3×16/Gen4×8    |
| Transceivers    | GTX (12.5Gb/s) 56             | GTY (32.75Gb/s) 120 |

Table 2.1: FPGA logic resources

The two FPGA boards share many types of resources, and the latest UltraScale+ VU9P



Figure 2.1: FPGA hardware: (a) mesh architecture, (b) Virtex<sup>®</sup>-7 ASMBL architecture.

FPGA is more resource-abundant and efficient in data exchange than the Virtex<sup>®</sup>-7 boards. The availability of some of the resources is concerned as it affects the scale of a power system to be deployed. Thus, they are introduced in the following subsections.

### 2.2.2 Configurable Logic Block

The configurable logic block is the fundamental component in the FPGA for providing basic logic and arithmetic functions as well as data storage. In the Xilinx<sup>®</sup> Virtex<sup>®</sup>-7 series FPGAs, the CLB contains 2 side-by-side slices, each of which is composed of four 6-input LUTs, which has 2 flip-flops [118]. Other than those resources, a CLB also has 3 wide-function multiplexers and the carry chain to perform arithmetic adding and subtracting operands in its slices.

Fig. 2.2 gives the scheme of a CLB. The slices, organized as 2 individual columns, are not directly connected to each other. *Slice0* is at the bottom of the CLB and place in the left column, while *Slice1* locates at the top and in the right column of the die.

There are two types of CLB slices: those support data storage using distributed RAM and data shift with 32-bit registers are categorized as SLICEM, while the rest are named as SLICEL. Then, a CLB can contain either two SCLICEL or one SLICEL and SLICEM. The LUT in the Virtex<sup>®</sup>-7 FPGA can be implemented as one 6-input 1-output LUT for 64-bit ROM or 2 5-input LUTs with individual outputs for 32-bit ROMs. The carrier chain contains multiplexers and an XOR logic gate for the addition or subtraction operation. As can be seen, the inputs and outputs of a slice are also its ports.

### 2.2.3 Block RAM (BRAM)

In 7-series FPGAs, the block RAM has up to 36Kb data storage capability, and it can be implemented as either 1 RAM or 2 separate RAMs with each having 18Kb data [119]. In addition, it also has the cascaded manner when an adjacent 36Kb BRAM is implemented,



Figure 2.2: Configurable logic block architecture.



Figure 2.3: 7-series FPGA block memory: (a) simple dual-port RAM, (b) true dual-port RAM.

i.e.,  $1 \times 64$ Kb, and under simple dual-port mode, there are a variety of configurations, e.g.,  $1 \times 32$ Kb,  $2 \times 16$ Kb, or even  $72 \times 512$ b. Similar configurations are also available to the two separated 18Kb RAMs.

Under simple dual-port mode, there is only one read-only port and write-only port, which has a high degree of independence, e.g., they are controlled by two clocks, and the data width can also be different, and independent read/write actions can take place simultaneously. Correspondingly, another BRAM type is the true dual-port RAM, whose symmetrical configuration is given in Fig. 2.3. It ensures a flexible data access to either or both ports by enabling them to have an individual address, input/output data, a clock



Figure 2.4: Basic DSP48E1 slice functionality.

signal, write enable, etc. The description of those port names is provided in Table 2.2.

| Table 2.2: Dual-port RAM description |           |                        |  |
|--------------------------------------|-----------|------------------------|--|
| Port                                 | Direction | Description            |  |
| DI                                   | in        | data input bus         |  |
| DIP                                  | in        | data input parity bus  |  |
| ADDR                                 | in        | address bus            |  |
| WE                                   | in        | Byte-wide write enable |  |
| EN                                   | in        | BRAM write enable      |  |
| CLK                                  | in        | clock input            |  |
| DO                                   | output    | data output bus        |  |
| DOP                                  | output    | data output parity bus |  |

#### 2.2.4 Digital Signal Processing (DSP) Slice

Programmable logic devices are efficient carriers for DSP applications, which use many binary multipliers and accumulators. Both the 7-series and UltraScale+ FPGAs have a number of dedicated low-power DSP slices, integrating high speed with compact size while at the same time the system design flexibility is maintained. In addition to digital signal processing, the DSP slices also enable wide dynamic bus shifters, memory address generators, memory-mapped I/O registers, etc. On the 7-series FPGA boards, DSP48E1 slice is adopted [120], as shown in Fig. 2.4, while its UltraScale counterpart is defined using more advancedDSP48E2 [121].

As shown in its slice architecture, the DSP48E1 slice includes 25×18 two's-complement multiplier, a 48-bit accumulator, 25-bit power-saving pre-adder, a pattern detector, etc.



Figure 2.5: Hardware design procedure and experimental setup.

# 2.3 HIL Implementation Procedure

The entire hardware design procedure for HIL emulation is depicted in Fig. 2.5. The whole process can be completed in 3 stages, summarized as:

- Data entry by high-level synthesis
- Top-level design and simulation with Vivado<sup>®</sup>
- Bit file generation and experimental test on FPGA

Thus, in the following paragraphs, each stage is specified, including tools necessary for the design, the programming language, and prototype setting, etc.

### 2.3.1 Vivado<sup>®</sup> High-Level Synthesis Tool

The Xilinx<sup>®</sup> high-level synthesis software Vivado  $HLS^{\mathbb{R}}$  is able to transform C/C++ functions into a register transfer level (RTL) implementation which synthesizes into the vendor's FPGAs.

During this stage of design, the user can develop a hardware module using the programming language C/C++, rather than VHDL at the logic gate level, which greatly facilitates the hardware design. For example, to realize a complex multiple-input-multipleoutput algebraic module, the C/C++ function can be written as:

void func (float ai, float bi, float \*ao, float \*bo){

algebraic functions here;

}

It should be pointed out that the variables are defined as a floating point which corresponds to 32 bits because it is more efficient for computation than 64-bit digits. The algebraic function description could contain potentially parallel operations, and therefore the design tool offers pipeline structure option in its directives, which greatly facilitates programming. Then, the C synthesis function provided by that tool creates the RTL design of the written function automatically. The syntax is also checked during this process: an erroneous function would lead to immature termination of C synthesis. The option *Export RTL* enables the RTL design to be exported as an IP, which has corresponding input/output ports in VHDL format:

COMPONENT func

PORT (

ao\_ap\_vld : OUT STD\_LOGIC;

bo\_ap\_vld : OUT STD\_LOGIC;

ap\_clk : IN STD\_LOGIC;

*ap\_rst* : IN STD\_LOGIC;

```
ap_start : IN STD_LOGIC;
```

*ap\_done* : OUT STD\_LOGIC;

*ap\_idle* : OUT STD\_LOGIC;

*ap\_ready* : OUT STD\_LOGIC;

ai : IN STD\_LOGIC\_VECTOR(31 DOWNTO 0);

*bi* : IN STD\_LOGIC\_VECTOR(31 DOWNTO 0);

ao : OUT STD\_LOGIC\_VECTOR(31 DOWNTO 0);

### bo : OUT STD\_LOGIC\_VECTOR(31 DOWNTO 0);

)

Since RTL design could be completed by Vivado HLS<sup>®</sup>, C/RTL co-simulation is available in the software after writing a corresponding test-bench. The co-simulation is deemed equivalent to the hardware behavioral simulation, and the results are a preliminary validation of the hardware design even though it is C-based.

### 2.3.2 Vivado<sup>®</sup> Top-Level Design

A power electronic system EMT model contains a number of modules, normally classified according to their functionality. Thus, for HIL emulation, all those functions are first written in C/C++ under Vivado HLS<sup>®</sup> environment, and after IP generation and export they can be identified by Vivado<sup>®</sup>. Those user-defined IPs are in fact treated as the same to its default hardware modules in the IP catalog.

In EMT simulation, signals between various electrical apparatuses are exchanged at the end of each time-step, e.g., the controller sends IGBT gate voltages to the power converter, which in turn gives its sampled voltages and currents, or in other cases, the outputs should be fed into the inputs of the same module. All those data exchanges is not included in the hardware modules designed by C functions. Instead, their connection is achieved in Vivado<sup>®</sup> using the programming language VHDL, and the typical syntax, take the above hardware module *func* for example, is:

*if clk='1' and clk'event then if ao\_ap\_vld='1' then ai<=ao; end if; if bo\_ap\_vld='1' then bi<=bo; end if; end if;* 

which can be synthesized into a flip-flop, as Fig. 2.6 shows the self-connection of the hardware module *func*. Once a valid output is generated, the data valid signal becomes binary 1, which is taken as the clock signal of the D flip-flop, whose output is fed to the input ports of the C-based module after the next clock signal  $ap\_clk$  arrives.

The *ap\_ctrl* port includes four binary ports, among which the start control port is controlled by the finite state machine (FSM) along with the reset port *ap\_rst*. Thus, unnecessary calculation by the module which leads to incorrect results can be avoided, and in case a rerun of the emulation is needed for observation of particular power system phenomena, giving a reset order is sufficient. On the other hand, the other 3 signals indicating the operation status of the module are taken by the FSM as feedbacks for state shift judgment.

Since the above design is carried out manually while the RTL design for Vivado HLS<sup>®</sup> co-simulation is conducted automatically, the results from the artificial design are not guaranteed to be correct. Thus, the behavioral simulation offered by Vivado<sup>®</sup> is a further validation approach of the top-level hardware design.



Figure 2.6: Demonstration of top-level hardware design.

### 2.3.3 FPGA Experiment

The HIL emulation results are ultimately expected to be observed in the oscilloscope. To achieve that goal, the designed top-level needs to be implemented on the FPGA after following steps in Vivado<sup>®</sup>:

- Run synthesis which is a process of transforming an RTL design into a gate-level representation.
- Run implementation includes all necessary stages to place and route the netlist onto FPGA resources, under various logical, physical, and timing constraints.
- Generate bit stream implements the embedded design and creates a bit file that can be downloaded into the targeted FPGA board

As shown in Fig. 2.5, a digital-to-analog conversion medium is mandatory since the oscilloscope channels receive analog signals. The Texas Instruments DAC34H84 quad-channel, 16-bit, digital-to-analog converter with a sample rate as high as 1.25 GSPS is connected to the FPGA board and the Tektronix DPO 7054 oscilloscope so that the hardware design results can be displayed as real-time waveforms.

# 2.4 GPU Introduction

Evolved from its origin as specialized graphics processor for rapid image display to nowadays 3-D graphics and state-of-the-art high-performance computing (HPC) technology, the GPU enables advances in various fields, such as artificial intelligence, autonomous driving, and numerous compute-intensive applications. In the meantime, the tremendous improvement in HPC is accompanied by the GPU architecture evolution. Take the NVIDIA<sup>®</sup> GPU roadmap for example, it underwent Tesla, Fermi, Kepler, Maxwell, Pascal, and Volta architectures since 2008, with each architecture type being launched every two years. In this work, two types of GPUs are used: the GeForce<sup>®</sup> GTX 1080 (Pascal architecture) with 16nm FinFET manufacturing process, and the Tesla<sup>®</sup> V100 (Volta architecture) accelerator fabricated on 12nm FFN manufacturing process. Their detailed information is provided in Table 2.3 [122, 123], which demonstrates that the Volta architecture GPU has more abundant hardware resources and faster data access speed, meaning the V100 GPU could have a higher concurrency in computation. The only performance that GTX 1080 GPU overrides is the clock frequency which indicates for circuit simulation with low parallelism, it will be faster.

| Resource           | GTX 1080            | Tesla <sup>®</sup> V100 |
|--------------------|---------------------|-------------------------|
| SMs                | 20                  | 80                      |
| CUDA cores         | 2560                | 5120                    |
| Base clock         | 1607MHz             | _                       |
| GPU boost clock    | 1733MHz             | 1530MHz                 |
| FLOPs              | 8873Giga            | 15.7Tera                |
| Texture units      | 160                 | 320                     |
| Memory             | 8GB                 | 16GB                    |
| Memory bandwidth   | 320GB/s             | 900GB/s                 |
| L2 Cache Size      | 2048KB              | 6144 KB                 |
| TDP                | 180W                | 300W                    |
| Transistors        | 7.2 billion         | 21.1 billion            |
| Die Size           | 314 mm <sup>2</sup> | 815mm <sup>2</sup>      |
| Manufacturing      | 16nm                | 12nm FFN                |
| Compute capability | 6.1                 | 7.0                     |

### Table 2.3: GeForce<sup>®</sup> GTX 1080 and Tesla<sup>®</sup> V100 specifics

### 2.4.1 NVIDIA<sup>®</sup> GPU Architecture

The block diagram of NVIDIA<sup>®</sup> GeForce<sup>®</sup> GTX 1080 (Pascal architecture) is sketched in Fig. 2.7, which shows that it consists of 4 graphics processing clusters (GPCs), each of which is composed of 5 streaming multiprocessors and a dedicated raster engine. On the other hand, the Tesla<sup>®</sup> V100 (Volta architecture) has a similar block diagram; nevertheless, it features a larger device, for example, it in total contains 6 GPCs, with each having 14 streaming multiprocessors equally distributed in its 7 TPCs, and 8 512-bit memory controllers.

As shown in Fig. 2.8(a), inside a NVIDIA<sup>®</sup> GeForce<sup>®</sup> GTX 1080 streaming multiprocessor there is 96KB shared memory, up to 128 CUDA cores, and 256 KB of register file capacity. The streaming multiprocessor which has a high parallelism schedule warps of 32 threads to CUDA cores. Thus, with 20 highly parallel multiprocessors, the GPU is equipped with a total of 2560 CUDA cores. As a comparison, the streaming multiproces-



Figure 2.7: GeForce GTX 1080 block diagram.

sor of the Tesla<sup>®</sup> V100 GPU is given in Fig. 2.8(b). There are 64 FP32 cores, 64 INT32 cores, 8 Tensor cores, and 4 texture units. Therefore, with a total number of 84 streaming multi-processors, the V100 GPU has 5376 FP32 cores and INT32 cores, and a total of 6144 KB of L2 cache.

The GPU CUDA compute capability is another concern in this work, since the dynamic parallelism which allows a CUDA kernel to create and synchronize its own new nested work [124] is used in GPU simulation to accommodate the hierarchical configuration of the power electronic system. Besides, with this feature, the programmer is liberated from transferring data and control between the host CPU and the GPU device.

### 2.4.2 GPU Massively Parallel Processing

The massive parallelism of GPU is realized by the kernel, which is a global function coded by CUDA C language and run on the GPU device. The kernel is implemented in the single-instruction-multiple-thread mode. The CUDA C syntax is

kernel<<<Nblock, Nthread>>>(input dev\_ai, input dev\_bi, ..., output dev\_ao, output dev\_bo, ...);

where *Nblock* and *Nthread* represent the number of compute blocks and threads per block respectively, and the overall *Nblock*×*Nthread* threads constitute a compute grid. The size of inputs *dev\_ai*, *dev\_bi* is determined by the number of variables they correspond to, and they are accessed by threads in a way defined by the programmer. For example, *dev\_ai* is



Figure 2.8: Streaming multiprocessor diagram of (a) GeForce GTX 1080, (b) Tesla V100.



Figure 2.9: GPU implementation process.

1-D when it is the simulation time, but its value can be read by all threads; and if *dev\_ai* has a dimension of *Nblock×Nthread*, its elements can be equally assigned to each thread. The declaration of the variable is normally conducted on the host CPU, and copied to the device by the following sentences, taking *dev\_ai* for example *cudaMalloc((void\*\*)&dev\_ai, N\*sizeof(datatype));* 

### cudaMemcpy(dev\_ai, host\_ai, N\*sizeof(datatype), cudaMemcpyHostToDevice);

where *host\_ai* is the variable stored in host memory. However, if the declaration takes place on the device when dynamic parallelism is applied, memory copy to the device is no longer necessary.

Fig. 2.9 describes the GPU's general computational architecture. The main function written by C/C++ runs on the host CPU, where data required by the kernel Kernel<sub>x</sub> are defined and copied to the GPU via the PCIe<sup>®</sup> interface. Then, the host invokes a compute grid of multiple blocks and threads on the GPU for  $Kernel_x$ . Data from the CPU, as well as the inputs and outputs of a kernel, is stored in the global memory so that it is accessible to all kernels. According to the definition of a computing grid, circuits with the same attribute can be coded as one kernel, and each circuit corresponds to one thread. Thus, by invoking the kernel, all circuits of the same type are computed in a massively concurrent fashion. During the computation process, supported by GPUs with compute capability *sm\_35* and thereafter, an arbitrary thread can launch its own child grid from the device where a new kernel Kernel<sub>u</sub> operates, and if necessary, it can keep launching grandchild grids in the SIMT fashion without CPU involvement. The synchronization function is used to ensure that the child grid will return to its upper level only when all of its threads complete their tasks, and so does the original compute grid for  $Kernel_x$ . Afterward, the GPU hands over the process control to CPU, so that all data generated on the device can be exported to the host through the PCIe<sup>®</sup> bus.

### 2.4.3 Multi-Core CPU

In addition to GPU simulation, CPU programming is also carried out for speed comparison. The presence of a large number of repetitive components means the CPU implementation would be extraordinarily inefficient if they proceeded in a sequential manner. Thus, the multi-core CPU is utilized to accelerate the simulation speed by distributing the tasks among those cores. The variables of identical circuit components are grouped as an array, so when the program is executed on a single-core CPU, it takes the form of *for loop*, as taking the variable *ao* for example

*for* (*int* i=0;i<M;i++){

for (int j=0;j<N;j++){
 ao[i][j]=f(ai[i][j],bi[i][j]);}</pre>

Each element in the above sentences is calculated sequentially. Noting that each element of *ao* is dependent on each other, the best condition is computing them simultaneously, which is feasible in FPGA implementation. Nevertheless, in multi-core CPU computation, the parallelism is limited by the number of cores. The OpenMP<sup>®</sup> is an application programming interface (API) supporting CPU with the coding language C/C++ under shared memory mode [125]. The alternative of CPU multi-threading programming

<sup>}</sup> 

method is thus applied for power electronic system simulation. By initiating the parallel order elements in the C loop can be assigned to different threads, as the syntax takes the form of

```
\begin{aligned} & \mbox{ $\#$ pragma omp parallel for $num\_threads(N_t)$} \\ & \mbox{ for (int $i=0;i<M;i++)$} \\ & \mbox{ $\#$ pragma omp parallel for $num\_threads(N_t)$} \\ & \mbox{ for (int $j=0;j<N;j++)$} \\ & \mbox{ $ao[i][j]=f(ai[i][j],bi[i][j]);$} \end{aligned}
```

}

It should be noted that adoption of OpenMP<sup>®</sup> is by no means a faster computational speed since multi-threading should be launched first and all calculation results are written to the shared memory after completion. On the contrary, the realization of speedup with multi-core CPU programming is dependent on the size of the problem. Meanwhile, with regard to parallel capability, the GPU is able to deal with a much large number of threads while the ability of CPU relies on its core number. In other words, as the number of circuits the multi-core CPU can handle at one time is smaller, it will have to process more times, and consequently, the efficiency is lower. In this work, Microsoft<sup>®</sup> Visual Studio is chosen to run the GPU CUDA C codes and the CPU C program.

### 2.5 Summary

This chapter presented some fundamental aspects of high-performance computing using the parallelism of FPGA and GPU, whose hardware architectures were briefly introduced. The FPGA is superior in parallelism, and the pipelined structure as well as convenient data access to other processors determines its popularity in real-time hardware-in-the-loop emulation. In the meantime, the Xilinx<sup>®</sup> Vivado HLS<sup>®</sup> enables the user to code in advanced C/C++ language to define their own IP cores which can be imported into Vivado<sup>®</sup> for upper-level circuit design with VHDL so that the time-consuming process of logic-gate manipulation is avoided.

Multi-threading programming techniques using CPU and GPU were also discussed. The large amount of cores of GPU enables massively parallelism using the programming language CUDA C. The long data exchange time between GPU and other processors and relatively lower parallelism than FPGA are two main factors which restrict its application in real-time simulation. Nevertheless, its single-instruction-multiple-threads operation mode enables the device to launch a compute grid and conduct the computation concurrently, making it suitable for large-scale power electronic system off-line simulation.

# Linearized Device-Level Modular Multi-Level Converter Model

# 3.1 Introduction

This chapter presents a device-level MMC model using piecewise linear IGBT/diode for nanoseconds-level real-time hardware-in-the-loop emulation.

Due to the large network size of the MMC, its solution in conjunction with surrounding systems proved to be a significant computational challenge. Taking the IGBT/diode pair as a two-state resistor and the subsequent merging of an entire MMC arm into its Thévenin equivalence are the two main processes in obtaining the detailed equivalent model (DEM) that results in less computational burden.

However, when the converter power loss is concerned, the accurate device-level information of the power semiconductor switch is needed for design evaluation or new model validation. The IGBT and its freewheeling diode are modeled with nonlinear static and dynamic characteristics. In the curve-fitting model, main features of the IGBT/diode pair, such as the static nonlinear *V*-*I* characteristics, turn-on/off and reverse recovery process, are preserved in order to provide more accurate information of the converter within the capability of the real-time system. Apart from previous works where the simulation timesteps were generally in the range of tens of microseconds to enable real-time execution while omitting nanosecond-scale phenomena, the time-step is much smaller so as to capture the switching transients.

Once the device-level IGBT/diode model is involved in the MMC, submodule (SM) merging becomes less feasible. As a consequence, the MMC will be extraordinarily large in terms of the number of circuit nodes, making it burdensome for real-time execution with a small time-step. Therefore, the fine-grained circuit partitioning approach based on



Figure 3.1: A resistor and its EMT model.

the transmission line modeling (TLM) link technique is proposed, which partitions every possible minimum subsystem from the original system. The instant outcome is that the multi-loop MMC is split into several smaller sub-circuits in terms of matrix size, and consequently enables a fully parallel implementation on the FPGA.

### 3.2 EMT Model of Basic Elements

The resistor, inductor, and capacitor are fundamental passive elements existing as an independent component or constituting the power electronic system. Therefore, their EMT companion circuits should be obtained in discrete-time domain. As TLM has been utilized in large circuit simulations due to its capability to replace reactive components while maintaining high precision [75–78, 126–128], in this work, it is also involved in the simulation.

### 3.2.1 Resistor

The EMT model of a resistor is quite exactly to itself in the continuous-time domain, as shown in Fig. 3.1. As a consequence, the relationship between its terminal voltage and current is the Ohm's law:

$$i_{km} = \frac{v_{km}}{R}.$$
(3.1)

### 3.2.2 Inductor

The inductor is an energy storage element with the following differential equation representing its continuous-time domain model

$$v_{km} = L \frac{di_{km}}{dt}.$$
(3.2)

Discretization of the above equation leads to several commonly-seen models with different orders. For example, the second-order model is obtained using Trapezoidal Integration rule:

$$i_{km}(t + \Delta t) = i_{km}(t) + \frac{1}{L} \int_{t}^{t + \Delta t} v_{km} dt$$
  
=  $i_{km}(t) + \frac{v_{km}(t + \Delta t) + v_{km}(t)}{2L} \Delta t,$  (3.3)



Figure 3.2: Inductor: (a) symbol, (b) companion model, (c) TLM representation, and (d) TLM-stub model.

which can be written as the following general form

$$i_{km}(t + \Delta t) = \frac{v_{km}(t + \Delta t)}{R_{eq}} + I_h(t),$$
(3.4)

where  $R_{eq}$  and  $I_h(t)$  are

$$R_{eq} = \frac{2L}{\Delta t},\tag{3.5}$$

$$I_h(t) = i_{km}(t) + \frac{v_{km}(t)}{R_{eq}},$$
(3.6)

as also given in Fig. 3.2(b).

The inductor can also be represented by a section of lossless transmission line [126–128], as shown in Fig. 3.2(c) where – according to transmission line theory that the terminal voltage is composed of the incident and reflected pulses –  $v_{km}^i$  and  $v_{km}^r$  together form the excitation of the one-port circuit. Fig. 3.2(d) is the Thévenin equivalent circuit of the inductor's TLM-stub model after the transmission line is discretized. The characteristic impedance, also known as surge impedance, is calculated in the same way as (3.5), whereas the voltage source is double the value of incident pulse. At the transmission line terminal, we have

$${}_{n}v_{km} = {}_{n}v_{km}^{i} + {}_{n}v_{km}^{r}, (3.7)$$

$${}_{n}v_{km} = {}_{n}i_{km} \cdot Z_L + 2 \cdot {}_{n}v^i_{km}, \qquad (3.8)$$

where the subscription n denotes time instant. Thus, the reflected pulse at instant n can be calculated according to above two equations. Then, the incident pulse at the next time





instant is

$$v_{n+1}v_{km}^i = -nv_{km}^r,$$
(3.9)

since the transmission line is short-circuited which causes the reflected pulse to be exactly the same as the incident pulse other than it has an opposite direction. It should be pointed out that the adoption of either Thévenin equivalent circuit or its Norton counterpart is determined by the manner of circuit solution since they are convertible to each other.

### 3.2.3 Capacitor

The capacitor is another energy storage component with the continuous-time domain differential equation written as:

$$i_{km} = C \frac{dv_{km}}{dt}.$$
(3.10)

The companion model in Fig. 3.3(b) can be obtained in a similar style as the inductor, as can be also expressed by the general companion model equation (3.4), where

$$R_{eq} = \frac{\Delta t}{2C},\tag{3.11}$$

$$I_h(t) = -i_{km}(t) - \frac{v_{km}(t)}{R_{eq}}.$$
(3.12)

The capacitor can also be modeled as an open-end lossless transmission line, as given in Fig. 3.3(c), and the TLM-stub model in the Thévenin equivalent circuit form in Fig. 3.3(d)



Figure 3.4: Lossless transmission line and its TLM-link model.

has the same configuration to that of an inductor. The characteristic impedance  $Z_C$  is defined as (3.11), and its incident pulse is updated by

$$v_{km}^i =_n v_{km}^r,$$
 (3.13)

since the end of transmission line is open-circuit.

### 3.2.4 TLM-Link

In addition to the stub model introduced above for replacing reactive components in EMT simulation, the TLM-link is another main type of the lossless transmission line, which is a two-port model that has typically been used to decouple a large circuit into a few small sub-circuits, leading to a reduction of sizes of the impedance and admittance matrices as well as a saving of calculation time. It is shown in Fig. 3.4 where  $v_k(t)$ ,  $i_k(t)$ ,  $v_m(t)$  and  $i_m(t)$  are the time-domain voltage and current at terminals k and m, and  $Z_0$  represents line characteristic impedance defined by  $\sqrt{\frac{L}{C}} \Omega$ , where L and C are the inductance of capacitance of the line, respectively.

As shown on the right-hand side of Fig. 3.4 where the discretized hybrid Thévénin-Norton equivalent circuit is adopted for demonstration. In digital simulation, the two discretized pulses are linked by the present time-step n and the next time-step n+1, that is,  $_{n+1}v_k^i = _n v_m^r$  and  $_{n+1}v_m^i = _n v_k^r$  assuming it takes one time-step for pulses to travel from one terminal to the other, and then the following two equations are valid at both terminals:

$${}_{n}v_{(k,m)} = {}_{n}i_{(k,m)} \cdot Z_0 + 2_{n}v_{(k,m)}^i,$$
(3.14)

$${}_{n}v^{r}_{(k,m)} = {}_{n}v_{(k,m)} - {}_{n}v^{i}_{(k,m)}.$$
(3.15)

Therefore, with discretization of the link, a circuit can be divided into two parts, meaning in digital simulation, these two sub-circuits are independent when solving their respective matrix equations, the only connection between them being the update of incident pulses on both sides which takes place after the reflected pulses are obtained at the end of one time-step.



Figure 3.5: MMC configuration and its half-bridge submodule models.

### 3.3 Power Semiconductor Switch-Based MMC Modeling

The configuration of MMC is given in Fig. 3.5, which also shows two half-bridge submodules (HBSMs): the device-level model employing nonlinear IGBT/diode models, and the ideal model which represents the switches as two-state resistors with distinct off- and onstate resistance. Therefore, it is obvious that the difference between various MMC EMT models is largely caused by modeling of its IGBT/diode, which is carried out in this section.

The ideal-switch-based DEM has been proven by EMT tools and is prevalent for achieving faster simulation speed compared with traditional models [33]. It is based on the following equations:

$$R_{eq} = \frac{R_1 R_2 + R_2 Z_{Ck}}{R_1 + R_2 + Z_{Ck}},$$
(3.16)

$$V_{eq}(t - \Delta t) = \frac{R_2 V_{Ceqk}(t - \Delta t)}{R_1 + R_2 + Z_{Ck}},$$
(3.17)

where  $Z_{Ck}$  and  $V_{Ceqk}$  compose the Thévenin equivalent circuit of the submodule capacitor,  $R_1$  and  $R_2$  are resistances of the two complementary switches, and  $\Delta t$  is the simulation time-step. However, it lacks device nonlinearities and is only suitable for system performance preview. On the contrary, nonlinear switch models are highly inefficient for CPU simulation and usually require large amount hardware resource for FPGA implementation due to the iterative nature of the solution. Thus, a trade-off can be made by constructing a hybrid MMC arm model which features device-level details and computational efficiency simultaneously.

### 3.3.1 IGBT/Diode Curve-Fitting Model

Table 3.1 gives the four states that a normally-operated HBSM undergoes. It indicates that each IGBT and diode is unique and it is easy to pin down which of them is operating at any given time. For example, positive arm current implies either the upper diode or lower IGBT is conducting, and combined with the value of  $V_g$ , a final judgment can be correctly made. Based on that principle, and since a considerable number of modern-day IGBT modules are a combination of both, modeling them as one switch and showing the corresponding features according to gate voltage and arm current direction would reduce the number of meshes in the submodule and consequently shortens the MMC model calculation time.





To establish an accurate model, both the nonlinear static and dynamic characteristics are required, and most of the information is readily available in the device datasheet, whose parameters are extracted from experimental setup, meaning factors such as stray inductance caused by the IGBT are also counted and reflected by the data. The static characteristic of IGBT is shown in Fig. 3.6(a), from which its voltage drop under steady-state can be acquired according to gate voltage and collector current, and therefore the power consumption can be calculated. However, the  $I_C - V_{CE}$  curves are nonlinear and representing them by nonlinear or polynominal functions would lead to a long hardware latency in the FPGA implementation. Considering that it is not necessary to obtain the  $I_C - V_{CE}$ values as precisely as they are, the nonlinear curve is divided into several segments and each is treated as a straight line, so that the collector-emitter voltage in a certain segment takes the form of

$$V_{CE} = r_0 \cdot I_C + v_{ceo}, \tag{3.18}$$



Figure 3.6: The behavior of IGBT and diode: (a) IGBT static I - V characteristics and switching transient waveforms, and (b) diode static I - V characteristics and reverse recovery process.

where the constants  $r_0$  and  $v_{ceo}$  are deduced by linearization and their values at different segments differ from each other. (3.18) reflects a linear resistance in a segment. Hence, a piecewise linear resistor that consists of all the linear resistances can be used to replace the IGBT. Thus, a general expression for the resistance of any segment can be written as

$$r = \frac{V_{CE}}{I_C} = r_0 + \frac{v_{ceo}}{I_C}.$$
(3.19)

The typical terminal voltage and current waveforms of the IGBT is also shown in Fig. 3.6(a), which can be obtained from SaberRD<sup>®</sup> simulation of a circuit with similar topology

as that of the MMC submodule. It is proven by SaberRD<sup>®</sup> that the shapes of  $v_{CE}$  and  $i_C$  are virtually the same in different levels of MMC. Therefore, curve-fitting is used and these shapes are applied to IGBTs in the MMC, meaning the proportions of the IGBT terminal voltage and current at turn-off and turn-on stages accounting for the steady-state values are known. The  $v_{CE}$  value at off-state can be deemed as equal to the submodule capacitor voltage  $V_C$ , and therefore, the transient voltage values can be calculated instantly. With regard to the current, the turn-off curve can be easily obtained, because its trend is certain, dropping from steady-state current to the final value of zero with a known rate. Nevertheless, the final value of the turn-on stage is unavailable unless the IGBT enters steady-state; therefore it is difficult to determine in advance the current surge during the turn-on stage. The MMC arm current provides a solution, noting that its absolute value can be deemed as equal to the IGBT current under steady-state. Then the voltage and current during the transient stage take the form of

$$v_{CE}(t + \Delta t_1) = x\% \cdot V_C,$$
 (3.20)

$$i_C(t + \Delta t_1) = k(t + \Delta t_1) \cdot |i_{u,d}|, \qquad (3.21)$$

where  $\Delta t_1$  is the time-step used to mark the transient process, and x% and  $k(t + \Delta t_1)$  are coefficients that decided by the shapes of  $v_{CE}$  and  $i_C$  respectively. Obviously, the smaller the  $\Delta t_1$ , the more precise the model would be as the transient stage for IGBT and diode usually only lasts from several hundred nanoseconds to a few microseconds. It should be pointed out that the gate driver resistance affects the static and dynamic characteristics of the IGBT. But usually its value is chosen from a small range within which the impact of gate resistance variation is little, and the typical value of 10  $\Omega$  is chosen since a larger resistor leads to a longer dynamic process.

Fig. 3.6(b) shows nonlinear static and dynamic characteristics of the anti-parallel diode. Forward conduction and reverse recovery are the important phenomena since they account for the majority of power loss. The exponential static curve of diode is linearized in a similar fashion as the IGBT. Hence it has a same form to (3.19) and is shown in Fig. 3.7(a), where the nodes are denoted by those of IGBT since the diode shares its terminals with it. During the diode reverse recovery process, the terminal voltage is still deemed as a controlled voltage source, whereas the current is a time-dependent current source which is proportional to the static current. For example, the peak value of the reverse current is set as large as the value just prior to the process, and the sequence of coefficients  $k_1$ ,  $k_2$ ,  $k_3$  are in decline to represent the tailing current. Therefore, the transient model for IGBT and diode can be unified as a combination of current-controlled current source and voltage-controlled voltage source, as shown in Fig. 3.7(b).



Figure 3.7: Unified IGBT/diode pair behavioral model for (a) static characteristics, and (b) dynamic features

# 3.4 Fine-Grained MMC Partitioning Schemes

The introduction of aforementioned nonlinear switch models leads to a more complicated MMC network, for which the submodule merging approach for DEM is not instantly feasible. Meanwhile, the many submodules connecting to each other introduce plenty of meshes and nodes, making direct computation of the converter impractical. As introduced, circuit partitioning is an effective method in reducing the dimension of circuit's matrix equation. Based on the fundamental principle that the section to be split should have a stiff voltage or current, and the fact that the complexity of MMC model is caused by its power semiconductor switches in the SM, the MMC arm turns out to be the ideal partitioning interface to create a group of independent sub-circuits. Therefore, the original large admittance matrix for the MMC is split into a number of smaller matrices and parallel computation can be achieved on the FPGA to accelerate HIL emulation.

### 3.4.1 TLM-Link Partitioning

As the three-phase MMC is symmetrical, it is reasonable to carry out analysis based on one phase. Fig. 3.8(a) shows the process of splitting the large MMC network consisting of a considerable number of nodes and meshes into several structurally independent, electrically related sub-circuits. The arm inductor is first divided into (N+1) parts which are redistributed so that a new inductor  $\delta L$  is connected in series with each submodule to constitute a two-port network, and consequently the remaining inductance for the arm inductor is  $L_{u,d}$ - $N \cdot \delta L$ . Then, these new inductors are replaced by TLM links, discretization of which leads to the separation of submodules from the rest of the converter (MMC main circuit), enabling the replacement of the originally large impedance or admittance matrix by a number of sub-matrices with smaller dimensions, which, if processed in parallel on the FPGA, would be much more time and resource efficient.

The selection of the value of  $\delta L$ , which decides the characteristic impedance  $Z_0$  and vice versa, plays a significant role on the emulation results. The principle, as stated, is an



Figure 3.8: TLM-based model for (N+1)-level MMC: (a) MMC partitioning approach, and (b) discretized schematic for the overall system.

appropriate value of  $\delta L$  should lead to a tiny current change in the inductor within each time-step [78]. Thus, the optimum value can be picked from its range by running Matlab simulation of the MMC and comparing the current changes. It shows that the final value of  $\delta L$  is negligible compared with the arm inductance so that the latter can still be deemed as  $L_{u,d}$ .

For the MMC main circuit where the Norton equivalent circuit part of TLM link's hybrid model locates, by merging all the Norton circuits in the upper and lower arms respectively there is actually only one node since the potentials at all other three nodes are known, as shown in Fig. 3.8(b), the overall schematic for MMC. Then, the nodal voltage equation at  $n^{th}$  time-step can be derived by applying Kirchhoff's Current Law (KCL),

$${}_{n}\mathbf{V}_{s}=\mathbf{G}^{-1}\cdot{}_{n}\mathbf{J},\tag{3.22}$$

where these  $1 \times 1$  matrices are

$$\mathbf{G} = \begin{bmatrix} \frac{2}{N \cdot Z_0 + Z_{Lu,d}} \end{bmatrix},\tag{3.23}$$

$${}_{n}\mathbf{J} = \left[\frac{{}_{n}J_{u}^{\Sigma} - {}_{n}J_{d}^{\Sigma}}{\Sigma Z} - {}_{n}i_{s}\right] = \\ \left[\sum_{2 \cdot \frac{j=N+1}{N}}^{2N} {}_{n}v_{mj}^{i} + {}_{n}v_{Ld}^{i} - \left(\sum_{j=1}^{N} {}_{n}v_{mj}^{i} + {}_{n}v_{Lu}^{i}\right) \\ {}_{N \cdot Z_{0} + Z_{Lu,d}} - {}_{n}i_{s}\right].$$
(3.24)

In (3.24),  $\Sigma Z_{n} J_{u}^{\Sigma}$  and  ${}_{n} J_{d}^{\Sigma}$  are the impedance and current sources of the Norton equivalent circuit of an arm. The one-element voltage vector  ${}_{n}\mathbf{V}_{s}$  is numerically equal to the stator voltage  ${}_{n}v_{s}$ . Also, it should be pointed out that all variables keep constant for a whole time-step. Prior to calculating reflected pulses for the next time-step by using (3.14) and (3.15), the upper and lower arm currents should be updated based on the obtained nodal voltages, as briefly expressed by:

$$\begin{bmatrix} ni_u\\ ni_d \end{bmatrix} = \begin{bmatrix} \frac{1}{\Sigma Z} & \frac{-1}{\Sigma Z} \\ \frac{1}{\Sigma Z} & \frac{1}{\Sigma Z} \end{bmatrix} \cdot \begin{bmatrix} V_{dc}\\ nv_s \end{bmatrix} + \begin{bmatrix} -nJ_u^{\Sigma} \\ -nJ_d^{\Sigma} \end{bmatrix}.$$
(3.25)

The 2N sub-circuits containing the Thévénin equivalent circuit part of the TLM link are identical. When the CFM is adopted for the IGBT/diode, all SMs have two meshes, so according to Kirchhoff's Voltage Law (KVL), the mesh current equations under steadystate can be written uniformly as

$$\begin{bmatrix} ni_{l1} \\ ni_{l2} \end{bmatrix} = \begin{bmatrix} Z_C + r_1 + r_2 & -r_2 \\ -r_2 & r_2 + Z_0 \end{bmatrix}^{-1} \begin{bmatrix} 2 \cdot nv_{Cj}^i \\ -2 \cdot nv_{kj}^i \end{bmatrix},$$
(3.26)

where  $_{n}i_{l1}$  and  $_{n}i_{l2}$  are mesh currents,  $Z_{C}$  is the characteristic impedance of DC capacitor. On the other hand, when the transient stage takes place, discretization of (3.21) leads to

$$_{n}i_{C1,2} = _{n}k \cdot |_{n}i_{u,d}|,$$
(3.27)

from which it is convenient to calculate the mesh currents by

$$_{n}i_{l1} = _{n}i_{C1},$$
 (3.28)

$${}_{n}i_{l2} = {}_{n}i_{C1} - {}_{n}i_{C2}. \tag{3.29}$$

Then, the voltage across each Thévénin equivalent circuit, regardless of the stage, can be calculated by

$$\begin{bmatrix} n v_{Cj} \\ n v_{kj} \end{bmatrix} = \begin{bmatrix} -Z_C & 0 \\ 0 & Z_0 \end{bmatrix} \cdot \begin{bmatrix} n i_{l1} \\ n i_{l2} \end{bmatrix} + \begin{bmatrix} 2_n v_{Cj}^i \\ 2_n v_{kj}^i \end{bmatrix}.$$
(3.30)

Thus, calculation of reflected pulses can be carried out by substituting the acquired terminal voltages into (3.15), and the time-step ends with updating the incident pulses.

## 3.5 Hardware Design on FPGA

In this section, the MMC with curve-fitting model is applied to drive an induction machine (IM). The hardware design on FPGA is first carried out, followed by real-time emulation results demonstration, analysis, and validation in the next chapter.

### 3.5.1 Hardware Platform

The hardware design of the MMC-IM system was carried out on the Xilinx<sup>®</sup> XC7VX485T FPGA, which includes 303600 look-up tables (LUTs), 607200 flip-flops (FFs), 2800 DSPs and 2060 block RAMs (BRAMs). Table 3.2 lists an estimation of hardware utilization when different levels of MMCs are implemented on two types of FPGA devices, and the maximum operational frequency  $f_{max}$  of each design is also shown. A higher operational frequency gives a larger speed margin for a certain time-step, but the chip power dissipation increases along with it; on the contrary, a lower frequency leads to less power dissipation but the design may fail to attain real-time execution. Therefore a trade-off is made and the operational frequency of 100 MHz is chosen, with the corresponding clock period of the FPGA  $T_{clk}$  as 10 *n*s.

| FPGA  | System        | LUT           | FF            | DSP           | f <sub>max</sub> (MHz) |
|-------|---------------|---------------|---------------|---------------|------------------------|
|       | MMC5 $(3ph)$  | 233K (76.74%) | 134K (22.08%) | 966 (34.50%)  | 116                    |
| XC7V- | MMC7 $(1ph)$  | 114K (37.50%) | 64K (10.57%)  | 490 (17.50%)  | 115                    |
| X485T | MMC11 $(1ph)$ | 168K (55.38%) | 109K (18.03%) | 901 (32.18%)  | 116                    |
|       | MMC5-IM       | 249K (82.06%) | 143K (23.62%) | 1155 (41.25%) | 115                    |
|       | MMC5 $(3ph)$  | 233K (19.08%) | 134K (5.49%)  | 966 (44.73%)  | 125                    |
| XC7V- | MMC7 $(3ph)$  | 350K (28.67%) | 194K (7.94%)  | 1530 (70.83%) | 121                    |
| 2000T | MMC11 $(1ph)$ | 168K (13.80%) | 111K (4.53%)  | 901 (41.71%)  | 121                    |
|       | MMC5-IM       | 250K (20.51%) | 143K (5.87%)  | 1155 (53.47%) | 125                    |

Table 3.2: Hardware utilization of the MMC-IM system

The hardware resources of XC7VX485T are sufficient for running a single phase 11level MMC but falls short of driving the induction machine with even 7-level MMC due to a lack of LUTs. As can be seen from the table, the demand for one phase accounts for 37.50% and will exceed the total available resources if the size triples. This can be avoided if the design is deployed to another FPGA device with abundant LUTs like the XC7V2000T, although it has fewer DSPs for implementing the three-phase 11-level MMC, as shown in the same table.

### 3.5.2 Controller Emulation

For the MMC-IM system, the control section is twofold, referred to as the MMC inner control and induction machine outer control respectively. The former is in charge of the



Figure 3.9: Control algorithm for the MMC-IM system.

DC capacitor voltages of submodules, and the latter regulates the induction machine's angular velocity. Detailed control algorithms for MMC and the induction machine have been separately developed [129–132], and their relations in HIL emulation are shown in Fig. 3.9. Three-phase stator currents  $i_{sa}$ ,  $i_{sb}$  and  $i_{sc}$  as well as  $\omega_m$  and its reference  $\omega_m^*$  are the inputs for the outer controller, which produces three-phase modulation signals  $v_{abc}^*$  and sends them to the inner controller as its inputs. Then the three-phase MMC inner controller generates driving pulses to control the switches.

As can be seen, regardless of what the conditions of surrounding devices such as the induction machine are, the hardware latency of the outer controller is restricted in a small range between 379 and 382  $T_{clk}$ , while the latency of the inner controller is a logarithmic function of the number of submodules in a leg due to the averaging of DC capacitor voltages, and the hardware delay is

$$L_{inner} = (T_{adder} \cdot \lceil \log_2(2N) \rceil + 40) \cdot T_{clk}, \qquad (3.31)$$

where  $T_{adder}$  is the latency of the adder, which takes four clock cycles for single precision numbers, and the rounding function is equivalent to setting N to its nearest even number times of 4. Hence, for the 5-level MMC that has 8 submodules, the controller latency is 52  $T_{clk}$ , or 520 ns time delay, slightly over the time-step of 500 ns for the MMC circuit, and for the 7-level MMC this delay increases to 560 ns, so using the same time-step would hinder achieving real-time. The solution is to utilize multiple time-steps for the subsystems: 1  $\mu$ s for the MMC inner controller and 4  $\mu$ s for the IM outer controller are applied. Theoretically with this time-step setting, the inner controller is able to deal with MMCs with thousands of levels for real-time HIL emulation purpose but in reality the number of voltage levels is restricted by hardware resources.

### 3.5.3 MMC Emulation on FPGA

Table 3.3 is a summary of the latencies of each hardware module in the 5-level MMC-IM system and the emulation time-steps for these subsystems. Based on the update frequency of variables, the whole system is dispatched to three layers, each satisfying the following

criterion that ensures real-time,

$$T_{clk} \cdot max\{L_1^i, L_2^i, \dots L_n^i\} \le \Delta t_i, \tag{3.32}$$

where  $L_1^i, L_2^i, ..., L_n^i$  are the latencies of hardware modules that the *i*<sup>th</sup> layer with the timestep  $\Delta t_i$  contains.

| Hardware Module   | Maximum Latency            | Time-step                 | Layer   |
|-------------------|----------------------------|---------------------------|---------|
| MMC main circuit  | 37 <i>T</i> <sub>clk</sub> |                           |         |
| Induction machine | $41~T_{clk}$               | $\Delta t_1$ =0.5 $\mu s$ | Layer 1 |
| Submodule         | $37 T_{clk}$               |                           |         |
| MMC controller    | 52 $T_{clk}$               | $\Delta t_2$ =1.0 $\mu s$ | Layer 2 |
| IM controller     | $382 T_{clk}$              | $\Delta t_3$ =4.0 $\mu s$ | Layer 3 |

| Table 3.3. Latencies  | of different hardware | modules in the 5-l | evel MMC-IM system |
|-----------------------|-----------------------|--------------------|--------------------|
| Table 5.5. Latericies | of unicitin naturate  | , mounes mune o r  |                    |

In order to run the 5-level MMC HIL emulation in real-time, the time-step for *Layer 1* should be close to 370 *ns*, if the induction machine is not taken into account. According to the device datasheet, this minimum time-step is approximately the rise/fall time of the selected Infineon<sup>®</sup> IGBT FZ400R33KL2C\_B5 ( $V_{CES}$ =3300 V,  $I_C$ =400 A) when its gate resistor is 10  $\Omega$ . This means that under these circumstances, a maximum of two values can be caught during rise/fall process and that section of the switching curve is straightened. On the other hand, the transient process is not limited to the aforementioned region and there are other sections of the curves that distribute beyond it; thus the time-step can be set a little larger to 500 *ns* and the voltage and current waveforms can be represented by piecewise linearized lines, one of which contains the rise/fall process.

Table 3.3 also shows that the induction machine has the largest latency in *Layer 1*. However, when the number of submodules increases, as the only part whose latency is affected, the MMC main circuit latency begins to overtake the IM as the dominant factor to determine real-time operation. The latency incremental for (M+1)-level MMC main circuit can be deduced from its (N+1)-level counterpart by

$$\Delta t_{N \to M} = \left( T_{adder} \cdot \left\lceil \log_2 \frac{M+1}{2^{\left\lceil \log_2(N+1) \right\rceil}} \right\rceil \right) \cdot T_{clk}.$$
(3.33)

Thus, the maximum number of levels that can achieve for real-time HIL emulation with a 500 *ns* time-step is 64, when the latency of the MMC main circuit reaches 49  $T_{clk}$ .

The hardware structure and signal flow routes for the MMC-IM system are drawn in Fig. 3.10, where  $j^{th}$  submodule structure can be seen out of the total 2*N* submodules. There are two levels of parallelism in the design: layers with different time-steps run simultaneously, and all hardware modules within a certain layer are also in parallel. In *Layer* 1, after each time-step  $\Delta t_1$ , the MMC main circuit exchanges TLM link information with the submodules and updates the three-phase voltages for the induction machine, from



Figure 3.10: Hardware structure and signal flow diagram for the FPGA emulation of the MMC-IM system.

which stator currents are received. Then there are information exchanges between the layers. Data going to the IM outer controller will not take effect unless an entire time-step  $\Delta t_3$  ends and produces the three-phase modulation waves for the inner one. For *Layer 2*, since  $\Delta t_2$  is between two other time-steps, the values of modulation waves are kept constant for  $\frac{\Delta t_3}{\Delta t_2}$  cycles and the DC capacitor voltages  $V_C$  and arm currents from *Layer 1* can participate in the control only when a new time-step begins. One of the benefits with such a hardware design is that all external and internal signals as well as the hardware other than LUTs in the submodules will not change if a new piecewise linear switch model is established to replace the original one. Even if a more complex switch model such as physics-based model is introduced, the only alteration occurs within submodules, thus there is no necessity to redesign the hardware for other parts.

With regard to the specific structure of each hardware module, their corresponding functions are written in C/C++ in Xilinx<sup>®</sup> Vivado HLS<sup>®</sup>. In this hardware design there are totally five types of function blocks: the induction machine, the MMC main circuit and submodules as well as the two controllers. Each is coded as an independent function in a separate program, whose inputs and outputs include all external signals of that block. Meanwhile, detailed mathematical as well as logic operations within a function block, such as those in Fig. 3.10, are represented by the programming language in a pipelined fashion. Although Vivado HLS<sup>®</sup> also has a pipeline directive option, which could further increase the maximum operational frequency of the designs, it was not used because the frequency improvement is at the cost of more hardware resource utilization and 100 MHz was deemed sufficient to ensure real-time execution. By running C synthesis of the completed code and the exporting RTL operation that follows, an IP core, the hardware module corresponding to the function block, is generated. However, these modules are



Figure 3.11: Finite state machine of the overall MMC-IM system for hardware emulation.

yet to be linked with each other. This is realized by VHDL coding in the form of signal exchange that takes place at the end of every time-step, and so is the finite state machine that achieves the multi-layer design and decides the time sequence of each module.

Fig. 3.11 shows the relationship between different layers and how they cooperate to execute the entire MMC-IM system by finite state machine (FSM). It should be pointed out that the maximum latency in each layer is smaller than corresponding time-step, meaning that the MMC-IM system will proceed *faster* than real-time. Therefore, a timer is introduced in *Layer 1* to achieve exact real-time, when it counts to  $\Delta t_1$ , that value is reset and the calculation for the next time-step begins. The command is also sent to the other two layers to enable their respective FSMs to enter a new stage, if they are already waiting. In *Layer 2*, the values of carriers are needed before the control starts, and near the end of each time-step, the carrier addresses are updated so that in the next time-step new values can be referred to. For the last layer, the operation is similar to the first layer, other than the fact that shifting to state *S*0 is controlled by the command from the latter. When the reset

order is issued, the states in all three layers begin to circulate and the HIL emulation of the MMC-IM system is ongoing. Thus, by giving proper speed and torque orders through the input interface, the status of the overall system can be observed via the output interface.

# 3.6 Real-Time Emulation Results

### 3.6.1 MMC

In this section, functions of different levels of MMCs are tested with *R*-*L* load. In the test, the DC line voltage is maintained at  $2V_{dc}$ =900 V, meaning that when the number of levels increases, the DC capacitor voltages will decline accordingly. However, the values of other circuit components such as the arm inductance will not be changed, as shown in Table 3.4, and the switching frequency is 2000 Hz. To validate the results from HIL emulation, SaberRD<sup>®</sup> simulations are also carried out with a maximum time-step 500 *ns* to ensure transient processes are recorded. The IGBT and diode models employed in simulations are *igbt1\_3* and *dp1*.

| Table 3.4: Parame | eters of MMC-IN | ⁄l syste | m |
|-------------------|-----------------|----------|---|
| (N+1)-level       | MMC parameter   | ers      |   |
| -                 |                 |          |   |

|                        | -              |                            |
|------------------------|----------------|----------------------------|
| Arm inductance         | $L_{u,d}$      | 1mH                        |
| MMC test load          | R-I            | $L = 5\Omega - 2mH$        |
| Submodule capacitance  | $C_{1-2l}$     | $_{\rm N}$ $6mF$           |
| Submodule DC voltage   | $V_{C_{1-2}}$  | $_{N}$ $\frac{2V_{dc}}{N}$ |
| Induction machin       | e paran        | neters                     |
| Stator inductance      | $L_s$          | 35.5mH                     |
| Rotor inductance       | $L_r$          | 35.5mH                     |
| Magnetizing inductance | $L_m$          | 34.7mH                     |
| Stator resistance      | $R_s$          | $0.087\Omega$              |
| Rotor resistance       | $R_r$          | $0.228\Omega$              |
| Inertia                | J              | $1.662Kg \cdot m^2$        |
| Number of poles        | $\overline{P}$ | 4                          |

In Fig. 3.12 specific system-level performances of the 5-level MMC and its 7-level counterpart are shown. Fig. 3.12(a) and (d) are the 60 Hz, single-phase output voltages of the 5-level and 7-level converter, respectively. As can be observed, the voltage waveform in the latter has two more levels than the former, but their peak values are virtually the same, both close to 430 V, and high symmetry is also observed. Moreover, voltage spectral analysis is carried out by the oscilloscope, which demonstrates that for the 5-level MMC, its output voltage harmonics mainly distribute around 8 kHz - 4 times higher than the switching frequency, while for the 7-level converter, the major harmonics center around 12 kHz. This



Figure 3.12: Comparison of performances of 5-level ((a), (b) and (c)) and 7-level ((d), (e) and (f)) MMC between real-time HIL emulation (top) and SaberRD<sup>®</sup> (bottom). (a) 5-level MMC output voltage, (b) arm currents, (c) DC voltage ripples of submodules in upper and lower arms, (d) 7-level MMC output voltage, and (e), (f) DC voltage ripples of submodules. Oscilloscope axes settings: (a), (d) x-axis 5 ms/div, y-axis 133.34 V/div ( $v_{out}$ ) and 66.67 V/div (FFT); (b) x-axis 5 ms/div, y-axis 13.333 A/div; (c), (e) and (f) x-axis 5 ms/div, y-axis 2.667 V/div.

phenomenon agrees with the theory that for (N+1)-level MMC, the effective switching frequency is N times higher. The results are verified by SaberRD<sup>®</sup> simulations as they give identical waveforms. Fig. 3.12(b) demonstrates the upper and lower arm currents of the 5-level converter, the results from oscilloscope and simulation agree with each other quite well in both waveshape and values. Fig. 3.12(c) shows the DC voltage ripples of the submodules in upper and lower arms for the 5-level converter. These values fluctuate around the reference of 225 V, indicating the inner controller is working properly. The peak-valley difference is estimated to be around 13.3V from the oscilloscope and simulation. In Fig. 3.12(e) and (f), some DC capacitor voltages of 7-level MMC are shown and compared. The former indicates that for submodules in the same arm, the rising/declining trends of DC voltage ripples are same, while the latter shows the trend in the opposite arm is totally in contrary. The average values of these DC voltages, as can be read from these figures, are about 150 V since the number of submodules in an arm increases to six while DC line voltage is kept constant.

Fig. 3.13 gives the switching process and power losses in the 5-level MMC and the shape of these waveforms in 7-level MMC are almost the same and are therefore not shown. Fig. 3.13(a) and (b) are the transient IGBT voltage and current waveforms during the turning on and off processes. After exerting a positive driving pulse on the gate and a period of turn-on delay lasting for 1  $\mu$ s, the voltage begins to drop and a current surge can be observed from both HIL emulation and SaberRD<sup>®</sup> simulation. Then the current gradually stabilizes and the voltage finally remains slightly above zero due to the conduction resistance. The rise time is defined as the time interval between 10% and 90% of collector current under steady-state, which is around 0.33  $\mu$ s, slightly below 0.4  $\mu$ s provided in the datasheet. When the driving pulse disappears, the turning off process takes place after a turn-off delay of approximately 4  $\mu$ s; it is an opposite process during which  $v_{CE}$  rises to DC capacitor voltage and collector current goes to zero, but the fall time has a similar definition to rise time and its value is near 0.42  $\mu$ s, a little larger than the datasheet value of 0.35  $\mu$ s. In Fig. 3.13(c), diode reverse recovery process is shown. As can be seen after plunging to peak value, which virtually has the same amplitude as the steady-state current, the reverse current begins to decay to zero and voltage over the diode climbs to DC capacitor voltage. It is observable that the current tail in the SaberRD<sup>®</sup> simulation is a little longer but since the value of the final stage is extremely small it is forced to zero in the diode model and that will not cause a significant error when calculating power loss. Meanwhile, the forward voltage of diode is also nonzero attributing to the exponential static *I-V* characteristics. The power loss corresponding to each process is also shown, a high degree of consistency between HIL emulation and SaberRD<sup>®</sup> simulation is observed.

To validate the effectiveness and convenience of the proposed circuit partitioning method in achieving real-time, the 7-level MMC is expanded to 11-level and emulated execution on the FPGA. Fig. 3.14(a) is the 11-level output voltage and the load current from HIL emulation, compared with those of 5- and 7-level MMC. The voltage quality is higher and as anticipated that the voltage spectral analysis yields an array of harmonics around 20 kHz; but they are almost negligible. The root mean square value of fundamental component is same to those of other two, all about 280 V. The output current, due to filtering effect of inductors, is sinusoidal and it reaches a peak value of 80 A, agrees with its theoretical value. The results from SaberRD<sup>®</sup> are also shown in Fig. 3.14(b) for comparison, which indicates the hardware implementation of MMC is correct.

Table 3.5 lists the time each switching process takes. The IGBT turn-on delay from HIL emulation is exactly what was provided in the datasheet, while its turn-off delay and diode reverse recovery time are both rounded to integers because the HIL emulation time-step is



Figure 3.13: Details of switching processes and power losses of IGBT or diode from HIL emulation (top) and SaberRD<sup>®</sup> simulation (bottom). (a) IGBT turning on, (b) IGBT turning off, and (c) diode reverse recovery. Oscilloscope axes settings: x-axis 1  $\mu$ s/div, y-axis 40 V/div and 26.67 A/div.

500 *ns*. The errors for IGBT rise and fall time are relatively large, because their values are smaller than the time-step and consequently both processes are located on straightened lines and affected by the slopes.

For the upper switch in a submodule, the maximum current flows through the antiparallel diode while for the lower switch, the maximum current emerges in the IGBT; thus, their power losses are important. Table 3.6 shows the energy consumption of IGBT and diode during the transient process and steady-state when the current reaches largest in 5and 7-level MMC, where the errors are in their absolute forms to avoid negative values. As can be seen from Fig. 3.13, the maximum steady-state current for both IGBT/diode pairs is about 60 A.

|                    | Table 3.5: Switching times of IGBT and diode |                |                                |  |  |
|--------------------|----------------------------------------------|----------------|--------------------------------|--|--|
| Time               | Description                                  | HIL            | Datasheet/SaberRD <sup>®</sup> |  |  |
| $t_{IGBT}^{d,on}$  | Turn-on delay                                | $1.00 \ \mu s$ | $1.00 \ \mu s$                 |  |  |
| $t^r_{IGBT}$       | Rise time                                    | $0.33 \ \mu s$ | $0.40~\mu s$                   |  |  |
| $t_{IGBT}^{d,off}$ | Turn-off delay                               | $4.00 \ \mu s$ | $3.90\ \mu s$                  |  |  |
| $t_{IGBT}^{f}$     | Fall time                                    | $0.42~\mu s$   | $0.35~\mu s$                   |  |  |
| $t_{diode}^{rr}$   | Reverse recovery time                        | $5.00 \ \mu s$ | $4.80 \ \mu s$                 |  |  |

Table 3.5: Switching times of IGBT and diode


Figure 3.14: System-level behavior of 11-level MMC: (a) real-time oscilloscope results; (b) SaberRD<sup>®</sup> simulation results. Oscilloscope axes settings: x-axis 5 ms/div, y-axis 133.34 V/div ( $v_{out}$ ), 66.67 V/div (FFT) and 66.67 A/div.

| MMC                   | 5L-MMC       |       | 7L-MMC                   |       |
|-----------------------|--------------|-------|--------------------------|-------|
| Energy (mJ)           | HIL/SaberRD® | Error | HIL/SaberRD <sup>®</sup> | Error |
| $E_{IGBT}^{turn-on}$  | 14.01/13.43  | 4.32% | 7.93/8.33                | 4.80% |
| $E_{IGBT}^{turn-off}$ | 6.38/6.77    | 5.76% | 5.41/5.43                | 0.37% |
| $E_{IGBT}^{conduct}$  | 2.59/2.58    | 0.39% | 2.40/2.39                | 0.42% |
| $E_{diode}^{rr}$      | 9.28/9.56    | 2.90% | 4.69/5.03                | 6.76% |
| $E_{diode}^{conduct}$ | 1.75/1.76    | 0.57% | 1.69/1.64                | 3.05% |

Table 3.6: Energy consumption validation of proposed IGBT and diode model

The steady-state power losses are quite accurate because the static V-I characteristics are provided in the datasheet whereas the transient waveforms are obtained by curvefitting, and therefore the error is a bit larger, but still they are precise and can be referred to when designing the MMC as well as the cooling system. Moreover, with the increase of output voltage level, the power consumed by switches decreases along with voltage and current stresses. Generally, the proposed HIL system is able to offer accurate power losses of both steady-state and transient stages in the MMC despite the variation of its voltage level and the load. It is more convenient compared with measuring power losses by setting up an experimental MMC prototype whose excitations as well as the loads should be adjusted repeatedly in order to provide the switches with the same electromagnetic environment. In addition, although knowing the steady-state current from simulation of conventional MMC models with ideal switches enables direct acquisition of steady-state power loss from the device datasheet, estimating the transient portion based on the turn-on and turn-off energy losses provided by datasheet is less accurate, since they were obtained in an experimental setup with distinct testing conditions.

#### 3.6.2 Induction Machine Driven by 5-Level MMC

The speed of the induction machine can only be regulated by 5-level MMC when the emulation was done on the XC7VX485T FPGA. As shown in Table 3.2, the LUT is not enough for the other two MMCs to extend to three phases.

Fig. 3.15(a) shows the regulation of the mechanical angular velocity by real-time HIL emulation. The initial speed reference is 160 rad/s, so the machine starts and the velocity goes up to the reference value in about one second. Meanwhile, a large stator current can be observed in all three phases and only phase A is shown since they are symmetrical. After 1 s, the actual speed is very close to the reference and the machine operates under steady-state with stator currents reduce significantly to around 20 A in amplitude. Then at  $t_1=3$  s,  $\omega_m^*$  plummets to -160 rad/s, meaning that the rotation direction is reversed, so that the positive speed slows down to zero and later increases in the opposite direction until it reaches the reference value, which sees a slight increase at  $t_2=6$  s to -80 rad/s. Consequently, the real speed follows and the machine quickly enters steady-state. Between  $t_3=8$ s and  $t_4=9$  s, a pulse of 100 N·m is applied to the torque; following this change is a temporary rise of stator current, but the impact it has on the angular velocity is negligible. As can be seen throughout the whole period, a large angular velocity leads to a higher current frequency demonstrated by the density of the waveform. For comparison, Matlab/Simulink simulation is carried out, and corresponding system-level performance is shown in Fig. 3.15(b), which proves that both controllers are functioning normally and the design theory is correct.

The starting of the induction machine with different values of torques was also tested. In Fig. 3.16(a), the locus of stator currents in  $\alpha$ - $\beta$  frame are drawn for the starting period when the mechanical angular velocity climbs up from 0 to 160 rad/s without any load. A momentary current surge at the vertical axis is observed immediately after starting, indicated by curve *A*. Then, as can be seen from curve *B*, the current steadily reduces from 300 A to 150 A, and following a sudden decline shown by curve *C*, the current finally stabilizes around the region *D*.

The loci of stator currents for three torques under steady-state are shown in Fig. 3.16(b).



Figure 3.15: Regulation of induction machine speed by 5-level MMC: (a) real-time oscilloscope results, and (b) off-line simulation results. Oscilloscope x-axis: 1 s/div.

As expected it shows that a larger torque yields a circle with greater radius. Other information such as the relation between the duration of transient process and torque is also available. When  $\omega_m^*=160 \text{ rad/s}$ , it takes 0.52 s, 0.65 s, 0.86 s, 1.31 s and 2.68 s for the machine to reach 95% of  $\omega_m^*$  when the torques are -200, -100, 0, 100 and 200 N·m, respectively, indicating a larger torque leads to a longer time to approach steady-state, while the reverse is true for  $\omega_m^*=-160 \text{ rad/s}$ .

# 3.7 Summary

This chapter has demonstrated real-time hardware emulation of a TLM-based MMC structure with piecewise linearized behavioral IGBT/diode model for variable speed drive applications.

From a mathematical point of view, using TLM links to partition the MMC circuit achieved the decomposition of a large matrix equation corresponding to the integral circuit by a set of smaller equations, which when solved in parallel, significantly accelerated com-



Figure 3.16: Real-time oscilloscope results of stator current in  $\alpha$ - $\beta$  frame under (a) starting period, and (b) steady-state with  $T_m$ =0, 100 and 200 N·m, respectively. Oscilloscope x- and y-axis settings: (a) 93.34 A/div; (b) 26.67 A/div.

putational speed even though the emulation time-step was small. Meanwhile, it offered a new perspective for hardware design in which the overall system is represented by several hardware modules and any change specific to one of them has no impact on others; thus scalability and modularity could be attained, just as in a real MMC system. Moreover, the computational speed is entirely independent of the number of output voltage level of the converter. The only impact is on the utilization of hardware resources, as several hardware designs have shown in the results, which helped to determine the appropriate voltage level according to the capacity of the FPGA device. MMCs of different levels were implemented, where the behavioral IGBT/diode model enables HIL emulation to provide accurate system-level performance as well as device-level information such as turn-on/off time and power losses.

Implementation of the MMC inner controller and induction machine outer controller was also carried out. Multiple time-steps are particularly useful when there is a remarkable latency disparity between different hardware modules.

# Nonlinear Device-Level Modular Multi-Level Converter Model

# 4.1 Introduction

The model of a modular multilevel converter determines the extent of circuit information that electromagnetic transient simulations can reveal while its complexity has a deep impact on the speed. Therefore, this chapter presents two nonlinear MMC models to cater for various FPGA-based hardware-in-the-loop emulation goals.

In the dynamic curve-fitting model (DCFM), factors affecting the transient performance are taken into account so that device-level behaviors such as power loss and junction temperature can be reproduced accurately in the electro-magnetic-thermal simulation of a power converter. The static parameters are extracted from the manufacturer's datasheet by piecewise linearization, while in the dynamic part, the IGBT rise and fall times are modeled as a nonlinear function of those factors.

The nonlinear behavioral model is widely used in off-line device-level tools such as SaberRD<sup>®</sup> to provide the very detail accurately. Another merit is the versatility: the model is deemed to be able to represent a real IGBT under most of the conditions without changing its parameters. The drawback is that its complexity leads to the inefficient solution of a circuit since the nonlinear model contains multiple nodes solved usually by many iterations of the Newton–Raphson (N–R) method, making it prone to divergence and sensitive to initial conditions. In this work, its modeling details are specified, and simplification is conducted for convergent results and efficient computation.

Like the previous chapter, the inclusion of device-level IGBT/diode models exerts a huge computational burden on the processors. Another fine-grained circuit partitioning approach using a pair of coupled voltage-current sources is applied to the MMC arm. The

methodology is simpler than the TLM-link by omitting selection of characteristic impedance, while it still leads to the same effect.

The ideal switch based-MMC detailed equivalent model (DEM) has a wide application in electromagnetic transient simulation of HVDC transmission system and multi-terminal HVDC grid. Based on the same switch model, a new MMC model is proposed by regarding the submodule as a transmission line stub, which, compared with the DEM, achieves faster computation speed and utilizes fewer FPGA hardware resources.

The hybrid arm structure is subsequently proposed under the condition of a coexistence of partitioning and merging. It is constructed by taking a number of submodules as a TLM stub whilst the rest use detailed device-level model, which, with merits such as lower hardware resource requirement, faster execution speed, and high numerical accuracy, is suitable for real-time HIL emulation.

# 4.2 Power Semiconductor Switch-Based MMC Modeling

#### 4.2.1 MMC TLM-Stub Model (TLM-S)

As shown in Fig. 4.1(a)(b), when an arbitrary submodule numbered k is under on-state, the capacitor is being charged through the upper switch which is instantaneously a small resistance, and if the submodule is off, the equivalent circuit is purely a small resistor. Thus, an on-state resistance is always in the conducting path during operation. The existence of the SM capacitor can be determined by the gate signal of the upper switch, deemed as a binary, i.e.,  $V_{qk}=1$  for on-state and  $V_{qk}=0$  for off-state. For the blocked state, simply ordering  $V_{ak}$ =0 and  $R_2$ = $R_{off}$  omits the free wheeling diode effect, just as the DEM. To enable correct SM ON/OFF mode of the blocked state, the gate signal is determined by the direction of the arm current  $i_{SM}$ : if it flows into the SM through node a, which is defined as the positive direction, then  $V_{ak}=1$ ; otherwise,  $V_{ak}=0$ . This criterion leads to two equivalent circuits similar to Fig. 4.1(a) and Fig. 4.1(b). For the former state, the submodule impedance equals to  $R_{on}+Z_{Ck}$ , while this value is  $R_{on}$  for the latter. Correspondingly, the capacitor voltage  $v_{Ck}$  alternates between  $i_{SM} \cdot Z_{Ck} + V_{Ceqk}$  and  $V_{Ceqk}$ . Nevertheless, to simulate the highimpedance mode of the blocked state when both diodes are off,  $v_{SM}$  and  $v_C$  are required to judge whether the upper diode should be turned on. Applying TLM-stub theory, the capacitor voltage and its iterative incident pulse  $v_{Ck}^i$  can be written as

$$v_{Ck}(t) = V_{gk}(t) \cdot i_{SM}(t) \cdot Z_{Ck} + 2v_{Ck}^i(t), \qquad (4.1)$$

$$v_{Ck}^{i}(t + \Delta t) = v_{Ck}(t) - v_{Ck}^{i}(t).$$
(4.2)

The Thévenin equivalent circuit of an SM can be obtained as in Fig. 4.1(c), in which

$$R_{eq}(t) = R_{SM} + V_{gk}(t) \cdot Z_{Ck},$$
(4.3)

$$V_{eq}(t) = 2V_{gk}(t) \cdot v_{Ck}^{i}(t),$$
(4.4)



Figure 4.1: MMC TLM-stub model: (a) SM on-state/blocked state, (b) SM off-state/blocked state, and (c) general representation.

where  $R_{SM}$  equals to  $R_{on}$  for all states except high-impedance mode when it should be  $R_{off}$ .

Then, for an MMC arm containing *N* submodules, the Thévenin equivalent circuit can be expressed as

$$V_{arm\_eq}(t) = \sum_{k=1}^{N} V_{eqk}(t) = 2 \sum_{k=1}^{N} (V_{gk}(t) \cdot v_{Ck}^{i}(t)),$$
(4.5)

$$R_{arm\_eq} = \sum_{k=1}^{N} R_{eq}(t) = NR_{SM} + Z_{Ck} \sum_{k=1}^{N} V_{gk}(t).$$
(4.6)

#### 4.2.2 IGBT/Diode Dynamic Curve-Fitting Model

When the operation status of the IGBT changes, e.g., the collector current, gate voltage, or even the junction temperature, its dynamic characteristics will also follow suit. However, ordinary CFM is unable to demonstrate the change. Thus, the datasheet-driven dynamic curve-fitting model involving environment-sensitive switching transients is proposed.

Piecewise linearizing the IGBT static *I*-*V* curves provided by the manufacturer into 6 segments, the collector current in the  $j^{th}$  segment can be written as

$$I_C = k_j(T_{vj})V_{CE} - b_j(T_{vj}), (4.7)$$

where  $b_j$  and  $k_j$  given in Appendix A are linear functions of junction temperature  $T_{vj}$  since data at two different temperatures are available. Taking the IGBT under steady-state as a resistor, its value can then be deduced as

$$r_s = \frac{V_{CE}}{I_C} = \frac{I_C + b_j(T_{vj})}{k_j(T_{vj})I_C}.$$
(4.8)

It should be pointed out that the IGBT off-state accounts for one of the 6 segments. Meanwhile, switching transients must be included as part of the model. In addition to  $T_{vj}$ , the rise and fall times generally denoted by  $t_{r,f}$  are also affected by factors such as gate resistance  $R_{g}$ , and collector current  $I_{C}$ , each of which, according to device datasheet, can be expressed by a piecewise linear function

$$t_{r,f}(x_i) = A_i x_i + B_i, (4.9)$$

where  $x_i$  represents either  $T_{vj}$ ,  $R_g$ , or  $I_C$ , and  $A_i$ ,  $B_i$  are coefficients. However, when two or more factors are combined, the relationship is still nonlinear; therefore, the overall effect can be described by a polynomial function

$$t_{r,f}(x_1, x_2, x_3) = k_0 \cdot \prod_{i=1}^3 (x_i) + \sum_{i,j=1\to 3}^{i\neq j} k_i x_i x_j + \sum_{i=1}^3 b_i x_i + b_0,$$
(4.10)

where  $k_i$  and  $b_i$  are coefficients that are obtained in a way that sets two variables constant and forces the function to be equal to (4.9) with the remaining variable, i.e.,

$$t_{r,f}(x_i) = t_{r,f}(x_i, x_j, x_k) \mid_{x_j, x_k = C} .$$
(4.11)

The values then become available, as listed in Appendix A. Note that the gate driving voltage does not appear in (4.10) because for specific applications its amplitude is fixed. Nevertheless, it can be added as a new variable if the  $t_{r,f}$ - $V_g$  relationship is provided by the datasheet.

As the shape of IGBT transient waveforms is influenced by the test circuit, the turn-on and turn-off waveforms of 5SNA 2000K450300 StakPak IGBT module in Fig. 4.2(a)(b) are obtained from a bridge-structure test circuit which provides the same electromagnetic environment to that of an MMC submodule [133,134] to ensure the applicability of the fitted model. As a result, the diode reverse recovery reflected by current surge in Fig. 4.2(a) is automatically included in the IGBT transient waveforms. Fig. 4.3(a)(b) is the transient model for IGBT in which the output of a per-unit circuit is amplified by a proper *K* times. The voltage-controlled current source (VCCS) and current-controlled current source (CCCS) are able to reproduce simulated curves that virtually fit with those measured experimentally, as shown in Fig. 4.2. The descending curves can be modeled as the capacitor voltage of a discharging *RC* circuit with a time constant  $\tau$ . Take the collector current for instance, the fall time  $t_f$ , defined as the current dropping from 90% to 10% of the initial value along an extrapolated straight line drawn between the time instants when the current is 90% and 60% of its initial value [133], is located on a virtually straight line. To achieve that, the initial capacitor discharging rate  $i_{disc}$  should be controlled at

$$i_{disc} = C \frac{dv_C}{dt} = C \frac{(90\% - 10\%)v_C(0)}{t_f}.$$
(4.12)

After  $v_C$  drops to about 33% of its initial value, the curvy tail current emerges. Then, the control object shifts to the resistance while the capacitance is kept constant, and in the  $j^{th}$  nonlinear segment of the curve, it is

$$i_{Cj} = K \cdot v_{Cj}(0) e^{-\frac{t}{\tau_j}},$$
(4.13)



Figure 4.2: IGBT transient waveforms from a bridge-structure test circuit: (a) turn-on process, (b) turn-off process, and (c) coefficient *K* determination.

where  $v_{Cj}(0)$  denotes the initial capacitor voltage of that segment, and coefficient *K* is the last steady-state value for turn-off current, while for turn-on current *K* is the instantaneous arm current, as shown in Fig. 4.2(c).

Similarly, the rising curves are realized by an RL circuit. The overshoot is achieved by charging purely the inductor while introducing a time-varying resistor forces the curve to decline with a certain slope. The rise time, defined as the time between instants when the collector current rises from 10% to 90% of the final value, decides the inductance. Since the segment where  $t_r$  locates is a straight line, the inductance can be derived as:

$$L = U \cdot \frac{dt}{di} = \frac{t_r(T_{vj}, R_g, I_c) \times 1(V)}{(90\% - 10\%)(A)}.$$
(4.14)



Figure 4.3: Dynamic IGBT model: (a) VCCS for descending curves, (b) CCCS for rising curves.

#### 4.2.3 Power Diode Nonlinear Behavioral Model

The power diode is simplified with only static features and the reverse recovery dynamics preserved while other negligible components in the original full behavioral model [135] are omitted, as shown in Fig. 4.4(a). The diode static characteristics represented by the symbol *NLD* reflects an exponential relationship between the static current  $I_d$  and the junction voltage  $V_j$ , as expressed by

$$I_d = I_s \cdot (e^{\frac{V_j}{V_b}} - 1), \tag{4.15}$$

where  $I_s$  is the leakage current, and  $V_b$  the junction barrier potential. Its discrete-time Norton equivalent circuit which is shown in Fig. 4.4(b) becomes available by taking partial derivative and subsequent linearization, as expressed by

$$G_j = \frac{\partial I_d}{\partial V_j} = \frac{I_s}{V_b} e^{\frac{V_j}{V_b}}$$
(4.16)

$$I_{jeq} = I_d - G_j \cdot V_j, \tag{4.17}$$

respectively, where  $G_j$  and  $I_{jeq}$  are the conductance and the equivalent current contribution of *NLD*.

The reverse recovery phenomenon is attained by the  $R_L$ -L pair and the voltage controlled current source with a coefficient of K. Backward Euler method is adopted due to its lower latency in hardware implementation compared with other integration methods. The Norton equivalent circuit of the linear inductor L is derived by the following equations

$$G_L = \frac{L}{\Delta t},\tag{4.18}$$

$$I_{Leq}(t) = i_L(t - \Delta t), \qquad (4.19)$$

where  $\Delta t$  is the simulation time-step and the iterative inductor current  $i_L(t)$  takes the form of

$$i_L(t) = I_{Leq}(t) + G_L \cdot v_L(t).$$
 (4.20)

Hence, the matrix equation of the simplified diode model is

$$\mathbf{G}^{Diode} \cdot \mathbf{v}^{Diode} = \mathbf{I}_{eq}^{Diode}, \tag{4.21}$$



Figure 4.4: Nonlinear power diode model: (a) Simplified power diode model, (b) linearized discrete-time equivalent circuit.

where the  $3 \times 3$  admittance matrix is given by

$$\mathbf{G}^{Diode} = \begin{bmatrix} G_j & K - G_j & -K \\ -G_j & G_j + G_L + G_{RL} & -G_L - G_{RL} \\ 0 & -G_L - G_{RL} - K & G_L + G_{RL} + K \end{bmatrix},$$
(4.22)

 $\mathbf{v}^{Diode}$  is a vector of diode nodal voltages, and the equivalent current source contribution vector is

$$\mathbf{I}_{eq}^{Diode} = \begin{bmatrix} -I_{jeq}, & I_{jeq} - I_{Leq}, & I_{Leq} \end{bmatrix}^T.$$
(4.23)

#### 4.2.4 IGBT Nonlinear Behavioral Model

The IGBT behavioral model is shown in Fig. 4.5(a), where *PWLD* denotes a piecewise linear diode,  $R_g$  is the resistance to the gate, and elements such as voltage controlled current sources  $i_{mos}$  and  $i_{tail}$  as well as inter-electrode capacitors  $C_{ce}$  and  $C_{cg}$  are nonlinear.

The basic operation can be summarized as: when the collector-emitter voltage  $v_{CE}$  is less than the threshold voltage  $V_{on}$ , *PWLD* keeps off and the collector current  $i_C$  is zero; when the value of  $v_{CE}$  is between  $V_{on}$  and the saturation voltage  $V_{sat}$ , the device is represented by  $i_{mos}$  in the quasi-linear region; then, when  $v_{CE}$  is greater than  $V_{sat}$ ,  $i_C$  will mainly depend on the gate-emitter voltage  $v_{ge}$  and  $v_{ce}$  [66]. The tail current  $i_{tail}$ , which is controlled by the internal parallel  $R_{tail}$ - $C_{tail}$  pair, only emerges during the turn-off process. Using the IGBT tool in SaberRD<sup>®</sup>, the static and dynamic parameters can be acquired based on corresponding characteristics and curves provided by the device data-sheet [136].



Figure 4.5: Nonlinear IGBT EMT model: (a) Continuous-time behavioral model, (b) Linearized discrete-time equivalent circuit.

The *PWLD* can be deemed as a binary conductor whose on- and off-state conductances are  $g_{on}$  and  $g_{off}$ , respectively. Thus, its Norton equivalent model for electromagnetic transient simulation is

$$G_{pwld} = \begin{cases} g_{on} & (v_{pn} > V_{on}) \\ g_{off} & (v_{pn} \le V_{on}) \end{cases},$$

$$(4.24)$$

$$I_{pwldeq} = -G_{pwld} \cdot V_{on}, \tag{4.25}$$

where  $v_{pn}$  is the voltage across *PWLD*,  $V_{on}$  is its forward threshold voltage.

Using a similar procedure illustrated in the diode section, all internal components can be turned into their EMT models, and the outcome is shown in Fig. 4.5(b), in which linearly passive elements are calculated by

$$G_{Cx} = \frac{C_x}{\Delta t},\tag{4.26}$$

$$I_{Cxeq} = -G_{Cx} \cdot v_{Cx}(t - \Delta t), \qquad (4.27)$$

where  $C_x$  is referred to either  $C_{tail}$  or  $C_{ge}$ . With regard to non-linear capacitors  $C_{ce}$  and  $C_{cg}$ , they are treated in a same fashion as taking  $C_{cg}$  for example

$$G_{Ccg} = \begin{cases} \frac{(ccgo\cdot(1+\frac{v_{Ccg}}{vcgo})^{-M})}{\Delta t} & (v_{Ccg} > 0) \\ \frac{ccgo}{\Delta t} & (v_{Ccg} \le 0) \end{cases},$$
(4.28)

$$i_{Ccgeq} = \frac{q_{Ccg}(t) - q_{Ccg}(t - \Delta t)}{\Delta t} - G_{Ccg} \cdot v_{Ccg}(t), \qquad (4.29)$$

where M is the Miller capacitance exponent coefficient that affects the current rise and fall time.

Since  $i_{mos}$  and  $i_{tail}$  are dependent on voltages over other components, their EMT models are taken as a combination of equivalent current sources and conductance or transconductance. The voltage controlled current source  $i_{mos}$  reflecting the turn-on and -off behaviors is the most complicated component, as expressed by

$$i_{mos} = \begin{cases} 0, & (v_{Cge} < V_t) || (v_d \le 0) \\ a_2 \cdot v_d^{(z+1)} - b_2 \cdot v_d^{(z+2)}, & v_d < (y \cdot (v_{Cge} - V_t))^{\frac{1}{x}} \\ \frac{(v_{Cge} - V_t)^2}{a_1 + b_1 \cdot (v_{Cge} - V_t)}, & (others) \end{cases}$$
(4.30)

where  $a_1, b_1, a_2, b_2, x, y$  and z are internal parameters,  $V_t$  is the channel threshold voltage, and  $v_d$  the potential difference between Inode1 and Inode2. It indicates that  $i_{mos}$  can branch off conductance  $G_{mosvd}$  and transconductance  $G_{mosvcge}$  derived by taking partial derivatives with respect to  $v_d$  and  $v_{Cge}$ , i.e.,  $\frac{\partial i_{mos}}{\partial v_d}$  and  $\frac{\partial i_{mos}}{\partial v_{Cge}}$ .

Thus, its equivalent current  $I_{moseq}$  takes the forms of

$$I_{moseq} = i_{mos} - G_{mosvd} \cdot v_d - G_{mosvcge} \cdot v_{Cge}.$$
(4.31)

Similarly, the equivalent current contribution from  $i_{tail}$  unit can also be found as an expression of transconductance

$$I_{taileq} = i_{tail} - G_{tailvd}v_d - G_{tailvcge}v_{Cge} - G_{tailvtail}v_{tail}.$$
(4.32)

A 5×5 admittance matrix  $\mathbf{G}^{IGBT}$  and current source contribution vector  $\mathbf{I}_{eq}^{IGBT}$  can be constructed according to the discrete model, as given in Appendix A. Then, the IGBT nodal voltage vector  $\mathbf{v}^{IGBT}$  is obtained by

$$\mathbf{v}^{IGBT} = (\mathbf{G}^{IGBT})^{-1} \cdot \mathbf{I}_{eq}^{IGBT}.$$
(4.33)

#### 4.2.5 Electro-Thermal Network

The IGBT power loss due to conduction and switching produces heat which diffuses through the junction and raises its temperature that in turn affects the device performance. Hence, establishing a dynamic electro-thermal relationship as in Fig. 4.6 is essential for cooling system capacity evaluation.

The IGBT power loss  $P_{loss}$  acts as the input current source, whose terminal voltage represents the junction temperature  $T_{vj}$ . The dynamic junction to case thermal impedance has the following expression

$$Z_{th} = \sum_{i=1}^{n} R_i (1 - e^{-\frac{t}{\tau_i}}), \qquad (4.34)$$



Figure 4.6: IGBT inherent electro-thermal transient network.

where  $R_i$  and  $\tau_i$  are constants available in manufacturer's device datasheet. For circuit simulation purpose, the thermal impedance is embodied by an *R*-*C* network [137], with the capacitance being calculated by

$$C_i = \frac{\tau_i}{R_i}.\tag{4.35}$$

Then, the companion circuit  $R_{ti}$ - $I_{hi}$  can be obtained by discretization. Subsequently, the junction temperature is calculated as

$$T_{vj} = \sum_{i=1}^{4} \left[ (P_{loss} + I_{hi}) \times (\frac{1}{R_i} + \frac{2\tau_i}{R_i \Delta t})^{-1} \right] + T_{amb},$$
(4.36)

where  $T_{amb}$  is the ambient temperature set at 25°C, and  $I_{hi}$  is the current source contribution of the capacitors' TLM stub model, written as

$$I_{hi} = 2 \cdot t_{Ci}^i \cdot \frac{2\tau_i}{R_i \Delta t},\tag{4.37}$$

in which  $t_{Ci}^i$  is the incident pulse of capacitor's TLM stub model and is updated by

$$t_{Ci}^{i}(t) = \left[ (P_{loss} + I_{hi}) \times \left(\frac{1}{R_{i}} + \frac{2\tau_{i}}{R_{i}\Delta t}\right)^{-1} \right] - t_{Ci}^{i}(t - \Delta t).$$
(4.38)

As shown by the device datasheet, the junction temperature has a significant impact on IGBT static performance. Thus, these static parameters should be expressed as functions of the temperature. Linear functions which have the form of

$$y(T_j) = k \cdot T_j + p \tag{4.39}$$

are applied to the calculation of these parameters because in datasheet only two temperature curves, at 25°C and 125°C, are provided. However, if more data are available, nonlinear functions can be employed so as to describe the dynamic electro-thermal features more precisely.

# 4.3 Fine-Grained MMC Partitioning Schemes

#### 4.3.1 V-I Coupling

The coupled voltage-current sources can be inserted between the arm and submodule as exactly the TLM-link is applied. However, the V-I coupling eliminates the characteristic impedance as TLM-link has, as shown in Fig. 4.7. The current source is placed on the submodule side since it contains nonlinear IGBT/diode models solved by nodal equations. While on the left side, the MMC main circuit can either be solved by mesh current or nodal voltage equations – in the latter case, the coupled voltage sources are converted to current sources first.

The partitioning method induces a unit delay to both sides. At the instant  $t-\Delta t$ ,  $i_u(t-\Delta t)$  and  $i_d(t-\Delta t)$  are obtained by solving the matrix equation corresponding to the left circuit, and they are sent to the submodules. Then, the time instant t begins. On the SM side, based on the submodule current it just received, its port voltage  $v_k(t)$  can be derived. Thus,  $v_k$  is one time-step ahead of  $i_u$  and  $i_d$  on the SM side, while the reverse is the case for the MMC arm. Nevertheless, the fact that the circuit computation frequency is much higher than that of the arm current means  $i_u$  and  $i_d$  can be deemed as constants in two neighboring time-steps and its impact on simulation accuracy is negligible.

The nodal voltage equation on the SM side is determined by the IGBT/diode type it uses, while on the main circuit side, the arms have a fixed form, e.g., the Thévenin equivalent circuit is

$$Z_{arm} = Z_{Lu,d} + r_{arm}, aga{4.40}$$

$$U_{arm} = 2v_{Lu,d}^{i} + \sum_{k=1}^{N} v_k(t - \Delta t).$$
(4.41)

where  $Z_{Lu,d}$  and  $v_{Lu,d}^i$  constitute the TLM-stub model of the inductor, and  $r_{arm}$  is its parasitic resistance.

#### 4.3.2 Hybrid Arm Model

The partitioning scheme solves the long latency issue caused by employing complex IGBT and diode models for FPGA implementation; nevertheless, a high hardware resource requirement has not been alleviated. In Fig. 4.8, the hybrid arm structure containing a flexible number of split submodules while the rest adopting TLM-stub model is proposed for efficient computation and less hardware utilization when deployed to FPGA. Then, the arm's Thévenin equivalent circuit is

$$v_{arm}(t) = i_{SM}(t) \cdot Z_{Lu,d} + 2v_L^i(t) + \sum_{k=1}^n v_{SMk}(t - \Delta t) + i_{SM}(t) \cdot \sum_{k=n}^N R_{eqk}(t) + \sum_{k=n}^N V_{eqk}(t - \Delta t),$$
(4.42)



Figure 4.7: MMC partitioning by *V*-*I* coupling.

where  $Z_{Lu,d}$  and  $2v_L^i$  is the TLM stub model for an arm inductor.  $v_{SMk}$  is the voltage coupling of  $k^{th}$  submodule, and the number  $n \in [1, N-1]$ .

On the nonlinear submodule side, the computation approach relies on switch state. Under steady-state, both switches of the SM are taken as resistors, then

$$\begin{bmatrix} v_{Ck}(t) \\ v_{SMk}(t) \end{bmatrix} = \begin{bmatrix} Z_{Ck}^{-1} + R_1^{-1} & -R_1^{-1} \\ -R_1^{-1} & R_1^{-1} + R_2^{-1} \end{bmatrix} \cdot \begin{bmatrix} v_{Ck}^i(t) \cdot Z_{Ck}^{-1} \\ i_{SMk}(t - \Delta t) \end{bmatrix},$$
(4.43)

where  $v_{SMk}(t)$  is the voltage to be sent to the opposite side. The SM blocked state is a special steady-state, where  $R_1$  and  $R_2$  are determined by the arm current: a positive  $i_{SM}$  indicates that the upper diode is on and  $R_1$  is small, otherwise  $R_2$  has a small resistance. During transient state, the IGBT DCFM is taken as a controlled current source, and the current in the complementary switch, defined as flowing from collector to emitter is given as:

$$i'_C(t) = i_C(t) \pm i_{SM}(t - \Delta t).$$
 (4.44)

Lastly, knowing the branch currents enables the calculation of circuit other variables, such as  $v_{Ck}(t)$  and  $v_{SMk}(t)$ .



Figure 4.8: MMC hybrid arm model with *V*-*I* couplings.

# 4.4 Hardware Emulation Case 1 – DCFM

The DCFM is applied to real-time emulation of a solid-state transformer (SST) in a three-terminal DC system, as shown in Fig. 4.9 and the parameters are given in Appendix A.3.

# 4.4.1 DC-DC Converter HIL Emulation

The proposed hybrid arm model is implemented on the Xilinx<sup>®</sup> Virtex<sup>®</sup>-7 FPGA. As demonstrated in Table 4.1, the MMC TLM-stub model enables faster simulation by CPU and smaller hardware latency on FPGA, making it more suitable for real-time simulation, particularly when a small time-step is required. For instance, DEM is nearly 30% slower than the proposed TLM-S in simulating a 51-level MMC using a 20- $\mu$ s time-step when it is run by 64-bit Windows<sup>®</sup> 7 Enterprise SP1 operating system on the 3.40GHz Intel<sup>®</sup> Core<sup>TM</sup>i7 CPU and 8.00GB RAM, while the accuracy of the two models is the same. Moreover, the resource utilization of DEM is much higher than TLM-S, for example, Xilinx<sup>®</sup> Vivado HLS<sup>®</sup> estimates that for an 11-level MMC, one DEM controller takes around 14% of LUT while it is only 4% for proposed TLM-S.

| Table 4.1. While model sinulation speed comparison |                         |        |         |               |               |
|----------------------------------------------------|-------------------------|--------|---------|---------------|---------------|
| MMC-                                               | 5-s simulation duration |        |         | Late          | ency          |
| Level                                              | DEM                     | TLM-S  | Speedup | DEM           | TLM-S         |
| 5-L                                                | 7.8s                    | 6.8s   | 1.15    | $76 T_{clk}$  | $64 T_{clk}$  |
| 11 <b>-</b> L                                      | 9.2s                    | 7.7s   | 1.19    | 92 $T_{clk}$  | $80 T_{clk}$  |
| 51 <b>-</b> L                                      | 17.8s                   | 13.6s  | 1.31    | $108 T_{clk}$ | 96 $T_{clk}$  |
| 101 <b>-</b> L                                     | 28.8s                   | 21.1s  | 1.36    | 116 $T_{clk}$ | $104 T_{clk}$ |
| 501-L                                              | 115.0s                  | 80.1s  | 1.44    | 132 $T_{clk}$ | $120 T_{clk}$ |
| 1001 <b>-</b> L                                    | 222.0s                  | 153.4s | 1.45    | 140 $T_{clk}$ | 128 $T_{clk}$ |

Table 4.1: MMC model simulation speed comparison



Figure 4.9: MMC-based DC-DC converter for MTDC system.



Figure 4.10: Top-level hardware structure of  $MMC_H$ .

Hardware design of  $MMC_H$  is taken as an example because the topology of SST is symmetrical. Vivado HLS<sup>®</sup> was employed to shorten the design cycle. Signals with the same attribute are grouped in an array, rather than being taken individually, for a significant reduction in hardware resource utilization when C synthesis is conducted. The drawback is that the latency will increase slightly along with the array size. Therefore, such hardware design strategy is mainly adopted in controllers which have a good latency tolerance. The unroll directive provided by the design tool is also chosen for achieving parallelism of signals in an array.

Table 4.2 gives hardware design specifics. Around 15% of LUT and DSP are required by the 1-phase 55-level MMC and its controller. With a FPGA clock frequency of 100MHz, the time-step for PSC should be no less than  $13.75\mu s$  according to its 1375  $T_{clk}$  latency, where  $T_{clk}$ =10ns, while it is flexible for the remaining modules other than the nonlinear SM hardware module *NSM*, so that all of them can be set at  $15\mu s$ . The parallelism of the 55-level MMC is purposely weakened to save hardware resources and consequently the latency of TLM-S increases to 95  $T_{clk}$ . Meanwhile, the time step for *NSM* is 500ns to ensure the accuracy of switching transients.

Fig. 4.10 illustrates the general hardware structure of  $MMC_H$ , where the input and output ports of MMC and its controller – grouped as one component – are specifically shown,



Figure 4.11: SST top-level finite state machine.



Figure 4.12: SST control scheme: (a)  $MMC_L$  controller, and (b)  $MMC_H$  controller.

while other functional blocks in the top-level are represented by their simplified forms. As can be seen, manipulating gate-level logics is avoided with Vivado HLS<sup>®</sup>, and only the input and output ports of a component are required during hardware design. Since the hybrid arm is used in the simulation, one MMC contains mainly four blocks, i.e., phase-shift control (PSC), MMC linear part (MMC0), MMC nonlinear submodule (NSM) and the thermal network (THM). Among them, the *THM* is independent of other modules to shorten the hardware latency: the NSM sends  $v_{CE}$  and  $i_C$  to *THM* whenever it completes calculation, and the newest values take effect when the *THM* starts a new computation cycle; similarly, the THM calculates  $t_{r,f}$  and sends them immediately to the *NSM*. Meanwhile, the 6 arms of an MMC are calculated concurrently, and the outputs *G* and *J* are sent to the *MFT* module where the nodal voltages are sought.

Connecting these modules either by wire or through D flip-flops is achieved by VHDL in Vivado<sup>®</sup>, where the top-level finite state machine defining the operation sequence in Fig. 4.11 is realized. Once hardware emulation starts, two independent loops run simultaneously. Loop 1 contains solely *NSM* and repeats every  $\Delta t_1$ =500ns, while all remaining modules constitute Loop 2 that has a period of  $\Delta t_2$ =15 $\mu$ s. Thus, this multiple timestepping scheme avoids compromising switching transients by other hardware modules that must have a large time-step. The carrier waveforms are stored in ROM so that MMC phase-shift control can operate properly. Table 4.2 shows that for each loop, the time-step is larger than the maximum latency, so two timers are set: once an exact time-step runs out, a new calculation cycle will begin.

| Module Latency                |                     |                 |            |  |  |  |
|-------------------------------|---------------------|-----------------|------------|--|--|--|
| Module                        | Description         | Latency         | Time-step  |  |  |  |
| NSM                           | nonlinear SM        | $35 T_{clk}$    | 500ns      |  |  |  |
| MMCP                          | 1-phase MMC         | $95 T_{clk}$    | $15 \mu s$ |  |  |  |
| MFT                           | transformer         | $205 \ T_{clk}$ | $15 \mu s$ |  |  |  |
| DQT                           | abc-dq              | 78 $T_{clk}$    | $15 \mu s$ |  |  |  |
| OLC                           | outer loop control  | 110 $T_{clk}$   | $15 \mu s$ |  |  |  |
| PLL                           | phase reference     | $8 T_{clk}$     | $15 \mu s$ |  |  |  |
| PSC                           | phase-shift control | 1375 $T_{clk}$  | $15 \mu s$ |  |  |  |
| THM                           | thermal network     | $61T_{clk}$     | $15 \mu s$ |  |  |  |
| Hardware resource utilization |                     |                 |            |  |  |  |
| Resources                     | 55L-TLM-S (1p)      | NSM (1SM        | ) Total    |  |  |  |
| LUT                           | 45159 (14.87%)      | 5848 (1.93%     | ) 303600   |  |  |  |
| LUTRAM                        | 46 (0.04%)          | 49 (0.04%)      | 130800     |  |  |  |
| FF                            | 37390 (6.16%)       | 3140 (0.52%     | ) 607200   |  |  |  |
| BRAM                          | 31.50 (3.06%)       | 0 (0%)          | 1030       |  |  |  |
| DSP                           | 435 (15.54%)        | 32 (1.14%)      | 2800       |  |  |  |
| BUFG                          | 4 (12.5%)           | 4 (12.5%)       | 32         |  |  |  |

| Table 4.2: MMC | hardware de  | esign specif | ications |
|----------------|--------------|--------------|----------|
| Ν              | Aodule Later | ncv          |          |

Fig. 4.12 shows the outer-loop controllers of the SST.  $MMC_L$  regulates active and reactive power, while its counterpart is in charge of the medium frequency transformer (MFT) AC voltage on the primary side. The control scheme is carried out in *d*-*q* frame and is largely the same to other voltage-source converters, except an additional MMC inner loop employing phase-shift control is adopted. The angle  $\theta$  for Inverse Park's Transformation is the reference which determines the MFT operation frequency. Internal variables with superscripts *H* and *L* correspond to the primary and secondary sides of the transformer, respectively.

#### 4.4.2 Real-Time HIL Emulation Results

#### 4.4.2.1 Device-Level Behavior

In the MMC, the static SM current is alternating at a frequency decided by  $\theta$  and is taken as an example to show its influence on IGBT's turn-on and turn-off times, as shown in Fig. 4.13(a)(b). When  $I_C$  climbs from 500A to 3500A, the turn-on time increases steadily, while the turn-off time first declines from 1300ns to around 800ns, and then rises again. Comparison with corresponding values provided by the datasheet proves the accuracy of proposed IGBT transient model, and with more linear segments to approximate the nonlinear  $t_{r,f}$ - $I_C$  curves, the accuracy of the results can be further improved. Fig. 4.13(c) gives corresponding energy consumption, both turn-on and turn-off energies  $E_{on}/E_{off}$ closely follow datasheet values when identical test conditions are set. A high degree of agreement between simulation and experimental data indicates that the proposed IGBT dynamic curve-fitting model qualifies for MMC simulation to give design guidance. In Fig. 4.14 and thereafter, simulations are conducted based on the MTDC system in Fig. 4.9. Stipulating that the switching frequency is kept 10 times higher than that of MFT to ensure MMC output quality, MMC<sub>H</sub> lower IGBT operation status in delivering 200MW to  $MMC_3$  is shown in Fig. 4.14. The power loss waveforms at two switching frequencies are given in Fig. 4.14(a), which indicates that with a higher density of power pulses, the latter has a more significant impact on junction temperature. Fig. 4.14(b) verifies this viewpoint: with switching frequencies of 600Hz, 1800Hz and 3000Hz, the junction temperatures center around 45°C, 75°C, and 103°C, respectively. The fluctuations in the temperature are caused by the alternating turn-on and turn-off processes, and consequently the higher the switching frequency, the denser the ripples appear. With respect to safe operation, it can be inferred that measures such as using external cooling apparatus and increasing the MMC level are required when  $f_{sw}$ =3000Hz, while natural cooling is sufficient for steady-state operation with  $f_{sw}$ =600Hz. For further validation, the transferred power is reduced to 20MW, and SaberRD<sup>®</sup> which is always referred to for device-level information is used to simulate a 5-level MMC considering numerical divergence will occur if the number of level is higher. The results in Fig. 4.14(c) demonstrate that the junction temperatures from DCFM and the simulation tool's own IGBT model are largely the same, meaning that the proposed DCFM is as accurate as commercial simulation tools. Fig. 4.14(d) gives the relation between MMC voltage levels and maximum IGBT junction temperatures, which demonstrates that by increasing the MMC voltage level, a dramatic junction temperature drop can be achieved if  $f_{sw}$ =3000Hz, while the improvement is not significant when  $f_{sw}$  is 600Hz.



Figure 4.13: IGBT transient tests under different collector current at  $T_{vj}$ =125°C: (a) turn-on process, (b) turn-off process, and (c) turn-on and turn-off energy.

#### 4.4.2.2 Converter-Level Performance

The  $MMC_H$  control target  $V_{gd}^{H*}$  is 90kV, and the MFT frequency for Fig. 4.15(a)(b) are set to be 60Hz and 180Hz, respectively. The real-time results from the oscilloscope show that the MFT primary voltages  $v_{pri}$  are exactly the control objects. Consequently, on the secondary side, the value is halved to about 45kV. The SM capacitor voltages are also given in Fig. 4.15(c), which indicates increasing the MFT frequency leads to smaller sizes of transformers as well as arm inductors and capacitors. Under 60Hz, SM capacitor voltage ripples for both  $MMC_H$  and  $MMC_L$  are still larger than those under 180Hz even though its SM capacitance and arm inductance are 2 to 4 times larger, as shown in Appendix A. As expected, the introduction of more accurate DCFM causes some trivial differences with respect to PSCAD/EMTDC<sup>®</sup> results; however, the average values and variations of these signals are similar. In Fig. 4.15(d), the arm currents and SM DC voltages are compared



Figure 4.14:  $MMC_H$  lower IGBT operation status: (a) power loss waveforms for 55L-MMC, (b) junction temperature waveforms for 55L-MMC, (c) 5L-MMC SaberRD<sup>®</sup> validation, and (d) relation between  $T_{vj}$  and MMC level.

between TLM-S and PSCAD/EMTDC<sup>®</sup>, it can be seen that these two types of ideal MMC models fit well.

#### 4.4.2.3 System Tests

Some tests demonstrating the function of SST are carried out as a further validation of proposed MMC models. In Fig. 4.16, power reversal is conducted, and all power flowing to DC yard is defined as positive. Initially, the power delivered to Station-2 and Station-3 are 300MW and 100MW, respectively, thus  $I_{dc2}$  and  $I_{dc4}$  are approximately 1.5kA and 0.5kA, and  $I_{dc3}$  maintains around 2 times that of  $I_{dc4}$ . At  $t_1=2s$ , the power order in  $MMC_L$  begins to ramp from -100MW to 100MW, i.e., Station-3 is diverted to a rectifier station, and consequently, Station-2 receives 500MW power from the other two stations and  $I_{dc2}$  finally stabilizes at 2.5kA. It shows that the DC currents are clear indicators of power variation, because the DC voltage at Station-2  $U_{dc2}$  is precisely controlled at 200kV. During the re-



Figure 4.15: SST converter-level results (left: HIL emulation; right:  $PSCAD/EMTDC^{\mathbb{R}}$ ): (a), (b) MFT primary and secondary voltages at 60Hz and 180Hz, (c) SM DC voltage ripples, and (d) ideal MMC models comparison. Oscilloscope horizontal axes setting: 10ms/div.

versal process, the MFT currents undergo ramping while its voltages keep constant due to  $MMC_H$ 's control, as demonstrated by  $v_{sec}$ . Another notable feature of SST is fault isolation, which is shown in Fig. 4.17. Immediately after  $t_0$ =1s when the fault on DC Line-3 is detected, both  $MMC_H$  and  $MMC_L$  are ordered to block their driving pulses. As a result, the voltage on both sides of the MFT vanish, indicating that the SST has fault isolation capability. Meanwhile, the power from Station-1 is diverted solely to Station-2 because  $I_{dc2}$  has the same amplitude to  $I_{dc1}$  and  $I_{dc4}$  reduces to 0. Corresponding results from PSCAD/EMTDC<sup>®</sup> confirms these statements, indicating the proposed MMC models can be used for MTDC grid studies.



Figure 4.16: MTDC system power reversal from HIL emulation (up/left) and PSCAD/EMTDC<sup>®</sup> (bottom/right). Oscilloscope horizontal axes setting: 1s/div.

Since the SM blocked state cannot be explicitly shown due to the SST's fault isolation capability, it is proven by applying the improved MMC model and the original TLM-S to  $MMC_1$  in Fig. 4.9 where a 1 $\Omega$  line-to-ground fault is imposed on DC Line 1 right after inductor  $L_1$ , and corresponding PSCAD/EMTDC<sup>®</sup> simulations are conducted for validation, as given in Fig. 4.18. Both models produce the same correct results until t=0.1s when an obvious bifurcation emerges. The improved MMC model yields result identical to that of the traditional model with each IGBT having a free wheeling diode in PSCAD/EMTDC<sup>®</sup>. The AC voltage under this scenario is still being rectified by the diodes so  $V_{dc1}$  is about 30kV, and the fault current stabilizes at around 30kA. Moreover, in the con-



Figure 4.17: SST fault isolation test waveforms from HIL emulation (up/left) and  $PSCAD/EMTDC^{\textcircled{R}}$  (bottom/right). Oscilloscope horizontal axes setting: 0.5s/div.

troller,  $i_{gd}$  and  $i_{gq}$  deviate from the control target, and their non-zero values prove energy flow between the AC and DC grids. However, with the original TLM-S that fully blocks the MMC,  $V_{dc1}$  finally stabilizes at 0 after periods of oscillation, and similar behavior can be observed with the DC current. Meanwhile,  $i_{gd}$  and  $i_{gq}$  are zero, meaning that no current is flowing from the AC side to the DC side. These incorrect results are identical to those in PSCAD/EMTDC<sup>®</sup> when a large resistor is inserted between the MMC and DC yard, proving that the original TLM-S has a high-impedance blocked state.

In Fig. 4.19, the alternation between different SM block states is tested by passive charging of  $MMC_1$  which is operating as an STATCOM, and results from off-line simulation tool are used for comparison. It can be seen that both the SM capacitor voltages and the upper arm currents are the same. Initially  $i_u$  is either positive or negative, indicating the ON and OFF modes of the blocked state. Then, the third state with zero arm current emerges, which indicates that both diodes are OFF and the SM is under high-impedance state.

# 4.5 Hardware Emulation Case 2 – NBM

#### 4.5.1 Power Converter HIL Emulation

A medium-voltage DC (MVDC) system is implemented on the FPGA for demonstration of the nonlinear behavioral IGBT/diode models involved in system-level emulation, as Fig. 4.20(a) shows. The station controller is shown in Fig. 4.20(b). The inverter is set to control the DC line voltage, while the rectifier is in charge of instantaneous power regulation.

In Fig. 4.21, the iterative HIL emulation process of the MMC submodule containing nonlinear behavioral IGBT and diode models is depicted. It should be noted that the *V*-*I* coupling module is designed specifically for the converter part with circuit partitioning.



Figure 4.18: *MMC*<sup>1</sup> blocked state test results.



Figure 4.19: Passive charging of  $MMC_1$  with opened DC line.

## 4.5.2 HIL Emulation Results and Validation

To showcase the versatility of nonlinear behavioral models, HIL emulation results from device-level to system-level captured by the Tektronix DPO 7054 Digital Phosphor Oscilloscope are validated by off-line simulation tools running under 64-bit Windows<sup>®</sup> 7 Enterprise SP1 operating system with 3.40GHz Intel<sup>®</sup> Core<sup>TM</sup>i7 CPU and 8.00GB of RAM. The employed IGBT and power diode models have been experimentally verified and are available in SaberRD<sup>®</sup>, as also listed in Appendix A.



Figure 4.20: MMC-based MVDC system: (a) system configuration, and (b) station control scheme.



Figure 4.21: Hardware architecture and its signal flow routes for the MMC submodule with non-linear behavioral switch models.

# 4.5.3 Islanded MMC Performance

The MMC topology in Fig. 4.20(a) is used as an inverter with DC link voltage  $V_{dc}$ =3kV and AC side inductive load 5 $\Omega$ -6mH for demonstrating the performance of non-linear behavioral IGBT and diode models. In device-level simulation, the selection of a switch type

should consider the device's capacity. The BSM300GA160D IGBT (1600V/400A) is suitable for this DC voltage rating and thus is chosen. The frequency of switches and AC output are 2.0kHz and 60Hz, respectively.

Due to the nonlinearities in a submodule, a minimum of 5 N-R iterations are needed for convergent results, and each iteration has a latency of 209 clock cycles. The HIL emulation time-step is set as 200ns and FPGA clock frequency is 100MHz. Table 4.3 summarizes the time some EMT simulators and the HIL system need to conduct the computation of a number of circuits for a 100ms period. To achieve high fidelity, multiple switches are considered. The time SaberRD<sup>®</sup> needs to complete simulation of simple circuits, e.g., a single diode and IGBT, is acceptable, and the hardware speedup is medium. However, it rises dramatically along with the circuit scale and the number of parallel switches. Thus, the speedup SP1 for a 3-phase 5-level MMC is 65 times while it reaches 275 for 11-level MMC. Meanwhile, the HIL system has a similar, or even faster simulation speed than PSCAD/EMTDC<sup>®</sup> in single 3-phase MMC cases even though the time-step in the latter tool is 20 $\mu$ s. Thus, it can be inferred that with higher voltage levels, more converters, or parallel devices, the speedup becomes more significant because the MMC latency keeps the same.

| Execution Time(s)  |             |             |             |            |             |
|--------------------|-------------|-------------|-------------|------------|-------------|
| Tool               | m           | Diode       | IGBT        | 5L-MMC     | 11L-MMC     |
|                    | $m_1 = 1$   | 2.96        | 4.2         | 340        | 715         |
| SaberRD®           | $m_2 = 2$   | 4.15        | 6.5         | 528        | 1060        |
|                    | $m_3=3$     | 5.10        | 8.6         | 620        | 1430        |
| PSCAD <sup>®</sup> | 1           | 0.3         | 0.3         | 4.5        | 17.5        |
| HIL system         | $m_{1,2,3}$ | 0.68        | 2.2         | 5.2        | 5.2         |
| Speedup SP1        | $m_{1,2,3}$ | 4.3/6.1/7.5 | 1.9/2.9/3.9 | 65/101/119 | 137/204/275 |
| Speedup SP2        | $m_{1,2,3}$ | 0.44        | 0.13        | 0.87       | 3.37        |

Table 4.3: Simulation execution times from EMT simulators and HIL systems

The oscilloscope results in Figs. 4.22(a)-(c) show starting of the 5-level MMC. Slightly irregular in the first two cycles, the output voltage later stabilises with an evident level of 5. DC capacitor overcharge is observed in all submodules, with those in the lower arm having larger amplitudes to around 1200V, but finally, all of them manage to maintain around 750V, as shown in Fig. 4.22(b), indicating proper functioning of the controller. Fig. 4.22(c) shows two arm currents, the opposite phase relation explains the submodule capacitor voltages in the upper and lower arms reach their peaks alternately. Moreover, a momentary current surge at the beginning explains the overcharge in DC capacitors. The impact of the number of behavioral SMs in an MMC is also tested by setting all of them non-linear, and the results are given in the middle, which are verified by SaberRD<sup>®</sup> using the same configuration in the bottom. The ideal switch model leads to some minor differences in the output voltage around the 3rd cycle; other than that, its outcomes are virtually the same to

the other two rows, indicating that the proposed MMC arm structure has a high fidelity.

From the perspective of a real converter design, switching dead-time is always set to protect switches in a submodule, and the gate driver circuit also affects their safe operation. Fig. 4.23(a) shows the turn-on waveforms of an IGBT without dead-time, and a gate voltage  $V_G$ =+15V/0V exerted on the device via a gate resistance of 10 $\Omega$ . A collector current surge up to over 1200A appears due to overlapped conduction of the two complimentary switches and consequently, the energy stored in the DC capacitor discharges dramatically through that path. To avoid the hazardous current which may damage the switches, as well as to demonstrate the versatility of the behavioral model, different gate driving conditions are set. As depicted in Fig. 4.23(b-c), the current surge, caused by diode reverse recovery, witnesses a remarkable mitigation to about 2 times the amplitude of the steadystate current by simply setting a sufficient dead-time to  $5\mu$ s. Reducing the off-state gate voltage  $V_C^{off}$  would loose the requirement on dead-time, as demonstrated in Fig. 4.23(d-e). By setting a  $2\mu$ s dead-time and  $V_G^{off}$ =0, a current surge up to 1000A can still be observed. In contrast, it disappears when  $V_G^{off}$  =-10V. Fig. 4.23(f) are the overview of switching waveforms of the upper and lower IGBT-diode pairs in a submodule. During Stage 1, the arm current is positive and consequently, the upper diode conducts to charge the DC capacitor, as can be noticed from the rising envelopes of  $v_{CE1}$  and  $v_{CE2}$ . Reverse recovery accompanies the diode operation, and correspondingly, current overshoot is induced to the lower IGBT. At Stage 2, the arm current becomes negative so the upper IGBT is ordered to turn on repeatedly, and the lower diode acts in concert to discharge energy stored in the DC capacitor. These device-level results prove that the non-linear behavioral model has a high versatility to variations of electromagnetic environment since its switching waveforms can change accordingly along with external circuits without any adjustment on its parameters once they are obtained; on the contrary, the ideal switch model and the averaged value model do not have transients. It is also impractical to enable the curve-fitting model to have that capability because potentially there could be numerous switching cases, and selection of an appropriate case is difficult. Moreover, it is also restricted by the availability of hardware resources when implemented on the FPGA.

In Table 4.4, some static and dynamic features of IGBT and diode models are validated by SaberRD<sup>®</sup> simulation. It shows that the reverse recovery time of diode lasts up to  $2\mu$ s, much longer than IGBT's turn-on and -off period, which are around 200ns and 640ns, respectively. The conduction energy consumption distinguished by subscript *cond* is measured when the collector current reaches its maximum, i.e., 300A. The error with respect to SaberRD<sup>®</sup> is negligible because essentially, it is a comparison of the static *I-V* characteristics, which is easy to model. The transient energy dissipation covers the overall switching period, i.e., from the time prior to the process to the switch's re-entry into steady-state. Thus, the energy consumption  $E_{Tr}$  is calculated over the duration of switching period:

$$E_{Tr} = \int_0^{T_{Tr}} (v \cdot i) dt, \qquad (4.45)$$



Figure 4.22: System-level performance of MMC with non-linear behavioral models from proposed models (top, middle) and SaberRD<sup>®</sup> simulation (bottom): (a) Output voltage, (b) Capacitor voltages, and (c) Arm currents. Oscilloscope y-axis: (a) 396V(A)/div., (b) 155V/div., (c) 155A/div.; x-axis: 50ms/div.

Table 4.4: Validation of IGBT and power diode nonlinear behavioral models by SaberRD<sup>®</sup>

|                         | SaberKD® | FPGA    | Error  |  |  |  |
|-------------------------|----------|---------|--------|--|--|--|
| Transient time          |          |         |        |  |  |  |
| $t_{rr}^{Diode}$        | 2080ns   | 1970ns  | 5.2%   |  |  |  |
| $t_r^{IGBT}$            | 200ns    | 205ns   | 2.5%   |  |  |  |
| $t_f^{IGBT}$            | 640ns    | 600ns   | 6.6%   |  |  |  |
| Energy consumption      |          |         |        |  |  |  |
| $E_{rr}^{Diode}$        | 3.71mJ   | 3.51mJ  | 5.4%   |  |  |  |
| $E_{cond}^{Diode}$      | 7.26mJ   | 7.27mJ  | 0.2%   |  |  |  |
| $E_r^{IGBT}$            | 18.18mJ  | 18.11mJ | 0.4%   |  |  |  |
| $E_f^{IGBT}$            | 103.48mJ | 99.96mJ | 3.4%   |  |  |  |
| $\vec{E_{cond}^{IGBT}}$ | 6.58mJ   | 6.58mJ  | < 0.1% |  |  |  |

In HIL simulation, the Trapezoidal method is applied to the above equation, leading to

$$E_{Tr} = \frac{\sum_{i=1}^{N_{Tr}} (v_i \cdot i_i + v_{i+1} \cdot i_{i+1}) \Delta t}{2}, \qquad (4.46)$$

where the entire duration  $T_{Tr}$  is divided into  $N_{Tr}=T_{Tr}/\Delta t$  intervals. Though the mathematical model for the switching transients is more complex, the energy loss from HIL



Figure 4.23: Performance of MMC with non-linear behavioral models from HIL emulation (top) and SaberRD<sup>®</sup> simulation (bottom): (a) IGBT turn-on without dead-time, (b-c) Switching transients with  $5\mu$ s dead-time, (d-e) Switching transients with  $2\mu$ s dead-time, and (f) Operation of complimentary switches in a SM from HIL emulation. Oscilloscope y-axis: (a) 156V(A)/div., (b)-(e) 130V(A)/div., (f) 255V(A)/div.; x-axis: (a)-(e)  $5\mu$ s/div., (f) 10ms/div.

emulation is still precise, with diode reverse recovery energy consumption having the largest error of 5.4% and IGBT turn-off loss next to it, at 3.4%. Moreover, the numerical results indicate that transient power losses are much higher, underlining the importance of device-level non-linear switch models for evaluation of the safe operation of a converter.

## 4.5.4 MMC-MVDC Performance

To enable a higher DC voltage with the same 5-level MMC configuration, IGBTs with a larger capacity, such as the 5SNA 2000K450300 StakPak IGBT Module (4500V/2000A), should be used. HIL emulation of a 10kV/0.8kA MVDC system is conducted while results validation relied on PSCAD/EMTDC<sup>®</sup> as SaberRD<sup>®</sup> is unable to simulate such a large



Figure 4.24: MVDC System-level performance from HIL emulation (top) and PSCAD/EMTDC<sup>®</sup> (bottom): (a) System start, (b) Line-to-line fault response, and (c) Power reversal. Oscilloscope y-axis: (a) 2.58kV/div., (b) 1.73kV/div., 272A/div., (c) 1.72kV/div., 246A/div.; x-axis: (a) 1s/div., (b) 100ms/div., (c) 10s/div.

system for a long period. In Fig. 4.24(a), system start is conducted, after a few oscillations at the beginning, the DC voltages stabilise at around 1s, with the rectifier station slightly over 10kV. At t=2s, pole-to-pole fault lasting 5ms is occurred to the centre of the transmission line, as Fig. 4.24(b) depicts, the DC voltages fall immediately, and the transmission line sees a large current, from initial 500A to approximately 1kA. In Fig. 4.24(c), power reversal is carried out. The power reference in the rectifier station is ordered to ramp down from -5MW to 3MW in a time interval of 10s, and consequently, the DC line current  $I_{dc}$  declines from approximately +500A to -300A. Therefore, before  $t_1$ =10s, the energy is transferred to the inverter side, and the DC voltage at rectifier station  $V_{dc1}$  is slightly higher than  $V_{dc2}$  at inverter station to ensure energy flow. Then,  $I_{dc}$  starts ramping down, accompanied by a minor decrease of DC voltages at both terminals. At  $t_2$ =20s, the process is ceased and noticeably, the numerical relationship between the two DC voltages has also reversed. These results prove that the decoupled hardware modules of the nonlinear behavioral switch models can be effectively employed for system-level studies when the fully iterative solution provides the same results as a transient simulation tool PSCAD/EMTDC<sup>®</sup> performing a non-iterative solution using ideal switch models, particularly when an obvious speed advantage is witnessed, i.e., it takes around 752s for the latter tool to simulate a 10s interval with a much larger time-step of  $20\mu s$ , while the HIL system only requires 520s even though its time-step is 100 times smaller.

# 4.6 Summary

This chapter has demonstrated real-time hardware emulation of MMCs with nonlinear device-level IGBT/diode model for various applications to obtain their precise performance for circuit design evaluation that otherwise cannot be achieved by the ideal two-state switch model.

The dynamic curve-fitting model improves the versatility of the previous curve-fitting model by linking its turn-on and -off times with time constants of *RC* and *RL* circuits, and consequently, the transient waveforms can be precisely simulated under various normal operation conditions. Meanwhile, the MMC TLM-stub model alleviates hardware resource burden and showed its speed advantage in both CPU simulation and HIL emulation on FPGA in conjunction with other complex switch models. Circuit partitioning enables the coexistence of separated nonlinear submodules and the linear MMC circuit even though they have distinct time-steps, and with smaller matrix dimension, a significant speedup can be attained in addition to avoiding numerical divergence.

Compared with the dynamic curve-fitting method, the nonlinear behavioral models are more versatile to electromagnetic environment variation, and consequently can be applied to various power converters to obtain their precise performance for thorough circuit design evaluation that even the dynamic curve-fitting model could not achieve. The consistency between HIL emulation results and those from off-line simulation tools indicated that proposed nonlinear behavioral IGBT and diode modules have wide application prospect ranging from device-level behavior evaluation to system-level performance preview.

# 5 High-Fidelity Device-Level Hybrid HVDC Breaker Models

# 5.1 Introduction

Modeling of HHB can be carried out from two perspectives: system-level and device-level. A precise HHB model should be a full-scale model so that internal details can be investigated. However, designing an HHB with hundreds of IGBTs in a massive array would lead to an extremely heavy computational burden as well as to a high FPGA resource utilization. The inclusion of IGBT models can further improve simulation results by providing more details. The predominant model is the TSSM, which only has two nodes to achieve fast circuit computation. The CFM is also a two-node switch model, whose on-state resistance is obtained from the static I-V characteristics, and the dynamic waveforms acquired directly from experimental measurement or indirectly from the device datasheet. Thus, the calculation of steady-state and transient power losses are more accurate. Its major shortcoming is that the stored values for transient waveforms need to be adjusted repeatedly along with the variation of electromagnetic environment for accurate results. Nonlinear behavioral model is widely used in off-line device-level tools such as SaberRD® to provide every detail of the circuit accurately. Another merit is the versatility: the model is deemed to be able to represent a real IGBT under most conditions without changing its parameters. The drawback is that its complexity leads to the inefficient solution of a circuit since the nonlinear model contains multiple nodes solved usually by many iterations of the Newton-Raphson (N-R) method, making it prone to non-convergence and sensitive to initial conditions.

In this chapter, three types of full-scale HHB models with high fidelity - classified according to IGBT models they contain - are proposed for efficient real-time HIL emulation of HVDC grids on FPGAs as well as for achieving fast electromagnetic transient calculation by off-line simulation tools. Type-1 model is based on TSSM that has been developed in the past, but this work overcomes its original drawbacks of slow simulation speed and high resource utilization so that this new model can be executed in real-time. Type-2 model based on the curve-fitting technique is a further improvement to realize two goals simultaneously: it can be used in real-time HIL emulation, and device-level phenomena are included to enable the model to provide more details. The second-order NBM-based DC breaker is also introduced as the most accurate one and is categorized as the Type-3 model. Like its curve-fitting counterpart, an electro-thermal network is created to enable the acquisition of operation statuses such as IGBT power loss and junction temperature, and consequently, the HHB design including the selection of IGBT type and its number can be evaluated. To reduce FPGA hardware resource utilization caused by a large number of IGBTs, circuit partitioning is first applied and based on that, one of the sub-circuits is used to represent all other identical ones.

# 5.2 HHB in MTDC System

#### 5.2.1 MTDC Schematic

Fig. 5.1 shows a three-terminal HVDC system in which functions of the hybrid HVDC breaker can be evaluated. The configurations of the three converter stations are symmetrical.  $STN_1$  (REC) is set as the rectifier station, while the other two, denoted as  $STN_2$  (INV1) and  $STN_3$  (INV2), are inverter stations. The former is in charge of power while the latter controls individual DC bus voltage.  $I_1$ ,  $I_{12}$  and  $I_{13}$  are rectifier side DC currents, and  $I_{21}$  as well as  $I_{31}$  represent inverter side DC currents.  $L_{12}$ ,  $L_{13}$ ,  $L_{21}$  and  $L_{31}$  are current limiting inductors in DC yards, which, together with symbols  $B_{12}$ ,  $B_{13}$ ,  $B_{21}$ , and  $B_{31}$ , constitute HHBs. Line faults with a resistance  $R_f$  can be simulated on both transmission lines linking inverters station with the rectifier station.

As MMCs are gaining popularity and presumed to be dominant in future MTDC projects, the proposed HHB models are inserted in the DC yards of such a system. Considering the main focus is on the performance of HHBs and since a proper modification of discounting AC side reactance and resistance into DC side enables the MMC averaged value model (AVM) to predict system-level behaviors when DC line fault occurs [42, 138], it is adopted to achieve low computational burden. Further more, the installation of an appropriately designed HHB guarantees that the DC fault current from an AVM-based MTDC system to be similar to that of a detailed equivalent model based on a period longer than the HHB protection time.

The control scheme of the MMC is also shown in Fig. 5.1, which demonstrates strategies for the rectifier and inverter are largely the same, other than the control objective is selected according to the state of the converter station. Meanwhile, the scheme based on


Figure 5.1: Schematic of a three-terminal monopole HVDC system and its control and protection concepts.

d-q frame is identical to that of other grid-connected voltage sourced converters, except the modulation signals  $v_{MMC}^{ABC}$  are sent to an additional inner-loop controller, which, in this case, adopts a phase-shift strategy to generate driving pulses denoted by the vector  $\mathbf{V}_{gate}$ . In AVM, the output voltage of a submodule is determined by the state of its upper switch. Thus the combined output voltage of submodules in one arm of an (N+1)-level MMC can be uniformly calculated by

$$v_{u,d} = \sum_{1}^{N} V_{g,i}^{upper} \times \frac{U_{dc}}{N},\tag{5.1}$$

where  $V_{g,i}^{upper}$  is a binary number indicating on/off state of the upper switch in the *i*<sup>th</sup> submodule by 1 and 0, respectively, and  $U_{dc}$  denotes the converter side DC line voltage.

# 5.2.2 DC Line Protection

The DC line protection (LPR) concepts for MMC-HVDC systems share a great similarity with line commutated converter based HVDC, which means that a variety of criteria, such as voltage derivative protection (VDP), under voltage protection (UVP), and over current protection (OCP), can also be applied to judge line faults. The difference lies in the fact that isolating the faulty section is mainly achieved by HHBs on both inverter and rectifier sides. In Fig. 5.1, two popular protection concepts for HHB testing are shown.

## 5.2.2.1 Voltage Derivative Protection

VDP has a fast reaction to line faults. The principle is: when the DC line contacts ground via a small resistance  $R_f$ , the voltage drops instantly from hundreds of kilovolts to close to zero or a negative value. Thus the voltage change rate DUDT is extremely large, which

is calculated by

$$DUDT = \frac{dU_{dc}(t)}{dt} = U_{dc}(t) - U_{dc}(t - \delta t),$$
(5.2)

where  $\delta t$  is the digital sampling rate and consequently  $U_{dc}(t - \delta t)$  indicates DC line voltage of the previous sampling. The protection threshold  $\delta u^*$  should be far larger than the DUDT value under steady-state condition and during the start or stop of the converter. Even so, to avoid maloperation, a width comparison section is introduced: if DUDT keeps larger than  $\delta u^*$  for a preset time  $\delta t^*$ , then a trip order will be issued to activate HHB protection process.

## 5.2.2.2 Over Current Protection

OCP has a relatively slower response to line faults compared with VDP and consequently it has a higher requirement on the breaking capability of an HHB. Nevertheless, it is still useful in protecting electrical facilities and can be used as a backup. The principle consists of the following: when the line current rises beyond the setting, a tripping pulse with a predefined width will be issued which will be followed by HHB operation sequence.

# 5.3 Proactive Hybrid HVDC Breaker

Fig. 5.2 (a) is the scaled-down model of a unidirectional HHB, which, as the real equipment does, contains six essential parts: current limiting inductor *L*, residual current breaker (RCB), ultrafast disconnector (UFD), load commutation switch (LCS), metal oxide varistor (MOV), and main breaker (MB) with the snubber circuit. Under normal conduction, the LCS accounts for the majority of energy consumption. On the contrary, when DC line protection is triggered, the power loss of LCS is negligible compared with that caused by MB as well as the MOV which absorb most of the energy stored in the energy transmission corridor [139], including the current limiting inductor. Thus, an accurate device-level IGBT model is necessary for MB so that the switching power loss can be calculated for HHB design evaluation, while for LCS its steady-state power loss is more concerned.

As part of line protection concepts, the operation sequence of the HVDC breaker is shown in the right corner of Fig. 5.1. After receiving the trip order, the LCS gate signals are immediately retrieved and the UFD is commanded to open, which takes around 2*ms* to complete. The MB gate voltages should vanish as soon as the previous actions are confirmed. The protection procedure ends with the opening of the RCB when the line current declines to zero so as to protect the varistor from overheating.

# 5.3.1 EMT Model of the Proposed HHB

The design theory and operation principle of HHB have been illustrated in detail under the assumption that all IGBTs in the MB chain are synchronized, which is reasonable and



Figure 5.2: Models of unidirectional HHB for EMT simulation: (a) scaled-down model, (b) conventional full-scale model, (c)  $3 \times 3$  IGBT array, and (d) decomposition of HHB full-scale model using *v*-*i* coupling.

also implies that all internal nodes are well balanced. In a scaled-down model, the MB and LCS are taken as two-state resistors, with on- and off-state resistances  $R_{on}$  and  $R_{off}$ , and the snubber circuit is rarely included. Thus, the total node number is 3 as UFD and LCS can be merged into one resistor so that the internal node between them is eliminated. For further simplification, the UFD-LCS branch is merged with the MB branch and this constitutes the simplest HHB model for EMT simulation [18, 140]. When a fault occurs on the transmission line, this model can give an approximate performance of the HHB. However, as stated above, it is unable to provide further information of the circuit breaker and may give inaccurate results. In Fig. 5.2(b), the full-scale HHB model is depicted, which has the exact configuration as that of a real one, and two types of commonly used snubber circuits for the circuit breaker are employed [141,142], i.e., RC and RCD, the latter is shown in the sub-figures.

Depending on the requirement of HHB capacity, an IGBT symbol for both LCS and MB in the full model may actually consist of only one or a number of such devices that are organized in a  $N \times N$  array. As indicated in Fig. 5.2(c), a  $3 \times 3$  array of 5SNA 2000K450300 StarkPak IGBT Module ( $V_{CE}$ =4500 V,  $I_C$ =2000 A) [143] is able to endure a DC current over 6000A. However, modeling the circuit breaker as it actually appears would result in a large system admittance matrix due to hundreds of IGBTs in the MB branch, and their snubber



Figure 5.3: HVDC power transfer corridor with HHB separated: (a) equivalent circuit topology, (b) EMT simulation model.

circuits that yield a similar number of nodes, making the original HHB model highly timeand resource-inefficient for commercial off-line as well as real-time EMT simulation tools to solve. Therefore, the voltage-current source coupling method is applied for circuit partitioning to achieve speedup in simulation, as shown in Fig. 5.2(d), the  $V_p$ - $J_s$  coupling enables all HHB units to be physically separated but electrically linked to the power transmission corridor. The UFD and LCS are equally divided to have the same number of HHB units, and so are their resistances. It should be stressed that one IGBT unit in Fig. 5.2(d) denotes an IGBT array, i.e., three IGBTs in parallel, and the MOV in MB unit is also equally divided and denoted by  $MOV_u$  since the three IGBT units sharing it has been forcibly disassembled.

Fig. 5.3 (a) shows the power transfer corridor, the configurations on both sides of the transmission line are symmetrical, and therefore only the rectifier station side is shown. The current source  $I_{dc}$  and capacitor  $C_e$  represent the DC part of the MMC (MMC-DC). By applying transmission line theory, the equivalent circuit for EMT simulation can be obtained, as shown in Fig. 5.3 (b), where  $C_{Ce}$  is represented by its TLM link model for circuit partitioning, and both transmission lines employ Bergeron Line Model [144], which adopts a hybrid Thévenin-Norton structure, leading to a number of one-node circuits whose calculation becomes very convenient. The only exception is the sub-circuit where the DC yard is located, whose matrix dimension is 2, as expressed below:

$$\mathbf{Z} = \begin{bmatrix} Z_{Ce} + RCB_2 + Z_L + Z_{13} & -RCB_2 - Z_L - Z_{13} \\ -RCB_2 - Z_L - Z_{13} & \sum RCB + 2Z_L + \sum Z_{1i} \end{bmatrix},$$
(5.3)

$$\mathbf{U} = \begin{bmatrix} 2v_2^i - 2v_4^i - \sum V_q + Z_{13} \cdot I_{kt13} \\ 2v_4^i + \sum V_q - Z_{13} \cdot I_{kt13} - 2v_3^i - \sum V_p + Z_{12} \cdot I_{kt12} \end{bmatrix},$$
(5.4)

where  $v_2^i$ ,  $v_3^i$  and  $v_4^i$  are incident pulses of the TLM link and stub models of capacitor and

inductors, respectively.  $Z_{Ce}$  and  $Z_L$  are characteristic impedance, Z is the transmission line's characteristic impedance,  $\sum V_{p,q}$  means voltage sources coupled with HHB units. Thus, the DC yard is linked to MMC-DC by incident pulses while it connects to the transmission line by the coupling between current sources  $I_{kt}$  and  $I_{mt}$ . Noticing that line faults are simulated on transmission line 1, a special line section independent from DC yard is constructed, while the model of transmission line 2 has only two sections.

The advantage of such partitioning method is, hybrid HVDC breakers in the power transmission path will not introduce any additional mesh; thus, mesh currents, rather than nodal voltages, are taken as variables, making solution of its corresponding matrix equation fast.

#### 5.3.2 Varistor Model

The varistor is modeled as a nonlinear resistor, whose value plummets when the current surges. The rating of the virtual varistor unit  $MOV_u$  can be determined from a real one, and since the current flowing through them are the same, their distinction lies in the voltage rating. For a  $3 \times 3$  IGBT array, the voltage rating of  $MOV_u$  should be reduced by two-thirds, and the *I*-*V* relation is expressed by

$$i_v = \left(\frac{v_v}{k_v \cdot V_{ref}}\right)^{\alpha_v} \cdot I_{ref},\tag{5.5}$$

where  $k_v$ ,  $\alpha_v$  are coefficients,  $V_{ref}$  denotes protection voltage and  $I_{ref}$  is the corresponding current, and  $v_v$  and  $i_v$  are varistor's voltage and current, respectively.

Based on (5.5), the Norton equivalent model of the nonlinear varistor takes the form of

$$G_v = \frac{\partial i_v}{\partial v_v} = \frac{\alpha_v \cdot I_{ref}}{k_v \cdot V_{ref}} \cdot \left(\frac{v_v}{k_v \cdot V_{ref}}\right)^{\alpha_v - 1},\tag{5.6}$$

$$I_{veq} = i_v - \frac{\partial i_v}{\partial v_v} \cdot v_v. \tag{5.7}$$

Since (5.6) and (5.7) are nonlinear equations, Newton-Raphson iteration is necessary to obtain correct results. However, Matlab off-line simulation of the HHB showed that calculation of the transient stage requires over 20 iterations, which is too many and would significantly prolong the computational time. Thus, the nonlinear function is piecewise linearized into 10 sections so as to reduce the iteration times.

The time HHB takes to block the DC current since line fault occurs consists of two parts: the reaction time of the UFD, known as breaking time, which takes about  $\Delta t_1=2ms$ , and the fault clearance time  $\Delta t_2$  that is decided by the inherent *I-V* characteristics of the varistor. After a line-to-ground fault occurs, the DC line current increases and reaches breaking current at  $\Delta t_1$ 

$$I_{f,max} = \frac{U_{dc}}{r} \cdot (1 - e^{-\frac{\Delta t_1}{\tau}}) + I_{dc} \cdot e^{-\frac{\Delta t_1}{\tau}},$$
(5.8)

where *r* is fault path resistance and to facilitate calculation it is deemed as  $R_f$ , and  $\tau = L/r$  is the time constant. To quench that amount of current within  $\Delta t_2$ , the protection voltage  $V_{ref}$  should meet the following criterion:

$$V_{ref} = \frac{I_{f,max} \cdot L}{\Delta t_2} + U_{dc},\tag{5.9}$$

which indicates that the MOV's protection voltage, reached during fault clearance period, should be set higher than  $U_{dc}$ .

## 5.3.3 General HHB Unit Model

Three types of IGBT models are used to meet the different simulation requirements of accuracy and speed. For real-time HIL emulation that aims to acquire system-level performance, the Type-1 model is a good choice as the simulation speed of two-state switch is fast and utilization of FPGA resources is low. In the case that both high simulation speed and specifics of HHB are demanded, the Type-2 model with curve-fitting IGBT is preferred. When the generality of IGBT model is prioritized, the Type-3 model which adopts the nonlinear behavioral model becomes the best alternative.

With regard to the first type, the parallel IGBTs in HHB unit can ultimately be replaced by an ideal two-state switch, with a small on-state resistance  $R_{on}$  and a large off-state resistance  $R_{off}$ . For the second type, the IGBT is normally deemed as a gate voltage controlled, time-varying current source. Therefore, both models have only two nodes and do not introduce any additional node to the HHB unit. As for the nonlinear behavioral model which supposedly has N nodes, it adds an extra N-2 nodes to the originally four-node unit. Therefore, the total number of nodes reaches N+2, among which one is considered the virtual ground, as shown in Fig. 5.2(d). To achieve real-time with a time-step in the scale of hundreds of nanoseconds, the smallest matrix dimension is favored. Thus, the internal nodes in the UFD-LCS branch and the snubber circuit are merged. For the former, it is naturally eliminated by taking the branch as one resistor, while for the latter, the diode is dominant when the capacitor  $C_s$  is quickly charged during the turn-off stage of MB, and when it turns on,  $R_s$  decides the rate of discharging. Based on this, the RCD snubber is equivalent to RC snubber, whose discretized model, by applying Backward Euler integration method, can be written as

$$R_{eq} = R_{sD} + \frac{\Delta t}{C_s},\tag{5.10}$$

$$I_{h}(t) = \frac{(R_{sD} - R_{eq})}{R_{eq}^{2}} \cdot v_{s}(t) + \frac{R_{sD}}{R_{eq}} \cdot I_{h}(t - \Delta t),$$
(5.11)

where  $R_{sD}$  is the equivalent resistance of  $R_s$ -D pair,  $v_s(t)$  is the voltage over the snubber,  $I_h$ , which is iterative, represents the history current of  $C_s$  and  $\Delta t$  is the simulation timestep. Noticing that all HHB units are identical, it is not necessary to model all of them. Instead, an arbitrary unit is selected for conducting the modeling work, bringing in a significant speedup for off-line simulation, as well as a great reduction in hardware resource utilization when the HHB model is deployed on FPGA. Furthermore, the parallel IGBTs in an IGBT unit are identical and synchronized, indicating they can be represented by one of them to avoid additional nodes caused by the rest. Then, the matrix equation for the *N*-node HHB unit can be generally written as

$$\mathbf{U}_{\mathbf{HHB}} = \begin{bmatrix} G_{11} & mG_{12}^{I} & \dots & mG_{1N}^{I} \\ mG_{21}^{I} & mG_{22}^{I} & \dots & mG_{2N}^{I} \\ \vdots & \vdots & \ddots & \vdots \\ mG_{N1}^{I} & mG_{N2}^{I} & \dots & mG_{NN}^{I} \end{bmatrix}^{-1} \begin{bmatrix} J_{1} \\ mJ_{2}^{I} \\ \vdots \\ mJ_{N}^{I} \end{bmatrix},$$
(5.12)

where elements  $G_{11}$  and  $J_1$  take the form of

$$G_{11} = G_v + R_{eq}^{-1} + (UFD + LCS)^{-1} + mG_{11}^I,$$
(5.13)

$$J_1 = J_s(t) - I_{veq}(t) - I_h(t) + mJ_1^I.$$
(5.14)

In (5.12), m is the number of IGBTs in parallel. The elements from IGBT, which are distinguished by superscript I, are multiplied with m since these parallel IGBTs are identical.

# 5.3.4 Two-Node IGBT Models

Both TSSM and CFM have only two nodes. As one of the most popular models in EMT simulation for its simplicity, TSSM realizes the function of an IGBT by shifting between  $R_{on}$  and  $R_{off}$  when it is commanded to turn on and off, respectively, as shown in Fig. 5.4(a). Thus, the switching transient cannot be shown by this model and the on-state voltage and current are not sufficiently accurate for power loss calculation, let alone thermal analysis. However, in this case, (5.12) is one dimensional since only one node with unknown voltage is left in HHB unit, and it can easily be obtained by solving the following algebraic equation:

$$U(t) = \frac{\sum J(t)}{\sum G} = \frac{J_s(t) - I_{veq}(t) - I_h(t)}{\frac{1}{UFD + LCS} + \frac{1}{R_{eq}} + G_v + G_{MB}}.$$
(5.15)

where  $G_{MB}$  is the conductance of the IGBT unit and consequently is a reciprocal of its resistance.

CFM overcomes aforementioned shortcomings, as its on-state resistance can be obtained from the IGBT static characteristics and the switching features can be preset in the program. As shown in Fig. 5.4(b), the *V*-*I* relation of IGBT is embodied by a piecewise linear resistor that can be expressed by a set of functions taking the form of

$$R_{ss,i} = \frac{V_{CE}}{I_C} = \frac{1}{k_i(T_j)} + \frac{b_i(T_j)}{k_i(T_j) \cdot I_C},$$
(5.16)



Figure 5.4: (a) IGBT two-state switch model, (b) steady-state representation of IGBT curve-fitting model, and (c) controlled current source for the turn-off of IGBT curve-fitting model.

where subscript *i* means the *i*<sup>th</sup> linear segment,  $k_i$  and  $b_i$  are linear functions of junction temperature.

While its turn-off voltage shape is prone to change, the robustness of the shape of IGBT turn-off current becomes critical to establishing its curve-fitting model for transient stage [69]. These current values, measured at different times by either experiment or simulation of commercial software and stored in a look-up table (LUT), are programmed as a time-controlled current source so that its value after a given number of time-steps can be accessed, as depicted in Fig. 5.4(c). Although during steady-state (5.15) is still applicable to CFM-based HHB, its model for turn-off stage distinguishes itself from TSSM, with the nodal voltage expressed by

$$U(t) = \frac{\sum J(t)}{\sum G} = \frac{J_s(t) - I_{veq}(t) - I_h(t) - I_{MB}(t)}{\frac{1}{UFD + LCS} + \frac{1}{R_{eq}} + G_v},$$
(5.17)

where  $I_{MB}$  is the programmed current source representing IGBT. A combination of (5.15) and (5.17) makes CFM a little more complex than TSSM because the way nodal voltage should be solved is dependent on the operating conditions of the switch. Thus, as a status indicator *t* is introduced to decide which of the two equations should be used.

# 5.3.5 IGBT Nonlinear Behavioral Model

One prominent merit brought by the aforementioned two-node models is efficient computation. However, they both have limitations, i.e., the TSSM is incapable of showing switching details and power calculation is not accurate, and the CFM lacks versatility since its transient current waveform will not change along with the electromagnetic environment, and consequently the curve should be amended. There are models that have generality while at the same time provides details of a switch, such as the fourth-order NBM. The main disadvantage is the complexity and relatively slow computational speed. To facilitate HIL emulation as well as simulation of circuits comprised of such models, it is simplified in a way that maintains its accuracy.

#### 5.3.5.1 IGBT Fourth-order Behavioral Model

In Chapter 4, the full nonlinear behavioral model of IGBT is introduced, as shown in Fig. 5.5(a), which can mainly be categorized as the metal-oxide semiconductor field-effect transistor (MOSFET) behavior represented by voltage controlled current source  $i_{mos}$  and inter-electrode capacitors  $C_{ce}$ ,  $C_{ge}$  and  $C_{cg}$ , tail current  $i_{tail}$  that controlled by  $v_{tail}$ , the voltage over  $C_{tail}$  and  $R_{tail}$ , and a piecewise linear diode D that sets the minimum on-state collector-emitter voltage drop.

The two controlled current sources are the main components deciding the IGBT's static and dynamic performance. The MOSFET behavior was described in (4.30), and for HHB HIL implementation, it is reorganized as

$$i_{mos} = \begin{cases} 0, & (v_{cge} \leq V_t), \\ f_1(v_{Cge}) \cdot v_d^{(z+1)} - f_2(v_{Cge}) \cdot v_d^{(z+2)}, \\ & (v_d < y \cdot (v_{Cge} - V_t)^{(1/x)}), \\ (a \cdot (v_{Cge} - V_t) + C)^{-1} + b \cdot (v_{Cge} - V_t) - C^{-1}, \\ & (others), \end{cases}$$
(5.18)

where  $V_t$  is IGBT's gate threshold voltage, a, b, x, y, z and C are static parameters,  $v_d$ ,  $v_{Cge}$  are terminal voltages of  $i_{mos}$  and  $C_{ge}$ , respectively, and  $f_1(v_{Cge})$  and  $f_2(v_{Cge})$  given by

$$f_1(v_{Cge}) = \frac{z+2}{y^{\frac{z+1}{x}}} (a+b(v_{Cge}-V_t))^{-1} \cdot (v_{Cge}-V_t)^{\frac{2x-z-1}{x}}$$
(5.19)

$$f_1(v_{Cge}) = \frac{z+1}{y^{\frac{z+2}{x}}} (a+b(v_{Cge}-V_t))^{-1} \cdot (v_{Cge}-V_t)^{\frac{2x-z-2}{x}}$$
(5.20)

are nonlinear functions of the voltage over  $C_{ge}$ , which shares collector-emitter voltage with the nonlinear capacitor  $C_{cg}$ , meaning that nonlinearities from IGBT capacitances are considered. The tail phenomenon that appears only when the IGBT turns off, is dependent on both  $v_{tail}$  and  $i_{mos}$ :

$$i_{tail} = \begin{cases} 0 & (\frac{v_{tail}}{R_{tail}} \le imos), \\ (\frac{v_{tail}}{R_{tail}} - i_{mos}) \cdot i_{trat} & (\frac{v_{tail}}{R_{tail}} > imos), \end{cases}$$
(5.21)



Figure 5.5: (a) IGBT fourth-order nonlinear behavioral model, (b) IGBT second-order nonlinear behavioral model, (c) general representation of IGBT behavioral model, and (d) linearized discrete-time equivalent model for electromagnetic transient analysis.

where  $i_{trat}$  is a ratio that decides the emergence of tail current.

As can be seen, the full model contains 5 nodes, which means any circuit containing it will yield at least a 4×4 admittance matrix, and solving its corresponding equation requires multiple Newton-Raphson iterations and more often than not it is prone to divergence. Thus, model simplification is carried out to improve its computational speed as well as robustness to divergence.

#### 5.3.5.2 Parameters Extraction

Different IGBT model types are distinguished by the parameters which can be extracted from the device datasheet using the IGBT tool in the off-line simulation tool SaberRD<sup>®</sup>. Similar to the CFM, the parameters of IGBT nonlinear behavioral model can be categorized as the static set and the dynamic set reflecting individual characteristics. The former mainly concentrates on  $i_{mos}$ , while the latter is applied to the remaining components.

It should be pointed out that the curves and data in the device datasheet are experimentally measured, which means that the linearities and nonlinearities, including the nonlinear nature of IGBT capacitances, are fully considered and can be reflected by the NBM. A number of curves from the device datasheet, including typical on-state characteristics, typical transfer characteristics, output characteristics, are imported into the IGBT tool for extracting static parameters such as a, b, x, y, z,  $V_t$ , and  $R_g$ . In the meantime, dynamic features, such as the relationship between typical capacitances and collector-emitter

$$\mathbf{G^{I}} = \begin{bmatrix} G_{mosvd} + G_{Ccg} & G_{mosvcge} - G_{Ccg} & -G_{mosvcge} - G_{mosvd} \\ -G_{Ccg} & G_{Cge} + G_{Ccg} + R_{g}^{-1} & -G_{Cge} - R_{g}^{-1} \\ -G_{mosvd} & -G_{cge} - R_{g}^{-1} - G_{mosvcge} & G_{Cge} + G_{mosvd} + G_{mosvcge} + R_{g}^{-1} \end{bmatrix},$$
(5.23)  
$$\mathbf{J^{I}} = \begin{bmatrix} -I_{moseq} - I_{Ccgeq}, I_{Ccgeq} + \frac{V_{g}}{R_{g}} - I_{Cgeeq}, I_{Cgeeq} + I_{moseq} - \frac{V_{g}}{R_{g}} \end{bmatrix}.$$
(5.24)

voltage, turn-on time and turn-off time, are used for obtaining the remaining parameters shown in Appendix A to ensure that transient characteristics are sufficiently reflected and properly modeled as well. Specific procedures for parameter extraction are provided by SaberRD<sup>®</sup> [136].

#### 5.3.5.3 Sensitivity Analysis

As can be seen from the IGBT nonlinear behavioral model, each node links to several branches, making the element  $G_{ij}^{I}$  in the admittance matrix a sum of individual admittances. Thus, when calculating the matrix, a considerable amount of time will be spent on addition and subtraction operations. Based on Jacobian sensitivity analysis, the matrix can further be simplified. To accomplish that goal, the weakly coupled items, which can be identified by putting the IGBT into a test circuit, have to be distinguished from those that are dominant.

At an arbitrary node that connects to N branches, if the conductance or transconductance of  $k^{th}$  branch is negligible at any time compared with the sum of the rest, that is,

$$\frac{\partial i_k}{\partial f_k(v_1, v_2, \ldots)} \ll \sum_{j=1}^{j=N} \frac{\partial i_j}{\partial f_j(v_1, v_2, \ldots)}.$$
(5.22)

Then, that item can be removed from the admittance matrix for fewer algebraic operation times. The analysis outcome showed that  $G_{tailvd}$ ,  $G_{tailvcge}$  and  $G_{Cce}$  can be omitted. Similarly, sensitivity analysis of **J**<sup>I</sup> leads to a removal of  $I_{taileg}$ .

#### 5.3.5.4 Model Parallelization

Noticing that the IGBT behavior can be largely categorized into two types, i.e., the basic MOSFET behavior determined by  $v_{Cge}$  and  $v_d$ , and the tail current phenomenon that can be deemed as an augment part solely dependent on  $v_{tail}$  according to sensitivity analysis, parallelization of the full behavioral model can be achieved. The former mainly includes components such as  $i_{mos}$ ,  $C_{cg}$ ,  $C_{ge}$  and  $R_g$ , while the latter is a combination of  $R_{tail}$ ,  $C_{tail}$  and  $i_{tail}$ . Thus, the overall model can be deemed as a superposition of both behaviors and consequently it is possible to detach these two parts from each other so as to reduce the number of nodes in each part. Since  $i_{tail}$  can be deemed as a current- and voltage-controlled current source which its own terminal voltage has no impact on, there is no

necessity to physically connect it to other circuit components. Therefore, the tail current itself constitutes an independent circuit. With regard to the parallel  $R_{tail}$ - $C_{tail}$  combination, the voltage across them is so small compared with that of  $i_{mos}$ , that their existence has negligible influence on the MOSFET behavior. Thus, it can also be detached. Then the superposition model of IGBT can be derived as a collection of several sub-models, as shown in Fig. 5.5(b). The second-order MOSFET sub-circuit is the only part that participates in circuit nodal voltage calculation while the other three are used for result correction, as shown in Fig. 5.5(c) the general representation of IGBT behavioral model. Hence, the collector current is calculated as

$$i_C = i_{mos} + i_{Ccg} + i_{tail},$$
 (5.25)

where  $i_{mos}$  and  $i_{Ccg}$  are obtained through solving the circuit in which the MOSFET part locates, and  $i_{tail}$  is calculated directly by (5.21). Similarly, the device's voltage can be deemed as a summation of  $V_{on}$  and  $v_d$ . In Fig. 5.5(d), the discretized and linearized circuit for MOS-FET part which contains merely one transconductance  $G_{mosvcge}$  is shown, where the arrow with a dashed line indicates that  $G_{mosvcge}$  is related to the node it points to. Based on that, (5.23) and (5.24) can be obtained, and the dimension of the matrix equation reduces to 3.

Such improvements of the full model leads to multi-fold benefits: the new model is as precise as its original counterpart, and the number of N-R iterations reduces substantially due to fewer nonlinearities as well as smaller matrix dimension; meanwhile, the maximum time-step to compute the model is also increased so that simulation can run much faster.

## 5.3.6 Electro-Thermal Network

The heat induced by IGBT power loss will raise the junction temperature which in turn affects the HHB performance. Meanwhile, determination of the size of IGBT array in LCS and MB unit also relies heavily on the junction temperature. Thus, the inherent electro-thermal network established in Fig. 4.6 is included as part of an accurate IGBT model. The cooling system is not included [140] in the MB for the reason that the selected IGBT type usually has enough capacity to withstand the DC line current for 2*ms*, while the adoption of cooling system in LCS can be determined by calculating the IGBT junction temperature using the electro-thermal network. It should be pointed out that this network is suitable for all the three proposed HHB models, and only one detialed electro-thermal network corresponding to the selected IGBT is established due to the fact that all IGBTs including their electro-thermal networks are identical. Therefore, their operation status such as the junction temperature is immediately known when computation of the selected HHB unit is completed.

As illustrated in the last chapter, linear functions taking the form of (4.39) are applied to reflect the impact of junction temperature on IGBT performance. The coefficients of the thermal network are listed in Table 5.1.

| Parameter | Coefficient k            | Coefficient p             |
|-----------|--------------------------|---------------------------|
| $V_t$     | -0.012221                | 8.018885                  |
| a         | $38.3699 \times 10^{-6}$ | 0.004176                  |
| b         | $-0.7738 \times 10^{-6}$ | $464.9903 \times 10^{-6}$ |
| х         | -0.0013681               | 1.353853                  |
| у         | $-852.9 \times 10^{-6}$  | 1.475723                  |
| Z         | $-982.22 \times 10^{-6}$ | 1.062776                  |

Table 5.1: IGBT parameters as a function of junction temperature

# 5.4 Hardware Implementation on FPGA

The hardware design of proposed HHB integrated into the MTDC system is targeted onto the Virtex 7 FPGA xc7vx485tffg 1761-2. As shown in the setup in Fig. 2.5, the FPGA board is connected to the oscilloscope via DAC34H84 EVM, which converters digitals into analogs so the results can be displayed as waveforms. To achieve a pipelined structure, the overall system is divided into a number of sub-circuits, and hardware modules are designed specifically for each one of them, such as three types of HHBs and the thermal network, transmission line with fault stimulus, the controllers for rectifier as well as the inverter and their DC yards, MMC inner loop controller and its specific circuits. Like previous chapters, Vivado HLS<sup>®</sup> is employed to enable C/C++ coding of a sub-circuit in the form of a function whose input and output variables are turned into corresponding hardware module's physical ports after being synthesized and exported as an IP core. Table 5.2 shows individual latencies obtained from Vivado HLS<sup>®</sup> synthesis as well as FPGA resource utilization of some key modules of the design.

It demonstrates that the longest hardware delay in the MTDC system employing either HHB-1 or HHB-2 can be attributed to the *abc-dq* transform module, with takes up to 78 clock cycles. However, it is not the factor that determines whether the design can attain real-time execution; instead, the HHB module is decisive because at least one N-R iteration is required in calculating the transients that take place after activating line protection. Therefore, the actual latencies for HHB-1 and HHB-2 are doubled to around 80  $T_{clk}$  considering a few intervals are inserted between two calculations. Hence, to accomplish the goal of real-time, the time-step should be larger than that value. In contrast, the NBM HHB, which, according to Table 5.2, has the largest maximum latency of 125  $T_{clk}$  among all of the components and is the determinant of HIL emulation speed. The varying latency makes output results last for different periods at different stages, leading to distorted waveforms. To avoid that, a timer is included to unify the actual latency of Type-3 model to a fixed 125  $T_{clk}$ . Meanwhile, each type of HHB module has a very low percentage of resource utilization compared with the power converter. And since only one HHB unit containing one IGBT model is necessary to be designed into a hardware module, the resource utilization

| Module  | Description     | Latency | LUT           | FF           | DSP         |
|---------|-----------------|---------|---------------|--------------|-------------|
| ABCDQ   | abc-dq          | 77-78   | 4723 (1.56%)  | 3044 (0.50%) | 34 (1.21%)  |
| DQABC   | dq-abc          | 75-76   | 4628 (1.53%)  | 3041 (0.50%) | 34 (1.21%)  |
| LPR     | Protection      | 4       | 665 (0.22%)   | 308 (0.05%)  | 2 (0.07%)   |
| MMC     | MMC Ctrl        | 38      | 6362 (2.10%)  | 4323 (0.72%) | 35 (1.25%)  |
| REC/INV | Rec/Inv control | 35      | 867 (0.29%)   | 697 (0.11%)  | 10 (0.36%)  |
| DCYARD  | DC yard         | 43      | 3235 (1.07%)  | 1984 (0.33%) | 18 (0.64%)  |
| TL      | Line fault      | 31      | 1646 (0.54%)  | 1135 (0.19%) | 10 (0.36%)  |
| MMC-AC  | AC part         | 23      | 3588 (1.18%)  | 2021 (0.33%) | 15 (0.54%)  |
| MMC-DC  | DC part         | 13      | 486 (0.16%)   | 409 (0.07%)  | 8 (0.29%)   |
| ITAIL   | tail current    | 26      | 1291 (0.43%)  | 737 (0.12%)  | 5 (0.18%)   |
| THERM   | Thermal network | 31      | 2893 (0.95%)  | 1779 (0.29%) | 15 (0.54%)  |
| HHB-1   | Type-1          | 40      | 5431 (1.79%)  | 2233 (0.37%) | 23 (0.82%)  |
| HHB-2   | Type-2          | 38      | 5032 (1.66%)  | 2662 (0.44%) | 26 (0.93%)  |
| HHB-3   | Type-3          | 67-125  | 14471 (4.77%) | 6502 (1.07%) | 106 (3.79%) |

Table 5.2: Latencies and hardware resource utilization of principal hardware modules in the 3-terminal HVDC system

for HHB-3 is quite low, let alone the other two types where much simpler IGBT models are employed. The Type-2 model normally requires more resources than the Type-1 model, but after transferring calculation of (5.16) to the electro-thermal network, it has a similar scale to the latter and its latency is also reduced from over 50  $T_{clk}$  to 38  $T_{clk}$ .

Fig. 5.6 depicts the pipelined hardware structure of a portion of the MTDC system as well as signal exchange routes, in which all hardware modules sealed in blocks achieve parallelism. Those modules related to MMC AC part and its control are represented by the module Grid-connected MMC, which receives reactive and active power or DC voltage orders and generates AC side current and voltage information in dq frame for the DC part of MMC, where the transmitted power is obtained and converted to DC current and voltage. Then, the incident pulse  $v_2^i$  is calculated and sent to the rectifier DC yard so as to obtain two mesh currents, based on which, other variables, such as DC line current can also be acquired. The module for Type-3 circuit breaker is shown as an example in the figure, while the other two types have the same ports. The DC line current acts as an excitation, and based on the status of the IGBT, the nodal voltages can be calculated. The signal *t* is introduced specifically for Type-2 HHB, for indication of operation status and consequently the transient current can be ascertained in the LUT. The voltage and current obtained directly or indirectly are delivered to the electro-thermal network so that the power loss and the junction temperature can be obtained, which are in turn sent to the circuit breaker module to update IGBT parameters for the next time-step. The virtual block N-R Iteration is not a hardware module designed by Vivado HLS<sup>®</sup>. In fact, it is realized



Figure 5.6: Hardware design of HHB integrated with MMC in a pipelined structure on the FPGA and signal flow routes.

by VHDL coding in Vivado<sup>®</sup> where all modules are arranged and connected to each other to form the top-level module.

A proper operation sequence for hardware modules is coordinated by a top-level state machine, as shown in Fig. 5.7 for MTDC system with Type-1 and Type-2 HHBs. With regard to Type-3, the only difference is the criterion in State  $S_5$  should change to whether the calculation of HHB-3 has been completed. The overall system starts to operate under the command *rst* that is generated by pressing the reset button on FPGA board. External signals such as three-phase AC grid voltages and carriers for MMC modulation are stored in ROM, and those data are accessed prior to the operation of all hardware modules. State  $S_5$  takes 43  $T_{clk}$  since the start order is given, which ensures at the end of that state, HHB-1 and HHB-2 have already completed their first computation and have been waiting to enter a new phase. Then, if the results converge, those finished modules will keep idle in State  $S_{10}$  until one time-step runs out, and by the time Park's transformation and its reverse have also been finished. On the contrary, if the results are not convergent, only the nonlinear HHB module will be executed again until it converges or the maximum number of iterations have been reached. For Type-1 and Type-2 HHBs, the maximum



Figure 5.7: Top-level state machine for coordinated operation of MTDC hardware modules.

iteration number can be set to 2 while for NBM-based HHB, 3 iterations are required. The selection of operation frequency is a trade-off between time-step and FPGA capability. For the first two types of models, the frequency is set to 100MHz, which means  $T_{clk}$ =10ns and consequently the minimum time to wait to synchronize to real-time in State  $S_{10}$  is approximately 100ns. Since the nominal latency of Type-3 model is unified to 125  $T_{clk}$ , the frequency is chosen as 125MHz so that  $T_{clk}$ =8ns and the design will execute three times slower than real-time.

# 5.5 HIL Emulation Results

The functions of HHB in guaranteeing normal operation of healthy power transmission corridors and isolating the faulty section are tested by HIL emulation of the three-terminal HVDC system in which the center of transmission line 1 is subjected to short circuit while line 2 keeps operating, as indicated in Fig. 5.1. The reaction of the overall system as well as the performance of some of its components - especially the IGBT - are investigated and validated by comparison with results from industry standard transient simulation tools PSCAD/EMTDC<sup>®</sup> and SaberRD<sup>®</sup>, respectively. The reason is that the former tool is well-known for its accuracy and reliability in system-level simulation, and simulation by the latter tool is always conducted for verification of a power converter design prior to constructing a prototype since the semiconductor switch models in its library have been experimentally validated [52, 145] and consequently are deemed sufficiently accurate. The MMC model established in such tools are AVM-based since they are particularly inefficient

or even unable to compute a circuit with nearly a thousand nodes if the detailed switching function model is applied to the 17-level MMC adopted in the simulation together with two conventional full-scale HHBs. Specific parameters of the MTDC system are given in Appendix B, which shows the combined protection voltage of all MOVs is 340kV, and the number of HHB units is 100, which means that the protection voltage of  $MOV_u$  is 3.4kV. It can be estimated by applying (5.9) that the fault clearance time is close to 4ms considering the actual MOV voltage during that period is a little lower than its protection voltage.

## 5.5.1 Device-Level Performance

The device-level behavior of the HHB mainly includes the voltage and current waveforms of its interior components as well as IGBT's junction temperature. SaberRD<sup>®</sup> is chosen for results validation since it provides detailed nonlinear behavioral IGBT models with thermal network, such as the chosen  $igbt1_3x$  model.

Fig. 5.8 shows the transient waveforms of an MB IGBT with RCD snubber during turnoff process. The HIL emulation results of the nonlinear behavioral IGBT model is shown in Fig. 5.8(a), which indicates that during the  $23\mu$ s period,  $v_{CE}$  slowly rises to around the varistor's protection voltage because the diode is under conduction state and consequently the RCD snubber circuit is equivalent to a capacitor that is being gradually charged. Even though the device still turns off within about  $1.6\mu$ s, an obvious tail current can also be observed. The  $v_{CE}$  curve of CFM is not shown since the result is identical, while the  $i_c$  waveform slightly differs with turn-off time of 2.5 $\mu$ s to fit with the RC snubber case. Fig. 5.8(d) demonstrates the same process by SaberRD<sup>®</sup>, which validates the correctness of the second-order NBM as well as the partitioning approach. It should be pointed out that variables shown in the oscilloscope is annotated based on the actual time the process needs to complete, and therefore voltage rise time  $t_r^v$  and IGBT turn-off time  $t_f$  are divided by a factor of 3, which is the time of speed that the HIL emulation runs slower than realtime. In Fig. 5.8(b) and (e), the turn-off process of TSSM IGBT is shown to illustrate the usefulness of complex IGBT models. Although the voltage rise process is exactly the same to that of NBM IGBT, the current waveform is straightened. As a consequence, the power loss during the transient stage is much smaller, while in the NBM IGBT case, the power loss soars to 37kW, as shown in Fig. 5.8(c). The power loss at on-state of CFM and NBM are almost the same, with the former having a little closer result to that of SaberRD<sup>®</sup>. However, the TSSM with estimated on-state resistance of IGBT is incapable of describing the power loss accurately. The junction temperature variation is shown in Fig. 5.8(f). Since a complete loop in the electro-thermal network cannot be formed in TSSM, its temperature remains slightly above 25°C. On the contrary, the IGBT junction temperature in the other two models rises shortly after 50ms because of the power dissipation induced by the DC line current transferred from UFD-LCS branch to the MB branch by forcing the LCS to turn off immediately after the protection sequence is activated by the line fault which occurs at



Figure 5.8: Turn-off performance of HHB with RCD snubber circuit: (a) MB NBM model of IGBT from HIL emulation, (b) MB TSSM model of IGBT from HIL emulation, (c) single MB IGBT power loss, (d) MB NBM model of IGBT from SaberRD<sup>®</sup>, (e) MB TSSM model of IGBT from PSCAD/EMTDC<sup>®</sup>, and (f) MB IGBT junction temperature. Oscilloscope horizontal axes settings:  $20\mu s/div$ .

50*ms*. As can be seen, both NBM and CFM approaches lead to curves that almost agree with the one from SaberRD<sup>®</sup> simulation, meaning that HHBs employing these two models are sufficient for HIL emulation. Moreover, in this case, the latter shows its advantage by running in real-time.

In comparison, Fig. 5.9 shows voltage and current waveforms of HHBs with RC snubber. Fig. 5.9(a) and Fig. 5.9(b) are turn-off waveforms of the MB with NBM and CFM IGBT respectively, and Fig. 5.9(d) shows the simulated  $v_{CE}$ - $i_C$  curves from SaberRD<sup>®</sup>, which validates the proposed models and partitioning approach. The IGBT turn-off process becomes a little longer and MB terminal voltage rises more quickly, from approximately  $23\mu s$ of RCD snubber to around  $3.6\mu s$ . The two current curves for RCD and RC snubber cases of CFM IGBT are the same: both have a turn-off time of  $2.5\mu s$ , proving that it is incapable of adjusting to variations in the electromagnetic environment unless the LUT is modified. With RCD snubber, v<sub>CE</sub> keeps low during IGBT turn-off process and consequently the power loss is small, while in RC snubber case, the rise of  $v_{CE}$  occurs simultaneously with the fall of  $i_C$ , and they cross at about 0.8kV(kA), which indicates the power loss is much larger, as shown in Fig. 5.9(c). The on-state power loss is virtually identical to that of MB with RCD snubber, and consequently the junction temperature in both cases is approximately 26.1°C at the beginning of 52ms before the IGBT turn-off process; nevertheless, its transient power loss reaches over ten times higher to about 500kW, leading to an instantaneous junction temperature jump to around 26.8°C - an increment of 0.7°C. As an obvious comparison, the temperature jump by IGBT turn-off behavior in the RCD snubber case is much smaller, estimated at 0.08°C, indicating the effectiveness of RCD snubber in



Figure 5.9: Turn-off performance of HHB with RC snubber circuit: (a) MB NBM model of IGBT from HIL emulation, (b) MB CFM model of IGBT from HIL emulation, (c) single MB IGBT power loss, (d) MB NBM IGBT from SaberRD<sup>®</sup>, (e) comparison between RCD and RC snubber, and (f) MB IGBT junction temperature. Oscilloscope horizontal axes settings: (a)  $10\mu s/div$ , (b)  $5\mu s/div$ .

reducing IGBT switching loss. Thus, the low junction temperature verifies the statement that no cooling system is required for the IGBT stacks in the MB branch, and the close agreement with results from SaberRD<sup>®</sup> in Fig. 5.9(c) and (f) again prove the accuracy of proposed models. In Fig. 5.9(e), the snubber currents are compared, all three models generate almost the same waveforms and therefore they are represented by the Type-3 model. It shows the credibility of HIL emulation which leads to results similar to that of SaberRD<sup>®</sup> in both RCD and RC snubber cases. For the RCD snubber, combined with Fig. 5.8(a), the MB and varistor operating process can be derived. After receiving block order from line protection, the MB turns off. In the meantime, the DC fault current diverts to the snubber and after its voltage, and also the varistor's voltage, increases to the protection voltage it again diverts to the varistor where it gradually vanishes. For RC snubber, as soon as MB turns off,  $R_s$  endures the protection voltage because the voltage over  $C_s$  is very low due to a slow charging rate limited by the resistor. Therefore the snubber current is much smaller but due to an early establishment of a voltage around 3300V, the power loss of MB is high.

The above two cases show that with a proper snubber circuit and 3 IGBTs in parallel, the junction temperature rise is negligible. However, such a benefit is accompanied by adopting extra IGBTs. With the help of electro-thermal network, evaluation of an appropriate size for IGBT array becomes feasible, and CFM is employed in the emulation. In Fig. 5.10, line protection tests of two HVDC systems with steady-state DC currents 1kA and 4kA are conducted to show the significance of electro-thermal network in guiding HHB design. Fig. 5.10 (a)(b) are MB IGBT junction temperature variations, which indicate that the chosen IGBT type has enough capacity to construct an MB unit with  $1 \times 1$ 



Figure 5.10: Junction temperature variation during operation: (a) MB IGBT under DC current 1kA, (b) MB IGBT under DC current 4kA, (c) LCS IGBT under DC current 1kA, and (d) LCS IGBT under DC current 4kA.

IGBT array to protect a transmission line with even 4kA steady-state DC current since the maximum temperature is only about 55°C. The LCS IGBT temperature variation is given in Fig. 5.10(c)(d). The temperature steadily rises after the HHB starts normal operation and at the entry into steady-state at t=3s, a line fault is simulated. As can be observed, a single-IGBT LCS is enough to accommodate 1kA, while when the steady-state DC current increases to 4kA, the junction temperature could rise beyond 100°C with self cooling, indicating the margin from safe operation is too small and consequently other types of IGBT arrays, such as 2×2, or an external cooling system, are required. In the meantime, a comparison between MB and LCS junction temperature variation validates the theory that switching transients modeling is particularly important for MB IGBTs while it is negligible for LCS, whose static characteristics dominates junction temperature rise, meaning that even the steady-state part of CFM is sufficient to satisfy simulation requirements.

Fig. 5.11 shows the overall performance of different types of HHB models. Fig. 5.11(a) and Fig. 5.11(b) are the results from HIL emulation in which Type-3 and Type-1 models are employed respectively. The shapes of these voltages and currents in both figures are virtually the same, which indicates that the breaking time is 2ms and the fault clearance time is approximately 4ms. The results of Type-2 breaker are omitted since they are identical.



Figure 5.11: Varistor voltage and current during protection: (a) HIL emulation of Type-3 HHB model, (b) HIL emulation of Type-1 HHB model, and (c) PSCAD/EMTDC<sup>®</sup> simulation results. Oscilloscope horizontal axes settings: (a) 5ms/div, (b) 1ms/div.

During the breaking time, the DC fault current is equally divided among three paralleled IGBTs so that each accounts for one-third of the total. The main difference between these two figures is their emulation speed. For Type-3 model, it runs three times slower than real-time, thus  $\Delta t_1$  and  $\Delta t_2$  by horizontal-axis are 6ms and 12ms, respectively, which are discounted into 2ms and 4ms. On the contrary, the other model is executed in real-time. In Fig. 5.11(c), the simulation results from PSCAD/EMTDC<sup>®</sup> are given, which have exactly the same shapes and values as the HIL emulation results.

In Fig. 5.12 the importance of developing a full-scale HHB model for HVDC system performance prediction is demonstrated. It would be misleading that snubber parameters such as those in Appendix B are applied to a scaled-down model, since it produces some unexpected oscillations in  $v_{hcb}$ , the circuit breaker's voltage, as well as line current that a full detailed model will not cause, as shown in Fig. 5.12(a), which gives the results from real-time HIL emulation and PSCAD/EMTDC<sup>®</sup> simulation. Therefore, evaluation of the behavior of an HVDC system will be inaccurate. On the other hand, the parameters that enable the scaled-down model to produce device-level waveforms as shown in Fig. 5.11 are probably inappropriate for the full model, as can be observed from both HIL emulation and PSCAD/EMTDC<sup>®</sup> simulation in Fig. 5.12(b) that some oscillations are introduced to the breaker's voltage and line current when the snubber resistance is altered to 200 $\Omega$ .

| have every consumer by american mill components |                        |                |               |  |  |
|-------------------------------------------------|------------------------|----------------|---------------|--|--|
| Snubber Type                                    | RC snubber/RCD snubber |                |               |  |  |
| Models                                          | MOV                    | Snubber        | MB            |  |  |
| Type-1 (TSSM)                                   | 3.075MJ/3.114MJ        | 24.0kJ/7866.9J | 288.0J/288.0J |  |  |
| Type-2 (CFM)                                    | 3.071MJ/3.111MJ        | 24.0kJ/7866.0J | 622.9J/428.8J |  |  |
| Type-3 (NBM)                                    | 3.231MJ/3.254MJ        | 23.9kJ/8072.0J | 627.7J/444.5J |  |  |
| SaberRD®                                        | 3.193MJ/3.233MJ        | 23.1kJ/8258.3J | 633.8J/442.1J |  |  |

Table 5.3: Energy consumed by different HHB components

Table 5.3 summarizes the energy consumed by the three main parts of HHB under two snubber circuits after line protection is activated. We can see that regardless of the snubber circuit, MOV absorbs the majority of remaining energy, slightly over 3MJ in both cases. However, the amount of energy dissipated by the other two parts is heavily dependent on the type of snubber. Energy absorbed by the RCD snubber is about one-third of the RC circuit. On the other hand, all three types of circuit breakers yield a close energy consumption for MOV and snubber, there is disagreement in the energy consumption of the MB path. The Type-1 model shows the least energy because the turn-off process of the TSSM is inaccurate. The Type-2 and Type-3 models have similar energy consumption, and as can be observed, the one based on the second-order NBM has closer results to SaberRD<sup>®</sup> where the fourth-order IGBT model *igbt*1\_3*x* is used, while for Type-2 model to attain more precise power consumption under RCD snubber case, the current curve of CFM should be adjusted. Moreover, it can be observed from the table that different IGBT models can cause minor differences in energy consumed by the MOV and snubber, which underlines the importance of precise switch models.

# 5.5.2 System-Level Performance

All three models are applicable to the MTDC system as they produce the same systemlevel results, so the Type-2 model is used for real-time purpose, and the waveforms are



Figure 5.12: Varistor voltage and line current during protection from HIL emulation (top) and PSCAD/EMTDC<sup>®</sup> simulation (bottom): (a) scaled-down model, (b) full-scale model. Oscilloscope horizontal axes settings: (a) 20ms/div, (b) 2ms/div.

validated by PSCAD/EMTDC® simulation, as shown in Fig. 5.13.

Before line fault, all converter side DC voltages are maintained at around 200kV, with the rectifier having a small margin over the other two to ensure power transfer. Immediately after Line 1 contacts the ground, the DC voltages at all stations sag, and after the faulty section is isolated by HHBs on both sides within around 6*ms*, the DC voltages are gradually restored since they are controlled by one of the inverters connecting to the rectifier, as shown in Fig. 5.13(a).

Fig. 5.13(b) is the converter side DC currents during the same period. As can be seen, during breaking time, the line fault leads to current surges in the Rectifier side as well as Inverter-1 side, which sees a polarity inversion from -1kA to over 2kA as the fault forces the inverter, along with the rectifier, to provide energy to the ground. As a consequence, the energy received by Inverter-2 reduces, but with a much slower speed since energy is also stored in the 200km-long path, including two current limiting inductors. After the fault is isolated by two HHBs, currents flowing from Inverter-1 and the Rectifier to the ground are interrupted, and therefore the 2kA rectifier current diverts to Inverter-2. In less than 100*ms*, the current stabilizes at 2kA as it is still controlled by the Rectifier. Fig. 5.13(c) shows the power delivered or consumed by the three stations, and we can see that during the fault, the rectifier station can provide as much as 1GW power to the ground, but



Figure 5.13: System-level performance of the MTDC system during long-term line fault with proposed and scaled-down HHB models from HIL emulation (top) and PSCAD/EMTDC<sup>®</sup> simulation (middle and bottom). (a)(b)(c) Converter side DC voltages, currents and active powers with proposed HHB models, (d)(e)(f) Converter side DC voltages, currents and active powers with the scaled-down HHB model. Oscilloscope horizontal axes settings: (a)(b)(c) 50ms/div.

after completing the protection process, the power is restored and Inverter-2 receives all that amount of energy. As a comparison, the results from using scaled-down HHB model with the same snubber parameters are also shown. Fig. 5.13(d) indicates the voltages are less affected by the simplification of the model. However, the current waveforms have a remarkable difference, with high-frequency oscillations lasting up to 100*ms* and the DC fault currents in Line 1 are not quenched immediately. As a consequence, the power transmission is also unstable during that period, with the Rectifier power reducing to zero momentarily and multiple energy exchanges between Inverter-1 and the rest of the system.

# 5.6 Ultrafast Mechatronic Circuit Breaker

In addition to the above ABB's hybrid circuit breaker, Alstom Grid has also proposed an economic hybrid HVDC circuit breaker employing thyristor (SCR) that features a larger capacity [146, 147], and its design process was also specified [148]. The configuration of the ultrafast mechatronic circuit breaker (UFMCB) is demonstrated in Fig. 5.14, where, as in the ABB's case, the *V*-*I* coupling method is used to separate the DC circuit breaker from



Figure 5.14: Ultrafast mechatronic circuit breaker decoupled from the transmission path.

the transmission path for node number reduction.

It should be noticed that a single thyristor symbol, i.e.,  $SCR_1$ ,  $SCR_{11}$ ,  $SCR_{12}$ , and  $SCR_2$ , in the UFMCB is actually a chain of dozens or even over one hundred thyristors. Thus, it is obvious that even the circuit breaker has been separated from the external system, its detailed model still contains – depending on the thyristor model used in EMT simulation – hundreds or even thousands of nodes, making instant circuit solution impractical. Meanwhile, creating a fundamental unit as in the IGBT-based HHB is also infeasible, due to its highly irregular configuration in the auxiliary branch, e.g., the thyristor array may differ in the four branches, which means when an integer number of fundamental units are created, the number of thyristors in some branches will probably be a non-integer. In addition to circuit splitting, merging the cascaded thyristors becomes the other option for node reduction.

## 5.6.1 Thyristor Modeling

The ideal thyristor model is taken as a two-state resistor whose on- and off-state is controlled by the thyristor logic, which is it turns on after a positive pulse is exerted on the gate of the device whose anode-cathode voltage  $v_{AK}$  meanwhile is greater than the threshold  $V_f$ ; when the current  $i_{AK}$  vanishes, the thyristor turns off if  $v_{AK}$  keeps below 0 over a period longer than the turn-off time  $T_q$ .

In a device-level thyristor model, the on-state *I-V* characteristics should be considered, along with some switching transients. Fig. 5.15(a) shows the nonlinear behavioral thyristor model that shares some similarities with the reverse-recovery diode model. The additional



Figure 5.15: Nonlinear behavioral thyristor model: (a) Single device, and (b) cascaded thyristor equivalent circuit.

SCR logic controls the proper state of the device by regulating the resistance of  $r_s$ : a large resistance for the off-state while a fixed small value is set when the thyristor is turned on.

The nonlinear diode symboled by *NLD* reflects the static on-state characteristics between  $v_{AK}$  and the current  $i_{AK}$ . It is approximated by the following analytical function

$$v_{AK} = A + B \cdot i_{AK} + C \cdot \sqrt{i_{AK}} + D \cdot \ln(i_{AK} + 1),$$
(5.26)

where A, B, C, D are coefficients, which are either provided by the device manufacturer's datasheet or can be estimated according to available *I-V* curves. For digital simulation based on nodal equations, (5.26) is usually discretized by taking partial derivatives so the companion circuit becomes available. However, for simplicity, *NLD* is taken as a pure nonlinear resistor with its conductance a function of the current

$$G_{AK}(i_{AK}) = \frac{i_{AK}}{v_{AK}(i_{AK})}.$$
(5.27)

Like the diode, a thyristor also has reverse recovery phenomenon, which is why the current source  $i_{rr}$  is retained to yield proper reverse recovery current whose value is dependent on the voltage over  $R_r$ - $L_r$  pair.

It can be counted from Fig. 5.15(a) that a single nonlinear behavioral thyristor model contains 4 nodes. Different from ideal switch models, device-level modeling has a mandatory requirement that its voltage and current cannot exceed the actual rating. Therefore, the model is always organized in series or parallel to withstand a high voltage or large currents, and consequently, numerous nodes are introduced into the circuit, exceeding the hardware resource of the FPGA board.

A computationally efficient model for cascaded thyristors is presented in Fig. 5.15(b). The  $R_r$ - $L_r$  pair is separated since their values are too small to affect the static feature, and the *NLD* of each thyristor in series is merged with SCR logic module to eliminate



Figure 5.16: SCR logic validation (top:  $v_s$  and  $V_g$ ; middle: thyristor voltage; bottom: thyristor current): (a) Proposed model, and (b) ANSYS/Simplorer<sup>®</sup>.

the internal nodes, resulting in the following equivalent conductance of  $N_{SCR}$  cascaded thyristors

$$G_{AK}^{N_{SCR}}(i_{AK}) = \frac{G_{AK}(i_{AK})}{N_{SCR}}.$$
 (5.28)

The *NLD* branch, along with  $i_{rr}$ , constitutes the basic part that participates in external circuit solution. Following the acquisition of the diode current  $i_D$ , the reverse recovery current is calculated directly using

$$i_{rr} = K \cdot (G_{Lr} + G_r) \cdot (i_D - I_{Leq}),$$
(5.29)

where  $G_{Lr}$  along with  $I_{Leq}$  constitutes the companion circuit of the inductor. As can be seen, no amplification is required for the reverse recovery current since all serial thyristors share the same current.

In Fig. 5.16, the SCR logic is tested by the circuit in Fig. 5.17 where ABB fast thyristor 5STF 28H2060 is modeled, and ANSYS Simplorer<sup>®</sup> providing device-level models are



Figure 5.17: Basic thyristor test circuit.

used for validation. To show an obvious thyristor reverse recovery phenomenon, the voltage source  $v_s$  is set square as  $\pm 200V/60Hz$ , while the thyristor gate voltage has a frequency of 100Hz, and their phase relationship is given on the top. When positive gate voltage  $P_1$ arrives, the thyristor turns on since it has already been forward-biased. In contrast, when  $P_2$  appears, the thyristor is not turned on because of reverse-biased state, and the off-state lasts until the arrival of  $P_3$  even though the biased condition is changed prior to that. Another noticeable feature of the SCR logic is the turn-off process during  $P_5$ , the thyristor turns off when it is reverse-biased even though the gate voltage is still in effect. The above statements are verified by the off-line simulation tool which shows exactly the same waveforms.

The device-level modeling is validated by the reverse recovery process in Fig. 5.18, which is obtained from the same test circuit. While the current is ascending from  $I_{rrm}$  to 0,  $v_{AK}$  is steadily approaching -200V in a process lasting around  $10\mu$ s. The results are very close to that of ANSYS Simplorer<sup>®</sup>, indicating the dynamic part of the proposed model is also correct.

# 5.6.2 UFMCB Modeling

In Fig. 5.14, the UFMCB is separated from the transmission path by a pair of coupled voltage-current source regardless of the model of surrounding components adopt. Another merit of the decoupling method is it provides convenience for inserting or removing the circuit breaker since the impact of the UFMCB on DC grid is realized by the voltage source  $V_p$ , which means when the DC yard operates without any circuit breaker, the configuration maintains by forcing  $V_p$  to 0.

The operation principle of the DC breaker decides that the LCS in the main branch operates under a low voltage which is clamped by the auxiliary branch, and therefore, its main heat is induced by normal conduction.

The LCS constitutes an array of IGBTs, and for a single switching element, it is taken as



Figure 5.18: SCR reverse recovery: (a) Proposed model, and (b)  $ANSYS/Simplorer^{(R)}$ .

a current-dependent resistor. The manufacturer's datasheet provides the static I-V characteristics, thus, its conductance  $G_{LCS}$  can be calculated by the terminal voltage divided by current, as is similar to (5.27). The MOV is specifically installed to protect the LCS from over-voltage. Due to its nonlinearity, many Newton-Raphson iterations are required, prolonging the computation process. The piecewise linear method is adopted by partitioning its entire I-V curve into 11 segments, each having the form of

$$i_v = G_v \cdot v_v + I_0,$$
 (5.30)

where the conductance  $G_v$  and  $I_0$  are constants in a particular segment. Then, the current contribution of the MOV is

$$I_{veq} = i_v - G_v \cdot v_v. \tag{5.31}$$

The UFD, on the other hand, is taken as a pure ideal switch with fixed on and off-state conductances. Its existence induces an internal node in the main branch that will lead to a larger admittance matrix. It is eliminated by merging all three components, as the main branch has an equivalent EMT circuit expressed by the following companion model:

$$G_{MB} = \frac{G_{UFD} \cdot (G_{LCS} + G_{v0})}{G_{UFD} + G_{LCS} + G_{v0}},$$
(5.32)

$$I_{MBeq} = \frac{G_{MB}}{G_{LCS} + G_{v0}} I_{veq}.$$
(5.33)

Then, the partitioned UFMCB model has a dimension of 5 in its admittance matrix and current vector, as given in Appendix B.

#### 5.6.3 UFMCB Hardware Design

The hardware implementation of UFMCB is conducted on UltraScale+ XCVU9P FPGA. According to the UFMCB description, its overall EMT model contains 5 components, i.e., the thyristor, the main branch, the MOV, the nodal equation solver, and signal update. Each of them is designed into hardware modules using Vivado HLS<sup>®</sup>, and an estimation of hardware resource utilization is given in Table 5.4.

| Table 5.1. Of theb parts hardware design summary |               |         |         |         |                          |  |
|--------------------------------------------------|---------------|---------|---------|---------|--------------------------|--|
| Module                                           | Description   | LUT     | FF      | DSP48   | Latency T <sub>clk</sub> |  |
| NodalEq                                          | Solver        | 3386    | 2295    | 27      | 131                      |  |
| $SCR(\times 4)$                                  | Thyristor     | 2496    | 1366    | 18      | 7-33                     |  |
| B0                                               | Main Branch   | 32      | 0       | 0       | 0                        |  |
| Update                                           | Signal Update | 2075    | 1735    | 26      | 7                        |  |
| $MOV(\times 4)$                                  | MOV           | 1921    | 844     | 8       | 8-13                     |  |
| UFMCB                                            | -             | 23161   | 12870   | 157     | 459                      |  |
|                                                  | -             | (1.96%) | (0.54%) | (2.30%) |                          |  |
| Available                                        | -             | 1182240 | 2364480 | 6840    | _                        |  |

Table 5.4: UFMCB parts hardware design summary

It demonstrates that with proposed thyristor modeling method, the hardware resource utilization is quite low; on the contrary, if the thyristors are taken individually, not only the resource requirement surges but also circuit solution becomes extraordinarily inefficient.

The nodal equation solver module *NodalEq*, according to the table, has the largest latency of 131 clock cycles. However, it is not equivalent to the UFMCB latency, which is determined by the way those hardware modules cooperate. Fig. 5.19 shows the pipelined design and signal routes of the DC circuit breaker on FPGA.

All hardware modules listed in Table 5.4 are briefly shown in the design. The outputs from *SCR*, *MOV*, and *B0* are not instantly sent to the *NodalEq* module for circuit solution. Instead, the D-latch whose clock port fed by corresponding data valid indicator is always between the outputs and inputs of two hardware modules. It means that a signal could be only received by the next module only when it is valid. After correct nodal voltages are obtained from the *NodalEq* module, it is sent to the *Update* module for calculating the thyristor currents and the voltage coupling  $V_p$ . As can be seen, a closed-loop is formed, and the inputs of a module are acquired from its upstream; however, parallelism is available for most of the blocks, and whenever the output data is valid, it is sent to the downstream modules.



Figure 5.19: Hardware architecture of the UFMCB.



Figure 5.20: UFMCB top-level finite state machine.

A proper finite state machine is also required to coordinate those modules since the involvement of a lot of nonlinearities forces the adoption of Newton-Raphson iteration for some modules. According to Fig. 5.20, repetitive operations of the *MOV* and *NodalEq* modules are possibly needed.

At the beginning, the reset order is issued and the emulation starts from state  $S_1$ , when *B0* and all *SCR* and *MOV* modules are running. Completion of these modules does not indicate an immediate conduct of circuit solution; instead, *NodalEq* begins until 33 clock



Figure 5.21: UFMCB terminal waveforms.

cycles counted since  $S_1$  has passed, as it is noticed that  $S_1$  involves some modules with a variable latency, and 33 clock cycles is the maximum value. Following circuit solution is convergence check. Since a maximum of 3 iterations is needed, the calculation in each time-step is conducted 3 times to ensure the results will not be distorted. Even if N-R iteration is conducted, only the *MOV* modules need recalculation, and like stage 1, 13 clock cycles are counted before entering to the nodal voltage solution module for gathering evenly distributed results in the time domain.

Hence, according to the FSM, a simulation time-step involves 3 N-R iterations and the total latency is 459, as given in Table 5.4.

## 5.6.4 UFMCB Real-Time Tests and Validation

Fig. 5.21 gives the UFMCB terminal voltage and the DC line current when the line-toground test is carried out based on Fig. 5.14. Initially, with a DC voltage of 120kV and a resistive load of 80  $\Omega$ , the steady-state current is 1.5kA. At t=100ms, the fault occurs at the load side, and the UFMCB begins to operate to isolate the fault. During the breaking period, at least one thyristor keeps turned on to ensure the voltage over LCS not too high:  $S_1$ ,  $S_{11}$  conduct first, and then  $S_{12}$  turns on to replace  $S_{11}$ , followed by transferring the current from the first auxiliary branch to the second auxiliary branch where  $S_2$  locates. During that period, the DC voltage source keeps feeding the fault, which accounts for the



Figure 5.22: UFMCB SCR1 and SCR12 currents.

rising current. Following the opening of UFD and turning off of  $S_2$ , the process enters fault clearance period when the MOV  $M_2$  begins to extinguish the DC current by maintaining a protection voltage around 170kV – higher than the DC voltage.

The adoption of device-level thyristor model yields some different results from the ideal switch model, as one example is given in Fig. 5.22. The currents in thyristors  $SCR_1$  and  $SCR_{12}$  by HIL emulation fit well with ANSYS Simplorer<sup>®</sup> results, but PSCAD/EMTDC<sup>®</sup> offers slightly different waveforms.

Fig. 5.23 demonstrates that with device-level models, the thyristor power loss can be accurately calculated. The equivalent resistance of the proposed thyristor model automatically adjusts under different currents, so is that of ANSYS Simplorer<sup>®</sup>. As a result, the instantaneous power loss of  $SCR_1$  approaches 7kW, while that of  $SCR_2$  is a little higher. In stark contrast, the ideal switch model is even incapable of power loss estimation. When a 10m $\Omega$  on-state resistance is assumed for both thyristors, their power losses are beyond 160kW and 180kW, respectively. Even if their resistances are reduced to 1m $\Omega$ , the power loss still maintains at around 20kW – much higher than the real situation. On the contrary, a further decrease in the resistance to 0.1m $\Omega$  leads to extremely low power dissipation. Therefore, it indicates the importance of device-level models in EMT simulation, which could not only give system-level results – as expected in real-time in this case but also



Figure 5.23: Power loss calculation by various EMT tools for: (a) SCR1, and (b) SCR2.

critical information that is unavailable in system-level simulation tools.

# 5.7 Summary

This chapter proposed three full-scale ABB's HHB models for the purpose of accurate realtime HIL emulation as well as electro-thermal transient simulation under the circumstance that their conventional counterpart is highly burdensome and resource-consuming. The approach that partitions the HHB full model into a number of fundamental and identical sub-circuits results in a remarkable reduction of the dimension of corresponding matrix equations and therefore can be referred to for the modeling of other power electronic apparatus. On the other hand, FPGA hardware resource utilization declined drastically by at least two orders of magnitude compared with the conventional full-scale model, and the burden of computing proposed HHB models is virtually the same to that of the scaleddown model. Therefore, it substantially accelerates the computational speed and is feasible for HIL execution to validate control and protection strategies of an MTDC system. Meanwhile, as a pivotal part of HHB, three types of IGBT models were adopted to give a variety of guidance in the breaker design process. It is demonstrated by comparison with the scaled-down model that all three models are capable of verifying whether a selection of parameters is reasonable by investigating the impacts of the designed HHB on the overall system that the latter is unable to achieve. And particularly, the curving-fitting model and improved nonlinear behavioral IGBT model are able to provide extra information unavailable in previous simulation studies of HVDC circuit breakers, such as IGBT power loss and subsequently its junction temperature, which is meaningful in the determination of an appropriate IGBT type and the size of its array for LCS and MB, as well as evaluation of the cooling condition. HIL emulation results demonstrated that the CFM-based HHB model can be executed in real-time while the NBM HHB model provides better versatility. Meanwhile, it showed that precise IGBT models with switching transients are needed for IGBT type selection, while a simple steady-state model is sufficient for LCS IGBT. Moreover, the proposed simplified IGBT nonlinear behavioral model is computationally more efficient and robust against numerical divergence, so it can be applied for the simulation of other power converters.

In the meantime, the device-merging method was also tested in modeling the ultrafast mechatronic circuit breaker. A thyristor with reverse recovery phenomenon was proposed, and to reduce the node number when they are cascaded, an equivalent circuit was derived. The merging method leads to a significant reduction in hardware resource utilization when the design is deployed on FPGA. Like the IGBT model, the modeling methodology of taking a device-level thyristor as a combination of basic characteristics and augment part can be applied to other power semiconductor switches for efficient computing.

# Fixed Time-Step CIGRÉ DC Grid Simulation on GPU

# 6.1 Introduction

Electromagnetic transient simulation of power electronic systems conducted on sequential processors slows down as the system scale increases and the models included become more complex. Thus, in this chapter, taking the CIGRÉ B4 DC grid as the testbench, efforts are made to improve the off-line simulation efficiency from three aspects, i.e., the modeling approaches, the processing units, and the computational techniques.

In the MTDC grid, detailed models as specific as device-level for the semiconductor switches are implemented to ensure high simulation accuracy and provide comprehensive circuit information. As the overall DC grid contains an extremely large number of nodes, which makes the direct solution of corresponding matrix equations extremely slow, three levels of circuit partitioning approaches are adopted for the parallel simulation. Separation of each converter station is naturally achieved due to the existence of transmission lines. Within a converter station, the TLM-link enabled further partitioning between major subsystems, which are ultimately split into multiple parts by coupled voltage-current sources as illustrated in Chapter 4.

Due to its superior performance in parallel computation, the GPU is investigated and utilized to improve the speed of time-domain transient simulation. Its hardware architecture and the way that the parallel EMT program is implemented determine that, in the meshed MTDC system, components of similar characteristics can be represented by one kernel and implemented by blocks of massively parallel threads. Nevertheless, the irregularity of MTDC grid topology remains the main challenge, i.e., many components in the grid only have a small number, thus restricting massively parallel execution, which is the
factor that GPU simulation relies on to derive speed advantage over CPU. For example, in the CIGRÉ B4 DC grid, there are 11 AC/DC or DC/AC converters, far less than the bus number in a typical AC system such as the IEEE 39-bus system. In this case, simply taking the MMC as GPU kernel and running it by 11 concurrent threads definitely falls short of utilizing massive parallelism.

Thus, using the fine-grained partitioned MMC and the HHB where a large number of identical circuit units exist to conform to the single-instruction-multiple-thread (SIMT) architecture of the GPU. Massive parallelism could be achieved following the creation of a substantial amount of similarities and taking each type of split circuit unit as an individual kernel. Meanwhile, many cycles in the controller are eliminated, and this liberation from repetition greatly shortens the computational time. And since the MTDC system can be categorized as many levels, e.g., the overall system comprises of 11 rectifiers and inverters, each of which contains three phases, and one MMC leg has many SMs, the dynamic parallelism feature of GPU is employed to accommodate this hierarchical configuration.

## 6.2 Wind Farm-Integrated MTDC Grid

Fig. 6.1 shows the CIGRÉ B4 DC grid [149] integrated with offshore wind farms (OWFs). It comprises of 3 DC systems (DCSs) and 11 AC/DC terminals, and the converter stations are numbered. The onshore converter stations connecting with OWF1-5 are rectifiers so that the energy could be transmitted to inland inverters. In this section, models of some critical components in forming the CIGRÉ DC grid are introduced.

#### 6.2.1 Induction Machine Model

The induction machine (IM) is the core part that converts the wind's kinetic energy into electricity. It is based on the following state-space equations [150]:

$$\dot{\Phi} = A\Phi + BU, \tag{6.1}$$

$$\mathbf{I} = \mathbf{C}\boldsymbol{\Phi},\tag{6.2}$$

where  $\Phi$ , I and U are vectors of fluxes, currents and excitations of the DFIG in the  $\alpha$ - $\beta$  frame, respectively. Since their elements are arranged in the same sequence, they can be uniformly denoted by a symbolic vector **X** as

$$\mathbf{X} = \begin{bmatrix} X_{\alpha s}, & X_{\beta s}, & X_{\alpha r}, & X_{\beta r} \end{bmatrix}^T.$$
(6.3)

Here, the subscript  $\alpha$  and  $\beta$  represent the  $\alpha$ - $\beta$  frame, and s and r indicate variables belonging to either the stator or rotor. The input matrix **B** is a 4×4 identity matrix, while the state



Figure 6.1: The CIGRÉ B4 DC Grid integrated with offshore wind farms.

and the output matrices are given as:

$$\mathbf{A} = \begin{bmatrix} \frac{-R_s L_r}{L_s L_r - L_m^2}, & 0, & \frac{R_s L_m}{L_s L_r - L_m^2}, & 0\\ 0, & \frac{-R_s L_r}{L_s L_r - L_m^2}, & 0, & \frac{R_s L_m}{L_s L_r - L_m^2}\\ \frac{R_r L_m}{L_s L_r - L_m^2}, & 0, & \frac{-R_r L_s}{L_s L_r - L_m^2}, & -\omega_r\\ 0, & \frac{R_r L_m}{L_s L_r - L_m^2}, & \omega_r, & \frac{-R_r L_s}{L_s L_r - L_m^2} \end{bmatrix},$$
(6.4)

$$\mathbf{C} = \begin{bmatrix} \frac{L_r}{L_s L_r - L_m^2}, & 0, & \frac{-L_m}{L_s L_r - L_m^2}, & 0\\ 0, & \frac{L_r}{L_s L_r - L_m^2}, & 0, & \frac{-L_m}{L_s L_r - L_m^2}\\ \frac{-L_m}{L_s L_r - L_m^2}, & 0, & \frac{L_s}{L_s L_r - L_m^2}, & 0\\ 0, & \frac{-L_m}{L_s L_r - L_m^2}, & 0, & \frac{L_s}{L_s L_r - L_m^2} \end{bmatrix},$$
(6.5)

where  $L_s$ ,  $L_r$  and  $L_m$  are stator, rotor and magnetizing inductances,  $R_s$  and  $R_r$  are stator and rotor resistances, and  $\omega_r$  is the electrical rotor velocity.

The electromagnetic torque is calculated following the solution of the space state equa-

tions, as

$$T_e = 1.5P_p(\Phi_{\alpha s}I_{\beta s} - \Phi_{\beta s}I_{\alpha s}), \tag{6.6}$$

where  $P_p$  is the number of pole pairs, and  $\omega_r$  is subsequently calculated by

$$\omega_r = \int \frac{P}{J} (T_e - T_m) dt, \tag{6.7}$$

where J is the inertia and  $T_m$  the mechanical torque.

For EMT calculation, differential equations need to be discretized. Using the Trapezoidal rule, (6.1) and (6.7) take the forms of

$$\Phi(n) = (\mathbf{I} - \frac{\mathbf{A}\Delta t}{2})^{-1} \cdot [(\mathbf{I} + \frac{\mathbf{A}\Delta t}{2})\Phi(n-1) + \frac{\mathbf{B}\Delta t}{2}(\mathbf{u}(n) + \mathbf{u}(n-1))], \quad (6.8)$$

$$\omega_r(n) = \omega_r(n-1) + \frac{P\Delta t}{2J}(T_e(n) - T_m(n) + T_e(n-1) - T_m(n-1)), \quad (6.9)$$

where I is a  $4 \times 4$  identity matrix, n indicates the time instant, and  $\Delta t$  is the time incremental.

### 6.2.2 Three-Phase Transformer

The transformer is widely distributed in the MTDC grid. For a *n*-winding transformer, its basic *V*-*I* characteristic is represented by the following differential equation [151]:

$$\mathbf{v}_{\mathbf{T}} = \mathbf{i}_{\mathbf{T}}\mathbf{R} + \mathbf{L}\frac{d}{dt}\mathbf{i}_{\mathbf{T}},\tag{6.10}$$

where  $v_T$  and  $i_T$  are both *n*-D vectors for all terminal voltages and currents, **R** is a  $n \times n$  diagonal matrix of winding resistances, and **L** contains self- and mutual inductances of all windings.

The discretization of the above equation, using Trapezoidal rule, would lead to

$$\mathbf{i}_{\mathbf{T}}(t + \Delta t) = \mathbf{G}_{\mathbf{T}} \mathbf{v}_{\mathbf{T}}(t + \Delta t) + \mathbf{I}_{\mathbf{his}}(t), \tag{6.11}$$

where the admittance matrix and the history current are

$$\mathbf{G}_{\mathbf{T}} = [\mathbf{I} + \frac{\mathbf{L}^{-1}\mathbf{R}}{2}\Delta t]^{-1} \cdot \frac{\mathbf{L}^{-1}}{2}\Delta t, \qquad (6.12)$$

$$\mathbf{I}_{\mathbf{his}}(t) = 2\mathbf{G}_{\mathbf{T}}(\mathbf{I} - \mathbf{R}\mathbf{G}_{\mathbf{T}})\mathbf{v}_{\mathbf{T}}(t) + (\mathbf{I} - 2\mathbf{R}\mathbf{G}_{\mathbf{T}})\mathbf{I}_{\mathbf{his}}(t - \Delta t).$$
(6.13)

In the DC grid, the 3-phase transformer has 6 windings. Consequently, the admittance matrix for systems containing it would have a minimum of  $6 \times 6$  elements. Unlike  $2 \times 2$  matrix equation that can be solved directly, the Gaussian Elimination procedure is more

efficient when the system contains a node number larger than 3 and therefore, it is employed for the solution of transformer. A universal form of an *N*-node system containing the transformer can be written as

$$\begin{bmatrix} \mathbf{v}_{\mathbf{T}} | v_{ext7}, \dots, v_{extN} \end{bmatrix}^{T} = \begin{pmatrix} \begin{bmatrix} \mathbf{G}_{\mathbf{T}} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} \end{bmatrix}_{N \times N} + diag[G_{ext1}, \dots, G_{extN}])^{-1} \cdot (\begin{bmatrix} \mathbf{I}_{\mathbf{his}} | \mathbf{0} \end{bmatrix}^{T} + \begin{bmatrix} J_{ext1}, \dots, J_{extN} \end{bmatrix})$$
(6.14)

where the 3-phase transformer corresponds to the first 6 nodes, and elements with subscript *ext* are contributed by its surrounding components. The external conductance results in a diagonal matrix, which is added with the inherent transformer admittance matrix. Similarly, the current contribution vector from the outer system is also combined with that of the transformer. Thus, the interaction between the transformer and its neighboring circuits can be obtained by solving the above equation.

#### 6.2.3 Frequency Dependent Line Model

The transmission line linking one electrical component with another provides an inherent circuit partitioning method to the power system, due to the time delay induced by traveling waves. The frequency dependent line model (FDLM) is able to describe all underground cables and overhead line geometries accurately in the phase domain [152]. The graphic form it takes for EMT computation is the same to other simpler line models, such as the Norton equivalent circuit shown in Fig. 6.2(a), where the history current of an arbitrary terminal is expressed by

$$I_{his(k/m)}(t+1) = \mathbf{Y}_{\mathbf{c}} * v_{(k/m)}(t+1) - 2\mathbf{H} * I_{(m/k)r}(t-\tau),$$
(6.15)

where \* symbolizes convolution,  $I_{(m/k)r}$  is the reflected current, and  $\mathbf{Y_c}$  and  $\mathbf{H}$  denote the characteristic admittance matrix and the propagation matrix, respectively. As the equation shows, the history item at terminal k is a function of the current at terminal m and vice versa means the two terminals are interactive and the traveling wave is manifested by the travel time  $\tau$ .

With  $I_{his}$  and the admittance G known, the terminal voltage  $v_{(k/m)}$  can be obtained by solving the circuit where the FDLM locates. Then, the terminal current is calculated by

$$i_{(k/m)}(t) = G \cdot v_{(k/m)}(t) - i_{his(k/m)}(t),$$
(6.16)

where  $i_{(k/m)}$  is used to calculate the incident current, by which the history currents can be updated as

$$I_{(k/m)i}(t+1) = \mathbf{H} * I_{(m/k)r}(t-\tau).$$
(6.17)

To facilitate circuit computation, the Norton equivalent circuit of FDLM is converted to its Thévenin counterpart, as shown in Fig. 6.2(b). Consequently, voltages  $V_{hisk}$  and  $V_{hism}$  participate in circuit EMT computation, whereas the update of FDLM's parameters is undertaken with currents  $I_{hisk}$  and  $I_{hism}$ .



Figure 6.2: General form of a frequency-dependent transmission line model: (a) Norton equivalent circuit, and (b) the Thévenin equivalent circuit.



Figure 6.3: MMC-based converter station controller.

#### 6.2.4 Aggregated Wind Farm EMT Model

The rectifier collects wind energy and provides a stable AC voltage to an array of doublyfed induction generators (DFIGs) which may locate in regions where AC grid is not available. As shown in the *d-q* frame-based controller in Fig. 6.3, the AC voltage reference in the *d*-axis  $V_{gd}^*$  is set for the actual output voltage to follow, and phase-shift control is adopted for regulating the MMC SM capacitor voltages The configuration of the rectifier side is given in Fig. 6.4(a). The wind turbine model converts wind speed into torque and feeds it into the IM, which generates electric power under vector control [153]. The OWF is represented by aggregated DFIGs since the focus is on GPU simulation of power converters with nonlinear device-level details for system study.

From the perspective of circuit analysis, the rectifier side contains a large number of nodes. Moreover, the IM model cannot be solved along with its surrounding parts by a matrix equation. Thus, twofold voltage-current source (V-I) couplings are introduced for interfacing the IM to external circuits, as shown in Fig. 6.4(b). The voltage sources should be on the induction machine's side since the input of (6.1) is 3-phase voltage, and the current sources are solved in conjunction with the remaining parts.

The aggregation of all DFIGs in an OWF introduces the other coupling at the point of



Figure 6.4: Offshore wind farm integration into MTDC grid: (a) DFIG array connected with MMC, and (b) rectifier side EMT model with aggregated wind farms.

common coupling (PCC). The voltage source is placed on the DFIG side, and on the AC side of the rectifier is the 3-phase current-controlled current sources, with a scaling factor  $N_D$  representing the number of DFIGs in a wind farm.

## 6.2.5 IGBT/Diode Grouping

A complete model for IGBT and its anti-parallel diode can bring up to 5 nodes, as shown in Fig. 6.5. And in the MMC submodule, it may have a number of IGBT/diode pairs, e.g., a fundamental half-bridge SM (HBSM) contains two switches, and in case the rating of a single IGBT is not enough, one switch symbol may be comprised of a few parallel IGBTs. Consequently, the number of nodes in the split SM rises dramatically and the simulation will be further slowed down. However, noticing that during operation, they are well balanced and the internal nodes have the same potential, a scaling factor m is introduced to indicate their exact number. Then, as indicated in Fig. 6.6(a), the basic admittance matrix



Figure 6.5: IGBT/diode nonlinear electro-thermal behavioral model.

for the HBSM still has a dimension of 8, written as a combination of three parts:

$$\mathbf{G}_{SM} = \begin{bmatrix} G_C & \mathbf{0}_{1\times7} \\ \mathbf{0}_{7\times1} & \mathbf{0}_{7\times7} \end{bmatrix} + \begin{bmatrix} m \cdot \mathbf{G}_{\mathbf{S}_{5\times5}} & \mathbf{0}_{3\times3} \\ \mathbf{0}_{3\times3} & \mathbf{0}_{3\times3} \end{bmatrix} + \begin{bmatrix} \mathbf{0}_{4\times4} & \mathbf{0}_{4\times4} \\ \\ \mathbf{0}_{4\times4} & m \cdot \begin{bmatrix} G_{S11} & \cdots & G_{S14} \\ \vdots & \ddots & \vdots \\ G_{S41} & \cdots & G_{S44} \end{bmatrix} \end{bmatrix}, \quad (6.18)$$

where the element  $G_C$  represents the conductance of SM capacitor; the  $2^{nd}$  matrix lists all elements in the 5-node upper switch, while its lower counterpart is placed in the  $3^{rd}$  matrix, which only contains 16 elements after the  $5^{th}$  node is naturally grounded. Similarly, the current contribution vector can be expressed by

$$\mathbf{J}_{SM} = \begin{bmatrix} I_{hisC} & 0 & 0 & J_s & 0 & 0 \end{bmatrix} + m \cdot \begin{bmatrix} J_{S1} & J_{S2} & J_{S3} & J_{S4} & J_{S5} & 0 & 0 \end{bmatrix} + m \cdot \begin{bmatrix} 0 & 0 & 0 & 0 & J_{S1} & J_{S2} & J_{S3} & J_{S4} \end{bmatrix},$$
(6.19)

where  $I_{hisC}$  and  $J_s$  are the capacitor's history current and arm current, respectively. The SM nodal voltages are subsequently obtained by

$$\mathbf{U}_{\mathbf{SM}} = \mathbf{G}_{\mathbf{SM}}^{-1} \cdot \mathbf{J}_{\mathbf{SM}}.$$
 (6.20)

The full-bridge SM (FBSM) is another topology used in MMC-based HVDC transmission for DC side fault ride-through. It originally contains 15 nodes; however, considering during normal operation the  $4^{th}$  IGBT is constantly under off state, it can be omitted, and the new FBSM in Fig. 6.6(b) has 13 nodes. Its admittance and current contribution matrices can be acquired in a similar style and the nodal voltages are also calculated by (6.20).



Figure 6.6: Partitioned SM nonlinear behavioral model: (a) Half-bridge submodule, and (b) full-bridge submodule.

# 6.3 MTDC Grid GPU Program Design

#### 6.3.1 MTDC Multi-Level Partitioning Scheme

The apparent irregularity of the CIGRÉ DC grid's configuration and a reasonable number of components mean that the DC grid is more close to an realistic project. To obtain a circuit structure suitable for GPU massively parallel implementation and a subsequent speedup, the DC grid is partitioned by 3 levels, two of which rely on transmission line modeling. The first level is the natural separation of converter stations by DC transmission lines. As each station connects to a different number of other stations, the configuration of DC yards also varies. In Fig. 6.7, the partitioning of the 4-terminal DCS2 is shown as an example, where FDLMs are discretized to separate one station from another. Despite that, from a mathematical point of view, a converter station still corresponds to an admittance matrix with a huge dimension since both MMC and the DC yard where the HHB locates contain hundreds of nodes. Thus, facilitated by the MMC DC capacitor which can be deemed as TLM-link, the second level of circuit partitioning is applied to detach them. The third-level of partitioning is introduced in previous sections for separating the MMC submodules from its arms and the modeling of HHB.

It can be noticed that some virtual branches are created in DC yards to enable all of them to have the same configuration. Moreover, identical names are assigned to components at the same position in different DC yards, which enables programming the DC yard as one GPU kernel. However, distinct circuit topologies lead to different computation algorithms. For example, the DC yards of Cm-B3 and Cm-F1 can be uniformly written as:

$$\mathbf{I_m} = \begin{bmatrix} \sum_{i=1}^{2} Z_{HHB_i} + Z_x + Z_y & -Z_{HHB_2} - Z_y \\ -Z_{HHB_2} - Z_y & Z_T + Z_{HHB_2} + Z_y \end{bmatrix}^{-1} \begin{bmatrix} V_y + \sum_{i=1}^{2} (-1)^i V_{HHB_i} - V_x \\ 2v_m^i - V_{HHB_2} - V_y \end{bmatrix}, \quad (6.21)$$

where  $Z_{HHB}$  and  $V_{HHB}$  is the equivalent impedance and voltage contribution of an HHB,



Figure 6.7: Circuit partitioning of HVDC stations in the DC subsystem DCS2 by transmission line models.

 $Z_{x/y}$  and  $V_{x/y}$  is the FDLM, and  $v_m^i$  constitutes TLM link model of a DC capacitor along with its impedance  $Z_T$ .

On the contrary, in single-line DC yards, there is only one actual loop, and the mesh current can be obtained more conveniently by the following algebraic equation:

$$I_{m1} = \frac{2v_m^i - V_{HHB_1} - V_x}{Z_T + Z_{HHB_1} + Z_x},$$
(6.22)

since it is obvious that  $I_{m2}=0$ .

In the overall CIGRÉ B4 DC grid, one station may connect up to three other stations, e.g., Cb-A1 is connected with Cb-C2, Cb-B1, and Cb-B2. To enable all DC yards to be computed by one kernel and subsequently achieve high parallelism, rather than by 2 kernels with lower parallelism, the standard DC yard should have 3 branches. Thus, the virtual line currents in double-branch DC yard and single-branch DC yard are  $I_z=0$  and  $I_y=I_z=0$ , respectively, whereas the actual branch current can be determined from mesh currents calculated by either (6.21) or (6.22).

Comparing the FDLM voltage variable names in Fig. 6.2(b) and Fig. 6.7, the names in the latter figure should be sorted. Throughout coupled DC yards, the one with smaller station number is defined as terminal k. Thus, variables belonging to the station  $V_{x/y/z}$  and  $I_{x/y/z}$  can be mapped to those of FDLM, i.e.,  $v_{k/m}$  and  $I_{his(k/m)}$ . Taking DCS2 for example, between MMC0 and MMC2 is the DC line  $L_1$ ; thus,  $V_{y0}=V_{hisk1}$ ,  $V_{y2}=V_{hism1}$ , where the number in the subscripts on the left and right side denotes MMC number and line number, respectively. By parity of reasoning, for the other two lines, the relationship between station variables and line variables are  $V_{x0}=V_{hisk0}$ ,  $V_{x1}=V_{hism0}$ ;  $V_{x2}=V_{hisk2}$ ,  $V_{x3}=V_{hism2}$ , while the variables in virtual branches do not have effective assignments. On the other hand, updating FDLM's information requires its terminal voltage and current, which obey the

| DC Vard (kernel.)                                                                           | Variable Sort1              |
|---------------------------------------------------------------------------------------------|-----------------------------|
| $V_X = V_k V_m$                                                                             | DCY                         |
| $I_{ki}$ $I_{his}$ variable $I_{hisX}$ solve $I_X$ variable $I_k$ $I_m$                     | $I_{hisx}[0]  I_{hisk}[0]$  |
| Imi sort1 Im sort2                                                                          | $I_{hisx}[1]$ $I_{hism}[0]$ |
|                                                                                             | $I_{hisx}[2]$ $I_{hisk}[2]$ |
| $\overset{\mathbf{r}_{kr} \mathbf{r}_{mr}}{\longleftarrow} \mathbf{k}_{kr} \mathbf{k}_{mr}$ | $I_{hisx}[3]$ $I_{hism}[2]$ |
| 9 threads                                                                                   | $I_{hisy}[0]  I_{hisk}[1]$  |
| $\begin{array}{c c c c c c c c c c c c c c c c c c c $                                      | $I_{hisv}[2]  I_{hism}[1]$  |
|                                                                                             | $I_{hisy}[1] = 0$           |
| $I_{kr}(\mathbf{t}-\boldsymbol{\tau}) I_{mr}(\mathbf{t}-\boldsymbol{\tau})$                 | $I_{hisy}[3] = 0$           |
| 9 threads                                                                                   | $I_{hisz}$ [0]              |
| $L_{1}$ $L_{2}$ $Z$ $Z$ $Z$ $Z$ $Z$ $Z$ $Kernel3 D$                                         | Variable Sort2              |
| Thisk2 Thism2                                                                               | FDLM ← DCY                  |
| 4 threads M                                                                                 | $V_k[0] = V_x[0]$           |
| $I_{hisk3}$ $I_{hism3}$ $[ \leq \cdots \leq   \leq   \leq   \text{kernel}_4 ]$              | $V_k[1] V_y[0]$             |
| I high 1 2 3 J high 1 2 3                                                                   | $V_k[2] = V_x[2]$           |
|                                                                                             | $V_m[0] = V_x[1]$           |
| $I_{hisk}$ $I_{hism}$ $I_{ki}$ $I_{mi}$ kernel <sub>5</sub>                                 | $V_m[1] = V_v[2]$           |
| Global memory                                                                               | $V_m[2]$ $V_x[3]$           |

Figure 6.8: DC yard kernel structure and variable sort algorithm.

same rule. The former is calculated by

$$v_{k,m} = I_{x/y} \cdot Z_{x/y} + I_{his(x/y)} \cdot Z_{x/y},$$
(6.23)

and the latter  $i_{k,m}$  is chosen from either  $I_x$  or  $I_y$ .

The GPU kernel for the DC yard, which also includes the FDLM, is designed in Fig. 6.8. The relationship between DC yard and FDLM variables is realized by CUDA C in the form of device function so it can be accessed directly by global functions. The FDLM contains 6 kernels, as each of them has a different grid size. After the variable sort process as described above, (6.15)-(6.17) can be processed in the FDLM kernel without distinguishing the terminal, meaning that the parallelism is heightened. The introduction of redundant branches enables all DC yards to have the same inputs and outputs, facilitating concurrent computation by the GPU.

#### 6.3.2 MMC GPU Kernel Design

Fig. 6.9(a) summarizes the MMC structure for all 3 types of conversions, i.e., rectifier, inverter, and the DC-DC converter. The existence of 2 capacitors  $C_1$  and  $C_2$  leads to stable DC voltage that justifies the second level separation by TLM link. In the rectifiers and inverters, the MMC main circuit has 8 nodes, 6 of which are induced by the transformer, while in the DC-DC converter, the total number of nodes reaches 10 since the topology is symmetric. The MMC arm takes the form of Norton equivalent circuit, where the elements are

$$G_p = \frac{1}{Z_{Lu.d} + r_{arm}},\tag{6.24}$$



Figure 6.9: (a) EMT model of an 3-phase MMC main circuit, (b) a general controller scheme for various control targets.

$$J_p = G_p \cdot (\sum_{i=1}^{N} V_{pi} + 2v_{Lu,d}^i),$$
(6.25)

where  $r_{arm}$  is the resistance caused by the arm inductor. Then, the MMC can be solved by (6.14) in the EMT program.

In Fig. 6.9(b), a general MMC controller structure is given. Depending on the actual demand, the outer-loop controller which is based on d-q frame compares various feedback, such as DC and AC voltages, active and reactive powers, with their references. Then, the Inverse Park's Transformation restores signals to three phases, and the inner-loop controller employing phase-shift control regulates individual phases. The similarity between these controllers enables writing a general kernel for the outer-loop controller where their differences are distinguished by the GPU thread number.

As can be seen, extensive symmetries exist in the three-phase MMC. The three phases have an identical topology and so do the three MMC inner-loop controllers, which contain three parts: generation of carriers, averaging control, and balancing control (BC), as shown in Fig. 6.10(a). For CPU simulation, several identical algebraic operations should be conducted repeatedly due to the sequential implementation manner, e.g., the definition of *N* carriers, summing up all DC capacitor voltages, and IGBT gate voltage  $V_g$  generation for 2*N* SMs, all of which prolongs the simulation time. However, the PSC computational structure in GPU sees a dramatic simplification: most part of the controller can be implemented in parallel. Accordingly, the PSC is composed of several kernels, and signals transmitted between them are set as global memory so they can be accessed by other kernels.



Figure 6.10: MMC inner loop control for single-phase: (a) Phase-shift control in CPU, (b) massive thread parallel structure of PSC and SM on GPU.

After SM DC voltages are obtained, they are summed up by multiple threads, as indicated in Fig. 6.10(b). For one phase, the CPU needs to conduct the add operation (2N-1) times, while in GPU, the number of operation times is

$$N_{sum} = \log_2(2^M), 2^{M-1} < 2N \le 2^M.$$
(6.26)

Thus, for an arbitrary MMC level, if 2N is less than  $2^M$ , the extra numbers are compensated by zeros so that in the addition process defined as in *Kernel*<sub>0</sub>, an even number of variables is always reserved until the last operation.

The bulk of averaging control is realized by  $Kernel_1$ , which corresponds to one MMC phase. Its output is sent to  $Kernel_2$  where a massive thread parallel implementation of the balancing control is carried out. It should be pointed out that repeated definition of the carriers as in the CPU can be avoided, instead, they are defined only once in each thread and is stored in global memory. Finally,  $Kernel_3$  receives the output array  $V_g$  and the submodule calculation is conducted. Like previous kernels, each thread corresponds to one SM, enhancing the computational efficiency significantly by reducing the number



Figure 6.11: Hierarchical dynamic parallelism implementation of a 3-phase HVDC converter.

of identical calculations. The output array  $\mathbf{v}_{\mathbf{C}}$  is sent to the global memory so that *Kernel*<sub>0</sub> will be able to read it when a new simulation time-step starts.

As an actual MMC has 3 phases, the above massively parallel structure needs an extension. From a circuit point of view, all 6 arms share the same configuration, and so do the 6N submodules. With regard to the controller, *Kernel*<sub>0</sub> and *Kernel*<sub>1</sub> of PSC have 3 copies - each corresponds to one phase, while *Kernel*<sub>2</sub> will be launched as a compute grid of 6Nthreads on the GPU. Therefore, a new HVDC converter kernel which contains the PSC kernel (*Kernel*<sub>0</sub>-*Kernel*<sub>2</sub>), SM kernel, and the MMC main circuit kernel based on (6.14) is constructed using the dynamic parallelism feature, as Fig. 6.11 shows.

Compared with previous GPU computational architecture, launching new compute grids from the GPU, rather than the host CPU, enables a flexible expansion of the number of HVDC converters to constitute an MTDC system. Meanwhile, the number of threads in each compute grid is exactly the same as that of an actual circuit part. For example, there is one outer-loop controller and one MMC main circuit, and accordingly, kernel PQC and MMC both invoke only thread, whereas  $Kernel_3$  for SM launches a grid of 2N threads. In the host, after initialization, the HVDC converter kernel is launched by CPU, which then invokes the 6 kernels with different compute grid scales. The input and output signals of each kernel are stored in global memory for the convenience of data exchange.

#### 6.3.3 Hybrid Circuit Breaker Model

Fig. 5.2(b) describes a typical structure of the hybrid HVDC breaker, which contains a large number of repetitive circuit units that could potentially produce a high degree of parallelism. As pointed out, the conventional full model containing as many components as a

real HHB, rather than the simplified model with a significantly reduced number of components [154], is preferred in EMT simulation of MTDC system as it features a higher fidelity and gives more details. Like in the MMC, the *V*-*I* coupling as the third level decoupling method can be applied, as shown in Fig. 5.2(d). Therefore, the HHB unit composed of the inserted current source  $J_s$ , the MOV, the LCS, and the MB becomes independent from the DC yard. This physically isolated structure results in a number of small matrix equations which can be calculated efficiently by parallel cores, and in particular, it caters specifically to the massive processing units of GPU.

After partitioning, the HHB leaves the RCB, the current limiting inductor, and a series of voltage sources, denoted by  $V_p$ , in the DC yard, and all nonlinearities are excluded from the DC yard. Applying TLM-stub theory, the HHB's contributions to the DC yard, represented by a combination of impedance  $Z_{HHB}$  and voltage  $V_{HHB}$ , are given as

$$HHB_{DCyard} = Z_{HHB} \cdot J_s + V_{HHB} = (RCB + Z_L) \cdot J_s + 2v_L^i + \sum_{i=1}^{N_H} V_{pi},$$
(6.27)

where  $Z_L$  and  $v_L^i$  constitute the inductor TLM-stub model, and RCB represents the resistance of itself. On the other hand, nonlinearities are confined to the small HHB unit, so the simulation efficiency is improved markedly by avoiding repeated calculations of the originally extremely large circuit in order to derive a convergent result. Instead, the matrix equation where the Newton-Raphson method should apply is merely 2 dimensional, and depending on the status of main breaker IGBTs, it has two forms, which can be expressed uniformly by

$$\mathbf{U_{HHB}} = \begin{bmatrix} R_{sD}^{-1} + T_{rt} \cdot G_{MB} + G_{MOV} + G_{(U-L)} & -R_{sD}^{-1} \\ -R_{sD}^{-1} & R_{sD}^{-1} + G_{Cs} \end{bmatrix}^{-1} \\ \begin{bmatrix} J_s - I_{MOVeq} - (1 - T_{rt}) \cdot I_{MB} \\ 2v_{C_s}^i \cdot G_{C_s} \end{bmatrix},$$
(6.28)

where  $T_{rt}$  is a binary value indicating the steady state and switching state by 0 and 1, respectively, and  $R_{sD}$  is the equivalent resistance of the parallel  $R_s$  and D in the snubber. By detecting the convergent HHB unit's nodal voltages in each time-step,  $R_{sD}$  can be ascertained: it behaves as a diode when  $U_{HHB}(1)$  is larger than  $U_{HHB}(2)$ ; otherwise, it is purely  $R_s$ .

When the HHB is initiated following the detection of line fault, the instantaneous current during the breaking period is estimated by

$$i_{dc}(t) = \frac{V_{dc}}{R_p} \cdot (1 - e^{-\frac{R_p}{L_p}t}) + \frac{P_{dc}(0)}{V_{dc}} e^{-\frac{R_p}{L_p}t},$$
(6.29)

where  $R_p$  and  $L_p$  denote the equivalent resistance and inductance of the power flow path, and  $P_{dc}(0)$  is the power before fault. This indicates that the AC grid at rectifier stations provides energy to the fault position immediately, while at the inverter station, a negative



Figure 6.12: HHB unit kernel design and its EMT calculation manner in conjunction with DC yard and LPR.

 $P_{dc}(0)$  means that the DC current will reduce to 0 before its AC grid could feed energy to the DC side. At the end of the breaking period  $\Delta t_1^f$ , the fault current reaches its peak, and the fault clearance period lasting  $\Delta t_2^f$  takes over. Thus, the number of HHB units  $N_H$ could be calculated – suppose a  $\delta$ % margin is reserved to ensure safe operation – by the following equation

$$N_H \cdot V_{CES}(1 - \delta\%) = V_{dc} + L_p \frac{i_{dc}(\Delta t_1^J)}{\Delta t_2^f},$$
(6.30)

where *V*<sub>CES</sub> is IGBT's maximum collector-emitter voltage.

#### 6.3.3.1 HHB GPU Computational Kernel

The GPU computational structure for HHB contains two parts: the voltage source side and the HHB units. The former is included in the linear DC yard function, while the latter constitutes an independent kernel with Newton-Raphson iteration, as Fig. 6.12 shows. The key for GPU to have a speed leverage over CPU toward the same configuration as Fig. 5.2(d) is its capability to fully utilize massive parallelism over all HHB units, rather than computing them in a sequential manner or by a few batches when multi-core CPU is available.

In the HHB unit kernel, the varistor, the LCS-UFD branch, and the IGBT DCFM are realized by CUDA C device functions so that they can be accessed by the kernel directly, and only the MOV function that causes nonlinearity may have to be called multiple times by the iterative Newton-Raphson process in which (6.28) is computed repeatedly until a precise result is derived. However, the update of variables stored in global memory as well as the determination of  $R_{sD}$  only takes place when the nodal voltages are convergent.

It should be pointed out that the HHB is always applied in conjunction with line protection (LPR). Various strategies have been proposed, and most of them rely on measurement of the DC line voltage and current, such as the voltage derivative strategy. Thus, its kernel is briefly drawn for illustration of the coordination between HHB kernel and the protection device. It is apparent that the DC yard of one converter station could have a transmission line number of  $N_L$ , meaning theoretically the same number of HHBs should be installed, and potentially the same number of LPR algorithms as well. As a consequence, the total number of HHB units in one DC yard reaches a significant  $N_L \cdot N_H$ . The large number disparity underlines the necessity of using dynamic parallelism to cater to this hierarchical structure, and like the MMC, all kernels for the DC yard are also included in the HVDC kernel.

## 6.3.4 Construction of Large-Scale MTDC Grids

A further expansion of the MTDC grid is carried out for simulating the Greater CIGRÉ DC Grid which is composed of several interconnected CIGRÉ DC systems, as shown in Appendix C. The hierarchical GPU computational structure for this new larger grid remains the same, and the current mainstream CPU processor could also share this structure when it conducts the simulation. In Fig. 6.13, pseudo code of CPU employing OpenMP<sup>®</sup> multi-threading algorithm is demonstrated. The parallelism is designed to start from the discrete components level, and since they have different sizes, the multi-threading function is applied several times. As it shows, after separating the HHB units from the DC yard, all functions become available for parallelism to attain fast simulation speed, and even though different MMCs can be connected in the MTDC system randomly, this arbitrary relationship is moved from the *DCyard* to the *variablesort* function. In addition, the inputs of the HHB are arrays, as inside each HHB there are  $N_H$  HHB units, each accounting for 1 element of the input array. Meanwhile, the efficiency of OpenMP<sup>®</sup> is highly dependent on the size of the *for loop*: the more cycles, the fuller utilization of multi-core CPU.

# 6.4 EMT Simulation Results with CFM-IGBT

At present, practical MMC-based systems attracting intensive study and are widely used in actual projects include single 3-phase MMC, point-to-point HVDC transmission, and the MTDC grid with a medium number (e.g., 4) of terminals. Thus, the GPU's performance in simulating these configurations determines its potential in time-domain simulations of various applications. Meanwhile, it is anticipated that large-scale DC grids will emerge in the future, and therefore, the CIGRÉ B4 DC grid and its extended version are also taken into consideration.



Figure 6.13: OpenMP<sup>®</sup> pseudo code for multi-core MTDC system CPU simulation.

#### 6.4.1 GPU Simulation of Basic MMC

The single-phase MMC with only inner-loop regulation function is first tested under a variety of voltage levels for computational speedup comparison with off-line simulation tool using the same time-step of 1 $\mu$ s, which is summarized in Table 6.1. The effectiveness of fine-grained circuit partitioning method in achieving efficient computation is manifested by a remarkable speedup attained in CPU and GPU cases over PSCAD/EMTDC<sup>®</sup> where the full switch-level model is used. And even the time-step in PSCAD/EMTDC<sup>®</sup> simulation is increased to 20 $\mu$ s, it still takes the tool over 4000s to complete the simulation of a 129-L MMC. Meanwhile, as can be observed, with the same circuit configuration, the CPU is around 2-times faster than the GPU when the voltage level is low; but once it reaches over 100 level, the potential of GPU's computational capability in MMC circuit simulation emerges even though the converter has only one phase, and the source of this leverage comes from a large quantity of SMs.

In Fig. 6.14, the accuracy of GPU simulation results are validated by PSCAD/EMTDC<sup>®</sup>. The output has a clear, countable voltage level in the 5-L and 17-L converter, while the waveform is virtually sinusoidal when the level reaches 33. Furthermore, their amplitudes are identical. With a fixed DC link voltage of 800V, the average SM DC capacitor voltage decreases proportionally along with the number of SMs in an arm, and the variation rule in the upper and lower arms are totally opposite. Meanwhile, small differences can be observed in one DC voltage cluster, e.g.,  $V_{C1}$ - $V_{C4}$  are not exactly the same, and so are those

| Ph. No.       | 1-ph TSSM |         |        | Speedup |        |      | 3-ph D | Sp.   |                   |
|---------------|-----------|---------|--------|---------|--------|------|--------|-------|-------------------|
| MMC           | ①: PSCAD® | 2): CPU | 3: GPU | 1/2     | 1/3    | 2/3  | CPU    | GPU   | $\frac{CPU}{GPU}$ |
| 5-L           | 33.1      | 9.68    | 20.7   | 3.42    | 1.60   | 0.47 | 32.6   | 49.0  | 0.67              |
| 9-L           | 55.1      | 10.23   | 20.9   | 5.39    | 2.64   | 0.49 | 50.8   | 50.2  | 1.01              |
| 17 <b>-</b> L | 125.8     | 11.25   | 21.0   | 11.18   | 5.99   | 0.54 | 89.9   | 51.5  | 1.75              |
| 33-L          | 389.0     | 13.42   | 21.4   | 28.99   | 18.18  | 0.63 | 166.0  | 54.3  | 3.06              |
| 65-L          | 2206      | 17.23   | 21.5   | 128.0   | 102.6  | 0.80 | 319.0  | 62.9  | 5.07              |
| 129-L         | 9753      | 25.49   | 22.3   | 382.6   | 437.35 | 1.14 | 618.6  | 74.4  | 8.31              |
| 257-L         | _         | 40.27   | 24.5   | —       |        | 1.64 | 1241.9 | 99.4  | 12.49             |
| 513-L         | _         | 71.29   | 29.8   | —       |        | 2.39 | 2396.9 | 151.8 | 15.79             |

Table 6.1: Execution time  $t_{exe}$  of different platforms for 1s simulation duration

in the lower arm. The close results of two simulation methods indicate that the *V*-*I* coupling separating every submodule from the arm enables more efficient computation on the converter while it ensures the simulation precision.

As the foundation of HVDC and MTDC systems, as well as other applications, the 3-phase MMC is frequently studied in simulation tools and therefore, the computational burden of a 3-phase MMC-based inverter on GPU is also tested and listed in Table 6.1 with various voltage levels. In order to retain IGBT device-level information, the DCFM is used and therefore, a maximum time-step of  $1\mu$ s is chosen for all the systems; however, the selection of the time-step will not change the speed ratio between two processors. It shows that even though the inclusion of the outer-loop controller increases the irregularity, with more submodules in an arm, the speedup is able to catch up, due to the fact that the slow down caused by the controller would be eventually made up by SMs. Moreover, Table 6.1 illustrates that for one 3-phase MMC, it is not advisable to use GPU to simulate it for low- and medium-voltage applications, but in high voltage scenarios where dozens or even hundreds of SMs are placed in an arm, GPU becomes advantageous with over 10-times speedup.

The switching transient of a semiconductor switch is reflected directly by the rise/fall time and ultimately affects the junction temperature. The simulated curves of IGBT rise and fall times and experimental results available in the datasheet under different collector currents are shown in Fig. 6.15(a). Three sections of linear functions are used for the  $t_f$ - $I_C$  curve, while the rise time curve requires only two sections, and when the y-axis is turned into logarithmic, like in the datasheet, both curves bend. Fig. 6.15(b)-(f) are device-level results from a 5-level MMC with 16kV DC bus, 2kHz carrier frequency, and 2kA, 60Hz output current. The relationship between IGBT average power loss and its switching frequency is drawn in Fig. 6.15(b). The average losses in both switches rise steadily along with the frequency, as the transient power loss becomes increasingly significant. Meanwhile, the loss on the lower switch is more severe, resulting in a higher junction temperature than its counterpart, as the upper diode and lower IGBT are expected to be subjected to a larger



Figure 6.14: Single-phase MMC results of GPU simulation (top) validated by  $PSCAD/EMTDC^{\textcircled{R}}$  (bottom). (a)-(c) 5-level, 17-level, and 33-level MMC output voltages, (d)-(f) SM DC capacitor voltages of 5-level, 17-level, and 33-level MMCs.

current, as shown in Fig. 6.15(d) and Fig. 6.15(f), which also show intensive diode reverse recovery and IGBT turn-on overshoot currents that otherwise are not available in the ideal switch model. As an industry standard tool frequently referred to for device-level information for guidance on power converter design evaluation, the above device-level results have been validated by SaberRD<sup>®</sup> simulation.

#### 6.4.2 GPU Simulation for Point-to-Point HVDC Transmission

The HVDC transmission system is a prevalent application of MMC, and the processor's performance with regard to CPU is particularly important since it decides the popularity of this new time-domain simulation platform. The 2-terminal DC subsystem DCS1 of Fig. 6.1 is selected as the testbench, where the DC line voltage is set to be  $\pm 100$ kV. Since the installation of HHBs in DC yard of an HVDC system is not compulsory, both cases are tested: in the first case the system operates without any HHB; while it is involved for fault



Figure 6.15: IGBT device-level performance ((c)-(f) results from proposed model (left) and SaberRD<sup>®</sup> (right)). (a) variation of turn-on and -off times, (b) averaged power loss under different switching frequencies, (c) SM upper IGBT junction temperature, (d) upper switch current waveform, (e) SM lower IGBT junction temperature, and (f) lower switch current waveform.

isolation in the second case.

Table 6.2 indicates that the CPU execution time is almost doubled as the voltage level doubles, due to the fact that the calculation burden of SM is comparatively higher than other circuit components. The inclusion of HHBs in *Case-2* provides extra speedup that makes the GPU simulation faster even when the voltage level is very low, although this benefit is gradually neutralized along with the rise of MMC levels. Considering that in a

practical HVDC project the voltage level is always in the range of hundreds of kilovolts, the GPU is highly preferable to CPU as a new simulation platform for such applications due to a speedup of nearly 30 times even though the computational capability has yet to be fully utilized.

| Case      | Case-I $t_{exe}$ (s) |       | Speedup           | Case-2 | $t_{exe}$ (s) | Speedup           |
|-----------|----------------------|-------|-------------------|--------|---------------|-------------------|
| MMC Level | CPU                  | GPU   | $\frac{CPU}{GPU}$ | CPU    | GPU           | $\frac{CPU}{GPU}$ |
| 5-L       | 59.6                 | 62.1  | 0.96              | 113.1  | 74.9          | 1.51              |
| 9-L       | 116.7                | 63.8  | 1.83              | 151.6  | 76.7          | 1.98              |
| 17-L      | 218.5                | 65.2  | 3.35              | 254.6  | 77.8          | 3.27              |
| 33-L      | 349.4                | 66.5  | 5.25              | 419.3  | 78.4          | 5.35              |
| 65-L      | 660.6                | 74.6  | 8.86              | 712.0  | 85.3          | 8.35              |
| 129-L     | 1298.0               | 87.8  | 14.78             | 1329.2 | 97.2          | 13.67             |
| 257-L     | 2551.5               | 112.4 | 22.70             | 2639.9 | 126.9         | 20.80             |
| 513-L     | 5134.7               | 165.2 | 31.08             | 5188.1 | 179.3         | 28.94             |

Table 6.2: CPU and GPU execution times of  $\pm 100$ kV HVDC for 1s simulation

Fig. 6.16 demonstrates some test results of the DCS1 HVDC system to illustrate the correctness of the modeling method and the high accuracy of GPU simulation. In Fig. 6.16(a), the waveforms of simultaneous start of both rectifier and inverter stations are given, the inverter's voltage immediately rises to around  $\pm 100$ kV after simulation starts, and the voltage on the rectifier side closely follows, with a small margin when the HVDC link enters steady state at around 1.5s to ensure that the power can be delivered in the correct direction, and the two poles in a station show opposite voltage polarity. The rectifier power step test is carried out in Fig. 6.16(b)-(c), which shows that at t=4.0s, the power order is changed abruptly from 400MW to 200MW. Thus, the actual power plunges and gradually stabilizes around the reference. Then, the order soars to 400MW, and the actual power quickly ascends to its order. Correspondingly, the DC pole-to-pole voltages witness some perturbations, but the amplitude is small. It can also be noticed that the DC voltage gap between the 2 stations Cm-A1 and Cm-C1 is smaller during reduced power transmission, because the DC current in this scenario halves, causing less voltage drop on the transmission corridor.

The inverter voltage step test results are given in Fig. 6.16(d), where the pole-to-pole voltage is shown. Before t=3s, the DC voltages are kept at approximately 1 p.u, with the rectifier station having a slight margin. Then, both curves drop as the voltage order in the inverter station is altered to 0.8 p.u., and the HVDC system operates under reduced voltage mode until 3 seconds later when the voltages are recovered as the order steps up to 1 p.u. Fig. 6.16(e)-(f) are results of DC line-to-line fault which lasts momentarily for 5ms, marked as *F1* at the rectifier Cm-A1 side in Fig. 6.1. The HHBs are disabled, so the fault current soars to over 11kA immediately after the fault occurs, followed by damped oscillations lasting dozens of milliseconds. Afterwards, the current is able to restore to the pre-fault value; nevertheless, with 100mH inductors installed in the DC yards, the fault's instanta-



Figure 6.16: Subsystem DCS1 results of GPU simulation (top) validated by  $PSCAD/EMTDC^{\textcircled{R}}$  (bottom). (a) System simultaneous start, (b)-(c) rectifier station power step tests, (d) inverter voltage step test, and (e)-(f) DC line-to-line fault lasting 5ms.

neous impact on converters' DC voltages is negligible. Corresponding off-line simulations are conducted with PSCAD/EMTDC<sup>®</sup>, whose virtually identical waveforms prove that the GPU simulation is more efficient while its results are as accurate.

## 6.4.3 GPU Simulation of MTDC Grid Test Cases

The MTDC system is a promising topology and currently, several projects have been constructed with a few terminals linking each other. The DCS2 subsystem could be taken as a typical example since its scale is very close to existing projects as well as those under research and development.

Installation of HHBs in the MTDC system would enhance its resilience to DC line faults, and Fig. 6.17 provides such test results of the 4-terminal DC system. Before the line fault taking place at t=3s, the DC voltages of all stations are around 1 p.u., with rectifier stations slightly above their counterparts, as Fig. 6.17(a) shows where the pole-to-pole



Figure 6.17: 4-terminal MTDC results of GPU simulation (top) validated by  $PSCAD/EMTDC^{(R)}$  (bottom). (a) DC voltages of all stations, (b) DC line currents, (c) current waveform amplification of Lm1 at Cm-E1 side, (d) detailed actions of Cm-E1 HHB, (e) power export of each station, and (f) power transferred on DC lines.

voltages are drawn. It can be seen that neither of them is severely affected by the fault due to proper action of the HHBs. On the contrary, Fig. 6.17(b) shows that the currents in DC yards have significant surges at both MMC2 (Cm-F1) and MMC3 (Cm-E1) as the fault  $F_2$  occurs between them. For  $I_{dc2}$  that flows to Cm-E1, it keeps increasing before the fault is isolated; while for  $I_{dc3}$ , its polarity is reversed, as the fault force Cm-E1 to operate as a freewheeling rectifier, rather than an inverter station under normal conditions. The power transfer restores in about 0.5s and since Cm-E1 is isolated, MMC1 (Cm-B2) receives all power from the other two terminals, and therefore, its current  $I_{dc1}$  is doubled. The function of HHBs on both terminals of the fault line can be illustrated by its voltages' phase relation with line current. On the Cm-E1 side, as in Fig. 6.17(c), the current polarity reverses immediately after the fault, and in the next 2ms, it keeps rising as the breaking stage is undergoing. Then, the current is forced to divert to the MOV whose voltage is clamped

at around 3.4kV when all IGBTs in the HHB are turned off. Thus, the current begins to drop, with the slope determined by the MOV's protection voltage. And from Fig. 6.17(d), specific HHB operation principles can be inferred. Initially, Cm-E1 receives power from Cm-F1, and the UFD-LCS is the main branch that the DC current passes through. When the fault is detected, the LCS turns off and consequently  $i_{LCS}$  drops to 0; while the main branch keeps on for the next 2ms, the current diverts to it, and because of the existence of the current limiting inductor,  $i_{MB}$  rises gradually from a negative value to positive. Following the turn-off of MB, the current again is diverted to the MOV where it is quenched in the form of  $i_M$ . Fig. 6.17(e)(f) are power flow at different positions. Prior to the fault, Cm-B2 and Cm-F1 - the two rectifiers - send approximately 800MW power to Cb-B2 and Cm-E1, thus the power exchange  $P_{L1}$  on  $L_1$  is virtually 0. After the fault is cleared, Cm-F1 is no longer able to send power to Cm-E1; instead, its export entirely goes to Cb-B2. Thus, the power flow on  $L_1$  rises to 800MW, and alongside the power from MMC0 (Cm-B3), the remaining inverter receives nearly 1600MW through  $L_0$ . Details from PSCAD/EMTDC<sup>®</sup> are also given for validation, which are virtually the same. It should be pointed out that the simplified HHB model without snubber is used in the PSCAD/EMTDC<sup>®</sup> simulation package; thus, it cannot reveal phenomena peculiar to a full HHB model, such as the voltage sag over the MOV caused by the snubber.

Table 6.3 indicates that the GPU simulation is hugely advantageous over CPU even in simulating medium-scale MTDC systems. With the default single-CPU mode, it takes 562s to execute the simulation of the  $\pm$ 200kV DCS2 over 1s, and this value rises dramatically when the MMC level becomes normal to withstand high voltage, reaching almost 10500s when the level is 513. In stark contrast, the GPU execution time is similar to its performance for an HVDC system, even though the scale has been doubled. Thus, in this case, the GPU attains a higher speedup, approximately 50 times for a normal 4-terminal DC system with a reasonable voltage level. On the other hand, there could be up to 12288 SMs in DCS2 when the voltage level is 513. Thus, the multi-core CPU framework is also tested. Compared with the default mode, its execution time merely increases by around 2 times. The computational capability of MCPU architecture cannot be fully utilized when the MMC voltage level is low, as launching multiple threads would take a significant part of the total time; when the MMC level reaches hundreds, MCPU gains a higher speedup over single CPU, but still it is about 20 times slower than GPU.

As the scale of the DC grid enlarges, the speedup will also increase, as shown in Table 6.4. In the CIGRÉ DC system, it takes the CPU thousands of seconds to compute 1s of results even though the MMC has only 5 level; and it soars up to 30000s when the MMCs are 513 level. The situation is slightly improved by adopting MCPU; however, it still requires a few thousands of seconds. In the meantime, the GPU simulation time remains mostly the same to DCS2 albeit the scale has been nearly quadrupled. As a result, it gains a speedup ranging from 26 to 90, much higher than the MCPU simulation which only has 2 to 5 times

| Case          | DC      | $\mathbf{S2} \ t_{exe}$ (s | )     |             | speedup    | 2           |
|---------------|---------|----------------------------|-------|-------------|------------|-------------|
| MMC           | CPU     | MCPU                       | GPU   | CPU<br>MCPU | CPU<br>GPU | MCPU<br>GPU |
| 5-L           | 561.8   | 387.1                      | 77.0  | 1.45        | 7.30       | 5.03        |
| 9-L           | 636.4   | 455.5                      | 78.2  | 1.40        | 8.14       | 5.82        |
| 17 <b>-</b> L | 808.9   | 521.6                      | 79.2  | 1.55        | 10.21      | 6.59        |
| 33-L          | 1149.6  | 735.6                      | 80.8  | 1.56        | 14.23      | 9.10        |
| 65-L          | 1772.4  | 941.6                      | 89.2  | 1.88        | 19.87      | 10.57       |
| 129-L         | 2988.0  | 1372.9                     | 100.9 | 2.18        | 29.61      | 13.61       |
| 257-L         | 5577.3  | 2424.1                     | 132.8 | 2.30        | 42.00      | 18.25       |
| 513-L         | 10427.9 | 4352.0                     | 194.4 | 2.40        | 53.64      | 22.39       |

Table 6.3: CPU and GPU execution times of  $\pm 200$ kV DCS2 for 1s simulation

Table 6.4: CPU and GPU execution times of the CIGRÉ B4 DC system for 1s simulation

| Case          | CIC     | GRÉ $t_{exe}$ (s | $\acute{\mathbf{E}} t_{exe}$ (s) Speedup |                    |                   | р           |
|---------------|---------|------------------|------------------------------------------|--------------------|-------------------|-------------|
| MMC           | CPU     | MCPU             | GPU                                      | $\frac{CPU}{MCPU}$ | $\frac{CPU}{GPU}$ | MCPU<br>GPU |
| 5-L           | 1959.5  | 1010.0           | 75.2                                     | 1.94               | 26.1              | 13.4        |
| 9-L           | 2238.7  | 1033.9           | 77.7                                     | 2.17               | 28.8              | 13.3        |
| 17 <b>-</b> L | 2658.8  | 1069.2           | 79.8                                     | 2.49               | 33.3              | 13.4        |
| 33-L          | 3594.3  | 1080.6           | 82.0                                     | 3.33               | 43.8              | 13.2        |
| 65-L          | 5279.5  | 1507.3           | 92.2                                     | 3.50               | 57.3              | 16.3        |
| 129-L         | 8847.7  | 2031.2           | 111.3                                    | 4.36               | 79.5              | 18.2        |
| 257-L         | 15819.6 | 3118.2           | 194.6                                    | 5.07               | 81.3              | 16.0        |
| 513-L         | 29939.6 | 5724.9           | 334.3                                    | 5.23               | 89.6              | 17.1        |

of speedup.

Some tests are also carried out to show that the GPU is the more efficient platform for studying the CIGRÉ DC test system. Power reversal is conducted by ordering the output power of DC-DC converter Cd-E1 to ramp from -200MW to 400MW. And the impact of this single converter's behavior on the overall system is given in Fig. 6.18. Initially, MMC0 and MMC2 as rectifiers in DCS2 release 1.2GW, and MMC1 and MMC3 receives around 660MW and 330MW, respectively. The surplus 200MW is fed to DCS3, as it can be seen that the combined amount of output power from MMC6, MMC8, and MMC10 is 3.2GW, but the inverters MMC7 and MMC9 get approximately 0.2GW more. During power ramp process, as expected, the output power of all rectifier stations remains virtually constant, while only the inverter MMC5 absorbs a fixed 800MW power as DCS1 is relatively isolated from its counterparts. In DCS2, the power MMC3 receives almost triples after the process, while MMC1 is slightly affected during the process, and after that, it restores. Meanwhile, as the power is flowing from DCS3 to DCS2, the power received by MMC7 and MMC9 both reduce, to around 1.55GW and 1.24GW, the summation of which has a deficiency of 400MW compared with that provided by rectifiers in that subsystem. The above process and its impact on the CIGRÉ DC system are validated by PSCAD/EMTDC<sup>®</sup>.

With regard to the Greater CIGRÉ DC grid, the leverage that GPU holds is supposed



Figure 6.18: CIGRÉ DC grid power reversal simulation by GPU (left) and PSCAD/EMTDC<sup>®</sup> (right).

to be larger. Take the 101-level MMC for example, when the number of CIGRÉ DC system rises from 2 to 8, it takes CPU and multi-CPU 4 times and 2.8 times longer respectively to compute, while this value merely increases by less than 1.4 times in the GPU case. Thus, compared with single-CPU mode, GPU simulation is able to seize about 90 to 270 times of speedup; on the contrary, multi-CPU could only achieve a speedup of approximately 7 to 11, as shown in Table 6.5.

The GPU's performance in simulating DC systems with both TSSM and the proposed DCFM are summarized in Fig. 6.19. All three figures share the trait that it takes a slightly longer time for GPU to compute when the switch model shifts to DCFM regardless of the MMC level, whilst both the CPU and MCPU frameworks witness a dramatic rise even in the logarithmic axes, which accounts for the fact that device-level semiconductor models are rarely used in CPU-based large system simulation. Meanwhile, it demonstrates that with the TSSM for only system-level simulation, GPU is still able to attain over a dozen



Figure 6.19: GPU performance in simulation of different DC systems with IGBT TSSM and DCFM. (a) HVDC with HHB, (b) DCS-2, and (c) CIGRÉ B4 DC system.

times of speedup, let alone the more complex switch model, which showcases a much higher speedup. The adoption of GPU greatly alleviates the computational burden caused by the complexity, making the involvement of device-level models in system-level simulation feasible.

|  | Number of<br>CIGRÉ SM |       | Exec  | ution tim | e (s) | Speedup            |                   |             |  |
|--|-----------------------|-------|-------|-----------|-------|--------------------|-------------------|-------------|--|
|  |                       |       | CPU   | MCPU      | GPU   | $\frac{CPU}{MCPU}$ | $\frac{CPU}{GPU}$ | MCPU<br>GPU |  |
|  | 2                     | 13200 | 14549 | 1995      | 162.1 | 7.3                | 89.8              | 12.3        |  |
|  | 3                     | 19800 | 22264 | 2810      | 179.9 | 7.9                | 123.8             | 15.6        |  |
|  | 4                     | 26400 | 30249 | 3103      | 195.6 | 9.7                | 154.6             | 15.9        |  |
|  | 5                     | 33000 | 36963 | 4067      | 200.9 | 9.1                | 184.0             | 20.2        |  |
|  | 6                     | 39600 | 44391 | 4284      | 208.4 | 10.4               | 213.0             | 20.6        |  |
|  | 7                     | 46200 | 52792 | 4868      | 210.0 | 10.8               | 251.4             | 23.2        |  |
|  | 8                     | 52800 | 60121 | 5538      | 221.9 | 10.9               | 270.9             | 25.0        |  |

Table 6.5: CPU and GPU execution times of the Greater CIGRÉ DC system for 1s simulation

# 6.5 EMT Simulation Results with NBM-IGBT

The GPU used in this case is the Nvidia Tesla<sup>®</sup> V100 (Volta architecture). GPU implementation results at both device-level and system-level are demonstrated and validated by commercial off-line EMT-solvers which run on a 64-bit Windows<sup>®</sup> 10 operating system with 2.20GHz 20-core Intel Xeon<sup>®</sup> E5-2698 v4 CPU and 128GB RAM.

## 6.5.1 Device-Level Switching Transients

The IGBT/diode modeling method is verified by the commercial device-level simulation tool SaberRD<sup>®</sup> using its default Siemens IGBT module BSM300GA160D since it provides switch models that were experimentally validated.

In Table 6.6, the execution time of device-level simulation is compared by computing single-phase MMCs for a 100 ms duration with 100ns the time-step. It takes SaberRD<sup>®</sup> up to 1700s to compute a 9-level converter, and the results are no longer convergent once the voltage level reaches 11. Therefore, it is unfeasible for the device-level simulation package to conduct power system computation. In the meantime, the proposed IGBT/diode model and the decoupling method are also tested on the single-core CPU and the Nvidia Tesla<sup>®</sup> V100 GPU. The proposed model enables the CPU to achieve a speedup  $SP_1$  of almost 18 times in 9-level MMC, and the speedup  $SP_2$  by GPU is near 11. The GPU overtakes CPU when the MMC level reaches 21 since its speedup over CPU  $SP_3$  is greater than 1.

Device-level results from a 9-L MMC with reduced DC bus voltage of 8kV are given. With a dead-time  $\Delta T=5\mu s$ , a gate resistance of 10 $\Omega$  and a voltage of ±15V, the switching transients are normal in Fig. 6.20(a)-(c). Slight overshoot is observed in the IGBT turn-on current, and the diode reverse recovery process accounts for this phenomenon. Fig. 6.20(d) shows the impact of the gate driving conditions on switching transients that is only available in device-level modeling. Adjusting the gate turn-off voltage to 0V leads to a tremendous current overshoot, which means this driving condition is hazardous to the IGBT. And when the gate resistance is set to 15 $\Omega$ , the current rises more slowly as the time interval  $t_2$  is slightly larger than  $t_1$ . SaberRD<sup>®</sup> simulation is also conducted, and a good agreement



Figure 6.20: Switching transients of behavioral IGBT/diode pair: (a) Turn-on, (b) turn-off, (c) diode reverse recovery, and (d) IGBT turn-on current under different gate conditions.

Table 6.6: NBM-based MMC execution time by various platforms for 100ms duration

| MMC           | Execution            | on Time | S     | peedu  | p      |        |
|---------------|----------------------|---------|-------|--------|--------|--------|
| Level         | SaberRD <sup>®</sup> | $CPU_1$ | GPU   | $SP_1$ | $SP_2$ | $SP_3$ |
| 5-L           | 709                  | 56.2    | 159.1 | 12.6   | 4.5    | 0.35   |
| 7-L           | 1240                 | 80.3    | 159.8 | 15.4   | 7.8    | 0.50   |
| 9-L           | 1720                 | 98.4    | 163.7 | 17.5   | 10.5   | 0.60   |
| 11 <b>-</b> L | -                    | 121.2   | 163.3 | -      | -      | 0.74   |
| 21-L          | -                    | 260.3   | 206.0 | -      | -      | 1.26   |
| 33-L          | -                    | 368.9   | 238.4 | -      | -      | 1.55   |

validates the proposed IGBT/diode nonlinear behavioral model and the designed MMC GPU kernel.

In Fig. 6.21(a)-(c), the switching patterns between different tools are compared. The proposed NBM leads to exact static and dynamic current waveforms to SaberRD<sup>®</sup>. In contrast, PSCAD/EMTDC<sup>®</sup> is not able to give the actual current stress of an IGBT dur-



Figure 6.21: Switching pattern and IGBT junction temperature: (a) Upper switch current, (b) lower switch current, (c) IGBT junction temperatures, and (d) switching pattern difference between device-level model and two-state switch model.

ing operation, as the switching transients could not be observed. Moreover, the model also determines the simulation accuracy. It is shown by Fig. 6.21(c) that with default IGBT and diode model TSSM<sub>1</sub> and a typical time-step of  $20\mu$ s in PSCAD/EMTDC<sup>®</sup>, a current disparity  $\Delta I$ =6A out of 190A is witnessed even under steady state. The result from PSCAD/EMTDC<sup>®</sup> becomes closer to proposed NBM when an approximate on-state resistance and voltage drop of the IGBT/diode pair is set to its switch model TSSM<sub>2</sub>. The junction temperatures of the two complementary switches in a submodule are given in Fig. 6.21(d). A dramatic temperature surge is observed when the 9-level MMC starts to operate, and the curve decreases gradually along with the converter's entry into steady-state. The correctness of these results is validated by SaberRD<sup>®</sup>.

#### 6.5.2 Wind Farm Integration Dynamics

The DCS1 subsystem is taken for illustration of 100 wind turbines' integration into the DC grid. Fig. 6.22(a) gives the power-wind speed characteristics of the DFIG. When the wind



Figure 6.22: OWF integration into DCS1: (a) DFIG *P*-*v* characteristics, (b) wind speed and rotor mechanical velocity, (c) rectifier AC currents, (d) rectifier AC voltage, (e) converter station DC yard power, and (f) terminal DC voltages.

speed declines from 11m/s to 8m/s in 1s, as Fig. 6.22(b) indicates, the rotor mechanical velocity drops from initial 188rad/s to around 137rad/s. Consequently, the rectifier side currents are almost halved; nevertheless, the AC voltage maintains virtually constant due to the proper control of MMC, as shown in Fig. 6.22(c)-(d). In Fig. 6.22(e), the output

power of a single DFIG reduces from 1.96MW to about 0.76MW, which fits with the *P-v* characteristics. Therefore, the power at both stations gradually ramps down from 200MW to about 76MW. Fig. 6.22(f) demonstrates DC voltage fluctuation caused by the change in wind speed. The DC voltage at the inverter station has a momentary sag to 199kV, but it recovers immediately. The rectifier side DC voltage reduces as the power delivered between the two stations has a significant reduction. The above results are verified by PSCAD/EMTDC<sup>®</sup> simulation.

#### 6.5.3 MTDC System Tests

Fig. 6.23 gives the HBSM- and FBSM-MMC responses to DC line fault in DCS1. The poleto-pole fault  $F_1$  occurs to the center of the line at t=3s, and subsequently all IGBT gate signals are retrieved. It is shown in Fig. 6.23(a) that the DC current in FBSM case reduces to 0 after a few oscillations, while it eventually reaches over 10kA with HBSM topology. Similarly, the FBSM achieves 0kV on the DC line, but its counterpart is unable to block thoroughly since the freewheeling diodes operate as a rectifier. PSCAD/EMTDC<sup>®</sup> simulation results are also given for validation. Minor differences are observed due to the adoption of different switch models, i.e., NBM and the TSSM. The fact that different onstate resistances of the TSSM lead to distinct DC current and voltage waveforms indicates the importance of accurate switch models even in the system-level study.

The power flow of the entire grid under steady-state is shown in Fig. 6.24 when OWF1– 5 sends energy to the inland inverter stations. Cm-A1 in DCS1 receives virtually all 200MW power from the OWF. In DCS2, the combined energy that Cm-B2 and Cm-B3 receive is around 100MW more than that from OWF4 and OWF5 since Cd-E1 is ordered to deliver an additional 100MW. As a consequence, Cb-A1, Cb-B1, and Cb-B2 have in total 892MW while OWF2 and OWF3 send around 1GW. The power distribution from PSCAD/EMTDC<sup>®</sup> simulation shows virtually identical values.

Table 6.7 shows the time CPUs and the NVIDIA Tesla<sup>®</sup> V100 GPU need to calculate the CIGRÉ B4 DC Grid for 1s duration with a time-step of 200ns. It can be seen that the single CPU is hardly able to simulate a practical DC system as it could take more than 1 million seconds. The situation is slightly improved by using multiple cores but they still require an extremely long period. In contrast, the V100 GPU can complete the simulation of 11 401-L MMCs in less than 1800 seconds, and there is no obvious difference in calculating HBSM-MMC or FBSM-MMC. Consequently, the GPU gains a remarkable speedup SP<sub>1</sub> over single CPU, i.e., 1302 and 2608 times when the MMC level is 401, and the speedup SP<sub>2</sub> reaches 134 and 265 over 20-core CPUs. As a further comparison, PSCAD/EMTDC<sup>®</sup> was unable to compute the full-scale CIGRÉ DC grid even with much simpler IGBT and diode models.



Figure 6.23: Inverter side HBSM- and FBSM-MMC response to DC fault (L: GPU simulation, R: PSCAD/EMTDC<sup>®</sup>): (a) DC currents, and (b) DC voltages.

|                | Execution Time (s)  |                     |                     |                     |          |      |        | 3PU Sp | eeaup  | )    |
|----------------|---------------------|---------------------|---------------------|---------------------|----------|------|--------|--------|--------|------|
| MMC            | 1 CPU core          |                     | 20 CPU cores        |                     | V100 GPU |      | $SP_1$ |        | $SP_2$ |      |
| Level          | HB                  | FB                  | HB                  | FB                  | HB       | FB   | HB     | FB     | HB     | FB   |
| 51-L           | $2.3 \times 10^{5}$ | $4.0 \times 10^{5}$ | $2.6 \times 10^4$   | $5.3 \times 10^{4}$ | 901      | 908  | 254    | 444    | 29.3   | 58.5 |
| 101 <b>-</b> L | $4.4 \times 10^{5}$ | $9.0 \times 10^{5}$ | $5.8 \times 10^{4}$ | $1.1 \times 10^{5}$ | 957      | 957  | 462    | 936    | 60.7   | 117  |
| 201-L          | $9.5 \times 10^{5}$ | $2.1 \times 10^{6}$ | $1.1 \times 10^{5}$ | $2.3 \times 10^{5}$ | 1215     | 1218 | 779    | 1749   | 93.8   | 186  |
| 401 <b>-</b> L | $2.3 \times 10^{6}$ | $4.5 \times 10^{6}$ | $2.3 \times 10^{5}$ | $4.6 \times 10^{5}$ | 1728     | 1729 | 1302   | 2608   | 134    | 265  |

Table 6.7: Execution time of CIGRÉ B4 DC grid by CPUs and GPU for 1s duration

# 6.6 Summary

An efficient methodology for large-scale multi-terminal HVDC system simulation using massive parallelism on the GPU was presented in this chapter wherein three levels of circuit partitioning were employed to attain fine-grained parallelism. Fundamental power



Figure 6.24: Power flow in the CIGRÉ B4 DC Grid.

electronic and power system components were designed into CUDA C kernels to constitute the GPU simulation library so that they can be conveniently called, and consequently, their massively parallel computation is achievable after introducing a general structure that covers every circumstance considering practically components sharing the same property may have some dissimilarities. The power semiconductor switch was specifically modeled using the prevalent ideal switch model, the dynamic curve-fitting model, and the nonlinear behavioral model to cater for various simulation requirements. Dynamic parallelism appropriately revealed the hierarchy of an MTDC system, and therefore, circuit information, from device-level to the grid-level, became available in the multi-terminal layout. Test cases from a single-phase MMC to the Greater CIGRÉ DC Grid were taken as typical examples, and with the same accuracy to existing commercial offline simulation packages and a dramatic speedup over CPU and multi-core CPU frames, it is proven that GPU would play a significant role in simulating MTDC systems of a variety of scales in the future. And since the advantage of data handling capability of GPU becomes overwhelming when more identical components, such as the MMC submodule, are computed, it is expected to be a new generation of platform for off-line time-domain simulation and particularly, dominant in the area of hybrid system-level and device-level simulation. Other than a detailed demonstration of an approach for extensively parallel computation of an irregular MTDC grid on the GPU, this work also proposed a 3-category circuit partitioning method which showcased its efficacy in accelerating both CPU and GPU simulation, and therefore, it can be referred to for large-scale system simulation on various platforms.

# MTDC Grid Variable Time-Stepping Simulation on GPU

7

# 7.1 Introduction

The GPU simulation of the CIGRÉ B4 DC grid has achieved a remarkable speedup over traditional CPU simulation or the proposed multi-core CPU architecture in the last chapter. A fast computational speed with high accuracy is always the paramount goal of electromagnetic transient simulation that requires effects.

Therefore, the variable time-stepping scheme is proposed in this chapter to further expedite the EMT simulation on both the CPU and GPU. The basic principle is that, under the circumstance of correct simulation results, when those concerned variables change slowly or even under steady state, the time-step can be enlarged; on the contrary, when drastic variations occur, the time-step should be reduced to ensure a high resolution. Subsequently, a number of criteria which could reflect the state change of the system are categorized and utilized to regulate the time-step dynamically during simulation. Meanwhile, as the accuracy of results is heavily reliant on the switch models in the MMC, corresponding variable time-stepping schemes are analyzed. A combination of GPU and the VTS schemes makes system-level simulation involving device-level details feasible.

# 7.2 Proposed Variable Time-Stepping Schemes

## 7.2.1 Event-Correlated Criterion

This criterion is based on events taking place in the system, i.e., transmission line faults, breaker operation, and even a power semiconductor switch's action. Though any change in the state of a component results in perturbation to the system, the impact caused by dif-
ferent components varies. It is dependent on the simulation requirement, e.g., for systemlevel results, line faults have a far significant impact than a single switch on the grid. But when device-level transients are of concern, the turn-on and turn-off processes become the focus. Therefore, regulation of the time-step is dependent on the system's sensitivity to events related to a certain component, and all these impacts are eventually reflected by the variation in currents or voltages. Therefore, their change rates *dv/dt* and *di/dt* are two common criteria for time-step control.

#### 7.2.2 Local Error Truncation

The local error truncation (LTE) is another general criterion [155] for time-step adjustment due to an extensive distribution of energy storage components such as the inductor and capacitor whose integral *i*-*v* relationship needs discretization before being applied for EMT calculation. As introduced in Chapter 3, One-Step integration approximations such as Backward Euler and Trapezoidal rule are the main methods in EMT-solvers. However, due to their relatively low orders, the estimation has lower precision than the Multi-Step Methods, and the error increases along with the integral step. Thus, the LTE is obtained in such a manner that the prediction is first computed by the linear multi-step method, and within that time-step, it is compared with the corresponding solution of the nodal matrix equation.

For linear energy storage components, their *i*-*v* characteristics can generally be expressed by the following first-order ordinary differential equation:

$$y' = F(t, y(t)),$$
 (7.1)

where *y* can be either inductor current or capacitor voltage. Given a new time-step n+1, the *y* value is calculated by integrating both sides

$$y_{n+1} = y_n + \int_{t_n}^{t_{n+1}} F(\tau, y(\tau)) d\tau.$$
(7.2)

In EMT simulation, the above equation needs to be discretized. An accurate *y* can be predicted by the implicit *s*-step Adams-Moulton (AM) Method

$$\bar{y}_{n+1} = y_n + \sum_{m=n-s+1}^{n+1} (\int_0^{\Delta \tilde{t}} \psi(\tau) d\tau) F(t_m, y_m),$$
(7.3)

where  $\Delta \tilde{t}$  is the adaptive time-step, and  $\psi(\tau)$  is the Lagrange interpolating polynomial:

$$\psi(\tau) = \prod_{k=n-s+1, k \neq m}^{n+1} \frac{\tau - t_k}{t_m - t_k}.$$
(7.4)

On the other hand, solving the circuit nodal equation based on the One-Step integration approximation gives the EMT simulation outcome  $y_{n+1}$ , which has a lower accuracy. Therefore, the relative error is obtained by

$$\epsilon = |\frac{\bar{y}_{n+1} - y_{n+1}}{\bar{y}_{n+1}}| \times 100\%.$$
(7.5)

The time-step is dynamically adjusted according to  $\epsilon$ , either being reduced to ensure accuracy or enlarged to accelerate the simulation in a predefined manner.

For efficient computation, medium-order AM formulas are adopted to predict the value of the next time-step, e.g., the 4th-order AM is given as

$$\bar{y}_{n+1} = y_n + \frac{\Delta \tilde{t}}{24} (9F_{n+1} + 19F_n - 5F_{n-1} + F_{n-2}).$$
 (7.6)

#### 7.2.3 Newton-Raphson Iteration Count

The nodal voltage equation of a nonlinear system can generally be written as

$$\mathbf{U}^k = (\mathbf{G}^{-1})^k \cdot \mathbf{J}^k,\tag{7.7}$$

where *k* is the iteration count. (7.7) is computed repeatedly within a time-step until the nodal voltage vector converges when the difference between results of two successive iterations is smaller than the threshold  $\zeta$ ,

$$\left\|\frac{\mathbf{U}^{k+1} - \mathbf{U}^k}{\mathbf{U}^{k+1}}\right\| \le \zeta.$$
(7.8)

The Newton-Raphson iteration count is thus a VTS criterion peculiar to nonlinear components. The time-step can be determined according to the number of iterations, e.g., for the nonlinear behavioral IGBT/diode model, the steady-state has fewest iterations and consequently the largest applicable time-step, while during transient stages it is below that upper limit.

#### 7.2.4 Hybrid Time-Step Control and Synchronization

A small value is preferred as the default time-step in the EMT program initialization. Once the simulation commences, the time-step is doubled each time if required until it reaches the upper limit, which may not be two times larger than the second largest value. Similarly, when the time-step is reducing, it follows the same route, i.e., if the current time-step is already maximum, it first steps down to its nearest value; otherwise the time-step halves until it reaches the lower limit.

In the MTDC grid, a large number of components can utilize VTS schemes, which means that each of them will produce an individual time-step  $\Delta t_i$ . The *localized* VTS algorithm is applied taking the converter as a basic unit, as Fig. 7.1(a) shows where a hybrid FTS and VTS scheme is adopted. With regard to system-level components, they are computed at a fixed *global* time-step  $\Delta T$ , which is much larger than the maximum VTS value. Since a large disparity exists between them, the FTS system proceeds at a much slower



Figure 7.1: Hybrid FTS-VTS scheme for MTDC grid simulation: (a) System structure, and (b) time instant synchronization.

frequency, i.e., it enters the next time-step only when the time instants of all VTS systems reach beyond its current value, as demonstrated in Fig. 7.1(b). Otherwise, all VTS systems continue individual computations while the FTS system waits for them to finish.

#### 7.2.5 VTS-Based MMC

#### 7.2.5.1 Two-State Switch Model

The TSSM is the simplest model for power semiconductor switches whose turn-on and turn-off action completes instantaneously with the transition of two distinct states lasting only one time-step. Since it is not a device-level model, when it is applied in the MTDC grid, system-level results are of interest. Therefore, in this scenario, the switching is not taken as a criterion for time-step control, nor is the N-R iteration due to the absence of non-linearity in the converter. Nevertheless, discrete events are still a criterion for indicating state shift in other components such as the transmission line. The LTE as a general method is applicable to the MMC, as it contains a large number of SM capacitors and 6 arm inductors. Therefore, in the TSSM-based DC grid, a combination of LTE and events-correlated criterion can be utilized.

Take the partitioned half-bridge SM (HBSM) in Fig. 5.6 for instance. It has the following matrix equation

$$\begin{bmatrix} U_1(t) \\ U_2(t) \end{bmatrix} = \begin{bmatrix} G_C + R_1^{-1} & -R_1^{-1} \\ -R_1^{-1} & R_1^{-1} + R_2^{-1} \end{bmatrix} \cdot \begin{bmatrix} I_{Ceq}(t) \\ J_s(t - \Delta t) \end{bmatrix}$$
(7.9)

for computing nodal voltages, where  $G_C$  and  $I_{Ceq}$  is the discrete-time companion model of the SM capacitor,  $R_1(R_2)$  represent the equivalent resistance of the upper (or lower) switch, and  $J_s(t - \Delta t)$  indicates one time-step delay due to circuit partitioning. The calculated capacitor voltage  $U_1(t)$  is then compared with its predicted value, which is calculated by (7.6) where

$$F_m = \frac{1}{C} (G_C \cdot U_1(t - (n+1-m)\Delta t) - I_{Ceq}(t - (n+1-m)\Delta t)).$$
(7.10)

The subscripts m = (n+1, n, n-1, n-2) denote values at the current time-step and previous time-steps.

#### 7.2.5.2 MMC Main Circuit

For various MMC topologies, the main circuit is always the same after partitioning. It contains 5 nodes after deriving the Norton equivalent circuit, i.e., 3 nodes on the AC side and the other 2 on the DC side, as shown in Fig. 5.9 if the converter is transformer-less. The matrix equation for this universal part is

$$\mathbf{G} = \mathbf{G}_{ext} + \begin{bmatrix} 2G_{\Sigma} \cdot \mathbf{I}_{3\times3} & [-G_{\Sigma}]_{3\times2} \\ [-G_{\Sigma}]_{2\times3} & (3G_{\Sigma} + G_{C_C}) \cdot \mathbf{I}_{2\times2} \end{bmatrix},$$
(7.11)

$$\mathbf{J} = \begin{bmatrix} -J_{\Sigma A u} + J_{\Sigma A d}, -J_{\Sigma B u} + J_{\Sigma B d}, -J_{\Sigma C u} + J_{\Sigma C d}, \\ J_{\Sigma A u} + J_{\Sigma B u} + J_{\Sigma C u} + I_{C_1 e q}, \\ -J_{\Sigma A d} - J_{\Sigma B d} - J_{\Sigma C d} - I_{C_2 e q} \end{bmatrix} + \mathbf{J}_{ext},$$
(7.12)

where matrices  $G_{ext}$  and  $J_{ext}$  represent elements contributed by AC and DC grids the MMC connects to,  $G_{C_C}$  and  $I_{Ceq}$  denote the DC bus capacitor, and  $G_{\Sigma}$  and  $J_{\Sigma}$  are the companion model of an MMC arm where subscripts u and d stands for the upper and lower arm, respectively. For variable time-stepping control, the 6 arm currents are calculated in a similar manner by (7.6) and the LTE by (7.5) after solving the matrix equation of this part.

As a universal approach to systems which comprise reactive components, the LTE is one choice for time-step regulation, and the procedure is the same as illustrated above. Meanwhile, a proper judgment on the events can also be utilized, e.g., in the nonlinear behavioral IGBT model,  $v_{Cge}$  is an indicator for switching behavior: when the IGBT turns on or off, it approaches the gate voltage, making dv/dt nonzero; otherwise, its derivative is nearly 0. The N-R iteration count is the most convenient criterion for nonlinearities. For the solution of the SM matrix equation (7.7), the steady-state takes the fewest number of iterations, and it is tolerable of a time-step of up to 200*n*s, which is subsequently selected as the upper limit. On the other hand, the transient stage requires more iterations, and it is prone to divergence if the time-step is kept large. Thus, the lower limit is 10*n*s.

#### 7.2.5.3 VTS MMC Kernel

As the core part of the MTDC grid, the GPU kernel of the NBM-based MMC with N-R iteration VTS scheme is specified in Fig. 7.2. The inputs and outputs of all kernels are stored in the GPU global memory so that they can be accessed by other kernels. The kernels are designed according to the number of functions the partitioned MMC has. Among



Figure 7.2: Nonlinear behavioral MMC kernel with VTS scheme.

them, the SM kernel is the most complex part. The IGBT/diode model is programmed as a GPU device function which could be instantly called by a kernel. Their outputs are properly organized according to (6.18) and (6.19). The N-R iteration of the matrix equation (7.7) repeats until all nodal voltages are convergent, and the final iteration count  $K_{NR}$  is stored in global memory so that it can be read by the kernel *VTS* which produces proper time increment for the next calculation.

It is noticed that not all threads launched by the same kernel implements exactly identical instructions, e.g., the number of N-R iterations conducted by the SM kernel varies in different SMs, and therefore synchronization of all threads is implemented at the end.

### 7.3 VTS Simulation Results and Validation

#### 7.3.1 System Setup

Fig. 7.3 shows two HVDC links integrated with offshore wind farms (OWFs), and by connecting them, e.g., between buses *B*1 and *B*2, an MTDC system is formed.  $MMC_1$  and  $MMC_2$  are rectifiers which provide stable AC voltage for the OWFs while simultaneously converting their energy into DC.  $MMC_3$  and  $MMC_4$  operate as inverters regulating the DC voltage. Bergeron's traveling wave model is adopted for the transmission lines, and transformers are required on the rectifier side for wind energy integration. Each OWF is modeled as an aggregation of 100 doubly-fed induction generators (DFIGs). Specifics of the DC grid are listed in Appendix D.

The VTS simulations are conducted on both the CPU and GPU under the 64-bit Windows<sup>®</sup> 10 operating system on the 2.2GHz Intel<sup>®</sup> Xeon E5-2698 v4 CPU and 160GB RAM. The device-level and system-level results in the following subsections are validated by SaberRD<sup>®</sup> and PSCAD/EMTDC<sup>®</sup>, respectively.



Figure 7.3: MMC-based MTDC grid with wind farm integration.

#### 7.3.2 VTS in Device-Level Simulation

In Fig. 7.4, all 3 proposed VTS schemes are tested in regulating the time-step of a nonlinear single-phase 9-level MMC fed with 8kV DC bus voltage and switched at 1kHz. The output voltages are shown on the left, which are virtually the same. The time-step variation in a zoomed 0.5ms segment is shown on the right. As can be seen, the 3 schemes lead to different results, but under steady-state, they are all 200*n*s, and dramatic regulations are observed during the transient stage. The efficiency of the schemes in computing low-level MMCs for a 100ms duration by CPU are summarized in Table 7.1. With circuit partitioning, MMCs having more than 9 voltage levels can be computed, which SaberRD<sup>®</sup> is unable to achieve. The N-R iteration method has the highest efficiency, around 16 times faster than the FTS for 5-L MMC.

In Fig. 7.5, device-level results are given from the 9-L MMC whose time-step is controlled by N-R iteration count. Fig. 7.5(a) gives the IGBT turn-on waveforms, which show that the density of points is higher during the transient stage, and it is also varying, meaning the MMC is computed at a variable frequency. The diode reverse recovery waveforms in Fig. 7.5(b) also demonstrate the same phenomenon. The power loss variation is ultimately reflected by the junction temperature, as shown in Fig. 7.5(c). The temperature of the lower IGBT/diode surges to over 100°C immediately after the converter is started, but it is still within normal operation region. On the other hand, the upper IGBT/diode has a much lower temperature, and finally, they all reach around 30°C. These results are virtually identical to that of SaberRD<sup>®</sup>, indicating that under VTS scheme, GPU simulation produces correct results.

#### 7.3.3 MTDC System Preview

The proposed VTS-NBM MMC model is also applied for system studies. In Fig. 7.6, disconnecting TL3, results of permanent pole-pole fault with a resistance of  $1\Omega$  occurring at



Figure 7.4: VTS schemes for nonlinear MMC simulation (left: output voltage; right: zoomed-in waveform): (a) SaberRD<sup>®</sup> results, (b) event-correlated criterion, (c) LTE, and (d) N-R iteration count.

t=5s in HVDC Link1 with both MMCs supported by stiff AC grids are given. Immediately after detecting the fault, all IGBTs are blocked. However, with HBSM topology, as given in Fig. 7.6(a), the DC system is still interactive with the AC grid, because the freewheeling diodes are operating as a rectifier. Thus, a residual line-line voltage of around 30kV is observed, and also a DC current of nearly 13kA. The fact that the residual current is



Figure 7.5: IGBT nonlinear behavioral model VTS control (left: proposed model; right: SaberRD<sup>®</sup>): (a) IGBT turn-on, (b) diode reverse recovery, and (c) junction temperatures.

dependent on the resistance of the switch leads to various values in PSCAD/EMTDC<sup>®</sup>, while with NBM, the current is definitive. In Fig. 7.6(b), the FBSM-MMC is able to achieve the blocking function, and consequently, the DC line-line voltages and currents eventually remain at 0. PSCAD/EMTDC<sup>®</sup> shows similar results.

In Fig. 7.7, the impact of wind speed on MTDC system is shown. Started at t=12s, the wind speed at  $OWF_1$  rises linearly from 8m/s to 11m/s in 1s; while the reverse is true for  $OWF_2$ . It is observed that the voltage at *Grid* 1 maintains stable due to the proper func-

|               | Execution time (s)   |      |       |       |       |                 |                 |  |
|---------------|----------------------|------|-------|-------|-------|-----------------|-----------------|--|
| Level         | SaberRD <sup>®</sup> | FTS  | Event | LTE   | N-R   | $\mathbf{Sp}_1$ | $\mathbf{Sp}_2$ |  |
| 5-L           | 463                  | 218  | 28.1  | 29.3  | 13.8  | 34              | 16              |  |
| 7-L           | 723                  | 303  | 44.6  | 40.3  | 20.0  | 36              | 15              |  |
| 9-L           | 966                  | 557  | 65.1  | 50.4  | 42.5  | 23              | 13              |  |
| 17 <b>-</b> L | _                    | 653  | 183.6 | 132.9 | 60.1  | _               | 11              |  |
| 33-L          | —                    | 1102 | 555.2 | 305.3 | 142.6 | -               | 8               |  |

Table 7.1: Comparison of VTS schemes' efficiency on CPU

Table 7.2: Execution time  $t_{exe}$  of a 4-T DC system for 0.1s duration

| MMC                                    | <b>CPU HBSM</b> $t_{exe}$ /s            |                                                      |                                        | <b>GPU HBSM</b> $t_{exe}$ /s                     |                                          |                                                                   | Speedup                                                           |  |
|----------------------------------------|-----------------------------------------|------------------------------------------------------|----------------------------------------|--------------------------------------------------|------------------------------------------|-------------------------------------------------------------------|-------------------------------------------------------------------|--|
| Level                                  | FTS                                     | VTS                                                  | $\mathbf{Sp}_1$                        | FTS                                              | VTS                                      | $\mathbf{Sp}_2$                                                   | Sp <sub>3</sub>                                                   |  |
| 51-L                                   | 11539                                   | 349                                                  | 30                                     | 554                                              | 23.2                                     | 24                                                                | 497                                                               |  |
| 101 <b>-</b> L                         | 24442                                   | 821                                                  | 30                                     | 550                                              | 22.1                                     | 25                                                                | 1106                                                              |  |
| 201-L                                  | 50484                                   | 1574                                                 | 32                                     | 548                                              | 26.4                                     | 21                                                                | 1912                                                              |  |
| 401-L                                  | 102084                                  | 3177                                                 | 32                                     | 574                                              | 75.0                                     | 7.7                                                               | 1361                                                              |  |
|                                        |                                         |                                                      |                                        |                                                  |                                          |                                                                   |                                                                   |  |
| MMC                                    | CPU F                                   | <b>BSM</b> $t_e$                                     | $_{xe}/s$                              | GPU                                              | FBSM                                     | $t_{exe}/s$                                                       | Speedup                                                           |  |
| MMC<br>Level                           | CPU F                                   | $\frac{\mathbf{BSM} t_e}{\mathbf{VTS}}$              | $\frac{xe/s}{\mathbf{Sp}_1}$           | GPU<br>FTS                                       | FBSM :<br>VTS                            | $\frac{t_{exe}/s}{\mathbf{Sp}_2}$                                 | Speedup<br>Sp <sub>3</sub>                                        |  |
| MMC<br>Level<br>51-L                   | CPU F<br>FTS<br>25471                   | <b>BSM</b> <i>t<sub>e</sub></i><br><b>VTS</b><br>571 | $\frac{s_{xe}/s}{45}$                  | <b>GPU</b><br><b>FTS</b><br>1181                 | <b>FBSM</b> 79                           | $\frac{t_{exe}/s}{\mathbf{Sp}_2}$ 15                              | Speedup           Sp <sub>3</sub> 322                             |  |
| MMC<br>Level<br>51-L<br>101-L          | CPU F<br>FTS<br>25471<br>51421          | BSM t <sub>es</sub><br>VTS<br>571<br>1138            | xe/s<br><b>Sp</b> 1<br>45<br>45        | <b>GPU</b><br><b>FTS</b><br>1181<br>1155         | <b>FBSM</b> :<br><b>VTS</b><br>79<br>133 | t <sub>exe</sub> /s<br><b>Sp</b> <sub>2</sub><br>15<br>8.7        | Speedup           Sp <sub>3</sub> 322           387               |  |
| MMC<br>Level<br>51-L<br>101-L<br>201-L | CPU F<br>FTS<br>25471<br>51421<br>94309 | BSM t <sub>e</sub><br>VTS<br>571<br>1138<br>2201     | $   \frac{se/s}{sp_1}   45   45   43 $ | <b>GPU</b><br><b>FTS</b><br>1181<br>1155<br>1355 | FBSM 7<br>VTS<br>79<br>133<br>154        | t <sub>exe</sub> /s<br><b>Sp</b> <sub>2</sub><br>15<br>8.7<br>8.8 | Speedup           Sp <sub>3</sub> 322           387           612 |  |

tioning of  $MMC_1$ , and so does the voltage at  $OWF_2$ , both of which are close to sinusoidal waveforms. Due to a stronger wind, the current  $I_{grid1}$  fed by  $OWF_1$  more than doubled, while its counterpart has the opposite trend. As for a single wind turbine, the power of a DFIG at  $OWF_1$  increases from approximately 750kW to 2.0MW, while those at  $OWF_2$  has the exact opposite output. The variations in wind speed also affect the power flow in the DC grid, and the DC line voltage as well. The power delivered by  $MMC_1$  and  $MMC_2$  has the same trend to a single DFIG in the respective OWFs, other than the fact that values are 100 times larger, and the power received by the two inverters also exchanged position. Meanwhile, minor perturbations are caused to DC voltages, but at inverter stations, they are recovered immediately.

Table 7.2 summarizes the execution times of different MMCs with two time-stepping schemes by the processors. Tested under a switching frequency of 200Hz, the proposed VTS scheme helps the CPU to achieve around 50 times speedup for both HBSM and FBSM MMCs. The GPU gains around 17 times speedup with FBSM-MMC, and almost 90 times for HBSM-MMC when the voltage level is below 401. Therefore, the proposed VTS scheme implemented on GPU is able to attain a dramatic speedup over the CPU with fixed time-stepping scheme, e.g., for the two types of MMCs, the speedup  $Sp_3$  could reach almost



Figure 7.6: HVDC-link1 pole-pole fault: (a) HBSM-MMC response, (b) FBSM-MMC response.

2000 and over 1000 times, respectively.

#### 7.4 Summary

A variable time-stepping MMC model with nonlinear device-level details were presented in this work for MTDC grid study. The high-order IGBT and diode models are more accurate and reveal information unavailable in the detailed model based on two-state



Figure 7.7: MTDC system dynamics with wind farms (left: proposed model, right:  $PSCAD/EMTDC^{(R)}$ ).

switch representation, but their nonlinearity may lead to an inefficient solution. Thus, fine-grained circuit partitioning was applied to separate MMC SMs from the arms, which consequently created a substantial number of identical circuits corresponding to a smaller matrix dimension and improved numerical stability as these circuit parts with nonlinear characteristics created subsequently are more convergent. The partitioned submodules were designed into a GPU kernel. The SIMT mode enables the GPU to conduct massively parallel execution and thus avoids sequential calculations of the partitioned circuit parts.

Meanwhile, several variable time-stepping schemes were proposed, and their application scenarios to different MMC models are analyzed. The event-correlated criterion and LTE are general methods regardless of the linearity of the system they apply to, while the N-R iteration count is specific to the nonlinear behavioral SM based MMC. A hybrid FTS-VTS scheme was proposed to mitigate the computational burden of the overall system, and the simulation conducted on different processors gains significant speedup compared with the fixed time-step scheme. The execution time of an MTDC system by GPU also indicated that system-level EMT simulation involving highly complex nonlinear device-level IGBT/diode models is feasible when massive parallelism is utilized, which the CPUs could hardly achieve.

# Conclusions and Future Works

Having witnessed its application in many HVDC projects around the world in recent years, the modular multi-level converter draws tremendous attention from both academia and industry for its potential in electricity delivery, or even power redistribution of a region by constructing the multi-terminal DC grid. Electromagnetic transient simulation is the main approach for studying the control and protection algorithms on the electrical secondary side, as well as the system performance on the primary side. Nevertheless, the simulation slows down dramatically when a practical power electronic system containing a large number of nodes is involved. In the meantime, high fidelity is required to acquire more accurate results, as well as micro-level information such as power loss and IGBT junction temperature for converter design evaluation.

FPGA is the prime platform for real-time hardware-in-the-loop emulation of the power electronic system. Its intrinsic parallelism and pipeline architecture enable more efficient computation even though its clock frequency is lower than other processors. Meanwhile, with a rapid growth in logic gates and higher clock frequency due to maturing manufacture technology, the FPGA is able to handle more complex power electronic systems. And the development of corresponding software tools shortens hardware design cycle by enabling programming in advanced languages.

On the other hand, off-line simulation is, in fact, more common since it is available on desktop computers. The graphics processing unit which initially worked for displaying images induces interests for its massive parallelism, which is deemed to be more efficient in computing the HVDC transmission and its extension as MTDC grid.

Thus, in this thesis, both real-time hardware-in-the-loop emulation of the MTDC grid with device-level details and the off-line GPU simulation are investigated.

## 8.1 Contributions of Thesis

The main contributions of this thesis are summarized as follows:

- The development of two types curve-fitting models of IGBT and its freewheeling diode for nanoseconds-level real-time HIL emulation of MMC on the FPGA. Real-time execution is quite a challenge as it requires the proposed models be completed within a small time-step. Thus, the transient waveform shapes of the first curve-fitting model are stored in the FPGA LUT, so the data can be instantly accessed and amplified properly according to its steady-state values. The dynamic curve-fitting model takes factors affecting the IGBT switching transients as variables of its rise and fall times represented in piecewise linearized functions. Thus, the impact of external circuits, as well as the operation condition on the IGBT, can be precisely predicted, revealing more accurate device-level results for converter design assessment.
- When the ideal switch model is used in MMC simulation, replacing the MMC submodule by the TLM-stub achieves faster simulation speed than the detailed equivalent model. Besides, this model requires fewer hardware resources when deployed to the FPGA. A hybrid arm structure becomes available by combining the TLM-stub with partitioned MMC submodules. As a consequence, the MMC model has a lower requirement on FPGA resources while the fidelity is also ensured.
- The proposal of two circuit partitioning schemes based on TLM-link and voltagecurrent coupling. Direct solution of the MMC is inefficient due to a large number of nodes. The circuit partitioning approaches improve the computation efficient by splitting the submodules from the MMC arms. Consequently, the originally large admittance matrix is converted into a number of smaller matrices that can be processed in parallel, enabling real-time MMC emulation on the FPGA even complex IGBT/diode model is used.
- A number of detailed device-level full-scale hybrid HVDC circuit breaker models are proposed for real-time HIL emulation. Adoption of the scaled-down model of the HHB sometimes induces incorrect results, while the inclusion of full-scale HHB would require an extraordinarily long simulation time. Thus, circuit partitioning using a pair of coupled of voltage-current sources is applied to the HHB, which is separated into a number of identical units that could be processed concurrently.
- A multi-layer hardware implementation structure is proposed. In power converter real-time emulation, the controller usually requires a much larger time-step than that of device-level IGBT/diode models. If a unified time-step is adopted, that should be the step size of the controller, meaning the device-level transients cannot be captured with a high resolution. The multi-layer hardware design enables a pipelined

computation structure between the circuit and controller. As a result, the switching transients can be recorded without distortion.

- A twofold basic-augment IGBT nonlinear behavioral model is proposed to improve the EMT simulation efficiency and the numerical stability. The original model includes the static and dynamic characteristics in one circuit, leading to an inefficient solution, and more often than not, the numerical divergence which forces the simulation to terminate. In the proposed IGBT model, the basic section providing MOSFET functions participates circuit solution, while the augment part is added later for IGBT dynamic features and the subsequent power loss as well as junction temperature. This model can be widely used in system-level simulations involving device-level details, and the methodology can be referred in the future for the modeling of other complex power semiconductor switches.
- Development of an electro-thermal model of two types of IGBTs: the curve-fitting model and the nonlinear behavioral model. The inclusion of the thermal network represented by cascaded R-C circuits enables the revelation of the junction temperature, which is a key factor for power converter evaluation.
- GPU simulation of the CIGRÉ B4 DC grid by massive parallelism is investigated. Various power system and power electronic components, including the transmission line, transformer, varistor, IGBT and diode, MMC and its controller, are design into GPU kernels. Fine-grained circuit partitioning is carried out to reduce the size of the system's matrix equation and to create a substantial number of physically independent subsystems catering for the massively parallel architecture of GPU. Various scales of MMCs adopting both ideal switch model and the curve-fitting model are designed, from single-phase to the Greater CIGRÉ DC grid. It shows that even in simulating small-scale converters, GPU is advantageous over CPU, let alone a much larger DC grid. The kernel design methodology can be referred to in the future when a new generation of commercial off-line simulation tools based on GPU is developed. Moreover, parallel programs for multi-core CPU implementation are also designed, and currently, even this type of simulation is rarely seen in commercial products.
- For higher versatility, the nonlinear, iterative IGBT and diode behavioral models are applied to the CIGRÉ B4 DC grid with offshore wind farm integration. GPU kernel design involving nonlinear elements is demonstrated, and a remarkable speedup is attained over multi-core CPU. The variable time-stepping scheme further expedites the EMT computation process, making system-level simulation containing device-level models feasible. Three main criteria as to judge and control the simulation time-step dynamically are proposed, and their usage is summarized.

### 8.2 Directions for Future Work

The following topics are proposed for future work:

- With a growing size of the hybrid AC/DC grid, the computational burden on a single GPU will increase, as the number of threads will far exceed the available CUDA cores. The multi-GPU structure which distributes the burden equally among several GPUs by programming will expedite the simulation and can be explored.
- The variable time-stepping schemes could be applied to other power system configurations for faster off-line simulations on GPU and CPU, as well as real-time HIL emulation on FPGA. It has been in this work applied to a single energy storage element. Nevertheless, the algorithm for complex systems such as the electrical machines [156, 157] whose mathematical equations take the form of the matrix has not been developed. New criteria for time-stepping judgment and control can also be investigated.
- A fully detailed MTDC grid could be developed on both GPU and FPGA by using high fidelity models of power system components such as the transmission line and the transformer, rather than the lumped model in current EMT simulation tools. The finite element method is deemed as the most accurate model for the transformer [158] and various rotating machines [159]. Thus, the simulation can provide more accurate results and information that is unavailable in previous simulation platforms.
- Hardware resource is the main factor that restricts the scale of power electronic system deployed to the FPGA board. For the CIGRÉ B4 DC grid to be deployed to current single FPGA board, the averaged value model has to be employed, as the hardware resource could hardly meet the requirement of even the detailed equivalent circuit model, not to mention the curve-fitting model, or nonlinear behavioral model. Therefore, in the future, a multi-FPGA system where these boards are connected to each other with low latency for inter-board data exchange. On the other hand, corresponding power system reconfiguration by using methodologies such circuit partitioning could be investigated for accommodating the inherent transmission delay between neighboring boards. With more hardware resources, detailed information from a large system running in real-time becomes possible.

# Bibliography

- H. M. P. and M. T. Bina, "A transformerless medium-voltage STATCOM topology based on extended modular multilevel converters," *IEEE Trans. Power Electron.*, vol. 26, no. 5, pp. 1534-1545, May 2011.
- [2] M. Hagiwara, R. Maeda, and H. Akagi, "Negative-sequence reactive-power control by a PWM STATCOM based on a modular multilevel cascade converter (MMCC-SDBC)," *IEEE Trans. Ind. Appl.*, vol. 48, no. 2, pp. 720-729, Mar. 2012.
- [3] P. Sotoodeh and R. D. Miller, "Design and implementation of an 11-level inverter with FACTS capability for distributed energy systems," *IEEE J. Emerging Sel. Topics Power Electron.*, vol. 2, no. 1, pp. 87-96, Mar. 2014.
- [4] S. Debnath and M. Saeedifard, "A new hybrid modular multilevel converter for grid connection of large wind turbines," *IEEE Trans. Sustainable Energy*, vol. 4, no. 4, pp. 1051-1064, Oct. 2013.
- [5] J. Mei, B. Xiao, K. Shen, L. M. Tolbert, and J. Y. Zheng, "Modular multilevel inverter with new modulation method and its application to photovoltaic grid-connected generator," *IEEE Trans. Power Electron.*, vol. 28, no. 11, pp. 5063-5073, Nov. 2013.
- [6] M. R. Islam, Y. Guo, and J. Zhu, "A high-frequency link multilevel cascaded mediumvoltage converter for direct grid integration of renewable energy systems," *IEEE Trans. Power Electron.*, vol. 29, no. 8, pp. 4167-4182, Nov. 2014.
- [7] A. Antonopoulos, L. Angquist, L. Harnefors, and H. P. Nee, "Optimal selection of the average capacitor voltage for variable-speed drives with modular multilevel converters," *IEEE Trans. Power Electron.*, vol. 30, no. 1, pp. 227-234, Jan. 2015.
- [8] W. Kawamura, K. L. Chen, M. Hagiwara, and H. Akagi, "A low-speed, high-torque motor drive using a modular multilevel cascade converter based on triple-star bridge cells (MMCC-TSBC)," *IEEE Trans. Ind. Appl.*, vol. 51, no. 5, pp. 3965-3974, Sep. 2015.
- [9] J. Qin and M. Saeedifard, "Predictive control of a modular multilevel converter for a back-to-back HVDC system," *IEEE Trans. Power Del.*, vol. 27, no. 3, pp. 1538-1547, Jul. 2012.

- [10] W. Wang, A. Beddard, M. Barnes, and O. Marjanovic, "Analysis of active power control for VSC?HVDC," *IEEE Trans. Power Del.*, vol. 29, no. 4, pp. 1978-1988, Aug. 2014.
- [11] I. A. Gowaid, G. P. Adam, A. M. Massoud, S. Ahmed, D. Holliday, and B. W. Williams, "Quasi two-level operation of modular multilevel converter for use in a high-power DC transformer with DC fault isolation capability," *IEEE Trans. Power Electron.*, vol. 30, no. 1, pp. 108-123, Jan. 2015.
- [12] Z. Xing, X. Ruan, H. You, X. Yang, D. Yao, and C. Yuan, "Soft-switching operation of isolated modular DC/DC converters for application in HVDC grids," *IEEE Trans. Power Electron.*, vol. 31, no. 4, pp. 2753-2766, Apr. 2016.
- [13] X. She, A. Q. Huang, and R. Burgos, "Review of solid-state transformer technologies and their application in power distribution systems," *IEEE J. Emerging Sel. Topics Power Electron.*, vol. 1, no. 3, pp. 186-198, Sep. 2013.
- [14] R. Li, L. Xu, L. Yao, and B. W. Williams, "Active control of DC fault currents in DC solid-state transformers during ride-through operation of multi-terminal HVDC systems," *IEEE Trans. Energy convers.*, vol. 31, no. 4, pp. 1336-1346, Dec. 2016.
- [15] D. Jovcic and H. Zhang, "Dual channel control with DC fault ride through for MMCbased, isolated DC/DC converter," *IEEE Trans. Power Del.*, vol. 32, no. 3, pp. 1574-1582, Jun. 2017.
- [16] T. Lüth, M. M. C. Merlin, T. C. Green, F. Hassan, and C. D. Barker, "High-frequency operation of a DC/AC/DC system for HVDC applications," *IEEE Trans. Power Electron.*, vol. 29, no. 8, pp. 4107-4115, Aug. 2014.
- [17] J. Häfner and B. Jacobson, "Proactive hybrid HVDC breakers? a key innovation for reliable HVDC grids," in *Proc. Cigré Symp.*, Bologna, Italy, Sep. 13-15, 2011.
- [18] M. Callavik, A. Blomberg, J. Häfner, and B. Jacobson, "The hybrid HVDC breaker an innovation breakthrough enabling reliable HVDC grids," *ABB Grid Systems, Technical Paper*, Nov. 2012.
- [19] M. Mobarrez, M. G. Kashani, and S. Bhattacharya, "A novel control approach for protection of multi-terminal VSC based HVDC transmission system against DC faults," in *Proc. Energy Conversion Congr. Expo.*, Sep. 2015, pp. 4208-4213.
- [20] N. A. Belda and R. P. P. Smeets, "Test circuits for HVDC circuit breakers," IEEE Trans. Power Del., vol. 32, no. 1, pp. 285-293, Feb. 2017.
- [21] M. Hajian, L. Zhang, and D. Jovcic, "DC transmission grid with low-speed protection using mechanical DC circuit breakers," *IEEE Trans. Power Del.*, vol. 30, no. 3, pp. 1383-1391, Jun. 2015.

- [22] A. Shukla and G. D. Demetriades, "A survey on hybrid circuit-breaker topologies," *IEEE Trans. Power Del.*, vol. 30, no. 2, pp. 627-641, Apr. 2015.
- [23] Working Group B4.57, Guide for the Development of Models for HVDC Converters in a HVDC Grid. CIGRE, 2014.
- [24] H. Saad, T. Ould-Bachir, J. Mahseredjian, C. Dufour, S. Dennetière, and S. Nguefeu, "Real-time simulation of MMCs using CPU and FPGA," *IEEE Trans. Power Electron.*, vol. 30, no. 1, pp. 259-267, Jan. 2015.
- [25] F. Yu, W. Lin, X. Wang, and D. Xie, "Fast voltage-balancing control and fast numerical simulation model for the modular multilevel converter," *IEEE Trans. Power Del.*, vol. 30, no. 1, pp. 220-228, Feb. 2015.
- [26] A. Beddard, M. Barnes, and R. Preece, "Comparison of detailed modeling techniques for MMC employed on VSC-HVDC schemes," *IEEE Trans. Power Del.*, vol. 30, no. 2, pp. 579-589, Apr. 2015.
- [27] N. Ahmed, L. Angquist, S. Mahmood, A. Antonopoulos, L. Harnefors, S. Norrga, and H. P. Nee, "Efficient modeling of an MMC-based multiterminal DC system employing hybrid HVDC breakers," *IEEE Trans. Power Del.*, vol. 30, no. 4, pp. 1792-1801, Aug. 2015.
- [28] Kuang Sheng, B. W. Williams, and S. J. Finney, "A review of IGBT models," IEEE Trans. Power Electron., vol. 15, no. 6, pp. 1250-1266, Nov. 2000.
- [29] H. Saad, J. Peralta, S. Dennetière, J. Mahseredjian, J. Jatskevich, J. A. Martinez, A. Davoudi, M. Saeedifard, V. Sood, X. Wang, J. Cano, and A. Mehrizi-Sani, "Dynamic averaged and simplified models for MMC-based HVDC transmission systems," *IEEE Trans. Power Del.*, vol. 28, no. 3, pp. 1723-1730, Jul. 2013.
- [30] J. Peralta, H. Saad, S. Dennetiere, J. Mahseredjian, and S. Nguefeu, "Detailed and averaged models for a 401-level MMC-HVDC system," *IEEE Trans. Power Del.*, vol. 27, no. 3, pp. 1501-1508, Jul. 2012.
- [31] Z. Zheng, K. Wang, L. Xu, and Y. Li, "A hybrid cascaded multilevel converter for battery energy management applied in electric vehicles," *IEEE Trans. Power Electron.*, vol. 29, no. 7, pp. 3537-3546, Jul. 2014.
- [32] T. Soong and P. W. Lehn, "Evaluation of emerging modular multilevel converters for BESS applications," *IEEE Trans. Power Del.*, vol. 29, no. 5, pp. 2086-2094, Oct. 2014.
- [33] U. N. Gnanarathna, A. M. Gole, and R. P. Jayasinghe, "Efficient modeling of modular multilevel HVDC converters (MMC) on electromagnetic transient simulation programs," *IEEE Trans. Power Del.*, vol. 26, no. 1, pp. 316-324, Jan. 2011.

- [34] J. Xu, C. Zhao, W. Liu, and C. Guo, "Accelerated model of modular multilevel converters in PSCAD/EMTDC," *IEEE Trans. Power Del.*, vol. 28, no. 1, pp. 129-136, Jan. 2013.
- [35] G. P. Adam and B. W. Williams, "Half- and full-bridge modular multilevel converter models for simulations of full-scale HVDC links and multiterminal DC grids," *IEEE J. Emerging Sel. Topics Power Electron.*, vol. 2, no. 4, pp. 1089-1108, Dec. 2014.
- [36] J. Xu, C. Zhao, Y. Xiong, C. Li, Y. Ji, and T. An, "Optimal design of MMC levels for electromagnetic transient studies of MMC-HVDC," *IEEE Trans. Power Del.*, vol. 31, no. 4, pp. 1663-1672, Aug. 2016.
- [37] M. Bhesaniya and A. Shukla, "Norton equivalent modeling of current source MMC and its use for dynamic studies of back-to-back converter system," *IEEE Trans. Power Del.*, vol. 32, no. 4, pp. 1935-1945, Aug. 2017.
- [38] D. C. Ludois and G. Venkataramanan, "Simplified terminal behavioral model for a modular multilevel converter," *IEEE Trans. Power Electron.*, vol. 29, no. 4, pp. 1622-1631, Apr. 2014.
- [39] H. Saad, S. Dennetière, J. Mahseredjian, P. Delarue, X. Guillaud, J. Peralta, and S. Nguefeu, "Modular multilevel converter models for electromagnetic transients," *IEEE Trans. Power Del.*, vol. 29, no. 3, pp. 1481-1489, Jun. 2014.
- [40] M. A. Perez, S. Bernet, J. Rodriguez, S. Kouro, and R. Lizana, "Circuit topologies, modeling, control schemes, and applications of modular multilevel converters," *IEEE Trans. Power Electron.*, vol. 30, no. 1, pp. 4-17, Jan. 2015.
- [41] H. Yang, Y. Dong, W. Li, and X. He, "Average-value model of modular multilevel converters considering capacitor voltage ripple," *IEEE Trans. Power Del.*, vol. 32, no. 2, pp. 723-732, Apr. 2017.
- [42] A. Beddard, C. Sheridan, M. Barnes, and T. Green, "Improved accuracy average value models of modular multilevel converters," *IEEE Trans. Power Del.*, vol. 31, no. 5, pp. 2260-2269, Oct. 2016.
- [43] A. Hassanpoor, J. Häfner, and B. Jacobson, "Technical assessment of load commutation switch in hybrid HVDC breaker," *IEEE Trans. Power Electron.*, vol. 30, no. 10, pp. 5393-5400, Oct. 2015.
- [44] U. A. Khan, J. G. Lee, F. Amir, and B. W. Lee, "A novel model of HVDC hybrid-type superconducting circuit breaker and its performance analysis for limiting and breaking DC fault currents," *IEEE Trans. Appl. Supercond.*, vol. 25, no. 6, pp. 1-9, Dec. 2015.

- [45] W. Lin, D. Jovcic, S. Nguefeu, and H. Saad, "Modelling of high-power hybrid DC circuit breaker for grid-level studies," *IET Power Electron.*, vol. 9, no. 2, pp. 237-246, Feb. 2016.
- [46] E. Kontos, T. Schultz, L. Mackay, L. M. Ramirez-Elizondo, C. M. Franck, and P. Bauer, "Multiline breaker for HVdc applications," *IEEE Trans. Power Del.*, vol. 33, no. 3, pp. 1469-1478, Jun. 2018.
- [47] Z. Q. Shi, Y. K. Zhang, S. L. Jia, X. C. Song, L. J. Wang, and M. Chen, "Design and numerical investigation of a HVDC vacuum switch based on artificial current zero," *IEEE Trans. Dielectr. Electr. Insul.*, vol. 22, no. 1, pp. 135-141, Feb. 2015.
- [48] S. P. Azad and D. V. Hertem, "A fast local bus current-based primary relaying algorithm for HVDC grids," *IEEE Trans. Power Del.*, vol. 32, no. 1, pp. 193-202, Feb. 2017.
- [49] M. K. Bucher and C. M. Franck, "Fault current interruption in multiterminal HVDC networks," *IEEE Trans. Power Del.*, vol. 31, no. 1, pp. 87-95, Feb. 2016.
- [50] O. Cwikowski, M. Barnes, R. Shuttleworth, and B. Chang, "Analysis and simulation of the proactive hybrid circuit breaker," in *Proc. IEEE PEDS*, Sydney, Australia, Jun. 9-12 2015, pp. 4-11.
- [51] J. A. Martinez and J. Magnusson, "EMTP modeling of hybrid HVDC breakers," in Proc. IEEE Power Energy Soc. Gen. Meet., Jul. 26-30 2015, pp. 1-5.
- [52] A. R. Hefner and D. M. Diebolt, "An experimentally verified IGBT model implemented in the Saber circuit simulator," *IEEE Trans. Power Electron.*, vol. 9, no. 5, pp. 532-542, Sep. 1994.
- [53] X. Yang, M. Otsuki, and P. R. Palmer, "Physics-based insulated-gate bipolar transistor model with input capacitance correction," *IET Power Electron.*, vol. 8, no. 3, pp. 417-427, 2015.
- [54] R. Chibante, A. Araújo, and A. Carvalho, "Finite-element modeling and optimizationbased parameter extraction algorithm for NPT-IGBTs," *IEEE Trans. Power Electron.*, vol. 24, no. 5, pp. 1417-1427, May 2009.
- [55] K. Sheng, B. W. Williams, and S. J. Finney, "A review of IGBT models," IEEE Trans. Power Electron., vol. 15, no. 6, pp. 1250-1266, Nov. 2000.
- [56] M. Miyake, M. Ueno, U. Feldmann, and H. J. Mattausch, "Modeling of SiC IGBT turn-off behavior valid for over 5-kV circuit simulation," *IEEE Trans. Electron Dev.*, vol. 60, no. 2, pp. 622-629, Feb. 2013.

- [57] M. Miyake, M. Ueno, U. Feldmann, and H. J. Mattausch, "Steady-state loss model of half-bridge modular multilevel converters," *IEEE Trans. Indus. Appl.*, vol. 52, no. 3, pp. 2415-2425, May 2016.
- [58] W. Wang, Z. Shen, and V. Dinavahi, "Physics-based device-level power electronic circuit hardware emulation on FPGA," *IEEE Trans. Ind. Informat.*, vol. 10, no. 4, pp. 2166-2179, Nov. 2014.
- [59] P. O. Lauritzen, G. K. Andersen, and M. Helsper, "A basic IGBT model with easy parameter extraction," in *IEEE PESC'01*, Vancouver, Canada, Jun. 2001, pp. 2160–2165.
- [60] R. Fu, A. E. Grekov, K. Peng, and E. Santi, "Parameter extraction procedure for a physics-based power SiC schottky diode model," *IEEE Trans. Indus. Appl.*, vol. 50, no. 5, pp. 3558-3568, Sep. 2014.
- [61] A. Shaker, M. Abouelatta, G. T. Sayah, and A. Zekry, "Comprehensive physically based modelling and simulation of power diodes with parameter extraction using MAT-LAB," *IET Power Electron.*, vol. 7, no. 10, pp. 2464-2471, 2014.
- [62] P. Xue, G. Fu, and D. Zhang, "Physics-based compact model for the EMCON p-i-n diode using MATLAB and Simulink," *IET Power Electron.*, vol. 9, no. 12, pp. 2416-2424, 2016.
- [63] G. Bazzano, D. G. Cavallaro, and G. Greco, "An analog behavioral thermal macromodel aimed at representing an elementary portion of a discrete IGBT power device," in *THERMINIC'11*, Paris, France, Sept. 2011, pp. 1–6.
- [64] J. T. Hsu, and K. D. T. Ngo, "Behavioral modeling of the IGBT using the Hammerstein configuration," *IEEE Trans. Power Electron.*, vol. 11, no. 6, pp. 746-754, Nov. 1996.
- [65] J. L. Tichenor, S. D. Sudhoff, and J. L. Drewniak, "Behavioral IGBT modeling for predicting high frequency effects in motor drives," *IEEE Trans. Power Electron.*, vol. 15, no. 2, pp. 354-360, Mar. 2000.
- [66] M. Zhang, A. Courtay, and Z. Yang, "An improved behavioral IGBT model and its characterization tool," in *IEEE Proc. Electron Devices Meeting*, Hong Kong, China, Jun. 2000, pp. 142–145.
- [67] L. Herrera, C. Li, X. Yao, and J. Wang, "FPGA-based detailed real-time simulation of power converters and electric machines for EV HIL applications," *IEEE Trans. Indus. Appl.*, vol. 51, no. 2, pp. 1702-1712, Mar. 2015.
- [68] Z. Shen and V. Dinavahi, "Real-time device-level transient electrothermal model for modular multilevel converter on FPGA," *IEEE Trans. Power Electron.*, vol. 31, no. 9, pp. 6155-6168, Sep. 2016.

- [69] C. Wong, "EMTP modeling of IGBT dynamic performance for power dissipation estimation," *IEEE Trans. Indus. Appl.*, vol. 33, no. 1, pp. 64-71, Jan. 1997.
- [70] J. J. Sanchez-Gasca, R. D'Aquila, J. J. Paserba, W. W. Price, D. B. Klapper, and I. P. Hu, "Extended-term dynamic simulation using variable time step integration," *IEEE Computer Applications in Power*, vol. 6, no. 4, pp. 23-28, Oct. 1993.
- [71] J. J. Sanchez-Gasca, R. D'Aquila, W. W. Price, and J. J. Paserba, "Variable time step, implicit integration for extended-term power system dynamic simulation," *Proc. IEEE Power Industry Computer Applications Conf.*, 712 May, 1995, pp. 183189.
- [72] J. Yao, T. Wang, and J. Roychowdhury, "An efficient time step control method in transient simulation for DAE system," *Proc. IEEE ICECS*, 7-10 Dec., 2014, pp. 44-47.
- [73] S. Kumashiro, T. Kamei, A. Hiroki, and K. Kobayashi, "An accurate metric to control time step of transient device simulation by matrix exponential method," in *Proc. IEEE SISPAD*, 7-9 Sep., 2017, pp. 37-40.
- [74] Z. Shen and V. Dinavahi, "Dynamic variable time-stepping schemes for real-time FPGA-based nonlinear electromagnetic transient emulation," *IEEE Trans. Ind. Electron.*, vol. 64, no. 5, pp. 4006-4016, May 2017.
- [75] S. Y. R. Hui, K. K. Fung, and C. Christopoulos, "Decoupled simulation of DC-linked power electronic systems using transmission-line links," *IEEE Trans. Power Electron.*, vol. 9, no. 1, pp. 85-91, Jan. 1994.
- [76] K. K. Fung, S. Y. R. Hui, and C. Christopoulos, "Concurrent programming and simulation of decoupled power electronic circuits," *IEE Proc.- Sci., Meas. Technol.*, vol. 143, no. 2, pp. 131-136, Mar. 1996.
- [77] H. Selhi, C. Christopoulos, A. F. Howe, and S. Y. R. Hui, "The application of transmission-line modelling to the simulation of an induction motor drive," *IEEE. Trans. Energy Convers.*, vol. 11, no. 2, pp. 287-297, Jun. 1996.
- [78] K. K. Fung and S. Y. R. Hui, "Fast simulation of multistage power electronic systems with widely separated operating frequencies," *IEEE Trans. Power Electron.*, vol. 11, no. 3, pp. 405-412, May 1996.
- [79] R. Champagne, L. A. Dessaint, H. Fortin-Blanchette, and G. Sybille, "Analysis and validation of a real-time AC drive simulator," *IEEE Trans. Power Electron.*, vol. 19, no. 2, pp. 336-345, Mar 2004.
- [80] T. Kato, K. Inoue, T. Fukutani, and Y. Kanda, "Multirate analysis method for a power electronic system by circuit partitioning," *IEEE Trans. Power Electron.*, vol. 24, no. 12, pp. 2791-2802, Dec. 2009.

- [81] A. Davoudi, J. Jatskevich, P. L. Chapman, and A. Bidram, "Multi-resolution modeling of power electronics circuits using model-order reduction techniques," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 60, no. 3, pp. 810-823, Mar. 2013.
- [82] S. P. Azad and D. V. Hertem, "A fast local bus current-based primary relaying algorithm for HVDC grids," *IEEE Trans. Power Del.*, vol. 32, no. 1, pp. 193-202, Feb. 2017.
- [83] V. Jalili-Marandi, L. F. Pak, and V. Dinavahi, "Real-time simulation of grid-connected wind farms using physical aggregation," *IEEE Trans. Ind. Electron.*, vol. 57, no. 9, pp. 3010-3021, Sep. 2010.
- [84] J. H. Jung, S. Ahmed, and P. Enjeti, "PEM fuel cell stack model development for realtime simulation applications," *IEEE Trans. Ind. Electron.*, vol. 58, no. 9, pp. 4217-4231, Sep. 2011.
- [85] K. Ou, H. Rao, Z. Cai, H. Guo, X. Lin, L. Guan, T. Maguire, B. Warkentin, and Y. Chen, "MMC-HVDC simulation and testing based on real-time digital simulator and physical control system," *IEEE J. Emerging Sel. Topics Power Electron.*, vol. 2, no. 4, pp. 1109-1116, Dec. 2014.
- [86] H. F. Blanchette, T. Ould-Bachir, and J. P. David, "A state-space modeling approach for the FPGA-based real-time simulation of high switching frequency power converters," *IEEE Trans. Ind. Electron.*, vol. 59, no. 12, pp. 4555-4567, Dec. 2012.
- [87] T. Ould-Bachir, H. F. Blanchette, and K. Al-Haddad, "A network tearing technique for FPGA-based real-time simulation of power converters," *IEEE Trans. Ind. Electron.*, vol. 62, no. 6, pp. 3409-3418, Jun. 2015.
- [88] F. Montano, T. Ould-Bachir, and J. P. David, "An evaluation of a high-level synthesis approach to the FPGA-based submicrosecond real-time simulation of power converters," *IEEE Trans. Ind. Electron.*, vol. 65, no. 1, pp. 636-644, Jan. 2018.
- [89] Y. Chen and V. Dinavahi, "Digital hardware emulation of universal machine and universal line models for real-time electromagnetic transient simulation," *IEEE Trans. Ind. Electron.*, vol. 59, no. 2, pp. 1300-1309, Feb. 2012.
- [90] G. G. Parma and V. Dinavahi, "Real-time digital hardware simulation of power electronics and drives," *IEEE Trans. Power Del.*, vol. 22, no. 2, pp. 1235-1246, Apr. 2007.
- [91] A. Myaing and V. Dinavahi, "FPGA-based real-time emulation of power electronic systems with detailed representation of device characteristics," *IEEE Trans. Ind. Electron.*, vol. 58, no. 1, pp. 358-368, Jan. 2011.
- [92] M. Matar and R. Iravani, "Massively parallel implementation of AC machine models for FPGA-based real-time simulation of electromagnetic transients," *IEEE Trans. Power Del.*, vol. 26, no. 2, pp. 830-840, Apr. 2011.

- [93] J. Liu and V. Dinavahi, "A real-time nonlinear hysteretic power transformer transient model on FPGA," *IEEE Trans. Ind. Electron.*, vol. 61, no. 7, pp. 3587-3597, Jul. 2014.
- [94] J. Liu and V. Dinavahi, "Detailed magnetic equivalent circuit based real-time nonlinear power transformer model on FPGA for electromagnetic transient studies," *IEEE Trans. Ind. Electron.*, vol. 63, no. 2, pp. 1191-1202, Feb. 2016
- [95] X. Liu, A. H. Osman, and O. P. Malik, "Real-time implementation of a hybrid protection scheme for bipolar HVDC line using FPGA," *IEEE Trans. Power Del.*, vol. 26, no. 1, pp. 101-108, Jan. 2011.
- [96] W. Li and J. Bélanger, "An equivalent circuit method for modelling and simulation of modular multilevel converters in real-time HIL test bench," *IEEE Trans. Power Del.*, vol. 31, no. 5, pp. 2401-2409, Oct. 2016.
- [97] M. Ashourloo, R. Mirzahosseini, and R. Iravani, "Enhanced model and real-time simulation architecture for modular multilevel converter," *IEEE Trans. Power Del.*, vol. 33, no. 1, pp. 466-476, Feb. 2018.
- [98] Y. Chen and V. Dinavahi, "Multi-FPGA digital hardware design for detailed largescale real-time electromagnetic transient simulation of power systems," *IET Gener. Trans. Distrib.*, vol. 7, no. 5, pp. 451-463, May 2013.
- [99] Y. Chen and V. Dinavahi, "Hardware emulation building blocks for real-time simulation of large-scale power grids," *IEEE Trans. Ind. Informat.*, vol. 10, no. 1, pp. 373-381, Feb. 2014.
- [100] G. Liu, Z. Xu, Y. Xue, and G. Tang, "Optimized control strategy based on dynamic redundancy for the modular multilevel converter," *IEEE Trans. Power Electron.*, vol. 30, no. 1, pp. 339-348, Jan. 2015.
- [101] N. Yousefpoor, A. Narwal, and S. Bhattacharya, "Control of DC-fault-resilient voltage source converter-based HVDC transmission system under DC fault operating condition," *IEEE Trans. Indus. Electron.*, vol. 62, no. 6, pp. 3683-3690, Jun. 2015.
- [102] L. Codecasa, V. d'Alessandro, A. Magnani, and A. Irace, "Circuit-based electrothermal simulation of power devices by an ultrafast nonlinear MOR approach," *IEEE Trans. Power Electron.*, vol. 31, no. 8, pp. 5906-5916, Aug. 2016.
- [103] J. Kwon, X. Wang, F. Blaabjerg, and C. L. Bak, "Frequency-domain modeling and simulation of DC power electronic systems using harmonic state space method," *IEEE Trans. Power Electron.*, vol. 32, no. 2, pp. 1044-1055, Feb. 2017.
- [104] V. Jalili-Marandi and V. Dinavahi, "SIMD-based large-scale transient stability simulation on the graphics processing unit," *IEEE Trans. Power syst.*, vol. 25, no. 3, pp. 1589-1599, Aug. 2010.

- [105] Z. Zhou and V. Dinavahi, "Parallel massive-thread electromagnetic transient simulation on GPU," *IEEE Trans. Power Del.*, vol. 29, no. 3, pp. 1045-1053, Jun. 2014.
- [106] J. K. Debnath and A. M. Gole and W. K. Fung, "Graphics-processing-unit-based acceleration of electromagnetic transients simulation," *IEEE Trans. Power Del.*, vol. 31, no. 5, pp. 2036-2044, Oct. 2016.
- [107] Z. Zhou and V. Dinavahi, "Fine-grained network decomposition for massively parallel electromagnetic transient simulation of large power systems," *IEEE Power Energy Technol. Syst. J.*, vol. 4, no. 3, pp. 51-64, Sep. 2017.
- [108] H. Karimipour and V. Dinavahi, "Parallel domain-decomposition-based distributed state estimation for large-scale power systems," *IEEE Trans. Indus. Appl.*, vol. 52, no. 2, pp. 1265-1269, Mar. 2016.
- [109] H. Karimipour and V. Dinavahi, "Parallel relaxation-based joint dynamic state estimation of large-scale power systems," *IET Gener. Trans. Distrib.*, vol. 10, no. 2, pp. 452-459, 2016.
- [110] X. X. Liu, S. X. D. Tan, H. Wang, and H. Yu, "A GPU-accelerated envelope-following method for switching power converter simulation," in *Proc. Design, Autom. Test Eur. Conf. Exhibit.*, Mar. 2012, pp. 1349-1354.
- [111] S. Yan, Z. Zhou, and V. Dinavahi, "Large-scale nonlinear device-level power electronic circuit simulation on massively parallel graphics processing architectures," *IEEE Trans. Power Electron.*, vol. 33, no. 6, pp. 4660-4678, Jun. 2018.
- [112] "EMTDC users guide," Manitoba HVDC Research Centre Inc., Canada, Apr. 2005.
- [113] D. Tan, "Power electronics in 2025 and beyond: a focus on power electronics and systems technology," *IEEE Power Electron. Mag.*, vol. 4, no. 4, pp. 33-36, Dec. 2017.
- [114] U. Farooq, Z. Marrakchi, and H. Mehrez, *Tree-Based Heterogeneous FPGA Architectures*. New York: Springer, 2012.
- [115] "7 series FPGA overview," Xilinx, Inc., USA, 2011.
- [116] "7 series FPGAs data sheet: overview," Xilinx, Inc., USA, Feb. 2018.
- [117] "UltraScale architecture and product data sheet: overview," Xilinx, Inc., USA, May 2018.
- [118] "7 series FPGAs configurable logic block," Xilinx, Inc., USA, Sep. 2016.
- [119] "7 series FPGAs memory resources," Xilinx, Inc., USA, Sep. 2016.
- [120] "7 series DSP48E1 slice user guide," Xilinx, Inc., USA, Mar. 2018.

- [121] "UltraScale architecture DSP slice user guide," Xilinx, Inc., USA, Apr. 2018.
- [122] "NVIDIA GeForce GTX 1080," NVIDIA Corp., USA, Aug. 2017.
- [123] "NVIDIA Tesla V100 GPU architecture," NVIDIA Corp., USA, Aug. 2017.
- [124] "CUDA C programming guide," NVIDIA Corp., USA, Jun. 2017.
- [125] "OpenMP application programming interface," OpenMP architecture review board, Nov. 2015.
- [126] P. B. Johns and M. O'Brien, "Use of the transmission-line modelling (t.l.m.) method to solve non-linear lumped networks," *Radio and Electronic Engineer*, vol. 50, no. 1.2, pp. 59-70, Jan. 1980.
- [127] S. Y. R. Hui and C. Christopoulos, "Modeling non-linear power electronic circuits with the transmission-line modeling technique," *IEEE Trans. Power Electron.*, vol. 10, no. 1, pp. 48-54, Jan. 1995.
- [128] C. J. Smartt and C. Christopoulos, "Modelling nonlinear and dispersive propagation problems by using the TLM method," *IEE Proc. Microw., Antennas Propag.*, vol. 145, no. 3, pp. 193-200, Jun. 1998.
- [129] M. Hagiwara and H. Akagi, "Control and experiment of pulsewidth-modulated modular multilevel converters," *IEEE Trans. Power Electron.*, vol. 24, No. 7, pp. 1737-1746, Jul. 2009.
- [130] F. Blaschke, "The principles of field orientation as applied to the new transvector closed-loop system for rotating field machines," *Siemens Rev.*, vol. 34, pp. 217-220, 1972.
- [131] P. Karamanakos, P. Stolze, R. M. Kennel, S. Manias, and H. du Toit Mouton, "Variable switching point predictive torque control of induction machines," *IEEE J. Emerging Sel. Topics Power Electron.*, vol. 2, no. 2, pp. 285-295, Jun. 2014.
- [132] M. A. Fnaiech, S. Khadraoui, H. N. Nounou, M. N. Nounou, J. Guzinski, H. Abu-Rub, A. Datta, and S. P. Bhattacharyya, "A measurement-based approach for speed control of induction machines," *IEEE J. Emerging Sel. Topics Power Electron.*, vol. 2, no. 2, pp. 308-318, Jun. 2014.
- [133] ABB Applying IGBTs, Application Note 5SYA 2053-04. Aug. 2012. [Online]. Available: http://www.abb.com/abblibrary/DownloadCenter/
- [134] F. Dugal, E. Tsyplakov, A. Baschnagel, L. Storasta, and T. Clausen, "IGBT press-packs for the industrial market," *Proc. PCIM12*, Nürnberg, Germany, 2012.
- [135] A. Courtay, "MAST power diode and thyristor models including automatic parameter extraction," *SABER User Group Metting.*, Brighton, UK, Sep. 1995.

- [136] "Saber model architect tool user guide," Synopsys, Inc., USA, Sep. 2011.
- [137] Z. Luo, H. Ahn, and M. A. E. Nokali, "A thermal model for insulated gate bipolar transistor module," *IEEE Trans. Power Electron.*, vol. 19, No. 4, pp. 902-907, Jul. 2004.
- [138] J. Xu, A. M. Gole, and C. Zhao, "The use of averaged-value model of modular multilevel converter in DC grid," *IEEE Trans. Power Del.*, vol. 30, no. 2, pp. 519-528, Apr. 2015.
- [139] D. Döring, D. Ergin, K. Würflinger, J. Dorn, F. Schettler, and E. Spahic, "System integration aspects of DC circuit breakers," *IET Power Electron.*, vol. 9, no. 2, pp. 219-227, Feb. 2016.
- [140] J. Häfner and B. Jacobson, "Proactive hybrid HVDC breakers a key innovation for reliable HVDC grids," in *Proc. Cigré Symp.*, Bologna, Italy, Sep. 13-15, 2011.
- [141] R. M. Cuzner and V. Singh, "Future shipboard MVdc system protection requirements and solid-state protective device topological tradeoffs," *IEEE J. Emerging Sel. Topics Power Electron.*, vol. 5, no. 1, pp. 244-259, Mar. 2017.
- [142] W. Wen, Y. Huang, Y. Sun, J. Wu, M. A. Dweikat, and W. Liu, "Research on current commutation measures for hybrid DC circuit breakers," *IEEE Trans. Power Del.*, vol. 31, no. 4, pp. 1456-1463, Aug. 2016.
- [143] ABB 5SNA-2000K450300 StakPak IGBT module, doc. No. 5SYA1431-00, available online: http://new.abb.com/semiconductors/stakpak.
- [144] H. W. Dommel, "Digital computer solution of electromagnetic transients in singleand multiphase networks," *IEEE Trans. Power App. Syst.*, vol. PAS-88, no. 4, pp. 388-399, Apr. 1969.
- [145] J. Chen, S. Downer, A. Murray, A. Guerra, and T. McDonald, "Combined device and system simulation for automotive application using SABER," *IEEE Trans. Electron. Transportat.*, pp. 99104, 2002.
- [146] W. Grieshaber, J. P. Dupraz, D. L. Penache, and L. Violleau "Development and test of a 120 kV direct current circuit breaker," in *Proc. Cigré Session*, Paris, France, Aug. 2014, pp. 111.
- [147] C. C. Davidson, R. S. Whitehouse, C. D. Barker, J. P. Dupraz, and W. Grieshaber, "A new ultra-fast HVDC Circuit breaker for meshed DC networks," in 11th IET Int. Conf. AC DC Power Transmiss., 2015, pp. 17.
- [148] A. Jamshidi Far and D. Jovcic, "Design, modeling and control of hybrid DC circuit breaker based on fast thyristors," *IEEE Trans. Power Del.*, vol. 33, no. 2, pp. 919-927, Apr. 2018.

- [149] T. K. Vrana, S. Dennetière, Y. Yang, J. Jardini, D. Jovcic, and H. Saad, "The CIGRE B4 DC grid test system," ELECTRA issue 270, pp10-19, Oct. 2013.
- [150] P. C. Krause, O.Wasynczuk, and S. D. Sudhoff, *Analysis of Electric Machinery*. New York: IEEE Press, Jan. 1995.
- [151] V. Brandwajn, H. W. Dommel, and I. I. Dommel, "Matrix representation of threephase N-winding transformers for steady-state and transient studies," *IEEE Trans. Power App. Syst.*, vol. PAS-101, no. 6, pp. 1369-1378, Jun. 1982.
- [152] J. R. Marti, "Accurate modelling of frequency-dependent transmission lines in electromagnetic transient simulations," *IEEE Trans. Power App. Syst.*, vol. PAS-101, no. 1, pp. 147-157, Jan. 1982.
- [153] H. Abu-Rub, M. Malinowski, and Kamal Al-Haddad, *Power Electronics for Renewable Energy Systems, Transportation and Industrial Applications*. Wiley-IEEE Press, 2014.
- [154] D. Jovcic, M. Taherbaneh, J. P. Taisne, and S. Nguefeu, "Offshore DC grids as an interconnection of radial systems: protection and control aspects," *IEEE Trans. Smart Grid*, vol. 6, no. 2, pp. 903-910, Mar. 2015.
- [155] L. O. Chua and P. M. Lin, *Computer-aided analysis of electronic circuits: algorithms and computational techniques*. NJ: Prentice-Hall, 1975.
- [156] N. R. Tavana and V. Dinavahi, "A general framework for FPGA-based real-time emulation of electrical machines for HIL applications," *IEEE Trans. Ind. Electron.*, vol. 62, no. 4, pp. 2041-2053, Apr. 2015.
- [157] N. R. Tavana and V. Dinavahi, "Real-time nonlinear magnetic equivalent circuit model of induction machine on FPGA for hardware-in-the-Loop simulation," *IEEE Trans. Energy Convers.*, vol. 31, no. 2, pp. 520-530, Jun. 2016.
- [158] P. Liu and V. Dinavahi, "Real-time finite-element simulation of electromagnetic transients of transformer on FPGA," *IEEE Trans. Power Del.*, vol. 33, no. 4, pp. 1991-2001, Aug. 2018.
- [159] B. Jandaghi and V. Dinavahi, "Prototyping of nonlinear time-stepped finite element simulation for linear induction machines on parallel reconfigurable hardware," *IEEE Trans. Ind. Electron.*, vol. 64, no. 10, pp. 7711-7720, Oct. 2017.



#### A.1 IGBT/Diode NBM Parameters

| Table A.1: Behavioural IGBT and diode | parameters provided by | 7 SaberRD® |
|---------------------------------------|------------------------|------------|
|---------------------------------------|------------------------|------------|

| Siemens <sup>®</sup> BSM300GA160D Model Parameters                                                                                                                                                            |
|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| $r_{off}=10^{9}\Omega$ , $g_{off}=10^{-12}$ S, $g_{on}=10^{6}$ S, $r_{g}=5\Omega$ , $vce1=4.8$ V, $vge1=9$ V, $ic1=225$ A, $vce2=1.8$ V,                                                                      |
| $vge2=7V$ , $ic2=20A$ , $vce3=4V$ , $vge3=17V$ , $ic3=400A$ , $V_t=6.3V$ , $V_{on}=0.8V$ , $vce4=10V$ , $vce5=4V$ ,                                                                                           |
| vce6=800V, vge4=10V, vge5=20V, itrat=20, crss1=30nF, crss2=1.6nF, coss1=42nF,                                                                                                                                 |
| $coss2=5nF$ , $q1=400nC$ , $q2=2000nC$ , $q3=3500nC$ , $\tau=10\mu s$ , $M=0.5$ , $R_{tail}=1\mu\Omega$ , $C_{tail}=10F$ ,                                                                                    |
| a1=0.0217, a3=91.705, b1=0.00395, b3=3.221, x=0.973, y=1.428, z=0.369, icsat3=1.789kA,                                                                                                                        |
| cceo=12nF, ccgo=110nF, cgeo=40nF, vceo=0.873V, vcgo=0.0189V                                                                                                                                                   |
| ABB <sup>®</sup> 5SNA 2000K450300 StakPak IGBT Model Parameters                                                                                                                                               |
| $r_{off}=10^{9}\Omega$ , $g_{off}=10^{-12}$ S, $g_{on}=10^{6}$ S, $r_{g}=1.2\Omega$ , $V_{t}=7.71$ V, $V_{on}=0.43$ V, $itrat=4$ , $a=0.00514$ ,                                                              |
| $b=445.6\mu$ , $x=1.32$ , $y=1.45$ , $z=1.04$ , $ittau=1\mu$ , $cres0=30$ nF, $cres1=25$ nF, $cres2=4$ nF,                                                                                                    |
| coes0=40nF, coes1=32nF, coes2=10nF, cies0=40nF, M=0.5, V1=12V, V2=20V                                                                                                                                         |
| Behavioural Diode Model Parameters                                                                                                                                                                            |
| $r_{on}=10 \text{m}\Omega$ , $r_{off}=100 \text{k}\Omega$ , $V_{on}=0.7 \text{V}$ , $I_{Fo}=10 \text{A}$ , $\frac{\text{d}I_r}{\text{d}t}=50 \times 10^6$ , $I_{rrm}=10 \text{A}$ , $t_{rr}=2 \mu \text{s}$ , |
| $K=9.883\times10^4, L=10\times10^{-12}H, R_L=1.279\times10^{-5}\Omega$                                                                                                                                        |
|                                                                                                                                                                                                               |

## A.2 IGBT DCFM Parameters

In Chapter 4, the ABB 5SNA 2000K450300 StakPak IGBT module DCFM parameters are provided as follows:

The 6 piecewise linearized IGBT static model segments are:

1.  $I_C$ >1000A:  $k_1$ =-4.428 $T_{vj}$ +1567,  $b_1$ =-4.263 $T_{vj}$ +1975.5;

2.  $I_C \in (500, 1000]$ A:  $k_2 = -2.684T_{vj} + 1113.1$ ,  $b_2 = -1.867T_{vj} + 1107.4$ ;

3.  $I_C \in (300,500]$ A:  $k_3 = -2.185T_{vj} + 881$ ,  $b_3 = -1.588T_{vj} + 772.7$ ; 4.  $I_C \in (200,300]$ A:  $k_4 = -1.692T_{vj} + 709$ ,  $b_4 = -1.179T_{vj} + 562.8$ ; 5.  $I_C \in (0,200]$ A:  $k_5 = 200$ ,  $b_5 = 0$ ; 6.  $I_C < 0$ :  $k_6 = 10^{-6}$ ,  $b_6 = 0$ . The IGBT turn-on model's coefficients are: Segment 1.  $k_0 = 0$ ,  $k_1 = 0$ ,  $k_2 = 1$ ,  $k_3 = 0$ ,  $b_0 = 3375$ ,  $b_1 = 1$ ,  $b_2 = -1833.3$ ,  $b_3 = -1.6$ ; Segment 2.  $k_0 = 0$ ,  $k_1 = 0$ ,  $k_2 = 0$ ,  $k_3 = 0$ ,  $b_0 = 5$ ,  $b_1 = 1$ ,  $b_2 = 0$ ,  $b_3 = 0.24$ . The IGBT turn-off model's coefficients are: Segment 1.  $k_0 = 0$ ,  $k_1 = 0$ ,  $k_2 = 0$ ,  $k_3 = 0$ ,  $b_0 = 1748.3$ ,  $b_1 = 1$ ,  $b_2 = 33.33$ ,  $b_3 = -0.6867$ ; Segment 2.  $k_0 = 0$ ,  $k_1 = 0$ ,  $k_2 = 0$ ,  $k_3 = 0$ ,  $b_0 = 2048.3$ ,  $b_1 = 1$ ,  $b_2 = -200$ ,  $b_3 = -0.6867$ ; Segment 3.  $k_0 = 0$ ,  $k_1 = 0$ ,  $k_2 = 0$ ,  $k_3 = 0$ ,  $b_0 = 1420$ ,  $b_1 = 1$ ,  $b_2 = 0$ ,  $b_3 = -0.49$ . The IGBT thermal network parameters:  $R_1 = 1.601$ K/kW,  $R_2 = 1.765$ K/kW,  $R_3 = 0.358$ K/kW,  $R_4 = 0.328$ K/kW,  $C_1 = 0.362898$ kJ/K,  $C_2 = 0.033428$ kJ/K,  $C_3 = 0.01676$ kJ/K,  $C_4 = 0.003049$ kJ/K.

#### A.3 SST Test Case Parameters in Chapter 4

The parameters of MTDC system for SST test in Chapter 4 are:  $MMC_1$  rated power  $P_{rec}$ =400MW, DC line 1 and 2 voltage  $V_{dc1,2}$ =200kV, DC line 3 voltage  $V_{dc3}$ =100kV,  $L_{1-4}$ =100mH. The SST parameters under 300/180/60Hz are: SM capacitance  $C_{SM}^{MMC_H}$ =3/12/20mF,  $C_{SM}^{MMC_L}$ =3/5/10mF, arm inductance  $L_{u,d}$ =10/15/50mH; Y-Y MFT capacity 600MVar, 110/55kV;  $MMC_H$  55-level,  $MMC_L$  31-level. Transmission line parameters: distance 100km, r=0.01 $\Omega$ /km, l=0.1mH/km, C=0.2 $\mu$ F/km. The parameters for  $MMC_1$ - $MMC_3$  are: 5-level,  $L_{u,d}$ =20mH, arm inductor resistance  $r_{u,d}$ =0.1 $\Omega$ ,  $C_{SM}^{MMC_{1-3}}$ =10mF, grid voltage (L-L, RMS)  $V_{g1,2}$ =134kV,  $V_{g3}$ =67kV.

### A.4 MVDC System Parameters in Chapter 4

The MVDC system parameters are:  $V_{dc}$ =10kV,  $P_{dc}$ =8MW,  $C_{SM}$ =1mF,  $L_{u,d}$ =20mH,  $V_{g1,2}$ =5.5kV/60Hz, feed-in resistor  $r_{1,2}$ =0.4 $\Omega$ , feed-in inductor  $L_{1,2}$ =1mH.

# A.5 Full NBM IGBT Matrix

$$\mathbf{G^{IGBT}} = \begin{bmatrix} G_x & -G_x & 0 & 0 & 0 \\ -Gx & G_x + G_{tailvd} & G_{mosvcge} & -G_{tailvd} & -G_{mosvcge} \\ +G_{mosvd+} & +G_{tailvcge} & +G_{tailvtail} & -G_{tailvcge} - G_{cce} \\ +G_{Ccg} + G_{Cce} & -G_{Ccg} & -G_{mosvd} & -G_{tailvtail} \\ 0 & -G_{Ccg} & G_{Cge} + R_g^{-1} & 0 & -G_{Cge} - R_g^{-1} \\ & & +G_{Ccg} & & \\ 0 & -G_{mosvd} & -G_{mosvcge} & r_{tail}^{-1} + G_{ct} & -R_g^{-1} - G_{ct} \\ +G_{mosvd} & +G_{mosvcge} \\ 0 & -G_{cce} & -G_{Cge} - R_g^{-1} & -G_{ct} - r_{tail}^{-1} & G_{tailvcge} + G_{cce} \\ & -G_{tailvd} & -G_{tailvcge} & -G_{tailvtail} & +G_{ct} + r_{tail}^{-1} + R_g^{-1} \\ & +G_{tailvd} & +G_{Cge} + G_{tailvtail} \\ \end{bmatrix}_{\substack{5 \times 5 \\ (A.1)}}$$

$$\mathbf{I_{eq}^{IGBT}} = \begin{bmatrix} G_x v_{on} \\ -G_x v_{on} - I_{moseq} - I_{Ccgeq} - I_{taileq} - I_{Cceeq} \\ I_{Ccgeq} + \frac{V_g}{R_g} - I_{Cgeeq} \\ I_{moseq} - I_{Cteq} \\ I_{Cgeeq} + I_{Cteq} + I_{Cceeq} + I_{taileq} - \frac{V_g}{R_g} \end{bmatrix}^T.$$
(A.2)

# B

# **B.1** MTDC System Parameters in Chapter 5

AC side impedance  $Z_{ac}=0.1+j11.3\Omega$ , MMC DC side capacitor  $C_e=500\mu F$ , total power  $P_{rec}=400$ MW, rectifier DC current  $I_1=2$ kA, inverter DC voltage  $U_{dc1,2}=200$ kV, inverter DC current  $I_{dc1,2}=1$ kA;

## **B.2** Transmission Line Parameters in Chapter 5

Impedance  $r_0=0.012\Omega/\text{km}$ , inductance  $l_0=0.106\text{mH/km}$ , capacitance  $c_0=0.296\mu F/\text{km}$ , length D=200km;

## **B.3** ABB HHB Parameters in Chapter 5

Snubber resistor  $R_s=10\Omega$ , snubber capacitor  $C_s=30\mu F$ , MOV overall protection voltage  $V_{ref}=340$ kV,  $I_{ref}=2$ kA, number of HHB units  $N_{hcb}=100$ .

## B.4 Alstom Grid HHB Parameters in Chapter 5

 $N_{SCR_1} = N_{SCR_{11}} = N_{SCR_{12}} = 60, N_{SCR_2} = 120, C_{11} = 500 \mu\text{F}, C_{12} = 190 \mu\text{F}, C_2 = 13 \mu\text{F}, r_{11} = 750\Omega, r_{12} = 2k\Omega, r_2 = 30k\Omega.$  Varistor protection voltages:  $V_{M_0} = 11 \text{kV}, V_{M_{11}} = 7 \text{kV}, V_{M_{12}} = 80 \text{kV}, V_{M_2} = 180 \text{kV}.$ 

# **B.5 UFMCB Companion Model**

$$\mathbf{I_{eq}^{UFMCB}} = \begin{bmatrix} J_s - I_{Meq2} - I_{MBeq} \\ 0 \\ 2v_{C11}^i G_{C11} - I_{Meq11} \\ 2v_{C12}^i G_{C12} - I_{Meq12} \\ 2v_{C2}^i G_{C2} \end{bmatrix}^T.$$
(B.2)

# C

# C.1 CIGRÉ B4 DC Grid Parameters

MMC parameters:

voltage level 5-513, arm inductance  $L_{u,d}$ =20mH, SM capacitance  $C_i$ =3mF; AC grid voltage  $V_g$ =380kV, AC frequency f=60Hz;

DCS1,2 transformer ratio 380/270kV, YY structure;

DCS 1,2 rated DC voltage  $V_{dc}$ =±200kV, DCS 3 rated DC voltage  $V_{dc}$ =±400kV.

Rectifier stations: MMC0 800MW, MMC2 400MW, MMC4 800MW, MMC6 1600MW, MMC8 800MW, MMC10 800MW;

inverter stations: MMC1  $\pm 200$  kV, MMC3  $\pm 200$  kV, MMC5  $\pm 200$  kV, MMC7  $\pm 400$  kV, MMC9  $\pm 400$  kV.

Transmission line parameters: distance DCS1  $d_1$ =50km, DCS2  $d_2$ =100km, DCS3  $d_3$ =200km; shunt conductance g=10<sup>-8</sup>m $\Omega$ /km, conductor outer radius 10cm, height H=50m, sag 2m, DC resistance  $r_{dc}$ =0.01 $\Omega$ /km.

# C.2 Greater CIGRÉ DC Grid



Figure C.1: Greater CIGRÉ DC Grid consisting of multiple CIGRÉ DC B4 systems.
## D

In Chapter 7, the MMC parameters are: rated power 200MW, AC voltage 135kV, DC voltage  $\pm$ 100kV, MMC level 201, SM capacitor 20mF, arm inductor 50mH. DC line parameters:  $10m\Omega/km$ , 0.1mH/km,  $0.3\mu F/km$ , length 100km.