Received 19 May 2018; revised 4 July 2018; accepted 17 August 2018. Date of publication 24 August 2017; date of current version 18 September 2018.

Digital Object Identifier 10.1109/JPETS.2018.2866589

# Design and Implementation of Real-Time Mpsoc-FPGA-Based Electromagnetic Transient Emulator of CIGRÉ DC Grid for HIL Application

# ZHUOXUAN SHEN<sup>®</sup> (Student Member, IEEE), TONG DUAN (Student Member, IEEE), AND VENKATA DINAVAHI<sup>®</sup> (Senior Member, IEEE)

Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB T6G 2V4, Canada CORRESPONDING AUTHOR: Z. SHEN (zshen@ualberta.ca)

This work was supported in part by the Natural Science and Engineering Research Council of Canada (NSERC).

**ABSTRACT** Real-time electromagnetic transient simulation is a powerful tool for the power system transient study and the hardware-in-the-loop (HIL) testing. Large-scale DC grid can meet the flexible transmission requirements with high power efficiency and high controllability. CIGRÉ working group has proposed a DC grid test system, which covers various HVDC configurations and deploys modular multi-level converters (MMCs) in the grid. This work focuses on the efficient solution of the DC grid real-time emulator providing accurate and detailed results. The design and implementation of the CIGRÉ DC grid are carried out on a hybrid MPSoC-FPGA platform realizing the synergy between the Xilinx Vitrex UltraScale+ FPGA device containing a large number of logic resources and Xilinx Zynq UltraScale+ MPSoC device containing the ARM multi-core processing system and FPGA resources on a single chip. Hybrid modeling methodology using device-level electrothermal model, equivalent circuit model, and average value model for the converters is employed to present the detailed device-level results of local equipment and the accurate system-level results of global interactions of the DC grid. The detailed design partitioning and implementation methods are presented, and the real-time results are captured by the oscilloscope and validated with commercial simulation tools PSCAD/EMTDC and SaberRD.

**INDEX TERMS** Electromagnetic transient simulation, field-programmable gate arrays, HVDC grid, hybrid modeling, modular multi-level converter, multi-processing system-on-chip, real-time systems.

#### NOMENCLATURE

| FPGA  | Field programmable gate array.     |
|-------|------------------------------------|
| IGBT  | Insulated-gate bipolar transistor. |
| MMC   | Modular multi-level converter.     |
| MPSoC | Multi-processing system-on-chip    |
| MTDC  | Multi-terminal direct current.     |
| PL    | Programmable logic .               |
| PS    | Processing system.                 |

### I. INTRODUCTION

Real-time electromagnetic transient (EMT) simulation is often used in the hardware-in-the-loop (HIL) scenario for the closed loop testing of power system equipment, control and protection systems [1]–[3]. While the design, testing and commissioning of *local* control and protection functions of system equipment is the primary objective of such simulation, it is also paramount for *global* dynamic and interactive studies of large-scale power systems. Simulating large AC-DC grids in real-time is a significant challenge due to the need for detailed *device-level* modeling of system components and simultaneously reproducing the *systemlevel* interactions accurately by accommodating large system sizes. These requirements place enormous pressure on the simulation timing constraints and the hardware selection for implementing the real-time simulation.

The CIGRÉ working group has proposed a high voltage DC (HVDC) grid test system, which is composed of 3 DC sub-systems connecting the off-shore renewable

2332-7707 © 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications\_standards/publications/rights/index.html for more information.

resource power plants and the on-shore AC system [4]–[6]. The DC grid test system contains in total 11 AC-DC converters and 2 DC-DC converters and covers three major DC transmission configurations, which are two-terminal HVDC system, non-meshed multi-terminal DC (MTDC) system, meshed multi-terminal DC system. Modular multi-level converters (MMCs) are utilized for the test system, which have the advantages of low switching losses and low harmonics [7]–[9]. Although the DC grid has the benefits of high controllability, low power losses, and low capital investment for long-distance bulk-power transmission, the low impedance of DC transmission lines, which is the merit for normal operation, can cause severe short circuit currents during contingencies. It is critical to develop accurate and efficient modeling schemes and implementation platforms for the realtime EMT simulation to conduct in-depth studies of such systems, and to validate the solutions, control algorithms, and equipment by HIL testing.

Field programmable gate arrays (FPGAs) containing numerous configurable logic resources and I/Os can provide very high parallelism, which has been applied in high-performance computing including the real-time EMT emulation system [10]. Recently tremendous developments of FPGA technology are not only in the area of improving the timing performance and enlarging the resource capability in terms of larger transistor count and distributed memories, but also the design methodology and fundamental architecture [11]–[13]. High-level synthesis (HLS) design methodology uses C/C++ programing language instead of using hardware description language (HDL), which significantly reduces the design effort and maintains the synthesis efficiency. The synthesized modules can be pipelined efficiently with the optimized performance regarding the resource usage and latency. The multi-processing system-on-chip (MPSoC) combines multi-core processing system and FPGA on the same chip, which maximizes the communication bandwidth and the synergy of fast sequential computing and high hardware parallelism. In this work, a hybrid MPSoC-FPGA platform using Xilinx Zyng Ultrascale+ XCZU9EG and Virtex Ultrascale+ XCVU9P devices is established for the realization of CIGRÉ DC grid real-time emulator.

For the purpose of providing both the accurate and detailed results of the components in the local study zone, and the global interactive results of other components in the DC grid with low hardware resource and design cost, a feasible hybrid modeling scheme is utilized for representing various MMCs in the DC grid. Device-level electrothermal model, equivalent circuit model, and the average value model are used for representing different components and zones in the emulated system [9], [14], [15]. Among them, the device-level electrothermal model is the most complex, which considers the temperature-dependent characteristics of IGBTs and diodes. Various data and waveforms are calculated including the switching and conduction power losses, junction temperatures, the linearized device-level switching waveforms. The electrothermal model can be used to evaluate the converter efficiency during the normal operation and the device status during fault transient by monitoring the device junction temperatures. Applying the electrothermal model can largely extend the functionality of real-time simulation of the DC grid, while requiring high parallelism to ensure real-time performance.

In the literature, three terminal MMC HVDC system has been simulated with equivalent circuit model [16]. The complete CIGRÉ DC grid has been simulated with the MMCs all using the average value model, which cannot be used to verify the valve-level control [17]. This work substantially increases the capability of the real-time emulator by applying more accurate and detailed MMC modeling scheme to significantly larger circuit topology with optimum cost, which requires integrated solutions for the modeling scheme, the decomposition and partition scheme, the hardware selection, implementation, etc. The major contributions of this work are summarized as follows:

- applying the device-level electrothermal model to the complex DC gird;
- 2) developing the MPSoC-FPGA hybrid digital hardware platform for the real-time simulation application;
- 3) applying hybrid modeling scheme to the real-time simulation of the large-scale DC grid.

All the above achievements are presented for the first time in the area of the real-time EMT simulation. The algorithm decomposition, hardware design partitioning and implementation methodologies for the CIGRÉ DC grid emulator are proposed and present with detail in this work. The complete system is decomposed to 20 sub-systems to achieve high parallelism and modularity. The system-level and the device-level emulation results captured in real-time on the oscilloscope are compared with commercial EMT tools PSCAD/EMTDC and SaberRD, respectively. The developed emulation system can significantly enhance the accuracy and efficiency of HIL test for the control and protection schemes in the DC grid. The paper is organized as follows: Section II introduces the architecture, design methodology and communication of the FPGA and MPSoC devices; Section III shows the topology and control scheme of the CIGRÉ DC grid and the hybrid modeling scheme of the MMCs and other devices; Section IV presents the detailed partitioning and design implementation methodologies of the emulation system. Section V shows the captured real-time system-level and device-level results during normal operation, power flow control, and DC fault conditions with the validation, followed by the conclusions in Section VI.

# II. FPGA AND MPSOC HYBRID HARDWARE ARCHITECTURE AND DESIGN METHODOLOGY

Both FPGA and MPSoC devices have been widely used in data center, communication, image processing, industrial control, high-performance computing applications, etc [18], [19]. FPGA has also been used for real-time EMT emulation system and HIL testing [3], [9], and is adopted by commercial products, especially for MMC emulation [15].



FIGURE 1. FPGA and MPSoC hybrid architecture: (a) Xilinx VCU118 FPGA board and UltraScale+ XCVU9P device, (b) Xilinx ZCU102 MPSoC board and UltraScale+ XCZU9EG device.

The basic architectures of FPGA and MPSoC are presented and compared in this section, and such knowledge is useful for the device selection, system partitioning, and hardware implementation.

# A. FPGA ARCHITECTURE

The major configuration and architecture of Xilinx VCU118 FPGA board and the corresponding Virtex UltraScale+ XCVU9P device is shown in Fig. 1(a) [11]. FPGA is essentially a reconfigurable digital integrated circuit, accomplished by the programmable connection and configuration of the switch matrices and the configurable logic blocks (CLBs). The CLB is composed of the combinational logic elements: look-up-tables (LUTs) and sequential logic elements: flip-flops (FFs). Block RAMs and DSP slices, which provide the memory resources and arithmetic logic units respectively, are added to improve the overall performance. The rich I/O pins and their flexible allocation are also the advantages of FPGA.

# **B. MPSOC ARCHITECTURE**

In the FPGA due to the complex routing and additional combinational and sequential logics inferred for reconfiguration, the clock frequency of such device is lower for a specific function compared with application specific integrated circuit (ASIC), such as central processing unit (CPU). The MPSoC architecture used in Xilinx UltraScale+ XCZU9ZG integrates the FPGA resources and multi-processing system, including ARM Cortex-A53 quad-core application processing units (APUs), ARM Cortex-R5 dual-core real-time processing units (RPUs), and ARM Mali-400 MP2 graphical processing unit (GPU) on a single chip shown in Fig. 1(b) [12]. The processing

| TABLE 1. | Programmable   | logic resource | comparison. |
|----------|----------------|----------------|-------------|
|          | i i egiannaoie | logio rocoaroo | oompanoom   |

| Device      | XCVU9P FPGA | XCZU9EG MPSoC |
|-------------|-------------|---------------|
| LUT         | 1,182,240   | 274,080       |
| FF          | 2,364,480   | 548,160       |
| DSP Slice   | 6,840       | 2,520         |
| Memory (Mb) | 345.9       | 32.1          |
| I/O         | 832         | 328           |

system (PS) communicates with programmable logic (PL) with high-bandwidth and low-latency Advanced eXtensible Interface (AXI) channels. Such architecture provides high flexibility to merge the advantages of the fast sequential calculation and the paramount parallelism to meet the requirements of high-performance computing. While the logic resources of MPSoC are much lower than FPGA, a summary of the main resources available in this work are shown in Table 1.

# C. COMMUNICATION

Both Xilinx VCU118 and ZCU102 boards have plenty of components and interfaces, such as double data rate fourth-generation (DDR4) memory, quad serial peripheral interface (QSPI) flash, general purpose IO (GPIO), small form-factor pluggable (SFP). This work uses the SFP interface of ZCU102 and quad SFP (QSFP) interface of VCU118 combined with the Xilinx Aurora IP cores to accomplish the communication between the two boards as shown in Fig. 2(a). The Aurora 64B/66B core is a scalable, lightweight, high data rate, link-layer protocol for high-speed serial communication, supporting bi-directional transfer of data between devices using consecutive bonded gigabit transceiver Y (GTY) on VCU118 board and gigabit transceiver H (GTH) on ZCU118 board [20]. This work



FIGURE 2. Communication process: (a) block diagram and (b) digital signal waveforms.

uses a total of four transceivers located in ZCU102 SFP and VCU118 QSFP interface to construct four lanes with 64-bit AXI-4 user data stream transmitting in each lane, which can achieve a throughput from 500 Mb/s to over 254 Gb/s. Fig. 2(b) shows the waveforms of major signals during the communication process. When CHANNEL\_UP signal is asserted, the Aurora cores have initialized and established 4 channel lanes for user applications to pass frames of data. User data are loaded on AXI4\_TDATA[0:255] bus (64b  $\times$  4 lanes) at each edge of USER\_CLK when AXI\_TREADY is asserted. Then user data are transferred into encoded differential serial data (RXP/N[0:3] and TXP/N[0:3]) and transmitted through the four GTH or GTY transceivers and the QSFP/SFP cable.

#### D. HIGH-LEVEL SYNTHESIS

Conventionally, HDL is used for the register-transfer level (RTL) design of FPGA, which can be time-consuming and error-prone. Instead, HLS methodology significantly raises the design abstraction level and shorten the design cycle by using C/C++ programming language [13]. The synthesized design can still meet certain latency and resource optimization requirements by inserting various directives, such as pipelining, loop unroll, limiting the number of instances. However, for the modules with specific demanding timing constraints, such as counters, HDL programming is necessary. This work uses both HLS and HDL design methodologies for the hardware design of the real-time DC grid emulator.

## III. CIGRÉ DC GRID NETWORK TOPOLOGY, CONTROL SCHEME, AND HYBRID MODELING METHODOLOGY A. NETWORK TOPOLOGY

The CIGRÉ DC grid test system emulated in this work is shown in Fig. 3(a), with the parameters from [4]. The DC System 3 uses the meshed topology with the DC-DC converter Cd-B1 to provide an extra degree of freedom for controlling the loop power flow. DC System 1 and DC System 3 are connected through AC system, while DC System 2 and DC System 3 are connected directly by the DC-DC converter Cd-E1. The circuit topology of MMC, which is used for all the AC-DC converters, is shown in Fig. 3(b). The MMC contains 6 arms, and each arm is composed of the series connected sub-modules (SMs) and arm inductor  $L_m$ . Halfbridge SM topology is utilized for all the MMCs containing the upper IGBT module (IGBT  $S_1$  and Diode  $D_1$ ), the lower IGBT module (IGBT  $S_2$  and Diode  $D_2$ ), and the SM capacitor as shown in Fig. 3(c).

#### **B. CONTROL SCHEME**

The control diagram of the MMC is shown in Fig. 4, which is composed of the outer loop control, inner current loop control, and valve-level control [21], [22]. The controlled variables are determined by the converter function and the type of the AC system. In this work, Converters Cm-A1, Cb-A1, Cm-B2 connecting to the strong on-shore AC system are utilized for DC voltage regulation, while the other AC-DC converters primarily control the active power flow. The reactive power flow can be regulated independently for all converters. Capacitor-sorting based control scheme is used for the MMC valve-level control [21], [22]. It is noted that the computation effort of valve-level control is dependent on the SM number in an arm, and can be significantly larger than the system-level control including the outer and inner loop controls.

## C. HYBRID MODELING METHODOLOGY OF MMC

The major challenge and the focus of this work are the modeling and implementation of all the converters in the DC grid for real-time EMT simulation with limited hardware resources. Although the FPGA board can provide high parallelism, the realization of all converters using detailed models can consume a huge amount of resources, which exceeds the capacity of a single board. Hybrid modeling strategy is applied to reduce the resource consumption and maintain the accuracy of the study zone and the size of the complete system. Among various modeling methods, devicelevel electrothermal model, equivalent circuit model, and average value model, which cover a wide range of modeling complexity, are adopted for various locations in the DC grid [9], [14], [15]. The detailed modeling scheme is used at the local element or zone which is of most interest and can be the closest converter to the fault location for transient analysis; the equivalent circuit modeling scheme is applied for the converters surrounding the local element; the average value



FIGURE 3. Topology of the: (a) CIGRÉ DC grid test system, (b) modular multi-level converter, and (c) half-bridge sub-module.



FIGURE 4. Control diagram of MMC: (a) outer loop control for power and DC voltage, (b) inner current loop control (c) MMC valve-level control.

model is used for the converters in remote locations. In Fig. 3, the modeling schemes adopted for various converters are annotated. Such strategy is applicable for other elements, such as the transmission lines, etc.

# 1) DEVICE-LEVEL ELECTROTHERMAL MODEL

Device-level electrothermal model is the most complex method adopted in this work, which can provide the power losses, the junction temperatures and device-level switching transients of any single IGBT or diode [9]. The parameters of the electrical model interface obtained from manufacture's datasheet [23] are dependent on the values of the junction temperature, which matches the actual physical phenomena of power electronic devices. Using such a modeling scheme in the real-time simulation can provide the comprehensive on-line evaluation of the power converter for efficiency and security under normal condition and fault transients with direct thermal indicators.



FIGURE 5. Major procedures of device-level electrothermal model.

The major processes of the electrothermal model are illustrated in Fig. 5. The IGBT module (one IGBT and one diode



FIGURE 6. (a) Temperature-dependent IGBT module output characteristics (b) circuit model of half-bridge SM, and (c) simplified SM model.

connecting in parallel) is modeled as a resistor  $r_i$  in series with a voltage source  $v_i$ , where *i* is either 1 or 2 for upper or lower IGBT module as shown in Fig. 6(b). When the IGBT or the diode is turned on, polynomial functions are applied to fit the output characteristic curves shown in Fig. 6 for the interface model as follows:

$$v(i) = \sum_{i=0}^{n} \alpha_n i^n \tag{1}$$

$$r_{on} = \frac{\mathbf{d}v}{\mathbf{d}i} = \sum_{i=0}^{n-1} \beta_n i^n \tag{2}$$

$$v_{on} = v(i) - r_{on}i. \tag{3}$$

The slope resistance  $r_{on}$  and the threshold voltage  $v_{on}$  will be assigned to  $r_i$  and  $v_i$ , if the current goes through the corresponding device. When the IGBT module is turned off, a very large value is assigned to  $r_i$ , and  $v_i$  is 0V. Since the output characteristic curves are give at  $T_1$  (25 °C) and  $T_2$  (125 °C) in the datasheet as shown in Fig. 6 (a). The  $r_{on}$  and  $v_{on}$  of arbitrary junction temperature  $T_{vj}$  are interpolated using the following equation:

$$r_{on}(T_{vj}) = \frac{T_{vj} - T_2}{T_2 - T_1} \left( r_{on}^{T_2} - r_{on}^{T_1} \right) + r_{on}^{T_2}$$
(4)

$$v_{on}(T_{vj}) = \frac{T_{vj} - T_2}{T_2 - T_1} \left( v_{on}^{T_2} - v_{on}^{T_1} \right) + v_{on}^{T_2}.$$
 (5)

Similar equations are also used to update other temperaturedependent parameters include the switching energy losses, the IGBT current rise and fall times. The equivalent circuit model is then applied to further simplify the MMC arm interface element, which is described in the next sub-section. The complete system matrix equation based on nodal analysis and Trapezoidal integration method is formed as follows:

$$\mathbf{G} \cdot \mathbf{V}(t) = \mathbf{I},\tag{6}$$

where **G** is the conductance matrix of the complete system, V(t) is the node voltages, and the **I** is the equivalent current source composed of the power sources and history terms.

The updated nodal voltages are then used to calculate other variables such the arm current, the SM capacitor voltages, which are used to calculate the power losses  $P_{\text{loss}}$  given as:

$$P_{\rm loss}(t) = P_{\rm cond}(t) + P_{\rm switch}(t), \tag{7}$$

where,

$$P_{\text{cond}}(t) = r_{on}(T_{vj})i^{2}(t) + v_{on}(T_{vj})i(t)$$
(8)

$$P_{\text{switch}}(t) = \left(\sum_{i=0}^{2} \alpha_{n} i^{n}\right) \frac{v(t)}{v_{\text{rated}} \cdot \Delta t}.$$
 (9)

The power losses are composed of conduction loss  $P_{\text{cond}}$  and the switching power loss  $P_{\text{switch}}$ . The switching power loss is fitted by second order polynomial function, which is assumed to be proportional to the rated voltage  $v_{\text{rated}}$  in the datasheet. The interpolation equation is also applied to the switching power loss similar to (4) and (5).

The computed power losses are then used to calculate the device junction temperatures through a sixth order thermal network. The thermal network is composed by six series-connected thermal impedances of the device junction to the case, the case to the heatsink, and the heatsink to the ambient. Each impedance is modeled as a thermal resistor and a thermal capacitor in parallel, given as the resistance  $R_t^i$  and capacitor time constant  $\tau_t^i$ . Multiple devices are mounted on the same heatsink, which generate the summation of the power losses  $P_{\text{sum}}$  of all devices. In this work, one upper submodule and one lower sub-module are mounted on the same heatsink. The junction temperature  $T_{vj}$  is calculated as:

$$T_{vj}(t) = \sum_{i=1}^{5} (\alpha_i (P_{\text{loss}}(t) + P_{\text{loss}}(t - \Delta t)) + \beta_i \Delta T_i(t - \Delta t)) + \alpha_6 (P_{\text{sum}}(t) + P_{\text{sum}}(t - \Delta t)) + \beta_6 \Delta T_6(t - \Delta t) + T_{amb},$$
(10)

where

and

$$\alpha_i = \frac{R_t^i \cdot \Delta t}{2\tau_t^i + \Delta t},\tag{11}$$

 $\beta_i = \frac{2\tau_t^i - \Delta t}{2\tau_t^i + \Delta t}, \quad i = 1, 2, \dots 6.$ 

 $T_{amb}$  is the ambient temperature, which is 30 °C in this work;  $T_i(t - \Delta t)$  is the history temperate difference across the  $i_{th}$  thermal impedance. Then the junction temperatures of all devices can be used to update the parameters for the next time-step.

The device-level calculation requires the temperaturedependent parameters, such as the rise-time and fall-time of the current waveforms, which are temperature-dependent and can be obtained from the datasheet. The rise-time and falltime of the voltage waveform are estimated by the data of the current waveform and the calculated temperature-dependent switching power losses. Linearized curves are used to present the switching transients of the IGBT. When implemented

(12)

on hardware, the system-level circuit and thermal network calculation use the IEEE 32-bit floating-point data precision, while the device-level waveform generation module uses fixed-point data in order to update the results at each FPGA clock frequency, which is 100 MHz in this work.

## 2) EQUIVALENT CIRCUIT MODEL

Since an MMC with high level number contains many SMs, which can generate a large number of system matrix nodes, the conventional EMT simulation can be extremely time-consuming and is not able to execute in real-time. Equivalent or surrogate circuit is commonly used for the simplification of a sub-module model shown in Fig. 6(b). For conventional equivalent circuit model,  $r_i$  ( $r_1$  or  $r_2$ ) and  $v_i$  have fixed values to represent the on-state and off-state. While for the electrothermal model,  $r_i$  and  $v_i$  change during the simulation, which are temperature-dependent. The equivalent resistance  $r_{\text{SMeq}}$  and the equivalent voltage source  $v_{\text{SMeq}}$  of each sub-module are given as:

$$r_{\rm SMeq} = r_i + k \cdot R_{\rm cap} \tag{13}$$

$$v_{\text{SMeq}} = v_i + k \cdot v_{\text{cap}}^{\text{Hist}}(t - \Delta t)$$
 (14)

where k is either 0 or 1, determined by the IGBT gate signal and the arm current direction during the block mode;  $R_{cap}$  and  $v_{cap}^{Hist}$  are the equivalent resistance and history term for the SM capacitor using the Trapezoidal rule. The simplified SM model shown in Fig. 6(c) can be further summed generating the interface of a converter arm, which decreases the electrical nodes substantially, given as the following equation:

$$v_{\rm arm}(t) = \sum_{i=1}^{n} (v_{\rm SMeq(i)} + r_{\rm SMeq(i)} \cdot i_{\rm arm}(t - \Delta t)), \quad (15)$$

where *n* is the number of the sub-module in each arm;  $i_{arm}$  is the arm current. The equivalent circuit modeling scheme accurately calculates the SM capacitor voltages and can verify the fundamental function of the complete control algorithm.

#### 3) AVERAGE VALUE MODEL (AVM)

The average value model is the simplest modeling scheme for an MMC converter applied in this work. For each converter arm at the AC side, the arm inductor is connected with the voltage source with the fundamental frequency, given as

$$v_{p,i}(t) = \frac{1 - m_i}{2} v_{dc},$$
 (16)

$$v_{n,i}(t) = \frac{1+m_i}{2} v_{dc},$$
(17)

where,  $v_{p,i}$  and  $v_{n,i}$  is the voltage source at the upper and lower arm of phase-*i*;  $m_i$  is the modulation signal of phase-*i*;  $v_{dc}$  is the DC side voltage across the DC side equivalent capacitor. The DC side capacitor  $C_{avm}$  is given as:

$$C_{\rm avm} = \frac{6 \cdot C_{\rm sm}}{n},\tag{18}$$

where  $C_{\rm sm}$  is the SM capacitance; *n* is the SM number in each arm. The DC side equivalent current source  $i_{\rm avm}$  is generated conserving the transferred power, given as:

$$\dot{u}_{\text{avm}} = \frac{1}{2} \cdot (m_a \cdot i_a + m_b \cdot i_b + m_c \cdot i_c), \qquad (19)$$

where  $i_{a,b,c}$  are the converter AC side currents. The valvelevel control cannot be verified with this model, since all the SM capacitor voltages are assumed to be balanced at all time. The average value model uses the controlled voltage and current sources with one time-step latency, which provides the matrix separation capability between the AC and DC side. Such a feature is beneficial for the system decomposition of reducing the matrix size, while may also bring numerical consequences during faults.

#### 4) TRANSMISSION LINE MODEL

A hybrid modeling methodology is also applied to the transmission line model by using both universal line model (ULM) and Bergeron line model (BLM) [24], [25]. The ULM fully considers the frequency-dependent characteristics of the transmission lines, while the time-domain convolution required for such model consumes relatively larger computation and memory resources. On the other hand, BLM is simpler and less accurate considering only the fundamental frequency. ULM is applied to the lines in the study zone connecting the interested local elements, while BLM is used for short lines or remote locations in AC system. The PI section model is avoided due to its inaccuracy and the absence of latency feature for decomposing the DC grid.

In this work, the Converter Cb-A1 is chosen as the component of interest; and the detailed electrothermal model is applied to all the IGBT modules of this converter. The nearby three Converters Cb-B1, Cb-C2, and Cm-A1 utilize the equivalent circuit model. Each of the above four converters has 65 levels or 384 modules. Multiple IGBT modules are connected in parallel and in series, which are 10 and 7 in this work respectively, to provide sufficient voltage and current rating for the converter. The transmission lines connecting between these converters utilize the ULM. The rest of the converters and transmission lines use AVM and BLM, respectively. A uniform modeling scheme is used for the power sources and transformers in the grid. In future work, hybrid modeling schemes could also be used for these elements, such as connecting a detailed wind power plant on the off-shore side.

# IV. DESIGN AND IMPLEMENTATION OF REAL-TIME MPSOC-FPGA BASED DC GRID EMULATOR

# A. SYSTEM DECOMPOSITION METHOD

Besides the rapid development of IC technology, the feasibility of conducting real-time electromagnetic transient simulation for large-scale power systems relies on the appropriate decomposition or relaxation of the system, which is to divide the complete system to multiple smaller sub-systems. Therefore, the large system matrix is decomposed into multiple sub-matrices, which can be calculated in parallel. The delay property is required for the decomposition scheme which was realized by following means applied in this work:

- using distributed line models, such as ULM and BLM, for the widely existing transmission lines in the DC grid;
- 2) using the delay property of the average value model of the converters.

When using disturbed line models, the line length must be sufficiently long, which makes the traveling time at least longer than the system simulation time-step, which is  $20\mu$ s in this work. The traveling time of some specific transmission lines shall also be longer than the communication latencies between various hardware computing units. Such requirements can be generally met for a transmission network, while may be an issue for a smaller system, such as a microgrid. It can be solved by inserting short distributed line models for decomposition. As previously explained, average value model of MMC can also provide the system decomposition with the sacrifice of accuracy and numerical stability. During the periodical calculation for each time-step, the EMT simulation includes three major steps, which are the electrical and control element calculation, the data exchange between subsystems, and the matrix equation formation and solution as shown in Fig. 7(a). Except for the exchange of history values of transmission line models and the voltage and current values of MMC average value model, the calculations among the sub-systems are independent and can be parallelized.

# B. HARDWARE RESOURCE ALLOCATION AND TASK PARTITIONING

The multi-core CPU is often used for the real-time EMT simulation. Admittedly, the communication latencies among the CPU cores are small, but frequent and large-amount of data exchange between cores is a heavy burden and is seldom recommended. For such reason, the complete calculation of a sub-system often resides in a single thread or core illustrated in Fig. 7(a). Since a CPU core is a sequential calculation device, the compute capability is limiting the number of sub-systems which can be contained in a single core. When complex topologies and components exist, it is necessary to use multiple cores increasing the parallelism to some extent for the calculation of the sub-system ensuring the real-time performance, which in effect increases the inter-core communication burden.

The utilization of FPGA resources can effectively eliminate the above dilemma, since various modules are connected directly by appropriate routing design without the complex mechanism for generalized communication. In effect, the module-level connections have no fundamental differences from the internal connections for various logic resources within an FPGA module. Instead of partitioning the system by sub-systems used in the conventional CPU-based simulator, this work partitions the system by different functions. The internal design of the modules uses pipelined structures to optimize the resource utilization. One or multiple module instances are applied to different components based on the timing requirements. Such processes are automatically done by HLS tools with the input of C code and directives enforcing specific resource and latency requirements. Generally, resource and latency optimization cannot be achieved simultaneously, which may require multiple design iterations to achieve the balance for a specific design.

The resources on one FPGA or MPSoC board may not be sufficient for the emulator of a large and detailed system. Multiple boards can be interconnected providing enormous computation capability. The board-level partition scheme is similar to the sub-system based partition, due to the relatively high communication cost. A sub-system group containing a large number of sub-systems is assigned to a board. In order to present the versatility of the proposed CIGRÉ DC grid emulator, which can conduct control device HIL test, power equipment HIL test and the multi-board expansion, the MMC control of all 11 converters and the calculation of two AC buses Ba-A0 and Ba-B0 alone with the corresponding equivalent power source and solver modules reside in the MPSoC board. For example, MPSoC can contain only the MMC control algorithm, which is the typical configuration for controller HIL test.

## C. DESIGN AND IMPLEMENTATION

Fig. 7(b) shows the hardware implementation of the CIGRÉ DC grid emulator on MPSoC-FPGA platform. The resource consumption and the latency of major components or functions are presented. Fig. 8 shows the flowchart of the execution process of the DC grid emulator. When the emulator starts, both FPGA and MPSoC begin the initialization of the hardware. The coefficients and variables for various components are initialized and loaded to the corresponding modules. Before the periodic precess begins, both devices are synchronized by the communication module.

On the FPGA, all the modules of the electrical element calculation and the MMC and DC-DC converter model calculation modules begin simultaneously. Then the MMC arm interface, which is a voltage source in series with a arm inductor shared by the MMC calculation modules using different modeling schemes is then computed. These modules will first update the history current values, and then calculate the equivalent current sources in the matrix equation. If the test condition or potential protection scheme is triggered for a certain time-step, some elements in the conductance matrix require modification. In this work, when the fault happens at Bus Bb-A1, the conductance connecting to the nodes of the DC bus changes from a very small value to a very large value presenting the short-circuit scenario. When the values of the matrix equations of all components are updated, the solver modules run simultaneously. The communication modules exchange the data between the FPGA board and the MPSoC board through Aurora core, which handles the data conversion for serial transceivers. The transferred data includes the emulation control signals, MMC control I/O data, transmission line data, and chosen results for observation.



FIGURE 7. (a) System decomposition and conventional processor-based partition scheme, (b) MPSoC-FPGA based partition and implementation details.

The MPSoC board follows the synchronization signal from the FPGA board and accomplishes the tasks of MMC control and the calculation of two AC buses. It is noted that the emulator and the controller processes are relatively independent from each other. For instance, the emulator portion can be disabled for a smaller circuit topology, the MPSoC-FPGA platform is considered as a typical control HIL configuration. When the exact time of the next time-step reaches, the periodical calculation starts again on both FPGA and MPSoC until the end of the simulation.

There are 20 sub-matrices existing in the DC grid emulator with sizes varying from  $2 \times 2$  to  $13 \times 13$ . Four types of solvers are implemented calculating different size range to achieve the optimized resource and latency performance as shown in Fig. 7(b). For example, a  $9 \times 9$  matrix equation can be solved by a  $13 \times 13$  solver; while the matrix equation with smaller size such as  $6 \times 6$  can use another smaller matrix solver.

The implementation of the MMC controller takes the advantages of MPSoC devices. The relatively complex and sequential calculations of the system-level control of the 11 MMC converters are completed in the four APU cores. The system-level control schemes and parameters can also be conveniently modified in the APU cores for HIL test. The valve-level control required by Converters Cb-A1, Cb-B1, Cb-C2, and Cm-A1, includes the sorting of all the SM capacitors in an arm and the gate signal generation of all IGBT modules, which is computationally demanding assigned on PL for acceleration.

There are various design flexibilities regarding the latency and resource consumption, when using the same devices or other devices, such as Xilinx XCVU13P, which



FIGURE 8. Flowchart of the MPSoC-FPGA based DC grid emulator

has 46% more logic blocks and 80% more DSP slices compared with Xilinx XCVU9P used in this work. In this work, the calculation modules for converters on the FPGA board wait for the MMC control signals from the MPSoC board within the same time-step. Sometimes there could be a delay for the MMC control signals, and the converter calculation modules can use the control signals of the previous time-step. In this way, the simulation time-step can be further reduced. In terms of resource consumption, the number of SMs using electrothermal model within an MMC model, the number of MMC converters using electrothermal model or equivalent circuit model, and the MMC level numbers can be adjusted based on the available resources and emulation requirements.

## D. MPSOC-FPGA HYBRID HARDWARE CONFIGURATION

Fig. 9 shows the hybrid platform configuration of the emulator. The VCU118 FPGA board is connected to the ZCU102 board through QSFP to  $4 \times$ SFP cable. The programmable logics of both boards are running at 100 MHz. The USB-JTAG cables of the two boards are connected to the host computer for downloading the configuration files. A digital-analog converter (DAC) board is used to connect the FPGA mezzanine card (FMC) of the VCU118 board and the oscilloscope to capture the real-time results from the emulator.



FIGURE 9. Hybrid platform configuration of the DC grid emulator.

# V. REAL-TIME EMULATION RESULTS, VALIDATION, AND DISCUSSION

Various studies and tests along with substantial results can be accomplished with the proposed real-time CIGRÉ DC grid emulator. The emulator uses  $20\mu s$  as the time-step for the system-level calculation and 10ns as the time-step for the device-level transient waveform update. This section presents some of the system-level and device-level real-time results under both normal operation and DC fault transient validated with commercial software PSCAD/EMTDC and SaberRD respectively. The complete CIGRÉ DC grid is modeled in PSCAD/EMTDC with the same hybrid modeling scheme as the real-time emulator, except that the electrothermal model is not used. It took 63s for a 3s run in PSCAD/EMTDC, which is conducted on the computer using Intel Xeon E5-2609 CPU at 2.4GHz and Window 7 operating system. However, the complete DC grid could not be implemented in SaberRD due to the numerical stability and convergence issue for the complex DC grid system, and the external system-level waveforms are imported to the precisely modeled sub-module for devicelevel results verification.

### A. STEADY-STATE OPERATION RESULTS

Fig. 10 shows the system-level steady-state operation results including the AC voltages of Converters Cb-A1 and Cm-C1, and the SM capacitor voltages of the first upper arm SM of Converter Cb-A1, which are compared with PSCAD/EMTDC. Since the MMCs modeled in this work have 65 levels, the AC voltage of the Converter Cb-A1 is close to sinusoidal waveforms with few harmonics. Such harmonics can also be observed for Converter Cm-C1 modeled with average value model since the other converter Cm-A1



FIGURE 10. System-level steady-state results: (a), (b) AC voltages of Converter Cb-A1, (c), (d) AC voltages of Converter Cm-C1, and (e), (f) the first upper arm SM capacitor voltages of Converter Cb-A1. ((a), (c), (e) are the real-time results; (b), (d), (f) are the offline PSCAD/EMTDC results.)

uses the equivalent circuit model and can affect the waveforms of Converter Cm-C1, which are in the same DC system. The SM capacitor voltages of the real-time results and the PSCAD/EMTDC results have some differences, due to the different modeling schemes. The IGBT model used in PSCAD/EMTDC simulation is the two-state resistor model with a small resistance representing on-state and a large resistance representing the off-state. While the electrothermal model used for Converter Cb-A1 is based on the detailed and accurate temperature-dependent nonlinear output characteristics of the IGBT module obtained from the manufacturer's datasheet for FZ400R33KL2C\_B5 IGBT module [23].

The device-level results of IGBTs and diodes are presented in Fig. 11 with the steady-state initial condition. SaberRD is utilized for the verification, which uses dynamic thermal IGBT transistor model and power diode model. The final values and dynamical changing patterns of the junction temperatures for  $S_1$ ,  $S_2$ ,  $D_1$ , and  $D_2$  are consistent between the real-time results and SaberRD results shown in Fig. 11(a)–(d). The linearized voltage and current waveforms can give a good estimation for the switching



FIGURE 11. Device-level steady-state results of the first SM in Converter Cb-A1 upper arm Phase A: (a), (b) junction temperatures of IGBTs S<sub>1</sub> and S<sub>2</sub>, (c), (d) junction temperatures of Diodes D<sub>1</sub> and D<sub>2</sub>, (e), (f) voltage  $v_{ce}$  and current  $i_c$  of S<sub>1</sub> during switching-on transient, (g), (h) voltage  $v_{ce}$  and current  $i_c$ of S<sub>1</sub> during switching-off transient, ((a), (c), (e), (g) are the real-time results; (b), (d), (f), (h) are the offline SaberRD results.)

transients with small amount of additional computation effort shown in Fig. 11(e)–(h). The overshoot current shown in Fig. 11(e) and (f) is caused by the reverse recovery phenomenon during the diode turn-off process.

### B. RESULTS DURING POWER FLOW COMMAND CHANGE

Fig. 12 shows the results during power flow command change. The power generation from Converter Cb-C2 changes from 600MW to 0MW from 3.5s to 4.5s of the simulation run representing the scenario of cutting-off an off-shore



FIGURE 12. Results during power flow command change: (a) (b) active power of Converter Cb-C2, (c) power loss of IGBT  $S_1$  from the first SM in Converter Cb-A1 upper arm Phase A, and (d) zoomed power loss. ((a), (c), (d) are the real-time results; (b) is the offline PSCAD/EMTDC result.)

power plant. Fig. 12(a) and (b) shows the active power tracking performance of Converter Cb-C2. Since Converter Cb-A1 is responsible for DC voltage regulation, the power flow increases responding to the power flow change of Converter Cb-C2, which influences the corresponding power losses of the IGBT as shown in Fig. 12(c) and (d).

## C. RESULTS DURING DC FAULT

A DC ground fault of both poles is applied to the Bus Bb-A1 at 1.0s of the simulation run with the steady-state initial condition. The DC voltages of Converter Cb-A1, Cm-B3, and Cm-C1 are shown in Fig. 13(a)–(f), which are located in DC System 3, DC System 2, and DC System 1. The realtime results are accurate compared with PSCAD/EMTDC. Although Converter Cm-B3 is far away from the fault location, the DC voltage is seriously affected by the fault. It is because the DC System 3 and DC System 2 are directly connected by the DC-DC converter without galvanic isolation. On the other hand, the fault effect is small for DC System 1 due to the galvanic isolation and the support from the relatively strong AC system. Fig. 13(g)-(j) presents active powers of Converters Cb-D1 and Cm-F1 during the DC fault. Fig. 13(k) and (l) shows the junction temperatures of  $S_1$  and  $D_2$ , where the fault current flows through. The characteristics and parameters of high temperatures and high currents are estimated from the normal operation data from the datasheet and the IGBT and diode parameter fitting tools provided by SaberRD. It is observed that the temperatures increase significantly within few milliseconds. The times for



FIGURE 13. Results during DC fault: (a), (b) DC voltage of Converter CB-A1, (c), (d) DC voltage of Converter Cm-B3, (e), (f) DC voltage of Converter Cm-C1, (g), (h) active power of Converter Cb-D1, (i), (j) active power of Converter Cm-F1, and (k), (l) junction temperatures of  $S_1$  and  $D_2$ . ((a), (c), (e), (g), (i), (k) are the real-time results; (b), (d), (f), (h), (j) are the offline PSCAD/EMTDC results; (l) is the offline SaberRD result.)

the devices reaching 150 °C since the begin of the fault are measured and compared with SaberRD. When temperatures exceed 150 °C, the devices are no longer safe for operation. It is noted that such results may not be accurate with extreme high temperatures, where the devices could be damaged and the characteristics are fundamentally changed in practical situation. The electrothermal model can still give a rough estimation of the junction temperatures during the faults to evaluate the potential damage to the devices and to verifies the control and protection algorithms for the devices.

#### **VI. CONCLUSIONS**

This work presents the design and implementation of a hybrid real-time DC grid emulator on MPSoC-FPGA platform. Hybrid modeling scheme and detailed hardware

implementation of the real-time EMT emulation system of the complete CIGRÉ DC grid test system are presented, which are flexible and extensible respectively. PSCAD/ EMTDC and SaberRD were used to validate the results of system-level and device-level hardware emulation results. The improvement of the modeling schemes and implementation techniques can enlarge the functionality and the scope of the EMT study and provide comprehensive analyses of the AC-DC grid. The proposed real-time emulator can benefit the control and protection study of the DC grid. For the transient study, the electrothermal model can provide the thermal data to evaluate the converter efficiency during normal operation and to determine the device status during transients by the junction temperatures. The usage of the electrothermal model can also increase the overall simulation accuracy. Comprehensive tests of the supervisory system-level control and the detailed valve-level control can be accomplished using

the proposed simulation techniques. The effect of the local controller to the complete DC grid can be analyzed in HIL test. Future work will include further development of the emulation system by including different MMC topologies and combining the DC grid with detailed modeled large-scale AC system and off-shore renewable energy power plants with more interconnected computing devices.

#### REFERENCES

- M. D. O. Faruque *et al.*, "Real-time simulation technologies for power systems design, testing, and analysis," *IEEE Power Energy Technol. Syst. J.*, vol. 2, no. 2, pp. 63–73, Jun. 2015.
- [2] X. Guillaud *et al.*, "Applications of real-time simulation technologies in power and energy systems," *IEEE Power Energy Technol. Syst. J.*, vol. 2, no. 3, pp. 103–115, Sep. 2015.
- [3] Y. Chen and V. Dinavahi, "Hardware emulation building blocks for realtime simulation of large-scale power grids," *IEEE Trans. Ind. Informat.*, vol. 10, no. 1, pp. 373–381, Feb. 2014.
- [4] T. K. Vrana, Y. Yang, D. Jovcic, S. Dennetiére, J. Jardini, and H. Saad, "The CIGRÉ B4 DC grid test system," Cigré, Tech. Rep., 2014. [Online]. Available: http://b4.cigre.org/Publications/Documents-relatedto-the-development-of-HVDC-Grids.
- [5] D. Van Hertem and M. Ghandhari, "Multi-terminal VSC HVDC for the European supergrid: Obstacles," *Renew. Sustain. Energy Rev.*, vol. 14, no. 9, pp. 3156–3163, Dec. 2010.
- [6] Z. Shen and V. Dinavahi, "Comprehensive electromagnetic transient simulation of AC/DC grid with multiple converter topologies and hybrid modeling schemes," *IEEE Power Energy Technol. Syst. J.*, vol. 4, no. 3, pp. 40–50, Sep. 2017.
- [7] A. Lesnicar and R. Marquardt, "An innovative modular multilevel converter topology suitable for a wide power range," in *Proc. IEEE Bologna Power Tech Conf.*, vol. 3, Jun. 2003, p. 6.
- [8] M. A. Perez, S. Bernet, J. Rodriguez, S. Kouro, and R. Lizana, "Circuit topologies, modeling, control schemes, and applications of modular multilevel converters," *IEEE Trans. Power Electron.*, vol. 30, no. 1, pp. 4–17, Jan. 2015.
- [9] Z. Shen and V. Dinavahi, "Real-time device-level transient electrothermal model for modular multilevel converter on FPGA," *IEEE Trans. Power Electron.*, vol. 31, no. 9, pp. 6155–6168, Sep. 2016.
- [10] X. Zhou, G. He, and X. Zhou, "FPGA design and implementation for realtime electromagnetic transient simulation system," in *Proc. IEEE 17th Int. Conf. High Perform. Comput. Commun., IEEE 7th Int. Symp. Cyberspace Saf. Secur., IEEE 12th Int. Conf. Embedded Softw. Syst.*, New York, NY, USA, Aug. 2015, pp. 848–851.
- [11] VCU118 Evaluation Board User Guide, UG1224 (V1.2), Xilinx, Inc., San Jose, CA, USA, Nov. 2017.
- [12] ZCU102 Evaluation Board User Guide, UG1182 (V1.3), Xilinx, Inc., San Jose, CA, USA, Aug. 2017.

- [13] Vivado Design Suite User Guide, High-Level Synthesis, UG902 (V2017.4), Xilinx, Inc., San Jose, CA, USA, 2017.
- [14] U. N. Gnanarathna, A. M. Gole, and R. P. Jayasinghe, "Efficient modeling of modular multilevel HVDC converters (MMC) on electromagnetic transient simulation programs," *IEEE Trans. Power Del.*, vol. 26, no. 1, pp. 316–324, Jan. 2011.
- [15] T. Maguire, B. Warkentin, Y. Chen, and J. Hasler, "Efficient techniques for real time simulation of MMC systems," in *Proc. Int. Conf. Power Syst. Transients (IPST)*, Vancouver, BC, Canada, Jul. 2013.
- [16] K. Ou *et al.*, "MMC-HVDC simulation and testing based on real-time digital simulator and physical control system," *IEEE J. Emerg. Sel. Topics Power Electron.*, vol. 2, no. 4, pp. 1109–1116, Dec. 2014.
- [17] S. Elimban, Y. Zhang, and J. C. G. Alonso, "Real time simulation for HVDC grids with modular multi-level converters," in *Proc. 11th IET Int. Conf. AC DC Power Trans.*, Birmingham, U.K., 2015, pp. 1–8.
- [18] Applications of FPGA and MPSoC. Accessed: 2018. [Online]. Available: https://www.xilinx.com/applications.html
- [19] J. J. Rodríguez-Andina, M. D. Valdés-Peña, and M. J. Moure, "Advanced features and industrial applications of FPGAs—A review," *IEEE Trans. Ind. Informat.*, vol. 11, no. 4, pp. 853–864, Aug. 2015.
- [20] Aurora 64B/66B LogiCORE IP Product Guide, PG074 (V11.2). Xilinx Inc., San Jose, CA, USA, 2017
- [21] M. Saeedifard and R. Iravani, "Dynamic performance of a modular multilevel back-to-back HVDC system," *IEEE Trans. Power Del.*, vol. 25, no. 4, pp. 2903–2912, Oct. 2010.
- [22] B. Wu, High-Power Converters and AC Drives. Hoboken, NJ, USA: Wiley, 2006.
- [23] Technical Information, Infineon IGBT Modules FZ400R33KL2C\_B5. Accessed: Jan. 15, 2018. [Online]. Available: https://www.infineon.com
- [24] A. Morched, B. Gustavsen, and M. Tartibi, "A universal model for accurate calculation of electromagnetic transients on overhead lines and underground cables," *IEEE Trans. Power Del.*, vol. 14, no. 3, pp. 1032–1038, Jul. 1999.
- [25] H. W. Dommel, *EMTP Theory Book*. Portland, OR, USA: Bonneville Power Administration, 1984.



**ZHUOXUAN SHEN** (S'14) received the B.Eng. degree in electrical engineering from Jiangsu University, Zhenjiang, Jiangsu, China, in 2013. He is currently pursuing the Ph.D. degree in electrical and computer engineering with the University of Alberta, Edmonton, AB, Canada. His research interests include real-time simulation of power systems, power electronics, and field-programmable gate arrays.







**VENKATA DINAVAHI** (S'94–M'00–SM'08) received the B.Eng. in electrical engineering from the National Institute of Technology, Nagpur, India, in 1993, the M.Tech. degree in electrical engineering from IIT Kanpur, India, in 1996, and the Ph.D. degree in electrical and computer engineering from the University of Toronto, ON, Canada, in 2000. He is currently a Professor with the Department of Electrical and Computer engineering, University of Alberta, Edmonton,

AB, Canada. His research interests include real-time simulation of power systems and power electronic systems, electromagnetic transients, device-level modeling, large-scale systems, and parallel and distributed computing.