### Lecture 6 Transaction Level Modeling in SystemC

Multimedia Architecture and Processing Laboratory

多媒體架構與處理實驗室

Prof. Wen-Hsiao Peng (彭文孝)

pawn@mail.si2lab.org

2007 Spring Term

### Acknowledgements

This lecture note is partly contributed by Prof. Gwo Giun Lee (李國 君) in the Dept. of EE, National Cheng-Kung University and his team members 王明俊, 林和源 in the research laboratory 多媒體系 統晶片實驗室

2

E-mail: clee@mail.ncku.edu.tw

Tel: +886-6-275-7575 ext. 62448

Web: http://140.116.216.53



### Transaction Level Modeling

- A high-level approach to model digital systems
  - Care more on what data are transferred to and from what locations

Δ

- Care less on the actual protocol used for data transfer
- Features
  - Details of communication are separated from details of computation
  - Communication mechanisms are modeled as channels
  - Low-level details of information exchanged are hidden in channels
  - Pin level details at the structural boundary are abstracted into interface
  - Transaction requests take place by calling interface functions
  - Synchronization details of channels are typically abstracted into blocking and/or non-blocking I/O
- Advantages
  - Enable high simulation speed by hiding uninteresting details





### Implementation of The Very Simple Bus

class very\_simple\_bus\_if : virture public virture void **burst** read (char \*data, unsigned adder, unsigned length) sc interface //model bus contention using mutex, but no arbitration rules public: \_bus\_mux.lock(); virtual void **burst\_write** (char \*data, // block the caller for length of burst transaction unsigned adder, unsigned length)=0; Wait (length \* cycle time); virtual void burst\_read (char \*data, // copy the data form memory of burst transaction unsigned adder, unsigned length)=0; memcpy(data, \_mem +addr, length); // unlock the mutex to allow others access to the bus }; \_bus\_mutex.unlock(); class very\_simple\_bus: public very\_simple\_bus\_if, virture void **burst\_write** (char \*data, unsigned adder, unsigned length) public sc\_channel bus mutex.lock(); public: wait (length \* \_cycle\_time); very\_simple\_bus ( sc\_name nm, // copy the data form requestor to memory unsigned mem\_size, memcpy(\_mem+addr, data, length); sc time cycle time) bus mutex.unlock(); Cycle-count : sc channel(nm), cycle time(cycle time) protected: accurate //we model bus memory access using an char\* \_mem; //embedded memory array model sc time cycle time; mem = new char [mun\_size]; sc\_mutex \_bus\_mutex; //set initail value of memory to zero }; memset( mem, 0, mem size ); ~very\_simple\_bus() { delet [] \_mem; }







## Simple Bus Design (c. 1)



### Structure of the Simple Bus Design

- Masters (CPUs, DSPs, Arithmetic Intensive ASIC)
  - Initiate transactions on the bus
- Bus
  - Allow the masters and slaves to communicate using bus transactions

- Slaves (ROMs, RAMs, I/O Devices, Hardware Accelerators)
  - Response to the bus requests
- Arbiter
  - Arbitrate which master can issue the transaction via the bus
  - Select a request to execute from the competing bus requests
  - When a master is granted access to the bus, the requests from the other masters are queued by the bus and executed in later cycles
- Clock generator
  - Provide the system clock that can synchronize the blocks

### Structure of the Simple Bus Design (c. 1)

- Master 1: Blocking Master
  - Use blocking master interface
  - Model high-level software that initiates transactions as they execute

- Master 2: Non-Blocking Master
  - Use non-blocking master interface
  - Model detailed processor (instruction-set simulator, ISS) that must execute on every clock edge even when waiting for its bus transactions to complete
- Model Master 3: Direct Master
  - Use direct master interface
  - Print debug information about the contents of the memories
  - Does not represent a block that will exist in the real world

### Structure of the Simple Bus Design (c. 2)

- Slave 1 (Fast Memory)
  - Implement slave interface
  - Model a random access memory that supports single-cycle read/write operation with no wait states and no clock port

- React immediately to the bus request and set the status
- Slave 2 (Slow Memory)
  - Implement slave interface
  - Model a random access memory which takes a few number of cycles to complete a read/write operation, and contains a clock port
- Wait states
  - Additional cycles that a slave takes to complete an operation
  - ✤ All other activity on the bus waits until the operation completes

### Features of the Simple Bus Design

High performance, cycle-accurate, platform transaction-level model

- Cycle-accurate transaction level modeling
  - Model is done at transaction level
  - Model is based on cycle-based synchronization
- Cycle-based synchronization
  - Model the data movement on a clock by clock basis
  - Sub-cycle events are of no interest
- Transaction-based modeling
  - Communication between components are described as function calls
  - Sequences of events on a group of wires are denoted by a set of function calls in an abstract interface
- Two-phase synchronization
  - Modules attached to the bus execute on the rising clock edge
  - The bus executes on a falling clock edge

### Features of the Simple Bus Design (c. 1)

Easy to add different kinds and numbers of masters or slaves

- Masters connect to the bus using just one port connection
- Slaves connect to the bus using SystemC multi-port feature
- Easy to change the arbitration policy by replacing the arbiter
  - Arbiter is a separate module from the bus

### Ideas behind the Simple Bus Model

### Modeling efforts

- Relatively easy to develop, understand, use, and extend
- Capable of being constructed very early in the system design
- Enable designers to explore implementation alternatives
- Make design trade-offs before it is too late or too expensive to do so

17

#### Accuracy

- Being fully cycle-accurate
- Being able to accurately simulate with both the SW and HW components
- Fast and accurate enough to validate SW before more detailed HW models or implementations are available

#### Speed

- Capable of simulate at the speed of more than 0.1MHz
- Fast enough to allow meaningful amounts of SW to be executed along with HW models



### Master Interface

Describe the communication between the master and the bus

- Master interface is used by masters and implemented in the bus
- 3 sets of master interface functions
  - Blocking master interface
  - Non-blocking master interface
  - Direct master interface
- Multiple masters can be connected to a bus
  - Each master is independent of the others
    - + Each master can issue a bus request at any time
  - Each master is identified by a priority number
    - + The lower the priority is, the more important the master is
    - + Each master interface function use this priority to set the importance of the call
  - A master can reserve the bus for a subsequent access
    - + The bus can be locked for the same master in consecutive cycles



# Return Values of Master Interface Methods

- SIMPLE\_BUS\_REQUEST
  - The request is issued and placed in the queue
  - The status in all cases right after issuing the request
  - The status only changes when the bus processes the request
- SIMPLE\_BUS\_WAIT
  - The request is being served but not completed yet
- SIMPLE\_BUS\_OK
  - The request is completed without errors
- SIMPLE\_BUS\_ERROR
  - The request is finished but the transfer is not complete successfully

### Non-Blocking Master Interface

These functions return immediately, but the read/write will take more than one cycle when competing requests exist

22

- Caller must check the status of the last request using get\_status()
- Used by ISS models which cannot be suspended while they have outstanding bus requests

class simple\_bus\_non\_blocking\_if : public virtual sc\_interface{

public: // non-blocking BUS interface
virtual void read (unsigned int unique\_priority
 , int \*data
 , unsigned int address

, bool **lock** = false) = 0;

virtual void write (unsigned int unique\_priority , int \*data

- , unsigned int address
- , bool **lock** = false) = 0;

virtual **simple\_bus\_status get\_status** (unsigned int unique\_priority) = 0;

}; // end class simple\_bus\_non\_blocking\_if

### Non-Blocking Master Interface (c. 2)

A non-blocking request can be made if the status of the last request is either SIMPLE\_BUS\_OK or SIMPLE\_BUS\_ERROR

An error message is produced and the execution is aborted when a new request is issued and the current one is not completed yet

### Direct Master Interface

- These functions provide instantaneous read/write
  - Simulated time will not advance and scheduler will not intervene
  - Data accesses go through the bus for proper routing of the requests

24

- Data transfer is done without using bus protocol
- Used for creating simulation monitors
  - Enable debuggers running on top of ISS models to read/write to slaves without waiting for the simulation time to advance

class simple\_bus\_direct\_if : public virtual sc\_interface

#### public:

// direct BUS/Slave interface
virtual bool direct\_read(int \*data, unsigned int address) = 0;
virtual bool direct\_write(int \*data, unsigned int address) = 0;

}; // end class simple\_bus\_direct\_if



### Slave Interface

- Describe the communication between the bus and the slave
  - Slave interface is used by the bus and implemented by every slave

- By definition, the slaves thus play the role of channels
- 2 sets of slave interface functions
  - Normal slave interface
    - + Serve the default read/write to and from the slaves
  - Direct slave interface
    - + Similar to direct master interface
- Multiple slaves can be connected to a bus
  - Two functions can be used to obtain the memory range of a slave
  - unsigned int start\_address() const;
  - unsigned int end\_address() const;

### Normal Slave Interface

The read/write function performs a single data transfer and returns immediately, and caller must check the return values

- Return values of slave interface methods
  - SIMPLE\_BUS\_WAIT: the slave issues a wait state
  - SIMPLE\_BUS\_OK: the transfer was successful
  - SIMPLE\_BUS\_ERROR: an error occurs during the transfer
- If the return status is SIMPLE\_BUS\_WAIT, caller must call the function again until the status becomes SIMPLE\_BUS\_OK



### Arbiter Interface

- Describe the communication between the bus and the arbiter
  - Arbiter interface is used by the bus and implemented in the arbiter

- By definition, the arbiter thus plays the role of channel
- Arbitrate competing requests issued by different masters
  - The bus passes its outstanding requests to an arbiter on each cycle
  - One of the requests is selected for execution based on arbitration policy while the others are kept in the SIMPLE\_BUS\_REQUEST state

| <pre>class simple_bus_arbiter_if : public virtual sc_interface{</pre>                                                     | Outstanding requests are passed<br>to the arbiter as a vector |
|---------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------|
| <pre>public:<br/>virtual simple_bus_request* arbitrate(const simple_bus_r<br/>}; // end class simple_bus_arbiter_if</pre> | request_vec &requests) = 0;                                   |
|                                                                                                                           |                                                               |

| Maste                     | or and Claun Doquact Status                                           |  |
|---------------------------|-----------------------------------------------------------------------|--|
| Ινιαsισ                   | er and Slave Request Status                                           |  |
| <ul> <li>Maste</li> </ul> | er request status (read by the master)                                |  |
| <b>♦</b> SI               | MPLE_BUS_REQUEST                                                      |  |
|                           | The request is issued and placed in the queue                         |  |
| ✤ SI                      | MPLE_BUS_WAIT                                                         |  |
| 4                         | The request is being served but not completed yet                     |  |
| ✤ SI                      | MPLE_BUS_OK                                                           |  |
| +                         | The request is completed without errors                               |  |
| <b>♦</b> SI               | MPLE_BUS_ERROR                                                        |  |
| +                         | The request is finished but the transfer is not complete successfully |  |
| <ul> <li>Slave</li> </ul> | request status (read by the bus)                                      |  |
|                           | MPLE_BUS_WAIT                                                         |  |
|                           | The slave issues a wait state                                         |  |
| ✤ SI                      | MPLE_BUS_OK                                                           |  |
|                           | The transfer was successful                                           |  |
| ✤ SI                      | MPLE_BUS_ERROR                                                        |  |
|                           | An error occurs during the transfer                                   |  |



### **Overall Execution Scheme**

- On the rising edge of the clock
  - Masters execute and may send requests to the bus
  - Bus maintains a set of outstanding requests including unfinished ones from past cycles
- On the falling edge of the clock
  - Bus calls arbiter to select a request for execution
  - Bus looks up the address of the request to determine the target slave

- Bus invokes the read()/write() functions of the target slave
- Functions return and indicate if the slave issues wait states
  - + Bus will reissue the request on the next cycle upon receiving wait states
- Bus updates the status of the original master once the slave completes the request









### **Two-Phase Synchronization**

- Masters and slaves are active on the rising edge of the clock
- Bus and arbiters are active on the falling edge of the clock
- Two-phase synchronization
  - Communication between modules attached to the bus go through the bus

- Communication is delayed by a clock cycle
- On the rising edge of the clock, no state changes of the bus are visible
- On the falling edge of the clock, the bus arbitrates the competing requests
- Request-update mechanism
  - Communications between processes go through the primitive channels
  - Communication is delayed by a delta-cycle
  - ✤ In the evaluation phase, no state changes of primitive channels are visible
  - In the update phase, primitive channels resolve competing requests

## Two-Phase Synchronization (c. 1)

Triggering the bus using the clock falling edge is just a technique

- Actual implementation may not use the falling edge of the clock
- Designs with the two-phase synchronization and deterministic arbitration rules are deterministic
  - The order of process execution will not affect the execution results















# **Request Arbitration Rules**

If the request that was executed has its "lock" flag set, when the master issue the requests to the bus

- If the request was a burst request and it is not yet completed, it is always selected
- If the master that issued the request is issuing another request in the current cycle, then the master's request is always selected





#### Implementation of Blocking Interface









```
Initiate the counters of wait states.
                                                                                                              52
                                                                   <mark>class</mark> simple bus slow mem
                           Bus will keep invoking the same
                                                                     : public simple_bus_slave_if
                           function until the return is
                                                                     , public sc_module
                           SIMPLE BUS OK.
                                                                   ſ
                                                                   public:
Slave 2: Slow Memory
                                                                     // ports
                                                                     sc_in_clk clock;
                                                                     SC HAS PROCESS(simple bus slow mem);
inline simple_bus_status simple_bus_slow_mem::read(int ×data
                                                                     // constructor
                                                                     simple_bus_slow_mem(sc_module_name name_
                             , unsigned int address
                                                                                 . unsigned int start address
                                                                                 , unsigned int end address
  // accept a new call if m_wait_count < 0)</pre>
                                                                                 , unsigned int nr_wait_states)
  if (m_wait_count < 0)</pre>
                                                                       : sc_module(name_)
    £
                                                                       , m_start_address(start_address)
      m_wait_count = m_nr_wait_states;
                                                                       , m end address(end address)
      return SIMPLE_BUS_WAIT;
                                                                       , m_nr_wait_states(nr_wait_states)
                                                                       , m_wait_count(-1)
  if (m_wait_count == 0)
                                                                       // process declaration
                                                                       SC_METHOD(wait_loop);
      *data = MEM[(address - m start address)/4];
                                                                       dont_initialize();
      return SIMPLE BUS OK;
                                                                       sensitive_pos << clock;</pre>
  return SIMPLE_BUS_WAIT;
                                                                                            Method process
                                                                     // process
inline void simple bus slow mem::wait loop()
                                                                     void wait_loop();
                                              Do nothing without
 if (m_wait_count >= 0) m_wait_count--;
                                                                     // direct Slave Interface
                                               read/write request
                                                                      . . . . . . . . . . . . . . . . .
                                                                     // Slave Interface
inline bool
                                                                     unsigned int start_address() const;
simple_bus_slow_mem::direct_read(int ×data, unsigned int address)
                                                                     unsigned int end_address() const;
  *data = MEM[(address - m start address)/4];
                                                                   private:
  return true:
                                                                     int *MEM;
                                                                     unsigned int m_start_address;
                                                                     unsigned int m_end_address;
   Works to be done are minimized in
                                                                     unsigned int m_nr_wait_states;
 the frequently activated method process
                                                                     int m wait count;
                                                                   }; // end class simple_bus_slow_mem
```



# High-Performance Modeling Techniques

- Simple modules are modeled without any processes at all
  - Example: fast\_mem and arbiter
- Blocks to be activated most frequently should use SC\_METHOD
  - SC\_METHOD consumes less memory and execute more quickly
- Frequently activated processes should do as little work as possible

54

Example: in slow\_mem, there is a clocked SC\_METHOD that simply decrements a counter to indicate when the wait states comes to completion

### Comparisons between TLM and RTL

RTL uses signals for communication; TLM employs transactions

- Transactions are modeled by function calls
- Both control and data are transferred along with function calls
  - + There is no pin-accuracy
  - + Data can be bundled and passed more efficiently
- Pointers to data are transferred between modules by transaction
  - Enable one module to very efficiently copy blocks of data to another
  - Example: the burst\_read/burst\_write transactions
- RTL uses low-level bit vectors; TLM uses high level C data-types
  - RTL uses static sensitivity; TLM uses dynamic sensitivity
    - RTL modules execute on every cycle even if no work is being done
    - TLM modules enable execution when they have real work to perform
      - + Processes are suspended until the bus requests complete

#### **Common Questions**

- What is the distinction between modules and hierarchical channels?
  - In an informal way
    - + *Hierarchical channels:* implement interface functions and contain no ports

- + *Modules:* do not implement interface functions and contain ports
- In reality
  - + Hierarchical channels and modules are the same thing
- In simple\_bus design
  - Blocks implementing transactions are designed to be channels that inherit form their transaction interface
  - Blocks that initiate transactions are designed to be modules that allow them to access the channels
  - The bus implements several interface functions and it also has ports to access the interface of the slaves and arbiter

## Common Questions (c. 1)

- Why do slaves implement slave interface rather than having normal ports like other modules?
  - Eliminate the need for a process within the fast\_mem and arbiter
  - Allow minimizing the amount of works in the process of slow\_mem

- Why are multiple slave channels attached to the same port on the bus?
  - Do not want to fix the number of slaves
  - Allow binding as many slaves to the bus as wished during elaboration
  - Multi-port feature of SystemC
    - \*sc\_port<simple\_bus\_slave\_if, 0> slave\_port
    - + slave\_port.size() returns the number of channels bounded to the port
    - + slave\_port[N] separates slave channels bounded to the port