Lecture 8: Memories, lab4



TSEA44: Computer hardware - a system on a chip

2022-12-07

2

# **Today**

- Memories/memory controller
- Lab4, new instruction



2022-12-07

### Practical info

- Lab closed after 23/12
  - Opens again after new year (2/1-20)
  - Remote login still works (thinlinc.edu.liu.se -> muxen1-0xx, xx = 01-16)
- Office corridors locked during christmas/new year
  - Hard to get access to people (even if not on holiday/vacation)





















2022-12-07

# SDRAM; Synchronous Dynamic RAM

- Clocked device
- Memory element: Capacitance
- Needs periodic refreshing
- Pipelined operation
- Burst oriented
  - Single burst in our design
- $2 \times (16M \times 16) = 64MB$















2022-12-07 19

### FLASH - Interface

- Looks like SRAM
  - Read
  - Write commands
- Erase is done in blocks
- Contains uCLinux kernel + file system



TSEA44: Computer hardware - a system on a chip

2022-12-07 20

# FLASH - Cell



















2022-12-07 28

# Lab 4, Custom instruction

- Increase performance by adjusting instruction set
- Specific for application domain
  - General purpose processor is general purpose
  - Not exceptionally good at anything
- Use profiling to find out the most timeconsuming part of the application code



2022-12-07 2

# **Huffman Encoding/Decoding**

### 1) After Q

```
22 12 0 -12 0 0 0 0
  0 -8 0
          0 0 0 0
          0 0 0 0
0
          0 0 0 1
 0 0
       0
       0
          0 0 0 0
0
 0 0
       0 0 0 0 0
0 0 0
  0
    0
       0
          0 0 0 0
          0 0 0 0
 0 0
```

### 2) After zig-zag

### 3) After RLE

```
Value raw bits (amplitude value)
05 10110
                -12 =>
      1100
                12-1, force MSB=0
13
      100
                => 0011
 24
      0011
04
      0111
F0
                8-1, force MSB=0
F0
                => 0111
D1
         1
00
```

### 4) Huffman coding

Run of 0:s

 Value are HC (variable length) using table lookup

Magnitude

- raw bits are left untouched



TSEA44: Computer hardware - a system on a chip

2022-12-07 30

# **Huffman in JFIF**

- Output: 1 16 bits
- Encodes bytes
- 2 tables used
  - YDC
  - YAC

LINKÖPING UNIVERSITY

```
2022-12-07 31
TSEA44: Computer hardware - a system on a chip
ipegfiles
        jpegtest.c, jcdctmgr.c, jdct.c, jchuff.c
  draw_image()
                                                 "write header,
                                            → wile
init your HW"
                       л init_huffman(<del>)</del>
  init_encoder()
                          init_image() "init some variables"
                                           ╭¬ jpeg_fdct_islow()
                          forward_DCT() ____ "quantize"
  encode_image()
                           encode_mcu_huff()
                                                  ^{\sim} emit_bits()
                           > flush_bits()
  finish_pass_huff()
                                           "flush remaining bits"
 LINKÖPING
UNIVERSITY
```

```
TSEA44: Computer hardware – a system on a chip
                                                                                                         2022-12-07 32
 Emit_bits()
/* Only the right 24 bits of put_buffer are used; the valid bits are left-justified in * this part. At most 16 bits can be passed to emit_bits in one call, and we never retain * more than 7 bits in put_buffer between calls, so 24 bits are sufficient.
static void emit_bits (unsigned int code, int size)
    unsigned int startcycle;
    new put buffer = (int) code;
// Add new bits to old bits. If at least 8 bits then write a char to buffer,
// save the rest until we get more bits.
    new_put_buffer &= (1<<size) - 1;
                                                                  /* mask off any extra bits in code */
    current_buffer_bit += size; /* new number of bits in buffer */
new_put_buffer = new_put_buffer << (24 - current_buffer_bit); /* align incoming bits */
    new_put_buffer = new_put_buffer | old_put_buffer; /* and merge with old buffer contents */
    while (current_buffer_bit >= 8) {
  int c = ((new_put_buffer >> 16) & 0xFF); // Mask out the 8 bits we want
  buffer[next_buffer] = (char) c;
      next_buffer++;
if (c == 0xFF) {
                                   // 0xFF is a reserved code for tags, if we get image data
          buffer[next_buffer] = 0x00; // with an FF value it has to be followed by 0x00.
          next buffer++;
     new_put_buffer <<= 8;
current_buffer_bit -= 8;
   old_put_buffer = new_put_buffer; /* update state variables */
 LINKÖPING
UNIVERSITY
```



2022-12-07 34

# Adding an Instruction

- 1. Instruction Selection
- 2. Hardware modification
- 3. Assembler modification
- 4. Compiler modification

LINKÖPING UNIVERSITY

2022-12-07 35

# **Instruction Selection**

- I.custx
  - No operands
- Instructions for 64 bit
  - Not used
  - Assembler can understand
  - I.sd I(rA),rB



TSEA44: Computer hardware - a system on a chip

2022-12-07 36

# **Hardware Modifications**

- Instruction decoder modifications
  - Legal instruction
  - or1200\_ctrl.v
- Special purpose register
  - New group
  - or1200\_sprs.v
- Data path
  - New hardware
  - or1200\_lsu.v
  - or1200\_vlx\_top.v





2022-12-07

# Or1200 Pipeline

• Remember stall

|       | 1  | 2   | 3   | 4  | 5   | 6   | 7   |
|-------|----|-----|-----|----|-----|-----|-----|
| IF    | ld | add | sub | -  |     |     |     |
| ID/RR |    | ld  | add | ı  | sub |     |     |
| EX/M  |    |     | ld  | ld | add | sub |     |
| W     |    |     |     | -  | ld  | add | sub |











2022-12-07 43

## Control

- May not be needed
- May be an FSM



TSEA44: Computer hardware - a system on a chip

2022-12-07 44

44

# **Store Unit**

- Stores the data
- 0xFF stored as 0xFF00
  - JPEG markers
- Only byte alignment!
  - Parallel stores faster



2022-12-07 45

### Software

- New assembler
  - Easy
- New compiler
  - Hard problem for complex instructions
  - Compiler knows functions
- C
  - Inline Assembler



```
TSEA44: Computer hardware - a system on a chip
                                                    2022-12-07 46
Instruction Usage
unsigned char* sb_get_buff_pos(void)
                                         output
{
   unsigned char* pos;
   asm volatile("l.mfspr %0,%1,0x2":"=r"(pos):"r"(0xc000));
   return pos;
}
 00000250 <_sb_get_buff_pos>:
  250: 9c 21 ff fc
                         1.addi r1, r1, 0xfffffffc
  254: d4 01 10 00
                         1.sw\ 0x0(r1), r2
  258: 9c 41 00 04
                         1.addi r2, r1, 0x4
  25c: a9 60 c0 00
                         1.ori r11, r0, 0xc000
                         1.mfspr r11, r11, 0x2
  260: b5 6b 00 02
  264: 84 41 00 00
                         1.1wz r2,0x0(r1)
                         1.jr r9
  268: 44 00 48 00
  26c: 9c 21 00 04
                         1.addi r1, r1, 0x4
LINKÖPING
UNIVERSITY
```



