A processor X1 operating at 2 GHz has a standard 5-stage RISC instruction pipeline having a base CPI (cycles per instruction) of one without any pipeline hazards. For a given program P that has 30% branch instructions, control hazards incur 2 cycles stall for every branch. A new version of the processor X2 operating at same clock frequency has an additional branch predictor unit (BPU) that completely eliminates stalls for correctly predicted branches. There is neither any savings nor any additional stalls for wrong predictions. There are no structural hazards and data hazards for X1 and X2. If the BPU has a prediction accuracy of 80%, the speed up (rounded off to two decimal places) obtained by X2 over X1 in executing P is ____________.
Consider a pipelined processor with 5 stages, Instruction Fetch (IF), Instruction Decode (ID), Execute (EX), Memory Access (MEM), and Write Back (WB). Each stage of the pipeline, except the EX stage, takes one cycle. Assume that the ID stage merely decodes the instruction and the register read is performed in the EX stage. The EX stage takes one cycle for ADD instruction and two cycles for MUL instruction. Ignore pipeline register latencies.
Consider the following sequence of 8 instructions:
ADD, MUL, ADD, MUL, ADD, MUL, ADD, MUL
Assume that every MUL instruction is data-dependent on the ADD instruction just before it and every ADD instruction (except the first ADD) is data-dependent on the MUL instruction just before it. The speedup is defined as follows:
$$Speedup = \frac{{Execution{\:}time{\:}without{\:}operand{\:}forwarding}}{{Execution{\:}time{\:}with{\:}operand{\:}forwarding}}$$
The Speedup achieved in executing the given instruction sequence on the pipelined processor (rounded to 2 decimal places) is _______
A five-stage pipeline has stage delays of 150, 120, 150, 160 and 140 nanoseconds. The registers that are used between the pipeline stages have a delay of 5 nanoseconds each.
The total time to execute 100 independent instructions on this pipeline, assuming there are no pipeline stalls, is ______ nanoseconds.
Consider the following instruction sequence where register R1, R2 and R3 are general purpose and MEMORY[X] denotes the content at the memory location X.
Instruction |
Semantics |
Instruction Size (bytes) |
MOV R1, (5000) |
R1 ← MEMORY[5000] |
4 |
MOV R2, (R3) |
R2 ← MEMORY[R3] |
4 |
ADD R2, R1 |
R2 ← R1 + R2 |
2 |
MOV (R3), R2 |
MEMORY[R3] ← R2 |
4 |
INC R3 |
R3 ← R3 + 1 |
2 |
DEC R1 |
R1 ← R1 – 1 |
2 |
BNZ 1004 |
Branch if not zero to the given absolute address |
2 |
HALT |
Stop |
1 |
Assume that the content of the memory location 5000 is 10, and the content of the register R3 is 3000. The content of each of the memory locations from 3000 to 3010 is 50. The instruction sequence starts from the memory location 1000. All the numbers are in decimal format. Assume that the memory is byte addressable.
After the execution of the program, the content of memory location 3010 is ______