A 5-stage instruction pipeline has stage delays of $180,250,150,170$, and 250 , respectively, in nanoseconds. The delay of an inter-stage latch is 10 nanoseconds. Assume that there are no pipeline stalls due to branches and other hazards. The time taken to process 1000 instructions in microseconds is ________ . (Rounded off to two decimal places)
An application executes $6.4 \times 10^8$ number of instructions in 6.3 seconds. There are four types of instructions, the details of which are given in the table. The duration of a clock cycle in nanoseconds is _________. (rounded off to one decimal place)
Instruction type |
Clock cycles required per instruction (CPI) |
Number of Instructions executed |
---|---|---|
Branch | 2 | $2.25\times10^8$ |
Load | 5 | $1.20\times10^8$ |
Store | 4 | $1.65\times10^8$ |
Arithmetic | 3 | $1.30\times10^8$ |
A non-pipelined instruction execution unit operating at 2 GHz takes an average of 6 cycles to execute an instruction of a program P. The unit is then redesigned to operate on a 5-stage pipeline at 2 GHz. Assume that the ideal throughput of the pipelined unit is 1 instruction per cycle. In the execution of program P, 20% instructions incur an average of 2 cycles stall due to data hazards and 20% instructions incur an average of 3 cycles stall due to control hazards. The speedup (rounded off to one decimal place) obtained by the pipelined design over the non-pipelined design is ________
The baseline execution time of a program on a 2 GHz single core machine is 100 nanoseconds (ns). The code corresponding to 90% of the execution time can be fully parallelized. The overhead for using an additional core is 10 ns when running on a multicore system. Assume that all cores in the multicore system run their share of the parallelized code for an equal amount of time.
$$ \text { The number of cores that minimize the execution time of the program is _______. } $$