Search<%BANNER[mem130]%>
<%BANNER[left_130x300]%>
<%BANNER[left_130x130_2]%>
InformationX-bit Labs for mobile users! Do not forget that we are running a special version of X-bit Labs web-site for users of mobile and handheld devices: http://pda.xbitlabs.com. Check out our news and articles from smartphones and PDAs to be always updated on the latest computer and technology news. <%BANNER[right_130x600]%>
|
<%BANNER[top_768x90]%>
|
|
|
<%BANNER[banner_468x60]%>
Articles: CPUPrescott: The Last of the Mohicans? (Pentium 4: from Willamette to Prescott). Part II (page 8)Category: CPU [ 05/30/2005 | 01:32 PM ] Moreover, we can easily prove that the efficiency of this strategy will reduce in general as the pipeline grows longer. As you have just seen, we got 8 clock cycles instead of 2 for the pipeline with 6 stages between the scheduler and the execution units. The resulting efficiency in this case equals 25%. For a pipeline with only one stage distance between the scheduler and the execution unit the efficiency will increase to 67%. For a pipeline with 666 stages we will get 668 clock cycles instead of 2. The efficiency is 0.3%. At the same time, if the instruction takes longer to execute, this strategy may actually work much better. Say, for instance, that our pipeline features 6 stages distance between the scheduler and the execution units, but the considered instructions takes 50-100 clock cycles to execute (depending on the circumstances). However, we do not know the exact execution time from the very beginning, but only after about 25 clocks. The execution unit received micro-operation at [0] time point. At [25] time point the execution unit learns that it will take 51 clock to complete the operation processing. At the same time point ([25]) the scheduler receives the same information. It waits for a while and … At the same [45] time point it sends out the dependent micro-operation, which will reach the execution unit exactly at… The [51] time point, when it suddenly finds the just obtained result of the previous micro-operation. In other words, there are such situations when the combination of the pipeline length, micro-operation latency and the time this latency becomes known, that turn this strategy into something truly efficient. This strategy is 100% efficient, when [the distance between the scheduler and functional units] is smaller than the difference between [the micro-operation latency] and [the time the latency becomes known]. The integer operations do not comply with this condition that is why this strategy doesn’t work for us here. Third option (optimistic). From the performance point of view, the two previous options we have just discussed are not so interesting for us. The first option is awfully stupid, and the second option is too inefficient. There is only one more option left: to send instructions in advance before we know the execution status of the previous micro-operations. Let me describe this option in a bit more detail. The commands can be released one after another hoping for the best in terms of data loading outcome. In our case it will mean that 2 clock cycles after the data load from the memory occurs, the next micro-operation should already be sent. How can we benefit from this strategy? At the [0] time point we send the data load micro-operation to the execution unit. It should reach this unit at the [0+6] time point and the scheduler knows about it. Without waiting for this particular time point, the scheduler releases the next micro-operation at the [0+2] time point (i.e. two clocks down the pipeline from the previous command). What happens next? At [0+6] time point the data load command reaches the execution unit. The next command depending on it is 2 clocks behind. At [0+6+2] time point the data load command receives data from the cache and continues its trip down the pipeline, and the execution unit receives the second micro-operation right in time, by the time the result is ready. So, it turns out that the execution unit works two clocks in a row without pausing. <%BANNER[banner_468x30]%>
|
Category NewsCategory: CPU Thursday, July 17, 20082:36 pm AMD’s Chief Executive Officer Hector Ruiz Steps Down. Dirk Meyer Becomes New Chief Exec of AMD 12:15 pm Intel: Atom Will Not Substitute Celeron Processors. Intel Denies Possibility to Change Celeron for Atom Wednesday, July 16, 200811:55 pm Intel Promises to Ship 100 Million 45nm Microprocessors This Year. Intel Says 45nm Process Technology Ramp Better than Ever 7:06 pm Intel to Launch Another Offence with Nehalem Microprocessors Later This Year. Intel to Aggressively Push Nehalem Micro-Architecture into High-End Desktops Tuesday, July 8, 200811:01 pm DreamWorks and Intel Sign Pact: Larrabee, Xeon Set to Be Used. DreamWorks Switches from AMD to Intel 6:07 pm AMD Loses Microprocessor Revenue Share to Intel – iSuppli. AMD, Intel Continue to Gain CPU Revenue Share All Latest News <%BANNER[right_130x130_1]%>
|
|
<%BANNER[foot_728x90]%> | ||