Search<%BANNER[mem130]%>
<%BANNER[left_130x300]%>
<%BANNER[left_130x130_2]%>
InformationX-bit Labs for mobile users! Do not forget that we are running a special version of X-bit Labs web-site for users of mobile and handheld devices: http://pda.xbitlabs.com. Check out our news and articles from smartphones and PDAs to be always updated on the latest computer and technology news. <%BANNER[right_130x600]%>
|
<%BANNER[top_768x90]%>
|
|
|
<%BANNER[banner_468x60]%>
Articles: CPU
Prescott: The Last of the Mohicans? (Pentium 4: from Willamette to Prescott) (page 17)Category: CPU [ 05/25/2005 | 11:45 AM ] There is one important thing to be mentioned: the cache access latencies given in the whitepaper documentation differ from those we get during real testing, because they do not take into account the cache functioning strategy. The claimed values are none other but the L2 cache latencies in case the cache is completely isolated from all other sub-systems. But any actual request to the L2 cache of the Northwood processor will show 9+ clocks. When the new data is requested, the L1 cache is always checked first. If the requested data is not there, the search will continue in the caches of other hierarchies. In this case it will mean the following: the total latency will equal 2 clock cycles in L1 cache and 7 clock cycles in L2 cache. So, the actual access latency in Prescott core will equal 22 clocks or even more. Note that it won’t be easy to get these numbers: the requests need to be composed in a specific way, so that there would be no influence of the data prefetch on the latency. By the way, since we came to speak about prefetch, let me remind you that this mechanism was developed even more in Prescott (as you remember its major task is to “deliver” the data to the processor core in time). To be more exact the limitations, which used to be the case in Northwood core, have finally been eliminated. In particular, the prefetch of Northwood processor couldn’t cross the 4KB border of the virtual memory page, which reduced its efficiency tremendously. The problem was that the Northwood prefetch mechanism could grab the data from the page only when it had already been written into TLB. If they requested the data of the “following” page, and it hadn’t yet been written into the TLB, then the Northwood core would not initiate the request to page tables for further translation and would not add any record into the TLB. That is why this prefetch mechanism didn’t go beyond the 4KB limit. In fact, it can go beyond 4KB, but only if the data is already in the TLB, and it cannot add any record about the page into the TLB, if the page is not there yet. In Prescott everything is absolutely different. Now the prefetch mechanism is more aggressive, and doesn’t have any limitations such as 4KB border of the virtual memory page. Moreover, it handles the loop exits much better. As a result of all these improvements (as well as improved bus speed), work with the memory subsystem turned into Prescott’s major advantage. And the work with the caches got somewhat less efficient, and the maximum bandwidth got lower, especially during writing operations. In fact, we have already mentioned it above. I have to point out that once the size of the L1 and L2 caches changed, the cache conflicts processing speed also changed. Aliasing conflicts are the most important ones among them. In this case this term implies that there is a situation when two different addresses in the memory have identical low order 16 bits, for instance. In this case if we try to locate both memory strings in the cache and address them in turns, they will be ousting one another from the cache every time, which will reduce the caching efficiency a lot. Since we haven’t yet dwelled on the aliasing in our article, let’s do it now. In order to understand better what the aliasing conflict is about, we will have to recall the structure of the n-channel partially-associative cache. <%BANNER[banner_468x30]%>
|
Category NewsCategory: CPU Wednesday, July 23, 20083:35 pm AMD to Discuss Rival for Intel Atom Towards Year End. AMD’s Competitor for Intel Atom in the Works, Says Company Monday, July 21, 20088:46 am AMD Initiates Pilot Production of 45nm Chips. AMD to Bring 45nm Products in Early Q4 2008 Thursday, July 17, 20082:36 pm AMD’s Chief Executive Officer Hector Ruiz Steps Down. Dirk Meyer Becomes New Chief Exec of AMD 12:15 pm Intel: Atom Will Not Substitute Celeron Processors. Intel Denies Possibility to Change Celeron for Atom Wednesday, July 16, 200811:55 pm Intel Promises to Ship 100 Million 45nm Microprocessors This Year. Intel Says 45nm Process Technology Ramp Better than Ever 7:06 pm Intel to Launch Another Offence with Nehalem Microprocessors Later This Year. Intel to Aggressively Push Nehalem Micro-Architecture into High-End Desktops All Latest News <%BANNER[right_130x130_1]%>
|
|
<%BANNER[foot_728x90]%> | ||