Search<%BANNER[mem130]%>
<%BANNER[left_130x300]%>
<%BANNER[left_130x130_2]%>
InformationX-bit Labs for mobile users! Do not forget that we are running a special version of X-bit Labs web-site for users of mobile and handheld devices: http://pda.xbitlabs.com. Check out our news and articles from smartphones and PDAs to be always updated on the latest computer and technology news. <%BANNER[right_130x600]%>
|
<%BANNER[top_768x90]%>
|
|
|
<%BANNER[banner_468x60]%>
Articles: CPU
Replay: Unknown Features of the NetBurst Core (page 14)Category: CPU by Victor Kartunov , Yury Malich , Jan Keruchenko aka C@t , and Vadim Levchenko aka VLev [ 06/06/2005 | 04:20 PM ] Recording Data into Write BufferThe story here is fairly funny: it all started with our discovery of absolutely new replay loops, which appeared out of nowhere. Of course, we had to find out where they actually come from and why. However, at first the whole investigation went on really slowly, as the loops emerges often and then disappeared for some unknown reasons. Our detailed investigation revealed the connection with the data recording into Write Buffer. Let’s discuss a few details about it. The L1 cache of Pentium 4 processor supports Write Through data update policy, which implies that the contents of both: L1 and L2 cache should be updated simultaneously. Therefore, the record is actually made into L2 cache, while the results of multiple operations are accumulated in WB first, so that the transfer to L2 cache could be performed within the minimal number of transactions. When the line absent in WB has been recorded, it gets blocked for reading for a while. For the data reading from Write Buffer to be blocked, a few conditions should be fulfilled:
Note that the addresses for read and write operations may be different: they should only belong to the same line. According to the test results we obtained, if we read the first half of the line (0.31 bytes), the Load command will go to replay within 21-33 clock cycles after Store. And if we read the data from the second half of the line (32-63 bytes), Load command will go to replay within 21-34 clock cycles after Store. Only one clock cycle difference for the lower and higher 256 bits of the line indicates that the modified data is combined with the non-modified data from the cache within these periods of time. Pentium 4 Northwood processor features Write Buffer with 6 lines 64 bytes each. It is not too much that is why if the read and write operations go close to one another, we will experience not only recording delays. It will be impossible to read saved data from Write Buffer immediately, which will also cause delays and send read commands through the replay system. <%BANNER[banner_468x30]%>
|
Category NewsCategory: CPU Thursday, July 17, 20082:36 pm AMD’s Chief Executive Officer Hector Ruiz Steps Down. Dirk Meyer Becomes New Chief Exec of AMD 12:15 pm Intel: Atom Will Not Substitute Celeron Processors. Intel Denies Possibility to Change Celeron for Atom Wednesday, July 16, 200811:55 pm Intel Promises to Ship 100 Million 45nm Microprocessors This Year. Intel Says 45nm Process Technology Ramp Better than Ever 7:06 pm Intel to Launch Another Offence with Nehalem Microprocessors Later This Year. Intel to Aggressively Push Nehalem Micro-Architecture into High-End Desktops Tuesday, July 8, 200811:01 pm DreamWorks and Intel Sign Pact: Larrabee, Xeon Set to Be Used. DreamWorks Switches from AMD to Intel 6:07 pm AMD Loses Microprocessor Revenue Share to Intel – iSuppli. AMD, Intel Continue to Gain CPU Revenue Share All Latest News <%BANNER[right_130x130_1]%>
|
|
<%BANNER[foot_728x90]%> | ||