Information

X-bit Labs for mobile users! Do not forget that we are running a special version of X-bit Labs web-site for users of mobile and handheld devices: http://pda.xbitlabs.com. Check out our news and articles from smartphones and PDAs to be always updated on the latest computer and technology news.

 

Articles: CPU

Replay: Unknown Features of the NetBurst Core (page 9)


Category: CPU

by Victor Kartunov , Yury Malich , Jan Keruchenko aka C@t , and Vadim Levchenko aka VLev

[ 06/06/2005 | 04:20 PM ]


Pages : 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17

STLF Violations

Those of you who have studied NetBurst micro-architecture carefully should know that Store operation is split into two quasi-independent micro-operations: Store data (STD) and Store Address (STA). The results of these two micro-operations are combined in the Store Buffer (SB). The data stays in the SB until the record command retires. After that they are saved in the cache and RAM via the intermediate Write Buffer, where the modified data is put back together. Data load commands perform speculative reading of the data which is not in cache yet directly from the SB. This process is called store-to-load-forwarding (STLF).

In order to the STLF to end up successfully, certain conditions should be fulfilled:

  • The data requested by the read operation can be taken from the SB or from the L1D, but in no case should be a combination of both, so the size and the address of the read data should correspond to the SB record.
  • By the time the data load is taking place the SB should already carry the correct results of STA and STD.

The violation of any of the above mentioned conditions may lead to very unpleasant delays.  The re-execution of data load command, which couldn’t be completed successfully because of the STLF violation is also carried out via the replay system.

Well, in order for Store to be able to send the data for Load command, SB should have the STA and STD results ready in advance. Although “IA-32 Intel Architecture Optimization Reference Manual” classifies this condition as “Store Forwarding Restrictions”, we should realize that it requires specific processing and leads to specific consequences. The true STLF violation, when the data already located in the SB cannot be sent, results into a significant delay: store and all preceding instructions should retire first and the store result should be saves in the cache. In our case, i.e. during STLF Restriction on Data Availability, we should only wait for the STA/STD result. As you may have already guessed, replay works here: LD and all dependent instructions are sent to RL and circle there until the Store results arrive.

We have just studies two types of STLF violations: when by the time Load should be executed either STA or STD result is not ready yet (the third type, when none of the results is ready will be determined by the worst consequences of the first two types).

<<< Previous page Next page >>>

Discussion

Comments currently: 25
Discussion started: 06/08/05
View comments

Add your Comment

Name/Nickname
Your Comments
 

Category News

Category: CPU

Thursday, July 17, 2008

2:36 pm AMD’s Chief Executive Officer Hector Ruiz Steps Down. Dirk Meyer Becomes New Chief Exec of AMD

12:15 pm Intel: Atom Will Not Substitute Celeron Processors. Intel Denies Possibility to Change Celeron for Atom

Wednesday, July 16, 2008

11:55 pm Intel Promises to Ship 100 Million 45nm Microprocessors This Year. Intel Says 45nm Process Technology Ramp Better than Ever

7:06 pm Intel to Launch Another Offence with Nehalem Microprocessors Later This Year. Intel to Aggressively Push Nehalem Micro-Architecture into High-End Desktops

Tuesday, July 8, 2008

11:01 pm DreamWorks and Intel Sign Pact: Larrabee, Xeon Set to Be Used. DreamWorks Switches from AMD to Intel

6:07 pm AMD Loses Microprocessor Revenue Share to Intel – iSuppli. AMD, Intel Continue to Gain CPU Revenue Share

 
News Archive
All Latest News