Bookmark and Share


The modern trend of microprocessors’ development is focused around creation of devices with as many cores as possible. However, there are algorithms that cannot benefit from many-core architectures or multi-threading execution. In order to boost performance of single-threaded applications on multi-core microprocessors, Intel Corp. recently outlined the technology called “Anaphase”.

Researchers from Intel Labs Barcelona have developed “Anaphase” technology, which is a novel hardware/software hybrid approach to leverage multiple cores in order to improve single-thread performance on multi-core processors. This research focuses on different speculative techniques to automatically partition single thread applications to be processed on multiple cores.

The proposed technique features a set of novel hardware mechanisms that support the execution of threads generated at compile time. These threads result from a fine-grain speculative decomposition of the original application and they are executed under a modified multi-core system that includes: mechanisms to support multiple versions; mechanisms to detect violations among threads; mechanisms to reconstruct the original sequential order; and mechanisms to checkpoint the architectural state and recovery to handle misspeculations. On the hardware side, a new unit called “Inter-Core Memory Coherency Module” (ICMC) could be integrated into the die of future processors.

According to Intel, the proposed hardware/software scheme outperforms previous hardware-only schemes to implement the idea of combining cores for executing single-thread applications in a multi-core design by more than 10% on average on Spec2006 for all configurations. Moreover, single-thread performance is improved by 41% on average when the proposed scheme is used on a so-called “tiny-core” (Intel did not reveal, what tiny-core actually is, but it may potentially be a part of the company’s SSC 48-core processor), and up to 2.6 times for some selected applications.

At the present Anaphase is a research project and the Intel Labs Barcelona researchers are looking into ways how to potentially integrate this technology into future processor designs.

Considering that Intel is working on numerous many-core designs, including Larrabee x86 graphics processor and 48-core supercomputer on a chip (SCC) prototype, the ICMC may indeed be a useful piece of hardware. In fact, not only for Intel. Both ATI, graphics business unit of Advanced Micro Devices, and Nvidia Corp. are working hard to implement their many-core graphics processing units (GPUs) into various high-performance computing (HPC) segments. Although raw horsepower is more important for HPC that performance of single-threaded apps, as general purpose processing on GPUs (GPGPU) becomes more popular on different markets, hardware/software tricks to speed up single-threaded algorithms may become necessary.

Tags: Intel, ICMC, Core, SCC, Larrabee


Comments currently: 10
Discussion started: 05/20/10 02:54:55 PM
Latest comment: 05/21/10 12:09:27 PM
Expand all threads | Collapse all threads


Sounds good, and... Is there any AMD plan for such things?
0 0 [Posted by: Pouria  | Date: 05/20/10 02:54:55 PM]

It doesn't necessarily need to be. The ATi division should do their job and provide more useful software that uses the GPU. Working on that chapter will .. in some years ... lead to implementations in CPUs. But they aren't working on that at all . They should have provided around 6 to 9 projects per year since they've been promoting GPGPU ... Work with software makers ... Make a plug-in for a multiform video encoder, one for archiving, one for 2D processing, one for password recovery etc. IMHO ... they should have made the best video encoder on the market and make it available for 19$ and only compatible with ATi cards That would have been very profitable and good for their image.
0 0 [Posted by: East17  | Date: 05/20/10 04:27:57 PM]
- collapse thread

What about ATI Avivo fastest video encoder on market.
0 0 [Posted by: Blackcode  | Date: 05/20/10 08:49:37 PM]

AMD already boost 3 core for single thread application in turbo mod.
0 0 [Posted by: Blackcode  | Date: 05/20/10 08:46:55 PM]
- collapse thread

which is just a "not as good" copy of intel's turbo mode.

But this is not talking about turbo mode, its talking about splitting a single thread so it can run on multiple threads (in ADDITION to turbo mode)

AMD isn't working on anything like that
0 0 [Posted by: taltamir  | Date: 05/21/10 10:37:36 AM]
I think Intel go on wrong way ,most of today application become multithread ,we will see.AMD turbo mod is better than Intel ,because Intel OC only 1 core in turbo mode,AMD OC 3 core in turbo mod.Intel power off other core in turbo mode,AMD core which is not in turbo mode go to idle,so i dont agree with u that AMD turbo is copy of Intel turbo ,u see diffrent approach of turbo.
0 0 [Posted by: Blackcode  | Date: 05/21/10 12:09:27 PM]

AMD was very good specially on graphics and gaming (processors I mean not VGA cards). AMD memory controller in the CPU instead of north bridge was a wonderful work. But why since Intel has changed their architectur similar to AMD's (maybe I am wrong) AMD is far behind Intel?
Also AMD does not have any good response for Hyper-threading.
"AMD True Core Scalability" is there
who cares?!!!!
0 0 [Posted by: Pouria  | Date: 05/21/10 04:13:13 AM]

Well, Pouria, Nehalem may have a heirarchy and layout that is very similar to K10 but remember that Core2/i7 has a phat execution engine. Intel focused on execution with C2 and more so on "plumbing" with Nehalem. I 'd say that C2->i7 ~= K8->K10 in terms of transition but C2 was a lot meaner to begin with, therfore i7 has the performance edge now.

What this article is talking about is something very different from what we've seen so far. They are claiming that they can put multiple cores to work on a single thread. I don't know how on earth they will pull it off because there should be a lot of synchronization problems with trying to do something like this.
0 0 [Posted by: cheeseman  | Date: 05/21/10 05:45:02 AM]
- collapse thread

The computer architecture community has been doing research along these lines for some time now but research has picked up with the multi-core era. This research area is known as "Speculative Multithreading" or "Thread-Level Speculation" and does in fact require mechanisms for cross-thread communication. (A google scholar search should yield some good works. The seminal paper is "Multiscalar".)

The research by Intel improves on previous work and is novel, especially in its compiler support to create fine-grain speculative threads. I'm personally excited to see Intel is looking more seriously at this approach.

Some recent research has looked at using Transactional Memory hardware to implement Speculative Multithreading (disclosure - I do research in this area). Intel has looked at Transactional Memory (as has AMD) so I'm curious if they may be looking at some form of merger.
0 0 [Posted by: leporter  | Date: 05/21/10 10:49:23 AM]

About the need for the software part of this new tech; On existing CPU designs, maybe... but what if... sometime in the near future... Intel/AMD/? solve cache coherency problems in multi-core processors when addressing a single process memory scope, then apply a simple hardware switch that, in effect, frequency-interlaces multiple cores instruction-wise? I know there would be certain cache latency penalties, but branch predictive functions of the CPU should make up for that in most cases and put those parts to even better use.
0 0 [Posted by: MyK  | Date: 05/21/10 06:27:50 AM]


Add your Comment

Related news

Latest News

Wednesday, November 5, 2014

10:48 pm | LG’s Unique Ultra-Wide Curved 34” Display Finally Hits the Market. LG 34UC97 Available in the U.S. and the U.K.

Wednesday, October 8, 2014

12:52 pm | Lisa Su Appointed as New CEO of Advanced Micro Devices. Rory Read Steps Down, Lisa Su Becomes New CEO of AMD

Thursday, August 28, 2014

4:22 am | AMD Has No Plans to Reconsider Recommended Prices of Radeon R9 Graphics Cards. AMD Will Not Lower Recommended Prices of Radeon R9 Graphics Solutions

Wednesday, August 27, 2014

1:09 pm | Samsung Begins to Produce 2.13GHz 64GB DDR4 Memory Modules. Samsung Uses TSV DRAMs for 64GB DDR4 RDIMMs

Tuesday, August 26, 2014

10:41 am | AMD Quietly Reveals Third Iteration of GCN Architecture with Tonga GPU. AMD Unleashes Radeon R9 285 Graphics Cards, Tonga GPU, GCN 1.2 Architecture