Information

Did you know you could become an X-bit labs fan on Facebook or follow us on Twitter?



Discussion

Discussion on Article:
Tilera Announces 72-Core Chip with 23MB Cache for Network Devices.

Started by: pwerwebna | Date 02/24/13 06:54:17 PM
Comments: 6 | Last Comment:  02/28/13 09:24:42 AM

Expand all threads | Collapse all threads

[1-2]

1. 
Sounds like a lot but when you take a moment to think
23MB cache / 72 Cores = 0.33 MB per core.
4 DDR channels: 4x16 = 64GB max for a server???

My desktop rig i7-3770: 8MB cache /4 cores = 2MB per core.
4 DIMMS x 8GB = 32GB.
My whole system idles at 30-35W, full load 100W.

I don't know what ARM core it uses, if it is A15, it easily sucks 1-2W at load. 72cores x 1W = 72W at least.







0 1 [Posted by: Tukee44  | Date: 02/24/13 07:24:06 PM]
Reply
- collapse thread

 
You do know this kind of processor goes in network routers/switches and not servers, right?
1 0 [Posted by: BillionPa  | Date: 02/25/13 05:32:43 PM]
Reply
 
They're nominally communication/imaging processors, but they have a broader potential application space than that. Facebook tested the previous generation parts and saw a pretty nice power/performance advantage relative to Xeon for some of their workloads.
0 0 [Posted by: patrickjchase  | Date: 02/25/13 05:59:11 PM]
Reply
 
For the applications this targets DDR bandwidth is only part of the picture: Mesh and L2 bandwidth matter at least as much, since they often use these cores to implement higher-level structures like pipelines and systolic arrays.

Also, who said anything about an "ARM core"?

Tilera uses their own 3-wide 64-bit VLIW core as opposed to an out-of-order superscalar like A9/A15/proAptive. That's actually a pretty nice tradeoff for their application space. Most of the workloads/algorithms they implement have reasonable static ILP, and they don't care about binary portability between core generations. They can therefore save a LOT of power/gates by pushing dependency analysis and instruction scheduling into the compiler.

This makes for an interesting contrast with LSI, who went the "many ARM core" route (16 A15s at 1.6 GHz) for a similar application space and chip size. Tilera achieves 216 Gops/sec peak, whereas LSI gets 77 Gops/sec. That's actually a pretty typical ratio for a simple VLIW vs an OoO core like A15. The trick is to use all of that bandwidth :-).

0 0 [Posted by: patrickjchase  | Date: 02/25/13 05:44:11 PM]
Reply

2. 
Up to 23 MBytes of coherent cache is available and the high-end TILE-Gx devices can address up to 1 TB of DDR3 memory.

See their webpage here :
http://www.tilera.com/pro...processors/TILE-Gx_Family
0 0 [Posted by: reif.mike@gmail.com  | Date: 02/28/13 09:24:42 AM]
Reply

[1-2]

Back to the Article

Add your Comment