Tilera Corp. has announced the Tile-Gx72 processor with 72 power-efficient cores coupled with massive I/O and quad-channel high-performance DDR3 memory controller to drive the next-generation of network, multimedia and cloud infrastructure.
Tilera’s customers are building the intelligent infrastructure for tomorrow’s hyper-connected, mobile networks and the amount of real-time data and video content is growing exponentially each year. Substantially more complex services are being offered, and the rapid pace of new feature additions demands a software-based model with familiar, standards-based programming tools. Further, OEMs are insisting on dramatically higher levels of performance and scalability from their processor supplier.
The Tile-Gx72 leverages Tilera’s many innovations – including the iMesh 2-dimensional interconnect, DDC distributed coherent cache, and TileDirect direct-to-cache I/O – to deliver the highest compute-per-watt efficiency of any multicore processor in its class. The 72 cores each support three-issue operations and 64-bit capabilities. Each processor core integrates L1 and L2 caches and supports virtual memory and multiple privilege levels. In total, the chip has 23MB of L3 ECC cache. Tile-Gx72 sports 8 10Gb/s Ethernet ports, configurable as 32 1Gb/s ports; 6 PCI Express ports with 24 lanes of SerDes and other capabilities.
“The Tile-Gx72 rounds out our processor portfolio, complementing our 9, 16 and 36-core TILE-Gx processors and is offering a remarkable range of processing performance. Customers demand ever-increasing levels of performance and performance-per-watt to stay competitive and they simultaneously want to reuse their software and hardware investments across their product portfolio. The TILE-Gx72 brings an unprecedented amount of compute to customer designs, and leverages thousands of open source libraries and the growing Linux ecosystem,” said Devesh Garg, president and chief executive officer of Tilera.
The Tile-Gx72 is suited for compute and I/O-intensive applications including:
- L2-7 networking and firewall appliances;
- High throughput SDN (Software Defined Network) computing;
- Network monitoring and analytics with 100% line-rate packet capture at 100Gb/s;
- Layer 7 Deep Packet Inspection (DPI) at over 50 Gb/s;
- Compute offload NIC (Network Interface Card);
- Intrusion prevention and detection (IPS/IDS) at over 20 Gb/s;
- “Big Data” transaction processing at over 4 million transactions per second;
- Streaming video server/content delivery networking offering 50 Gbps HTTP streaming;
- HD video conferencing with dozens of H.264 1080p encode/decode channel;
“We continue to be impressed with the scalability of the Tile-Gx family with its seamless software compatibility from 9 cores to 72 cores. The TILE-Gx72 processor brings the right mix of compute, low-latency I/O, memory bandwidth, and accelerators for the needs of our intelligent, integrated security appliances,” said Ofer Raz, head of platforms and architecture of Check Point Software Technologies.
Tags: Tilera, Gx, Gx72
Comments currently: 6
Discussion started: 02/24/13 06:54:17 PM
Latest comment: 02/28/13 09:24:42 AM
Expand all threads
| Collapse all threads
Sounds like a lot but when you take a moment to think
23MB cache / 72 Cores = 0.33 MB per core.
4 DDR channels: 4x16 = 64GB max for a server???
My desktop rig i7-3770: 8MB cache /4 cores = 2MB per core.
4 DIMMS x 8GB = 32GB.
My whole system idles at 30-35W, full load 100W.
I don't know what ARM core it uses, if it is A15, it easily sucks 1-2W at load. 72cores x 1W = 72W at least.
02/24/13 07:24:06 PM]
- collapse thread
You do know this kind of processor goes in network routers/switches and not servers, right?
02/25/13 05:32:43 PM]
They're nominally communication/imaging processors, but they have a broader potential application space than that. Facebook tested the previous generation parts and saw a pretty nice power/performance advantage relative to Xeon for some of their workloads.
02/25/13 05:59:11 PM]
For the applications this targets DDR bandwidth is only part of the picture: Mesh and L2 bandwidth matter at least as much, since they often use these cores to implement higher-level structures like pipelines and systolic arrays.
Also, who said anything about an "ARM core"?
Tilera uses their own 3-wide 64-bit VLIW core as opposed to an out-of-order superscalar like A9/A15/proAptive. That's actually a pretty nice tradeoff for their application space. Most of the workloads/algorithms they implement have reasonable static ILP, and they don't care about binary portability between core generations. They can therefore save a LOT of power/gates by pushing dependency analysis and instruction scheduling into the compiler.
This makes for an interesting contrast with LSI, who went the "many ARM core" route (16 A15s at 1.6 GHz) for a similar application space and chip size. Tilera achieves 216 Gops/sec peak, whereas LSI gets 77 Gops/sec. That's actually a pretty typical ratio for a simple VLIW vs an OoO core like A15. The trick is to use all of that bandwidth :-).
02/25/13 05:44:11 PM]
Up to 23 MBytes of coherent cache is available and the high-end TILE-Gx devices can address up to 1 TB of DDR3 memory.
See their webpage here :
02/28/13 09:24:42 AM]
Add your Comment
Enter your username and e-mail address. Password will be sent to you.