Ants and beetles have exoskeletons–and chips with 60 and 80 cores are going to need them as well.
Researchers at Intel are working on ways to mask the intricate functionality of massive multicore chips to make it easier for computer makers and software developers to adapt to them, said Jerry Bautista, co-director of Intel’s Tera-scale Computing Research Program.
These multicore chips, he added, will also likely contain both x86 processing cores, similar to the brains inside the vast majority of Intel’s server and PC chips today, as well as other types of cores. A 64-core chip, for instance, might contain 42 x86 cores, 18 accelerators and four embedded graphics cores.
Some labs and companies such as ClearSpeed Technology, Azul Systems and Riken have developed chips with large numbers of cores–ClearSpeed has one with 96 cores–but the cores are capable of performing certain types of operations.
The 80-core mystery
Ever since Intel showed off its 80-core prototype processor, people have asked, “Why 80 cores?”
There’s actually nothing magical about the number, Bautista and others have said. Intel wanted to make a chip that could perform 1 trillion floating-point operations per second, known as a teraflop. Eighty cores did the trick. The chip does not contain x86 cores, the kind of cores inside Intel’s PC chips, but cores optimized for floating point (or decimal) math.
Other sources at Intel pointed out that 80 cores also allowed the company to maximize the room inside the reticle, the mask used to direct light from a lithography machine to a photo-resistant silicon wafer. Light shining through the reticle creates a pattern on the wafer, and the pattern then serves as a blueprint for the circuits of a chip. More cores, and Intel would have needed a larger reticle.
Last year, Intel showed off a prototype chip with 80 computing cores. While the semiconductor world took note of the achievement, the practical questions immediately arose: Will the company come out with a multicore chip with x86 cores? (The prototype doesn’t have them.) Will these chips run existing software and operating systems? How do you solve data traffic, heat and latency problems?
Intel’s answer essentially is, yes, and we’re working on it.
One idea, proposed in a paper released this month at the Programming Language Design and Implementation Conference in San Diego, involves cloaking all of the cores in a heterogeneous multicore chip in a metaphorical exoskeleton so that all of the cores look like a series of conventional x86 cores, or even just one big core.
“It will look like a pool of resources that the run time will use as it sees fit,” Bautista said. “It is for ease of programming.”
A paper at the International Symposium on Computer Architecture, also in San Diego, details a hardware scheduler that will split up computing jobs among various cores on a chip. With the scheduler, certain computing tasks can be completed in less time, Bautista noted. It also can prevent the emergence of “hot spots“–if a single processor core starts to get warm because it’s been performing nonstop, the scheduler can shift computing jobs to a neighbor.
Intel is also tinkering with ways to let multicore chips share caches, pools of memory embedded in processors for rapid data access. Cores on many dual- and quad-core chips on the market today share caches, but it’s a somewhat manageable problem.
“When you get to eight and 16 cores, it can get pretty complicated,” Bautista said.
The technology would prioritize operations. Early indications show that improved cache management could improve overall chip performance by 10 percent to 20 percent, according to Intel.
Like the look and feel of technology for heterogeneous chips, programmers won’t, ideally, have to understand or deliberately accommodate the cache-sharing or hardware-scheduling technologies. These operations will largely be handled by the chip itself and be obscured from view.
Heat is another issue that will need to be contained. Right now, I/O (input-output) systems need about 10 watts of power to shuttle data at 1 terabit per second. An Intel lab has developed a low-power I/O system that can transfer 5 gigabits per second at 14 milliwatts–which is less than 14 percent of the power used by current 5Gbps systems today–and 15Gbps at 75 milliwatts, according to Intel. A paper outlining the issue was released at the VLSI Circuits Symposium in Japan this month.
Low-power I/O systems will be needed for core-to-core communication as well as chip-to-chip contacts.
“Without better power efficiency, this just won’t happen,” said Randy Mooney, an Intel fellow and director of I/O research.
Intel executives have said they would like to see massive multicore chips coming out in about five years. But a lot of work remains. Right now, for instance, Intel doesn’t even have a massive multicore chip based around x86 cores, a company spokeswoman said.
The massive multicore chips from the company will likely rely on technology called Through Silicon Vias (TSVs), other executives have said. TSVs connect external memory chips to processors through thousands of microscopic wires rather than one large connection on the side. This increases bandwidth.