Reflections: Parrallelism via Multithreaded and Multicore CPUs

For any reader’s who are a member of the ACM or IEEE you might be familiar with the magazine, Computer. This segment is a critical look at an article from March 2010 called “Parallelism via Multithreaded and Multicore CPUs”. I will offer my own personal analysis along with general information about what was contained within the article.

The general summary of the article is that it is a “comparison between multicore and multithreaded CPUs currently on the market”. The attributes the article focuses on are “design decisions, performance, power efficiency, and software concerns in relation to application and workload characteristics”. This area is really intriguing to me personally, because it is an area I have always wanted more clarification on. What are the yields of a multicore and multithreaded processors and how important are they, especially when it comes to choosing the right cpu for a new computer.

The article starts off with design decisions. It describes how multithreaded cores have multiple hardware threads to make switching between threads easier and more efficient. The most common approach to switch between threads is known as simultaneous multithreading aka hyperthreading. This threading technique utilizes precoded instructions from only a subset of the threads on the chip. Interestingly the article also mentions that no commercial CPUs issue more than two threads per core per cycle. This information tells me that the way CPUs are threaded is a negligible difference when deciding which CPU is better. The article further explains that the limits on threading are due to scalability. By having more than two threads you have surpassed a “saturation point”. This point hampers your ability to get any more use out of executing more than two threads. However there is a way to work around this dilemma: multiple cores, which is great news. These facts indicate that threading is standard but how many cores you have does make a big difference in capability.

Another consideration is the cache. There are currently three types of caches: shared, private, and dynamic. The latter being very rare. Because of this reason the article compares the major types of private and shared. Shared implies that the cache is shared between the cores, while private implies it belongs to one core alone and cannot be used by other cores. For multicore programs it is better to have shared cache if the software threads need to share data. This method also prevents the need to copy data and if more efficient because it avoids the need to access other caches indirectly between cores. The draw back though is that shared cache is more unpredictable. The software is less isolated and therefore can end up using much more cache than is necessary. Also it makes it difficult to gauge the program’s service to each thread which leads to instability. The private is therefor more predictable and controls performance. These findings depict some of the tricky decisions in CPU choice. It becomes a much more advanced situation of deciding which kind of trade off you want to make based on the software you use. To me I would prefer the private because stability is many times better than speed when it comes to managing memory.

So when it comes to multicore processors, this article suggests that there is no clear cut choice. It all depends on your software’s specific needs and design. Certain hardware is always have certain software designs in mind and CPUs are no exception. Just like life, there is never one true right answer.