A Tip A Day [:: ATAD ::]

a fortune, two cents a day

ATAD #15 – Beyond the CPU clock speed

leave a comment »

In completing a task the computer splits each instruction into a series of independent steps. The computer that driven by a clock will process an instruction at a time and proceed to the next one in the instruction pipeline when the next clock signal arrives. So naturally, faster the clock speed, the instructions waiting in the pipeline are going to be completed faster. A generic pipeline has four independent steps performed per clock cycle.

  1. Fetch: Read an instruction
  2. Decode and Register Fetch
  3. Execute which might involve memory access
  4. Write Back

(Note: newer processors are buit to perform more than one of the steps in the pipeling during one clock cycle. Eg. the upcoming Intel Nahelm processor can decode upto four instructions at a time)

Processing instructions would need memory access. A CPU Cache is employed here that is simply a very fast memory and that can be accessed in very few cycles. Modern processors adopt cache of different sizes and at various levels or stps of the instruction processing.  A efficient cache design is also pivotal in improving processing speed and clock frequency.

Instruction flow are not just sequential but can brach out based on conditional evaluation and exeution. Sometimes a case might arise where a brach taken will require new instructions to be executed, in which case the pipeline must be stalled or flushed. Modern microachitecure has intruduced techniques such as branch prediction and speculative execution to reduce such penalties.

Processor development and VLSI techniques improvements  like superscalar processor facilitates parallel instruction execution per cycle. The keys to superscalar execution are an instruction fetching unit that can fetch more than one instruction at a time from cache; instruction decoding logic that can decide when instructions are independent and thus executed simultaneously; and sufficient execution units to be able to process several instructions at one time.

Increasing cache size facilitates storing of more instructions which leads to the possibility of Out-of-order execution of instructions while an older instruction waits on the cache, then re-orders the results to make it appear that everything happened in the programmed order.

Other well known concepts are Multiprocessing where two or more CPUs are used in a single computer and Multithreading where thread level and instruction level parallelism is targetted.



Written by veed

August 11, 2008 at 7:25 pm

Posted in ATAD, computing, tech

Tagged with , ,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: