Joke Collection Website - Blessing messages - What is cpu

What is cpu

Central Processing Unit

It is the abbreviation of "Central Processing Unit" in English, that is, CPU. CPU is generally composed of logical operation unit, control unit and storage unit. The logic operation and control unit includes some registers, which are used for temporary storage of data when the CPU processes data. In fact, when we buy a CPU, we do not need to know its structure, as long as we know Its performance is just fine.

The main performance indicators of the CPU are:

1. Main frequency

The main frequency is also called the clock frequency, and the unit is MHz, which is used to indicate the computing speed of the CPU. . CPU main frequency = FSB × multiplication factor. Many people think that the main frequency determines the running speed of the CPU. This is not only one-sided, but also for servers, this understanding is also biased. So far, there is no definite formula that can realize the numerical relationship between the main frequency and the actual computing speed. Even the two major processor manufacturers Intel and AMD have great disputes on this point. We start from Intel Looking at the development trends of its products, it can be seen that Intel attaches great importance to strengthening the development of its own main frequency. Like other processor manufacturers, someone once compared it with a 1G Transmeta processor. Its operating efficiency is equivalent to a 2G Intel processor.

Therefore, the main frequency of the CPU is not directly related to the actual computing power of the CPU. The main frequency indicates the speed of the digital pulse signal oscillation in the CPU. We can also see examples of this in Intel's processor products: 1 GHz Itanium chips can perform almost as fast as 2.66 GHz Xeon/Opteron, or 1.5 GHz Itanium 2 is about as fast as 4 GHz Xeon/Opteron. The computing speed of the CPU also depends on the performance indicators of various aspects of the CPU's pipeline.

Of course, the main frequency is related to the actual computing speed. It can only be said that the main frequency is only one aspect of CPU performance and does not represent the overall performance of the CPU.

2. FSB

The FSB is the base frequency of the CPU, and its unit is also MHz. The CPU's FSB determines the running speed of the entire motherboard. To put it bluntly, in desktop computers, what we call overclocking refers to overclocking the CPU's FSB (of course, under normal circumstances, the CPU multiplier is locked). I believe this is well understood. But for server CPUs, overclocking is absolutely not allowed. As mentioned earlier, the CPU determines the running speed of the motherboard. The two run synchronously. If the server CPU is overclocked and the FSB is changed, asynchronous operation will occur. (Many desktop motherboards support asynchronous operation.) This will cause the entire server to run asynchronously. System instability.

In most current computer systems, the FSB is also the synchronous running speed between the memory and the motherboard. In this way, it can be understood that the CPU's FSB is directly connected to the memory, realizing both synchronized operating status. FSB and front-side bus (FSB) frequencies are easily confused. Let’s talk about the differences between the two in the following introduction to the front-side bus.

3. Front-side bus (FSB) frequency

Front-side bus (FSB) frequency (i.e. bus frequency) directly affects the speed of direct data exchange between the CPU and memory. There is a formula that can be calculated, that is, data bandwidth = (bus frequency × data bandwidth)/8. The maximum bandwidth of data transmission depends on the width and transmission frequency of all data transmitted simultaneously. For example, the current Xeon Nocona that supports 64-bit has a front-side bus of 800MHz. According to the formula, its maximum data transmission bandwidth is 6.4GB/second.

The difference between FSB and FSB frequency: The speed of FSB refers to the speed of data transmission, and the FSB is the speed of synchronous operation between the CPU and the motherboard.

In other words, the 100MHz FSB specifically refers to the digital pulse signal oscillating ten million times per second; while the 100MHz front-side bus refers to the amount of data transmission that the CPU can accept per second, which is 100MHz×64bit÷8Byte/bit=800MB/ s.

In fact, the emergence of the "HyperTransport" architecture has changed the actual front-side bus (FSB) frequency. We previously knew that the IA-32 architecture must have three important components: Memory Controller Hub (MCH), I/O Controller Hub and PCI Hub, like Intel's typical chipsets Intel 7501 and Intel7505 chipsets, which are dual Xeons. The processors are tailor-made. The MCH they contain provides the CPU with a front-side bus frequency of 533MHz. With DDR memory, the front-side bus bandwidth can reach 4.3GB/second. However, as processor performance continues to improve, it also brings many problems to the system architecture. The "HyperTransport" architecture not only solves the problem, but also improves the bus bandwidth more effectively, such as AMD Opteron processors. The flexible HyperTransport I/O bus architecture allows it to integrate the memory controller, so that the processor does not transmit data through the system bus. The chipset exchanges data directly with the memory. In this case, I don’t know where to start talking about the front-side bus (FSB) frequency in AMD Opteron processors.

4. CPU bits and word length

Bit: Binary is used in digital circuits and computer technology, and the codes are only "0" and "1", whether it is "0" Or "1" is a "bit" in the CPU.

Word length: In computer technology, the number of binary digits that the CPU can process at one time per unit time (at the same time) is called the word length. Therefore, a CPU that can process data with a word length of 8 bits is usually called an 8-bit CPU. In the same way, a 32-bit CPU can process binary data with a word length of 32 bits per unit time. The difference between byte and word length: Since commonly used English characters can be represented by 8-bit binary, 8 bits are usually called a byte. The length of the word length is not fixed, and the length of the word length is different for different CPUs. An 8-bit CPU can only process one byte at a time, while a 32-bit CPU can process 4 bytes at a time. Similarly, a 64-bit CPU can process 8 bytes at a time.

5. Multiplier coefficient

The multiplier coefficient refers to the relative proportional relationship between the CPU main frequency and the FSB. Under the same FSB, the higher the frequency multiplier, the higher the CPU frequency. But in fact, under the premise of the same FSB, a high-multiplier CPU itself is of little significance. This is because the data transmission speed between the CPU and the system is limited. A CPU that blindly pursues high multipliers and obtains a high main frequency will have an obvious "bottleneck" effect - the maximum speed at which the CPU obtains data from the system cannot satisfy the CPU's computing requirements. speed. Generally speaking, except for the engineering samples, Intel's CPUs have locked multipliers, but AMD has not locked them before.

6. Cache

Cache size is also one of the important indicators of the CPU, and the structure and size of the cache have a great impact on the CPU speed. The cache in the CPU runs at an extremely high frequency. Generally, it operates at the same frequency as the processor, and its working efficiency is much greater than that of system memory and hard disk. In actual work, the CPU often needs to read the same data block repeatedly, and the increase in cache capacity can greatly improve the hit rate of reading data within the CPU without having to search for it in the memory or hard disk, thereby improving system performance. . However, due to factors such as CPU chip area and cost, the cache is very small.

L1 Cache (level one cache) is the first level cache of the CPU, which is divided into data cache and instruction cache. The capacity and structure of the built-in L1 cache have a greater impact on the performance of the CPU. However, the cache memory is composed of static RAM and has a complicated structure. When the CPU die area cannot be too large, the capacity of the L1 cache is not sufficient. Probably made too big.

The capacity of the L1 cache of a general server CPU is usually 32-256KB.

L2 Cache (Level 2 Cache) is the second level cache of the CPU, which is divided into internal and external chips. The internal on-chip L2 cache runs at the same speed as the main frequency, while the external L2 cache only runs at half the main frequency. The L2 cache capacity will also affect the performance of the CPU. The principle is that the bigger the better. The largest capacity of the current home CPU is 512KB, while the L2 cache of the CPU on servers and workstations is as high as 256-1MB, and some are as high as 2MB or 3MB. .

L3 Cache (three-level cache) is divided into two types. The early one was external, and the current one is built-in. Its actual effect is that the application of L3 cache can further reduce memory latency and improve processor performance when calculating large amounts of data. Reducing memory latency and improving large-data computing capabilities are helpful for games. In the server field, adding L3 cache still has a significant improvement in performance. For example, a configuration with a larger L3 cache will use physical memory more efficiently, so it can handle more data requests than a slower disk I/O subsystem. Processors with larger L3 caches provide more efficient file system cache behavior and shorter message and processor queue lengths.

In fact, the earliest L3 cache was applied to the K6-III processor released by AMD. The L3 cache at that time was limited by the manufacturing process and was not integrated into the chip, but was integrated on the motherboard. The L3 cache, which can only be synchronized with the system bus frequency, is actually not much different from the main memory. Later, the L3 cache was used by Intel's Itanium processor for the server market. Then there are P4EE and Xeon MP. Intel also plans to launch an Itanium2 processor with 9MB L3 cache, and later a dual-core Itanium2 processor with 24MB L3 cache.

But basically the L3 cache is not very important to improve the performance of the processor. For example, the Xeon MP processor equipped with 1MB L3 cache is still not the opponent of Opteron. It can be seen that the increase of front-side bus is more important than that of Opteron. The increase in cache brings more effective performance improvements.

7. CPU extended instruction set

CPU relies on instructions to calculate and control the system. Each CPU is designed with a series of instruction systems that match its hardware circuit. The strength of instructions is also an important indicator of the CPU. The instruction set is one of the most effective tools to improve the efficiency of microprocessors. From the current mainstream architecture, the instruction set can be divided into complex instruction set and simplified instruction set. From the perspective of specific applications, such as Intel's MMX (Multi Media Extended), SSE, SSE2 (Streaming-Single instruction multiple data -Extensions 2), SEE3 and AMD's 3DNow! are all extended instruction sets of the CPU, which respectively enhance the multimedia, graphics and Internet processing capabilities of the CPU. We usually refer to the extended instruction set of the CPU as the "CPU instruction set". The SSE3 instruction set is also the smallest instruction set currently. Previously, MMX contained 57 commands, SSE contained 50 commands, SSE2 contained 144 commands, and SSE3 contained 13 commands. Currently, SSE3 is also the most advanced instruction set. Intel Prescott processors already support the SSE3 instruction set. AMD will add support for the SSE3 instruction set to future dual-core processors. Transmeta processors will also support this instruction set.

8.CPU core and I/O working voltage

Starting from the 586CPU, the working voltage of the CPU is divided into two types: core voltage and I/O voltage. Usually the core voltage of the CPU is less than Equal to the I/O voltage. The size of the core voltage is determined based on the CPU's production process. Generally, the smaller the production process, the lower the core operating voltage; I/O voltages are generally 1.6~5V.

Low voltage can solve the problems of excessive power consumption and excessive heat generation.

9. Manufacturing process

The micron of the manufacturing process refers to the distance between circuits within the IC. The trend in manufacturing processes is towards higher density. Higher-density IC circuit designs mean that ICs of the same size can have circuit designs with higher density and more complex functions. Now the main ones are 180nm, 130nm and 90nm. Recently, officials have stated that there is a 65nm manufacturing process.

10. Instruction set

(1) CISC instruction set

CISC instruction set, also known as complex instruction set, the English name is CISC, (Complex Instruction Abbreviation for Set Computer). In a CISC microprocessor, each instruction of the program is executed serially in order, and each operation in each instruction is also executed serially in order. The advantage of sequential execution is simple control, but the utilization rate of various parts of the computer is not high and the execution speed is slow. In fact, it is the x86 series (that is, IA-32 architecture) CPU produced by Intel and its compatible CPUs, such as AMD and VIA. Even the new X86-64 (also called AMD64) belongs to the category of CISC.

To know what an instruction set is, we must start with today's X86 architecture CPU. The X86 instruction set was specially developed by Intel for its first 16-bit CPU (i8086). The CPU in the world's first PC—i8088 (simplified version of i8086) launched by IBM in 1981 also used X86 instructions. At the same time, the computer The X87 chip was added to improve floating-point data processing capabilities. From now on, the X86 instruction set and the X87 instruction set will be collectively referred to as the X86 instruction set.

Although with the continuous development of CPU technology, Intel has successively developed newer i80386, i80486, up to the past PII Xeon, PIII Xeon, Pentium 3, and finally to today's Pentium 4 series, Xeon (Excluding Xeon Nocona), but in order to ensure that the computer can continue to run various applications developed in the past to protect and inherit rich software resources, all CPUs produced by Intel continue to use the X86 instruction set, so its CPU Still belongs to the X86 series. Since the Intel X86 series and its compatible CPUs (such as AMD Athlon MP,) all use the X86 instruction set, today's huge lineup of X86 series and compatible CPUs has been formed. x86CPU currently mainly includes Intel server CPU and AMD server CPU.

(2) RISC instruction set

RISC is the abbreviation of "Reduced Instruction Set Computing" in English, which means "reduced instruction set" in Chinese. It was developed on the basis of the CISC instruction system. Someone tested the CISC machine and showed that the frequency of use of various instructions is quite different. The most commonly used instructions are some relatively simple instructions, which only account for 20% of the total number of instructions. But the frequency of occurrence in the program accounts for 80%. A complex instruction system will inevitably increase the complexity of the microprocessor, making the development of the processor long and costly. And complex instructions require complex operations, which will inevitably reduce the speed of the computer. Based on the above reasons, RISC CPUs were born in the 1980s. Compared with CISC CPUs, RISC CPUs not only streamlined the instruction system, but also adopted something called "superscalar and super-pipeline structure", which greatly increased parallel processing capabilities. . The RISC instruction set is the development direction of high-performance CPUs. It is opposed to traditional CISC (Complex Instruction Set). In comparison, RISC has a unified instruction format, fewer types, and fewer addressing methods than complex instruction sets. Of course, the processing speed is greatly improved. At present, CPUs with this instruction system are commonly used in mid-to-high-end servers, especially high-end servers all use CPUs with the RISC instruction system.

The RISC instruction system is more suitable for UNIX, the operating system of high-end servers. Now Linux is also a UNIX-like operating system. RISC CPUs are incompatible with Intel and AMD CPUs in both software and hardware.

At present, the CPUs that use RISC instructions in mid-to-high-end servers mainly include the following categories: PowerPC processors, SPARC processors, PA-RISC processors, MIPS processors, and Alpha processors.

(3) IA-64

There has been a lot of debate about whether EPIC (Explicitly Parallel Instruction Computers) is the successor of RISC and CISC systems. In terms of system, it is more like Intel's processor taking an important step towards the RISC system. Theoretically speaking, the CPU designed by the EPIC system can handle Windows application software much better than Unix-based application software under the same host configuration.

Intel's server CPU using EPIC technology is Itanium (development codename: Merced). It is a 64-bit processor and the first in the IA-64 series. Microsoft has also developed an operating system codenamed Win64 to support it in software. After Intel adopted the set, so the IA-64 architecture using the EPIC instruction set was born. IA-64 is a huge improvement over x86 in many aspects. It breaks through many limitations of the traditional IA32 architecture and achieves breakthrough improvements in data processing capabilities, system stability, security, usability, and considerable rationality.

The biggest flaw of IA-64 microprocessors is their lack of compatibility with x86. In order for Intel's IA-64 processors to better run software from both generations, it The x86-to-IA-64 decoder is introduced on (Itanium, Itanium2...), so that x86 instructions can be translated into IA-64 instructions. This decoder is not the most efficient decoder, nor is it the best way to run x86 code (the best way is to run x86 code directly on the x86 processor), so the performance of Itanium and Itanium2 when running x86 applications Very bad. This has also become the fundamental reason for the emergence of X86-64.

(4) X86-64 (AMD64/EM64T)

Designed by AMD, it can handle 64-bit integer operations at the same time and is compatible with the X86-32 architecture. It supports 64-bit logical addressing and provides the option of converting to 32-bit addressing; however, the data operation instructions default to 32-bit and 8-bit, and provides the option of converting to 64-bit and 16-bit; supports general-purpose registers, if it is a 32-bit operation , it is necessary to expand the result to a complete 64 bits. In this way, there is a difference between "direct execution" and "conversion execution" in the instruction. The instruction field is 8 bits or 32 bits, which can avoid the field being too long.

The creation of x86-64 (also called AMD64) is not groundless. The 32-bit addressing space of x86 processors is limited to 4GB of memory, and IA-64 processors are not compatible with x86. AMD fully considers the needs of customers and enhances the functions of the x86 instruction set so that this instruction set can support 64-bit computing modes at the same time. Therefore, AMD calls their structure x86-64. Technically, in order to perform 64-bit operations in the x86-64 architecture, AMD has introduced a new R8-R15 general-purpose register as an expansion of the original Use these registers.

The original registers such as EAX and EBX have also been expanded from 32 bits to 64 bits. Eight new registers have been added to the SSE unit to provide support for SSE2. The increase in the number of registers will lead to performance improvements. At the same time, in order to support both 32- and 64-bit codes and registers, the x86-64 architecture allows the processor to work in the following two modes: Long Mode (long mode) and Legacy Mode (genetic mode). Long mode is divided into two sub-modes: Mode (64bit mode and Compatibility mode). This standard has been introduced into the Opteron processor in AMD server processors.

This year, EM64T technology that supports 64-bit was also launched. Before it was officially named EM64T, it was IA32E. This is Intel The name of the 64-bit extension technology used to distinguish the X86 instruction set. Intel's EM64T supports 64-bit sub-mode, which is similar to AMD's X86-64 technology. It uses 64-bit linear plane addressing, adds 8 new general-purpose registers (GPRs), and adds 8 registers to support SSE instructions. Similar to AMD, Intel's 64-bit technology will be compatible with IA32 and IA32E. IA32E will only be used when running a 64-bit operating system. IA32E will be composed of 2 sub-modes: 64-bit sub-mode and 32-bit sub-mode, which are backward compatible with AMD64. Intel's EM64T will be fully compatible with AMD's X86-64 technology. Now the Nocona processor has added some 64-bit technology, and Intel's Pentium 4E processor also supports 64-bit technology.

It should be said that both of them are 64-bit microprocessor architectures compatible with the x86 instruction set, but there are still some differences between EM64T and AMD64. The NX bit in the AMD64 processor is not processed by Intel. will not be provided in the server.

11. Superpipeline and superscalar

Before explaining superpipeline and superscalar, let’s first understand the pipeline. The pipeline was first used by Intel in the 486 chip. The assembly line works like an assembly line in industrial production. In the CPU, an instruction processing pipeline is composed of 5-6 circuit units with different functions, and then an X86 instruction is divided into 5-6 steps and then executed by these circuit units respectively, so that one instruction can be completed in one CPU clock cycle. , thus increasing the computing speed of the CPU. Each integer pipeline of the classic Pentium is divided into four levels of pipeline, namely instruction prefetching, decoding, execution, and writing back results. The floating point pipeline is divided into eight levels of pipeline.

Superscalar uses built-in multiple pipelines to execute multiple processors at the same time. Its essence is to trade space for time. The super pipeline is to complete one or more operations in one machine cycle by refining the pipeline and increasing the main frequency. Its essence is to exchange time for space. For example, the Pentium 4's pipeline is as long as 20 stages. The longer the steps (stages) of the pipeline are designed, the faster it can complete an instruction, so it can adapt to CPUs with higher operating frequencies. However, an excessively long pipeline also brings certain side effects. It is very likely that the actual computing speed of a CPU with a higher frequency will be lower. This is the case with Intel's Pentium 4, although its main frequency can be as high as 1.4G or more. , but its computing performance is far inferior to AMD's 1.2G Athlon or even Pentium III.

12. Packaging form

CPU packaging is a protective measure that uses specific materials to solidify the CPU chip or CPU module in it to prevent damage. Generally, the CPU must be packaged before it can be delivered to the user. use. The packaging method of the CPU depends on the CPU installation form and device integration design. From a broad classification point of view, CPUs usually installed using Socket sockets are packaged using PGA (grid array), while CPUs installed using Slot x slots are all packaged using SEC (Single-sided junction box) form of packaging.

There are also packaging technologies such as PLGA (Plastic Land Grid Array) and OLGA (Organic Land Grid Array). Due to increasingly fierce market competition, the current development direction of CPU packaging technology is mainly cost saving.

13. Multithreading

Simultaneous multithreading, referred to as SMT. SMT can copy the structural state on the processor, allowing multiple threads on the same processor to execute simultaneously and fully share the processor's execution resources. It can maximize wide-issue, out-of-order superscalar processing and improve The utilization of the processor's computing components alleviates memory access delays caused by data dependencies or cache misses. When multiple threads are not available, SMT processors are almost the same as traditional wide-issue superscalar processors. The most attractive thing about SMT is that it only requires a small change in the design of the processor core, which can significantly improve performance at almost no additional cost. Multi-threading technology can prepare more data to be processed for the high-speed computing core and reduce the idle time of the computing core. This is undoubtedly very attractive for low-end desktop systems. Starting from the 3.06GHz Pentium 4, all Intel processors will support SMT technology.

14. Multi-core

Multi-core also refers to single-chip multiprocessors (Chip multiprocessors, referred to as CMP). CMP was proposed by Stanford University in the United States. Its idea is to integrate SMP (symmetric multi-processor) in large-scale parallel processors into the same chip, and each processor executes different processes in parallel. Compared with CMP, the flexibility of SMT processor structure is more prominent. However, when the semiconductor process enters 0.18 micron, the line delay has exceeded the gate delay, requiring the design of the microprocessor to be carried out by dividing many basic unit structures with smaller scale and better locality. In contrast, since the CMP structure has been divided into multiple processor cores for design, each core is relatively simple, which is conducive to optimized design, and therefore has more development prospects. Currently, IBM's Power 4 chip and Sun's MAJC5200 chip both use the CMP structure. Multi-core processors can share cache within the processor, improve cache utilization, and simplify the complexity of multi-processor system design.

In the second half of 2005, new processors from Intel and AMD will also be integrated into the CMP structure. The development code of the new Itanium processor is Montecito. It adopts a dual-core design, has at least 18MB of on-chip cache, and is manufactured using a 90nm process. Its design is definitely a challenge to today's chip industry. Each of its individual cores has independent L1, L2 and L3 caches and contains approximately 1 billion transistors.

15. SMP

SMP (Symmetric Multi-Processing), short for symmetric multi-processing structure, refers to a group of processors (multiple CPUs) assembled on one computer, each The memory subsystem and bus structure are shared between CPUs. With the support of this technology, a server system can run multiple processors at the same time and share memory and other host resources. Like dual Xeon, which is what we call two-way, this is the most common type in symmetric processor systems (Xeon MP can support up to four-way, AMD Opteron can support 1-8 way). There are also a few that are number 16. But generally speaking, the scalability of machines with SMP structure is poor, and it is difficult to achieve more than 100 multi-processors. The conventional ones are generally 8 to 16, but this is enough for most users. It is most common in high-performance server and workstation-class motherboard architectures, such as UNIX servers that can support systems with up to 256 CPUs.

The necessary conditions for building an SMP system are: hardware supporting SMP including motherboard and CPU; system platform supporting SMP, and application software supporting SMP.

In order to enable the SMP system to perform efficiently, the operating system must support SMP systems, such as 32-bit operating systems such as WINNT, LINUX, and UNIX. That is, the ability to perform multitasking and multithreading. Multitasking means that the operating system can enable different CPUs to complete different tasks at the same time; multithreading means that the operating system can enable different CPUs to complete the same task in parallel

To build an SMP system, all The selected CPU has very high requirements. First of all, the APIC (Advanced Programmable Interrupt Controllers) unit must be built inside the CPU. The core of the Intel multiprocessing specification is the use of Advanced Programmable Interrupt Controllers (APICs); again, the same product model, the same type of CPU core, the exact same operating frequency; finally, keep the same as much as possible Product serial number, because when two production batches of CPUs are run as dual processors, it may happen that one CPU is overloaded and the other is underloaded very little, and the maximum performance cannot be exerted. What's worse is that May cause crash.

16. NUMA technology

NUMA is non-uniform access distributed shared storage technology. It is a system composed of several independent nodes connected through high-speed dedicated networks. Each node Can be a single CPU or an SMP system. In NUMA, there are multiple solutions for Cache consistency, which require support from the operating system and special software. Figure 2 is an example of Sequent's NUMA system. There are three SMP modules connected by a high-speed dedicated network to form a node, and each node can have 12 CPUs. Systems like Sequent can go up to 64 CPUs or even 256 CPUs. Obviously, this is based on SMP and then expanded with NUMA technology. It is a combination of these two technologies.

17. Out-of-order execution technology

Out-of-order execution means that the CPU allows multiple instructions to be sent separately to each corresponding device in an order that is not specified by the program. Technology for circuit unit processing. In this way, after analyzing the status of each circuit unit and the specific situation of whether each instruction can be executed in advance, the instructions that can be executed in advance are immediately sent to the corresponding circuit unit for execution. During this period, the instructions are not executed in the specified order, and then the rearrangement unit Rearrange the results of each execution unit in instruction order. The purpose of using out-of-order execution technology is to make the internal circuits of the CPU operate at full capacity and accordingly increase the speed of the CPU's running programs. Branching technology: (branch) instructions need to wait for the results when performing operations. Generally, unconditional branches only need to be executed in the order of instructions, while conditional branches must decide whether to proceed in the original order based on the processed results.

18. Memory controller inside the CPU

Many applications have more complex read patterns (almost randomly, especially when cache hits are unpredictable), And bandwidth is not used efficiently. A typical application of this type is business processing software. Even if it has CPU features such as out-of-order execution, it will still be limited by memory latency. In this way, the CPU must wait until the dividend of the data required for the operation is loaded before it can execute the instruction (whether the data comes from the CPU cache or the main memory system). The memory latency of current low-end systems is about 120-150ns, and the CPU speed has reached more than 3GHz. A single memory request may waste 200-300 CPU cycles. Even with a 99% cache hit rate, the CPU may spend 50% of its time waiting for memory requests to complete - for example due to memory latency.

You can see that the latency of the Opteron integrated memory controller is much lower than the latency of the chipset supporting dual-channel DDR memory controllers. Intel is also planning to integrate the memory controller inside the processor, which will make the northbridge chip less important. But it has changed the way the processor accesses the main memory, which helps to increase bandwidth, reduce memory latency and improve processor performance

Manufacturing process: The current CPU manufacturing process is 0.35 microns, and the latest PII can reach 0.28 micron, in the future the CPU manufacturing process can reach 0.18 micron.

CPU manufacturers

1. Intel Corporation

Intel is the big brother in producing CPUs. It occupies more than 80% of the market share. The CPUs produced by Intel become The de facto x86CPU technical specifications and standards. The latest PII becomes the CPU of choice.

2. AMD Company

There are several companies currently using CPU products. In addition to Intel, the most powerful challenger is AMD, the latest K6 and K6-2 It has a very good price/performance ratio, especially the K6-2 uses 3DNOW technology, which makes it perform very well in 3D.

3. IBM and Cyrix

After the merger of IBM and Cyrix, the US National Semiconductor Company, it finally has its own chip production line, and its finished products will become increasingly complete and complete. The current MII performance is also good, especially its price is very low.

4. IDT Company

IDT is a rising star among processor manufacturers, but it is not yet mature yet.

5. VIA VIA Corporation

VIA VIA is a motherboard chipset manufacturer in Taiwan. It acquired the CPU departments of the aforementioned Cyrix and IDT and launched its own CPU

6. Domestic GodSon

GodSon, nicknamed Gou Sheng, is a state-owned general-purpose processor with independent property rights. It currently has 2 generations of products, and can only catch up with INTEL's P2 Times

Previous article:Bike-sharing purchased successfully. Will there be a short message on WeChat?
Next article:April Fool's Day SMS greetings