新闻资讯

News Center

current position: Home > News Center > News

How is High Speed Digital IO designed and how is it implemented?

2022-07-19 10:01:34
times

Because the communication in the 5G and IoT era has the characteristics of high speed, low latency and high bandwidth, to realize the connection between 5G and IoT, a large number of base stations need to be established, which will also bring a lot of power, signal, optical fiber and other connectors and cables. Therefore, many high-speed digital interfaces are gradually entering our eyes. 


At present, the most common ones are PCIE, SAS and SATA. Of course, there are also XAUI protocol and Thunderbolt on 10G Ethernet. This is jointly developed by Intel and Apple. Universal high-speed interfaces that combine DisplayPort ports, etc. Today we are going to share how High Speed Digital IO itself is designed and implemented! For specific rankings such as PCIE and SAS, you can ask Du Niang: PCIe, SAS and SATA, who will lead the storage interface.


High Speed Digital is usually a serial interface, and it can be said that it is precisely because it is serial that the speed becomes so high, because it converts the data on the internal parallel bus of the chip into serial transmission, such as the above The mentioned QSFP 40G is also achieved through 10G*4x, but why must it be done? The SI problem of high-speed digital signals on the PCB is very complicated and difficult to design. Why can't the parallel data inside the chip be directly sent out? Let's find out together!


This has to start with Moore's Law. When the chip scale keeps getting larger, the pins on the chip package do not come out, so it is impossible to pull all the IOs to the package individually. Friends who are familiar with packaging know that, The more packaging pins are added, the higher the packaging cost. In order to meet the market's cost-effectiveness, we must also meet such requirements, and a small number of pins is actually beneficial for SI (PI). It is assumed that when dozens of parallel IOs are flipped at the same time , it will bring a large SSO (simultaneous switching ouput), the stability of the power supply will be very poor, and the whole board will also have large EMI. 


When dozens of single-ended parallel IOs become a pair of serial Signal, the SSO problem is greatly reduced, and usually the high-speed serial interface is differential input and output, the differential signal itself because the current is opposite, for the IC at the drive end, the current on the power supply is stable! So now I understand that the high-speed interface actually exists mainly to reduce costs, but at the same time, a problem that the high-speed interface must solve is the board-level synchronization problem. When the IO speed is very slow, the delay caused by the board wiring is almost It can be ignored, but when the IO speed is fast, the delay caused by the board wiring will be a few cycles worse.


The figure shows the same length of trace. The high-speed IO has flipped for several cycles, and the low-speed IO has only experienced a rising edge.

So at this time, we can no longer use the previous system-level synchronization method, because the whole board uses the same clk to synchronize, but the distance between this clk and each IC is different, and it is definitely impossible to synchronize, unless it is stable and completely symmetrical, go The lines are exactly the same length, so I think this kind of board should not be sold at all!


Another synchronization method is to drive the IC to send clk and data to the receiver at the same time, and then configure the same length to ensure that the clock and data are aligned. For example, our common ddr is this way, but this The more troublesome way is that there will still be very strict isometric requirements to transmit speeds above Gbps, which limits the flexibility of wiring. This requires the ddr3 and ddr4 chips to add a training process to achieve read-write balance. , leaving more headroom for our routing. In addition, when the receiving end receives the clk sent by the driver, it has to match the clk with the main clock of its own chip. Assuming a 32-bit ddr data, every 8 bits with a clk, so the receiver will receive 4 different clk, and then move them to the clock domain of the chip itself, so that there are 5 different clock domains in the chip . The chip design is also more troublesome.


No, none of these solutions are good enough, and a really good solution should be concise and effective. Since it is not so flexible to send clk and data at the same time, can you only send data, and then extract the clk information from the data at the receiving end? In this way, neither the data needs to be synchronized with the clk, nor the clock domain movement of the clk!


The answer is yes, every part of the high-speed digital io serves this purpose: effectively parse out data and clk, and then move them to the internal differential bus. The increased chip area of these modules is a little more than the package. The cost of ten pins is nothing compared to that. Do you think that the person who invented the high-speed serial interface is too perverted, and the things that can be solved with money have to rely on wisdom, which makes it more difficult for ordinary people to follow up!



The sending end must first have a parallel-to-serial structure, which can be implemented with a series D flip-flop. The data from the parallel bus is moved bit by bit into the series D flip-flop using the parallel clk as the enable signal. go. Then move the phase of the internal parallel clk (for example, move 90 degrees) to create four clocks of 0 degrees, 90 degrees, 180 degrees, and 270 degrees. When combined again, the clock becomes 4 times the original. Use this The clock, the data is stored in the D flip-flop before sending! The receiving end is also in a similar way, but it is serial-to-parallel in reverse. The clock recovery in the data is done through pll. In fact, not only the clock is recovered here, but the control signal also needs to be separated from the data!

The PLL recovers according to the fastest frequency of the signal received on the Rx pin, and then with Rx, it becomes the internal Rx data!



Now that the data and the clock are in place, it's time for those so-called protocols to come into play. In the protocol, we define some combinations of databits as flags. When the receiver receives these flags, we can judge whether the next data is data, control, or idle!


One more thing here, why should idle be distinguished? Is it not enough to send a string of 0s when sending? Not so. In order to reply to clk normally, try to ensure that the output data does not have long strings of 0 and 1. This involves our most common 8b/10b encoding, 64b/66b scrambling, etc., which are also defined in the protocol.



Tags

Recently Viewed:

Related products

news.related_news