Tuesday, 20 March 2018

Project Everest - Adaptive Compute Acceleration Platform

What would you do with a 50 billion gate budget?

Mt Everest from the Tibetan plateau - Wikipedia
I meander about this occasionally in my dreams. Xilinx has gone a step further and spent about $1B, four to five years, and the efforts of around fifteen hundred engineers to come up with Project Everest.

Xilinx is calling the result the adaptive compute acceleration platform (ACAP). They are expecting to tape out the first version in chip form, code-named "Everest",  in 7nm at TSMC later this year [Xilinx Press Release.]

The idea is to use a network on chip (NOC) as a fabric assist to glue together some ARM based CPUs, "programmable engines"; HBM; RF ADCs / DACs; 33G, 58G, 112G Serdes, and programmable IO. It also suggests that Xilinx is going to swing back to using a mix of monolithic and interposer-based chip delivery too after eschewing the interposer approach for their latest monolithic UltraScale+ designs.
(click to enlarge)

The goal is to target the kind of workloads GPUs and AI chips, such as Google's TPUs, are now doing at a reasonable W/Op rate.  In addition, they want to target more application-oriented specific chips, such as RF baseload ASICs, with this flexible, or adaptable, chip architecture that can seemingly do it all.
(click to enlarge)
It is a very interesting approach. FPGAs have been adding more complex, dense application specific logic blocks for a while to add connectivity or compute opportunities that would otherwise consume too many FPGA resources or be too slow in their generic FPGA routed form. We've seen that in the growth of PCIe, memory interface, embedded processor, DSP, BRAM, network specific blocks use in modern FPGAs. Project Everest takes it another step including a bigger palette for your palate with the addition of an alternative glue of the NOC instead of just relying on FPGA routing resources.

Those familiar with Jan Gray's very cool GRVI Phalanx will not need to wonder long about the usefulness of such an approach. It is useful. 
(click to enlarge)
Why it takes four or five years, 1500 engineers, and a billion dollars I don't know but perhaps there is some cofounding of various project costs in there. It reminds me of Wave Computing's CGRA to an extent that they are adding a lot of coarseness to the FPGA fabric but what it really means is a bit lost in the vagueness of what we know so far, especially with respect to the "programmable engines." We don't really know what it means. The effort is certainly a bit more extreme than the light budget and 1-2 man years that went into the Epiphany-V, a 1024 core 64-bit processor:
Epiphany-V development (source)
(click to enlarge)
One of the keys, perhaps the KEY, to the usefulness of this project will be the usability of the engine. Xilinx understands this and is suggesting you'll be able to access it with your C++ or Python. How flexible that will be, remains to be seen. Targeting ML toolkits, BLAS, etc will go a long way to making such a beast an indispensable bit of kit. The following development stack is encouraging:


(click to enlarge)
Xilinx suggests it will be easier to use than CUDA, with Python access, TensorFlow, etc, so large numbers of software developers may ignore the custom hardware engine underneath. A nice and ambitious goal for sure.

Having blocks for CCIX connectivity for building cache coherent interconnects for multiple chip solutions makes the mind meander into some fascinating territory. That is quite the kitchen sink.

The idea of 50B gates opens up this kind of dreamy architecture, as Victor Peng, Xilinx CEO, alludes to here [Anandtech interview],
"It is fundamentally different than what it was a few years ago, for example, somebody asked the question could you have done an ACAP prior to 7nm. Well we could, it would be in certain respects maybe a little less powerful, more costly perhaps, but the kind of the coming together of all those things together at 7nm makes this just the right time for the company to take this more quantum leap."
The intriguing bit, pardon the pun, is going to be in understanding the so-called hardware-software programmable engines. They will not quite be programmable to the bit. Their details are undisclosed for now,
"It's a little bit hard to talk the new product without pre-announcing some of the features, but I talked about hardware software programmable engine, which won't be an engine that is customizable down to a single bit, but it will have notions of some granular data paths and memory and things like that, and it has some overlay of an instruction set, but while most people won't program it, it still has hardware programmability. We’re not just somebody else coming out with a VLI doping multi core architecture because then what is the difference between us and someone else right? There's always going to be that secret sauce."
Finding out the NOC capabilities will be interesting. Xilinx has licensed a 3rd party NOC so it will not be groundbreaking in itself but the combination should be powerful you'd expect.

Here are some performance claims in Xilinx's footnotes (click to expand to read):
(click to enlarge)
Xilinx is saying here that Everest will provide 20x Virtex VU9P acceleration for deep learning. Frankly, that is too low to dislodge the case for an Nvidia V100, Pegasus Drive, or Google TPUv2. We'll have to wait until we see some real benchmarks.

We can't get too excited as shipping and revenue will be next year even if tape-out and perhaps some engineering samples come out this year. Xilinx may still shoot themselves in the foot as their high-end Virtex FPGAs retail for $10-50K. Selling a $50K accelerator will cause some pause. My paws will pause. Delivering a beast that can outperform an Nvidia V100's 120 TOPS for ML at a similar price will be the benchmark they will have to meet.

I like the approach but the specifics will matter. We will all have to wait for more meat on the bone to be presented. Especially for what those "programmable engines" really are. Can they compete with thousands of Nvidia streaming compute units? I don't think Everest completely eclipses the idea of a NOC with thousands of small CPUs with custom co-processors attached, but it may make it harder to justify a start. Again, specific performance and price matter. Xilinx has been strong on delivering glorious high-end chips - just not high-end chips priced for the masses. Let's hope they give us something fun we can afford.

Happy chipping away,

--Matt.
____________________

Press release articles with discussion:





Friday, 2 March 2018

Nasdaq sues IEX for patent infringement

(click to enlarge)
Nasdaq is attempting to neuter the InvestorSexChange (IEX).

Yesterday Nasdaq and Nasdaq Technology AB decided to preempt the usual Ides of March bushwhacking with an earlier little march to the District Court in New Jersey to file Case 3:18-cv-03014 against IEX Group, Inc; Investors Exchange  LLC.

Let's have a meander around this.

It has been widely reported:
Nasdaq Press Release, "Nasdaq Files Patent Infringement Lawsuit to Protect Intellectual Property"

Reuters, "Nasdaq sues rival IEX Group over patent infringement" by John McCrank
Bloomberg, "Nasdaq Sues ‘Flash Boys’ Exchange IEX for Patent Infringement" by Annie Massa
WSJ, "Nasdaq Sues IEX Over Stock-Exchange Technology Patents" by Alexander Osipovich
FT, "Nasdaq sues IEX for allegedly infringing technology patents" by Nicole Bullock

Here is the complaint filed on behalf of Nasdaq by Critchley , Kinum & Denoia, LLC and  Susman Godfrey LLP in all its insomnia-inducing eighty pages of glory.



(click to enlarge)
Notably, Nasdaq is asking for willful damages which could attract up to treble damages if IEX is found at the requested jury trial to have infringed. If it gets to trial.

(click to enlarge)
I'm not the biggest fan of IEX being a public exchange as their Dark Fader style does not promote the kind of public price discovery exchanges are meant to perform. I'm firm in my view that IEX should be placed back in the SEC's ATS box.

That said, on a quick glance, I'm not necessarily a fan of all of the seven "Patents-in-Suit" Nasdaq has opened the suit with. Some of the patents seem a little abstract or obvious to me. Some are quite specific and do hit IEX quite directly. Overall, it looks a strong opening gambit from Nasdaq until discovery allows some refinement.

The seven patents in question are as follows, but you'd best just scan and skip if you wish your grey matter to not fuse and cry out in pain from the l33t patent-speak.

7,647,264, "Closing in an electronic market

Abstract: A method for trading a security in an electronic market includes receiving closing orders and orders for the security traded in the electronic market, disseminating an order imbalance indicator indicative of predicted trading characteristics of the security at the close of trading, determining a closing price for the security based on the closing orders and orders, and executing at least some of the closing orders at the determined closing price.

"Closing orders are executed in a single transaction. The information included in the imbalance indicator improves transparency and price discovery. Disseminating the imbalance indicator gives market participants an opportunity to adjust their trading based on the imbalance indicator by adjusting the price and/or size of existing imbalance only orders, or by submitting additional imbalance only orders. The closing process improves liquidity, reduces risk and reduces costs for investors seeking to trade at the closing price."

7,895,112, "Order book process and method"

Abstract: A system for execution of transactions includes a main memory of a computer system storing an order book to match a portion of security interest in the order book to a received order for a security.
Primary claim

7,933,827, "Multi-parallel architecture and a method of using the same"

Abstract: Multiple securities processors each process attributable security interest messages generated by market participants. Each of these attributable security interest messages relates to a specific security chosen from a plurality of securities traded on the securities trading system, such that each individual security is assigned to one or more of the securities processors. An order routing system routes each attributable security interest message to one of the securities processors.


8,117,609, "System and method for optimizing changes of data sets"

Abstract: A system and method for generating an update data set to be sent to remote terminals. The update data set comprises operators describing differences between two data sets, so that a remote terminal is able to transform an old data set into a more recent data set. The system comprises a comparator for comparing data elements in the data sets, and a selector for selecting operators based on a change parameter stored in a memory.

8,244,622, "Order matching process and method"


Abstract: A trading process for trading securities in an electronic market includes a matching process to match a portion of a received order for a security against a security interest stored in an order book that resides in main memory of a computer system.

"According to an aspect of this invention, a trading process for trading securities in an electronic market includes a matching process to match a portion of a received order for a security against a security interest stored in an order book that resides in main memory of a computer system.

According to a further aspect of the invention, a method for trading securities in an electronic market includes matching a portion of a received order for a security against a security interest of stored in an order book that resides in main memory of a computer system.

According to a further aspect of the invention, a computer program product residing on a computer readable medium includes instructions for trading securities in an electronic market cause a computer to match a portion of a received order for a security against a security interest stored in an order book that resides in main memory of a computer system.

One or more of the following features may also be included.

The main memory may be random access memory. The main memory may be a cache. The received order may be validated. The marketability of the received order may be checked against a state of the electronic market prior to matching the portion of the received order. The security interest may be retrieved from the order book in main memory. The security interest may be updated in the order book in main memory. The security interest in the order book in main memory may be added to. The matching of the portion of the received order may be reported to an execution log file. The matching may occur in a securities processor.

One or more advantages can be provided from the above. By matching security orders with security interests stored in a random access memory based order book, the orders may be quickly executed due to the fast access time of the order book. Further, besides quickly executing an order that can be matched, an order may be quickly entered into the order book if the order can not be matched. Additionally, if an order is not marketable the order can be quickly returned to the user. By providing faster matching more security transactions may be matched over a period of time thereby reducing the potential backlog of security transactions."

8,280,797, "Closing in an electronic market"

Abstract: A method for trading a security in an electronic market includes receiving closing orders and orders for the security traded in the electronic market, disseminating an order imbalance indicator indicative of predicted trading characteristics of the security at the close of trading, determining a closing price for the security based on the closing orders and orders, and executing at least some of the closing orders at the determined closing price.


8,386,362, "Information distribution process and method"


Abstract: A process for distributing information in an electronic market includes an insertion process to insert, in a file that resides in a storage medium, information representing an activity relating to a security interest stored in an order book that resides in main memory and is accessible by a matching process.

Further claims


Nasdaq is claiming at least four Nasdaq employees went to work at IEX and some or all were likely to have knowledge of the patent material. Nasdaq points to quite explicit language where IEX has acknowledged in their own material that they have carefully studied Nasdaq material so that they may perform similarly, especially with respect to the closing auction indication.

The case will not be able to be dismissed easily. 

It's worth reiterating, discovery will be interesting as Nasdaq will be hunting for further similarities and possible patent infringement additions to the suit. There has been no claim of source code copying alleged and you'd expect there not to be any, but you never know as stranger things have happened. Non-disclosure experts at forty paces will be expensive for IEX.

Some of the Nasdaq patents' claims seem overly broad and one line of attack from IEX will be trying to get patents invalidated in parallel to the process of the suit. Alice may help as some seem a little too abstract to really hold water. Alice rails against the all too abstract. IEX will be busy. It certainly looks like the closing imbalance indication violation will be a hard one for IEX to dodge. The employment of the ex-Nasdaq staff does not help.

I'm not sure all the other exchanges around the world using a non-Nasdaq platform will be cheering as many exchanges' matching engines use similar techniques. For example, there are not many exchanges that don't use the '827 patent, "Multi-parallel architecture and a method of using the same" technique for separating symbols over separate hardware matching engines. It's a pretty natural technique to use and it's not the kind of patent I like to see granted.

It is certainly going to be quite the distraction for IEX but you can have little sympathy for the "plucky Flash Boys exchange." IEX has long misled the public about their fairness and efficacy. IEX has used improper and intemperate language in the market structure debate to both glorify itself and teardown its critics. One should not forget the shameful attacks IEX has launched on some commentators. Such misbehaviour from IEX may not help them. You might imagine IEX's credibility could be called into question in court, especially with respect to a jury trial, when the court's attention is drawn to IEX's pattern of promoting incorrect information and misleading the public and their customers. Their BS may come back to bite them.  This may even be an essential line of attack in court. IEX's false public image is that of the good little guy providing fairness against the evil exchanges when that is far from reality. IEX is, in some regards, the least fair and most uneven of the public exchanges in the US. It may be necessary for Nasdaq's counsel to dissuade the jury from their potential IEX prejudice due to IEX's false fairness schmarketing. That could be interesting.

IEX performance


IEX has a bit over 2% market share for the last year. It remains a very expensive place to trade due to their high fees. It would not be unexpected if IEX was to bring in somewhere between $50-$100M in fees over the next year and they are likely to be sitting on a cash pile of the order of $100M you'd think. Inordinate fees for such a small market share.

IEX's market structure abuse and false marketing may have them laughing all the way to the bank. They may even have enough cash to pay some treble damages if the allegations are proved and willful. Though I expect the elimination of the infringing technology is probably the Nasdaq goal. If Nasdaq wins on all counts and the infringing tech has to be eliminated, not much of IEX will not be burnt to the ground. It's an existential threat to IEX. IEX will no doubt try to trivialise the suit as just a nasty competitive approach. That is kind of right and kind of the point. Patents are granted to those that take the innovation risk for good reason. Nasdaq is not a troll. They have real tech that should be given the protection an innovator deserves, even if I might not see eye to eye on all their claims.

Here is the recent IEX market share showing an average of around 2.3% recently:

(click to enlarge)
IEX continues to be a mainly dark exchange, though the additional US-wide volume spike from the volatility burst in February drove IEX's darkness to record low levels where only 76.5% of trades in February were not lit. Still pretty dank and dark in the expensive trade restaurant you'd have to say.

It is interesting to see in more detail the few especially higher volume days in February, IEX had more lit activity:

Though this still correlates to being darker for more market share:

Darker for more market share is not an encouraging trend for a public exchange. Efficient public price discovery needs to be promoted. The SEC needs to reconsider IEX's place as a public exchange. It is simply not a well-functioning public market. IEX should be an ATS.

One thought as to why IEX had more lit volume in the market chaos could simply be that the resting orders were disadvantaged for some customers due to the unfairness of the IEX technology platform combined with the fact that the crazy high fees were more acceptable to pay in such chaotic conditions. Hmmm. Also, as more of an outlier, there is an increased chance that the Crumbling Quote Indicator's (CQI) logistic regression may have performed under par.

I'm not sure Nasdaq would want to buy IEX but perhaps a nominal purchase price could be the easiest way out for IEX? Would Nasdaq want to license their tech to IEX? May Nasdaq be required to offer reasonable royalty terms? Could IEX have to eliminate various aspects of the technology? You know just the matching bits, the data feed, the closing auction, et cetera. The non-essential bits, really. They'd still have a brand left but not much tech in the smoking ruins.

Beyond Nasdaq's few hundred patents, both CME and ICE have broad and significant patent portfolios too. You have to wonder why they're not protecting their IP with suits against IEX? It is, after all, kind of a requirement that you need to enforce your patents to keep them viable. Will CME and ICE step up?

Happy trading,

--Matt.