Saturday, 11 March 2017

Using the matrix - exchange and enterprise network improvement

To achieve the lowest of low latency market data distribution you need to use layer one switching devices, otherwise known as matrix switches. Sometimes they are called crossbar switches. In the old days of a telephone exchange such things were built with relays, and the like, but the world builds nice electronic ones now. They are not a packet switch. A matrix switch is all about making a circuit from one place to another. Here is a little simplified diagram:
Example crosspoint schematic
(click to enlarge - source)
If we close the Y1 to X1 switch we have a connection between them and a signal can flow whether it is a 10G Ethernet packet or an analogue voice call. It's just a wire. If the circuit supported one source and many destinations we could close the Y1/X1 switch and the Y1/X3 switch so the signal from Y1 could flow to both X1 and X3 fairly instantly. That is the essence of the matrix if you choose to live in it.

The step after requiring a telephone operator to patch your wire into its destination wire by asking them to patch you, was the development of fancy schmancy dialing systems that had matrix switches at their core. Here is a picture of a Western Electric crossbar switch manufactured in 1970:

6-wire crossbar switch. Manufactured by Western Electric, April 1970.
(click to enlarge - source)
N Reynolds of Western Electric first invented the crossbar selector in 1913.  Here is a picture of a technician fooling around at East 30th St NY, NY in 1938:

(click to enlarge - source)
 It kind of reminds me of a steam punk version of a modern matrix switch, expect with no steam or punk, I guess. Anyway, here is a picture from 1955 showing us the modern data centre of its day, care of AT&T:

(click to enlarge - source)
Matrix switches are important and have been around for over a hundred years. Quite a few of us don't appreciate their long history. I was certainly late to the party.

It is with some irony you might like to mull on the idea that the crossbar switch of 1913 was capable of lower latency than a modern data centre’s packet switch infrastructure. Yes, Arista, Mellanox, Juniper, and Cisco packet switches are all slower than one hundred year old technology. The switching was precomputed by the relays choosing the path. That path was then simply a direct connection. No microseconds or nanoseconds wasted on choosing paths. Certainly the clink, clank, thunk of the path setting was pretty slow, but once the path is in place, you’re off to the races - as fast as your electrons can travel.

There are a couple of handfuls of vendors who provide these kind of layer one matrix switches. Such switches have a strong history of use for video and audio environments, besides the obvious telecommunication use cases. Zeptonics introduced a device specifically targeting financial applications a few years ago but times have moved on and better devices exist from excellent new generation vendors such as Metamako.

Market data

In a packet switched environment, multicast UDP is typically used for market data delivery.  Whether it be dense or sparse multi-cast, a load is added to the switching device that may also interfere with other traffic. Even without contention, there will be one or two orders of magnitude difference between packet switched multi-cast and matrix switched traffic. Matrix switches are simply really good at doing nothing. If your destination is preordained by the network's omniscient subscription God, then let it be.

Some packet switches offer spanning which may cost as little as 50ns for duplication between ports, but this is still an order of magnitude different to the typical matrix switch port to port cost of around 5ns. That 5ns cost is largely related to trace lengths within the device with only 0.1 to 2 ns due the matrix switching chip. Distance matters at such speed.

So, the best and most efficient way to send data from one point to many, as you need to do in a market data environment, is to simply use a matrix switch, or layers of such, and fan out the data.

There are two main types of modern matrix switch, the optical and the electronic. The optical usually uses a MEMS chip, think little magical mirrors, that directs beams around the place. They are a bit slow to set up, typically milliseconds, but typically lead the electronics devices in the bandwidth race. Often they have more scale, that is ports, but you pay for the privilege as they are not a cheap thing to make. It is a bit easier to stamp out a complex bit of silicon as society is better geared up for that. The electronic variety is typically faster in set up, microseconds, but a bit lower in bandwidth. 25G being new for silicon but a bit older for optical MEMS. The electronic variety usually has advantages of multi-tap or multi-cast where you can go from one to many, or all, which is harder to arrange with optical. Also electrickery usually fares better than optical with regard to signal integrity simply due to the fact that we are better massaging electrickery than we are at catching rainbows.

One of the magical things I used to use such matrix switches for was unit based performance tests. As the connectivity of the matrix is scriptable, when code was checked into the version repository, a series of tests would be run to check the performance. The network config was part of the unit test. The network was appropriately reconfigured for each test to get real wire to wire measurements for components. This is a very handy way to keep the nanoseconds under control and stop speed bumps being inadvertently introduced. A graphite database we used could show all the nanoseconds of code evolution over time. Alarms would ring and emails would go out if a handful of extraneous nanoseconds suddenly appeared which is surprisingly easy to do. The key to this was having the network being completely configurable for each unit test. That is a joy of the matrix switch in the lab.

If I were to set about designing your enterprise or data centre network, I would always use a matrix switch just to save on maintenance by having it as a flexible, no remote hands, patch panel. You can save yourself a lot of expensive data centre walks or transport fares. That kind of use case is usually enough to justify the cost of the device to start with. It is one of the few obvious ways to save money in network design.

If you are distributing market data and you care about latency, you really should always use a matrix switch, otherwise you’re not doing your job properly. Stick a ye olde modern packet switch in the way, and now you’re just going slow. Don’t do that. Packet switches are good for humans, but not so good for algorithms. If an algorithm is hooked up to a packet switch and another algo is hooked up to a matrix switch, the algo hooked up to the packet switch will lose. Don't lose: use a matrix switch.

One thing exchange failures have taught us is that vendors are not to be trusted ;-) Homogeneity kills. A vendor wants you to use all of their equipment and be homogeneous. Sometimes that makes sense but often it doesn't. We need our own architects to bypass the BS and build real resilience into our network architectures. For this reason, I would do my redundancy with a packet switched multi-cast UDP network. That is, I’d do both layer one and layer two or three, with the packet switching being the back-up path. To me, heterogeneity in both design and vendors matters. You won’t get that kind of advice from your vendor, which is why you need to rely on your own team.

For finance there is one stand out vendor for matrix switches. Metamako make by far the best product. This is for two main reasons. Their switches understand common packet structures. As I’ve explained, essentially the layer one matrix is pretty much just a point to point wire, but in addition to just doing that, the Metamako gear understands a few things, such as 1G and 10G Ethernet and you can get some packet information which helps a great deal in building and monitoring the network. Just about all the other vendors just give you a cable equivalent and you’re somewhat blind to the packets going across. Some will also do similar signal integrity stuff to what Metamako do, but this is not the same as counting packets. 

The second win from the Metamako stuff is the embedded latency measuring stuff. In the old daze you’d have to use an expensive Endace DAG card, or equivalent, and tap a specific line to it to capture packets with time-stamp annotations. The Metamako stuff lets you add time-stamps to any port you want and benchmark many ports simultaneously. Depending on how you think of it, such time-stamping capability is either a massive cost saving or a massive capability upgrade. The downside is that a few hundred nanoseconds are added to a time-stamped packet, but you can add a tap, as is the nature of the matrix switch, and have an undelayed reproduction in addition to the delayed time-stamped line. A big advantage of the electronic matrix chips is that when you add an additional replicant of the line, the original line is undisturbed, so you can add replicants of the data in production without disturbing the original line. Very cool. You’d better get it right though as it is unwise to mess with a production network and with great flexibility comes great danger. But when you need Felix's bag of tricks, you really need it.

I was triggered to write this meandering piece as Metamako just released a 2RU 96 port matrix switch:
Metamako's MetaConnect 96
(click to enlarge)

This is a pretty serious bit of kit that encourages large enterprise fanouts. With only two layers of MetaConnect 96s you have support for over nine thousand end points of replicated market data. Neat. Exchanges, banks, and brokers should take note. Three layers would give you the possibility of over 800,000 end points at a cost of around 18 nanoseconds plus wire time. Wire length becomes the obvious constraint rather than the switch.

Future networks

In a future world, I hope we’ll see what I like to call SDN++ networks, next gen software defined networks, that not only support advanced flexible routing, virtual networks, and packet switching but also directed circuits via matrix switches. Perhaps we’ll then see support for on demand bandwidth, such as for VM migrations, plus the automation of resilience planning as well as the expected latency and bandwidth optimization planning.

Resilience planning is especially interesting to me. Just as you may use instrument delta and gamma simplifications for VaR planning for fast risk, you should be thinking of using device and link failure potentials for network deltas to plan your instant and automated redundancy responses.

P4 is a step in the right direction, but it is not enough. Plexxi is one such hybrid approach but it is also limited and not enough. Plexxi is neat but seems to have lost sight of the forest for the trees even though I think they are heading in the right direction. The future will belong to not just packets but also to circuits. That is, the key will be the orchestration of not just packets, or flow tables, but also the planning of links and the transient rewiring of links to support bandwidth, latency, and resiliency within the context of competing priorities.

Feeding into such a framework should also be aggregation. Metamako’s fastest Mux application is a 69ns service that is handily faster than say a fully connected layer 2 switch could be with the same technology. A mux is faster than a fully connected switch simply because it is simpler. Such things are important when you have a latency critical criteria for an important aggregation problem, such as a financial service talking to an exchange or broker. So imagine a future where you have not only flow tables in different devices to optimise, but flexible circuits, packet switches, and aggregators; plus all the monitoring, timing, and measurement foo you wish to throw around. Then consider clients, servers, operating systems, network cards, and network devices all having flexible circuit and switching capabilities. Such a rich environment provides awesome opportunities for optimisation and improvement.  We write VHDL and have software optimise our circuit layouts as part of modern chip design, but we still hand specify our networks due to the artistry involved. Hmmm. There is an obvious destination if we can find the right path.

As the old saying goes, many problems can be solved by adding a layer of indirection, hence packet switching. Reducing layers of indirection by circuits is also a noble act. Let's do that too.

I really want this future SDN++ network. No one yet is planning such a beast. Modern matrix switches and better monitoring with measurement is a step in the right direction but there is much more to come. It feels we’re on the threshold of exciting changes that will bring real, practical benefits to the data centre.

Happy trading,


Wednesday, 1 March 2017

Hashcat 2017

It's been a while since I've cracked a password. Perhaps three years?

My eldest daughter is doing a cybersecurity course as part of her engineering degree which is the kind of thing you do after an acting degree, right? Yep, that's a weird mix of degrees happening there.

Anyhow, I just wanted to show her a quick example of how to listen in for an auth handshake for a wifi SSID and then crack the WPA2 password. I used her grandparents' network as an example. It is an unchanged Telstra AP from a couple of years ago. It has one of those printed credit-card-like plastic id card things with a ten digit WPA2 password that has never been changed to something more secure. Old people trust giant telecommunication vendors.

How fast can a modern modest laptop crack that?

We gathered the auth handshake with aircrack-ng with a little help from its deauth replay attack. That packet trace then generated a hashcat hash capture file. Wind up the clockwork spring on the laptop with its Nvidia 970M GPU and hashcat puts out over 100k H/s of WPA key searching. A better single desktop processor may do five times as many hashes, but, to me, that seems terrific for a little battery powered device.

(click to enlarge)
This job will finish a 10 decimal digit search in a bit over a day if it is not lucky. As I know the pass phrase, I know it is not going to get lucky ;-) I was quite surprised that no multi-GPU cluster is required to keep the expected value of this task to under a day. Times have really moved on.

(click to enlarge)
Another nice advance with the reinvigoration and open-sourcing of hashcat is that it can potentially support FPGA kernels via OpenCL. That's a very interesting option. Well done hashcat team.

(click to enlarge)
Ten digit hex WPA keys may be feasibly found with a multi-GPU set up. A random 10 digit alpha-numeric is pretty safe as you'd expect it to take around a month on a cluster with a thousand state-of-the-art GPUs. Despite XKCD, beware of pass-phrases thanks to modern markov chains and dictionaries. XKCD's 2^44 is only slightly better than ten random hexadecimal digits. Though, in good salt we can trust.

KeePassX is my friend. I've been slowly converting to 16 random characters for each of my passwords. I'd better hurry up. You too should try to pick better passwords to keep the anti-social at bay. Entropy is your friend in both trading and in passwords.


Update: the WPA2 crack took 15 hours for a correct result. There was a bit of GPU throttling due to the summer heat and workload.

Saturday, 18 February 2017

Some of my favourite, or most useful, finance books

Fortune's Forumla was one of the most fun reads I've ever been fortunate enough to stumble across. That's a big call, but I mean it. It's certainly been a while since I last read it, but it has been worth reading more than once. I must read it again. William Poundstone's narrative theme is the Kelly Criterion largely centred around Ed Thorp with a dash of Claude Shannon. You gotta read it to understand how the mob's low latency telephone betting arbitrage underwrote an embarrassingly large amount of AT&T/Bell's revenue. The amazingly simple story within: buy the worst performing stocks the next day, sell the best performing, rinse repeat for a couple of decades. Just read it.

At four companies I've been involved with, I used C/C++ code that was transliterated from Espen Haug's The Complete Guide to Option Pricing Formulas book. As far as option pricing goes, this approach was largely enough pricing to make some millions of dollars. Last time I just ocr'd the relevant pages, as the CD had gone walkabout, and transliterated the BASIC code to C/C++ fairly directly. Pricing and unit tests done in less than a day. Perhaps I should do it again and release such C++ as open source so I can stop the repetition. I kind of prefer the size and convenience of the first edition, but the second edition is certainly an improvement.

I like Barry Johnson's Algorithmic Trading and DMA: An introduction to direct access trading strategies even though it is not overly insightful for a market professional. It is nice, clean, and easy to read but its real usefulness to me has been as the "goto" description of a call auction if anyone asks you. That small snippet is dog eared. Somehow I find it a particularly pleasant book even if it is not filled with great insight. A tidy reference.

As far as understanding option trading goes, there is only one that is worthy that I've come across. It is ancient but still relevant and a great introduction for a budding trader. Sheldon Natenberg's Option Volatility & Pricing: Advanced Trading Strategies and Techniques. This is the ancient 1994 edition I've read and recommended over the years. There is a newer 2014 edition, "Option Volatility and Pricing: Advanced Trading Strategies and Techniques, 2nd Edition", but I can't vouch for it as I haven't read it, though I know I probably should.

For futures and option basics, especially for new finance staff, just stick to the biblical Hull, "Options, Futures, and Other Derivatives (9th Edition)", but why is it now so expensive? You might find a better price in a university's co-op bookshop. Just a tip.

I'm not the biggest Taleb fan, but whilst I found the mildly annoying "Fooled by Randomness: The Hidden Role of Chance in Life and in the Markets (Incerto)" was well worth the read, albeit through gritted teeth, you gotta give Taleb a lot of credit for the masterclass that is: "Dynamic Hedging: Managing Vanilla and Exotic Options."

Rebanto's "Volatility and Correlation: The Perfect Hedger and the Fox" is a favourite of mine though I've only read about half of it as it is my most "missing" book. Over the years, somehow it has just walked out of a few of my offices and disappeared into the Ether. Perhaps there can't be a much better recommendation than that? I'd like to read it all but I can't really afford to keep buying copies. I credit this book with stimulating me into some newer and novel ways of thinking about volatility and pricing that aren't contained in any text book. That is a real credit to Rebanto's intuitive way of presenting his thoughts on pricing and volatility. He's a good teacher. Perhaps it is the most valuable half read book I've never completed? Hmmm.

As another fun, albeit soft, read, Peter Berstein's "Against the Gods: The Remarkable Story of Risk" is about as good as it gets. The story about the English using the Roman life tables for their annuities is a rip roarer, especially when you look at those median life spans from the 1600s. There is a lot of context in many of his books that have enriched my life. Highly recommended.

What's next on my list? Well I'm waiting on Amazon to deliver Dave Cumming's autobiography, "Make the Trade". It should be a beauty based on this snippet:

A snippet from Sniper's tweet
(click to enlarge)
I've chatted briefly to Dave a few times and always found him to be terribly interesting and engaging. I'm sure his tome will be a great read, full of historical gems that only someone who has been there and done that can provide.

Happy trading,


Tuesday, 24 January 2017

US microstructure - why the rules don't matter

That's a little facetious. Rules do matter except when they don't.

Let's trudge past the click-bait-like title a little to revisit the recent Citadel SEC fine: A tale of two cities' firms.

First a fair warning. This is really not that interesting and you'll have to be a very bored market structure geek to care much about this meandering meander. My advice is to run away whilst you still can.

OK. Don't say I didn't warn you.

CES was fined for a lack of disclosure around how their internaliser worked under some limited circumstances. In the article linked above, I explained why I think PFOF and best execution is kind of oxymoronic. Nevertheless, it is useful, and prudent, to remember that the retail punter is better off, on the whole, with CES stepping in and providing general price improvement. CES is a remarkably efficient PFOF machine that benefits many clients. Perhaps the market could do better if given a chance to chew on retail orders? Probably not easily, and perhaps never, especially with the sub-penny rule. It's an interesting dilemma.

All that aside, there were some questions around if, and if so: when and how, FastFlow and SmartProvide at CES would have become improper if they did continue. Larry Tabb suggested such activity was allowed at the time but was not likely to be OK now:

That was interesting as I wasn't sure when or where the regulations made the direct feed (DF) essential if you had access. It still remains that you can use just the SIP if that's all you have, by the way.

Mr Kipp Rogers pointed out the correct FINRA regulation covering this. Kipp also pointed out the preceding reg from the daze of NASD. The text is pretty short. Importantly, all the various versions are also referenced as they changed over time. You won't see any reference to a DF in them there wordy things though. Hmmm.

Let's have a quick looks at FINRA 5310, the pertinent regulation. The particular sub-section reads:

5310. Best Execution and Interpositioning 

(a)(1) In any transaction for or with a customer or a customer of another broker-dealer, a member and persons associated with a member shall use reasonable diligence to ascertain the best market for the subject security and buy or sell in such market so that the resultant price to the customer is as favorable as possible under prevailing market conditions. Among the factors that will be considered in determining whether a member has used "reasonable diligence" are:
(A) the character of the market for the security (e.g., price, volatility, relative liquidity, and pressure on available communications); 
(B) the size and type of transaction; 
(C) the number of markets checked; 
(D) accessibility of the quotation; and 
(E) the terms and conditions of the order which result in the transaction, as communicated to the member and persons associated with the member.

That form of the rule is the latest incarnation, as from 9-May-2014. Some form of this best execution obligation has been a rule effective since May 1968 you'll see in the history listed at the bottom of the reg. I presume preceding 1968 best ex was just a moral obligation if there was no such rule. I expect it was likely covered by some code of practice somewhere. After-all, doing the right thing by a customer is an obvious and indispensable thought.

If you choose to go to the previous version of the FINRA rule, the text of 5310 (a)(1) remains exactly the same for the period covering May 9 2011 - May 30 2012. This is still after the CES fine period. We have to go back to the preceding reg at NASD2320 and look at the third last version to get to the last period covered by the CES settlement:

2320. Best Execution and Interpositioning

Past version: effective from Dec 14 2009 - Jun 27 2010.

(a)(1) In any transaction for or with a customer or a customer of another broker-dealer, a member and persons associated with a member shall use reasonable diligence to ascertain the best market for the subject security and buy or sell in such market so that the resultant price to the customer is as favorable as possible under prevailing market conditions. Among the factors that will be considered in determining whether a member has used "reasonable diligence" are: 

(A) the character of the market for the security, e.g., price, volatility, relative liquidity, and pressure on available communications;
(B) the size and type of transaction; 
(C) the number of markets checked; 
(D) accessibility of the quotation; and 
(E) the terms and conditions of the order which result in the transaction, as communicated to the member and persons associated with the member.
Yes. It's the same. The regulatory text hasn't changed but what was proper is now improper. So how did it change? How do you know what is the law of the land?

It is unfortunate that reading the law gives not that much understanding of the interpretation of the law. That changes over time. This is a particular case in the point. The Best Ex obligations and wording have remained the same, but the interpretation changed. When you re-read Larry Tabb's tweet above, carefully note the word "guidance." Hmmm.

I asked for a pointer to when that was on twitter. Mr David Weisberger politely replied as follows:

The link to the November 2015 interpretation that mentions DFs in footnote 12 is here, with the following snippets extracted:

So, that's settled. If you're somewhat unsettled by the settling of something so important in a footnote to a regulatory notice that is not explicitly referenced in the regulatory legalese, then you're not alone. Practitioners in law and tax have long had such problems. Law is set by precedents and interpretations by different strengths of courts and officials. Sometimes concurrently. Sometimes with paradoxical conflict. The sometimes twisty, long history of many of these things matter. Some interpretations will go back to the Magna Carta over 800 years ago so don't feel bad if you're too young to remember the actual regulatory events. It is simply not possible for a mortal to have the collective history of all laws and interpretations, so we rely on study and specialisations. As Matt Levine points out in Marblegate, sometimes we forget about how things became the way they are until cases are lost and then won on appeal when clever archaeology assists the memory reconstruction process to derive the thoughts that once resulted in a heuristic now assumed to be innate,
"There are three possible levels of understanding the law, or a bond document, or whatever:

  1. Not reading it.
  2. Reading it.
  3. Reading it while also being familiar with the institutional memory of the legal community."
The largest law library in the world, available to all
Good luck reading
"approximately 5 million items"
It is a curse of the modern world. My father opines for a return to an understandable tax system. In the 1960s, when he first became a partner in an accounting firm, he fondly remembers being able to read the tax law in a thinnish but not tiny volume, and digest it, and understand it. Today's Australian tax law is voluminous, cumbersome, and esoteric and yet still likely simpler than the US tax code. One of his practice's partners had deep anxiety about the ever growing complexity of tax. He felt he could no longer serve his clients diligently as he simply could no longer keep it all in his head. This fellow took an early retirement rather than cope with the anxiety of the unknowable that all sensible accountants and lawyers must contain within their true selves beyond the carefully marketed veneer of expertise. Therefore we have the practice of regulatory books reproducing laws along with carefully researched annotations regarding precedents, cases, and interpretations that are necessarily incomplete but serve as the "real" law to most except for the rare exceedingly expert bird.

Market regulation is simply a simpler case of the same lack of simplicity.

That is, market structure is perhaps now getting to a stage where it has a similar need for an interpretive dance book just like larger fields. DF versus SIP for PFOF. IOIs in the dark. Speed bump interpretations and types of delays. The so-called millisecond "de minimis" that isn't a millisecond. Allowable orders. There is an annual interpretive book needed there that I definitely don't want to read but probably would. It would be incomplete, a good start, and cheaper than a poorly focused discussion with a securities law firm.

For now, the bottom line is, you won't know the law if you read the law. Be alert, not alarmed. That's how it is designed to be. You'll need the institutional memory of bright folk like Larry Tabb, David Weisberger, David Lauer, Kipp Rogers, and their kin to keep you straight.

Good luck with that,


Tuesday, 17 January 2017

A tale of two cities' firms

Not so comical?
(source: Wikipedia)
"It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, it was the epoch of belief, it was the epoch of incredulity, it was the season of Light, it was the season of Darkness, it was the spring of hope, it was the winter of despair, we had everything before us, we had nothing before us, we were all going direct to heaven, we were all going direct the other way - in short, the period was so far like the present period, that some of its noisiest authorities insisted on its being received, for good or for evil, in the superlative degree of comparison only."

Charles Dickens is no less relevant in 2017 than he was in 1859. Yeah, the quote is a bit delightfully long.

The big news from last week was the Moody's gentle US$864m wrist slap and general escape. However, the micro news was a bit more stimulating to me. The SEC came out with two interesting finalisations of matters.

Firstly, ITG's Frank Troise's valiant attempt to turn around belief struck a hiccough with ITG's well telegraphed ADR fiasco being finalised by the SEC ($US 24.4M). Secondly, Citadel, or more specifically Citadel Execution Services (CES) as part of Citadel Securities LLC, was poked in the ribs for misrepresenting how some of its wholesale internalisation, or Payment For Order Flow (PFOF), worked ($US 22.6M). There was a very important difference between the two findings. ITG was doing something wrong. CES was not found to be doing something wrong with respect to transactions but, rather, was found to be miscommunicating what it was doing. Let's meander through both of these.

New York, New York

It's just on a year since Frank Troise took over at ITG. He has made significant progress in changing the focus of the firm back to its clients. The most important action undertaken was closing the proprietary trading and lending businesses down. ITG had been called to account for their pool abuses with their previous settlement of $US 20.4M. This new settlement may finalise the lending abuses and Troise's aim to deflect the settlement to historical legacy is fair enough. It remains to be seen if the strong language from the SEC,
"Many of the ADRs obtained by ITG through pre-release transactions were ultimately used to engage in short selling and dividend arbitrage even though they may not have been backed by foreign shares."
results in any other parties being held to account. The large missing item for ITG is that it has not been properly held to account for the historical lying to its customers about being an agency only business. ITG did infact engage in proprietary trading in the same or similar products to many of its clients. Bob Gasser misled a US Senate Committee in 2012 when he claimed ITG did not engage in proprietary trading,
"ITG is not a market maker, and we do not take on proprietary positions. In other words, we do not have “skin in the game...”
ITG closed their proprietary trading business in 2016. I don't think proprietary trading is necessarily a bad thing in a diversified financial services business, but lying about it is definitely bad. When you're caught out doing something you said you're not doing, the sensible thing is to get rid of it so your customers may grow better trust in your integrity. ITG seems a bit swollen in head count with ageing products but at least it has a chance now if it can lift its game. However, the deception associated with ITG's proprietary trading has not yet been accounted for by any regulator. We'll have to wait to see if that penny drops.

ITG's share price has been doing OK with reasonable interest being associated with sizeable positions. Now revenues may see some improvement through some of the diversification being promised. It is also possible significant head count reductions could see a much better ROI from ITG's legacy IP even if revenue faded. That is, I'm not sure ITG would perform much worse with only 250 people instead of over 1000. 2017 is going to be "interesting" for ITG staffers you'd expect. Time will tell if New York's ITG is a value trap or not.

A cover from the 1859 Serial. Has the SEC started another serial?
Is there more to come? (Source: Wikipedia)

By the inland sea: the other city

In a tale from the city of Chicago, Citadel had a bit of a different diagnosis and prognosis from their SEC fine. After a detailed look under the covers, the SEC found CES made misstatements about how its executions worked in some circumstances. Its operations were not found to be at fault, but its marketing was. That's an important distinction. So how bad was it? Is this similar to ITG deceiving its customers about it pool operations or prop trading? It looks quite different to me.

After trying to parse to the SEC order I found myself left without all the detail to make a proper judgement of the situation. There are are curious twists in the saga worth discussing however. I was certainly left with the impression that there was simple miscommunication going on rather than anything nefarious. The FastFill aspect goes to the heart of PFOF. More about that below as I feel the juxtaposition of PFOF and best execution responsibilities particularly troublesome. The other procedure focused upon by the SEC was CES's algo called SmartProvide. There is not enough detail to evaluate SmartProvide. The scant details could even be interpreted as a gain for the customer, though less likely than otherwise. Nevertheless, CES was penalised for saying it was doing X and instead doing mainly X with a dash of Y.

The scale of the Citadel misgivings was pretty small beer according to the SEC. They point out that CES does about 35% of retail execution in the US. Yet in the period covered from "late 2007 through January 2010" the SEC disgorged $5.2M for those two and a bit years. It is interesting to try to put that in a per customer perspective. CES is one of the big internalisers, for example they handle most of TD Ameritrade's volume. TD Ameritrade is one of the big five retail brokers (Nov 2015). This shows north of 50 million or so retail accounts are likely for the big five.  The SEC counted 109 million retail and institutional brokerage accounts in 2011. It would thus not surprise if there were approximately 100 million retail brokerage accounts in the US. So if the CES penalty was over 35% of those it would equate to about $0.075 annually.  Yes, seven and a half cents, per account per year. That calculation is arguably quite a bit wrong, but you get the idea. The scale is nevertheless appropriate. The profit attributed to those algos was pretty small beer.

There are a few interesting facets to all of this. Let's look at Fast Fill: from the SEC order,
"10. One strategy, known as FastFill, was triggered when the best price from one or more of the depth of book feeds that FastFill referenced was better than the best price disseminated by the SIP feed. Assuming all other eligibility conditions were met, FastFill immediately internalized a marketable order at the SIP NBB or NBO, as applicable, or better. 
11. For example, if CES was handling a marketable order to buy shares, and the SIP best offer was $10.01, and the best offer from one or more of the depth of book feeds was $10.00, FastFill immediately internalized the order using the SIP offer of $10.01 per share. FastFill did not internalize at or seek to obtain through routing the better $10.00 price from the depth of book feeds."
This is basically saying that if the SIP was behind the direct feed, the customer got the SIP and CES would do its best to get something better for itself. That may not have always worked out for CES, but it probably did. This is the direct feed (DF) versus SIP feed, a so called latency arb that isn't, but is frowned upon at an ATS or exchange. I'm not sure if this was improper for an internaliser under rules at the time. I don't think it was. Newer rulings, subsequent to January 2010, may have an impact in current interpretations. The SEC did not take the view it was improper, they just wanted proper disclosure. For example, say that you used some smart ML to determine the price was about to change and thus filled your clients over the spread with the expectation you would do better as you hedge, but with less certainty, is that wrong? At what price is innovation? It quickly gets cloudy.

SmartProvide gets murkier. The SEC order doesn't give full details but enough to know that you can't really make a judgement,
"12. The second strategy, known as SmartProvide, was triggered when the SIP NBB or NBO, as applicable, was better than the best price from at least one of the depth of book feeds. SmartProvide did not internalize at the SIP price, nor did it seek to obtain an execution at that price by sending an order to the market. Instead, assuming all other conditions for order handling by SmartProvide were met, SmartProvide would route a non—marketable order to the market.
13. For example, if CES was handling a marketable order to buy shares, and the SIP NBO was $10.01, and the best offer from one or more of the depth of book feeds was $10.02, SmartProvide would send a buy order to be displayed in the market at a price less than $10.01, such as $10.00. This order would be displayed for up to one to five seconds, depending on the size of the order. If this order received an execution, the customer order would benefit from the execution at the better price (i.e., the shares purchased by the customer would be at a price at least one penny better than the NBC). This occurred for approximately 18% of the shares handled by SmartProvide. If the order did not receive a full execution from this routing, CES’s algorithms reassessed the handling of the remaining shares, and could either internalize or seek to obtain an execution in the market. Some of the orders that CBS internalized after SmartProvide displayed an order in the market on their behalf received a price that was worse than they otherwise would have received in the absence of SmartProvide."
So, 18% of trades were executed at a better price. The customer benefited in those cases. Did that offset the other 82%? It is unclear. It probably didn't but it could have if the average 18% gain was 4 times larger than the 82% average displacement. We don't have enough information to know for sure. Also, it was not a simple DF versus SIP equation here as the decision was specific to order size and which stock as to how the algo assessed the decision making. Here is how the SEC described the SmartProvide trigger,
"Triggering Event for SmartProvide 
34. SmartProvide was triggered when the SIP NBB or NBO, as applicable, was better than the best bid or offer from one or more depth of book feeds. SmartProvide referenced only one depth of book feed for many securities and fewer than all of the depth of book feeds for other securities. Accordingly, at times, SmartProvide was triggered when the SIP NBB or NBO, as applicable, was from an exchange whose depth of book feed SmartProvide did not reference. In addition, SmartProvide sometimes could be triggered when the difference existed between the SIP and only one of the depth of book feeds SmartProvide referenced, and not the others.
35. For example, in the case of a marketable order to buy shares, SmartProvide could be triggered if the SIP NBO was $10.01, and the best offer from one or more of the depth of book feeds was $10.02, even though the best offer on one or more of the depth of book feeds from one or more other exchanges was $10.01."
So, it was a bit more complex than DF versus SIP. There was some judgement as to which DFs got used, often only one. However, this is not what was disclosed,
"40. During the relevant period, CES provided a written disclosure to certain retail broker—dealer clients that described a market order as an “[o]rder to buy (sell) at the best offer (bid) price currently available in the marketplace,” and made other, similar representations to its clients. As discussed above, these statements suggested that CES would either internalize the marketable order at, or seek to obtain through routing, the best bid or offer from the various market data feeds CES referenced. These statements were materially misleading in light of the way that FastFill and SmartProvide functioned."
Sometimes a client got a much better price by not crossing the spread thanks to SmartProvide. However, that is not what paragraph 40 says. It is wrong to say something and do another even if it's advantageous to the client. So, the customers still received SIP NBBO or better but the marketing didn't correctly represent the ever changing algo operations. This CES story is not simply a black hat versus white hat story.

Best execution versus Payment For Order Flow

If you were to do best execution by policy, I'd argue payment for order flow could not exist. By definition the wholesaler is getting money, say $0.002 per share, for handling the order. The wholesaler is not a charity. They are expecting to receive more than the fee they pay for the execution or the execution information as a statistical whole. They need to make a profit.

That is, fundamentally, the customer is not getting best ex in a holistic fee and execution sense. The broker could make the same decisions as the wholesaler. Then, instead of losing the wholesale fee and the wholesale profit, the customer could receive that cost as a benefit. This is what I mean by best ex and PFOF being in tension. It is also just weird that US retail broking is just not really all that concerned with, you know, broking. Maybe it's just me.

All that said, there are obligations on the broker to shop for the appropriate wholesaler and to monitor and report on such. There is some competition and tension in the market place even though big wholesalers are very few in number. To me, PFOF, like the order protection rule, had a point in days gone by, but it appears to have over stayed its welcome.

Europe does best ex better than the US. In the US you get an audit and profiled against the SIP. The US has specific procedural elements, such as the order protection rule, you must take heed of. In Europe there is a better approach where best execution is a policy. Thus best ex is a little woolier but it ostensibly takes the gaming of specifics out of the equation. However, a lack of enforcement makes for weak policy in the Europe but enforcement seemingly improves over time. Canada has learnt from the US and European experience and I think it has struck a better policy / execution balance. The SEC could learn a little from IIROC but are unlikely to look North for inspiration in their parochial world.

The order protection rule is ripe for change as, not only is it tired, incumbents benefit if it is retired. PFOF is unlikely to ease out of the picture as large brokers and wholesalers benefit, and arguably, the smaller brokers still benefit by being able to outsource their operations as the NMS gets ever more complex. However, best execution and PFOF will continue to remain oxymoronic to me.

Happy trading,