So in the past few weeks I’ve made some progress on my direction and what I plan to make by Thesis Week.
For the map / visualization of high frequency trading and the importance of location, I’d like to create something in the style of this map, or board game, from the time of the Alaskan Gold Rush.
I’ve made some headway and found a listing of proprietary trading firms, some of which specialize in high frequency trading. I’ve started to find all their locations, though some gaps are still there. I’m also trying to locate the data centers for all of the trading exchanges in the US as a starting point, then perhaps globally if I can manage.
I also started moving in a bit of a different direction. Just as I am interested in the importance of location for high frequency trading, I’m also interested in the importance of time. HFT exists in a timescale that is below the limits of human perception. There is an excellent article that a lot of people were kind enough to forward to me last week, which pinpointed the differences in algorithmic behavior at timescales below 650 milliseconds, which is the time it takes a chess grandmaster to realize a mistake. At this timescale, the algorithms are just trying to interact with one another as opposed to human traders. Most interesting, is that the fractal patterns that explain the market don’t hold up at microsecond timescales. As the researcher states, they “broke the fractal.”
This got me thinking about the behavior at millisecond increments of the market, but it also got me thinking about the importance of timekeeping in order to keep high frequency trading running. Then I came across this article, which REALLY got me thinking about how important timekeeping is to our financial system. From the GPS.gov website:
Each GPS satellite contains multiple atomic clocks that contribute very precise time data to the GPS signals. GPS receivers decode these signals, effectively synchronizing each receiver to the atomic clocks. This enables users to determine the time to within 100 billionths of a second, without the cost of owning and operating atomic clocks.
HFT uses a combination of a network timing protocol and GPS to keep time. The network protocol functions along the cable, while the GPS checks the accuracy on either end of the exchange. Since GPS receivers need to point at satellites, they are usually located on the roofs of trading exchanges. To read more about the methods for timekeeping in HFT, this is a great article.
The methods that the researchers achieved this are a bit out of my means, but I’ve found a few hacker articles which gave me hope that I could theoretically interact with the market by confusing the GPS signal. I plan to test this in a controlled environment, and construct a sculpture that would have the potential to cause disruption, without ever actually using it for that purpose.
So my new plan is to make a sculpture, that would theoretically shift the timestamp of algorithmic trades, and confuse the market. I would not use this, but I think making it points out the vulnerabilities in the system, and the reliance that our financial system has on technologies that we don’t realize are involved.
Lastly, here are some screenshots, and links to a video piece I made. It’s a landscape generated from financial data. I hope to refine this piece, but I’m not sure it’s quite in the same vein as the other pieces I plan to build for my thesis.
Here are some thoughts on three papers related to the qualities of high frequency trading when analyzing them in terms of small increments of time at incredibly high speeds.
High frequency trading has been in the news as of late. People have been forwarding me a few insightful articles that led me to new journal articles. This well-written article referenced this paper:
Hasbrouck, J., & Saar, G. (2011). Low-Latency Trading, 10012(September).
It gave me a better understanding between what they call Agency Algorithms (which I believe other papers have referred to as ‘passive’ algorithms), and Proprietary Algorithms. Agency algorithms are used by large institutions when buying or selling many orders at once, in order to time them so as to “reduce slippage,” and keep as much profit on the order as they can. These algos still look to larger market trends, and might suggest to a human trader which stock(s) to buy or sell, but the trader would most likely determine volume, then execute the order via Agency Algos.
Proprietary Algorithms are what actually qualify as “low-latency algos” or aggressive high frequency trading. These try to game the speed of the system itself, baiting other algorithms to place an order, so that they can pounce and do it first. The patterns of these algorithms are a lot of buy-cancel-execute orders in milisecond periods of time, in an attempt to confuse the other algorithms out there and profit before they can.
Another article that many people have emailed me is Wired’s article on how HFT could negatively affect markets. The article was primarily a synopsis of this paper:
Johnson, N., Zhao, G., Hunsader, E., Meng, J., Ravindar, A., Carran, S., & Tivnan, B. (n.d.). Financial black swans driven by ultrafast machine ecology. Physics.
I found this paper incredibly compelling. The authors look at periods of time less than 650 ms, which is the threshold of human response time. As an example, they cite that 650 ms is the time it takes a chess Grandmaster to realize they are in trouble. This gives context to the transition away from “traditional human-machine systems,” where human oversight is possible if changes to the system are observable within human response time.
They describe the global financial system as governed by “the self-organized activity of a global collective of trading agents, including both humans and machine algorithms.” Since this system operates without much oversight or “real-time controller,” the study heeds researchers to develop a “scientific theory for the underlying human-machine ecology on these ultrafast timescales.”
They use the term “black swan” to describe events in the market that reflect extreme volatility, or jumps in pricing. Their definition: stock price had to tick down or up at least ten times before ticking up or down and the price change had to exceed 0.8%. They reference Francis Bacon in their interest in studying these “black swan events” as “it is in such moments that a complex system offers glimpses into the true nature of the underlying fundamental forces that drive it.” They also determine that the nature of black swan events change fundamentally as “the duration threshold is reduced beyond typical human reaction times.” The paper reflects a shift away from a market made up of mixed decisions between humans and machines, in which humans have time to asses information, to a system primarily governed by ultrafast machines dictating pricing.
These articles have made me intrigued by the “quantum” properties of HFT at the sub-second time level. In the same way that Newtonian physics gives way to different behaviors at the subatomic level, financial systems (generated from the interplay of programmed agents collectively displaying complex behaviors) have the similar properties at the sub-second level – namely at time periods below the threshold of human perception: 650ms. As Wired states, “While market behavior tends to rise and fall in patterns that repeat themselves, fractal-style, in periods of days, weeks, months and years, “that only holds down to the time scale at which human stop being able to respond,” said Johnson. “The fractal gets broken.”
These areas of inquiry lead me back to the first paper I read on this topic, when my interests were primarily on Colocation and the effect of distance on high frequency trading. I went back and re-read it, and some things were made clearer, and others not. There is a lot of math there, but here’s what the paper describes in my understanding.
Wissner-Gross, a., & Freer, C. (2010). Relativistic statistical arbitrage. Physical Review E, 82(5), 1-7. doi:10.1103/PhysRevE.82.056104
1. The speed of HFT – with typical trade latencies below 500 microseconds – have made the speed that information travels over distance relevant. Basically, firms are bumping up against a fundamental physical constant, the speed of light.
2. The paper calculates optimal nodes for communication with multiple exchanges.
3. “Within financial markets, the relevant time series are typically the logarithms of the prices (log-prices) of financial instruments.” (look into log prices)
4. They use the Vasicek model to describe the behavior of “correlated financial instruments.” Based on Brownian motion.
5. The optimal intermediate location simplifies to the two center locations weighted by speeds of reversion. The speeds of reversion scale with market turnover velocities. The optimal intermediate locations are midpoints weighted by turnover velocity.
6. Note that while some nodes are in regions with dense fiber-optic networks, many others are in the ocean or other sparsely connected regions, perhaps ultimately motivating the deployment of low-latency trading infrastructure at stuch remote but well-positioned locations.
7. “Such slowing or stopping of the propagation of pricing information due to arbitrage is somewhat analogous to the refraction and scattering of light by a dielectric medium, but novel in an econophysical context…This result also raises the possibility of establishing arbitrage analogs of other concepts from optics and acoustics, such as reflection and diffratction.” Perhaps there’s something here I can tap for an installation.
I’m still trying to determine how to tie what I’d like to build for Thesis, with the research, programming, and writing I’ll be doing in Algorithms, Research Studio.
Miguel and I worked together to generate financial data as a three-dimensional landscape. The goal was to see whether a 3D environment would be a new or interesting way to explore a familiar (to some more than others) view of financial data — the daily stock charts. Conceptually, we wanted to connect the idea that the financial system is really an environment that we are all living within, and feeling the effects of. The thought was that data that evoked a natural landscape, like mountains or canyons, would make people connect the financial system to a “natural” system. Other artists have already made this metaphorical connection, like Michael Najaar, but we were curious about what the effects of rendering this in a manufactured 3D environment might be.
We began with an OpenFrameworks program that Miguel wrote to parse Yahoo Financial Data, and render it as a heightmap. We chose a window of historical stock data to analyze: Jan 1, 2000 – Feb 7, 2012, which we hoped would give enough of a range of time, but not be too long before certain tech stocks were publicly traded.
Our first thought of how to render the data so that it would make a terrain was inputting it into a Perlin Noise function. Miguel found a good tutorial for generating noise in C++, which also provided a better understanding of how noise works, and then wrote a nice program that parsed the data csv, and fed values into noise. We determined that the closing price each day and the volume of trades were the values we should take into consideration the most, and fed these values into noise.
The result makes for good terrain, lots of peaks and valleys, but I started to worry that it was a bit too random – that the comparison of multiple stocks would cease to have meaning.
I tried going about generating a 3D mesh straight from the data itself. I finally plotted the points, but although they ended up in 3D space it looked pretty linear (or planar I should say). I needed something that had volume.
Just to illustrate, here’s the data graphed in 3D, with time as the x value, volume as y, and closing price as z:
It’s a lot truer to the data, but it doesn’t give me the same terrain to play with as the noise heightmap. So, I started thinking about various ways I could achieve a similar result while staying “truer” to the data. I tried to figure out if there was a way to start with a base plane mesh, and then extrude along the y axis for every point that aligned with the x and z values. It turns out that an image is a good way to generate a planar mesh (at least in OF), so after talking with Greg, I decided to write up a quick processing sketch that input 2 columns of data, and output a grayscale image. This combined the grayscale heightmapping with the mesh technique of mapping the data points to a shared grid system.
To generate the grayscale image, my process was:
1. Get data to read into processing
2. Map data to width and height of image. x = time, y = closing price
3. Map volume data (what we want to extrude by) to color, 0 – 255.
4. Step through x and y, but plot volume as grayscale circles, radius getting smaller as color gets lighter (taller)
Here is the image result from this quick test:
After testing it in OpenFrameworks to render a 3D mesh, I decided that this was the direction to move forward with in Unity.
With this in mind, Miguel and I sat down to start generating the terrain in Unity. Working from the HeightmapGenerator from the Procedural Examples, we set out to create a financial data terrain programmatically. There might have been a better use of our time than trying to do everything in code, but it definitely made us a lot more familiar with the Unity docs and general structure of writing programs for this environment.
Unsure how much would be visible if we didn’t generate it programmatically, we erred on that side, but underestimated the time it would take to implement. Saving the scene wouldn’t work as we tested out placement and textures. The end result is a bit obvious, and looks sort of like a candy raver emerald city in the clouds, but I like it. Here are some screenshots.
Though not interactive, or even animated, the final result does use an external input to generate imagery. The interaction is primarily camera flyaround. Next steps like changing the terrain based on other financial forces (overall market trends, Dow Jones Divisor) might be interesting. The stocks featured are the 30 stocks in the DJIA – Dow Jones Industrial Average.
Continuing from my previous post, I was able to get OpenFrameworks working to extrude the brightness values into the Y axis. I tried first with the noise images that Miguel and I generated at the beginning of the week.
Here’s the noise image in 2D:
And here it is extruded into 3D:
Here you can see them both, so you get the idea of how the brightness maps to height:
So, the form looks pretty good. But it also looks pretty random. By feeding financial data into a noise function, we can create a great looking terrain, but it starts to be so far removed from the data itself that it loses meaning. So, I tried to think about how I could generate mesh straight from the data. I finally plotted the points, but although they ended up in 3D space it looked pretty linear (or planar I should say). I needed something that had volume.
Just to illustrate, here’s the data graphed in 3D, with time as the x value, volume as y, and closing price as z:
It’s a lot truer to the data, but it doesn’t give me the same terrain to play with as the noise heightmap. So, I started thinking about various ways I could achieve a similar result while staying “truer” to my data. I tried to figure out if there was a way to start with a base plane mesh, and then extrude along the y axis for every point that aligned with the x and z values. It turns out that an image is a good way to generate a planar mesh (at least in OF), so after talking with greg, I decided to write up a quick processing sketch that input 2 columns of data, and output a grayscale image. This combined the grayscale heightmapping with the mesh technique of mapping the data points to a shared grid system.
To generate the grayscale image, my process was:
1. Get data to read into processing
2. Map data to width and height of image. x = time, y = closing price
3. Map volume data (what we want to extrude by) to color, 0 – 255.
4. Step through x and y, but plot volume as grayscale circles, radius getting smaller as color gets lighter (taller)
Here is the image result I got from this quick test:
I opened it up in Photoshop to do a little blurring, not sure if it was necessary, but didn’t want jagged steps between gradient values. Oh, and I also cropeed, which I should probably implement in code but oh well:
Final step was importing the image back into OF and generating a heightmap there. I used code from James George which made this pretty simple. I ran into a memory error, but I believe that had to do with pixel referencing or something. Here’s a little screencap video that I set to music so it’s a bit more enjoyable to just hang out in dataland. The glitches are due to my computer not enjoying recording and running the app at the same time. Perhaps I need to ramp down the framerate for next time, not that I mind it much.
This week has been productive in some ways, but I’m still feeling a bit behind where I’d like to be in terms of determining the best direction to take in regards to “visualizing” high frequency trading, and conveying its qualities to a non-financial audience.
I submitted a proposal for an article to Triple Canopy magazine that would explore the colocation aspect of High Frequency Trading. I will see whether they’d like to work with me to pursue this research by early March. If they don’t, I intend to use the paper I write in Research Studio to reflect the direction I proposed to Triple Canopy. Hopefully this will become settled once I really determine the focus of my research.
After I was able to get the Bloomberg API running in Eclipse, I realized (with Dan Shiffman and Heather’s guidance), that I wouldn’t be able to access any of the data without an active subscription. Apparently the API for Bloomberg software is open, but the data itself is still commodified, and very expensive to access. I wrote to them inquiring about student discounts, but received a reply asking me if real-time data was totally crucial to my project. At this point, it’s not, but having access to the Bloomberg API would allow me to call so much data incredibly easily, that it’s a shame I can’t work that way. Instead, I’ve resorted to downloading CSV files of historical data from Yahoo Finance, which gives me the bare minimum of datasets, but which are probably sufficient for now. I have can access Date, Open, Close, High, Low, Volume and Adjusted Close. From looking into the metrics of the stock market a bit more, I realized that the Close value, and the Volume are what everyone watches for.
If I were to write my own HFT algorithm, I would need access to a certain stock’s buy and sell pricing, as well as the share volumes available at those prices. As far as I know, this data is not available in the free historical datasets. I will have to keep looking into alternatives, or press the Bloomberg people I’ve been corresponding with that access to the API is essential to the project.
I began using the historical datasets to see if I could visualize the data in 3D space. One aim is to generate a 3D landscape from financial data. I worked with Miguel Bermudez, who wrote a CSV parser in C++ (thanks Miguel!), and then we brought the data into OpenFrameworks. We debated various avenues, but decided that feeding the data into a noise function would be an easy way to generate nice looking topography. It’s definitely not the most accurate, as 2D noise essentially turns the data into a lot of structured noise, but the effect makes for a nice way to extrude the texture into 3D space, mapping brightness to the y axis. Miguel found a nice tutorial on noise in C++ that we followed, in order to generate some “financial noise.”
Next step is to map the y coordinates to the brightness of the image. Then it should be a 3D model. My next goal is to have “real-time” data animating through the 3D stock terrain. I’m having trouble determining which part of the data makes sense for this and how, but hopefully with more thought I’ll get it.
I also tried experimenting with generating a 3D mesh straight from the data, essentially mapping x to time, y to normalized volume, and z to normalized close. Instead of a plane, it generates more of a line. I’m trying to map mesh to an image so that it looks as if it’s coming out of a plane, but haven’t quite finished that yet. The first image is before I normalized the data, mapping the min and max values to 0 and 1000.
I’m going to continue pursuing the 3D landscape generation from the data for another week. Then I plan to assess whether this direction makes sense with my larger goals, and if so, how I can improve my visualization algorithms to better reflect qualities in the data sets.
I talked to three alumni on Monday about my project direction: Sherri ?, Gabe B-C, and Drew Burrows. All three found the concept compelling, once I was finally able to summarize what I’m interested in. They all said that it sounded like I had a few directions I was considering, and that I’d need to decide soon which direction was the most compelling.
Out of the three I spoke with, Sherri picked up the most on the idea of a map or visualization on the distance – speed – value relationship, or Value in terms of distance and time. She pointed me to a few references and inspiration sources:
Aaron Straup made Pretty Maps while working at Stamen. He also has nice Flickr sets.
Sherri said she was currently working on an exhibit with David Breashears that documents the receding Glaciers in the Himalaya. She pointed out that the source photographs from the early 1900s are all surveying tools, and how that could be an interesting angle to tell the story.
Trevor Paglen’s work – especially on investigating the hidden.
She also told me to focus on a simple story that manifests the idea – an entry point. When I described the Spread Networks cable, she said that the communication between Chicago and New York could be a good entry point.
She also told me to really focus on who is my audience – namely in the data viz infographic vs art piece debate, which affects the level of accuracy I should be looking to portray, as well as if this is a “tool” a la the interactive map with a slider, or a piece that provokes conversation – a la the maps as critical design tool.
Her final advice to me was to get all the iterations and angles that I could possibly go in out and down on paper. Then select and refine as I choose a focus.
The next alum I spoke with was Gabe B-C. I wanted his opinion on a video piece I did last semester, which is a sketch for one direction I could take this project. When I showed him the piece, he told me to look up Marco Brambilla’s work, especially the pieces he did with grided comparisons, like every revolving restaurant, or split screen reaction shots of first person shooter games.
When I mentioned that one idea I had was a roadtrip from New York to Chicago traveling along the same route as the fiber optic cable, he said he could see a series of pieces inspired by transactions – or doing things with technology and money, but sloooowwwly. I’d have to engage in transactions on either side of the journey for sure. He also got me thinking along more physical lines, like making an actual sculpture that might be influenced by financial data – a glacier whose temperature is controlled according to financial data.
I talked to Drew next, who seemed to have the hardest time understanding what I wanted to do, or perhaps more, why I wanted to do it. He said he liked the idea of the 3D landscape generated by real-time financial data, but that the surveyor maps were the hardest to grasp.
Last, I also talked to Kathy, who nicely volunteered to reach out to people in the NYU Financial Mathematics Dept, basically the Wall Street Quant program that NYU has. She also said to look into Mae West and Mae East, which are routing maps of the internet. I have a lot more research to do, but I found a list of IXPs, Internet Exchange Points, where the various ISPs, Internet Service Providers, exchange traffic between their networks. This happens in various physical locations around the world.
Our first assignment in Research Studio Algorithms was to research and summarize relevant research pertaining to our topics. I had been reading a few papers about High Frequency Trading over break, but took this as an opportunity to delve more deeply into the topic. My sources are a combination of academic papers in finance, economics, math, physics, law, as well as news articles, and general books on the topic.
For a more in depth description on High Frequency Trading and the qualities I will be looking into, download my Literature Review here.
Goals:
- Have a greater understanding of High Frequency Trading Algorithms
- Research methods of analyzing financial data with models used for weather, natural occurring patterns
- Triple Canopy article on High Frequency Geography – whether that’s the co-location angle, or new nature angle TBD
- Analyze Bloomberg API in realtime – make my own algorithm
Week1
Start Research High Frequency Trading at Bobst / Courant
Week2
- Start writing Literature Review, compiling sources
Week4
- Get Bloomberg API working in Java
- Read through Developer’s Guide
- Determine which data is most important
Week5
- Use historical financial data to create perlin noise 2D images
- Extrude 2D images into 3D
- Determine how realtime financial data should affect 3D datascape
Week6
- Investigate making docs or a processing wrapper for Bloomberg API
- Research stochastic and chaotic models for financial data
- Research value – distance – speed data for colocation
Week7
- Make documentation or wrapper for Bloomberg API in processing
- Begin visualizing stochastic and chaotic models for financial data
- Finalize topic
Week8
- Final Topic is due
- Continue researching natural and financial visualization models
- Asses success of stochastic and chaotic models for financial data
Week9
- Begin writing research paper
- Finish coding data analysis
Week10
- Present rough draft of research paper
- Finalize data models
For the second week’s assignment in AppNewTech, our assignment was to do something with faces. Kyle McDonald does a lot of work with faces, facial recognition, face substitution, and has a great list of resources to peruse when you have some time.
I decided to mess around with Kyle’s ofxFaceTracker addon, which requires the FaceTracker library that Jason Saragih releases to people if they email him and ask nicely. I wanted to make a symmetry mirror, which takes one half of your face, then reflects it across your face to draw the other half. Due to the way the Face Tracker library was trained (with images of faces facing slightly to the right), the effect looks a lot more distorted than I thought it would. I also made a way to toggle back and forth between reflecting the left side of the face, and the right side of the face. This is especially visible when there’s a strong lighting difference on each side of the face, as visible in the example with Stepan below:
My code is up on github if you want to check it out. You won’t be able to run it without Jason’s library though.
For the first assignment in Kyle McDonald’s class Appropriating New Technologies, I worked from the Face Flip example he showed us. I worked mostly just at getting up and running with OpenFrameworks and github, since Kyle wants us to work with Git exclusively in order to collaborate.
One of the starting points of the code we worked on was an example from Zach Lieberman that let’s you grab and manipulate pixels beneath the program window. Next I did Haar Detection to find faces in the program window, with the goal of sending the program out to chatroulette via CamTwist, and replacing my video feed with the person’s own video feed manipulated by my code.
I had high hopes for this assignment – namely trading faces with someone else on chatroulette – but of course didn’t get as far as I hoped in a week. I ended up doing something fairly rough and simple – a “head spin” – which looks rough but still has a neat effect. Here’s a video of it in action on chatroulette.
My code is up on github if you’d like to peruse, in the week 1 folder.