Data – Intraday Trading Data and R

Posted: May 15th, 2012 | Author: genevieve | Filed under: Data, Thesis | No Comments »

So a few weeks ago I had the good fortune to get my hands on intraday trading data, due to the kindness of the folks at Nanex Research. I asked for a plain old text file rather than learn their custom software and API, and what I received was a 23 GB text file. Now, it’s a bit difficult just to open a 23 GB text file and see what’s inside since most software programs aren’t built to handle that kind of memory management.

But I was excited to see what was inside nonetheless. This was also the dataset that I’d been waiting all semester to receive, and the one that I was looking forward to exploring more in depth in Mark Hansen’s Data class. Unfortunately, it came to me when there were only about 2 weeks left in the semester, but Mark was kind enough to sit down with me and go over some basic techniques of how to open, parse and model the data inside this massive file.

So, first of all there are some helpful commands in Terminal which allowed me to see what was inside the file, and then make smaller files that R could actually open. For instance, just to see what the file contained, I typed:

head 'filename'

Then, in order to see more of the file, I typed:

head -100000 'filename' >> tenthousandlines.txt

This gave me a new text file, named “tenthousandlines.txt” which contained the first ten thousand lines of the original file.

Another useful command is grep. Grep lets you search for a term, and returns all lines that contain that term. After looking through the head file with Mark, we noticed that one of the first actual trades was of the stock symbol eNOK, which stands for Nokia. So we made another file that searched for “eNOK” and then saved all the lines containing that term into a new file.

Also, by typing in ‘tail’ we were able to check the timestamp of the large file, and it turns out that a 23 GB file of electronic exchange activity (on what day I’m not sure), translates to about 4 hours worth of data.

We also made a file that counted how many lines there were with the same timestamp (in seconds). The resolution of the exchange data is every 25 milliseconds, but we wanted to get a feel for how much activity there was at different times of the day.

Next we brought the files into R.

We took a few approaches to understanding what was in the data. Keeping with the quote activity over the course of 4 hours, we first graphed that activity by typing:

cntPerSec = read.table("Desktop/count.txt")
plot(cntPerSec$V1,type="l")

This yielded a plot that looks like this:

We wanted to get a better look at what was happening in the later part of the data, so we typed:

plot(cntPerSec$V1[12000:15000],type="l")

Which gave us a plot that looked like this:

Then we did the next logical thing, which was to plot it as a histogram:

hist(cntPerSec$V1)

This looked like the familiar “hockey stick” so we did the obvious next step, graphing the log histogram:

hist(log(cntPerSec$V1))

This gave us a very normal looking distribution:

So normal, that Mark showed me how to plot the data as a QQ-Norm, which from a little googling is used to compare the distribution of the data to another distribution, in our case to see whether the data follows a normal distribution (right Mark?)

So we typed:

qqnorm(log(cntPerSec$V1))
qqline(log(cntPerSec$V1))

Which gave us the QQ-norm plot and the “best-fit” (linear regression) line.

So the timing of things was one way to approach the dataset, but we also looked at a more traditional way of approaching financial data, stock by stock. Using the eNOK file we made from the grep commands in terminal, we took that into R, and looked at what was there.

The main thing we did was plot the price of eNOK stock over the course of the 4 hours. We typed:

> enok <- read.table("Desktop/enok.txt",sep="|",as.is=T)
>
> head(enok)
  V1   V2      V3           V4       V5   V6   V7  V8   V9 V10
1 MQ NYSE eNOK    04:00:00.175 04:00:00 PACF 4.03 190 4.05  10
2 MQ NYSE eNOK    04:00:01.125 04:00:01 PACF 4.03 290 4.05  10
3 MQ NYSE eNOK    04:00:03.875 04:00:03 PACF 4.06 500 4.09 500
4 MQ NYSE eNOK    04:00:03.875 04:00:03 PACF 4.07 500 4.09 500
5 MQ NYSE eNOK    04:00:13.100 04:00:13 PACF 4.08 150 4.09 500
6 MQ NYSE eNOK    04:00:13.125 04:00:13 PACF 4.07 500 4.09 500
> plot(enok$V7,type="l")

And plotted this chart:

While it’s not necessarily the most interesting thing to learn about the data, to my delight, the chart looks a lot like the Nanex charts I’ve been pouring over the whole semester. So I realized that using R is the way it’s done.

A huge thanks to Mark Hansen for encouraging me to keep asking for data even after getting refused numerous times. I’m looking forward to using the techniques he showed me to keep pouring through this dataset. My first plans will most likely be a sonofication of actual trades through the constant noise of orders posted and cancelled. But I’m very excited to take a look at the activity happening at a sub-second timescale. It’s pretty mind-boggling, but I guess that’s how 4 hours fills up a 23GB text file.


Mapping

Posted: March 21st, 2012 | Author: genevieve | Filed under: Research Studio Algorithms, Thesis, Uncategorized | No Comments »

Two serendipitous things happened today.

Toby sent me an article on a plan in the works where three different companies, one Russian, one Canadian and one American, are all investing heavily into laying down high speed fiber optic cables that would traverse the arctic circle and provide much faster connections between the US, Western Europe and Japan. Despite the huge investment and undertaking (each cable is estimated to cost between $600 million and $1.5 billion each, and will reduce latency between London and Tokyo by 30%), this is only possible due to global warming and polar ice caps receding significantly in the last few years.

I believe that this cable, this story, really brings together the connections between the financial system and the environment that I’ve been trying to deal with in more metaphorical ways (with the Unity landscape). Currently, I’m trying to figure out how to relate the speed gained by investing billions into these cables, an idea only made possible by human impact on our planet, and the effects that the financial system (and the infrastructural feats we are willing to do in its name) will continue to have on the environment. Here is the map of the planned cable.

I also met with Heather earlier in the day about my paper topic for her research studio class and my thesis project. Since I haven’t actually completed the projects I want to make, I can’t necessarily write the artist’s paper she had in mind for me. I still want to frame what I’ve been researching in that vein, but I may have to focus on other artists who have tried to do similar things.

So that got me looking at a book I’ve had for a while called Else/Where: Mapping New Cartographies of Networks and Territories. Flipping through, I came across a diagram that immediately resonated with me in light of the previous map.

The diagram, called “Centers and Peripheries,” was originally made by geographer Denis Retaillé in 1992, but included in a 1994 volume on the “globalization of capital” by the economist François Chesnais. In his chapter “Counter Cartographies” Brian Holmes discusses the map.

This map shows three things. First, a circuit linking the United States, Western Europe and Japan, the so-called “Triad” regions, which form a “global oligopoly” accounting for the majority of industrial and financial exchanges. Second, the major nodes of the world network, represented by densely outlined circles. And third, the hierarchical relations between the regions, as described with these categories: center; periphery integrated to the center; annexed periphery; exploited periphery; abandoned periphery. Chesnais performs a Marxist analysis, showing how globally fragmented production lines are coordinated through the computerized circuits of the financial sphere. His map describes the hierarchy of social relations in a post-national era, when no political formation can erect any substantial barrier to the dictates of capital. And it reveals the near-perfect correlation between the graph of virtual flows and the geography of human exploitation.

I need to think about the relationship of these diagrams a bit more, but it’s as if one is predicting the existence of the other.


Post Spring Break Thesis Update

Posted: March 19th, 2012 | Author: genevieve | Filed under: Thesis | No Comments »

There have been some positive developments in the past few weeks, but since I spent my Spring Break in Austin I did not get as much work done on my thesis as I hoped to.

Access to Intraday Trading Data
I am now in touch with Knight Capital to get one day’s worth of exchange data. When I met with Marius Watz for my 1-on-1 meeting a few weeks back, he mentioned that he’d done a project for them, called Stockspace. He told me to drop his name and see if they’d share data with me like they did for him. Finally, after a few weeks of emails, they are working to get me data for a day’s worth of exchange activity, as well as one day’s worth of Knight trading activity. This will really help me get a feeling for the intraday behavior of the trading exchanges, which is especially important since high frequency traders buy and sell during the day, but try to leave their position “flat” at closing. Basically, this means they don’t want to end the day holding shares of stock whose values might drop. I hope to use Mark Hansen’s Data class to really explore the intraday data and see what kinds of patterns and visualizations emerge.

I may use this data to animate a similar landscape in Unity like I’ve been experimenting with this semester. Here is a video of the latest version, though still not where I envision it in the long run.

Hunting down data for Colocation Map
I spent a lot of my Spring Break hunting down data on the locations of trading exchanges, as well as colocation facilities that service the proprietary trading firms that engage in high frequency trading. Following Dave Boyhan’s suggestion, I did a lot of whois.com searches on the IP address from this not completely inclusive list of these prop trading firms. It returns Latitude and Longitude coordinates, but I’m not quite sure if they’re really accurate. I’m also torn about whether to list the trading firms at their official addresses, or to track down where they colocate. I will try and do both, but again, there’s a lot of private data so I’m also limited to what I have access to.

Spread Networks, which is currently the fastest fiber cable between New York and Chicago, making a roundtrip in 13.10 ms, updated their website recently and made a lot more of their locations public. They also posted a map of “amplification sites” along the way. This makes me really want to roadtrip to all of these towns, but I’m afraid that dream may have passed since I spent my Spring Break eating tacos in Austin.

Still, it’s great that they’re publishing more of their data. I’d like to contrast the Spread Networks cable with a few that are a tiny bit slower, but are much cheaper because of it. I’m assuming that the real HFTs use Spread Networks since even 1ms advantage is worth the cost in most cases.

In addition to the Klondyke Gold Rush map, I’m looking at some maps Stamen made of London, in which they visualized the relationship where they made a heat map that corresponded to the time it took to commute downtown.

GPS Spoofing
I’ve spent a while weighing the pros and cons of this since I last presented it in Thesis class a few weeks ago. I’m still fascinated by the extreme importance of time keeping in HFT, and the fact that it needs to be accurate down to the microsecond. I have ideas like making a Time Microscope or a High Frequency Time Machine, which would be imaginative objects that address the conflict of humans trying to experience computer time, or the lengths that traders might go to find that edge.

When I talked about the GPS Spoofer in Crit Group, I received interesting feedback that made me question why I wanted to make it. Abigail Simon said, the financial system is vulnerable (and problematic) due to larger issues than GPS timekeeping. I’m torn between exploring this technical facet vs. pushing some more conceptual ideas I have, especially about the relationship between finance and landscape/geography.

I’m still going to try and make the spoofer, and will hopefully have a report on that soon. I’m just not sure how to frame it. Is it a “guerilla weapon?” A sculpture? Do I have to test it in order for it to be worth doing?

Research Group Paper
I’m going to take the paper I’m writing in Heather Dewey-Hagborg’s Research Studio as an opportunity to write what I’ve been thinking about High Frequency Trading down in some kind of more formal way. I will write an update with more specific details, but I see this paper as both explaining High Frequency Trading to a general audience, as well as writing about the issues of time and space, geography and value that my thesis is exploring.


Research Studio Update Week 7

Posted: March 14th, 2012 | Author: genevieve | Filed under: Research Studio Algorithms, Thesis | No Comments »

This past week has been a whirlwind of speaking to experts and consultants about my research. In chronological order, these are the people I’ve spoken to with a few notes from our conversations.

Nancy Nowacek
Meeting with Nancy was wonderful. She immediately got my concept and was really good about offering references that she thought might be relevant. The first thing we discussed were waterlots, and a project we’d both seen at the CCA Curatorial MFA show in 2010 (I guess we’re both from the Bay Area). Sandra Nakamura makes installations with pennies that represent larger value connected with prices of land. For this piece she turned a grant from CCA for the amount it would have cost to buy the waterlot it sits on during the Gold Rush. She then turned this amount into pennies.
She also referenced the Propeller Group, a collective from Vietnam who were in the last Triennial at the New Museum. They did a project where they re-branded Communism, overlaying two opposing forces that makes the viewer confront the absurdity of the capitalist machine.
She told me to look at Carsten Holler’s work, since he approaches his artwork very grounded in his scientific background. She told me that the slides are about doubt, which is an interesting and non-interactive take on them.
Find a poetics about the technical
Try to make connections; 5-6 thought experiments
Do something that confronts the body – where the emotion lies
Look at Xavier LeRoy (choreographer) and Cassie Thornton, who has an excellent project where she turns people’s debts (bank statements, bills) into nuggets of paper mache gold, or bling.
Exercise to examine the core mechanics of a prop – light, heavy, bouncy. Emotion as an end goal.

Sean McIntyre
Sean McIntyre, a first year who does a lot of work with mesh networks, had a previous life as a high frequency trading programmer. It’s sort of the best case scenario, since he gets what we do here at ITP, and he’s not under an NDA like all current high frequency traders. He was nice enough to sit down with me last week and tell me a little bit about how the system worked (from his experience). He worked at Virtu, and did a lot of quality assurance, which was basically making sure that the algorithms worked properly before they “unleashed the beasts” (his words). He confirmed that they colocated their algorithms in three locations – Carteret, Weehawken and Seacaucus – most likely so that they could communicate with the trading exchanges nearby. At this point, NYSE didn’t have their Mahwah data center built yet (2008-2010).
GETCO was the company to beat
Bankruptcy in seconds (if things went bad)
Arbitrage across data centers
Citigroup stock was consistently in the top 5 for volume. Volume was the biggest indicator for HFT, much more important than closing price. They liked Citigroup cuz it was relatively cheap, and predictable.
In terms of liquidity rebates and transaction fees (other factors beside bid and ask spread that affect HFT algo decision-making), these are negotiated individually between the exchange and each trading company. This is one of the reasons that Virtu poached Chris Concannon, a former VP of Nasdaq, due to his connections and ability to negotiate better prices for their company.
Rebates are tiered according to a firm’s performance. More volume, lower fees
Chronos – their name for a time-based strategy
Algos usually incorporated multiple strategies
Not so sure about the HFT algos that lure others by buying and cancelling – Your can easily piss off an exchange by spamming them with buy/cancel orders
HFT has a data processing problem
Nasdaq exchange protocol = FIX protocol
Every exchange has a different standard of sending messages, need to figure out how to get them all to talk to each other
UDP
Messages from data center in 1 of 2 formats
Whole book or stock specific

Petter Kolm
I spoke with Petter Kolm from Courant Mathematical Finance Dept last Thursday, which has direct connections to Wall Street firms.
He told me that HFT algorithms are actually not that complex, just operate really fast
He suggested that I might model one simple system – and change parameters overtime
If volatility in the market goes up, algorithms become more aggressive
trading – sell-side activity, service to customers to minimize transaction costs – “agency algorithms, sell-side algos”
Aggressive HFT – one strategy is to pick off those large orders, and buy ahead in order to sell them the stock they want at a profit
Passive – place limit orders in the book
Limit – spread based on supply/demand
Aggressive HFT – instead of providing liquidity, you take it
Prop Trading Firms, Hedge Funds & HFT firms all employ different strategies at different frequencies
Market Making – they can be at the top of the limit order, buy low, sell high
colocated latency, 3-4 ms in exchange, longer if outside
Dark Pools – 30 dark pools exist
-they’re listed on the web
-just “another form of electronic trading”
-allow people who want to trade larger amounts of shares at once to execute them in one go, without the market seeing it and changing the price of the trades – “slippage”
-in dark pools, trades are marked at the midpoint between the buy and ask prices, don’t have to reach the ask.
-in reality, the avg size of a trade in dark pools isn’t as large as what they were designed to accomodate

Marius Watz
Very helpful, offered to put me in touch with Knight Capital Group, who he did dataviz for. They gave him a full day of intraday trading data for various stocks to visualize. He said I could use his name and perhaps they’d offer something similar.
He said the map sounded really interesting, referenced “They Rule”, Kevin Slavin, etc. Something he would “tweet”
He said to look at the Nanex crop circles and pick them apart with someone who might know what’s going on – they could be an opportunity to visualize

Tom Igoe
Went to office hours with Tom for his advice on the “understanding networks” angle to my research and project. He had some great ideas in terms of distilling my concept and how to proceed
Some notes from our meeting:
Not just actual locations, but distances/speeds in relation to how fast packets can travel. For instance, speed of fiber cable will connect the same two locations at different rates.
How has HFT changed the daily workflow for traders? (Trying to see if it has affects on human actors, or if things have changed over time because of it)
In addition to talking to quants, reach out to fund managers to see if this has changed the way they manage their team?
How might it affect a fund manager vs a specific stock trader?
If HFT injects liquidity it also injects volatility
Look at the GPS Spoofer article again with Higgs Boson debunk in mind – basically, they thought they find a particle that went faster than the speed of light but really it was an error in their GPS signal data
What do I need to ask a quant vs a trader?
Have the principles behind shorting changed (because of HFT?)
How have human trading patterns changed since HFT
Arbitrage – pure inter market arbitrage, other strategies
Tell me about some of the different methods that traders use, pattern of those methods over time, both manually and algorithmically

In addition to these first person sources, I have also been reading about human perception of time, and the time it takes to process actions and our consciousness of our actions. I read a chapter from The User Illusion, by Tor Norretranders, that described an experiment done by Benjamin Libet in which he attempted to determine the time and order of people’s consciousness of their own actions. Essentially, people react before their conscious brains do, so the whole idea of conscious action and agency generates from impulses in the body, where our brains explain them by saying they “wanted” to do something.

Another interesting thing the reading referenced is Wilhelm Wundt’s complexity clock – which is a clock that takes about 2.56 seconds to make a full rotation. People can still visually see the 3 ‘o clock, 7 ‘o clock (etc) spots around the clock, so that they can pinpoint smaller amounts of time more easily than just trying to sense what time it was when they made a decision.

I’m also reading up on whitepapers about GPS and different high frequency trading strategies, which I will summarize in another blog post.


Thesis Update

Posted: February 27th, 2012 | Author: genevieve | Filed under: Thesis | No Comments »

So in the past few weeks I’ve made some progress on my direction and what I plan to make by Thesis Week.

For the map / visualization of high frequency trading and the importance of location, I’d like to create something in the style of this map, or board game, from the time of the Alaskan Gold Rush.

I’ve made some headway and found a listing of proprietary trading firms, some of which specialize in high frequency trading. I’ve started to find all their locations, though some gaps are still there. I’m also trying to locate the data centers for all of the trading exchanges in the US as a starting point, then perhaps globally if I can manage.

I also started moving in a bit of a different direction. Just as I am interested in the importance of location for high frequency trading, I’m also interested in the importance of time. HFT exists in a timescale that is below the limits of human perception. There is an excellent article that a lot of people were kind enough to forward to me last week, which pinpointed the differences in algorithmic behavior at timescales below 650 milliseconds, which is the time it takes a chess grandmaster to realize a mistake. At this timescale, the algorithms are just trying to interact with one another as opposed to human traders. Most interesting, is that the fractal patterns that explain the market don’t hold up at microsecond timescales. As the researcher states, they “broke the fractal.”

This got me thinking about the behavior at millisecond increments of the market, but it also got me thinking about the importance of timekeeping in order to keep high frequency trading running. Then I came across this article, which REALLY got me thinking about how important timekeeping is to our financial system. From the GPS.gov website:

Each GPS satellite contains multiple atomic clocks that contribute very precise time data to the GPS signals. GPS receivers decode these signals, effectively synchronizing each receiver to the atomic clocks. This enables users to determine the time to within 100 billionths of a second, without the cost of owning and operating atomic clocks.

HFT uses a combination of a network timing protocol and GPS to keep time. The network protocol functions along the cable, while the GPS checks the accuracy on either end of the exchange. Since GPS receivers need to point at satellites, they are usually located on the roofs of trading exchanges. To read more about the methods for timekeeping in HFT, this is a great article.

The methods that the researchers achieved this are a bit out of my means, but I’ve found a few hacker articles which gave me hope that I could theoretically interact with the market by confusing the GPS signal. I plan to test this in a controlled environment, and construct a sculpture that would have the potential to cause disruption, without ever actually using it for that purpose.

So my new plan is to make a sculpture, that would theoretically shift the timestamp of algorithmic trades, and confuse the market. I would not use this, but I think making it points out the vulnerabilities in the system, and the reliance that our financial system has on technologies that we don’t realize are involved.

Lastly, here are some screenshots, and links to a video piece I made. It’s a landscape generated from financial data. I hope to refine this piece, but I’m not sure it’s quite in the same vein as the other pieces I plan to build for my thesis.


Financial Data Meshes

Posted: February 19th, 2012 | Author: genevieve | Filed under: Appropriating New Technologies, Thesis, Unity3d | No Comments »

Continuing from my previous post, I was able to get OpenFrameworks working to extrude the brightness values into the Y axis. I tried first with the noise images that Miguel and I generated at the beginning of the week.

Here’s the noise image in 2D:

And here it is extruded into 3D:

Here you can see them both, so you get the idea of how the brightness maps to height:

So, the form looks pretty good. But it also looks pretty random. By feeding financial data into a noise function, we can create a great looking terrain, but it starts to be so far removed from the data itself that it loses meaning. So, I tried to think about how I could generate mesh straight from the data. I finally plotted the points, but although they ended up in 3D space it looked pretty linear (or planar I should say). I needed something that had volume.

Just to illustrate, here’s the data graphed in 3D, with time as the x value, volume as y, and closing price as z:

It’s a lot truer to the data, but it doesn’t give me the same terrain to play with as the noise heightmap. So, I started thinking about various ways I could achieve a similar result while staying “truer” to my data. I tried to figure out if there was a way to start with a base plane mesh, and then extrude along the y axis for every point that aligned with the x and z values. It turns out that an image is a good way to generate a planar mesh (at least in OF), so after talking with greg, I decided to write up a quick processing sketch that input 2 columns of data, and output a grayscale image. This combined the grayscale heightmapping with the mesh technique of mapping the data points to a shared grid system.

To generate the grayscale image, my process was:
1. Get data to read into processing
2. Map data to width and height of image. x = time, y = closing price
3. Map volume data (what we want to extrude by) to color, 0 – 255.
4. Step through x and y, but plot volume as grayscale circles, radius getting smaller as color gets lighter (taller)

Here is the image result I got from this quick test:

I opened it up in Photoshop to do a little blurring, not sure if it was necessary, but didn’t want jagged steps between gradient values. Oh, and I also cropeed, which I should probably implement in code but oh well:

Final step was importing the image back into OF and generating a heightmap there. I used code from James George which made this pretty simple. I ran into a memory error, but I believe that had to do with pixel referencing or something. Here’s a little screencap video that I set to music so it’s a bit more enjoyable to just hang out in dataland. The glitches are due to my computer not enjoying recording and running the app at the same time. Perhaps I need to ramp down the framerate for next time, not that I mind it much.


High Frequency Geography – Week 3 Progress

Posted: February 16th, 2012 | Author: genevieve | Filed under: Research Studio Algorithms, Thesis | No Comments »

This week has been productive in some ways, but I’m still feeling a bit behind where I’d like to be in terms of determining the best direction to take in regards to “visualizing” high frequency trading, and conveying its qualities to a non-financial audience.

I submitted a proposal for an article to Triple Canopy magazine that would explore the colocation aspect of High Frequency Trading. I will see whether they’d like to work with me to pursue this research by early March. If they don’t, I intend to use the paper I write in Research Studio to reflect the direction I proposed to Triple Canopy. Hopefully this will become settled once I really determine the focus of my research.

After I was able to get the Bloomberg API running in Eclipse, I realized (with Dan Shiffman and Heather’s guidance), that I wouldn’t be able to access any of the data without an active subscription. Apparently the API for Bloomberg software is open, but the data itself is still commodified, and very expensive to access. I wrote to them inquiring about student discounts, but received a reply asking me if real-time data was totally crucial to my project. At this point, it’s not, but having access to the Bloomberg API would allow me to call so much data incredibly easily, that it’s a shame I can’t work that way. Instead, I’ve resorted to downloading CSV files of historical data from Yahoo Finance, which gives me the bare minimum of datasets, but which are probably sufficient for now. I have can access Date, Open, Close, High, Low, Volume and Adjusted Close. From looking into the metrics of the stock market a bit more, I realized that the Close value, and the Volume are what everyone watches for.

If I were to write my own HFT algorithm, I would need access to a certain stock’s buy and sell pricing, as well as the share volumes available at those prices. As far as I know, this data is not available in the free historical datasets. I will have to keep looking into alternatives, or press the Bloomberg people I’ve been corresponding with that access to the API is essential to the project.

I began using the historical datasets to see if I could visualize the data in 3D space. One aim is to generate a 3D landscape from financial data. I worked with Miguel Bermudez, who wrote a CSV parser in C++ (thanks Miguel!), and then we brought the data into OpenFrameworks. We debated various avenues, but decided that feeding the data into a noise function would be an easy way to generate nice looking topography. It’s definitely not the most accurate, as 2D noise essentially turns the data into a lot of structured noise, but the effect makes for a nice way to extrude the texture into 3D space, mapping brightness to the y axis. Miguel found a nice tutorial on noise in C++ that we followed, in order to generate some “financial noise.”

Next step is to map the y coordinates to the brightness of the image. Then it should be a 3D model. My next goal is to have “real-time” data animating through the 3D stock terrain. I’m having trouble determining which part of the data makes sense for this and how, but hopefully with more thought I’ll get it.

I also tried experimenting with generating a 3D mesh straight from the data, essentially mapping x to time, y to normalized volume, and z to normalized close. Instead of a plane, it generates more of a line. I’m trying to map mesh to an image so that it looks as if it’s coming out of a plane, but haven’t quite finished that yet. The first image is before I normalized the data, mapping the min and max values to 0 and 1000.

I’m going to continue pursuing the 3D landscape generation from the data for another week. Then I plan to assess whether this direction makes sense with my larger goals, and if so, how I can improve my visualization algorithms to better reflect qualities in the data sets.


Thesis Updates – Alumni Feedback

Posted: February 11th, 2012 | Author: genevieve | Filed under: Thesis | No Comments »

I talked to three alumni on Monday about my project direction: Sherri ?, Gabe B-C, and Drew Burrows. All three found the concept compelling, once I was finally able to summarize what I’m interested in. They all said that it sounded like I had a few directions I was considering, and that I’d need to decide soon which direction was the most compelling.

Out of the three I spoke with, Sherri picked up the most on the idea of a map or visualization on the distance – speed – value relationship, or Value in terms of distance and time. She pointed me to a few references and inspiration sources:

Aaron Straup made Pretty Maps while working at Stamen. He also has nice Flickr sets.

Sherri said she was currently working on an exhibit with David Breashears that documents the receding Glaciers in the Himalaya. She pointed out that the source photographs from the early 1900s are all surveying tools, and how that could be an interesting angle to tell the story.

Trevor Paglen’s work – especially on investigating the hidden.

She also told me to focus on a simple story that manifests the idea – an entry point. When I described the Spread Networks cable, she said that the communication between Chicago and New York could be a good entry point.

She also told me to really focus on who is my audience – namely in the data viz infographic vs art piece debate, which affects the level of accuracy I should be looking to portray, as well as if this is a “tool” a la the interactive map with a slider, or a piece that provokes conversation – a la the maps as critical design tool.

Her final advice to me was to get all the iterations and angles that I could possibly go in out and down on paper. Then select and refine as I choose a focus.

The next alum I spoke with was Gabe B-C. I wanted his opinion on a video piece I did last semester, which is a sketch for one direction I could take this project. When I showed him the piece, he told me to look up Marco Brambilla’s work, especially the pieces he did with grided comparisons, like every revolving restaurant, or split screen reaction shots of first person shooter games.

When I mentioned that one idea I had was a roadtrip from New York to Chicago traveling along the same route as the fiber optic cable, he said he could see a series of pieces inspired by transactions – or doing things with technology and money, but sloooowwwly. I’d have to engage in transactions on either side of the journey for sure. He also got me thinking along more physical lines, like making an actual sculpture that might be influenced by financial data – a glacier whose temperature is controlled according to financial data.

I talked to Drew next, who seemed to have the hardest time understanding what I wanted to do, or perhaps more, why I wanted to do it. He said he liked the idea of the 3D landscape generated by real-time financial data, but that the surveyor maps were the hardest to grasp.

Last, I also talked to Kathy, who nicely volunteered to reach out to people in the NYU Financial Mathematics Dept, basically the Wall Street Quant program that NYU has. She also said to look into Mae West and Mae East, which are routing maps of the internet. I have a lot more research to do, but I found a list of IXPs, Internet Exchange Points, where the various ISPs, Internet Service Providers, exchange traffic between their networks. This happens in various physical locations around the world.


High Frequency Trading Literature Review

Posted: February 11th, 2012 | Author: genevieve | Filed under: Research Studio Algorithms, Thesis | No Comments »

Our first assignment in Research Studio Algorithms was to research and summarize relevant research pertaining to our topics. I had been reading a few papers about High Frequency Trading over break, but took this as an opportunity to delve more deeply into the topic. My sources are a combination of academic papers in finance, economics, math, physics, law, as well as news articles, and general books on the topic.

For a more in depth description on High Frequency Trading and the qualities I will be looking into, download my Literature Review here.


Research Studio Algorithms – Research Plan

Posted: February 10th, 2012 | Author: genevieve | Filed under: Thesis | No Comments »

Goals:
- Have a greater understanding of High Frequency Trading Algorithms
- Research methods of analyzing financial data with models used for weather, natural occurring patterns
- Triple Canopy article on High Frequency Geography – whether that’s the co-location angle, or new nature angle TBD
- Analyze Bloomberg API in realtime – make my own algorithm

Week1
Start Research High Frequency Trading at Bobst / Courant

Week2
- Start writing Literature Review, compiling sources

Week3
- Write timeline
- Finish writing Literature Review
- Write Triple Canopy proposal

Week4
- Get Bloomberg API working in Java
- Read through Developer’s Guide
- Determine which data is most important

Week5
- Use historical financial data to create perlin noise 2D images
- Extrude 2D images into 3D
- Determine how realtime financial data should affect 3D datascape

Week6
- Investigate making docs or a processing wrapper for Bloomberg API
- Research stochastic and chaotic models for financial data
- Research value – distance – speed data for colocation

Week7
- Make documentation or wrapper for Bloomberg API in processing
- Begin visualizing stochastic and chaotic models for financial data
- Finalize topic

Week8
- Final Topic is due
- Continue researching natural and financial visualization models
- Asses success of stochastic and chaotic models for financial data

Week9
- Begin writing research paper
- Finish coding data analysis

Week10
- Present rough draft of research paper
- Finalize data models

Week11
- Write research paper

Week12
- Bring questions and problems to class

Week13
- Finish research paper

Week14
- Present project