My final project idea for the Bit by Bit was to verify whether traffic images could be used to predict traffic patterns for the future.
As a background for this project, I studied the traffic patterns in various countries which are worst hit by problems of traffic congestion, including areas in Paris ( among the most badly hit in Europe), Sao Paulo, Bangalore, Sierra Leone and the Bay Area in San Francisco.
An approach which I thought would be interesting would be to use low cost cameras, which could be used to fetch images at various discrete time intervals and process them to obtain the traffic density of at those intervals of time. These traffic densities could then be used with various prediction models, to predict the traffic density for a future time interval.
Traffic density in this context, is the number of vehicles which are present on the road, which can also be indicated by the amount of free space, visible on the road. The more the free space, the lesser the number of vehicles and hence, lesser the traffic density.
For the same, I decided to examine two specific areas, 42nd Street, 5th Avenue and 42nd Street, 6th Avenue. Both these junctions are among the busier areas and places having slow moving traffic. NYCDOT cameras, present at these locations served as my data source.
For creating the corpus, images were collected over a period of five days and then, processed. The detailed distribution data of the images which were collected is as follows:
The total number of images collected was 9248. A sample image of 42nd Street, 5th Avenue and 42nd Street, 6th Avenue are as follows:
All the images were examined for the amount of free space in them. The amount of free space would be amount of space of the road which would be visible in the images, which can be measured by counting the number of pixels which are gray in color. For the same, all the images were converted into gray scale and a measure of the number of pixels, which lie in the range form 123-165 ( an approximate measure of the gray color range on the gray scale) was made.
To measure the gray color density, a designated area in the images, for both the specific test spots was selected. This set of co-ordinates was kept the same throughout the process for both the locations.
A histogram showing the gray color density for 42 Street and 5th Avenue is shown in the following figure:
For the purpose of regression, each of these traffic densities was mapped to a time value, which was a measure of the time instant at which the image was collected. This time value was an offset, in seconds, from midnight, at which the image was collected.
Once the corpus was created, the linear regression model was used for the purposes of prediction. The accuracy rate obtained was above 50%.
Some of the next steps for this project are as follows:
1. Increase the corpus size by collected data over a larger time frame.
2. Use more complex prediction models.
3. Determine means of interpreting the traffic density and how it could be used.