NetEX VXP

ITP HELP | RecentChanges | Preferences

[netex]

Video By The Pixel

These Java classes that will allow you to take in live incoming video, visit each pixel individually, analyse it, change it, and spit is back out to the screen. They can enable you to write software to, for instance, track something within the image or remove the background or change the colors or blur the image. These classes really just allow you to do things like getPixel(x,y) and setPixel(x,y,color), the schemes for using these functions to accomplishing tracking or background removal are up to you (with a few examples from me). I could not resist throwing a bunch of other utilities into the classes for simple things like creating BufferedImages or jpegs from the video.

By Dan O'Sullivan

Short Story:

  1. Download [vxp_fat.jar]
  2. Make it accessable to your classes by; 1) putting it in the same folder as all your classes; or 2) putting it in your jdk.../jre/lib/ext folder; or 3) telling your IDE where to find it (In Eclipse right click on project>propterties>java build path>libraries tab>add external library.
  3. Make sure you have Quicktime or ([Java Media Framework] on the PC) installed on your machine.
  4. Look at the [documentation] for PixelSource and for and one of its implementations like QTLivePixelSource or JMFLivePixelSource
  5. Try this [code].

System Installation

There is a sometimes tricky pipeline that the video pixels will take before you can get your mits on them. Most of these drivers are too wierd so it is possible that your machine is just ready to go now without your having to read this. You can chose the Quicktime or the JMF, Quicktime probably more common.

Quicktime PipeLine (Mac or PC)

  1. VDIG Driver-- This should come with your camera. It converts the proprietary details of your camera into the QuickTime standard format. The VDIG driver is very common for the MAC and rare for the PC. Luckily there is are translators from the WDM drivers that are usually included for the PC into a VDIG. You install them automatically from [free] (people are experiencing some temporary mismatching between current version of quicktime and winvdid can get old versions [here]) or [$]. This is the piece that is a bit unusual and unlikely that you have on your machine. The translator is usually this is just a .qtx. Where that file is installed is evolving with QT 7 from a Quicktime folder in the Windows/System32/ directory to a QuickTimeComponents director in /program files/Quicktime.
  2. Test that you are getting video input by using the simple hackTV application [osx ] [PC]
  3. QTJava.zip --These are the "QuickTime For Java" classes that should come as part of every install of Quicktime. You might do a search on this file to see if they are installed. On a mac with a recent install of QuickTime you should be fine. You might have to go to your Quicktime Player and do a custom install with QuickTime for Java checked or you might have to use the [QT Installer]. It may have been installed in a different jdk.../jre/lib/ext than you are using, so you can move it (try placing it in all the ext folders) or explicitely tell your IDE where to find it (In Eclipse right click on project>propterties>java build path>libraries tab>add external library.) These days the best place to look for QTJava.zip is in "Program Files/QuickTime/QuickTimeSystem/".
  4. vxp.jar --This is software that I wrote to simplify the interface of Quicktime for live image processing, and unify it with other systems. Your classes also have to have access to this file.
  5. Use the QTLivePixelSource class

  1. if you get an error like: quicktime.std.StdQTException[QTJava:6.1.0g1],9405=couldntGetRequiredComponent,QT.vers:6528000 -- you probably do not have a camera attached or on a pc that you don't have a VDIG Driver as explained above.

JMF PipeLine (PC or Linux)

  1. A driver (WDM or VFW) should come with your camera. It converts the proprietary details of your camera into the standar windows direct show format.
  2. JMF.jar -- Install [JMF] It may have been installed in a different jdk.../jre/lib/ext than you are using, so you can move it (try placing it in all the ext folders) or explicitely tell your IDE where to find it (In Eclipse right click on project>propterties>java build path>libraries tab>add external library.)
  3. Register the video devices on your machine by using the preferences of the JMF STudio application that gets installed with JMF.
  4. Test that you are getting video input by using the JMF test capture application (JMF Studio) that should have been installed or [amcap], a simple video viewing program.
  5. vxp.jar --This is software that I wrote to simplify the interface of JMF for live image processing, and unify it with other systems. Your classes also have to have access to this file.
  6. Use the JMFLivePixelSource class

Long Story:

Having computers fast enough to analyze video on a pixel by pixel basis is relatively new. I find that people who experiment with this stuff experience a mind shift from which they never recover. Treating video as an input to your software instead of the usual output of your software lots of things open up to you. It solve one big problem, your software is boring your body. Your body is capable of so much subtle movement and response that is really just lost on the mouse and keyboard. A video camera with its millions of pixels every second, now that is a volume and a speed that is a worthy match for your body. After you get a hold of the pixels you start pulling objects out the video image, perhaps just removing a foreground from the background. Pretty soon you are treating video more like a mutable vector than a "take it or leave it" bitmap which opens up all sorts of interactive possiblities. You should know that you will never win, there are just too many of them and they change too often. They will just beat you down and you will stop bathing.

New Appreciation of your body:

Having a machine that is fast enough to go pixel by pixel through video is one thing, using that to, for instance, reliably track a face moving through the video is another. As you get into this stuff you will have new appreciation of your body's visual and mental equipment. The thing that makes computer vision so difficult is that our eyes and brains automatically adjust but cameras and computers do not. For instance your eye compensates when it sees a fire engine at noon and in the evening and it appears to be the same color even though the sun's color is changing the color of the truck. The computer will just see it as different colors. Making matters worse, we are using old video standards (NTSC, PAL etc) that are very noisy and the even if the color is not changing, blips in the signal will make it appear to change. Even if we could solve those problems with better equipment and specttragraphic imaging (like Mitchell Rosen is doing at RIT) that still leave us with the biggest problem, recognizing objects within the video. Our brains so effortlessly can pick out parts of a scene, for instance a head, a hand or a snake but when you sit down to write software to do that, it is really hard to even know where to start.

80% Approach:

Many of these techniques fall roughly the fields of Computer Vision or Digital Image Processing. I suspect many of you are not anxious to go back to school to get a PhD in these subjects. In the examples I show you how to get pretty far, say 80% of the way, pretty easily and then expect you to contrive your situation so the remaining 20% never happens. For instance it might be very difficult to find a person against some changing and arbitrary background (the stuff of PHD dissertations) but very easy to find them against a uniform white background (the stuff of Art School projects). It is very difficult to recogonize a person's face but trivial to find a particular red hat. If you want to make your life easy have the person wear a red hat and ensure that there is a uniform white background. It does not have to be that contrived but you get the idea.

How these classes compare with other technologies:

Java may not leap to mind as the perfect solution for pixel by pixel work because it tends to lag behind C++ in terms of the speed of execution which is so crutial when you are talking about so many pixels per frame. Java is now fast enough for some video scanning applications and as machines and just-in-time compilers get faster this will not be a problem. For example I have been working on a 2.0 Gighz PC with a firewire camera (some cameras or input devices cannot provide you with 640x480). I can get the 640 x 480 pixels and display them at about 24fps. I can track a color 640 x 480 at 15fps. I can blur (this is the most expensive act) and remove the background at 320 x 240 at 10fps. These are all with the using the convenient (but slower) setpixel and getpixel routines instead of unpacking things manually. The ConvolveOps for blurring are theoretically tapping your native graphics capabilities so a better graphics card might also help in addition to pure processor speed. There are of course all the usual benefits to Java like the fact that it is portable between platforms, and can be networked like the dickens but the main reason you might want to use Java for video by the pixel is because you already know how to use it or want to learn.

Other solutions for video by the pixel:

TrackThemColors Xtra -- Danny Rozin made this extention of Macromedia Director. Director is not fast enough to go through the pixels one by one but the Xtra (written in C) does most of the work for you. This is by far the easiest solution if you already use Director. Josh Nimoy has a similar Xtra for director that I have never used but he is a smart guy and I like his web page.

Jitter or NATO-- These are additions to MAX. If you are a Max or PD programmer this may be just the ticket for you.

C++. This approach will give you the absolute best speed a given machine is able to deliver in going through all those many pixels. It also has the advantage that you can connect with certain useful libraries like Intel's OpenCV (computer vision) libraries. Here are some samples for how to do it using Quicktime.

Microcontrollers. Now even tiny single chip computers (for instance the SX) that you can buy for $10 are fast enough to process video on a pixel by pixel basis. CMU has a kit and companies like TYZX are developing interesting products.

Code for finding the width (which can be mapped to distance): After doing the loops that find the best x and y for a given color, go into two loops that track out from the center of the best color out to the edges to find the width.

 rightEdge = x ;
        int[] nrgb =   ps.getPixel(rightEdge,y);
        int pRed = nrgb[1];
        int thresholdOfChange = 10;
        while(rightEdge < kWidth){
          nrgb =   ps.getPixel(rightEdge,y);
          int diff = Math.abs(nrgb[1]-pRed);
          if (diff > thresholdOfChange) break;
          pRed = nrgb[1];
          rightEdge++;  
        }
        leftEdge = x ;
        nrgb =   ps.getPixel(x,y);
        pRed = nrgb[1];
        while(leftEdge >=0){
          nrgb =   ps.getPixel(leftEdge,y);
          int diff = Math.abs(nrgb[1]-pRed);
          if (diff > thresholdOfChange) break;
          pRed = nrgb[1];
          leftEdge--;  
        }
        

The expressiveness capacity to give impressions) appears to involve two radically different kinds of sign activity: the expression that he gives, and the expression that he gives off. The first involves verbal symbols or their substitutes which he uses admittedly and solely to convey the information that he and the others are known to attach to these symbols. This is communication in the traditional and narrow sense. The second involves a wide range of action that others can treat as symptomatic of the actor, the expectation being that the action was performed for reasons other than the information conveyed in this way. As we shall have to see, this distinction has an only initial validity. The individual does of course intentionally convey misinformation by means of both of these types of communication, the first involving deceit, the second feigning

Of the two kinds of communication - expressions given and expressions given off - this report will be primarily concerned with the latter, with the more theatrical and contextual kind, the non-verbal, presumably unintentional kind, whether this communication be purposely engineered or not.


ITP HELP | RecentChanges | Preferences
This page is read-only | View other revisions
Last edited February 6, 2007 11:21 am (diff)
Search:
To EDIT, You have to enter an ADMINISTRATOR password (guess) in Preferences. And refresh