Vision

Parsable Space

Telling stories in moving pictures is the most powerful technology ever developed. Unfortunately in the usual progression toward an even more powerful medium that can be parsed and then recombined liked text, motion pictures are stuck in the age of scribes.  Film and video have been kept from having their “movable type moment” because scenes never get broken down beyond the level of the frame.  Each frame is merely a collection of colored pixels instead of a collection of separable and recombinable elements.  We are seeing hint of progress here with elements being captured and manipulated separately in  video games and special effects.  This week are going to take some steps to find some structure in video images.

Getting an Image

From a URL or a  File.

Many of you have already done this.  If you use use a filename rather than a url, you first have to put the file inside your sketch folder.  You may even want to create an assets folder inside of your sketch folder and store all your images there.  Make sure you declare the variable at the top so you can load it in setup but use it in draw.  If you load it over and over in draw() your program will be slow. Very slow.  The image function can take two coordinates for the position of the image but optionally you can supply two more scaling the width and heigh of the image.

var img;  // Declare variable 'img'.

function setup() {
  createCanvas(720, 400);
  img = loadImage("assets/moonwalk.jpg");  // Load the image
}

function draw() {
  // Displays the image at its actual size at point (0,0)
  image(img, 0, 0);
  // Displays the image at point (0, height/2) at half size
  image(img, 0, height/2, img.width/2, img.height/2);
}

 

From the Camera

We are going to make a “capture” variable for pulling in the images from your webcam. You should declare it at the top so it can be created in setup() and used in draw().

In setup(), use the createCapture() function to store your camera’s images in our capture variable.  Declare the size of your capture using capture.size().

Finally you want to show the images in draw.  You use the image() function just as if it were a still image.  From this point on video is just treated the same as a still image.

var capture;

function setup() {
  createCanvas(390, 240);
  capture = createCapture(VIDEO);
  capture.size(320, 240);
  //capture.hide();
}

function draw() {
  background(255);
  image(capture, 0, 0, 320, 240);
  filter('INVERT');
}

Also notice I’ve commented out the capture.hide() function.  Uncomment to see what it does.

From a Movie

You can also bring movies into P5.  The process is nearly identical to playing sound.

var playing = false; // a boolean for switching between states
var fingers; // variable to hold our video (video that happens to be about fingers)

function setup() {
  fingers = createVideo('assets/fingers.mov'); // refer to your video file here
  
}

// plays or pauses the video depending on current state
function mousePressed() {
  if (playing) {// if you click the mouse and the video is already playing
    fingers.pause();//then pause the video
    
  } else { // if you click the mouse and the video is not already playing
    fingers.loop(); // then play and loop the video
  }
  playing = !playing; // on each click set your playing boolean to be the opposite of what it already was. this way it will always update correctly  
}

 

Getting the Pixels Array from an Image

Images are not very computationally interesting.  They are really take it or leave it, all or nothing entities with very little opportunity for interactivity.  We want to break down beyond the frame into the pixels.  Regardless of where your images came from, the internet, a file, a camera, a movie, in the draw loop they are all just images.

Behind every image there is an array of numbers with 4 integers (R,G,B,A) for each pixel of  the picture.  It is very easy to use dot notation to get at the array of pixels from inside the image object.  We will use this notation to ask about a pixel color or to set a pixel color.

pixels[0] // the Red value of the first pixel in the array

pixels[1]  // the Green value of the first pixel in the array

pixels[2]  // the Blue value of the first pixel in the array

pixels[3]  // the Alpha value (transparency) of the first pixel in the array

 

Load Pixels and Update Pixels

Before you start operating on the pixels of an image object you should use the  loadPixels() function of that object to make sure the pixels are fresh.  On the other side once you are done changing the individual pixels of an image object you should use updatePixels() to make sure that array refreshes the image when when it is displayed.  You can get away without doing this quite a lot.

Visit All the Pixels

There are a lot of pixels so you will need a repeat to visit each pixel.   This is a pretty power thing to be able to visit hundreds of thousands of pixels every 30 milliseconds (30 frames/sec) and ask each one a question (what color are you?) and set it to a new color accordingly.

var img; // image we will eventually display on our canvas

function setup() {
  createCanvas(640, 480);
  // specify multiple formats for different browsers
  mirror = createCapture(VIDEO); // our video object that will just see what the camera sees
  mirror.size(320,240);
  //fingers.hide();
  img = createImage(640, 480); // create our image to be the same size as our canvas
  img.loadPixels(); // load our pixels into our first set of pixels into the image
}

function draw() {
  background(255);
  mirror.loadPixels(); // load pixel information into our mirror array
  
  for (var i=0; i < 4*(mirror.width*mirror.height); i+=4) { //multiply and step by 4 since each pixel has 4 color variables associated with it (r,g,b,a)
   
      var r = mirror.pixels[i];   // store red value
      var g = mirror.pixels[i+ 1]; // store green value
      var b = mirror.pixels[i+ 2]; // store blue value
      var a = mirror.pixels[i+ 3];// store alpha value

      img.pixels[i] = b;       // in the image swap red values for blue values
      img.pixels[i + 1] = r;  // in the image swap green values for red values
      img.pixels[i + 2] = g;  // in the image swap blue values for green values
      img.pixels[i + 3]= a;   // keep alpha the same for now
    
  }
  img.updatePixels();  // update all the pixels for the image after you've looped through all the pixels
  image(img,0,0);      // finally write new processed image to the canvas
}

 

But we can do more that swap color.  We can begin representing pixel information with other objects as well.  Lets look at each pixel and see how bright it is.  Then for each pixel lets map that brightness to the size of an ellipse.  Since we will be changing the sizing of each ellipse, its overkill to look at each individual pixel, we should step over a certain amount of pixels based on the maximum potential size of our ellipses:

function setup() {
  createCanvas(640, 480);
  // specify multiple formats for different browsers
  mirror = createCapture(VIDEO);
  mirror.size(640,480);
  //mirror.hide();
  noStroke();
  fill(0);
}

function draw() {
  background(255);
  mirror.loadPixels();
  var stepSize = round(constrain(mouseX / 8, 6, 32)); // based on mouseX limit the size of our ellipses' diameter to be between 6 and 32 px. 

  for (var y=0; y<height; y+=stepSize) {
    for (var x=0; x<width; x+=stepSize) {
      
      var i = y * width + x;
      
      var darkness = (255 - mirror.pixels[i*4]) / 255;
      var radius = stepSize * darkness;
      ellipse(x, y, radius, radius);
    }
  }
}

 

Image Processing vs Computer Vision

As powerful as that repeat loop is, it is not good enough.  The code above does image processing and then the results are delivered to your eyeballs to interpret.  We want the machine to do a little more interpretation itself, for instance separate an object from the background and tell me the position of the object in the scene.  Rather than delivering the results only to your eyeballs I  want it to deliver the results in numbers that can then be used in your code for other things.  For instance you could have an if statement that says if their x position is greater than 100 pour a virtual bucket of water on them. Or play a higher pitched not as your hand moves up.   Your next goal to should be to track the position of something in the camera’s view.

Art School vs Homeland Security

You should contrive your the physical side of your tracking situation as much as possible to make the software side as easy as possible.  Computer Vision is pretty hard if you are trying to find terrorists at the super bowl.  But your users have a stake in the the thing working and will play along.  Rather than tracking natural things it is okay to have the user wear a big orange oven mitt.

The Classic: Rows and Columns and Questions

Unfortunately the pixels are given to you in one long array of numbers but we think of images in terms of rows and columns.  Instead of a single for loop we will have a “for loop” inside  a “for loop.”  The first loop will visit every column and the second will visit each row in the column (same to do the reverse visit each row and than each column in the row).  Then you pull out the colors of the pixel and ask a question about them (more on the possible questions below).  This “for loop” within a “for loop” , extracting color components followed by an if statement question will be the heart of every computer vision program you ever write.

 for (var x = 0; x < video.width; x ++ ) {
    for (var y = 0; y < video.height; y ++ ) {
      var loc = x + y*video.width;
      // What is current color
      color currentColor = video.pixels[loc];
      var red = red(currentColor);
      var green = green(currentColor);
      var blue = blue(currentColor);
      SOME IF STATEMENT THAT ASKS A QUESTION ABOUT THAT COLOR
     }
 }

 Formula for Location in Pixel Array for a Given Row and Column

There is just one trick in there for finding the offset in the long linear array of numbers for a given row and column location.  The formula is:

var loc = x + y*video.width;

You perform this formula every time you try to figure out how many people are in a theater.  Pretending people filled all the seats in order you would find the number seats in a row and multiply times how many rows are full and then add how many seats are in the last partially full row.

Comparing Colors

As we will see, after you extract the colors of the pixel you might want to compare it to a color you are looking for, the color of the adjacent pixel, the same pixel in the last frame etc….  You could just subtract and use absolute value to find the difference in color.  But it turns out in color space as in real space finding the distance as the crow flies between two colors is better done with the pythagorean theorem.  Coming to the rescue in Processing is the dist() function.

   var r1 = red(currentColor);
   var g1 = green(currentColor);
   var b1 = blue(currentColor);
   var r2 = red(trackColor);
   var g2 = green(trackColor);
   var b2 = blue(trackColor);

   // Using euclidean distance to compare colors
   var d = dist(r1,g1,b1,r2,g2,b2); 
   // We are using the dist( ) function to compare the current color with the color we are tracking.

Things You Can Ask Pixels

Color Tracking: Which Pixel Is Closest to A Particular Color?

In this code we will use three variables, one to keep track of how close the best match yet is.  And two more to store the location of the best match so far.

You click on a pixel to set the color you are tracking.  Notice we used this formula again in mousePressed() find the offset in the big long array for the particular row and column you clicked at.  Remember it is usually better to insert some patch of unnatural color, say a bright colored toy, into your scene than to make software that  can be very discerning of nature.

function mousePressed() {
  // Save color where the mouse is clicked in trackColor variable
  trackColor = video.get(mouseX, mouseY); // returns rgba color of pixel at location mouseX and mouseY
  console.log(trackColor);
}
var video;

// A variable for the color we are searching for.
var trackColor;

function setup() {
  createCanvas(640, 480);
  // devicePixelScaling(false);
  video = createCapture(VIDEO);
  video.size(320, 240);
  // The above function actually makes a separate video
  // element on the page.  The line below hides it since we are
  // drawing the video to the canvas
  // video.hide();

  // Start off tracking for red
  trackColor = [255, 0, 0];
}

function draw() {


  // Draw the video
  image(video, 0, 0);

  // We are going to look at the video's pixels
  video.loadPixels();

  // Before we begin searching, the "world record" for closest color is set to a high number that is easy for the first pixel to beat.
  var worldRecord = 500;

  // XY coordinate of closest color
  var closestX = 0;
  var closestY = 0;

  for (var y = 0; y < video.height; y++) {
    for (var x = 0; x < video.width; x++) {

      var loc = (x + y * video.width) * 4;
      //var loc = (x + y * video.width) * 4;
      // The functions red(), green(), and blue() pull out the three color components from a pixel.
      var r1 = video.pixels[loc];
      var g1 = video.pixels[loc + 1];
      var b1 = video.pixels[loc + 2];

      var r2 = trackColor[0];
      var g2 = trackColor[1];
      var b2 = trackColor[2];

      // Using euclidean distance to compare colors
      var d = dist(r1, g1, b1, r2, g2, b2); // We are using the dist( ) function to compare the current color with the color we are tracking.

      // If current color is more similar to tracked color than
      // closest color, save current location and current difference
      if (d < worldRecord) {
        worldRecord = d;
        closestX = x;
        closestY = y;
      }
    }
  }

  // We only consider the color found if its color distance is less than 10. 
  // This threshold of 10 is arbitrary and you can adjust this number depending on how accurate you require the tracking to be.
  if (worldRecord < 50) {
    // Draw a circle at the tracked pixel
    fill(trackColor);
    strokeWeight(4.0);
    stroke(0);
    ellipse(closestX, closestY, 16, 16);
  }
}


function mousePressed() {
  // Save color where the mouse is clicked in trackColor variable
  trackColor = video.get(mouseX, mouseY);
  console.log(trackColor);
}

 

Smoother Color Tracking: What is the Average Location of of Pixels That are Close Enough Particular Color?

The key part of this code is the fact that I am not looking for a single pixel but instead using a threshold to find a group of pixels that are close enough.  In this case I will average the locations of all the pixels that were close enough.  In the previous example a bunch of similar pixels would win the contest of being closest in different frames so it was kind of jumpy.  Now the whole group of close ones win so it will be smoother.

   if (diff < threshold) {  //if it is close enough in size, add it to the average
          sumX = sumX + col;
          sumY= sumY + row;
          totalFoundPixels++;
          
        }

Another thing to notice in this code is the key pressed function.  That is used to change variable, especially the threshold variable dynamically while the program is running.  This is important in computer vision because light conditions can change so easily.

function keyPressed() {
  //for adjusting things on the fly
  if (key == '-') {
    threshold--;
    console.log("Threshold " + threshold);
  } 
  else if (key == '=') {
    threshold++;
    console.log("Threshold " + threshold);
  }
  else if (key == 'd') {
    background(255);
    debug = !debug;
    console.log("Debug " + debug);
  }
  else if (key == 't') {
    console.log("Time Between Frames " + ellapsedTime);
  }
}

This is not very important, don’t read it, but I using a timer to test the performance.  As fast as computers are these days, you will really be testing them with computer vision where you need to visit millions of pixels per second.  If you add a third repeat loop in there you will really need to start checking performance.  We have seen this method before of using a variable to store the millis() and then checking the millis() against that variable to see how much time has passed.  There are about 33 milliseconds in a frame of video at 30 frames/second.

ellapsedTime = millis() - lastTime;  
//find time since last time, only print it out if you press "t"
lastTime = millis();  //reset timer for checking time next frame

Okay here is some smoother tracking.

var threshold = 20; //255 is white, 0 is black
var aveX, aveY; //this is what we are trying to find
var objectR =255;
var objectG = 0;
var objectB = 0;
var debug = true;
var lastTime, ellapsedTime; //for checking performance


function setup() {
  createCanvas(640, 480);
  video = createCapture(VIDEO);
  video.size(320, 240);
}

function draw(){
  
    ellapsedTime = millis() - lastTime;  //find time since last time, only print it out if you press "t"
    lastTime = millis();  //reset timer for checking time next fram
   
    video.loadPixels();
   
    var totalFoundPixels= 0;  //we are going to find the average location of change pixes so
    var sumX = 0;  //we will need the sum of all the x find, the sum of all the y find and the total finds
    var sumY = 0;
    
    //enter into the classic nested for statements of computer vision
    for (var row = 0; row < video.height; row++) {
      for (var col = 0; col < video.width; col++) {
        //the pixels file into the room long line you use this simple formula to find what row and column the sit in 

        var offset = (row * video.width + col)*4;
        //pull out the same pixel from the current frame 
        var thisColor = video.pixels[offset];

        //pull out the individual colors for both pixels
        var r = video.pixels[offset];
        var g = video.pixels[offset + 1];
        var b = video.pixels[offset + 2];

        //in a color "space" you find the distance between color the same whay you would in a cartesian space, phythag or dist in processing
        var diff = dist(r, g, b, objectR, objectG, objectB);

        if (diff < threshold) {  //if it is close enough in size, add it to the average
          sumX = sumX + col;
          sumY= sumY + row;
          totalFoundPixels++;
         // if (debug) video.pixels[offset] = 0xff000000;//debugging
        }
      }
    }
    video.updatePixels();
    
    image(video,0,0);
    
    if (totalFoundPixels > 0){
      aveX = sumX/totalFoundPixels;
      aveY = sumY/totalFoundPixels;
      ellipse(aveX-10,(aveY-10),20,20);
    }
  
}
function mousePressed(){
  //if they click, use that picture for the new thing to follow
 // var offset = mouseY * video.width + mouseX;
 
  //pull out the same pixel from the current frame 
  var thisColor = video.get(mouseX,mouseY);

  //pull out the individual colors for both pixels
   objectR = thisColor[0];
   objectG = thisColor[1];
   objectB = thisColor[2];
   console.log("Chasing new color  " + objectR + " " + objectG + " " + objectB);
}
function keyPressed() {
  //for adjusting things on the fly
  if (key == '-') {
    threshold--;
    console.log("Threshold " + threshold);
  } 
  else if (key == '=') {
    threshold++;
    console.log("Threshold " + threshold);
  }
  else if (key == 'd') {
    background(255);
    debug = !debug;
    console.log("Debug " + debug);
  }
  else if (key == 't') {
    console.log("Time Between Frames " + ellapsedTime);
  }
}

 

 

Looking for Change: How does this Pixel Compare to the background?

For this we need to store another set of background pixels to compare the incoming frame to.  If you use the previous frame as the background pixels you will be tracking change.  If you set the background in setup or in mousepressed() (as in this example) you will be doing background removal.

var video;
var display;
var bgImage;

// How different must a pixel be to be a foreground pixel
var threshold = 20;

function setup() {
  createCanvas(640, 480);
  video = createCapture(VIDEO);
  video.size(320, 240);

  // Create an empty image the same size as the video
  bgImage = createImage(width, height);
  display = createImage(width, height);

}

function draw() {


  // We are looking at the video's pixels, the memorized backgroundImage's pixels, as well as accessing the display pixels. 
  // So we must loadPixels() for all!

  video.loadPixels();
  bgImage.loadPixels();
  display.loadPixels();

  // Begin loop to walk through every pixel
  for (var x = 0; x < video.width; x++) {
    for (var y = 0; y < video.height; y++) {
      var loc = (x + y * video.width) * 4; // What is the 1D pixel location

      // Store the RGB values for each pixel of the video and our background image
      var r1 = video.pixels[loc];
      var g1 = video.pixels[loc + 1];
      var b1 = video.pixels[loc + 2];
      var r2 = bgImage.pixels[loc];
      var g2 = bgImage.pixels[loc + 1];
      var b2 = bgImage.pixels[loc + 2];

      //Compare the foreground and background color
      var diff = dist(r1, g1, b1, r2, g2, b2);

      // Is the foreground color different from the background color
      if (diff > threshold) {
        // If so, display the foreground color
        display.pixels[loc] = video.pixels[loc];
        display.pixels[loc + 1] = video.pixels[loc + 1];
        display.pixels[loc + 2] = video.pixels[loc + 2];
        display.pixels[loc + 3] = video.pixels[loc + 3];

      } else {
        // If not, display green
        display.pixels[loc] = 0;
        display.pixels[loc + 1] = 255;
        display.pixels[loc + 2] = 0; // We could choose to replace the background pixels with something other than the color green!
        display.pixels[loc + 3] = 255;
      }
    }
  }
  //update the display image
  display.updatePixels();
  
  // display the updated display image
  image(display, 0, 0);

  // show our threshold in text at the bottom of the screen
  fill(0);
  rect(0, height - 20, width, height);
  fill(255);
  text("Threshold is now: " + threshold, 20, height - 5);
}

function mousePressed() {
  for (var i = 0; i < video.pixels.length; i++) {
    bgImage.pixels[i] = video.pixels[i];
  }

  bgImage.updatePixels();
  changeThreshold(); // comment this out to keep threshold constant
}

// change the threshold on the fly depending on where you click
function changeThreshold() {
  threshold = map(mouseX, 0, width, 0, 175);
  console.log("Threshold is now: " + threshold);
}

 

 

Skin Rectangles: Is this pixel skin colored and is it part of an existing group of pixels?

Okay this example is combining two new things.  The most important thing is we are looking for multiple groups of pixels so find a single average for the whole frame as we were doing won’t be enough.  Instead we are going to ask each pixel first if they qualify against a threshold and then second if they are close to an existing group or whether they need to start a new group.  This could be used for pixels that qualify for things other than  skin color, for example the pixels could qualify for being a bright or being part of the foreground.

    //is this spot in an  existing box
          if (existingBox.isNear(col, row,reach)) {
            existingBox.add(col, row);
            foundAHome = true; //no need to make a new one
            break; //no need to look through the rest of the boxes
          }
        }
        //if this does not belong to one of the existing boxes make a new one at this place
        if (foundAHome == false) boxes.add(new Rectangle(col, row));
      }

This example also happens to be about skin color.  It is a heart warming fact that regardless of race we are all pretty much the same color because we all have the same blood.  We are however different brightnesses.  Because brightness is not a separable notion in the RGB color space we using a “normalized” color space where we are looking for percentages of red and green (the third color will just be the remain percentage).

  float r = red(_thisPixel);
  float g = green(_thisPixel);
  float total = r + g + blue(_thisPixel);
//convert into a "normalized" instead of RGB
  float percentRed = r/total;
  float percentGreen = g/total;
  return (percentRed < skinRedUpper && percentRed > skinRedLower  && percentGreen < skinGreenUpper && percentGreen > skinGreenLower);

Okay there is the code in two parts.  There is also a rectangle object for holding the the qualifying pixels.

import processing.video.*;

float skinRedLower = .35f;
float skinRedUpper = .55f;
float skinGreenLower = .26f;
float skinGreenUpper = .35f;
int reach = 3;
Capture cam;
int w = 640;
int h = 480;
boolean debug = true;
long elapsedTime = 0;

void setup() {
  size(w, h);  
  cam = new Capture(this, w, h, 30);
  cam.start();
}

public void draw() {  //called everytime there is a new frame

  long startTime = System.currentTimeMillis();

  if (cam.available()) {
    cam.read(); //get the incoming frame as a picture
    ArrayList rects = findRectangles();

    //console.log(elapsedTime);  
    consolidate(rects, 0, 0);
    cleanUp(rects, 100);
    if (debug) image(cam, 0, 0);
    fill(0, 0, 0, 0);
    stroke(255, 0, 0);
    for (int i = 0; i < rects .size(); i++) {
      Rectangle thisBox =  (Rectangle) rects .get(i);
      thisBox.draw();
    }

    elapsedTime = System.currentTimeMillis() - startTime;
  }
}

boolean test(int _thisPixel) {
  float r = red(_thisPixel);
  float g = green(_thisPixel);
  float total = r + g + blue(_thisPixel);
//convert into a "normalized" instead of RGB
  float percentRed = r/total;
  float percentGreen = g/total;
  return (percentRed < skinRedUpper && percentRed > skinRedLower  && percentGreen < skinGreenUpper && percentGreen > skinGreenLower);
}

void keyPressed() {

  //for adjusting things on the fly
  if (key == '-') {
    skinRedUpper = skinRedUpper - .01f;

    skinRedLower = skinRedLower - .01f;
  } 
  else if (key == '=') {
    skinRedUpper = skinRedUpper + .01f;

    skinRedLower = skinRedLower + .01f;
  }
  else if (key == '_') {
    skinGreenUpper = skinGreenUpper - .01f;

    skinGreenLower = skinGreenLower - .01f;
  } 
  else if (key == '+') {
    skinGreenUpper = skinGreenUpper + .01f;

    skinGreenLower = skinGreenLower + .01f;
  }
  else if (key == 'r') {
    reach--;
    console.log("reach " + reach);
  } 
  else if (key == 'R') {
    reach++;
    console.log("reach " + reach);
  } 
  else if (key == 't') {

    console.log("Elapsedtime " + elapsedTime);
  }
  else if (key == 'd') {
    debug = !debug;
    console.log("debug  " + debug);
  }
  console.log("RU:" + skinRedUpper + " RL:" + skinRedLower +"GU:" + skinGreenUpper + " GL:" + skinGreenLower );
}
void cleanUp(ArrayList _rects, int _sizeThreshold) {
  for (int j = _rects.size() - 1; j > -1; j--) { 

    Rectangle newRect = (Rectangle) _rects.get(j);
    if (newRect.getHeight()*newRect.getWidth() < _sizeThreshold) _rects.remove(j); //if the area to small, loose it
  }
} 

public void consolidate(ArrayList _shapes, int _consolidateReachX, int _consolidateReachY) { 

  //check every combination of shapes for overlap 
  //make the repeat loop backwards so you delete off the bottom of the stack
  for (int i = _shapes.size() - 1; i > -1; i--) {
    //only check the ones up 
    Rectangle shape1 = (Rectangle) _shapes.get(i);

    for (int j = i - 1; j > -1; j--) {
      Rectangle shape2 = (Rectangle) _shapes.get(j);
      if (shape1.intersects(shape2) ) {
        shape1.add(shape2);
        //System.out.println("Remove" + j);
        _shapes.remove(j);
        break;
      }
    }
  }
} 
ArrayList findRectangles() {
  ArrayList boxes = new ArrayList();

  for (int row = 0; row < cam.height; row++) {
    for (int col = 0; col < cam.width; col++) {
      int offset = row * width + col;
      int thisPixel = cam.pixels[offset];
      if (test(thisPixel)) {
        cam.pixels[offset] = 0xffff00;
        //be pessimistic
        boolean foundAHome = false;
        //look throught the existing
        for (int i = 0; i < boxes.size(); i++) {
          Rectangle existingBox =  (Rectangle) boxes.get(i);

          //is this spot in an  existing box
          if (existingBox.isNear(col, row,reach)) {
            existingBox.add(col, row);
            foundAHome = true; //no need to make a new one
            break; //no need to look through the rest of the boxes
          }
        }
        //if this does not belong to one of the existing boxes make a new one at this place
        if (foundAHome == false) boxes.add(new Rectangle(col, row));
      }
    }
  }
  return boxes;
}

 

class Rectangle {
  public int furthestLeft;
  public int furthestUp;
  public int furthestRight;
  public int furthestDown;

  Rectangle(int _x, int _y) {
    furthestLeft = _x;
    furthestUp = _y;
    furthestRight = _x;
    furthestDown = _y;
  }

  void add(int _x, int _y) {
    if (_x < furthestLeft) furthestLeft = _x;
    if (_x > furthestRight) furthestRight = _x;
    if (_y < furthestUp) furthestUp = _y;
    if (_y > furthestDown) furthestDown = _y;
  }

  void add(Rectangle _rect){
    if (_rect.furthestLeft < furthestLeft) furthestLeft = _rect.furthestLeft;
    if (_rect.furthestRight  > furthestRight) furthestRight = _rect.furthestRight ;
    if (_rect.furthestUp   < furthestUp) furthestUp = _rect.furthestUp ;
    if (_rect.furthestDown  > furthestDown) furthestDown = _rect.furthestDown ;
  }

  boolean isNear(int _x, int _y, int _reach) {
    //make sure this new spot is inside the current stretch by reach
   return ((_x >= furthestLeft-_reach && _x < furthestRight + _reach) && (_y >= furthestUp -reach && _y < furthestDown + _reach));
  }

  void draw() {
    rect(furthestLeft, furthestUp, furthestRight-furthestLeft, furthestDown-furthestUp);
  }

  int getWidth(){
    return furthestRight-furthestLeft;
  }

  int getHeight(){
    return furthestDown-furthestUp;
  }

  boolean intersects(Rectangle _other){
    return ! ( _other.furthestLeft > furthestRight
    || _other.furthestRight < furthestLeft
    || _other.furthestUp > furthestDown
    || _other.furthestDown < furthestUp
    );
  }
}

 

Other Examples:

Face detection

Kinect For P5

Kinect For Processing

Network Camera

Related reading:
Learning Processing, Chapters 15-16
Assignment

Pixels Project

Track a color and have some animation or sound change as a result.  Create a software mirror by designing an abstract drawing machine which you color according to pixels from live video.
Create a video player. Consider combining your pcomp media controller assignment and build a Processing sketch that allows you to switch between videos, process pixels of a video, scrub a video, etc.
Use the kinect to track a skeleton. Can you “puppeteer” an avatar/animation with the kinect?

 

Leave a Reply