Object recognition: current process

ObjectRecognition Some days ago I whined about my problems with C++ and my webcam. After I figured out how to get the raw pixel-data with high speed I started with the interesting things.

I started with a datatype for the color. Since I am able to copy all bytes I recieve from the webcam into a struct of B,G,R (yes, reverse order) I can now use my 3Byte datatype for more tricky operations. I overloaded most of the operators. This is not possible in Flash (AS1,AS2,AS3) which is a shame. Maybe you want to compare RGB(255,0,255) with 0xff00ff. You have to write your own functions etc. in Flash and it is so boring and of course not a very sexy solution. Another example is the data conversion.
Imagine you have a RGB class with 3 components that are from the type uint (AS3) or Number (AS1,AS2). In functions like lineStyle() you need the integer representation 0xRRGGBB. Therefore you have to write a function that puts your r, g and b bytes into this format. The code is simply (r << 16) | (g << 8) | b BUT you have to write lineStyle( color.toInt(), ... ); everytime and it slows the code down.
I am so excited about all the features that C++ offers that I overloaded most of all operators and type casts. ;-) Something like BlendMode.ADD is done by rgb0 + rgb1 etc. This makes coding much more easier.

After I played around with the datatypes and operators I started with filters. Filters are very important for my solution to solve the problem of the object recognition. First of all I do not want to track moving objects. So I need the outline of a not moving object. The hard thing is to remove all the noise (e.g. the background). It is hard to decide wich parts of the picture belong to the object and an outline in the background.
But I started only four days ago with working on this project. There are lots of things that have to be done. What I want to do is to create a polygon of the outline and then use the polygon to check it with a neuronal network. I will try to build the polygon from the outline that I get using different filters.

What you see on the left in the 2nd picture is a combination of blendmodes, convolution filter and binary filter. The third picture is my evil tricky part ;-) It is the outline of objects in motion with different weights. This will be used to get the extract of the picture where the object has been placed. It could also be used for motion vectors (maybe).

So what will be next? I have to buy something that allows me to use USB2.0. My camera is doing 1 FPS. This was also a problem when I did some experiments in Actionscript3. The next steps in developement are creating an extract from the whole picture, calculating a polygon of the outline and then finally using the neuronal network.

13 Responses to “Object recognition: current process”


  • I’ve been working on a vector trace class in the Flex 2.0 IDE. (AS 3.0). I’ve got the contours of the objects working but the holes in the objects are a pain. The contours are realized by making the image black and white. But deciding if a boundry is a hole or just a boundry is still quite difficult…
    The trace of a 160×120 webcam feed takes about 0.5 sec but that’s before i have optimized the tracing code
    Ofcourse, working with C++ gives you more tools

  • You are using shape-space estimation?

  • Probably not, cause i don’t know what that is ;). No, I blur the image, make the contrast so high that the image turns black and white. then, scanlining through the image i find a black pixel above and a white pixel below. When i find this combination, i follow the edge round till i’m back at the start. Ofcource i register the information in a 2 dim Array. It works quite good. At this moment i rely on the scanline also to continue checking for holes but that makes it dificult to find out if there is a hole.

  • I see: Shape space estimation is far to mathematical for me. Im a designer trying to make things i imagine…

  • I don’t know waht you want to achieve. But maybe you need the medial axis of an object.
    This is only the “inline” of an object. It could also be usefull for you.

    It is a shame that I can’t find any good link for that topic. It belongs to terms of Voronoi Diagrams and Skeletonization.

  • iv’e checked the topics you named but you’re thinking in a far too complicated direction. Its like a two dimensional doughnut. it has a hole. i’m trying to find the hole but then with an irregular polygon with a irregular polygon hole.

  • If you know the outline and got some connective points of that polygon you could calculate the center. Then you could walk from the center in a circular directions to the outlines and everything you cross there will be a hole.

  • Thats right, that was one of the tactics i concidered but then the center could be a hole… :) And i have no way of deciding either way
    i’ve got another solution working now. I use scanlines to run through the pixels and look for outlines. I run along the points of the outline and save the boundry points on a scanline for that outline. When i’ve got an outline, using the boundry points, i scanline the inner space of the outline for holes. I also mark the inner points as points of the outline. When i do that i can ignore the inner space of the outline and still scan the inside of the holes for other outlines.

    Its all pretty rough and none scientific but it seems to work. There are some problems because scanning for holes with the boudries of an outline on a scanline i could still hit an outline outside of the current outline.
    But i think there’s no way of determining, within the context of an scanline if i find a hole or an outline.

  • Do you have a static background or is the background changing? There are some (complex) algorithms to determine what is in the background and what is in the foreground.

    There are also simple solutions. You could set the background by pressing a button and you would know what other objects are in front of the camera. Using scanlines to determine the outline would become much more easier.

  • In principal the background is mostly static because i use a webcam feed and because its a live feed determining the background by clicking is not realy an option. But combining the idea of having a static background and measuring the changes in the bitmapdata could make localizing motion detection quite cool!!
    Foreground and background detection would be even cooler!! :)

  • Do you know you have a popup advertisment running on your page?

  • No, but it might be the webstats counter. Thanks for the information I’ll remove that thing.

    About the background/foreground: Leave me your eMail or maybe I will create a post here about how to check between back and foreground easily in AS3. It is not that hard even using a webcam. There are simple approaches.

  • A post is always good cause its also interresting for other people. My address is m-i-c-h_a_ u-d=At=v_e-n ~a *n^. nl (remove all the none alphabetic chars an change the At into @).

Leave a Reply