This week was a very successful one for me. I found a professor at the computing faculty of the University Bielefeld who is able to verify me in my fith a-levels exam. This is a special topic in informatics. Normally you have an exam in four topics. My fifth is an optional topic because I wanted to do this. It is about object recognition. I wanted to start with simple motion detection and maybe some object deteciton. Something like detecting a red ball.
Now everything gets more complicated. My new task is to create an application that is doing the following:
- Some objects are placed on a table. There is a camera directed in front of theese objects with a known distance
- User grabs an object and holds it in front of the camera
- The user puts the object back in any rotation. The object can also be flipped horizontally or vertically.
- The program has to identify that object
This will be a lot of work. I will use artificial neural networks for that process. Also some motion detection and maybe a lot of filters. I am just starting right now with an application written in C++. But there is the first problem. I need the rgb values in something like an Array (Flash) or int[WIDTH][HEIGHT][3] if you know about C. The current problem is that I do not have any clue about the WinAPI and all that stuff. There are a lot of tricks I do not know about. Maybe I will try using DirectShow for the video capture because I am currently using the Video For Windows (VFM) API and everything becomes very slow (9fps ;-)).
If there won’t be any solution (for me, as an unskilled person in C/C++) to capture the webcam nice and fast and put all the rgb values in an Array I have to use proce55ing or Actionscript3 for the whole project. But I need the speed.
Anyway. Thanks to David Rowald from extrajetzt for the nice stickers. They will look nice on my new ride - a Fiat126.
Update:
Phew. It was not that difficult to get more speed. Instead of using slow GetPixel functions of the WinAPI I am using now the direct information from the camera. One memcpy() call is doing me the work to divide all pixels into a rgb array, where I do not even have to split the values from 0xRRGGBB into RR,GG,BB. This is now done in one call. FPS still greater than 2*10^5.




1 Response to “Research!”