Read more: Paul Mason
Recap: What needs fixing?
Please note: An assumption is made in this article that all
invalid opcodes are single byte opcodes; this example does not cater for
invalid double byte opcodes.
Well, to work out what needs fixing, we'll firstly write some code
that we'll use to break Mono.Cecil (and for testing):
01.
//Load the assembly
02.
var
assembly = AssemblyFactory.GetAssembly(
03.
@
"D:\temp\Obfuscated\SimpleLibrary.dll"
);
04.
05.
//Output the il for each method in the assembly
06.
foreach
(TypeDefinition type
in
assembly.MainModule.Types)
07.
{
08.
//Go through each method
09.
foreach
(MethodDefinition
def
in
type.Methods)
10.
{
11.
//Check the body
12.
if
(def.HasBody)
13.
{
14.
//Get the
CIL worker
15.
CilWorker worker = def.Body.CilWorker;
16.
17.
//Chuck
the bad instructions in here to avoid modifying the collection
18.
List<Instruction>
instructionsToFix =
new
List<Instruction>();
19.
20.
//Go
through each instruction
21.
foreach
(Instruction instr
in
def.Body.Instructions)
22.
{
23.
//TODO:
Some how figure out if it is one to fix and add it to be fixed
24.
}
25.
26.
//Go
through the ones to fix and replace
27.
foreach
(Instruction instr
in
instructionsToFix)
28.
{
29.
Instruction
newInstr = worker.Create(OpCodes.Nop);
30.
worker.Replace(instr,
newInstr);
31.
}
32.
}
33.
}
34.
}
35.
36.
//Save the assembly
37.
AssemblyFactory.SaveAssembly(assembly,
@
"D:\temp\Obfuscated\SimpleLibrary.new.dll"
);
This is some pretty basic code which simply goes through each type
and each method inside an assembly and replaces all invalid opcodes with
a nop.
When we run this code using the default version of Mono.Cecil we
unfortunately come across an error:
Mono.Cecil didn't like
an opcode
Now we know what we're fixing!
Getting the source
First of all, we need to get the source for Mono.Cecil to start
working with it. Rather than get the entire Mono system, I decided to
just check out the project that I needed via SVN:
svn co svn://anonsvn.mono-project.com/source/trunk/mcs/class/Mono.Cecil
Unfortunately the project won't compile by itself due to the .snk
file being located in a directory one up from Mono.Cecil. For this
example I simply turned off assembly signing to get this compiling,
however please feel free to download the .snk file and place it in the
appropriate location to have a fully signed version of Mono.Cecil.
Hacking Mono.Cecil
Now that we've got the source and it's compiling; let's hack it. Now,
from the screenshot you'll see that the error is sourcing from the
CodeReader class on line 207 (in my copy anyway). Taking a look in the
code at that line we see the following switch statement:
01.
if
(cursor
== 0xfe)
02.
op = OpCodes.TwoBytesOpCode [br.ReadByte ()];
03.
else
04.
op = OpCodes.OneByteOpCode [cursor];
05.
06.
Instruction
instr =
new
Instruction
((
int
) offset,
op);
07.
switch
(op.OperandType) {
08.
case
OperandType.InlineNone :
09.
break
;
10.
...
11.
case
OperandType.InlineTok :
12.
MetadataToken token =
new
MetadataToken
(br.ReadInt32 ());
13.
switch
(token.TokenType) {
14.
...
15.
default
:
16.
throw
new
ReflectionException (
"Wrong token: "
+ token);
17.
}
18.
break
;
19.
}
That's our error message alright; and it seems to be happening
because it is going into OperandType.InlineTok. Hmmm... well, ideally
we'd like to go into InlineNone due to not having any subsequent
operand. As you can see, the OperandType comes from the variable op
which is defined by the lines:
1.
if
(cursor
== 0xfe)
2.
op = OpCodes.TwoBytesOpCode [br.ReadByte ()];
3.
else
4.
op = OpCodes.OneByteOpCode [cursor];
Well, since we're only working with one byte op codes in this
example, let's concentrate on that. The OpCodes.OneByteOpCode variable
is actually an array which places each opcode as a position in the array
according to it's byte code representation; for example: index 0 = 0x00
= nop, index 1 = 0x01 = break ... etc. In one
of our previous articles, we placed several invalid opcode bytes
throughout the code; all within a certain subset: 0xbe, 0xc0, 0xc1...
etc. Therefore, our invalid opcodes should be at the specified index of
OneByteOpCode; i.e. 190, 192, 193... etc.
Still following? Essentially to solve this problem we need to see
what opcodes are being defined at these indexes in Mono.Cecil at
runtime. Well, as we all know, a struct is never null therefore the
object at each of those "unused" opcode indexes is an empty struct (i.e.
all variables left uninitialised). Due to the way that the Mono.Cecil
OpCode object works, this gives us a confusing result stating that the
size of the OpCode is two bytes - even though it is in the one byte
array (check out OpCode.Size property to see why).
No wonder it causes problems! So how do we fix this? Well, for a
start we should initialise the array inside the OpCodes class to avoid
this issue:
01.
static
OpCodes()
02.
{
03.
//Start from first index to avoid
nop
04.
for
(
int
i = 1; i <
OneByteOpCode.Length; i++)
05.
{
06.
//Check to see if it is listed
as an arglist... but not one
07.
if
(OneByteOpCode[i].Op2
== 0x00 && OneByteOpCode[i].Code != Code.Arglist)
08.
{
09.
OneByteOpCode[i]
=
new
OpCode(0xff,
(
byte
) i,
Code.Unused, FlowControl.Next, OpCodeType.Primitive,
10.
OperandType.InlineNone, StackBehaviour.Pop0,
StackBehaviour.Push0);
11.
}
12.
}
13.
}
Basically we are looking for all OpCodes that haven't been
initialised properly; that is those with Op2=0x0. We have to be careful
however: both Nop and Arglist use an empty Op2 correctly - therefore we
intentionally skip these ones. Now, if you copied and pasted this into
your code it will complain about the variable Code.Unused. To make
things cleaner I simply added a new option to the Code enum so that
identification of invalid OpCodes is nice and easy. The reason I use the
word "unused" is really so that it is inline with how ILDASM sees an
invalid OpCode.
Before we finish hacking Mono.Cecil; there is one more "aesthetic"
change that I thought I'd make. Technically, the change above fixes the
issue for us; however being the pedantic guy that I am, I also wanted to
fix the "ToString()" method so that it'd display "unused" instead of
"arglist" when an invalid OpCode is present. Well, it actually isn't a
hard aesthetic fix to make. Simple find the Name property in the OpCode
class, and use the following:
1.
public
string
Name {
2.
get
{
3.
int
index =
(Size == 1) ? Op2 : (Op2 + 256);
4.
return
OpCodeNames.names
[index] ??
"unused"
;
5.
}
6.
}
Now to test it all...
Testing our results
As you'll remember; I declared a new enum member: Code.Unused.
It starts to come in use when we rewrite our testing program:
01.
//Load the assembly
02.
var
assembly = AssemblyFactory.GetAssembly(
03.
@
"D:\temp\Obfuscated\SimpleLibrary.dll"
);
04.
05.
//Output the il for each method in the assembly
06.
foreach
(TypeDefinition type
in
assembly.MainModule.Types)
07.
{
08.
//Go through each method
09.
foreach
(MethodDefinition
def
in
type.Methods)
10.
{
11.
//Check the body
12.
if
(def.HasBody)
13.
{
14.
//Get the
CIL worker
15.
CilWorker worker = def.Body.CilWorker;
16.
17.
//Chuck
the bad instructions in here to avoid modifying the collection
18.
List<Instruction>
instructionsToFix =
new
List<Instruction>();
19.
20.
//Go
through each instruction
21.
foreach
(Instruction instr
in
def.Body.Instructions)
22.
{
23.
//Remove
invalid opcode
24.
if
(instr.OpCode.Code ==
Code.Unused)
25.
instructionsToFix.Add(instr);
26.
}
27.
28.
//Go
through the ones to fix and replace
29.
foreach
(Instruction instr
in
instructionsToFix)
30.
{
31.
Instruction
newInstr = worker.Create(OpCodes.Nop);
32.
worker.Replace(instr,
newInstr);
33.
}
34.
}
35.
}
36.
}
37.
38.
//Save the assembly
39.
AssemblyFactory.SaveAssembly(assembly,
@
"D:\temp\Obfuscated\SimpleLibrary.new.dll"
);
We use Code.Unused to test for an invalid opcode to replace.
What are the results? Well, Reflector can now decompile the code as per
usual (again):
Reflector now works ok
again
Conclusion
This week we took a look at "fixing" the problem with Mono.Cecil when
we reached an invalid OpCode. Essentially to fix the problem in
Mono.Cecil involved:
- Creating a new enum member Code.Unused so that we can identify
invalid opcodes - Initialising the static array with our invalid opcodes:
OpCodes.OneByteOpCode. This helped provide us with accurate opcode
descriptions in unused positions. - (Optional) Changing OpCode.Name to return an accurate friendly name
for invalid opcodes.
Once Mono.Cecil could handle these Opcodes, we had no problem
whatsoever writing an automated tool to "fix" the assembly for us. It
certainly doesn't take much to reverse some of the "value added"
obfuscation techniques does it!?
Next time
Well, that's all for this week. If you have any
questions/suggestions/notes, then please let me know. Not sure what the
next article will be about yet, however I'll be sure to make it
something interesting (perhaps tamper proofing?). What are your
thoughts?