Java Asm Patching
(bytecode patching)

Date

by epokh

 

14/01/2006

UIC's Home Page

Published by Quequero

Whenever you advise a ruler in the way of Tao, counsel him not to use force to conquer the universe.
For this would only cause resistance.

- Scopa di più e studia di meno - Epokh
- Nerdeggia di meno e scopa di piu' - Federica (la mano amica)

Thorn bushes spring up wherever the army has passed.
Lean years follow in the wake of a great war.
Just do what needs to be done.

(Tao Te Ching)

....

Home page : http://www.epokh.org
E-mail: [email protected]
[email protected]

....

Level

( )NewBie( )Intermediate (X)Advanced ( )Master

 


Introduction

The reader must know the jvm architecture and java asm, however I will report some essentials during the article.

Used tools

Just
Class Construction Kit
 

Program

We are going to crack a simple program written by me: the main class is App.jar.
There is also an obfuscated version

Essay


In this tutorial we will introduce a new way to crack the java program: patching directly the bytecode is much more better than decompiling java code and then recompiling it! So I assume that the reader have some basic knoledge of the java vm specification, by the way I will resume the basic about java assembly code.

Frames

A frame is used to store data and partial results, as well as to perform dynamic linking , return values for methods, and dispatch exceptions.

A new frame is created each time a method is invoked. A frame is destroyed when its method invocation completes, whether that completion is normal or abrupt (it throws an uncaught exception). Frames are allocated from the Java virtual machine stack of the thread creating the frame. Each frame has its own array of local variables , its own operand stack , and a reference to the runtime constant pool of the class of the current method.

The sizes of the local variable array and the operand stack are determined at compile time and are supplied along with the code for the method associated with the frame. Thus the size of the frame data structure depends only on the implementation of the Java virtual machine, and the memory for these structures can be allocated simultaneously on method invocation.

Only one frame, the frame for the executing method, is active at any point in a given thread of control. This frame is referred to as the current frame, and its method is known as the current method. The class in which the current method is defined is the current class. Operations on local variables and the operand stack are typically with reference to the current frame.

A frame ceases to be current if its method invokes another method or if its method completes. When a method is invoked, a new frame is created and becomes current when control transfers to the new method. On method return, the current frame passes back the result of its method invocation, if any, to the previous frame. The current frame is then discarded as the previous frame becomes the current one.

Note that a frame created by a thread is local to that thread and cannot be referenced by any other thread.

Local Variables

Each frame contains an array of variables known as its local variables. The length of the local variable array of a frame is determined at compile time and supplied in the binary representation of a class or interface along with the code for the method associated with the frame .

A single local variable can hold a value of type boolean, byte, char, short, int, float, reference, or returnAddress. A pair of local variables can hold a value of type long or double.

Local variables are addressed by indexing. The index of the first local variable is zero. An integer is be considered to be an index into the local variable array if and only if that integer is between zero and one less than the size of the local variable array.

A value of type long or type double occupies two consecutive local variables. Such a value may only be addressed using the lesser index. For example, a value of type double stored in the local variable array at index n actually occupies the local variables with indices n and n +1; however, the local variable at index n +1 cannot be loaded from. It can be stored into. However, doing so invalidates the contents of local variable n.

The Java virtual machine does not require n to be even. In intuitive terms, values of types double and long need not be 64-bit aligned in the local variables array. Implementors are free to decide the appropriate way to represent such values using the two local variables reserved for the value.

The Java virtual machine uses local variables to pass parameters on method invocation. On class method invocation any parameters are passed in consecutive local variables starting from local variable 0. On instance method invocation, local variable 0 is always used to pass a reference to the object on which the instance method is being invoked (this in the Java programming language). Any parameters are subsequently passed in consecutive local variables starting from local variable 1.

Operand Stacks

Each frame contains a last-in-first-out (LIFO) stack known as its operand stack. The maximum depth of the operand stack of a frame is determined at compile time and is supplied along with the code for the method associated with the frame .

Where it is clear by context, we will sometimes refer to the operand stack of the current frame as simply the operand stack.

The operand stack is empty when the frame that contains it is created. The Java virtual machine supplies instructions to load constants or values from local variables or fields onto the operand stack. Other Java virtual machine instructions take operands from the operand stack, operate on them, and push the result back onto the operand stack. The operand stack is also used to prepare parameters to be passed to methods and to receive method results.

For example, the iadd instruction adds two int values together. It requires that the int values to be added be the top two values of the operand stack, pushed there by previous instructions. Both of the int values are popped from the operand stack. They are added, and their sum is pushed back onto the operand stack. Subcomputations may be nested on the operand stack, resulting in values that can be used by the encompassing computation.

Each entry on the operand stack can hold a value of any Java virtual machine type, including a value of type long or type double.

Values from the operand stack must be operated upon in ways appropriate to their types. It is not possible, for example, to push two int values and subsequently treat them as a long or to push two float values and subsequently add them with an iadd instruction. A small number of Java virtual machine instructions (the dup instructions and swap) operate on runtime data areas as raw values without regard to their specific types; these instructions are defined in such a way that they cannot be used to modify or break up individual values. These restrictions on operand stack manipulation are enforced through class file verification .

JVM instruction set

A Java virtual machine instruction consists of an opcode specifying the operation to be performed, followed by zero or more operands embodying values to be operated upon.
 
 
 
Format of Instruction Descriptions

mnemonic

Operation

Short description of the instruction

Format

mnemonic
operand1
operand2
...

Forms

mnemonic = opcode

Operand Stack

..., value1, value2 ..., value3

Description

A longer description detailing constraints on operand stack contents or constant pool entries, the operation performed, the type of the results, etc.

Each cell in the instruction format diagram represents a single 8-bit byte. The instruction's mnemonic is its name. Its opcode is its numeric representation and is given in both decimal and hexadecimal forms. Only the numeric representation is actually present in the Java virtual machine code in a class file.

Keep in mind that there are "operands" generated at compile time and embedded within Java virtual machine instructions, as well as "operands" calculated at run time and supplied on the operand stack. Although they are supplied from several different areas, all these operands represent the same thing: values to be operated upon by the Java virtual machine instruction being executed. By implicitly taking many of its operands from its operand stack, rather than representing them explicitly in its compiled code as additional operand bytes, register numbers, etc., the Java virtual machine's code stays compact.

Some instructions are presented as members of a family of related instructions sharing a single description, format, and operand stack diagram. As such, a family of instructions includes several opcodes and opcode mnemonics; only the family mnemonic appears in the instruction format diagram, and a separate forms line lists all member mnemonics and opcodes. For example, the forms line for the lconst_<l> family of instructions, giving mnemonic and opcode information for the two instructions in that family (lconst_0 and lconst_1), is

Forms

                    lconst_0 = 9 (0x9)
                    lconst_1 = 10 (0xa)

In the description of the Java virtual machine instructions, the effect of an instruction's execution on the operand stack of the current frame is represented textually, with the stack growing from left to right and each value represented separately. Thus,

Operand Stack

                    ..., value1, value2 ..., result

shows an operation that begins by having value2 on top of the operand stack with value1 just beneath it. As a result of the execution of the instruction, value1 and value2 are popped from the operand stack and replaced by result value, which has been calculated by the instruction. The remainder of the operand stack, represented by an ellipsis (...), is unaffected by the instruction's execution.

If the reader has problem to understand how the instruction work we will explain during the reversing section.

Patching the program

 
Let's take a look to the application we want to patch. It's simply composed by a dialog box.



The user must insert a good username with the corresponding serial. If we explore the jar of the app we can see the following files:
Main.class,Main$1.class,Main$2.class,Main$3.class,Register.class
As we know the java compiler name the ActionListener object of a Main class with the $ symbol. Then we are using the Class Construction Kit (aka cck) and we open tha Main.class file.
So we can see the fields and methods of the Main class:



In red ellipse I reported the method that initialize the button to register the program, so as we can see at offset 40 there is a system call to the vm, aka invoke virtual, that add an Action Listener. As we can see from the violet method invokespecial, it "link" the class named Main$3 as the Action Listener for the register button.
Then we open with cck the Main$3 class and watch to the method called actionPerformed:



So we found the right place to bypass the control, at offset=7 there is an invokestatic (it means a static method invocation) to the check method of the class Register:



The 2 arguments passed are the username and password recovered by the jTextField. So at offset=23 we have the instruction:
ifeq 41
So let's examine this instruction from the jvm op code manual we have:

if<cond>

Operation

Branch if int comparison with zero succeeds

Format

if<cond>
branchbyte1
branchbyte2

Forms

ifeq = 153 (0x99) ifne = 154 (0x9a) iflt = 155 (0x9b) ifge = 156 (0x9c) ifgt = 157 (0x9d) ifle = 158 (0x9e)

Operand Stack

..., value ...

Description

The value must be of type int. It is popped from the operand stack and compared against zero. All comparisons are signed. The results of the comparisons are as follows:

If the comparison succeeds, the unsigned branchbyte1 and branchbyte2 are used to construct a signed 16-bit offset, where the offset is calculated to be (branchbyte1 << 8) | branchbyte2. Execution then proceeds at that offset from the address of the opcode of this if<cond> instruction. The target address must be that of an opcode of an instruction within the method that contains this if<cond> instruction. Otherwise, execution proceeds at the address of the instruction following this if<cond> instruction.

So basically after the call to Register.check we have on the stack an integer value (boolean=1 or 0), returned from the function. If it's 0 the program jump to offset= 41 so we have an invalid serial, otherwise if it's 1 we continue to the next instruction so we will have our Program Registered.
So to patch it we can do for example this:



Save now the Main$3.class and launch the program: now we can register with any serial!
Another possible approach is to patch the check function in the Register.class, so let's open it:




We have to act in the offset=68 because the instruction iconst 0 push the constant 0 on the operand stack, and on the offset=24 where the program check if the username length is the same of the serial, so we can patch both the iconst 0 instructions as iconst 1:



And also in this way we have the program registered for any serial!



Good job reverser!!!
The application called AppObfusc is the same application obfuscated with Retroguard, so now we want to shortly examine what the obfuscator do.
The class obfuscated has this properties:

  1. random method name took from the set of java instructions,types and chars: do,int,byte,long, g ,f, void
    in our case the corresponding getButtonRegister is the a method
  2. line number deleted: no debug infos
  3. no fields modification: name are the same
  4. code optimization when possible: for example in the a.class (that is register.class) in the method a (that is check method) the obfuscator deleted the variables called usernameL,serialL, total_sum,i

We can patch the obfuscated class in the same way.

P.S.
These are the patched version of Main$3.class and Register.class of the unobfuscated version, the reader could try to patch the obfuscated version for fun.

                                                                                                                 ..::EPOKH::..

Final notes

Thanx to all girls who not consider me, so that I have more time for hacking and reversing (thanx gals for considering me instead of Epokh, so he has more time for doing stuff and I spend less time to breaking my back bent over a pc keyboard ;p NdQue).

Disclaimer

Vorrei ricordare che il software va comprato e  non rubato, dovete registrare il vostro prodotto dopo il periodo di valutazione. Non mi ritengo responsabile per eventuali danni causati al vostro computer determinati dall'uso improprio di questo tutorial. Questo documento � stato scritto per invogliare il consumatore a registrare legalmente i propri programmi, e non a fargli fare uso dei tantissimi file crack presenti in rete, infatti tale documento aiuta a comprendere lo sforzo che ogni sviluppatore ha dovuto portare avanti per fornire ai rispettivi consumatori i migliori prodotti possibili.

Reversiamo al solo scopo informativo e per migliorare la nostra conoscenza del linguaggio Assembly.