Tuesday 2 August 2016

TAOSSA Chapter 5-6

Ch 5 - Memory corruption

All memory corruption vulnerabilities should be treated as exploitable until proved otherwise

Buffer overflows

Process memory layout
Stack overflows
The runtime stack, activation records (function frames). Stack usually grows downward (Full Descending?). ESP / EBP. Calling conventions
Exploiting stack overflows
SEH attacks. Convenient method for exploiting stack overflows on a Windows system b/c the exception handler registration structures are located on the stack. Stack overflow followed by any exception
Off-by-one errors. Often in dealing with C strings NUL byte is not accounted for correctly. Easy to exploit on x86 by overwriting LSB of saved EBP (also little-endianness combined with FD stack)
Heap overflows. Heap management. malloc(). Exploiting via marking the next block as free and causing a single controlled fixed size value to be written to a controlled location

Popular targets of heap overwrites

  • Global Offset Table (GOT), process linkage table (PLT)
  • Exit handlers (Unix)
  • Lock pointers (Win) in process environment block PEB
  • Exception handling routines in PEB (Win)
  • Function pointers
  • Global and static data overflows. Usually result in application-specific attacks. No runtime structures to control


Writing the code
Finding your code in memory. Shellcode must be position-independent

Protection mechanisms

Stack cookies (canary values). Does not prevent against overwriting adjacent local vars, only saved frame pointer and return address; or against SEH overwrite (so SEH-based exploitation was developed because of stack cookies?)
Heap implementation hardening. Header cookie related to a global cookie and chunk’s address. Or additional checks on unlink operation. Similar deficiencies
Non-executable stack and heap. ROP bypasses this
Address space layout randomisation. Limitations: find something in memory that’s in a static location; or bruteforce where possible
SafeSEH. When exception is triggered, EH target addresses are checked. Bypasses depend on specific implementation. Function pointer obfuscation. Obfuscate any sensitive pointers stored in globally visible data structures. Limitations - can still overwrite app-specific pointers.

Assessing memory corruption impact

Where is the buffer located in memory?
What other data is overwritten? Pay special attention to any variables in the overflow path that mitigate exploit attempts (e.g. pointers that are freed before return)
How many bytes can be overwritten?
What data can be used to corrupt memory? Sometimes attacker does not control what data is used to overwrite memory. Often happens with off-by-ones
Are memory blocks shared? Determining whether memory-block-sharing vulnerabilities are exploitable is usually complicated and application specific
What protections are in place?

Ch 6 - C language issues


Data storage overview

Chars are usually signed in implementations
Integers, floats, bit fields
Integers have precision and width, for signed ints width = precision + 1
Typedef as aliasing Signed representations: sign+magnitude; one’s complement; two’s complement (the most common) Byte order - big endian (RISC various), little endian (x86)
Common lengths: char type - signed by default and take up 1 byte. The short type takes 2 bytes, and int takes 4 bytes. The long type is also 4 bytes, and long long is 8 bytes. Also known as ILP32LL
64 bit architectures tend to be LP64 (long, long long, and pointer are 64 bit)

Arithmetic boundary conditions

Numeric overflow / underflow (wrapping). Used to manipulate length checks
NB: chars and shorts in arithmetic expressions are converted to ints first

Unsigned integer boundaries

In a typical case memory is allocated after multiplying user controlled values, then for loop is used to fill the memory x86 detects integer overflows but C cannot access the mechanism (OF flag)

Signed integer boundaries

In C spec, the result of under/overflow with signed integers is implementation defined and could include a machine trap. Usually though it is well defined, predictable and does not lead to exceptions
2’s complement used very commonly

Type conversions

Explicit vs. implicit type conversions. Conversions of integers are most interesting

Conversion rules

  • These rules are two’s complement specific
  • Value-preserving vs value-changing conversions
  • Widening - zero extension (unsigned source) and sign extension (signed source). If a narrow signed type is converted to a wider unsigned type, sign extension occurs(!).
  • Narrowing - only truncation.
  • Converting signed to unsigned of the same width does not modify bit pattern

Simple conversions

  • Casts (to the specified type)
  • Assignments (to the type of the left operand)
  • Function prototypes (to the types in the prototype; if not prototype, default argument promotion)
  • Return statement (to the type in the function definition)

Integer promotions

Each integer data type is assigned an integer conversion rank (top to btm):
  1. long long int, unsigned long long int
  2. long int, unsigned long int
  3. unsigned int, int
  4. unsigned short, short
  5. char, unsigned char, signed char
  6. _Bool
You can substitute higher rank by lower rank types. Bit fields are narrower than their base type
Variables of integer types higher than int do not get promoted. Smaller (lower rank/narrower type) ones get taken to an int. If a value-preserving transformation to an int can be done, it is; otherwise a value-preserving conversion to an unsigned int is performed
Almost everything is converted to an int. To unsigned int - only unsigned int fields with 32 bits or some implementation specific integer types
In K&R days promotions were unsigned-preserving (e.g. unsigned char -> unsigned int)

Integer promotion applications

  • Unary + operator promotes
  • Unary - operator promotes first then does negation by 2’s complement, regardless whether the promoted operand is signed
  • Unary ~ operator - 1’s complement after promotion
  • Bitwise shifts promote both operands; result is the same type as promoted left operand
  • Switch statements: controlling expression is promoted, then all switch constants converted to the resulting promoted type
  • Function invocations: for functions w/o prototypes - when called, default argument promotions apply. Each argument is promoted, and any floats converted to doubles

Usual arithmetic conversions

Transforming 2 operands in an expression into a common real type
  1. Floating point takes precedence (int -> float, less precise -> more precise)
  2. Apply integer promotions (when neither is float). => Types narrower than int are promoted to an int
  3. If both operands are the same type after int promotions, done
  4. If operands are same signedness, different type, converted to the wider type (these will be always wider than int)
  5. If the unsigned type is wider than or same width as a signed type - signed is converted to the type of the unsigned operand
  6. If the signed type wider than unsigned type & value-preserving conversion is possible - transform everything into the signed integer
  7. It the signed type wider than unsigned type & value-preserving conversion is impossible - take the type of the signed integer type, convert it into a corresponding unsigned integer type, then convert both operands to that type. E.g. unsigned int and long int (if long int width is the same as int) converts to unsigned long int

Usual arithmetic conversion applications

  • Addition (when both arguments are of arithmetic type)
  • Subtraction between 2 arithmetic types
  • Multiplicative - always arithmetic types, always converted
  • Comparison - always converted, result is an int = 0 or 1
  • Binary bitwise operators require integer operands, conversion applies
  • Questions mark operator - compiler decides on the type of the operand based on the types of 2nd and 3rd arguments, first arg does not affect

Type conversion summary

In addition to the above, sizeof is of type size_t which is unsigned integer type
Auditing tip: check assembly for conversions, beware of optimisations.

Type conversion vulnerabilities

Implicit type conversions are the source of vulnerabilities

Signed/Unsigned conversions

The most common case is simple conversions b/w signed and unsigned integers, esp. in assignments, function calls, or typecasts
Calling a function that expects an unsigned int with a negative parameter is a common case. Negative value gets interpreted as a huge positive int
Many libc routines use an argument of type size_t which is unsigned int the same width as pointer.
Important not to get a negative parameter into read(), recvfrom(), memcpy(), memset(), bcopy(), snprintf(), strncat(), strncpy(), and malloc()

Sign extension

Can occur:
  • because of typecast, assignment, or function call
  • if a signed type smaller than an integer is promoted via the integer promotions
  • as a result of the usual arithmetic conversions applied after integer promotions because a signed integer type could be promoted to a larger type, such as long long
In some cases sign extension is value changing (e.g. a negative value from char to unsigned int) and has an unexpected result
Programmers often forget that char and short types are signed, especially in network code that deals with signed integer lengths or code processing binary or text data one char at a time
Another place where programmers forget whether small types are signed occurs with use of the ctype libc functions
Programmers rarely intend for their smaller data types to be sign-extended when they are converted, and the presence of sign extension often indicates a bug
Sign extension is somewhat difficult to locate in C, but it shows up well in assembly code as the movsx eax, [XXX] instruction; zero extension is xor eax, eax / mov al, [XXX]


When a larger type is converted into a smaller type - can only happen as a result of an assignment, a typecast, or a function call that has a prototype
A good place to look at - in structure definitions, especially in network-oriented code


In comparisons, the compiler first performs integer promotions on the operands and then follows the usual arithmetic conversions on the operands to get compatible types. These promotions and conversions might result in value changes (because of sign change), and the comparison screws up
gcc -Wall does not warn on impossible condition checks (e.g. unsigned < 0), but gcc -W does
Pay particular attention to comparisons that protect allocation, array indexing, and copy operations
Watch for unsigned integer values that cause their peer operands to be promoted to unsigned integers. sizeof and strlen() are classic examples


Each operator has associated type promotions that are performed on each of its operands implicitly which could produce some unexpected results

Sizeof operator

One of the most common mistakes with sizeof is accidentally using it on a pointer instead of its target (ok to use sizeof(array) but not sizeof(pointer))
Often shows up as a result of editing when a buffer is moved from being within a function to being passed into a function
Look for it in expressions that cause operands to be converted to unsigned values

Unexpected results

2 primary issues with arithmetic operators: boundary conditions related to storage of integer types and issues with conversions that occur in expressions
On 2’s complement machines there are only a few C operators where signedness of operands can affect the result of operations. Their underlying implementation is sign-aware
Comparisons plus right shift >>, division /, modulus %
Right shift - problems when the left operand is signed (and negative). Easy to locate in assembly, look for sar mnemonic Division - when one operand is negative, the result is also negative. Apps often of not account for this possibility
Modulus - same when negative dividend. Often used with fixed-size arrays (hash tables)
Look for div (unsigned) vs idiv (signed) mnemonic in the x86 assembly

Pointer arithmetic

When pointers are subtracted, the result is a signed integer type ptrdiff_t
When C does arithmetic involving a pointer, it does the operation relative to the size of the pointer’s target.
Pointer to an object is treated as an array composed of one element of that object
You can add an integer type to a pointer type or a pointer type to an integer type, but you can’t add a pointer type to a pointer type
E1[E2] is equivalent to (*((E1)+(E2)))
Resulting type of the addition between an integer and a pointer is the type of the pointer


Plenty of vulnerabilities that involve manipulation of character pointers essentially boil down to miscounting buffer sizes
Also, developers mistakenly perform arithmetic on pointers without realising that their integer operands are being scaled by the size of the pointer’s target

Other C nuances

Order of evaluation

C doesn’t guarantee the order of evaluation of operands or the order of assignments from expression “side effects”.
Example is macros or functions with side effects used several times in an expression

Structure padding

Structure members don’t have to be laid out contiguously in memory.
The order of members is guaranteed to follow the order programmers specify, but structure padding can be used between members to facilitate alignment and performance needs.
More visible on 64bit structures.
Comparing memory of 2 “identical” structs with different padding content will lead to incorrect results, or lead to double-free errors


Sometimes a precedence mistake is made but occurs in such a way that it doesn’t totally disrupt the program
Precedence of the bitwise & and | operators, especially when you mix them with comparison and equality operators
Same with assignment, but these usually result in a compiler warning

Macros / Preprocessor

Parenthesising of params is important Problems with order of evaluation and side-effects exist


Programmers can make many simple typographic errors that might not affect program compilation or disrupt a program’s runtime processes, but these typos could lead to security-relevant problems
= in comparisons instead of ==
== instead of = in for loop initial assignment
& instead of && - even if there isn’t an issue caused by the difference between bitwise and logical AND operations in some situations, there’s still the critical problem of short-circuit evaluation (for the logical &&) and guaranteed order of execution
Semi-columns at the end of an if(); statement
Accidental octal conversion. E.g. 040 is 32 decimal
Missing comment sign */ at the end of the line will hide code until the end of the next comment
Missing {} in if statements Commenting out an if’s “then” part together with its “;” will make the next operator “then” part.