Ch 8 - Strings and metacharactersMajor areas of string handling:
- memory corruption due to string mishandling;
- vulnerabilities due to in-band control data in the form of metacharacters;
- vulnerabilities resulting from conversions between character encodings in different languages
The second way is better but a lot of overhead and the need to free memory correctly.
C++ has a safer string class, but the need to interface with C introduces the same issues
Used when reading in data from a file stream or string. Each data element specified in the format string is stored in a corresponding argument.
Destination buffer can be overflowed, usually by %s or % formats. Occasionally with %d or %f. Also format strings vulnerabilities when user can control the format string specifier
_wsprintfA() and _wsprintfW() copy a maximum of 1024 chars
Destination buffer can be overflown. There are Windows variants.
The destination buffer (dst) must be large enough to hold the string already there, the concatenated string (src), plus the NUL terminator
Accepts a max number of bytes that can be written to the output buffer.
On Windows OSs, if there’s not enough room to fit all the data into the resulting buffer, a value of -1 is returned and NUL termination is not guaranteed.
UNIX implementations guarantee NUL termination no matter what and return the number of characters that would have been written had there been enough room. That is, if the resulting buffer isn’t big enough to hold all the data, it’s NUL-terminated, and a positive integer is returned that’s larger than the supplied buffer size.
Accepts a max number of bytes to be copied into the destination.
Does not guarantee NUL-termination of the destination string. If the source string is larger than the destination buffer, strncpy() copies as many bytes as indicated by the size parameter, and then ceases copying without NUL-terminating the buffer.
wcscpyn() function is a safe alternative to wcscpy(). Wide characters confuse developers - they supply destination buffer’s size in bytes not wide chars.
Copies at most n bytes, i.e n is the space left in the buffer minus 1 for the NUL byte. This one byte is often miscalculated, resulting in off-by-one.
BSD alternative to strncpy(). Guarantees null termination of the destination buffer. The size returned is the length of the source string not including the NUL byte. It can be larger than the destination buffer size, which together with, e.g. strncat can lead to off by one.
Similar to strncat but the size parameter is the total size of the destination buffer, not the remaining space. Guarantees NUL termination. Returns the number of bytes required to hold the resulting string. If the destination string is already longer than n parameter, the buffer is left untouched and the n parameter is returned. One of the safest alternatives.
- Unbounded copies. Not checking the bounds of destination buffers;
- Character expansion where software encodes special chars, resulting in longer string than the original. Common when processing metacharacters or formatting raw data for human readability;
- Incorrectly incrementing pointers. Pointers can be incremented outside the bounds of the string being operated on. Two main cases: when a string isn’t NUL-terminated correctly; or when a NUL terminator can be skipped because of a processing error
- Typos. One occasional mistake is a simple pointer use error, which happens when a developer accidentally dereferences a pointer incorrectly or doesn’t dereference a pointer when necessary
- Embedded delimiters. A pattern in which the application takes user input that isn’t filtered sufficiently and uses it as input to a function that interprets the formatted string. This interpretation might not happen immediately; it might be written to a secondary storage facility and then interpreted later. An attack of this kind is sometimes referred to a “second-order injection attack.”
- NUL character injection. Special case of embedded delimiter, important in scenarios of Web apps or Java etc passing strings to C-based APIs.
Example is fgets() which stops reading when it runs out of space in the destination buffer or encounters \n or EOF. NULs have to be dealt with separately.
- Truncation. In statically sizes buffers, input that exceeds the length of the buffer must be truncated to fit the buffer size and avoid buffer overflows. THis avoids memory corruption, but could lead to interesting side effects from data loss in the shortened input string.
Can happen when using snprintf instead of sprintf. For functions in this family:
Consider how every function behaves when it receives data that isn’t going to fit in a destination buffer. Does it just overflow the destination buffer? If it truncates the data, does it correctly NUL-terminate the destination buffer? Does it have a way for the caller to know whether it truncated data? If so, does the caller check for this truncation?
- Path metacharacters
- File canonicalisation - especially directory traversal
- The Windows registry paths
- C format strings - printf(), err(), syslog() families of functions
- Shell metacharacters - e.g. using popen() or Perl open() call
- SQL queries
- Detect erroneous input and reject what appears to be an attack.
- Detect and strip dangerous characters.
- insufficient filtering
- character stripping vulnerabilities - mistakes in sanitisation routines.
- Detect and encode dangerous characters with a metacharacter escape sequence.
- If escape character is not treated carefully, it can be used to undermine the whole escaping routine
- Vulnerabilities in decoding
- Homographic attacks
- Windows unicode functions