Race condition vulnerability in syscall wrappers

Category: Security

12:15 PM, Thu, Aug 9 2007

Slashdot has a shocking headline, shouting that OpenBSD's systrace facility had been compromised. Of course they picked OpenBSD for shock-value, because OpenBSD's reputation and brand is based on security-first. Its reputation is well-earned, and as soon as OpenJDK becomes available for OpenBSD, we'll start trying it on servers. OpenBSD

The race conditions paper itself, by Robert N. M. Watson, is simple and insightful. The paper analyzes how system call wrappers work, and how a race condition creates a vulnerability in almost all system call wrappers today. This affects OpenBSD's systrace auditing feature. That has shock-value to those who understand the OpenBSD brand, but doesn't affect many end-users, because OpenBSD's market share is so small. The real punch of the paper is that anti-virus software for Microsoft Windows relies on the same mechanisms: a wrapper intercepts system (kernel) calls, and checks to see if the application is authorized. If the application is authorized, the arguments are passed on to the kernel. If the application is not authorized, the access is blocked. This is how an anti-virus program can prevent a program from opening a network connection, for example. If you download a program that says it's a desktop calculator or a screen saver, it's nice to be able to prevent that program from opening network connections or a network port. That's what anti-virus software does and it does it using system call wrappers, and Mr. Watson has shown a technique for bypassing all of them.

Mr. Watson's paper provides source code and details. Here is the quick summary: A process (application) attempts to make a system call. Let's say it tries a simple open() system call, to open a file. This is how the call works:

int open(const char *pathname, int flags)

Notice the const modifier for the *pathname. We'll come back to that in a moment.

A wrapper would provide its own implementation of this open call, and intercept the process' calls. The wrapper's code might look like this:

int open(const char *pathname, int flags) {
  if(allowedFile(pathname)) syscall_open(pathname, flags);
  // point A
  else accessDenied();
}

Nothing wrong with that, right? But there is something wrong, which is what Mr. Watson describes. The process could have more than one thread. What if, in the time between the allowedFile(pathname) and the real syscall, the contents of the memory region referenced by *pathname were to change? Mr. Watson demonstrates that this is not only possible, but easy to do, by carefully constructed code which either waits the right number of cycles (on a multi-core machine) or simply does something to cause the kernel to wait right after the wrapped call is made. See his paper for details. The process passes in a *pathname which points to a string in memory which is the process is authorized to access. The wrapper successfully completes its authorization check. Right at this time, the process (in another thread) modifies the contents of that string, just as the wrapper is about to pass the string on to the kernel for real. A safe access to write /tmp/tempdoc-5844 is approved by the wrapper, but is transformed into a write to /etc/shadow by the time the kernel sees it. Not good at all. The same method can also defeat tracing and auditing systems, also with harmful results.

Now, coming back to the const modifier in the declaration, wouldn't that prevent the characters from being modified, solving this whole problem? The answer is no. const is a message to the compiler and is not enforced in any way by anything. It doesn't help.

Now I will show you how this same class of attack is possible in Java, and why Java programmers rarely worry about it, and why java.lang.String is immutable.

Mr. Watson is discussing processes, which roughly correspond with what we call applications: a word processor, a calculator, a screen saver. At some much earlier point in time, applications could be trusted. You went to the computer store, bought your copy of Lotus 123, and installed it, and that was that. Now, there are literally millions of applications out there, they are often downloaded or emailed, and some of them are malicious. Some are so-called Trojan Horse applications. We use anti-virus software (or call tracing software and the like) to allow us to run applications which we don't fully trust. Mr. Watson shows that this fundamentally doesn't work.

Java was designed from the beginning to run non-trusted applications embedded in web pages. These applications are called Java Applets, or simply Applets. These applets needed some access to system resources to be able to draw on the screen, make network connections back to their server computer, and other limited access, but they must not have access to resources like the file system. Java was and is one of the few environments able to provide this security capability.

Let's set up a contrived example to show how Mr. Watson's vulnerability could exist in Java. This example could be an Applet implementation gone wrong. I'll put both the security check and the attack code in the same class. That's something that would never happen, but it shows how a thread can bypass a syscall wrapper (anti-virus) style security check in Java. First, here's the security check method. In this case, users can only open a file in /tmp with a certain file name:

public File secureOpen(StringBuilder fileName) 
  throws SecurityCheckException {
  StringBuilder authorizedName = "/tmp/userTempFile";
  if(fileName.equals(authorizedName)) return new File(fileName.toString());
  else throw new SecurityCheckException("Can't open: " + fileName);
}

That looks just like the anti-virus wrapper code Mr. Watson is describing. It also looks secure. But it isn't; it is Java code which suffers exactly the same vulnerability Mr. Watson describes. Here is an exploit test:

public final class Watson {

  public File secureOpen(StringBuilder fileName) throws Exception {

    StringBuilder authorizedName = new StringBuilder("/tmp/userTempFile");
    if(authorizedName.toString().equals(fileName.toString())) {
      return new File(fileName.toString());
    } else
      return null;
  }

  public void attack() throws Exception {
    final StringBuilder fileName = new StringBuilder("/tmp/userTempFile");
    final Thread attackThread = new Thread() {
        public void run() {
          fileName.delete(0, fileName.length());
          fileName.append("/etc/passwd");
        }
      };
    attackThread.start();
    File result = secureOpen(fileName);
    if(result == null)
      System.out.println("Attack FAILED");
    else {
      System.out.println("File opened: " + result);
      if(result.getName().equals("passwd"))
        System.out.println("ATTACK SUCCESSFUL!");
    }
  }

  public static void main(String[] args) throws Exception {
    for(int i = 0; i < 10; i++) {
      final Watson watson = new Watson();
      watson.attack();
      System.out.println();
      Thread.sleep(5);
    }
  }

}

I compiled and ran it with Java 6. The attack failed every time. Apparently the thread was able to modify the StringBuilder before the security check could happen, 10 out of 10 tries. This makes sense; the thread has started before the security check call can happen.

Time to rig the attack. I put a Thread.sleep(3); before the return new File. Now the attack works every time.

For comparison, I tried with and without the Thread.sleep(3) call in gcj-4.1:

gcj-4.1 --main=Watson  Watson.java
./a.out

The results were the same as with Sun's Java 6, not surprisingly, in the case of both the rigged and non-rigged versions. Binaries compiled with gcj and class files executed by Sun's JVM both end up using native threads. As a follow-up it might be interesting to explore how the timing is different between these two while trying to hit the exploit using a genetic algorithm.

So, the attack doesn't work unless there is a Thread.sleep() call in the security check. It is not realistic for there to be a Thread.sleep() call there, but in real code, the security check method would be doing something more complicated than a simple string comparison and file open. It might need to access a network resource or a file. It might also be possible for the attacking process to do something to slow down the security check at the right time. The validity of the security check is timing dependent, which means it is broken. We see that, even in secure Java, Mr. Watson's attack is possible.

This type of vulnerability is a race condition called a time-of-check-to-time-of-use vulnerability, abreviated TOCTTOU. In our example, the first test did not work, and I needed to rig the code with a Thread.sleep. In reality, there are ways for one thread to influence the timing of another thread. Mr. Watson's paper shows a very clever way to do that by triggering a page fault at just the right time. In a future blog entry I may explore ways of homing in on the right timing as a demonstration of a simple genetic algorithm, so this rigging won't be necessary. For example if a thread requires disk access or network access, it is certainly possible for an attacking process to take actions to slow down those resources, by issuing a careful flood of similar requests.

But Java is not plagued with such vulnerabilities. Far from it. Java has great security. This is because:

Beginning Java programmers are often frustrated and confused about why String can't be modified. This is why.

Looking back at the kernel open call:

int open(const char *pathname, int flags)

we see that, if the const modifier there could actually be enforced, like it is in Java with the immutable String class, things would be safe. At the end of his paper, Mr. Watson says, "As approach message passing, safety improves". In other words, a message passing architecture, where the message (the filename in our example) leaves the control of the calling process, is not vulnerable to this attack. The instant the calling process makes the call, the filename string is no longer under its control. That's good. Unfortunately this is a design change which cannot be retrofitted into existing operating systems. Also, messaging-based kernels, such as The GNU Hurd, have a reputation for being slow, and none are in mainstream use at this point. The causes of slowness are context switching (a context switch is needed every time a message is sent) and the need to copy all data in the message out of the process space. Some way for a process to write-protect a piece of memory, without being able to unlock it, would remove the need for copying or messaging.

The other alternative is to do what Java does, which is to have argument types which are immutable. Modern CPUs can write-protect pages. If there were a mechanism whereby the calling process could write protect the page, and could not undo the write protection until the call is complete, this would allow us to have secure calls like this. Until such a thing is possible, user-space wrappers, like anti-virus software, are out of luck.

Authorization wrappers like virus blockers could copy arguments before doing both the authorization check and the real system call. For calls which don't pass much data, like opening a file, where the only data are the filename and flags, copying the arguments won't cause any performance penalty. For operations such as writing a large file to disk, the performance penalty of copying all the arguments before writing might be substantial. One could argue that there isn't much security risk in a process being able to write blocks to a file it has already opened. Unfortunately, on further thought, that is not the case. For example, a virus blocker might want to inspect all file writes to make sure that a process is not trying to write a known virus. Using Mr. Watson's attack, the process could pretend to write something, get approved, and then write something else (a virus). Copy-and-inspect of all data written to disk or network would probably have a noticeable impact on performance.

Note that a virus wrapper written in Java would face the same problem when writing a file. Imagine a wrapper for a FileChannel which inspects the ByteBuffer for harmful bytes before writing. Unlike String, ByteBuffer is mutable, and the attacking thread could very well change the contents of the ByteBuffer after it has been authorized but before it has been written.

For Java programmers, we need to keep this in mind when doing authorization checks. If the arguments are all immutable, there is little to worry about. But if the argument is something like a Collection, which is mutable, it must be copied locally before either the authorization check or the access. It's best to keep things simple and use immutable arguments. Joshua Bloch argues for this approach in Effective Java Programming Language Guide, in his Item 13: "Favor immutability".

Looking at it from the wrapper perspective, the more wrappers start copying arguments and isolating code, the more the wrapper becomes like a virtual machine for executing executables. It might be better to use a virtual machine that was designed to be a virtual machine from the very beginning.

Download the source code for this article.