Fuzzing IOCTLs with angr
angr and I will join forces in a quest through anarchy, reversing compiled kernel modules to tear their beloved ioctls down…
This was the subject of my talk at the LSEWeek 2016. The stream and the slides are available at the end for further informations.
My first project at the LSE was what put me on angr
. The idea was to help strace
developpers (some at the labs) to handle IOCTLs and determine what commands could
be sent for a given kernel driver. Right now, the whole system is sadly too
unstable to be reliable.
This article will briefly explain what was done, what didn’t work, and what I did to fix the multiple issues I encountered. I will skip the very first parts where I tried to use metasm and miasm, because nothing actually worked, and I will directly jump to the part where angr comes in.
The project
The idea was to take an IOCTL, launch a symbolic/concolic execution
engine on it, and determine which constraints were applied on the cmd
argument.
As a bonus, finding the type of the arg
argument would be nice.
The overall ‘algorithm’ looks like this:
To begin, I explicitly had to tell the tool which function was interesting. I
would then get its endpoints (basic block where the function returns), tell angr
to go from the entry point to each of them, and print the paths where %eax
would be positive when reaching the end. I would then analyze the constraints
and get the possible return values. On a test module, which was not too easy nor
too complicated, it would work fine:
would give:
The result was interesting, but lacked precision and would quickly explode when the function was big. That was one of the first problem I encountered when using angr: I was not in a CTF where only one path interested me and where I could help it by patching some stuff, it had to be fully automatic. I thus tried to provide some generic handling for redundant errors.
Finding IOCTLs and duct taping
One thing I eventually figured out was that when you ask angr to find some address,
this address should be the one of the first instruction of a basic block, or the
exploration would not give you the expected result. This means that, if your address
was in a block, but not at the first instruction, angr would still stop at this
first instruction (it found the block, why would it go further ?). I didn’t
thought this to be a problem until finding that some ‘vital’ instructions for
the project might be at the end of the block, like mov eax, 0xfffffffe
, which
would set the return value of the function before returning. Fixing this was my
first angr pull request.
After polishing the exploration part, and securing it as much as possible (merge paths to save memory, exception handling when it’s ‘not too dangerous’…), another problem was to determine which IOCTL was interesting and where to find it. I first went for a dumb method that would serve as fallback if the clever one failed, which was to look for every symbol that would have ‘ioctl’ in it. This worked pretty well but obviously missed some that were not called like that, and this would add work because not every ioctl is called everytime, as some are just helpers.
The second way is the cleverest one in my opinion. When an IOCTL is registered
in a driver, it is passed to a file_operations
structure first. This structure
holds multiple function pointers used by the module for different operations, and
amongst them are compat_ioctl
and unlocked_ioctl
. Sometimes, the
compat_ioctl
is used for compatibility purpose and will call unlocked_ioctl
,
or they just point toward the same function. Usually, these structures are
static structures that stay in the .data
section:
So the idea was to find them in memory and determine the value of the fields to
get the interesting IOCTLs. The fops
is given to a register function for…
well registration. There are few of them. I thus took some for testing, and tried to
look in the module if any was called. When I found one, I could determine that
the fops
would be one of its parameter and get its address. For this, I needed
to find where it was called… Let’s be stupid for once, and parse the CFG
(Control Flow Graph)
to find something like call my_register_function
.
When found, I could take the instruction’s address, and determine in which
function it was by checking the upper and lower addresses of each symbol. With
that, I had the caller function, and the call instruction’s address of the
registration function.
To determine the address of fops
, I decided to launch an angr explorer from
the entry point of the caller toward the register function call, where I could break and just check
the registers. Sadly, the callers are often long functions and the multiple
unresolved function calls (we are in kernel land, which is not where angr feels
the better) used to break angr. So, let’s try to be clever: the parameters are
passed to the function through registers, and they are just before the call,
so we can just launch the explorer toward the call from the beginning of the block
containing it, not from the caller’s entry point. This saves time and memory.
Great. Doing this, we are able to examine the register, get the fops address,
compute the offset of the IOCTLs in the structure and crave in memory at the
found address to get the good IOCTL functions. FINALLY.
Well, it doesn’t work all the time sadly. Calling conventions fail to give the same registers, and often the memory is symbolic at this place and doesn’t give anything interesting. So I had to get something else to add a layer of fallback.
This was done in a simple manner: the DWARF symbols. We compiled the modules
with KCFLAGS='-g'
and just used the debug infos to get the fops offset in
memory. However, this closed the project to private modules where the code was
not available…
The project’s IOCTLs craver then looked like this:
In the end, the whole system is unusable for big modules¹, and I would have needed too many layers of fallback for it to be robust enough. However, angr is a great framework that can prove to be incredibly efficient when you are willing to help it a little, and it was nice to dive in it in some unusual context (meaning, not a CTF!)
¹. A.K.A modules whose function call ANY library function, or have more than ~10 basic blocks…
LSEWEEK 2016
You can see the slides and the video (with english subs) of my talk on the subject for some more details and context.