09 - How to maybe not be so bad at OSINT?

09 - How to maybe not be so bad at OSINT?

Disclaimer: For this post, I've taken a lot of IP / DNS info from google and simply modified it to be similar to a scenario I encountered a little while back at work.  It's not intended to be perfect, nor accurate as far as the DNS information goes, this post is just about the process.

BlAcK bOx TeStInG?

A while ago we had a client who had requested a 'full black-box test', as they'd put it.  As a tester, visions of grandeur and epic hacks flooded my mind straight away, imagining myself owning the perimeter in some non-specific way, dodging IPS laser beams, battling AI defenses-- but definitely getting root.

Thinking about it though, I realized that my OSINT skills aren't actually all that sharp...  What would happen if, say, the client didn't have any IPs registered to them?  Could we locate IPs to test?  There are lots of ways, but when it came down to it, how would I?

After many logistics, and what I'd assume was an actual nightmare for the people scheduling and billing, it was time to begin testing.  We had a phone number, two email addresses and the client's name.  Really, that's not a lot to go on.

The Harvester found us very little, and subdomain bruteforcing didn't find much either.  None of the MX or NS records were client owned, website was 3rd party hosted, social engineering was explicitly left out of scope, etc.

As my experience with OSINT is limited, I'd never really done anything in-depth, so this would be a good exercise.

What do?

Over the course of the past few years I've added several tools and resources to my 'OSINT' arsenal, but I quickly found I could exhaust those resources and still need more information. 

The client had requested that we reach out to confirm the devices we'd be testing once we 'discovered' them.  As much as this was useful for the client in making sure we were testing the devices they wanted us to, I'm sure the stipulation was also pushed for by legal, to make sure we didn't test something we shouldn't.  So, my general (passive) 'footprinting' process is usually something like:

1. whois client-domain.com
2. ARIN whois lookup (looking for additional subnets / domains)
3. DNS lookups (MX, NS, SPF, etc)
4. Pastebin searches
5. Shodan / Censys
6. The Harvester (usually superfluous information)
7. GXFR (Doesn't really work that well, I run it anyway)
8. Subdomain bruteforcing (sublist3r, fierce, etc)
9. SSL Cert checking (for additional domains)
10. ???
11. Profit...? 

Well, maybe profit.  I usually follow the above, but not to a T.  In most cases I'm given a scope, it may not be correct, but it's a scope nonetheless.  I've made sure to perform a process which (hopefully) locates additional ranges, whilst (hopefully) identifying errors in the scope the client sent.  It's fairly regular I find a fat-fingered IP, or an incomplete subnet.  Diligence in that respect is key-- clients may not know to scan the whole subnet, as they assume only certain devices are 'active'.  Really, it primarily depends on the maturity of the organization; every case is different.

While the above 'process' tends to work well enough for a typical pentest which we're given a scope for, can it be applied to a 'black box' test?  Yes and no-- if you don't have IPs, it throws a wrench in the gears.  All of the steps are applicable, but the information you glean from each step is probably pretty limited.  In this particular case, the first few ranges were easy enough to identify-- the client had subnets explicitly registered to them.  Great!  Three /27's.  Nothing huge.

Was that enough though?  I started looking a bit further into ARIN's database and identified 3 more subnets, all /28's, plus a large IPv6 range.  The registration information hadn't been updated in quite a while, but it was registered to the client and they didn't say 'Reallocated', so they might still be in use.  Either way, it's part of the client's internet presence.  So, I reach out to the client, thinking I've got more information than they thought we'd get, faster and more easily than I thought it'd be to get.
"Thanks, Howard.  The three /27's are correct, but we don't own those other 28s and don't have any IPv6 ranges.  There are also more subnets we'd like tested that aren't in the list you sent."
Huh.  Alright.  Definitely not expected.  Now I'm at a loss...  The client 'doesn't own' the 28s, and owns no IPv6 addresses.  Obviously, this seems... wrong?  They say they don't but according to ARIN, and the 'Organization' field of the whois results... they do.  So, I reached back out to the client, thanked them for the information and requested some clarification.  More or less I just want them to know that they DO in fact own the /28s and the IPv6 range.  I also wanted to stress to them that if the additional ranges they want included in the test aren't registered to them, it's going to be hard to identify the correct subnets.  The response is typical and expected, they DO own them, they just don't use them, and they'd like me to continue searching.

Making the connections

Long story short, the other two ranges were identified.  They aren't registered to the client, so it wasn't as simple as finding the other addresses, but in retrospect it wasn't much harder, either.  In actuality, I had all the information I needed after the first DNS lookups, but I didn't put two and two together.

A challenging part of being a tester is the wealth (mountain) of information we are sometimes presented with.  It can be difficult to pick the needles out of the stack, especially if you're not sure the needle looks like a needle.  Maybe it's just another piece of hay.

Let's look at some sample dig output, querying TXT records:

root@kali:~# dig txt google.com
google.com. 5 IN TXT "v=spf1 include:_spf.google.com ~all"
google.com. 5 IN TXT "globalsign-smime-dv=CDYX+XFHUw2wml6/Gb8+59BsH31KzUr6c1l2BPvqKX8="
google.com. 5 IN TXT "facebook-domain-verification=22rm551cu4k0ab0bxsw536tlds4h95"
google.com. 5 IN TXT "docusign=05958488-4752-4ef2-95eb-aa7ba8a3bd0e"

Not unexpected.  Let's dig deeper:

root@kali:~# dig txt _spf.google.com
_spf.google.com. 5 IN TXT "v=spf1 include:_netblocks.google.com include:_netblocks2.google.com include:_netblocks3.google.com ~all"


root@kali:~# dig txt _netblocks.google.com
_netblocks.google.com. 5 IN TXT "v=spf1 ip4: ip4: ip4: ip4: ip4: ip4: ip4: ip4: ip4: ip4: ip4: ~all"

And, suddenly we have some IPs.  But, do these matter?  On their own, perhaps not.  In Google's case, many of the IP addresses within their SPF records are likely to be registered to Google.  (I didn't check, don't kill me)  What do if the client is small, though?  What do if none of the IPs in the SPF record (if there even is one) are registered to or even seem relevant to the client?  Well, it comes down to needing more information. 

According to this site (https://ns1.com/resources/dns-types-records-servers-and-queries), there are 10 kinds of DNS records commonly used:

A Record
AAAA Record
CNAME Record
MX Record
NS Record
PTR Record
CERT Record
SRV Record
TXT Record
SOA Record

Click that link if you want to read about each.  Anyway, DNS does a lot of stuff.  I spoke of the challenge of correlating mountains of information to one another, and how sometimes things don't immediately seem related.  After enumerating by hand for a while, I decided to run DNS Recon against the domain we knew and just look at the results.  DNS Recon is nice, as it automates those searches we'd normally do manually.  It also presents the information in a nice format.  Sometimes that's all you need to make it click; a nice, pretty visual.

After running DNS Recon, I looked at the output and facepalmed.  The additional ranges were there in front of me the whole time.  

So let's assume the addresses in the previous SPF record lookups DON'T resolve to Google.  Now, let's look at the following 'clientdomain.com' DNS Recon (it's actually just a condensed lookup against Google, modified for the sake of being confusing).

root@kali:~# dnsrecon -d clientdomain.com
[*] Performing General Enumeration of Domain: clientdomain.com
[-] DNSSEC is not configured for clientdomain.com
[*]  SOA ns1.clientdomain.com
[*]  NS ns3.clientdomain.com
[*]  MX alt2.aspmx.l.clientdomain.com
[*]  AAAA clientdomain.com 2607:f8b0:4006:81a::200e
[*]  TXT clientdomain.com v=spf1 include:_spf.clientdomain.com ~all
[*] Enumerating SRV Records
[*]  SRV _ldap._tcp.clientdomain.com ldap.clientdomain.com 389 0
[*]  SRV _jabber._tcp.clientdomain.com alt4.xmpp-server.l.clientdomain.com 5269 0
[*]  SRV _xmpp-client._tcp.clientdomain.com alt4.xmpp.l.clientdomain.com 5222 0
[*]  SRV _jabber-client._tcp.clientdomain.com alt2.xmpp.l.clientdomain.com 2a00:1450:400b:c00::7d 5222 0
[*]  SRV _carddavs._tcp.clientdomain.com clientdomain.com 2607:f8b0:4006:81a::200e 443
[*]  SRV _caldavs._tcp.clientdomain.com calendar.clientdomain.com 2607:f8b0:4006:810::200e 443 0
[*]  SRV _sip._tcp.clientdomain.com.clientdomain.com 5060 0

Do you notice anything in particular? 

The additional ranges are contained within that output, as service records.  However, without further enumerating the SPF record (beyond a simple SPF lookup) that wouldn't be known. 


"v=spf1 ip4: ip4: ip4: ip4: ip4: ip4: ip4: ip4: ip4: ip4: ip4: ~all"


[*]  SRV _jabber._tcp.clientdomain.com alt4.xmpp-server.l.clientdomain.com 5269 0
[*]  SRV _xmpp-client._tcp.clientdomain.com alt4.xmpp.l.clientdomain.com 5222 0
[*]  SRV _sip._tcp.clientdomain.com.clientdomain.com 5060 0

After running whois against the SRV record IPs above, I found that they were registered as generic ISP addresses, where all of the other IPs in the DNS records were registered in a way that allowed me to identify them as likely NOT being associated with the client.  Now, I could make the assumption that the two ranges in the SPF record likely.  Reaching back out to the client, I inquired if '' and '' were the additional ranges.  It was confirmed they were, and I was able to begin testing. 


This was a fun one, as it wasn't the typical "Here's some IPs, make sure they're right, then test" scenario.  It also taught me a lesson in making sure I really paid attention to the information I was receiving, and reiterated the importance of understanding basic networking concepts, such as DNS. It's easy to ignore stuff that you regularly see, but don't normally pay close attention to.

This isn't a post about a how-to, nor is it a post about a process.  It's a moral-- what it comes down to in the end is staying sharp.  The moment you start to get lazy with this stuff is the moment it leaves you behind.  You think 'well, that tool never yielded anything' and don't run it, or run it and casually peruse the results.  It's the smallest of things that sometimes make the connections.  Ironically, I breached the perimeter on this client because of vulnerabilities within a webserver on one of the ranges I worked so hard to locate.

The hard work doesn't always pay off, but when it does, it pays ten-fold.

Popular posts from this blog

07 - Just Another OSCE Review

06 - How to maybe not be so bad at fuzzing, Part 2

02x01 - How to maybe not be as bad at fuzzing unknown binary protocols as you were before reading this