I recently completed a contact form on a technology website to send in a query.
The site looked like it was built on WordPress.
After populating the usual fields, rather than displaying an email confirmation or populating a dropdown with some standard confirmation text, the site instead displayed a copy of my query on a confirmation URL.
The URL structure looked familiar to me.
I was almost certain it was Contact Form 7 — which I used to use before jumping ship to WP Forms, which is the form embedded in this site.
I was wrong. There are services for detecting which themes and plugins a WordPress site is running, but in this case all I had to do was inspect the source HTML:
The confirmation URL looked like this:
http://www.website.com/contact-us?contact-form-id=243&contact-form-sent=24245&contact-form-hash=353533598988958&_wpnonce=cdf79b8067#contact-form-865
So we can clearly identity four variables:
- The contact form ID (a three digit number). Its raw string is contact-form-id
- The contact form confirmation number (a five digit number). Its raw string is contact-form-sent
- A 40 character hash, in UTF-8. Its raw string is contact-form-hash.
- A second unique form identifier: ‘Wpnonce’
The contact form hash alone is enough to prevent people from being able to access other users’ completed forms simply be polling URLs with sequential contact form IDs.
I presumed that the plugin was generating these confirmation messages with the appropriate SEO tags.
But —being the naturally inquisitive type, this immediately made me wonder: are these confirmation URLs ephemeral? And even if they are supposed to be in the deep web, might any have slipped through the net and be retrievable in the wild?
I went searching.
Query Building
After analyzing the URL structure of the confirmation page, my next task was building a Google query that might dredge these up.
Clearly the confirmation messages might vary, so polling for a specific URL pattern seemed like a better approach than looking up text strings.
I figured that "contact-us?contact-form-id="
should be common to all form completions using this WordPress plugin.
First query, no success:
The URLs either don’t seem to persist or are well de-indexed.
But simply searching for the text string dug up some interesting information — potentially content created through integrations:
A form has been partially completed and held online. But it looks like a bot submission:
Unfortunately, directly looking for queries doesn’t yield any obviously exposed form completions.
I decide to look for inurl: contact-form-hash
and I start to find actual form confirmations:
Unfortunately these sites haven’t piped the form output to the confirmation page.
And if they did, what’s a word somebody might use in a form? Regards, perhaps?
Note: any time you run advanced operators like this, you are virtually guaranteed to start getting this:
Which merely draws up more Russian spam:
Unfortunately these are not cached in Google:
One interesting observation is that the backend fields are sometimes captured by search engines, probably due to a misconfiguration:
We can find some more by searching for:
Or:
Outcome
This OSINT adventure didn’t yield too much in the way of useful information — but it did reassure me that my form completions are probably not being scraped. But every challenge like this is certainly an interesting experiment.
Some things I picked up from this:
- Contact form completion URLs are generally properly deindexed and so are on the deep web rather than the open internet — as they should be.
- The URL structure includes several unique tokens, including a hash, which would make attempting to guess live completion URLs, or brute force them, basically impossible.
- The only completion available ‘in the wild’ on the open web are from Russian spammers.
- Sometimes, form fields that should be kept within the backend are exposed to Google, making it possible to identify site owners’ private email addresses.