Identifying & linking Darknet vendors through NLP applications

Vageesh Saxena, Jerry Spanakis and Gijs Van Dijck

The Dark Web, aka Darknet, is a collection of countless hidden websites that facilitates various criminal activities, including financial frauds, hacking services, child sexual exploitation, and trafficking of drugs, organs, weapons, and even humans. While law enforcement is actively seeking such criminal activities, the anonymity and the vast scope of the Darknet favours these criminals to stay undetected. Gauging the scope and size of a Darknet market is another challenging task that can help law enforcement prioritize their resources by tracing and linking vendors across various markets and existing criminal databases. Therefore, to aid the law enforcement agencies, we propose a framework that can look into the writing style of different vendors from the Darknet advertisements and link them across and within seven distinct Darknet markets. This research emphasizes providing solutions to the following concerns:

(1) Vendors on the Darknet often distribute their business across multiple markets to stay under the radar of law enforcement. Therefore, we establish a BERT-based supervised baseline with an accuracy of 0.91 to classify 3,896 vendors across three Darknet markets in an open set multiclass classification setting.

(2) Often, vendors on the Darknet change their vendor handles within and across multiple markets to stay undetected. Therefore, using our established baseline, we extract the sentence embeddings and compute the representational similarity to identify vendors with identical advertisements.

(3) Countless new markets emerge every day on Darknet. Unfortunately, not all law enforcement have the resources to train the resource extensive SOTA classifiers. Therefore, we finally perform knowledge distillation from our established baseline to a smaller network for an emerging LR market and claim comparable performance to SOTA architectures.

Finally, we claim to identify 201 migrants and 57 aliases across Alphabay, Dreams, Silk Road-1, Traderoute, Agora, Valhalla, and Berlusconi Darknet markets through our research. We believe that law enforcement can benefit from our framework by following our approach and training the established baseline on an extensive criminal database.