Spammers use dedicated programs and technologies to generate and transmit the billions of spam emails which are sent every day. This requires significant investment of both time and money.
Spammer activity can be broken down into the following steps:
1. Collecting and verifying recipient addresses; sorting the addresses into target groups
2. Creating platforms for mass mailing (servers and/or individual computers)
3. Writing mass mailing programs
4. Marketing spammer services
5. Developing texts for specific campaigns
6. Sending spam
Each step in the process is carried out independently of the others.
Creating address databases
Collecting and verifying addresses; creating address lists
The first step in running a spammer business is creating an email database. Entries do not only consist of email addresses; each entry may contain additional information such as geographical location, sphere of activity (for corporate entries) or interests (for personal entries). A database may contain addresses from specific mail providers, such as Yandex, Hotmail, AOL etc. or from on-line services such as PayPal or eBay.
There are a number of methods spammers typically use to collecting addresses:
* Spoofing addresses using common combinations of words and numbers - john@, destroyer@, alex-2@
* Spoofing addresses by analogy - if there is a verified joe.user@yahoo.com , then it's reasonable to search for a joe.user@hotmail.com, @aol.com etc.
* Scanning public resources including web sites, forums, chat rooms, Whois databases, Usenet News and so forth for word combinations (i.e. word1@word2.word.3, with word3 being a top-level domain such as .com or .info)
* Stealing databases from web services, ISPs etc.
* Stealing users' personal data using Trojans
Topical databases are usually created using the third method, since public resources often contain information about user preferences along with personal information such as gender, age etc. Stolen databases from web services and ISPs may also include such information, enabling spammers to further personalize and target their mailings.
Stealing personal data such as mail client address books is a recent innovation, but is proving to be highly effective, as the majority of addresses will be active. Unfortunately, recent virus epidemics have demonstrated that there are still a great many systems without adequate antivirus protection; this method will continue to be successfully used until the vast majority of systems have been adequately secured.
Address verification
Once email databases have been created, the addresses need to be verified before they can be sold or used for mass mailing. Spammers send a variety of trial messages to check that addresses are active and that email messages are being read.
1. Initial test mailing. A test message with a random text which is designed to evade spam filters is sent to the entire address list. The mail server logs are analysed for active and defunct addresses and the database is cleaned accordingly.
2. Once addresses have been verified, a second message is often sent to check whether recipients are reading messages. For instance, the message may contain a link to a picture on a designated web server. Once the message is opened, the picture is downloaded automatically and the web site will log the address as active. Most email clients no longer download pictures automatically, so this method is on the wane.
3. A more successful method of verifying if an address is active is a social engineering technique. Most end users know that they have the right to unsubscribe from unsolicited and/or unwanted mailings. Spammers take advantage of this by sending messages with an 'unsubscribe' button. Users click on the unsubscribe link and a message purportedly unsubscribing the user is sent. Instead, the spammer receives confirmation that the address in question is not only valid but that the user is active.
However, none of these methods are foolproof and any spammer database will always contain a large number of inactive addresses.
Creating platforms for mass mailing
Today's spammers use one of these three mass mailing methods:
1. Direct mailing from rented servers
2. Using open relays and open proxies - servers which have been poorly configured, and are therefore freely accessible
3. Bot networks - networks of zombie machines infected with malware, usually a Trojan, which allow spammers to use the infected machines as platforms for mass mailings without the knowledge or consent of the owner..
Renting servers is problematic, since antispam organizations monitor mass mailings and are quick to add servers to black lists. Most ISPs and antispam solutions use black lists as one method to identify spam: this means that once a server has been blacklisted, it can no longer be used by spammers.
Using open relay and open proxy servers is also time consuming and costly. First spammers need to write and maintain robots that search the Internet for vulnerable servers. Then the servers need to be penetrated. However, very often, after a few successful mailings, these servers will also be detected and blacklisted.
As a result, today most spammers prefer to create or purchase bot networks. Professional virus writers use a variety of methods to create and maintain these networks:
1. Exploiting vulnerabilities in Internet browsers, primarily MS Internet Explorer. There are number of browser vulnerabilities in browsers which make it possible to penetrate a computer from a site being viewed by the machine's user. Virus writers exploit such holes and write Trojans and other malware to penetrate victim machines, giving malware owners full access to, and control over, these infected machines.
For instance, porn sites and other frequently visited semi-legal sites are often infested with such malicious programs. In 2004 a large number of sites running under MS IIS were penetrated and infected with Trojans. These Trojans then attacked the machines of users who believed that these sites were safe.
2. Using email worms and exploiting vulnerabilities in MS Windows services to distribute and install Trojans:
1. Most recent virus outbreaks have been caused by blended threats, which included installation of a backdoor on infected machines. In fact, nearly all email worms have a Trojan payload.
2. MS Windows systems are inherently vulnerable, and hackers and virus writers are always ready to exploit this. Independent tests have demonstrated that a Windows XP system without either a firewall and antivirus software attacked within approximately 20 minutes of being connected to the Internet.
3. Pirate software is also a favorite vehicle for spreading malicious code. Since these programs are often spread via file-sharing networks, such as Kazaa, eDonkey and others, the networks themselves are penetrated and even users who do not use pirate software will be at risk.
Spammer Software
An average mass mailing contains about a million messages. The objective is to send the maximum number of messages in the minimum possible time: there is a limited window of opportunity before antispam vendors update signature databases to deflect the latest types of spam.
Sending a large number of messages within a limited timeframe requires appropriate technology. There are a number of resources developed and used by professional spammers available. These programs need to be able to:
1. Send mail via a variety of channels including open relays and individual infected machines.
2. Create dynamic texts.
3. Spoof legitimate message headers
4. Track the validity of an email address database.
5. Detect whether individual messages are delivered or not and to resend them from alternate platforms if the original platform has been blacklisted.
These spammer applications are available as subscription services or as a stand alone application for a one-off fee.
Creating the message body
Today, antispam filters are sophisticated enough to instantly detect and block a large number of identical messages. Spammers therefore now make sure that mass mailings contain emails with almost identical content, with the texts being very slightly altered. They have developed a range of methods to mask the similiarity between messages in each mailing:
* Inclusion of random text strings, words or invisible text. This may be as simple as including a random string of words and/or characters or a real text from a real source at either the beginning or the end of the message body. An HTML message may contain invisible text - tiny fonts or text which is colored to match the background.
All of these tricks interfere with the fuzzy matching and Bayesian filtering methods used by antispam solutions. However, antispam developers have responded by developing quotation scanners, detailed analysis of HTML encoding and other techniques. In many cases spam filters simply detect that such tricks have been used in a message and automatically flag it as spam.
* Graphical spam. Sending text in graphics format hindered automatic text analysis for a period of time, though today a good antispam solution is able to detect and analyze incoming graphics
* Dynamic graphics. Spammers are now utilizing complicated graphics with extra information to evade antispam filters.
* Dynamic texts. The same text is rewritten in numerous ways so that it is necessary to compare a large number of samples before it will be possible to identify a group of messages as spam. This means that antispam filters can only be updated once most of the mailing has already reached its target.
A good spammer application will utilize all of the above methods, since different potential victims use different antispam filters. Using a variety of techniques ensures that a commercially viable number of messages will escape filtration and reach the intended recipients.
Marketing spammer services
Strangely enough, spammers advertise their services using spam. In fact, the advertising which spammers use to promote their services are a separate category of spam. Spammer-related spam also includes advertisements for spammer applications, bot networks and email address databases.
The structure of a spammer business
The steps listed above require a team of different specialists or outsourcing certain tasks. The spammers themselves, i.e. the people who run the business and collect money from clients, usually purchase or rent the applications and services they need to conduct mass mailings.
Spammers are divided into professional programmers and virus writers who develop and implement the software needed to send spam, and amateurs who may not be programmers or IT people, but simply want to make some easy money.
Future Trends
The spam market today is valued at approximately several hundred million dollars annually. How is this figure reached? Divide the number of messages detected every day by the number of messages in a standard mailing. Multiply the result by the average cost of a standard mailing: 30 billion (messages) divided by 1 million (messages) multiplied US $100 multiplied by 365 (days) gives us an estimated annual turnover of $1095 million.
Such a lucrative market encourages full-scale companies which run the entire business cycle in-house in a professional and cost-effective manner. There are also legal issues: collecting personal data and sending unsolicited correspondence is currently illegal in most countries of the world. However, the money is good enough to attract the interest of people who willing to take risks and potentially make a fat profit.
The spam industry is therefore likely to follow in the footsteps of other illegal activities: go underground and engage in a prolonged cyclic battle with law enforcement agencies.