11 March, 2008

I am not a robot!


Péter Gyöngyösi - developer

It did not come by surprise, as you could read about cracking captchas, new methods and new results throughout the year (for example here, here, here and here), but with the recent release of the most detailed, highly commented analysis about such an attack, it is time to review this topic a bit.

Let's start with what captcha is: it stands for "Completely Automated Turing-test to tell Computers and Humans Apart." The Turing-test is a reference to Alan Turing British cryptographer-mathematician who greatly contributed to the victory of the Allies by being a key player in cracking Enigma. It was his idea that there are tasks that a human can easily solve within seconds, but they are virtually impossible for machines, so they can be used to tell a machine from a real human.

It took half a century for his idea to become used not only by researchers of artificial intelligence, but also half of the world. Most commonly it is used on webmail sites to prevent automatic access that could create hundreds of accounts to send spam by the billion. Usually the test takes the form of a few distorted letters that the user must type -- algorithms are notoriously bad at this task, but owing to how our minds operate, we can effortlessly solve it. There are also funny solutions: I have seen websites ask which day is now in my country, which one of three pictures shows a cat, or the prettier lady. The security behind these solutions comes more from their uniqueness, but for a computer, answering these questions is still far from trivial.

The vulnerability of these and similar tests is that there the number of questions is limited. A typical recognize-the-letters captcha implementation uses short words instead of random letters, because the human brain recognizes words like "puppy" more easily than the "ahs3Eina" string. An average implementation uses a dictionary of 5-10000 words: if the prize is tempting, the attackers can and will build this dictionary -- with some human input, or even simply using brute force.

Another way would be to create even better algorithms and use the cheaper and cheaper computing capacity to solve the problem that originally we thought difficult and expensive to crack. Since the captchas have first appeared, character-recognition technology (OCR) has tremendouly improved: earlier it was sufficient to skew the letters a it to the right or left, and now on the more secure pages even we are scratching our human head to make out the text between the many skewing, rotations, and strikethroughs that are supposed to prevent automatic recognition. But self-learning algorithms (for example neural networks) can also help -- after some teaching they might find the few parameters required to decide if there is a cat in the picture.

But the third and scariest approach is to delegate the problem: the attacker does not even try to use machines to solve the problem, but simply removes it from its original environment and asks another human to answer the question. This has thousands of variations, ranging from games (using mostly adult content) that require the answer to access the next level to actual "jobs" that pay money for recognizing the letters. And what is scary about that? That it negates the entire concept: in such cases the Turing-test result is obviously positive, because the answers come from real humans. The success of the attack comes from the fact that the problem-solving is centralized and the cost of an answer reduced to a rate that is economically acceptable for the attacker.

It is possible to avoid the first two types of attack with some good ideas, and careful design and implementation -- though it is becoming more difficult because of the available cheap computing capacity, we can still find tasks that the human brain solves much more effectively than a computer, and increasing the dictionary or varying different tasks is just a question of technology. But fighting the centralized human recognition needs a different approach. The conflict cannot be resolved between these two interests: first, the protection must not become a tiring obstacle for the average user, second, we do not want that with some organizing the security questions are easy to answer by the thousands.

To solve the problem, we have to change the concept. If we ask the user to prove that he is a human only seldom (ideally, only once during the user's "online life"), we can ask for something more. Something that only humans have, and it is not worth create many forgeries. At this level, we can surpass the logical problems and questions-to-answer -- think rather of something from the physical world. Like returning a code received in a text message or via snail-mail: for the attacker, a creating a new mail address or phone number usually does not worth the fake account. But the service providers cannot create such security systems on their own, because it would be too expensive for them, and the users would not invest so much energy into every online service they use. The solution could be to manage the identities centrally, and we can be optimistic about that, because - as we have commented - OpenID is right around the corner.

The analysis published by Websense is a must-read for everyone interested in security: you can see in detail - including logs, screenshots, network dumps - as they track an attack against Google's captcha. The attackers have combined multiple methods: they have created a well-designed and professional system that has about 20% success-rate using algorithms, database building and human labor.

The ball is rolling, fictive accounts are created by the thousands even now. And what will answer Google, who is methodically hiring the greatest minds of the world? Probably this will be among the most exciting security ideas of the spring.

A silent explosion


Attila Kiss - marketing manager

Do you know which data source grows at the fastest rate for todays companies? Video files? Maybe for YouTube, but not for banks, telecommunication companies, or others. The answer is log files. Right, these short text messages can consume terabytes of storage, because every single "thought" of every computer-like device generates a log entry. And computers are becoming faster, there is more of them by the day, and for some reason, these logs are becoming increasingly important for the organizations. IT is experiencing a huge explosion, but it is an underground boom that noone has noticed yet except for some insiders, even though it will rock the foundations of the entire industry.

Another question: Which IT security system will companies spend most money in 2008? Antivirus solutions? Firewalls? Oh, no. None of the trendy, over-marketed stuff will be as important for the companies as the cleaning up of their logging infrastructure. According to an Ernst&Young prognosis for this year, the greatest risk for companies - even before the possibility of a global financial crisis - is that their IT system fails to comply with regulatory requirements. And regulations aim at authenticity, auditability, and retraceability: these can be achieved by proper log management.

There are developers who have already recognized this trend: IBM is already on the field, HP is busyly preparing, and there are the specialists: ArcSight, LogLogic and LogRhythm. But they are all after the big companies for whom logging was an important issue earlier as well. But where are the developers who address the masses, who make efficient tools affordable even to medium-sized companies? Why is Cisco missing from the market: they should be the leaders, but their Mars device reminds us of the past. And what are Microsoft, Symantec, CA, or McAfee waiting for? Where are the developers who will satisfy the currently small, but quickly growing demand?

Well, maybe in Hungary. It seem that at last there is a field of industry where we can take the lead. Several promising development has matured in this country, and - as we bragged about in an earlier blogpost - a significant contribution to the new international syslog standard came from Budapest. Why Hungary? If I want to be sentimental, the most creative minds of the world live here. If I want to be rational, it turned out this way. The conditions were right, and Lady Luck gave a hand.

At first sight, logging infrastructure might seem simple, and log management trivial. This might have been true in the past, but nowdays it is unarguably a process of strategic importance, and not only because of the standards or regulations. Information is power, and you cannot guarantee the security of a large IT system without logs. The idea is simple: Collect the logs to a central place, preferably using an encrypted channel. Get proper filtering and archiving. Finally, add some intelligence and analyzing capabilities, and you will know what is happening on your network.

Thoughts on botnets


Péter Höltzl - IT security advisor

"Here is the new botnet that dwarfs even Storm" - wrote the press not so long ago. Let's see what botnets really are!

A "botnet" (en.wikipedia.org/wiki/Botnet) is a network of zombie computers infected with viruses or worms that its controllers use for evil things. There are billions of infected computers, and the size of botnets can reach tens of millions of machines. The "owners" of the network use the devastating power of the botnet to for attacks and other misdoings: usually spamming, cracking e-mail databases, performing distributed denial of service (DDOS) attacks - or to blackmail with the threat of a DDOS. Of course, the zombie machines are not performing attacks all the time, they are trying to remain invisible, even hiding for weeks or months without any sign of activity, and also work on infecting other computers. Notice the similarity with good-willed distributed computation projects that build supercomputer capacity using the idle processing time of millions of contributing computers. The most well-known such project is the SETI@home that is looking for messages from alien civilizations in the radio waves received from space.

Botnets have remarkably evolved during the last few years, which makes defense against them very difficult.

Earlier they used simple ICMP messages (the control protocol of the TCP/IP networks), for example the "echo request" command is used to check that a particular computer is online at the moment) to control the botnets. Later IRC channels and other instant-messaging systems (like MSN and ICQ) were used to control the army of zombies. Today botnets communicate on peer-to-peer basis, which is also used by the well-known file sharing networks. The new technologies make it difficult to block botnets, because they do not have a single command center anymore. Unfortunately, the "services" of botnets are commercially available over the web; and they do have customers. Your mailbox is the evidence for that.

But why is it bad for us?

Spam sent from botnets is annoying for everyone, but attacks that paralyze companies seemingly do not effect us. (Some even rejoice such news.) But contrary to common beliefs, we are all victims. Spam and phishing mail fill our mailboxes every day, and it takes effort, time, tools to make them go away - which essentially means cost. But they fill up the logs on the servers as well, generating false or real incidents in our network intrusion detection systems (nIDS). About 90% of e-mails transmitted through the Internet is spam, meaning that often the companies have to maintain ten times the capacity that would be required in reality.

If a worms infiltrates our infrastructure and starts spamming from our network, we can end up on an e-mailing blacklist (so called RBL list), that can result in serious losses.

But the real danger of botnets is that anyone can become a victim, even if you take every measure to protect your network. In contrast with traditional viruses, botnets do the most harm to whom they are used against, and not the infected computers. A well-executed DDOS attack can block IT systems for hours, even days, causing great losses not only to the affected organization, but indirectly to the entire economy. The global economic system is running at full speed and became extremely volatile, the outage of a few element can have catastrophic results, and its effects can quickly spread to other regions and fields of industry.

In the not too recent past, the first comprehensive attack against a country was carried out, what we can regard as a main test, or even a demonstration of power. January 5, 2008, the IT system of about 3500 Belarusian company collapsed at the same time as the result of an aligned DDOS attack. Authorities were not able to identify the attackers, although the networks of certain Russian Internet Service Providers were found to be involved.

The question is, is there a protection against such attacks?

From a legal perspective, there is not much to do. Botnets are like the Internet: they do not respect country borders. Law and politics do.

There is nothing law can do about a botnet controlled from a remote country, and even it could, it would be only some after-attack remedy. The real solution would be to develop software with higher quality standards, but this is not in the interest of most software developing companies, because the market demand for that is low (high-quality software here means properly designed and tested applications, but these take time and expenses to develop).
Just like the countries around the world are taking legal steps against the monopoly of certain software developers, they could demand more secure and better tested products. A good example for such thing is making ABS (Anti-lock Braking Systems) mandatory in cars sold in the EU.

We can use preventive methods and strict firewall policies that delay the catastrophe, but there will always be covert channels, and you just have to use some protocols when running a company. And only pliers and pulling the power plug from the wall guarantees protection for hundred percent.

An interesting idea is to create "good" worms that use the same vulnerabilities that botnets do, infects the same computers, and removes the viruses and worms. Obviously, this could not completely eliminate botnets, but shrinking the size of the botnet below a critical level would be sufficient. While technically sound, this approach raises many legal and ethical issues. (And I would be interested in how long would it take for a "bad botnet that kills the good botnet" to appear.)