Viruses, Worms and Trojans

Prabhaker Mateti

Abstract: This lecture discusses virus related terminology, the structure of a typical virus, and how virus removal programs work.

Slides:  virusPM.pptx  •  Stuxnet 2011

Table of Contents

  1. Educational Objectives
  2. Viruses, Worms and Trojans
    1. Definitions
    2. Virus Varieties
      1. Stealth Virus
      2. Macro Viruses
      3. Linux Viruses
    3. Spreading Malware via the Internet
    4. Structure of Viruses
    5. Virus Detection
  3. Lab Experiment
  4. Acknowledgements
  5. References

Educational Objectives

  1. Understand the technique of infection.
  2. Learn the virus removal techniques.
  3. Able to distinguish between viruses, worms and trojans.
  4. Understand the modern delivery of viruses.

Viruses, Worms and Trojans

Unix.  The world's first computer virus.
Title of Chapter 1 of The Unix Haters Handbook, ISBN: 1-56884-203-1

The above is indeed the title of a chapter! The book is in fact written by serious computer scientists.  Nevertheless, we must disregard the suggestion that Unix is a virus as an attempt at being hilarious.  Equally unhelpful are the news media that use the term virus in referring to any piece of malicious software. The academic world uses the term "malware'' for these.  Rigorous definitions have been given by many computer security experts but they do not match the typical use even by other security experts.  Thus, we must settle for practical "definitions" of malicious software.

Definitions

Virus Varieties

Stealth Virus

A stealth virus  has code in it that seeks to conceal itself from discovery or defends itself against attempts to analyze or remove it.  The stealth virus adds itself to a file or boot sector but, when you examine, it appears normal and unchanged. The stealth virus performs this trickery by staying in memory after it is executed. From there, it monitors and intercepts your system calls. When the system seeks to open an infected file, the stealth virus displays the uninfected version, thus hiding itself.

Macro Viruses

Macro languages are (often) equal in power to ordinary programming languages such as C.  A program written in a macro language is interpreted by the application.  Macro languages are conceptually no different from so-called scripting languages.  Gnu Emacs uses Lisp, most Microsoft applications use Visual Basic Script as macro languages. The typical use of a macro in applications, such as MS Word, is to extend the features of the application. Some of these macros, known as auto-execute macros, are executed in response to some event, such as opening a file, closing a file, starting an application, and even pressing a certain key.  A macro virus is a piece of self-replicating code inserted into an auto-execute macro. Once a macro is running, it copies itself to other documents, delete files, etc.  Another type of hazardous macro is one named for an existing command of the application.  For example, if a macro named FileSave exists in the "normal.dot" template of MS Word, that macro is executed whenever you choose the Save command on the File menu. Unfortunately, there is often no way to disable such features.

In May 2000, an OutLook mail program macro virus called LOVELETTER propagated widely. 

Unix/Linux Viruses

The most famous of  the security incidents in the last decade was the Internet Worm incident which began from a Unix system.  But Unix systems were considered virus-immune -- not so.  Several Linux viruses have been discovered. The Staog virus first appeared in 1996 and was written in assembly language by the VLAD virus writing group, the same group responsible for creating the first Windows 95 virus called Boza.

Like the Boza virus, the Staog virus is a proof-of-concept virus to demonstrate the potential of Linux virus writing without actually causing any real damage. Still, with the Staog assembly language source code floating around the Internet, other virus writers are likely to study and modify the code to create new strains of Linux viruses in the future.

The second known Linux virus is called the Bliss virus. Unlike the Staog virus, the Bliss virus can not only spread in the wild, but also possesses a potentially dangerous payload that could wipe out data.

While neither virus is a serious threat to Linux systems, Linux and other Unix systems will not remain  virus-free.  Fortunately, Linux virus writing is more difficult than macro virus writing for Windows, so the greatest virus threat still remains with Windows.  [July 2000, http://www.boardwatch .com/ mag/ 2000/ jul/ bwm142pg2.html ]

Spreading Malware via the Internet

Whereas a Trojan horse is delivered pre-built, a virus infects.  In the past, such malicious programs arrived via tapes and disks, and the spread of a virus around the world took many months.  Antivirus companies had time to identify a new viral strain, and create cleaning procedures.  Today, Trojan horses, and viruses are network deliverable as E-mail, Java applets, ActiveX controls, JavaScripted pages, CGI-BIN scripts, or as self-extracting packages. 

Integrated mail systems such as Microsoft Outlook make it very simple to send not only a quick note edited within a limited text editor but also previously composed computer documents of arbitrary complexity to anyone, and to work with objects that you receive via standards such as MIME. They also support application programming interfaces (such as MAPI) that allow programs to send and process mail automatically. Well over 500 million E-mail messages are delivered daily in July 2000.

Mobile-program systems are becoming more and more widespread.  The most widely-hyped examples today are Java and ActiveX.  This technology became popular with Web servers and browsers, but it is now integrated (e.g., Java into Lotus Notes, and ActiveX into Outlook) mail systems. Both Java and ActiveX have been found to have security bugs.

Structure of Viruses

Here is a simple structure of a virus.  In the infected binary, at a known byte location in the file, a virus inserts a signature byte used to determine if a potential carrier program has been previously infected.

V()
{
  infectExecutable();
  if (triggered()) {
    doDamage();
  }
  jump to main of infected program;
}

void infectExecutable()
{
 file = chose an uninfected executable file;
 prepend V to file;
}

void doDamage() {
   ...
}

int triggered()
{
  return (some test? 1 : 0);
}

The above virus makes the infected file longer than it was, making it easy to spot.  There are many techniques to leave the file length and even a check sum unchanged and yet infect.  For example, many executable files often contain long sequences of zero bytes, which can be replaced by the virus and re-generated.  It is also possible to compress the original executable code like the typical Zip programs do, and uncompress before execution and pad with bytes so that the check sum comes out to be what it was.

Virus Detection

Known viruses are by far the most common security problem on modern computer systems. Several web sites maintain complete lists of known viruses.  There are thousands.  Visit, e.g., www.cai.com/ virusinfo/ encyclopedia/.  In the month of July 2000, there were 200+ "PC Viruses in the Wild" (www. wildlist. org).  Virus detection programs analyze a suspect program for the presence of known viruses.

Fred Cohen has proven mathematically that perfect detection of unknown viruses is impossible: no program can look at other programs and say either "a virus is present" or "no virus is present", and always be correct. But, in the real world, most new viruses are sufficiently like old viruses that the same sort of scanning that finds known viruses also finds the new ones. And there are a large number of heuristic tricks that anti-virus programs use to detect new viruses, based either on how they look, or what they do. These heuristics are only sometimes successful, but since brand-new viruses are comparatively rare, they are sufficient to the purpose.

Virus scanners are sometimes classified by their "generation."  The first generation virus scanners used previously obtained a virus signature, a bit pattern, to detect a known virus. They record and check the length of all executables. The second generation scans executables with heuristic rules, looking, e.g., for fragments of code associated with a typical virus. They also do integrity checking by calculating a checksum of a program and storing somewhere else the encrypted checksum. The third generation use a memory resident program to monitor the execution behavior of programs to identify a virus by the types of action that the virus takes. The fourth Generation Virus Detection combines all previous approaches and includes access control capabilities.

It is very educational to study the details of a scanner.  The paper by Sandeep Kumar, and Gene Spafford, "A Generic Virus Scanner in C++," Proceedings of the 8th Computer Security Applications Conference, IEEE Press, Piscataway, NJ; pp. 210-219, 2-4 Dec 1992 [Local copy .pdf] is Required Reading.

Lab Experiment

TBD

Acknowledgements

These lecture materials are gleaned from many sources.  All are presented after careful reading.   In some cases, I may have neglected proper attribution. I assure the reader it is not because I claim authorship.  Indeed, in the lectures there is hardly any thing new that I have contributed.  Suggestions for improvement are always welcome. 

References

  1. Simson Garfinkel, Gene Spafford, Practical Unix and Internet Security, 3rd edition (2003), O'Reilly & Associates; ISBN: 0596003234.  Chapter 11. Protecting Against Programmed Threats.  Required Reading.
  2. Sandeep Kumar, and Gene Spafford, "A Generic Virus Scanner in C++," Proceedings of the 8th Computer Security Applications Conference;  IEEE Press, Piscataway, NJ; pp. 210-219, 2-4 Dec 1992. [Local copy .pdf]  Required Reading.
  3. Ozgun Erdogan and Pei Cao, Hash-AV: fast virus signature scanning by cache-resident filters, International Journal of Security and Networks, Volume 2, Number 1-2 / 2007   Pages:  50 - 59.  Recommended Reading.
  4. Anthony Cheuk Tung Lai, "Comprehensive Blended Malware Threat Dissection Analyze Fake Anti-Virus Software and PDF Payloads", 2010, http://www.sans.org/reading_room/    Recommended Reading.
    Bryan Barber, " Cheese Worm: Pros and Cons of a Friendly Worm", 2003, http://www.sans.org/reading_room/    Recommended Reading.
  5. Schaffer, G.P., Worms and viruses and botnets, oh my! Rational responses to emerging Internet threats, Security & Privacy, IEEE, May-June 2006, Volume: 4,  Issue: 3, pp. 52-58.  Recommended Reading.
  6. Matthew G. Schultz, Eleazar Eskin, Erez Zadok, Manasi Bhattacharyya, and Salvatore J. Stolfo, "MEF: Malicious Email Filter A UNIX Mail Filter that Detects Malicious Windows Executables," Proceedings of the FREENIX Track: 2001 USENIX Annual Technical Conference, June 25-30, 2001, Boston, Masssachusetts, USA; http://www.usenix.org/publications/library/proceedings/ usenix01/ freenix01/ schultz/ schultz_html/ index.html Reference.
  7. Virus Bulletin, www.virusbtn.com/VirusInformation/  Technical journal on developments in the field of computer viruses and anti-virus products,  Reference.
  8. http://vxheavens.com/ Their slogan: "Viruses don't harm, ignorance does!" Collection of viruses source code.
  9. Steve R. White, Morton Swimmer, Edward J. Pring, William C. Arnold, David M. Chess, John F. Morar, "Anatomy of a Commercial-Grade Immune System," 1999, www.research.ibm.com/ antivirus/ SciPapers/ White/Anatomy/anatomy.html   The site (www.research.ibm.com/ antivirus/) has many other excellent articles.  Recommended Reading.
 
Copyright 2011 pmateti@wright.edu Other Internet Security Lectures by Mateti