I'll explain how to use DEBUG to disassemble
and step through a few 8086 Assembly programs here, and also comment on
how complex the code from "C compilers" can become versus that of
small .COM programs in which the programmers often write their own Machine
code in Assembly Language.
If you have no experience whatsoever with DEBUG, I suggest that you first work through my Guide to DEBUG (make sure to work on the DEBUG program listed under the ENTER [e] command as it concerns displaying all the Extended ASCII characters on screen), and then study the Detailed Step-by-step Analysis of the EICAR Program to gain experience in using more DEBUG commands before finally coming back to this page.
Since our first Assembly program is only 69 bytes long, you can simply "Copy and Paste" the following Enter Data commands into DEBUG:
e 100 b8 00 02 ba 00 00 b9 16 00 50 52 51 e8 13 00 59
e 110 5a 58 cd 21 42 50 52 51 e8 15 00 59 5a 58 e2 e9
e 120 eb 1e b8 00 02 ba 2e 00 b9 03 00 cd 21 e2 fc c3
e 130 e8 ef ff ba 3b 01 b4 09 cd 21 c3 0d 0a 24 90 90
e 140 b8 00 4c cd 21
Then type these commands at the DEBUG prompts to create the program file called
disp22.com ( it will be created in the same folder DEBUG was started from):
:45 [ 69 bytes in decimal ]
-qThis program simply displays the bytes 00h through
15h (a total of 22 characters) each on a separate line with three
dots on either side of the character itself.
Note: Some of these bytes will move the cursor position or perform some other action instead of displaying a character on your screen.
PROJECT: Once you've run the program to see what the output looks like, use DEBUG's U (unassemble) command to disassemble it into its Assembly Language code for reference. You can make a 1 pass disassembly listing quite easily by creating and saving a text file with the following lines (and then following my instructions below):
u 100 13a
d 13b 13d
u 13e 144
( Make sure you press the ENTER key once or even twice after typing this 'q' -- if not, DEBUG will 'lock up' waiting for a RETURN you'll never be able to enter.
NOTE: I saved you the trouble of having to figure out later that the bytes from 13b to 13d are DATA; not instructions. Commercial disassemblers often make many passes through code trying to determine the difference between Code and Data elements. )
You can name this file anything you want, but I'll use the name, disp22.dsf, here. Run this Debug script file in the same folder as disp22.com from a command line prompt like this:
C:\temp>debug < disp22.dsf > disp22.asm
which redirects the normal DEBUG screen output into the file disp22.asm. ( Unfortunately, and I have no idea why it should be so, but the file this creates has many spaces at the end of each line, and sometimes you need to add more RETURNS after saving it in Notepad.) Clean up the file as best you can, then try separating the Subroutines (sections of code that are pointed to by CALL instructions) from the rest of the code and data.
Open the program disp22.com in DEBUG, and try stepping through it using the T (Trace) and P (Proceed) commands (while making reference to your Assembly lising) until you understand how it operates. [ CAUTION: Always use the Proceed command to execute any 'INT' instruction, or you'll find yourself trying to trace through a huge section of the computer's BIOS code instead of just this little program! ] You should be able to place some comments in your Assembly listing describing how various instructions affect the program and/or its output, or labeling the names of the BIOS or DOS INTerrupt(s) that are used to display data on the screen. You can send me questions or comments about the code using this online Form.
If you're interested in doing this, but have problems along the way, I'll try to help you without giving away too much of how to do it.
DISP32.COM running under a .PIF file set to display in a DOS-Window of 43 lines per page using a 7 x 12 Bitmap Font.
After running Chartype.exe once or twice on your computer, open the program in NOTEPAD and under the 'Edit' menu, select 'WordWrap' before proceeding... then press the 'CTRL + END' keys to go to the end of the file where you'll see a lot of the program's text. Note that there's an extra line of text near the bottom that has nothing to do with what you saw in the program's output: "COMPAQ print scanf : floating point formats not linked" and moving up to the beginning of this text section, you'll also find a phrase that's purely for identifying the type of compiler/linker I used, "Borland C++ - Copyright 1991 Borland Intl."
IF you are running Windows
9x/ME (the NOTEPAD in 2000/XP doesn't have this problem): When
exiting NOTEPAD, MAKE SURE YOU click on the 'NO' button
in answer to the question: Do you want
to save the changes? For some weird reason, if you save it this way (even
though you simply Word-wrapped the file), Notepad will convert every
single 00-byte of a binary file to a space character (20h); making the executable
completely useless! I suggest that all Windows 9x/ME users obtain TheGUN.exe
from my FreeTools
page (to replace NOTEPAD); you'll be able to open files of any size with it,
and never have to worry about this word-wrap nusance!
Although it's possible to open Chartype.exe in MS-DEBUG and begin stepping through the code with the Proceed or Trace commands, you'll most likely become bored very quickly since Borland's compiler added lots of extra 'housekeeping' routines concerning DOS handles and Memory allocation right at the beginning of the code... If you really want to work through the relevant parts of this program, here's a time-saving tip: You can immediately skip to the instruction beginning at CS:0291 with the command: g 291 to bypass all that Borland stuff. But even then, there are lots of lines of code that seem quite wasteful compared to an Assembly program... Some important subroutines are found at 055D and 1A94 both of which call many other subroutines which will put your head into a spin unless you take the time to disassemble the whole program before trying to find the the very few lines of code that actually call a BIOS Video INTerrupt to display the characters on your screen! (Note: Locations 0CB9 thru 0CE8 and 0F45 thru 0F68 are all DATA locations not CODE even though they are found in the Code Section of the program! Anything found in the Data Section should always be DATA though.) There are at least five different video functions used in this program (due to some 'convoluted' programming there are actually others), and they're all found in one subroutine that's 161 bytes long. Can you tell me where it's located and/or what the five explicit video functions are called?
If you have any questions about these programs
or discussions, please use my online feedback form here: Comments/Questions
for The Starman.
[ The Starman. Revised: 27 OCT 2001.]
Last Update: 27 JUN 2003.