From news.sas.ab.ca!rover.ucs.ualberta.ca!news.bc.net!vanbc.wimsey.com!news.mindlink.net!agate!howland.reston.ans.net!news.sprintlink.net!news.dgsys.com!DGS.dgsys.com!raymoon Mon Mar 27 16:41:23 1995
Path: news.sas.ab.ca!rover.ucs.ualberta.ca!news.bc.net!vanbc.wimsey.com!news.mindlink.net!agate!howland.reston.ans.net!news.sprintlink.net!news.dgsys.com!DGS.dgsys.com!raymoon
From: raymoon@dgsys.com (Raymond Moon)
Newsgroups: comp.lang.asm.x86,alt.lang.asm
Subject: x86 Assembly Language FAQ - General Part 2/2
Followup-To: poster
Date: 25 Mar 1995 23:14:12 GMT
Organization: Digital Gateway Systems
Lines: 371
Distribution: world
Expires: Thu, 20 Apr 1995 23:59:59 GMT
Message-ID: <3l2844$1j8@news.dgsys.com>
References: <3l27l1$1j8@news.dgsys.com>
Reply-To: raymoon@dgs.dgsys.com
NNTP-Posting-Host: dgs.dgsys.com
Summary: This is the FAQ for the x86 Assembly Language programmers
 for the alt.lang.asm and comp.lang.asm.x86 newsgroups.  This
 particular section of the FAQ contains x86 assembly information
 common to all assemblers.
Keywords: x86 ASM FAQ General
X-Newsreader: TIN [version 1.2 PL2]
Xref: news.sas.ab.ca comp.lang.asm.x86:6022 alt.lang.asm:4554

Archive-name: asm_x86_faq/gen_part2
Alt-lang-asm-archive-name: asm_x86_faq/gen_part2
Comp-lang-asm-x86-archive-name: asm_x86_faq/gen_part2
Posting-Frequency: monthly (21st of every month)
Last-modified: 1995/03/22

------------------------------

Subject: 16. Accessing 4 Gigs of Memory in Real Mode

Flat real mode is a popular name for a technique used to access up to
4 GB of memory, while remaining in real mode.  This technique requires
a 80386 or higher processor.  The address space really is not flat,
actually, this technique allows you treat one or more segments as
large (32-bit) segments, thereby accessing memory above 1 MB.

When the CPU accesses memory, the base address of the segment used is
not described by the value currently in the appropriate register.  The
value is stored internally in a structure known as the descriptor
cache.  Changing the value of a segment register results in that
segment's entry in the descriptor cache being recalculated according
to the rules of the current mode.  In real mode, the value of the
segment register is shifted left four bits to find the base address of
the segment, and the size of the segment is always 64k.  In protected
mode, the value in the segment register is used as an index into a
descriptor table located in memory, and the base address and size
(which may be as small as 4 KB, or as large as 4 GB)from the
descriptor table are loaded into the descriptor cache.

When the processor changes modes, the contents of the processor's
internal descriptor cache are not changed.  The reason is because
changing them would result in (at the very least) the code segment
being recalculated according to the new mode's rules, most likely
causing your program to crash.  Thus the program must load the segment
registers with sensible values after the mode switch occurs.  Consider
an example where real mode code is located in segment 1000h.  If
switching modes caused an immediate recalculation of the descriptor
cache, the processor would attempt to read entry 1000h of the
descriptor table immediately upon switching to protected mode.  Even
if this were a valid descriptor (unlikely), it would have to have a
base address identical to real mode segment 1000h (i.e., 10000h), and
a size limit of 64 KB to prevent a probable crash.  An invalid
descriptor would cause an immediate processor exception.

Normally, aside from preventing situations like that in the above
example, there is little to be said about this feature.  After all, as
soon as you reload new values into the segment register, the
descriptor cache entry for that segment will be reset according to the
rules of the current mode.  After switching from protected mode to
real mode, however, when you load the segment registers with their new
values, the segment's base address is recalculated according to real
mode rules, but the size limit is not changed.  After setting the 4 GB
limit (which must be done in protected mode), it will stay in place
until changed by another protected mode program, regardless of what
values are loaded in the segment register in real mode.

So, the steps to using this technique are as follows:
    1.  Set up a bare bones global descriptor table, with a null
entry, and a single entry for a 4 GB segment.  The base address of
this segment is not important.
    2.  If you don't wish to define an interrupt descriptor table
(IDT), you must disable interrupts before switching to protected mode. 
You do not need a full-fledged protected mode environment for this, so
it is easiest just to disable interrupts and not worry about the IDT.
    3.  Switch to protected mode. 
    4.  Load the segment registers you wish to change with the
selector for the 4 GB segment.  I recommend using FS and/or GS for
this purpose, for reasons I'll describe below.
    5.  Return to real mode.
    6.  Re-enable interrupts.

After these steps, you can then load your segment registers with any
value you wish.  Keep in mind that the base address will be calculated
according to real mode rules.  Loading a value of 0 into a segment
register will result in a 4 GB segment beginning at physical address
0.  You can use any of the usual 32-bit registers to generate offsets
into this segment.

Some points to keep in mind:
    1.  Some software depends on 64 KB segment wrap-around.  While
rare, it is possible that you will encounter software that crashes if
the older segments (DS or ES) are 4 GB in size.  For that reason, I
recommend only using FS and/or GS for this purpose, as they are not
used as widely as the others.
    2.  You should never change the limit of the code segment.  The
processor uses IP (not EIP) to generate offsets into the code segment
in real mode; any code beyond the 64 KB mark would be inaccessible,
regardless of the segment size.
    3.  You should never change the limit of the stack segment.  This
is similar to the above; the processor uses SP in real mode, rather
than esp.
    4.  Because of the necessity of switching to protected mode, this
technique will not work in a virtual 8086 mode "DOS box" from Windows,
OS/2, or any other protected mode environment.  It only works when you
start from plain, real mode DOS.  Many memory managers also run DOS in
V86 mode, and prevent the switch to protected mode.  It is possible to
use VCPI to work around this, but if you go to that length you will
probably find that you have implemented a complete protected mode
environment, and would not need to return to real mode anyway.
    5.  This technique will not work in the presence of any protected
mode software that changes segment size limits.  When that software
returns control to your real mode program, the limits will be the
values to which the protected mode code set them.  If these limits are
different that what your program used, problems can result.  At the
very least, your program will return incorrect results when accessing
data stored in extended memory.  At worst, your program will crash and
burn.

The benefits of this technique are many.  Most importantly, you can
access extended memory without resorting to slow BIOS calls or having
to implement a complete DOS extender.  If your program uses interrupts
extensively (timer interrupts for animation or sound, for example),
real mode is a better choice because protected mode handles interrupts
slower.  DOS itself uses this technique in HIMEM.SYS as a fast,
practical method of providing access to extended memory.

Code demonstrating this technique is available in the file,
realmem.zip.  This file is available using anonymous ftp from
x2ftp.oulu.fi in the directory, pub/msdos/programming/memory.

For further reading on this topic, I suggest "DOS Internals," by Geoff
Chappell.  It is published by Addison-Wesley as part of the Andrew
Schulman Programming Series.  The ISBN number is 0-201-60835-9.

Contributor: Sherm Pendley, grinch@access.mountain.net
Last changed: 15 Jan 95  

------------------------------

Subject: 17. What Is Available at intel.com

To obtain a description of the files available at Intel:

    ftp ftp.intel.com
    anonymous log on
    cd pub/IAL
    get libdir.txt

Most of the files are press releases, but there are some hidden
jewels.  Search for and read the files under the Intel386, Intel486
and Pentium subdirectories.  Here are a sample of files that may be of
interest (path from IAL subdirectory is given):

Pentium/opt32.doc - Optimizations for Intel's 32-bit processors in MS
    Word format.
Pentium/pairng.txt - Instruction pairing optimization for Pentium
    processor - text format.
Pentium/p5cpui.txt - new official CPU identification scheme - text
    format.
Pentium/p5masm.mac - MASM macros for instructions new with Pentium
Tools_Utils_Demos/prot.txt - MASM code for entering protected mode -
    text format.
Intel486/2asm10.zip - C source for 8086-80486 16/32 bit disassembler -
    pkzip format.
 
Contributor: Raymond Moon, raymoon@dgs.dgsys.com
Last changed: 8 Jan 95

------------------------------

Subject: 18. Interrupts and Exceptions

    "(with interrupts) the processor doesn't waste its time looking
    for work - when there is something to be done, the work comes
    looking for the processor."
                                    - Peter Norton

INTERRUPTS AND EXCEPTIONS

Interrupts and exceptions both alter the program flow. The difference
between the two is that interrupts are used to handle external events
( serial ports, keyboard ) and exceptions are used to handle
instruction faults, (division by zero, undefined opcode).

Interrupts are handeled by the processor after finishing the current
instruction. If it finds a signal on its interrupt pin, it will look
up the adress of the interrupt handler in the interrupt table and pass
that routine control.  After returning from the interrupt handler
routine it will resume program execution at the instruction after the
interrupted instruction.

Exceptions on the other hand is devided into three kinds.  These are
Faults, Traps and Aborts.  Faults are detected and serviced by the
processor before the faulting instructions. Traps are serviced after
the instruction causing the trap. User defined interrupts goes into
this category and can be said to be traps, this includes the MS-DOS
INT 21h software interrupt for example. Aborts are used only to signal
severe system problems, when operation is no longer possible.

See the below table for information on interrupt assignments in the
Intel 386, 486 SX/DX processors, and the Pentium processor. Type
specifies the type of exception.

--------------------------------------------------------------------- 
Vector number           Description   
---------------------------------------------------------------------  
     0                  Divide Error (Division by zero)
     1                  Debug Interrupt (Single step)
     2                  NMI Interrupt
     3                  Breakpoint
     4                  Interrupt on overflow
     5                  BOUND range exceeded
     6                  Invalid Opcode
     7                  Device not available (1)
     8                  Double fault
     9                  Not used in DX models and Pentium (2)
    10                  Invalid TSS
    11                  Segment not present
    12                  Stack exception
    13                  General protection fault
    14                  Page fault
    15                  Reserved
    16                  Floating point exception (3)
    17                  Allignment check (4)
    18 - 31             Reserved on 3/486, See (5) for Pentium
    32 - 255            Maskable, user defined interrupts   
--------------------------------------------------------------------- 
(1) Exception 7 is used to signal that a floating point processor is
    not present in the SX model. Exception 7 is used for programs and
    OSes that has floating point emulation. Also the DX chips can be
    set to trap floating point instructions by setting bit 2 of CR0.
(2) Exception 9 is Reserved in the DX models and the Pentium, and is
    only used in the 3/486 SX models to signal Coprocessor segment
    overrun. This will cause an Abort type exception on the SX.
(3) In the SX models this exception is called 'Coprocessor error'.
(4) Allignment check is only defined in 486 and Pentiums. Reserved on
    any other Intel processor.
(5) For Pentuims Exception 18 is used to signal what is called an
    'Machine check exception'.

The other interrupts, (32-255) are user defined. They differ in use
from one OS to another.

For a list of MS-DOS interrupts, see 'Obtating HELPPC' (Subject #6) or
Ralf Browns Interrupt List (Subject #11)

Contributor: Patrik Ohman, patrik@astrakan.hgs.se
Last changed: 10 Jan 95

------------------------------

Subject: 19. ASM Books Available

Help build this section by posting your favorate assembly book to the
newsgroups.

Contributor: Raymond Moon, raymoon@dgs.dgsys.com
Last changed:  

------------------------------

Subject: 20. ASM Code Available On The Internet

The SimTel sites have a directory devoted to assembly language. 

    ftp oak.oakland.edu
    anonymous log on
    cd SimTel/msdos/asmutil

Enjoy!

Contributor: Raymond Moon, raymoon@dgs.dgsys.com
Last changed: 22 Mar 95  

------------------------------

Subject: 21. How To Commit A File

The easiest solution is to open or create the file to be committed
using Int 21h function 6ch, extended open/create.  The BX register
contains the desired Open Mode.  One option that can be or'ed into
this register is what Microsoft calls, OPEN_FLAGS_COMMIT, that has the
value of 4000h.  Using this option caused DOS to commit the file after
each write.  This function has been available (documented) since DOS
4.0.

If you do not want to commit the file at each write but only when
certain conditions are met, use Int 21h function 68h, commit file. 
The functions has been available (documented) since DOS 3.3.

If you need to support versions of DOS before 3.3, the following
technique will flush the all stored data without closing and opening
the file.  It is the opening of the file that is time consuming.
    1.  Use 21h function 45h to create a duplicate file handle to the
        file to be flushed.
    2.  Close that duplicate file handle.

This technique will work all the way back to DOS 2.0.

Contributor: Raymond Moon, raymoon@dgs.dgsys.com
Last changed: 30 Jan 95  

------------------------------

Subject: 22. Using Extended Memory Manager

22.1  HOW TO USE XMS

XMS usage - short recipe:
1.  Verify have at least 286 (pushf; pop AX; test AX,AX; js error).
2.  Verify vector 2Fh set (DOS 3+ sets it during boot).
3.  AX=4300h, int 2Fh, verify AL=80h (means XMS installed).
4.  AX=4310h, int 2Fh, save ES:BX as dword XmsDriverAddr.
5.  AH=8, call [XmsDriverAddr] - returns ax=largest free XMS memory
    block size in kB (0 if error).
6.  AH=9, DX=required size in kB, call [XmsDriverAddr] - allocates
    memory (returns handle in DX - save it).
7.  AH=0Bh, DS:SI->structure {
        dword size (in bytes and must be even),
        word source_handle,
        dword source_offset,
        word destination_handle,
        dword destination_offset }
    (if any handle is 0, the "offset" is Real Mode segment:offset)
8.  AH=0Fh, BX=new size in kB, DX=handle, call [XmsDriverAddr] -
    changes memory block size (without losing previous data).
9.  AH=0Ah, DX=handle, call [XmsDriverAddr] - free handle and memory.

Initially, should process #1-#6, then can use #7 to put data in/get
data from XMS memory, or #8 to change XMS memory block size.  On exit
use #9 to free allocated memory and handle.

Hint: handle cannot be 0, since zero is used as "no handle allocated"
value.

Errors for XMS calls (except AH=7 - Query A20) are signaled by AX=0.
Error code returned in BL, few codes can check for are:
    80h - not implemented,
    81h - VDISK detected (and it leaves no memory for XMS),
    82h - A20 error (e.g. fail to enable address line A20),
    A0h - all allocated,
    A1h - all handles used,
    A2h - invalid handle,
    A3h/A4h - bad source handle/offset,
    A5h/A6h - bad destination handle/offset,
    A7h - bad length,
    A8h - overlap (of source and destination areas on copy),
    A9h - parity error (hardware error in memory),
    ABh - block is locked,
    00h - OK

For more info read INT 2Fh, AH=43h in Ralf Brown interrupt list.

22.2  WHAT IS THE 'LINEAR BLOCK ADDRESS' RETURNED BY LOCK MEM BLOCK?

When you lock mem block, XMS driver arranges memory governed by it in
a way the locked block forms one contiguous area in linear address
space and returns you starting address of the memory.  Linear address
is base address of segment + offset in segment, in Real Mode it is
segment*16+offset, in Protected Mode the base address is kept in LDT
or GDT; note offset can be 32-bit on 386+.  If paging isn't enabled,
linear address = physical address.  You don't need the linear address
unless you use 32-bit offsets in Real Mode or you use Protected Mode
(see previous answer for exaplanation how you can access XMS memory). 

Contributor: Jerzy Tarasiuk, JT@zfja-gate.fuw.edu.pl
Last Changed: 30 Jan 95

------------------------------

Subject: 23. Acknowledgments

I would like to acknowledge all the people who have assisted me or any
of the contributors.  For their time and effort, this FAQ is a better
product.

Kris Heidenstrom, Alan Illeman, Chabad Lubavitch, Jeff Owens, Janos
Szamosfalvi and Cedric Ware 
 



