SD32-C SYSTEM SPECIFICATION VERSION 0.84

© 2003-2004 Yury Benesh[1]

 

WARNING! NO REDISTRIBUTION! THIS IS AN UNFINISHED DOCUMENT!!!

 

Creation date is 11/Jan/2004.

Modification date is 7/Nov/2004.

 

 

CONTENTS

 

MAIN CONCEPT. 2

FILE SYSTEM LAYER.. 2

MEMORY.. 3

MULTITASKING.. 3

DRIVERS. 3

SYSTEM CALLS. 5

USER ENVIRONMENT. 5

INTERNATIONALIZATION AND LOCALIZATION.. 5

ALGORITHMS IN KERNEL. 6

SOURCES DIRECTORY TREE. 6

CODING STYLE. 7

USED UNITS. 8

TOOLS. 9

SYSTEM INTERRUPT. 9

SYSTEM INTERRUPT-2. 9

NAMED MESSAGE QUEUES. 10

CONSOLE. 10

FILE SYSTEM RELATION.. 10


MAIN CONCEPT

SD-32 ALPHA (currently SD32-C) (referred below as SD) is a multitasking 32-bit operating system written in Free Pascal (FPK, http://www.freepascal.org ) and distributed under GNU GPL v2 license and SDK files under GNU Lesser GPL v2.1.

 

The future OS must meet the following terms:

1)      Small size — the binaries must be able to run from a 3.5” 1.41 MiB Diskette (known as “1.44MB”).

2)      Relatively fast performance — must be able to be used comfortably on a Pentium machine allowing browsing large hard disks with complicated file systems like ext3fs, NTFS etc.

3)      Economical memory consumption — must work comfortably on a machine with 16 MiB RAM without swapping.

4)      Simplicity — the sources must be easy to understand and the final distributive must be easy to learn and use. No deep and confusing directory structure like in UNIX-like operating systems is allowed.

5)      Maximally reusable components — the basic distribution libraries must provide the majority of the common functions in order to reduce the size of the executables.

6)      Powerful text mode service — SD is primarily a console OS with pseudo-graphical user interface.

7)      DOS-like — must be easy to learn by DOS users.

8)      WINDOWS-like — must provide several major features of Win32 API, PE-executables and DLLs.

9)      UNIX-like — must provide easy porting of UNIX-oriented software: support “/” slashes and devices like “zero”, “null” etc.

10)  Modern standards compliant — Unicode support realization using UTF-8 everywhere, IEEE 1541 (UNITS AND PREFIXES FOR DIGITAL ELECTRONICS) — using kibi (Ki), mebi (Mi), gibi (Gi) prefixes for measuring digital data (1024 bytes = 1 KiB, 1000 bytes = 1 kB).

11)  Software tools for hard disks maintenance and system backup — SD is planned to be an auxiliary OS providing great range of tools for preparing HDD for the main OS installation or removing an OS, restoring data, backups and boot sectors, managers, editing configuration files or Windows registry...

FILE SYSTEM LAYER

The file system layer of SD tends to be both UNIX and WINDOWS like. Block and character devices are accessible and used like in UNIX. Therefore there are several Unix-specific devices treated as files like “/dev/zero”, “/dev/hda”, “/dev/printer” etc. But in SD they are named with prefixes and accessible as files from any directory or drive: “..zero”, “..hda”, “..printer” while not being members of any directory. There’s no concept of a root mounting point. Mounting points are disks (or drives) like in DOS: “C:”, “D:”, “E:” etc. The disk name is separated by the colon. The directory and file names are separated by “/” or “\” i.e. the following path name is correct “C:\dir/dir\dir/file”. Disk names are not restricted to the only Latin letter. A disk name or a device name can’t contain spaces or the following symbols: “:*?/\”<>|” (colon, asterisk, question mark, slashes, less than, grater than, pipe) and must be equal or less than 11 characters. File names are allowed to be 255 symbols in length and contain spaces. Device names are case-sensitive while disk names consisting of one letter (A..Z) are not. Examples: “A:” equals to “a:”, “Ab:” is not equal to “AB:”. The file and directory names are handled as case-insensitive only if the file system they are carried on is of such a kind.

 

MEMORY

Physical memory is allocated by contiguous blocks, i.e. physical memory may be fragmented after allocation and freeing. So there’s no need for page bitmaps.

 

There are 3 pools — free physical memory pool, free per-process virtual memory pool, free global DLL memory pool. SD not supporting swap files, virtual memory in SD is always reflected to physical memory. Virtual memory, i.e. the memory visible to a process, is logically divided into 3 zones — low (first 1 GiB), middle (2 GiB) and high (upper 1 GiB).

 

In the low addresses (first 1 GiB) the memory corresponds to the physical addresses and the kernel is situated, the program and its allocated heap are placed within the next 2 GiB. In the high area the addresses for DLLs are reserved. So each process has its own free memory pool for 2 GiB and uses the only system pool of upper area memory (the high 1 GiB area).

 

Shared areas are watched by internal kernel structures and executable image manager.

 

Loadable images and modules (module manager) are described in a dedicated section of this manual.

 

MULTITASKING

Some kind of protection for shared resources of the system must be implemented, as well as deadlock protection. All system calls are sent via system interrupt, which may send the request to a driver and return at once or halt the system for a short time handling the request immediately.

 

The most common functions are placed into the system library (sdsys.dll or SDSYS.L32), which performs main actions. The drivers API provides locking handles so that documented functions must be called only when a locking handle is granted to the driver going to execute it (“mutex” objects).

 

There are TSS for:

1)      1 TSS for VM86 with IO-bitmask enabling direct access to all ports and is used to call BIOS services directly – used by the kernel and drivers ONLY.

2)      32 TSSes for exceptions

3)      1 TSS for kernel/user/fully emulated VM86 – software context switching is used.

 

DRIVERS

The kernel is monolithic; it maintains a device registry. A driver could not be unloaded because it is statically linked to the kernel but the devices it handles can be disabled or removed. After all the drivers are connected to the kernel through a special unit or a number of dedicated units, a window — they are isolated from the kernel code. This is needed to provide grater stability and easy debugging. Due to this feature a separated dynamically loadable drivers could be implemented.

 

Drivers register virtual devices of the following types:

 

An alias is a device, which really refers to the same functionality device but with another name. e.g. C: is a real volume and c: refers to C: and is an alias.

 

A substitute is a device, which does not directly correspond to another one but sends modified requests to a linked device. e.g. G: is a substituted volume, it modifies requests' path name to be C:\myrefdir\sss\...

 

A volume is a device which accepts file IO operations and serves as a basis,

e.g. C:\mypath\myfile.txt

It has a block and file system devices attached, or in other words, it lets a file system device to communicate with a block device.

 

A file system device has nothing downward attached — volumes use it.

 

A block device has nothing downward attached — it supports I/O operations and volumes use it

 

A character device is almost the same as a block device but can't contain a file system.

 

Mouse or bus devices could be an example of a special device, which do not handle general IO but provide special device requests and notification events. The other device types can also provide additional requests and notification events.

 

The FS layer may operate on volumes and character/block devices only. If a path contains a colon or slashes then only volumes are treated.

 

The order of the initialization:

1)      initialize FS drivers (they simply register devices);

2)      initialize block/char drivers;

3)      mount all block devices which can contain a file system automatically: volume devices are created by the kernel. Later they may be just remounted but new volume devices are created by the user. The new devices can also request to be mounted…

 

A volume device must check whether underlying media has changed, sends DETECTFS requests to all available FS devices, (re-)mounts and unmounts FS devices etc. Volume devices are created as many as block devices marked as FS-containers are registered.

 

An FS device must be concentrated on a lower way of operation. It performs direct block manipulation, it stores buffers and contexts, and maybe even PATH CACHE etc.

 

A volume sends MOUNT request to get a context handle which then be used in all requests to the FS device. The context handle is needed to provide many block devices to use the same FS device.

 

If a block device has changeable media, then it contains an ID of the media inserted. The ID must be time-dependent or anyhow provided to be unique for any media inserted in order to let the higher level devices (volumes, substitutes) to correctly detect media change.

 

A device driver provides a queue for system requests and a queue for user requests it is run on a separate thread.

SYSTEM CALLS

 

This topic is a subject to review.

The kernel is called either via interrupt or/and by using the main command message map. It is used by drivers as well as by user applications. The fields of the message queue entry are filled in the following way:

·        objid = [thread id or zero for kernel/drivers],

·        command=[system request number],

·        param1=[pointer to request’s parameters structure],

·        param2=[pointer to result structure].

 

The structures contain the first field, which is of DWORD size and contains number of bytes occupied by the structure. The pointers are in address space of the thread, that is why if objid>0 then conversion to the kernel’s address space is needed. The result structure must contain at least the following: ( DWORD szbytes, LONG result ).

 

Some requests may ignore param1 but they’ll never ignore param2!!!

 

USER ENVIRONMENT

There is a basic shell command interpreter like COMMAND.COM in DOS. It serves different commands like “ls, cd”, supports pipes etc. It is planned to be light and fast. Over this basic shell a more complicated and multifunctional file manager is provided. It uses text mode and provides text pseudo-graphics user interface with frames, windows, menus, buttons etc. and works like a server for applications, like a desktop. It uses pipes and so can launch CMD.E32 in a text window. The command interpreter of SD should be named as this, all native PE-executables must have .E32 extension and dynamic libraries — .L32). The multi-window text desktop maybe called like TWMAN.E32 (Text Window Manager). But should be accessed in a special way — via message queues or other special IPC means. The environment variables are case-sensitive and may contain only Latin letters and several symbols.

 

An average user of SD is an administrator or a member of customer support service. However, future versions could become suitable to be main operating systems for a server system or a remote-desktop workstation, embedded systems etc.

 

INTERNATIONALIZATION AND LOCALIZATION

SD uses UTF-8 encoding in all string-related data, which include system libraries, file system layer, translated strings base and other things. There are two system variables used to maintain compatibility with DOS and Windows, the variables are used by emulation subsystems, console and file system drivers:

 

Native SD applications use UTF-8 and UTF-32. The console buffer is stored in UTF-32 and rendered in the text display mode via the DosCP and VGA fonts, but if in the graphics mode, the Unicode fonts are used…

 

The kernel and/or system libraries provide such functions as UpCase, CharCompare, UTF8toUTF32 and so on.

 

 

ALGORITHMS IN KERNEL

The kernel needs the following facilities:

 

They are utilized as:

 

Message queues will be accessed through the kernel interrupt so no mutex is needed.

 

 

SOURCES DIRECTORY TREE

This topic is a subject to review.

/src/

 

            doc

 

            drivers/

 

                        fs/

 

                                   fat

FAT12, 16, 32 driver

                        bnc/

 

                                   disk/

 

                                               fdd

Floppy disk drivers

                                               hdd

Hard disk drivers

                                   std

Standard devices (“..zero”, “null” etc.)

                                   kbd

Keyboard drivers

                        special/

 

                                   mouse

Mouse drivers

            kernel/

 

                        ia-32

Low-level routines, IDT, GDT stuff…

                        memory

Memory management units

                        dev_engine

Device engine units

                        threads

Processes, Threads, Contexts stuff…

                        execs

Executable, DLLs, Modules stuff…

                        nls_out

Generated Tables (const) etc.

                        internat

Internationalization stuff…

            sysdll/

 

                        sdsys

sdsys.L32

            nls_tab

ISO, CP tables, internationalization strings etc.

            app_sdk/

SDK for Application writers

                        asm

 

                        fpc

 

            mbsyspas/

System unit to produce GRUB-compliant kernels

                        fpc1_0_10

 

            apps/

System applications

                        sdcmd

 

 

 

CODING STYLE

 

1)      Expand tabs to spaces;

2)      Line width is 80 symbols;

3)      Each level (begin…end) is shifted by two spaces

4)      Var, goto, begin, end are at the same shift as the function, interface, implementation etc.

5)      Begin…end are at the same shift as corresponding if, while, for

6)      Subfunctions, including those in the interface and implementation are shifted initially by two spaces.

7)      Return arguments must be named as p_out_* or r_out_*

8)      Pointers should be used for return variables, structures, arrays etc…

9)      Standard types are named with the first capital letter, e.g. Boolean, Char, PChar, Byte

10)  Underscores (“_”) should be used if a name is quite bad to read, e.g. TUI_Designer, TFAT_FileSystem, LinkedList_Init()…

11)  (*” and “*)” are the most preferable delimiters for comments.

12)  See gentypes.pas and gulnklst.pas version 2.0 for examples.

 

 

CODE STYLE SAMPLE

~~~~~~~~~~~~~~~~~

 

GlobalFunctionName()

 

FakeObject_MethodName() – e.g. LinkedList_Init, RobotCarret_Move etc.

 

gGlobalVariableName

 

mMemberName (objects, classes)

 

methodName

 

local_function_name (which is not declared in Interface)

 

local_variable_name

 

g_local_variable_name (not declared in Interface, but global for the implementation part)

 

argument_variable_name

 

TTypeName

CClassName

OObjectName

 

PPointerTypeName

 

CONSTANT_NAME

DEFINE_NAME

 

pPointerVariable

p_pointer_variable

 

rVariableReference

r_argument_reference

r_local_reference

 

 

MLCC_TTypeName (with prefix)

MLCC_Function

 

while (condition)

if (condition)

 

enum TMyEnum

{

  tmeEnumOption, tmeEnumOption2

}

 

Global Functions names:

ObjectAction

e.g. CursorDraw, FileOpen, FileClose...

 

USED UNITS

The following units (packages) were not included into the SD sources repository:

 

Visit http://ybx.narod.ru to download them.

 

TOOLS

The following tools were used:

 

SYSTEM INTERRUPT

This topic is a subject to review.

 

The following functions are provided via interrupt:

1)      Get SD-32 version;

2)      Terminate program;

3)      Find named message queue;

4)      Post message;

5)      Get message;

6)      Create message queue (supports SYSTEM flag in order to grant kernel-only access to the queue)

7)      Delete message queue;

8)      Register name for message queue;

9)      Sleep;

10)  Pause/resume thread;

11)  Kill thread;

12)  Restart thread;

13)  Copy list of devices (returns the size and then fills in the provided memory with the device list)

14)  Copy list of volumes (returns list of volumes)

15)  Gain Mutex object (automatically allocates this mutex)

16)  Release Mutex object

17)  Delete Mutex object

18)  Allocate shared memory (in 4KiB blocks)

19)  Free shared memory

SYSTEM INTERRUPT-2

This topic is a subject to review.

 

The following functions are provided via the second interrupt which is dedicated to drivers and not available for user applications:

1)      Register driver

2)      Register device

3)      Delete device

4)      Allocate DMA block

5)      Free DMA block

6)      Delete driver

7)      Allocate IRQ

8)      Release IRQ

9)      Signal IRQ Handled

10)  WaitForIRQ – put the caller thread into a sleeping mode awaiting for the IRQ

 

NAMED MESSAGE QUEUES

A message queue is a means to communicate in a multitasking environment chosen in the SD-32. First, the user creates a message queue via the system interrupt and gets a special value (handle) to access it. Then the user can give a name to a message queue, i.e. making it accessible for other processes. A named message queue consists of two main components: a name (8 bytes long) and a key (8 bytes long). The term “named” does not mean a sensual human readable word but an identifier, which is recognizable between applications. However, “REQUESTS” may be a name for the system request queue. The names of the system queues will be chosen later… The key is optional, it may be equal to zero for public message queues and set to a specified value in order to make different queues with the same name but with a different key (which may be treated as a port number)…

 

An application can register a number of message queues, for example, a message queue for notification by the system, a message queue for communication with a device driver and two message queues for communication with another application.

 

A message queue can have an attribute “WAKE UP THE OWNER”. So a thread may create a message queue and “go to bed” until a message will be posted to the queue (or a number of marked queues).

 

TODO: virtual named ports for data transactions…

CONSOLE

A console has a visual part (stdout=display) and an input part (stdin=keyboard). There’re several filters applied to the visual part. The reason is the universal character encoding (UTF-8, Unicode) and the limitation of display devices (for text-mode and 8-bit character fonts). That is why there’re many code pages (limited sets of characters from a huge range of all possible Unicode characters). The non-visual part works with native characters, i.e. it uses UTF-8 and there’s not any filter applied. However, the display uses a code page to translate UTF-8 encodings into visible symbols on the screen. Sometimes non-visual parts may be affected to when it is required by compatibility points. So the keyboard driver requires translation tables (a virtual key code and a corresponding UTF-32 code) and the kernel provides all code page related stuff so the console driver needs a code page identifier only.

FILE SYSTEM RELATION

There are two ways to implement it:

1)      The system maintains a list of file descriptors. The descriptor contains information about volume, access type, process, device driver (if it’s a block device opened)… The descriptor is unique. There is also a global list of file references, which refer to a descriptor and keep current position/operation info etc. Besides there is s per process lists of ID’s of such references. So we have a two-layered system:

·        Unique descriptors;

·        References.

The FS driver may also have a list of descriptors and references. When the FS layer uses FS driver it passes the needed information and provides pointers to store private data.

2)      The system maintains a list of references only. Each entry contains the type identifier (IO device or a file), device id, handle got from the device and custom data pointer provided to the device. The handles are used when accessing devices. It is the business of a device to maintain needed descriptors and further references…

 

I think the second way will be implemented, i.e. the devices provide ready to use references and do not tell descriptor identifiers… When you open a device, its request function returns a handle identifier so that future requests use that handle.



[1] Also known as Juras Benesz, ybx, exhu, Juras B., juras.