What is Computer File System & How Does It Work?

eComputerTips is reader-supported. When you buy through links on our site, we may earn a small commission without any additional cost to you.

What is File System?

In simple terms, a file system determines how a file is to be named, stored and recovered from the storage device. It specifies different facets of the file.

The file system also helps in copying, editing, or deleting a file and it also plays a major role when you visit a website on the internet or download a file.

KEY TAKEAWAYS

  • A file system is used by the operating system of a computer to store data in files and retrieve and load them when needed from the storage device.
  • The functions of this is to help in copying, moving, editing, printing or deleting a file and also help in accessing and downloading a file from a website.
  • The file system helps in determining the conventions, character limits and types and the length of the suffix while naming a file.
  • Files system is usually defined by the user or creator of the file and works by using a series of bits, bytes, and records.

Understanding Computer File System and How Does It Work?

Computer File System

A computer file is a set of interrelated information. This information is recorded and stored on a non-volatile or secondary storage device such as:

  • A magnetic disk
  • An optical disk and
  • Tapes.

It is actually a process of collecting data that can be used as a medium to give input and receive output from a program.

In technical terms, a file is a series of bytes, bits and records. The meaning of it is defined by the creator and user of the file.

All these files have a logical location in a storage device for retrieval when required. These storage devices can be:

All these files are organized and defined by the file system.

The file systems therefore can be considered as an index of all the files stored in a storage device along with the data contained in it.

In addition to that, a file system also contains other important information such as:

  • The file size
  • The attributes of the file and
  • The location details.

There is also a metadata to indicate its hierarchy in the directory. This metadata also identifies the available free blocks for storage on the drives along with the space that is available in it.

In order to understand the file system and its working process in a much better and a more comprehensive way, there are a lot of other relevant things that you will need to understand. Therefore, go through them all now.

File system tree:

It is the structure of the file directory that helps the file system to use a specific format to identify the path to a file.

In the file tree structure, a file is placed in the directory or a subdirectory or a folder in the Windows operating system at a preferred place.

In short, all operating systems of a computer or a mobile device have a file system wherein all the files are located in some sort of a hierarchical tree structure.

This means that proper partitions should be made and put into the right place before creating the files and directories.

This partition is a dedicated place in the hard disk or other storage devices which is managed separately by the operating systems.

In the primary partition, there is one file system though a few specific operating systems may allow creating multiple partitions in a single disk.

This is advantageous in the way that in the case of one file system getting corrupted, there will be another wherein the files will be safe.

Objectives:

The file system has a few specific objectives to accomplish. Some of the main objectives of a file system are:

  • Providing I/O support to different types of storage devices
  • Minimizing the chances of loss of data and
  • Helping the operating system to standardize the routines of the I/O interfaces for user processes.

The file system also provides I/O support to a variety of users in a varied set of systems and environments.

Properties:

The computer file systems also come with different important properties, and some of them are:

  • It keeps the files stored in the storage device safe and it is not lost when the user logs out or shuts off the computer system and
  • The dedicated file names give access permission and ensure controlled sharing of files.

The files are arranged properly in complex structures in a file system which shows the relationship between each other.

File structure:

A file system needs to have a predefined file structure which the operating system would be able to understand easily. The file structure is however defined exclusively based on the type.

There are mainly three types of file structures in an operating system namely:

  • A text file – This refers to a set of characters that are written or organized in lines.
  • An object file – This refers to a set of bytes that is structured into blocks.
  • A source file – This refers to the set of processes and functions.

Depending on the file structure, the file system will also have varied attributes, as said earlier. Here are the important attributes of file system used in an operating system with a brief explanation of each:

  • File name – This is the only information regarding the file that can be read by humans.
  • File identifier – This refers to the unique tag number in the file system.
  • File location – This indicates where the file can be found on the storage device.
  • File type – This is an important attribute that is required for the file system to support different types of files.
  • File size – This attribute displays how big the file is, usually denoted in KB.
  • File protection – This trait of the file system helps it to assign and control the right to access a file to read, write, or execute a file.

Another significant attribute of the file system is the date, time, and security which help in monitoring and protection.

File type:

Depending on the type of operating system, the type of files can also be different. Usually, the UNIX and MS_DOS operating systems would have the following file types:

  • Character special file – This refers to the hardware files that read and write data character by character.
  • Ordinary files – These files store user information, text, databases, and executable programs and allow the users to perform specific tasks such as adding, deleting, and modifying.
  • Directory files – This refers to the storage of other information relevant to the file. It is basically a folder which helps in holding and organizing several files.
  • Special files – Also referred to as device files, these files represent the physical devices such as printers, flash drives, disks, networks, and others.
Read Also:  What is ROM (Read only Memory)? Types & Function

The file type actually indicates the capability of the operating system to distinguish between different types of files such as source files, binary and text files.

File access methods:

The file system also works on the access methods of the files and the ways in which these files are read into memory. Usually, the operating systems support a single access method.

However, there are specific operating systems that may support different access methods. These are:

  • Sequential access – In this type of file access method, the operating system follows a specific and predefined sequence to access the records and processes the information stored in the file one by one. Most compilers follow this method.
  • Direct random access – Just as the name signifies, this file access method, which is also referred to as random method simply, allows the operating systems to access the files directly and at random. It depends on the unique address of the records to access them for reading and writing.
  • Index sequential access – This method depends on the simple sequential access where every file has a particular index built for it with a direct pointer to the diverse memory blocks. The operating system searches for the index sequentially and the pointer accesses the file directly.

Though all methods are good, the index sequential method is preferred a bit more because it allows using multiple levels of indexing. This increases the efficiency in accessing the files and at the same time reduces the time taken to access a particular record.

Disk space allocation:

Ideally, there are three basic types of space allocation methods followed. These are:

  • Contiguous allocation – In this method, the operating system assigns disk addresses in liner order and every user of the file uses a contiguous address space in the memory. However, in this method the most significant problem is external fragmentation.
  • Linked allocation – In this method, there is a list of links included in every file and the pointer or a link is contained by the directory in the first block of the file. Though there is no external fragmentation issue, this method is not suitable for direct file access because sequential access is followed for file allocation.
  • Indexed allocation – In this method, the directory contains the index blocks of the files which are created with all the pointers for that particular file in it.

Ideally, the working process of the file system depends heavily on the space allocation in the disks.

File directories:

The directories are called folders in the Windows operating system. Multiple files may or may not be contained in one particular directory. There may be sub-directories, as said earlier, in the primary directory.

However, the directories maintain the information regarding the files in a file system. The different types of information that are contained in a directory include:

  • The name of the file that is shown to the user
  • The type of the file
  • The positions of the current and next read and write pointers
  • The location of the file header on the device
  • The size of the file along with the block, bytes, and words in it and
  • The usage details include date and time of creating the file along with access and modification details.

The directory also maintains access control to the files to read, write, execute, or delete it, which provides protection to the file.

Need:

The need for a file system is immense. Without it, there would have been a huge chunk of data stored in the storage device back to back.

It would have been impossible for the operating system to distinguish them and access the right file in a short time.

The file systems, which are based on the principles of the traditional paper-based file management systems, store documents in files and the files into directories.

This makes it easy and quick to find them as and when required thereby making your electronic device useful.

The responsibilities of a file system include:

  • File management or ‘bookkeeping’
  • Space management
  • Data encryption
  • Metadata
  • Data integrity and
  • File access control.

The entire working process of the file system typically starts with partitioning and formatting the storage device before it is used.

Disk partitioning:

Partitioning, just as the word signifies, is dividing the storage device into multiple logical regions.

This makes them a sort of separate storage device within a device which helps the operating system in managing them separately and more efficiently.

Therefore, the basic need for partitioning the disk is to help in file management.

For Linux operating systems, there are three partitions used and each partition has different user cases such as:

  • One is used by the operating system
  • One is used for the users’ files and
  • One is used as an optional swap partition.

Windows and Mac operating systems also have almost the same partition structure with the only difference that they do not have the dedicated swap partition.

This is managed within that particular partition where the operating system is installed.

Multiple partitions also allow installing different operating systems and choose a different one to boot the system up every time.

It also facilitates diagnostics and recovery utilities. Apart from that, it allows keeping crucial system files separated from the ordinary files as well.

Typically, Windows operating system assigns each partitioned drive with a letter.

For example, the main partition on Windows is C: or drive C where Windows is installed. However, in UNIX operating systems the partitions are shown as normal directories under the root directory.

Disk partitioning methods:

While partitioning a storage device, two methods are followed mainly. These are:

Irrespective of the method, on the storage device the first couple of blocks will typically contain the vital data about the partitions.

The MBR method is the branch of the BIOS specs and contains vital information such as:

  • The boot loader that initiates the first phase of the booting process and
  • The partition table with information about the partitions.

It works on the BIOS-based firmware that works differently than a UEFI-based firmware. Being located in MBR, it does not need to deal with any file. So, it is easy and fast.

When the system is powered on, the firmware starts to load the boot loader program onto the memory. Once it is done, the Central Processing Unit or the CPU takes over and starts executing it.

Read Also:  What is USB Type B? (Explained)

One significant and most worrying drawback of this process is that there is no backup of the MBR segment. This means that, if it is corrupted somehow, the piece of hardware will need to be recycled.

On the other hand, GPT partitioning, which uses UEFI-based firmware, stands out in that aspect.

It is much more sophisticated and does not have the common limitations as it is found in MBR.

For example, you can have as many partitions in your disks as you want depending on how many the operating system allows.

And, each of these partitions can be pretty large in size. That is why modern computers are replacing MBR with GPT.

In GPT partitioning, there are different sectors. The first sector is called the Protective MBR and is kept for compatibility grounds with BIOS-based systems because some systems still use a BIOS-based firmware.

After this sector, the partition entries, GPT header and other GPT data structures are stored.

A backup is created for these so that it can be recovered if the main copy is corrupted. This backup is known as the Secondary GPT.

All booting services in GPT are stored in a dedicated partition known as ESP or the EFI System Partition.

This is used by the UEFI firmware. The ESP has its own FAT version of the file system.

Formatting partitions:

The next important step involves formatting the partition in the storage device before it can be used for file storage and management.

This process is followed based on a specific set of file systems and involves creating different data structures and metadata that are to be used to manage the files in the partition.

Space management:

Next step of the working process of the file system is space management. Typically, every storage device is divided into blocks of fixed size called sectors.

Each of these sectors acts as a separate storage device of a capacity ranging between 512 bytes and 4096 bytes.

However, file systems follow a much higher concept to store files. They use blocks as storage units. Each block is made up of multiple sectors.

The file system may allocate the files on one or more blocks depending on the size.

The file system is well aware of the used and free blocks in the partition which helps it in space management.

However, for much better and easier space management, the contiguous blocks are clustered into block groups.

Each of these block groups comes with own data block and structures and contain:

  • A super block or a metadata repository of the whole file system
  • Group descriptors that contains bookkeeping information
  • An inode bitmap to identify used and unused inodes within it
  • A block bitmap or a data structure to identify used and unused data blocks
  • An inode table to define the relationship between the files and the inodes and
  • Data blocks to store the file contents.

That is in the case of the ext3 file system. If you consider the ext4 file system you will find that it has taken space management to the next higher level with some additional features in their block groups.

In the ext4 file system, the block groups are organized into larger groups. These are called flex block groups.

In the first block group of each of these flex block groups the data structures are stored. It includes:

  • The inode bitmap
  • The block bitmap and
  • The inode table.

This gives it a significant advantage. It actually frees up larger contiguous data blocks in each flex block group on other block groups.

No matter whatever is the structure of the block group, managing files at this level offers better space management and enhances the performance of the file systems by a significant extent in comparison to organizing and managing files in one single unit.

Fragmentation:

More and more files are written over time and all these are required to be stored in the disk.

Moreover, the existing files may also get bigger with new text or contents added to it.

Whether you shrink or delete a few files from time to time, the changes made in the storage medium frequently leave a lot of small and empty spaces between the files.

These small gaps result in the variance of the actual size of a file and the size on the disk.

There will be some files that will not fill up the entire block and therefore leave some space in it that will be wasted.

These small wasted spaces will add up to a significant size over time which will mean that there will be not enough consequent and contiguous blocks available to store the new files as a whole in one row.

During these times the new files will be needed to store in fragments.

File fragmentation increases the burden of the file system significantly.

This is because the system will have to find out and collect every bit of the file located in different blocks in the disk every time a user program makes a request for such a file.

The same work overload will apply while saving such a file every time the user finishes working on it.

File fragmentation may also happen when a file is even written for the first time.

This is because the size of the file may be huge and there may not be enough continuous or contiguous blocks available in the partition to store it as a whole.

File fragmentation is one specific reason that makes an operating system slow, especially when the file system itself grows old.

However, the good news is that you need not worry about file fragmentation anymore these days.

This is because modern file systems are more sophisticated and use smart algorithms.

This helps them to detect chances of file fragmentation early and avoid it as much as possible.

The good thing about ext4 method and smart algorithm is that it also helps in doing some allocation beforehand.

This is done by reserving specific blocks for a large file before it is actually needed. This ensures that the file will not get fragmented if and when it gets bigger over time.

Now, you may ask, how exactly the number of such blocks required for per-allocation is determined by the file system.

Well, it is defined with respect to the size of the inode object in the length field of the file.

Moreover, the ext4 file system uses a specific type of allocation method which is called the delayed allocation method.

Read Also:  What is ILP (Instruction Level Parallelism)? (Explained)

The basic concept followed in this method is to accumulate the allocation requests in a buffer rather than writing to the data blocks during a write one at a time.

This helps in writing to the disk straight away without needing to call the block allocator of the file system every time a write request is made.

This further helps the file system to make superior choices when it comes to distributing the available block space.

For example, the file system can place the bigger and smaller files separately.

On the contrary, if a smaller file is located in between two large files, it will leave a small space between them when the smaller file is deleted. This may be left unused.

Also, when the files are well spread out, it will leave enough spaces between the data blocks.

Delayed allocation will help the file system to avoid file fragmentation more actively and increase its performance significantly and help to manage the files more easily.

File manager programs:

As you may know, there is a logical layer in the file system. This layer plays a significant role in the file management since it provides it with an API that enables the user applications to carry out different types of file operations such as:

  • Read
  • Write
  • Edit
  • Delete and
  • Execute.

This API of the file system is a low level mechanism. Though it is intended for computer programs, shells, and runtime environments, it is certainly not created for daily use.

Ideally, for that, you have the operating systems that offer convenient and simple file management utilities.

These features of the operating systems are enough for dealing with your daily file management needs.

There are different types of such useful file manager programs available in different types of operating systems such as:

  • Windows operating system comes with File Explorer
  • Mac OS comes with Finder and
  • Ubuntu comes with Nautilus.

All these file manager programs use the API of the logical file system under the hood to offer the best performance.

The APIs of the file system are exposed by the different operating systems using the Common Line Interfaces as well in addition to the GUI tools.

For example, on the Windows operating system you can use the Command Prompt and on Mac and Linux operating systems you can use the Terminal.

All these are text-based interfaces and will help you significantly to perform all types of file operations as text commands.

Managing file access:

File access management is an important job of the file system.

Proper access management will ensure that anyone and everyone cannot access the files owned by some other person and modify or remove them according to their wish.

The primary objective of file access management is to prevent unauthorized access to the files.

Therefore, the modern file systems are equipped with mechanisms that control the access of users and their capabilities as well.

This is done with the help of specific data sets pertaining to file ownership and access permissions.

It is stored in a particular data structure which is called ACL or Access Control List on Windows operating systems and ACE or Access Control Entries Unix-like operating systems such as Linux and Mac OS.

This particular feature is also offered in the Command prompt or Terminal.

You can use it to modify file ownerships or restrict permissions of each file that you want, directly from the Command Line Interface.

Data integrity management:

Another significant job of the file system is to ensure that the integrity of data is maintained all through.

When you make some changes in an existing file and save it, the word processor program will send a ‘write’ request to the API of the file system.

This request will be sent eventually to the physical layer after which the file may be stored on more than a few blocks.

Well, so far so good. But what happens when the system crashes when the older version of the file you modified is being replaced with the new version you want to save in the disk.

Typically, it would have caused a lot of worries if you used an older file system such as ext2 or FAT 32.

In such situations, the entire data will be written partially to the disk and it may be corrupted.

However, such incidents are less likely to occur when you use the modern file systems.

This is because the modern file systems use a special method known as journaling.

In this particular method, the file systems record each operation that is supposed to happen but has not happened yet in the physical layer.

The primary objective to do so is to keep a track of all the changes that have not been made physically to the file system.

This journal is ideally a particular allocation on the storage disk where every writing effort is stored as a transaction initially.

The change is made in the file system only when the data is written and placed physically on the storage device.

If, during the time of writing the system fails or crashes, the file system will identify the partial transaction and roll it back as if nothing had happened.

However, in that case, the new content that you have written on the existing file may still be lost.

The good thing is that the existing data on the existing file will not be lost and remain intact.

All modern file systems such as APFS, NTFS, ext3, and ext4 use this journaling technique that helps in preventing data corruption in a file worked upon in the event of system failure.

Conclusion

Ideally, it is not possible to describe the file system and its working process in one sentence. It is quite a complex topic.

Reaching to this end of the article, now you surely know quite a lot about the file systems. However, do not consider this to be the end of your learning.