What is Computer File System & How Does It Work?

4
51
What is Computer File System

What is Computer File System?

In simple terms, a file system determines how a file is to be named, stored and recovered from the storage device.

The file system also helps in copying, editing, or deleting a file and it also plays a major role when you visit a website on the internet or download a file.

KEY TAKEAWAYS

  • A file system is used by the operating system of a computer to store data in files and retrieve and load them when needed from the storage device.
  • Its function is to help in copying, moving, editing, printing or deleting a file and also to help in accessing and downloading a file from a website.
  • The file system helps in determining the conventions, character limits, types, and length of the suffix when naming a file.
  • A file system is usually defined by the user or creator of the file and works by using a series of bits, bytes, and records.

Understanding Computer File System and How Does It Work?

What is Computer File System

A computer file is a set of interrelated information. This information is recorded and stored on a non-volatile or secondary storage device such as:

  • A magnetic disk
  • An optical disk and
  • Tapes.

It is actually a process of collecting data that can be used as a medium to give input and receive output from a program.

The operating system of the computer uses the file system in order to store and retrieve files from the storage device and load them.

In technical terms, a file is a series of records in bits and bytes and is defined by the creator and user.

All these files have a logical location on a storage device for retrieval when required. These storage devices can be:

The files are organized and defined by the file system.

The file system, therefore, can be considered an index of all the files stored on a storage device along with the data contained in them.

It specifies different facets of the file, such as:

  • The conventions for naming files
  • The maximum character limit
  • The types of characters to be used
  • The length of the suffix and more

As for naming the files, in many cases, these file names are not case sensitive.

In addition to that, a file system also contains other important information, such as:

  • The file size
  • The attributes of the file
  • The location details

There is also a metadata to indicate its hierarchy in the directory. This metadata also identifies the available free blocks for storage on the drives along with the space that is available in it.

In order to understand the file system and its working process in a much better and a more comprehensive way, there are a lot of other relevant things that you will need to understand.

File System Tree:

It is the structure of the file directory that helps the file system use a specific format to identify the path to a file.

In the file tree structure, a file is placed in a directory, a subdirectory, or a folder in the Windows operating system at a preferred location.

In short, all operating systems of a computer or a mobile device have a file system wherein all the files are located in some sort of hierarchical tree structure.

This means that proper partitions should be made and put in the right place before creating the files and directories. This partition is a dedicated space on the hard disk or other storage device which is managed separately by the operating system.

In the primary partition, there is one file system, though a few specific operating systems may allow creating multiple partitions on a single disk.

This is advantageous in the way that, in the case of one file system getting corrupted, there will be another where the files will be safe.

Objectives:

The file system has a few specific objectives to accomplish. Some of them are:

  • Providing I/O support to different types of storage devices
  • Minimizing the chances of loss of data and
  • Helping the operating system standardize the routines of the I/O interfaces for user processes.

The file system also provides I/O support to a variety of users in a varied set of systems and environments.

Properties:

The computer file systems also come with different important properties, and some of them are:

  • It keeps the files stored in the storage device safe and they are not lost when the user logs out or shuts off the computer system.
  • The dedicated file names give access permission and ensure controlled sharing of files.

The files are arranged properly in complex structures in a file system which shows the relationships between each other.

File Structure:

A file system needs to have a predefined file structure which the operating system can easily understand. The file structure is however defined exclusively based on the type.

There are mainly three types of file structures in an operating system, namely:

  • A text file – This refers to a set of characters that are written or organized in lines.
  • An object file – This refers to a set of bytes that is structured into blocks.
  • A source file – This refers to the set of processes and functions.

Attributes:

Depending on the file structure, the file system will also have varied attributes. The important attributes used in an operating system are:

  • File name – This is the only information regarding the file that can be read by humans.
  • File identifier – This refers to the unique tag number in the file system.
  • File location – This indicates where the file can be found on the storage device.
  • File type – This is an important attribute that is required for the file system to support different types of files.
  • File size – This attribute displays how big the file is, usually denoted in KB.
  • File protection – This trait of the file system helps it to assign and control the right to access a file to read, write, or execute a file.

Another significant attribute of the file system is the date, time, and security, which help with monitoring and protection.

File Type:

Depending on the type of operating system, the types of files in a file system can also vary and include the following:

  • Character special file – This refers to the hardware files that read and write data character by character.
  • Ordinary files – These files store user information, text, databases, and executable programs and allow the users to perform specific tasks such as adding, deleting, and modifying.
  • Directory files – This refers to the storage of other information relevant to the file. It is basically a folder that helps in storing and organizing several files.
  • Special files – Also referred to as device files, these files represent the physical devices such as printers, flash drives, disks, networks, and others.
Read Also:  Should You Shutdown, Sleep or Hibernate Your PC?

The file type actually indicates the capability of the operating system to distinguish between different types of files such as source files, binary files, and text files.

File Access Methods:

The file system also works on the access methods of the files and the ways in which these files are read into memory. Usually, the operating systems support a single access method.

However, there are specific operating systems that may support different access methods. These are:

  • Sequential access – Here, the operating system follows a specific and predefined sequence to access the records. It processes the information stored in the file one by one. Most compilers follow this method.
  • Direct random access – Just as the name implies, this file access method, which is also referred to as the random method, allows the operating systems to access the files directly and at random. It depends on their unique address to access them for reading and writing.
  • Index sequential access – In this method, every file has a particular index built for it with a direct pointer to the diverse memory blocks. The OS searches for it sequentially and the pointer accesses the file directly.

Though all methods are good, the index sequential method is preferred a bit more because it allows using multiple levels of indexing. This increases the efficiency of accessing the files and, at the same time, reduces the time taken to access a particular record.

Disk Space Allocation:

Ideally, there are three basic types of space allocation methods followed. These are:

  • Contiguous allocation – Here, the OS assigns disk addresses in linear order and every user of the file uses a contiguous address space in the memory. However, it may have an external fragmentation issue.
  • Linked allocation – A list of links is included in each file and the pointer is contained by the directory in the first block. The method is not fit for direct file access, though there is no external fragmentation issue because sequential access is followed for file allocation.
  • Indexed allocation – In this method, the directory contains the index blocks of the files which are created with all the pointers for that particular file in them.

File Directories:

The directories are called folders in the Windows operating system. Multiple files may or may not be contained in one particular directory. There may be subdirectories in the primary directory.

However, the directories maintain information regarding the files in a file system. The different types of information that are contained in a directory include:

  • The name of the file that is shown to the user
  • The type of the file
  • The positions of the current and next read and write pointers
  • The location of the file header on the device
  • The size of the file along with the blocks, bytes, and words in it and
  • The usage details include the date and time of file creation, along with access and modification details.

The directory also maintains access control for the files to read, write, execute, or delete, which provides protection for the file.

Need:

The need for a file system is immense. Without it, there would have been a huge chunk of data stored in the storage device back-to-back.

It would have been impossible for the operating system to distinguish them and access the right file in a short time.

The file systems, which are based on the principles of traditional paper-based file management systems, store documents in files and the files into directories.

This makes it easy and quick to find them as and when required, thereby making your electronic device useful.

The responsibilities of a file system include:

  • File management or ‘bookkeeping’
  • Space management
  • Data encryption
  • Metadata
  • Data integrity and
  • File access control.

How Does It Work?

The entire working process of the file system typically starts with partitioning and formatting the storage device before it is used.

Ideally, the working process of the file system depends heavily on the space allocation on the disks.

Disk Partitioning:

Partitioning, just as the word implies, is the process of dividing the storage device into multiple logical regions.

This makes them a sort of separate storage device within a device, which helps the operating system manage them separately and more efficiently.

Therefore, the basic need for partitioning the disk is to help the file system manage files.

For Linux operating systems, there are three partitions used, and each partition has different use cases, such as:

  • One is used by the operating system.
  • One is used for the users’ files.
  • One is used as an optional swap partition.

Windows and Mac operating systems also have almost the same partition structure, with the only difference being that they do not have a dedicated swap partition.

This is managed within that particular partition where the operating system is installed.

Multiple partitions also allow installing different operating systems and choosing a different one to boot the system up every time.

It also facilitates diagnostics and recovery utilities. Apart from that, it allows keeping crucial system files separate from the ordinary files as well.

Typically, the Windows operating system assigns each partitioned drive a letter. For example, the main partition on Windows is C: or drive C where Windows is installed.

However, in UNIX operating systems, the partitions are shown as normal directories under the root directory.

Disk Partitioning Methods:

While partitioning a storage device, two methods are followed mainly. These are:

Irrespective of the method, on the storage device, the first couple of blocks will typically contain the vital data about the partitions.

The MBR method is a branch of the BIOS specs and contains vital information such as:

  • The boot loader that initiates the first phase of the booting process and
  • The partition table contains information about the partitions.

It works on the BIOS-based firmware, which works differently than a UEFI-based firmware.

Being located in MBR, it does not need to deal with any files. So, it is easy and fast.

When the system is powered on, the firmware starts to load the boot loader program onto the memory.

Read Also:  Desktop Buying Guide Explained

Once it is done, the Central Processing Unit or the CPU takes over and starts executing it.

One significant and most worrying drawback of this process is that there is no backup of the MBR segment.

This means that, if it is corrupted somehow, the piece of hardware will need to be recycled.

On the other hand, GPT partitioning, which uses UEFI-based firmware, stands out in that aspect.

It is much more sophisticated and does not have the same limitations as MBR.

For example, you can have as many partitions on your disks as you want, depending on how many the operating system allows.

And each of these partitions can be pretty large in size. That is why modern computers are replacing MBR with GPT.

In GPT partitioning, there are different sectors. The first sector is called the Protective MBR and is kept for compatibility with BIOS-based systems because some systems still use BIOS-based firmware.

After this sector, the partition entries, GPT header, and other GPT data structures are stored.

A backup is created for these so that it can be recovered if the main copy is corrupted. This backup is known as the Secondary GPT.

All booting services in GPT are stored in a dedicated partition known as ESP or the EFI System Partition.

This is used by the UEFI firmware. The ESP has its own FAT version of the file system.

Formatting Partitions:

The next important step involves formatting the partition on the storage device before it can be used for file storage and management.

This process is followed based on a specific set of file systems and involves creating different data structures and metadata that are to be used to manage the files in the partition.

Space Management:

The next step in the working process of the file system is space management.

Typically, every storage device is divided into blocks of a fixed size called sectors. Each of these sectors acts as a separate storage device with a capacity ranging between 512 bytes and 4096 bytes.

However, file systems follow a much higher concept to store files. They use blocks as storage units. Each block is made up of multiple sectors.

The file system may allocate the files on one or more blocks, depending on their size.

The file system is well aware of the used and free blocks in the partition, which helps with space management.

However, for much better and easier space management, the contiguous blocks are clustered into block groups.

Each of these block groups comes with its own data blocks and structures and contains:

  • A superblock or a metadata repository of the whole file system
  • Group descriptors that contain bookkeeping information
  • An inode bitmap to identify used and unused inodes within it
  • A block bitmap or a data structure to identify used and unused data blocks
  • An inode table to define the relationship between the files and the inodes
  • Data blocks to store the file contents

That is the case with the ext3 file system. If you consider the ext4 file system, you will find that it has taken space management to a higher level with some additional features in its block groups.

In the ext4 file system, the block groups are organized into larger groups. These are called flex block groups. In the first block group of each of these flex block groups, the data structures are stored. It includes:

  • The inode bitmap
  • The block bitmap and
  • The inode table.

This gives it a significant advantage. It actually frees up larger contiguous data blocks in each flex block group for other block groups.

No matter what the structure of the block group is, managing files at this level offers better space management and enhances the performance of the file systems to a significant extent in comparison to organizing and managing files in one single unit.

Fragmentation:

More and more files are written over time, and all of these are required to be stored on the disk. Moreover, the existing files may also get bigger with new text or contents added to them.

Whether you shrink or delete a few files from time to time, the changes made in the storage medium frequently leave a lot of small, empty spaces between the files.

These small gaps result in a variance between the actual size of a file and the size on the disk.

There will be some files that will not fill up the entire block and therefore leave some space in it that will be wasted.

These small wasted spaces will add up to a significant size over time, which will mean that there will not be enough subsequent and contiguous blocks available to store the new files as a whole in one row.

During these times, new files will be needed to store in fragments.

File fragmentation increases the burden of the file system significantly.

This is because the system will have to find out and collect every bit of the file located in different blocks on the disk every time a user program makes a request for such a file.

The same work overload will apply while saving such a file every time the user finishes working on it.

File fragmentation may also happen when a file is written for the first time.

This is because the size of the file may be huge, and there may not be enough continuous or contiguous blocks available in the partition to store it as a whole.

File fragmentation is one specific reason that makes an operating system slow, especially when the file system itself grows old.

However, the good news is that you need not worry about file fragmentation anymore these days.

This is because modern file systems are more sophisticated and use smart algorithms. This helps them detect the chances of file fragmentation early and avoid it as much as possible.

The good thing about the ext4 method and smart algorithm is that they also help with some allocation beforehand.

This is done by reserving specific blocks for a large file before they are actually needed. This ensures that the file will not get fragmented if and when it gets bigger over time.

Now, you may ask how exactly the number of such blocks required for per-allocation is determined by the file system.

Well, it is defined with respect to the size of the inode object in the length field of the file.

Moreover, the ext4 file system uses a specific type of allocation method called the delayed allocation method.

The basic concept followed in this method is to accumulate the allocation requests in a buffer rather than write them to the data blocks one at a time.

Read Also:  What is Macbook? 19 Pros & Cons

This helps in writing to the disk straight away without needing to call the block allocator of the file system every time a write request is made.

This further helps the file system make superior choices when it comes to distributing the available block space.

For example, the file system can place the bigger and smaller files separately.

On the contrary, if a smaller file is located in between two large files, it will leave a small space between them when the smaller file is deleted. This may be left unused.

Also, when the files are well spread out, it will leave enough spaces between the data blocks.

Delayed allocation will help the file system to avoid file fragmentation more actively, increase its performance significantly, and help to manage the files more easily.

File Manager Programs:

As you may know, there is a logical layer in the file system. This layer plays a significant role in file management since it provides it with an API that enables the user applications to carry out different types of file operations, such as:

  • Read
  • Write
  • Edit
  • Delete and
  • Execute

This API of the file system is a low-level mechanism. Though it is intended for computer programs, shells, and runtime environments, it is certainly not created for daily use.

Ideally, for that, you have operating systems that offer convenient and simple file management utilities. These features of the operating systems are enough for dealing with your daily file management needs.

There are different types of such useful file manager programs available in different types of operating systems, such as:

  • The Windows operating system comes with File Explorer.
  • Mac OS comes with the Finder.
  • Ubuntu comes with Nautilus.

All these file manager programs use the API of the logical file system under the hood to offer the best performance.

The APIs of the file system are exposed by the different operating systems using the Common Line Interfaces as well as the GUI tools.

For example, on the Windows operating system, you can use the Command Prompt and on Mac and Linux operating systems you can use the Terminal.

All these are text-based interfaces and will help you significantly to perform all types of file operations as text commands.

Managing File Access:

File access management is an important job of the file system. Proper access management will ensure that anyone and everyone cannot access the files owned by some other person and modify or remove them according to their wishes.

The primary objective of file access management is to prevent unauthorized access to files.

Therefore, modern file systems are equipped with mechanisms that control the access of users and their capabilities as well.

This is done with the help of specific data sets pertaining to file ownership and access permissions.

It is stored in a particular data structure called ACL or Access Control List on Windows operating systems and the ACE or Access Control Entries Unix-like operating systems such as Linux and Mac OS.

This particular feature is also offered in the Command Prompt or Terminal. You can use it to modify file ownerships or restrict permissions for each file that you want, directly from the Command Line Interface.

Data Integrity Management:

Another significant job of the file system is to ensure that the integrity of data is maintained throughout.

When you make some changes to an existing file and save it, the word processor program will send a ‘write’ request to the API of the file system.

This request will be sent eventually to the physical layer, after which the file may be stored on more than a few blocks.

Well, so far, so good. But what happens when the system crashes when the older version of the file you modified is being replaced with the new version you want to save on the disk.

Typically, it would have caused a lot of worry if you used an older file system, such as ext2 or FAT 32. In such situations, the entire data will be written partially to the disk, and it may be corrupted.

However, such incidents are less likely to occur when you use modern file systems.

This is because modern file systems use a special method known as journaling.

In this particular method, the file systems record each operation that is supposed to happen but has not yet happened in the physical layer.

The primary objective of doing so is to keep track of all the changes that have not been made physically to the file system.

This journal is ideally a particular allocation on the storage disk where every writing effort is initially stored as a transaction.

The change is made in the file system only when the data is written and placed physically on the storage device.

If, during the time of writing, the system fails or crashes, the file system will identify the partial transaction and roll it back as if nothing had happened.

However, in that case, the new content that you have written on the existing file may still be lost.

The good thing is that the existing data in the existing file will not be lost and will remain intact.

All modern file systems such as APFS, NTFS, ext3, and ext4 use this journaling technique that helps prevent data corruption in a file worked upon in the event of system failure.

Conclusion

Ideally, it is not possible to describe the file system and its working process in one sentence, being quite a complex topic.

Reaching this end of the article, now you surely know quite a lot about them. However, do not consider this to be the end of your learning.

About Puja Chatterjee

AvatarPuja Chatterjee, a distinguished technical writer, boasts an extensive and nuanced understanding of computer technology. She is an esteemed graduate of the Bengal Institute of Management Studies (BIMS), where she honed her skills and knowledge in the tech domain. Over the span of more than 12 years, Puja has developed a deep expertise that encompasses not only technology writing, where she articulates complex technical concepts with clarity and precision, but also in the realm of client relationship management. Her experience in this area is characterized by her ability to effectively communicate and engage with clients, ensuring their needs are met with the highest level of professionalism and understanding of their technical requirements. Puja's career is marked by a commitment to excellence in both written communication within the tech industry and fostering strong, productive relationships with clients.

Previous articleWhat is Computer File Extension & Its Types?
Next articleWhat is a File Size? (Explained)
Puja Chatterjee
Puja Chatterjee, a distinguished technical writer, boasts an extensive and nuanced understanding of computer technology. She is an esteemed graduate of the Bengal Institute of Management Studies (BIMS), where she honed her skills and knowledge in the tech domain. Over the span of more than 12 years, Puja has developed a deep expertise that encompasses not only technology writing, where she articulates complex technical concepts with clarity and precision, but also in the realm of client relationship management. Her experience in this area is characterized by her ability to effectively communicate and engage with clients, ensuring their needs are met with the highest level of professionalism and understanding of their technical requirements. Puja's career is marked by a commitment to excellence in both written communication within the tech industry and fostering strong, productive relationships with clients.
4 Comments
Oldest
Newest
Inline Feedbacks
View all comments