[Subject Prev][Subject Next][Thread Prev][Thread Next][Subject Index][Thread Index]

Re: [LI] Question regarding the Linux file system i/o



On Sat, Nov 27, 1999 at 05:05:52PM +0530, Satya wrote:

The answers to your question are not simple. Here are some references:

http://web.mit.edu/tytso/www/linux/ext2.html
http://www.idiom.com/~beverly/reiserfs.html

> Hi,
> 
> We are using RedHat Linux 6.0. In our application we also use the Linux file
> system and the database - though we are heavily tiltling towards Linux file
> system at the moment. Linux kernel is 2.2.10

If I were you, I'd use a database for large amounts of data. The filesystems
tend to be optimized for user experience for "normal usage".

> 
> I wanted to ask quedstiosn on these issues:
> 1. What is the right directory structure one has to create - if I have to
> create a set of files for every user of the system?
> 	Ex:
> 	All files in a single directory or each set of files in separate
> directory.

I remember reading somewhere that SGI XFS is capable of handling millions
of files in the same directory. I assume that other fs'es don't have that
capability.


> 2. Is there a recommend limit on the number of files in a single directory?
> When does the performance of the file system degrades?

The performance degrades when the locality of reference rule is violated.
When the files are no longer in the same or "nearby" cylinder groups also
called "blocks" in the references above, you get bad performance.

In a filesystem where files are getting created and destroyed, it is hard
to predict the size of a directory/file and reserve space for data
that may be created in the future. Most implementations have a rule of
thumb - (read the references above) that work ok for the general case.

This is where databases with their boutique data structures come into play.

> 3. Has any body doen experiment on - database query v/s file system i/o
> (seek - basically)? Which one is recommended - though my gut feeling says it
> is database (I am using Oracle 8)

For a very specific case, fs seek time can be made faster. But the optimal
strategy very much depends on the data you have. ext2fs knows nothing 
about your data. Oracle does (or at least there is a way to tell).

	-Arun

--------------------------------------------------------------------
The Linux India Mailing List Archives are now available.  Please search
the archive at http://lists.linux-india.org/ before posting your question
to avoid repetition and save bandwidth.