Des Profundis...: linux kernel

Showing posts with label linux kernel. Show all posts

Sunday, 2 November 2008

Oprofile - What are thou ?

O.K Apart from doing stupid references to films I actually have not really liked. What am I doing?

After reading a few mails from the Linux Kernel Mailing list, I found the following tool which seems useful: oprofile. I must admit I still do not have a clear idea of all the possibilities that this tool offers.

The first thing to know it is a profiler and it is a profiler capable of giving a number of information on a program with quite a low overhead. But what is a profiler?

A profiler collects information on running processes of a computer to analyze the performance of a program (see the wikipedia article on performance analysis).

It gives the possibility to look at the following aspects of a system:

Call-graphs (i.e it looks at the graph of the functions calls of programs)
libraries used by a program
instruction level information collections (also on assembly code level)

I will probably continue to take a few look at the possibilities of this tool in the next few weeks.

Monday, 15 September 2008

Autotest - WOW they did it!!!!

Welll I was reading as always a little bit of this automated testing approach. And I was again on this page: http://autotest.kernel.org/ Autotest is a software which allows the automated testing of the linux kernel, that is, it is an infrastructure to build, boot and ... linux kernels. It has quite a lot of features:

bisection...
building
booting
filesystem check
python based library to automate scripts

There is a good presentation on autotest presentation.

Wednesday, 10 September 2008

A central Linux Documentation page

As I was looking for a way to submit a patch to the documentation of the kernel about the i386 --> x86 as well as x86_64 change, I came on to an article about the linux documentation, which gave a pointer on the work of Rob Landley at kernel.org/doc. I may take a look at what could be missing tomorrow.

Tuesday, 9 September 2008

Useful appendices :-)

I have been reading this book, an excellent book on linux: Wolfgang Mauerer. Linux Kernelarchitektur Konzepte, Strukturen und Algorithmen von Kernel 2.6, Hanser 2004. ISBN 3-446-22566-8 Some of the information I write from this blog have been largely adapted or influenced through the reading of the book. A very useful thing is also, that the book has a web site, with PDF version of the appendices which are not in the book. It is a bit strange but still extremely useful. The book is in german, therefore it will not be useful for everybody.

There is also a list of useful Documentation links: http://www.linux-kernel.de/docs/index.html. In particular:

Online Documents about Kernel
important RFCs (TCP/IP..., Differentiated Services fields)
GNU tool information
ELF format
important documentation from the kernel

So I've got to say this is really a wonderful book on linux. I just happened to learn from the author that he is writing a new, more current version of the book.

Translation Lookaside buffer, aka TLB

in a few words from the wikipedia article: a CPU cache used memory management hardware to improve the speed of virtual address translation.(wikipedia).

Much information comes from this article.

The idea is that CPUs keep an associative memory to cache page table entries (PTEs) of virtual pages which were recently accessed.

When the CPU must access virtual memory, it looks up in the TLB for a number corresponding to the entry to obtain.

If an entry was found (a TLB hit), then the CPU can use the value of of the PTE which accessed and calculate the physical address.

In case it was not found (a TLB miss), then depending on the architecture, the miss is handled:

through hardware, then the CPU tries to walk the page table and find the correct PTE. if one is found the TLB is updated, if none is found then the CPU raises a page fault, which is then treated by the operating system.
through software, then the CPU raises a TLB miss fault. The operating system intercepts the miss fault and invoke the corresponding handler, which walks the page. if the PTE is found, it is marked present and the TLB is updated. if it not present, the page fault handler is then in charge.

Mostly, CISC (IA-32) use hardware, while RISC (alpha) use software. IA-64 uses an hybrid approach because the hardware approach is faster but less flexible as the software one.

Replacement policy

If the TLB is full, some entries must be replaced. For this depending on the miss handling strategy, different strategies and policy exist:

Least recently used (aka LRU)
Not recently used (aka NRU)

though the TLB miss mechanism is implemented in software, the replacement strategy can be implemented using hardware. This is performed by a number of new architectures: IA-64...

Ensure coherence with page Table

Another issue is to keep the TLB coherent with the page table it represents.

Wednesday, 3 September 2008

Read Copy Update

Read Copy Update (aka RCU) is another synchronisation mechanism in order to avoid reader writer locks.

An excellent explanation can be found at the LWN.net in three parts by Paul McKenney and Jonathan Walpole:

The basic idea behind it is that when a resource is modified, a new updated structure is put in its place and the old structure is not discarded right away, it waits until references to this structure by other processes are dropped. It can be seen as similar to the concept of garbage collection, but as noted in What is RCU? Part 2: usage, the old structure is not discarded automatically when there are no references any more and the programmer must indicate the critical read portions of the code.

There is an interesting page on the RCU argueing that this technique is used more and more in the kernel as a replacement for the reader writer locks.

Tuesday, 2 September 2008

Kernel Locking mechanisms

An important aspect of programming in an environment with threads and processes is to prevent the different processes to interfer with the functionalities of other processes at the wrong time.

In linux, a number of methods are used to ensure that the data or code section of processes is not disturbed by others. These methods are:

atomic operations
spinlocks
semaphore
reader writer locks

These locks and mechanisms are in the kernel space. Other methods or locking mechanisms are used in the user space.

atomic operations

The idea behind atomic operations is to perform very basic changes on variable but which cannot be interfered by other processes, because they are so small. For this, special data type is used called: atomic_t.

On this data type, a number of atomic operations can be performed:

function	description
atomic_read(atomic_t *v)	read the variable
atomic_set(atomic_t *v, int i)	set the variable to i
atomic_add(int i, atomic_t *v)	add i to the variable
atomic_sub(int i, atomic_t *v)	substract i to the variable
atomic_sub_and_test(int i, atomic_t *v)	substract i to the variable, return true value if 0 else return false
atomic_inc(atomic_t *v)	increment the variable
atomic_inc_and_test(atomic_t *v)	increment the variable, return true value if 0 else return false
atomic_dec(atomic_t *v)	decrement the variable
atomic_dec_and_test(atomic_t *v)	decrement the variable, return true value if 0 else return false
atomic_add_negative(int i, atomic_t *v)	add i to the variable, and return true if its value is negative else false

Note that I discussed in another post the local variables for CPUs.

spinlocks

This kind of locking is used the most, above all to protect sections for short periods from access of other processes.

The kernel checks continuously whether a lock can be taken on the data. This is an example of busy waiting.

spinlocks are used in the following way:

spinlock_t lock = SPIN_LOCK_UNLOCKED;
...
spin_lock(&lock);
/** critical operations */
spin_unlock(&lock);

Due to the busy waiting, if the lock is not released... the computer may freeze, therefore spinlocks should not be used for long times.

semaphores

Unlike linux spinlocks, the kernel sleeps while waiting for the release of the semaphore. Contrary to spinlocks, this kind of structure should only be used for locks which have a certain length, while for short locks using linux spinlocks is recommended.

DECLARE_MUTEX(mutex);
....
down(&mutex);
/** critical section*/
up(&mutex);

The waiting processes then sleep in an uninterruptable state to wait for the release of the lock. The process cannot be woken up using signals during his sleep.

There are other alternatives to the down(&mutex) operation:

down_interruptible(&mutex) : the process can be woken up using signals
down_trylock(&mutex): if the lock was successful then the process goes on and does not sleep

For the user space, there are also futexes.... But this is another story.

reader writer locks

Using this kind of locks, processors can read the locked data structure but when the structure is to be written the structure can only be manipulated by one processor at a time.

Friday, 29 August 2008

Linux Kernel Mailing List FAQ

One page I had not noticed, and which seems to contain lots of interesting features: the Linux Kernel Mailing List FAQ. There are some nice infos for newbies... But not so much information except a lot of very nice pointers to different useful places.

Linux Kernel: Copy on Write

Copy on Write is an important technique in the linux kernel memory management. The basic idea is to prevent the creation of unnecessary copies of structures when creating new processes. For this, when a child process is created, the kernel first provides only readable access to both parent and son. If one of the process need writable access, at the time it needs to write data in memory, the kernel throws a page fault indicating that a new copy should be created for the asking process. Thus, the data is actually copied only when the process actually tries to write.

Some information is of course available on Wikipedia on the Copy on Write page. Note that there is an acronym COW for Copy on Write.

Sunday, 29 June 2008

Git-bisect instructions

After reading a little bit of a long thread on testing Bugs for the linux kernel, there was a small HOWTO for running bissects of the linux kernel.

I write it again here in order to make sure, it is easier to find:

# install git

# clone Linus' tree:
git clone \
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git

# start bisecting:
cd linux-2.6
git bisect start
git bisect bad v2.6.21
git bisect good v2.6.20
cp /path/to/.config .
# start a round
make oldconfig
make
# install kernel, check whether it's good or bad, then:
git bisect [bad|good]
# start next round
After at about 10-15 reboots you'll have found the guilty commit
("... is first bad commit").
More information on git bisecting:
man git-bisect

But this is not all... The bisection as means of automatically checking whether a given tests works or not. Suppose you have a script to run a test called: test.sh. Then you could call for the bisection:

$ git bisect start v1.3 v1.1 -- # v1.3 is bad, v1.1 is good
$ git bisect run ~/test.sh

For more information on this a good explanation is found at this page.

Moreover, there is also the possibility to restrict the search for versions which are good or bad which had a change in a given part of the repository. For example, you may know that the bug is found in a certain subdirectories. Then you can specify these directories:

$ git bisect start -- mydirectory1 mydirectory2/directory3

Friday, 27 June 2008

Ketchup - Linux Kernel Automatic Patcher

I discovered an interesting script: ketchup which can be used in order to automatically perform patching of the kernel from one version to the next.

Des Profundis...