Des Profundis...: 2008

Friday, 5 December 2008

Lotus Notes

Well I have been learning to use lotus notes at work... And I am not so sure how good it is. There are many things that would need to be improved. It is used as a platform for some software which thus provides as a whole an interesting CRM solution. But I believe there is much room for improvement there.

Subversion and unicode

I was having some difficulty at work with documents which were not saved as unicode in the subversion repository. This caused problems with eclipse which has not been able to open it.

What I did to solve my problem was to use the command: file to determine the encoding, and use pconv to reencode the documents.

I am planning to do a small perl script (though some java program one would be also OK) which downloads files, checks the type and then reload and then performs a commit on the documents. However, I must make sure there is no problem with different encodings of the document for the history.

RAP

Well I learnt some things about RAP these days and I recalled a few things about eclipse I had forgotten as well I learnt new ones.

Part of the interesting things are targets in eclipse, which allow you to choose exactly the plugins available to the eclipse platform. This is particularly helpful when some packages conflict with each other.

In the case of RAP, the RCP and RAP plugins seems to have difficulty working together. I must admit though that I do not know what is the issue there.

So expect a RAP intro tutorial and perhaps other issues.

Tuesday, 18 November 2008

Common Errors in C

A nice link I had wished to see earlier about programming in C. The author has also a tutorial on assembler.

Sunday, 2 November 2008

Java Workqueue API

As I explained in a previouse entry, since java 1.5 there are new classes and interfaces to help the implementation of parallel systems. These structures are found in the java.util.concurrent package. But how are these structures to be used. The package description recalls the main types of interfaces and classes in this package:

executors
queues
concurrent collections
synchronizers
atomic (in package: java.util.concurrent.atomic )
locks (in package: java.util.concurrent.locks )

Of course, the structures discussed in this package have somewhat similar to the one found in typical kernel discussions: work queues and semaphores... Let us now take a look at the different parts.

Executors

Executors are containers taking care of the execution of tasks, just as SwingUtilities used to do it. Different kinds of executors are imaginable:

DirectExecutor (performing the task but not asynchronously)
ThreadPerTaskExecutor (one thread is created for each task to be performed)
SerialExecutor (each task is performed sequentially)
ThreadPoolExecutor (executor performing using a pool of threads)
ScheduledThreadPoolExecutor (executor performing using a pool of threads at certain specified moments or periodically)

The implentations of the java.util.concurrent package implement the ExecutorService interface which provides a number of supplementary methods.

Queues

Another useful data structure for performing task in parallel and asynchronously are queues (also known as FIFO data structures). The java.util.concurrent package provides a number of data structures for this purpose too. One particular kind of FIFO are blocking queues, for which five different versions exist:

LinkedBlockingQueue
ArrayBlockingQueue
SynchronousQueue
PriorityBlockingQueue
DelayQueue

The first two are queue data structure backed by a linked list, respectively an array. The synchronous queue is special in the way that the queue accept elements only if at the same times elements are removed from the queue on the other end (though I must admit I still need to look at how this actually performed). The priority blocking queue is a FIFO with priority to use the same ordering as a PriorityQueue. Finally, the delay queue is a blocking queue only providing elements when a certain delay as passed for the element to be retrieved.

Synchronizers

A number of possible techniques can be used to synchronize threads. Among these are semaphores which we discussed in the linux kernel context. Other types provided by the java.util.concurrent package, such as the CountDownLatch, used to block some actions until a number of events or elements have been counted, the CyclicBarrier, which is a resettable multiway synchronization mechanism, an exchanger to allow threads to exchange objects at some definite point and the already mentioned locks.

Concurrent Collections

The java.util Collection Framework already contained a number of snychronized or synchronizable classes. However, the java.util.concurrent package introduces new structures useful in a multithreaded context:

ConcurrentHashMap,
ConcurrentSkipListMap,
ConcurrentSkipListSet,
CopyOnWriteArrayList, and
CopyOnWriteArraySet

The synchronization approach of the usual synchonized classes came with a drawback for the scalability, because they imposed a strong check on the serializability of the action performed on the data structure. The Concurrent implementations are on contrary not as restrcited. It remains in the job of the developper to know when to prefer which implementation.

Timing Units

The java.util.concurrent package also introduces a new class TimingUnit to provide different kind of granularity as to which unit of interval for time measurements should be used. Here again, it would be useful to take a look at the implementation of the kernel and compare.

Atomic and Locks

Since atomic variables and locks are in other packages, I will describe them in other entries. However, it is again interesting to note that the same topics were already discussed in other entries of this blog.

Flash with Flex

So I tried a few things with Flash now that I more or less know the basics of flex (and only the basics).

So I wrote my first little movie in flash and compiled it using flex's compiler.

I can not really do an introduction yet to flash and flex. I still need to get use to too many things. For example, I am not really happy not knowing which data structures are available in action script. Maybe I have been spoiled by java using the java.util collections. Yet I think it sensible to expect the existence of a number of standard libraries one can use to prevent the inventingthewheelitis.

One thing I found at least im internet is from a developer from Mannheim who wrote some data structures for games in action script. I will have to take a look at it. It sure sounds really interesting.

I have also been interested in what is the best way to create small movies for fun. So I just thought of the general object oriented structure of my character creations. Actually I already had some project of the sort in java. But I had not that much because other priorities popped up as they always do.

Oprofile - What are thou ?

O.K Apart from doing stupid references to films I actually have not really liked. What am I doing?

After reading a few mails from the Linux Kernel Mailing list, I found the following tool which seems useful: oprofile. I must admit I still do not have a clear idea of all the possibilities that this tool offers.

The first thing to know it is a profiler and it is a profiler capable of giving a number of information on a program with quite a low overhead. But what is a profiler?

A profiler collects information on running processes of a computer to analyze the performance of a program (see the wikipedia article on performance analysis).

It gives the possibility to look at the following aspects of a system:

Call-graphs (i.e it looks at the graph of the functions calls of programs)
libraries used by a program
instruction level information collections (also on assembly code level)

I will probably continue to take a few look at the possibilities of this tool in the next few weeks.

Saturday, 25 October 2008

Interesting Week

A lot happened during the last week. I learnt flash, flex, action script 3. I don't know everything... But I did get a good glimpse of the whole. It was a lot of fun trying to solve my brother's problem, which consisted in trying to command some light software using a flash or flex interface. It is not completely solved yet. But we are getting close to it. I will also write a few important things I noted while looking for solutions for the problem. There are two types of problems:

network problems with TCP/UDP broadcast
flash/flex sandbox security issues

Monday, 6 October 2008

Performance according to Martin Fowler

Martin Fowler is well known for his work on refactoring and software engineering. In his very good book: Martin Fowler: patterns of enterprise application architecture. Addison Wesley 2003. he has some discussion of performance which usually cover the following issues:

response time
responsiveness
latency
throughput

A number of measures of a system are good indicators for the performance of a system regarding these issues:

the load
load sensitivity
efficiency
capacity

To make these notions clear, it is important to have some definition:

response time: the time it take to a system to process a request
responsiveness: the time that the system takes to acknowledge a request
latency: the minimum time that the system needs to perform any task even if the task has nothing to be done
throughput: how much work can be done in a given amount of time

We define the measures presented above:

load: how much strain are put on the resources (for example how many users ( or processes ) are logged on (or running)
load sensitivity: an indicator of the variation in the response time depending on the load
efficiency: the efficiency is the performance (i.e, either throughput or response time) divided by resources
capacity: indicate the maximum effective throughput or load for the system

In order to discuss performance, he also describe two kind of processing power scalability, i.e how good a system reacts when the size of the system grows in processing power. Adding more processing power to a server is an example of vertical scalability (or the system is scaled up), where as adding more servers is an example of horizontal scalability (or the system scales out).

Thursday, 2 October 2008

What's the ack is that ?

Well! as I was looking at the programs I was updating on my fedora 9 box. I fell upon the program ack. I had no idea what it was so I ran rpm -qi ack... And it told me that it was a kind of grep replacement. It seems to be faster than grep in most cases. Some people tried it out it seems. Though some other seem to disagree.

One of the advantage of ack is that it allows to select more easily the type of files to be checked. So I guess I will have to take a look at it....

Monday, 29 September 2008

Some Ideas for Desktop Improvements

I have few ideas to improve the desktop presentation to support better my needs. Basically a few things that I find important:

To Do List

I want a to do list which is more or less always present when the desktop is on.

classification by priorities
- important and urgent
- not important but urgent
- important and not urgent
- not important and not urgent
- not classified
classification by subject
- work
- administrativ
- Hobby
- Family

Style of the task and Icons
- Position of Tasks as Desktop Icons
- Size of Desktop Icons

I had seen a few months ago a presentation by Mozilla Labs (if I recall well) as well as others on how to improve the display of the desktop. I find this really good. This might not be that complex to implement. But before starting the implementation, I only need to know where the different parts are stored, as well as the process used for making the changes. I am not completely sure whether the information remains in memory or is stored on file. For instance, the position of the different icons is found in the following file: ~/.nautilus/metafiles/file:%2F%2F%2Fhome%2Fusername%2FDesktop.xml

Organized Important Files

I want that the files which I have on my desktop are organized in a meaningful way, for example in thematic and time oriented way. From left to right time oriented, top to bottom thematic. Of course the classification cannot be automatic for the thematic way. Moreover, the time oriented way might not always be relevant.

Friday, 26 September 2008

Firefox Plugins

There are a number of useful firefox add-ons:

No Script: an add on to control easily whether scripts (java, javascript, flash) are allowed to be performed. This is a per domain enabling or not.
Download Them All: with this add on, it easier to download many resources from a web page
Download Status Bar: adds a status bar which shows the status of downloading of things from firefox.
Greasemonkey: using greasemonkey allows to write scripts to be performed on top of web pages
Firebug: This is a utility to help in developing javascripts.

Monday, 22 September 2008

Hibernate

Hibernate is an OR-Mappping. Mostly adapted the examples from the chapter 2 of Java Persistence with Hibernate
Second Edition of Hibernate in Action
Christian Bauer and Gavin King
November, 2006 | 880 pages
ISBN: 1-932394-88-5
The goal is to map the objects created by a object programming language such as Java to a relational database in order to provide persistent objects, i.e object which can be stored on disk and which do not disappear when the virtual machine shutdowns. Hibernate performs the mapping using configurations files in xml (or other). Here is an example of XML file for a mapping called tasks.hbm.xml:

<?xml version="1.0"?>
<!DOCTYPE hibernate-mapping PUBLIC
   "-//Hibernate/Hibernate Mapping DTD//EN"
   "http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd">
<hibernate-mapping>
   <class
     name="mytasks.Task"
      table="TASKS">
    <id
       name="id"
       column="TASK_NAME">
       <generator class="increment"/>
    </id>
    <property
        name="name"
        column="Task_NAME"/>
    <many-to-one
       name="nexttask"
       cascade="all"
       column="SPOUSE_ID"
       foreign-key="FK_NEXT_TASK"/>
   </class>
</hibernate-mapping>

a class like:

class Task {
private String name;
private Task nexttask;
}

package mytasks; import org.hibernate.*;
import persistence.*;
import java.util.*;
public class TaskExample {
     public static void main(String[] args) {
         // First unit of work
         Session session =

        HibernateUtil.getSessionFactory().openSession();
         Transaction tx = session.beginTransaction();
         Task firsttask = new Task("Learn Hibernate");
         Long meberId = (Long) session.save(firsttask);
         tx.commit();
         session.close();
         // Second unit of work
         Session newSession =                     HibernateUtil.getSessionFactory().openSession();
         Transaction newTransaction =            newSession.beginTransaction();
         List tasks =            newSession.createQuery("from Task m order by m.name asc").list();
         System.out.println( tasks.size() + " Task(s) found:" );
         for ( Iterator iter = members.iterator(); iter.hasNext(); ) {
           Tasks task = (Task) iter.next();
           System.out.println( task.getName() );
         }
         newTransaction.commit();
         newSession.close();
         // Shutting down the application
         HibernateUtil.shutdown();
     }
}

It is possible to set the configuration file for the session factory using new Configuration().configure(<locationof config file>); for example:

SessionFactory sessionFactory = new Configuration() .configure("/persistence/tasks.cfg.xml") .buildSessionFactory();

calling new Configuration().configure(); would look for a file called: hibernate.properties in the class path directory outside of any package. the Hibernate Configuration file:

<!DOCTYPE hibernate-configuration SYSTEM "http://hibernate.sourceforge.net/hibernate-configuration-3.0.dtd">
<hibernate-configuration>
     <session-factory>
       <property name="hibernate.connection.driver_class">
         org.postgresql.Driver        </property>
       <property name="hibernate.connection.url">
         jdbc:postgresqll://localhost
       </property>
       <property name="hibernate.connection.username">
         sa
       </property>
       <property name="hibernate.dialect">
         org.hibernate.dialect.HSQLDialect
       </property>
     
       <property name="hibernate.c3p0.min_size">5</property>
         <property name="hibernate.c3p0.max_size">20</property>
         <property name="hibernate.c3p0.timeout">300</property>
         <property name="hibernate.c3p0.max_statements">50</property>
         <property name="hibernate.c3p0.idle_test_period">3000</property>
         
         <property name="show_sql">true</property>
         <property name="format_sql">true</property>
         
         <mapping resource="mytasks/tasks.hbm.xml"/>
     </session-factory>
</hibernate-configuration>

Note the use of a certain number of configuration entries:

the Hibernate connection pool provider: here the C3PO-connection-pool-provider
the hibernate dialect used, here the HSQLDialect
the connection information: drivers, url and username
the mapping file

There are a number of other possibilities to configure Hibernate.

Antipatterns and Code Problems

Here a few resources for antipatterns, which are patterns of behaviour or architecture which tend to create problems. First of all, there is a well known book on the subject. There is also a web site for the books: http://www.antipatterns.com/. Here another resource about Antipatterns in the software development: AntiPattern in der Softwareentwicklung (Note the text is in German). Also in addition a resource on symptoms which call for refactoring: SmellsToRefactorings.

Tuesday, 16 September 2008

Java AWT bug on linux with XCB

When starting my java application on linux, I have the following traceback:

Locking assertion failure. Backtrace:
#0 /usr/lib/libxcb-xlib.so.0 [0xc3e767]
#1 /usr/lib/libxcb-xlib.so.0(xcb_xlib_unlock+0x31) [0xc3e831]
#2 /usr/lib/libX11.so.6(_XReply+0x244) [0xc89f64]
#3 /usr/java/jre1.6.0_03/lib/i386/xawt/libmawt.so [0xb534064e]
#4 /usr/java/jre1.6.0_03/lib/i386/xawt/libmawt.so [0xb531ef97]
#5 /usr/java/jre1.6.0_03/lib/i386/xawt/libmawt.so [0xb531f248]
#6 /usr/java/jre1.6.0_03/lib/i386/xawt/libmawt.so(Java_sun_awt_X11GraphicsEnvironment_initD

It has already been discussed in a number of forums and bug reports:

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6532373, or https://bugs.launchpad.net/ubuntu/+source/libxcb/+bug/87947

One possible work around seems to be to add the following variable setting when starting the application:

export LIBXCB_ALLOW_SLOPPY_LOCK=1

Monday, 15 September 2008

Autotest - WOW they did it!!!!

Welll I was reading as always a little bit of this automated testing approach. And I was again on this page: http://autotest.kernel.org/ Autotest is a software which allows the automated testing of the linux kernel, that is, it is an infrastructure to build, boot and ... linux kernels. It has quite a lot of features:

bisection...
building
booting
filesystem check
python based library to automate scripts

There is a good presentation on autotest presentation.

PXE Problems with NAS 1000

A good friend of mine gave me a NAS 1000 so that I could try a few things with it. In particular, I wanted to try PXE and diskless solutions with the installation files or disk data on the NAS server.

First I had some troubles starting the atftpd daemon because of user and group information which did not work. I should have checked the messages information right away... duh!!! It would have saved me a lot to time.

But then as I tried getting the data from my linux box using the fedora tftp client, it did not work. Well actually I am still not sure why it is not. Some routing errors obviously:

Jan 1 13:36:21 icybox daemon.info atftpd[1951]: Server thread exiting
Jan 1 13:36:26 icybox daemon.notice atftpd[1952]: Fetching from 192.168.0.104 to ess

Saturday, 13 September 2008

Syntax highlighting for the Web

As I was reading an interesting post from Otaku, Cedric's weblog, I learned about the existence of a web syntax highlighting solution: pastebin, which is GPL software. It seems to be based on another GPL software: Genshi. This may prove useful once in a while. Especially if I intend to port my post to another blogging software since I am not completely sure about the user settings of blogger.

Friday, 12 September 2008

Java invokeLater

A number of month ago, I took a look at the new features of Java 1.5 and 1.6. And I fell on the new java.util.concurrent package.

Whoever programmed GUI interfaces in Java is certainly aware of the importance of using thread to run in the background in order to enable the user to perform other tasks and not just wait in front of a screen which is not refreshing. Using nice runnable threads, you could have a much more responsive GUI. A typical example was things of the sort:

Thread t = new Thread(){
public void run(){
// the task to perform which requires a certain amount of time
}
};

SwingUtilities.invokeLater(t);

This technique is really fundamental to a well programmed graphical interface.

But since Java 1.5, there are a number of supplementary structures which can be used to perform tasks in parallel. And these are found in the package java.util.concurrent which will be the topic of a future entry.

Overview of Maven

Maven is a tool design to support as many task as possible for the management of a software project.

Its purpose is to provide a simple tool to achieve the following tasks:

Builds
Documentation
Reporting
Dependencies
SCMs
Releases
Distribution

A number of good tutorials can be found on maven's guide page.

Archetypes:

In maven there is the possibility to create archetype models of projects. This means that it is possible to create very easily new projects which have a number of templates to start with. This is related to the possibilities of RAILS.

This is performed by issueing the following command:

$ mvn archetype:create -DgroupId=com.mycompany.app -DartifactId=my-app

Project Object Model: POM

There is a concept of project object model somewhat similar to the ant build files.

An example from the website (see this page):

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.mycompany.app</groupId>
<artifactId>my-app</artifactId>
<packaging>jar</packaging>
<version>1.0-SNAPSHOT</version>
<name>Maven Quick Start Archetype</name>
<url>http://maven.apache.org</url>
<dependencies>
      <dependency>
        <groupId>junit</groupId>
        <artifactId>junit</artifactId>
        <version>3.8.1</version>
        <scope>test</scope>
      </dependency>
</dependencies>
</project>

This model is quite easy to understand.

Project File and Directory Structure

The project file and directory structure depends on the archetype chosen to create a new project. There are means of configuring this, see: http://maven.apache.org/guides/mini/guide-creating-archetypes.html.

The build Life cycle

Each possible tasks (e.g. validate, compile, package) may require other to be performed first. This means that there are dependencies between the tasks (like in ant).

Common tasks

mvn compile (compile the code)
mvn test (test the functionalities of this project)
mvn test-compile (compile test classes for this project)
mvn package (package the code of this project)
mvn clean (clean the builds and task )
mvn site (create a template website for the project)
mvn idea:idea (create an IntelliJ IDEA descriptor for the project)
mvn eclipse:eclipse (create the project description files for eclipse)

Maven Plugins

There are a number of plugins which can be useful for maven. You can add them to the POM file of the project, see: How_do_I_use_plug-ins

A list of plugins can be found there.

SCM Plugin ( Source Code Management plugin)

One of the many pluging is the SCM plugin which offers useful tasks/goals for interacting with a SCM

External Dependencies

There is also the possibility to configure external dependencies.

Deployment

There are also possibilities of deployment if things are configured. For example, the created distribution can be copied and added to a repository using scp. For this, some information about user names, keys and/or passwords have to be configured.

Documentation

There are also some thing to help in the creation of a documentation site using archetypes. See: guide-site.

Thursday, 11 September 2008

Linux Links

Some information can be found using: http://www.tuxfinder.com/ There is a guide about kernel development from Jonathan Corbet, well known linux author and editor at LWN.net : https://ldn.linuxfoundation.org/article/everything-linux-kernel

www.google.com/linux and co

Well I just discovered an interesting thing while looking through old papers the existence of a certain number of URLs for specific google search engine: http://www.google.com/linux http://www.google.com/microsoft Though I would love to get a number of info on which other possibilities there are... Is there a list somewhere.

Wednesday, 10 September 2008

Using Busybox for serving linux distributtions

I want to use a busybox in order to test the kernel through a PXE installation as well as not requiring the hard disk of my machine which should cut some part of the noise. For this I would install tftp on the busybox... Though it might also work with an NFS or a samba technique... I should check that.

A central Linux Documentation page

As I was looking for a way to submit a patch to the documentation of the kernel about the i386 --> x86 as well as x86_64 change, I came on to an article about the linux documentation, which gave a pointer on the work of Rob Landley at kernel.org/doc. I may take a look at what could be missing tomorrow.

Tuesday, 9 September 2008

Useful appendices :-)

I have been reading this book, an excellent book on linux: Wolfgang Mauerer. Linux Kernelarchitektur Konzepte, Strukturen und Algorithmen von Kernel 2.6, Hanser 2004. ISBN 3-446-22566-8 Some of the information I write from this blog have been largely adapted or influenced through the reading of the book. A very useful thing is also, that the book has a web site, with PDF version of the appendices which are not in the book. It is a bit strange but still extremely useful. The book is in german, therefore it will not be useful for everybody.

There is also a list of useful Documentation links: http://www.linux-kernel.de/docs/index.html. In particular:

Online Documents about Kernel
important RFCs (TCP/IP..., Differentiated Services fields)
GNU tool information
ELF format
important documentation from the kernel

So I've got to say this is really a wonderful book on linux. I just happened to learn from the author that he is writing a new, more current version of the book.

Have you looked at JBOSS' projects lately ?

Did you take a look at JBoss lately... It is quite impressive the amount of technologies they have. Well I knew already some of them... But There are other technologies, which I was not aware of. See for example the projects doc page. They have things on Application servers, extension for Rich clients using JSF, Rule engines, remoting mechanisms, Object relational mappings.... You na... No perhaps not... But it is really impressive. So I am going to attack a JBOSS technology serie in this blog. So except blog entries on:

JBoss application server
rich faces
JBoss Remoting
hibernate (though there was already one or two entries)
JRUNIT a JUNIT extension to test client/server applications

JBoss Remoting

An interesting framework JBoss remoting. There is a demo at: http://docs.jboss.org/jbossas/remoting/demo/JBossRemoting_demo.htm There is also a very good article at: http://www.onjava.com/pub/a/onjava/2005/02/23/remoting.html

Translation Lookaside buffer, aka TLB

in a few words from the wikipedia article: a CPU cache used memory management hardware to improve the speed of virtual address translation.(wikipedia).

Much information comes from this article.

The idea is that CPUs keep an associative memory to cache page table entries (PTEs) of virtual pages which were recently accessed.

When the CPU must access virtual memory, it looks up in the TLB for a number corresponding to the entry to obtain.

If an entry was found (a TLB hit), then the CPU can use the value of of the PTE which accessed and calculate the physical address.

In case it was not found (a TLB miss), then depending on the architecture, the miss is handled:

through hardware, then the CPU tries to walk the page table and find the correct PTE. if one is found the TLB is updated, if none is found then the CPU raises a page fault, which is then treated by the operating system.
through software, then the CPU raises a TLB miss fault. The operating system intercepts the miss fault and invoke the corresponding handler, which walks the page. if the PTE is found, it is marked present and the TLB is updated. if it not present, the page fault handler is then in charge.

Mostly, CISC (IA-32) use hardware, while RISC (alpha) use software. IA-64 uses an hybrid approach because the hardware approach is faster but less flexible as the software one.

Replacement policy

If the TLB is full, some entries must be replaced. For this depending on the miss handling strategy, different strategies and policy exist:

Least recently used (aka LRU)
Not recently used (aka NRU)

though the TLB miss mechanism is implemented in software, the replacement strategy can be implemented using hardware. This is performed by a number of new architectures: IA-64...

Ensure coherence with page Table

Another issue is to keep the TLB coherent with the page table it represents.

Monday, 8 September 2008

Nice little tool isoinfo

I am working on havin a simple ram based distribution to test a few things at a test system. For this, I learnt from this page, that there exists a command to extract directly some files from iso without mounting the filesystem. This command is isoinfo and is used in the following way:

$ isoinfo -i isofilesystem.iso -J -x /filetobeextracted > filereceivingtheextracteddata

Nice!

Sunday, 7 September 2008

Kudos to helpwithpcs.com

I found a very nice and simple course on the basic knowledge of computer architectures: helpwithpcs.com. I collected very simple reminders of things I already knew... But it is good to take a look at what you might not know or might have missed such as:

It is simple and very well explained...

Wednesday, 3 September 2008

Read Copy Update

Read Copy Update (aka RCU) is another synchronisation mechanism in order to avoid reader writer locks.

An excellent explanation can be found at the LWN.net in three parts by Paul McKenney and Jonathan Walpole:

The basic idea behind it is that when a resource is modified, a new updated structure is put in its place and the old structure is not discarded right away, it waits until references to this structure by other processes are dropped. It can be seen as similar to the concept of garbage collection, but as noted in What is RCU? Part 2: usage, the old structure is not discarded automatically when there are no references any more and the programmer must indicate the critical read portions of the code.

There is an interesting page on the RCU argueing that this technique is used more and more in the kernel as a replacement for the reader writer locks.

Tuesday, 2 September 2008

Kernel Locking mechanisms

An important aspect of programming in an environment with threads and processes is to prevent the different processes to interfer with the functionalities of other processes at the wrong time.

In linux, a number of methods are used to ensure that the data or code section of processes is not disturbed by others. These methods are:

atomic operations
spinlocks
semaphore
reader writer locks

These locks and mechanisms are in the kernel space. Other methods or locking mechanisms are used in the user space.

atomic operations

The idea behind atomic operations is to perform very basic changes on variable but which cannot be interfered by other processes, because they are so small. For this, special data type is used called: atomic_t.

On this data type, a number of atomic operations can be performed:

function	description
atomic_read(atomic_t *v)	read the variable
atomic_set(atomic_t *v, int i)	set the variable to i
atomic_add(int i, atomic_t *v)	add i to the variable
atomic_sub(int i, atomic_t *v)	substract i to the variable
atomic_sub_and_test(int i, atomic_t *v)	substract i to the variable, return true value if 0 else return false
atomic_inc(atomic_t *v)	increment the variable
atomic_inc_and_test(atomic_t *v)	increment the variable, return true value if 0 else return false
atomic_dec(atomic_t *v)	decrement the variable
atomic_dec_and_test(atomic_t *v)	decrement the variable, return true value if 0 else return false
atomic_add_negative(int i, atomic_t *v)	add i to the variable, and return true if its value is negative else false

Note that I discussed in another post the local variables for CPUs.

spinlocks

This kind of locking is used the most, above all to protect sections for short periods from access of other processes.

The kernel checks continuously whether a lock can be taken on the data. This is an example of busy waiting.

spinlocks are used in the following way:

spinlock_t lock = SPIN_LOCK_UNLOCKED;
...
spin_lock(&lock);
/** critical operations */
spin_unlock(&lock);

Due to the busy waiting, if the lock is not released... the computer may freeze, therefore spinlocks should not be used for long times.

semaphores

Unlike linux spinlocks, the kernel sleeps while waiting for the release of the semaphore. Contrary to spinlocks, this kind of structure should only be used for locks which have a certain length, while for short locks using linux spinlocks is recommended.

DECLARE_MUTEX(mutex);
....
down(&mutex);
/** critical section*/
up(&mutex);

The waiting processes then sleep in an uninterruptable state to wait for the release of the lock. The process cannot be woken up using signals during his sleep.

There are other alternatives to the down(&mutex) operation:

down_interruptible(&mutex) : the process can be woken up using signals
down_trylock(&mutex): if the lock was successful then the process goes on and does not sleep

For the user space, there are also futexes.... But this is another story.

reader writer locks

Using this kind of locks, processors can read the locked data structure but when the structure is to be written the structure can only be manipulated by one processor at a time.

Monday, 1 September 2008

GIT tutorial

I was having a look at the git tutorial. The important tasks: 1/ download the code of the linux kernel from git:

>git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git linux-2.6

2/ pulling new code from git:

> git pull

3/ reverse code changes

> git checkout -f

4/ commiting the modifications:

> git commit -a

5/ undo last commits (note it is different from a revert which consists of a patch reverting some other patch)

> git reset HEAD~2

6/ list branches

> git branch

7/ create branch

> git checkout -b my-new-branch-name master

8/ choose a branch and make it the current one:

> git checkout branch

9/ Tell which branch is current

> git status

10/ merging code into a branch mybranch

> git checkout mybranch

> git merge anotherbranch

Friday, 29 August 2008

Linux Kernel Mailing List FAQ

One page I had not noticed, and which seems to contain lots of interesting features: the Linux Kernel Mailing List FAQ. There are some nice infos for newbies... But not so much information except a lot of very nice pointers to different useful places.

Linux Kernel: Copy on Write

Copy on Write is an important technique in the linux kernel memory management. The basic idea is to prevent the creation of unnecessary copies of structures when creating new processes. For this, when a child process is created, the kernel first provides only readable access to both parent and son. If one of the process need writable access, at the time it needs to write data in memory, the kernel throws a page fault indicating that a new copy should be created for the asking process. Thus, the data is actually copied only when the process actually tries to write.

Some information is of course available on Wikipedia on the Copy on Write page. Note that there is an acronym COW for Copy on Write.

Wednesday, 27 August 2008

Memory Management Documentation in Linux Kernel

As I was looking at the kernel newbies info. I found this post from Mel Gorman about documentation for the Memory Management under Linux. It contains links to two documents:

Wednesday, 23 July 2008

Customer orientation

customer orientation is the regular acquisition of the wishes, needs and expectation of customers as well as their realization in products, services and interactive processes. The purpose is to ensure stable, reasonably economic relations with the clients. Two levels:

Information level (what do we know of customers, which information collected, how are they organized, and how can they be used in processes: marketing, production..)
Customer level
- quality of products,
- of services,
- flexibility of services realization,
qualification of salesperson as well as their
- flexibility,
- reliability and
- friendliness,
treatment of
- sales
- complaints
interaction between customers and employees

Success formulas:

personal interaction with customer
knowledge of the customer (needs, expectation, desires)
check customer satisfaction
problem solving suggestions
customer oriented organization
customer oriented employees

Friday, 18 July 2008

Online C++ Documentation

I found a nice page as a C++ documentation: http://www.cplusplus.com/reference/.

C library

A number of headers are present in the standard C++ library, for example:

asserts
types
errors
floats
Standard definitions
localization
limits of standard types
maths
C I/O library functions
C strings
C time
C standard libraries
Jump routines to preserve calling properties and environment ***
and handling of variable arguments

I/O Streams

There is a number of headers existing for the C++ stream operations as well as a number of possible classes for all these headers information. The main reference page has a nice picture summarizing the relationship between these classes.

string header

The string headers is useful to represent characters sequence. It contains a number of operations which are described on a special page for the string operation topics. A nice aspect is also that these pages have examples. So it's quite nice to learn how to use them correctly.

Standard Template Libraries

One of the nice aspect of this page was for example the page on the Standard Template Library. It presents all the complexities of the different operation on the diverese containers which can be used with these structures. Dividing the different containers in three groups: sequence containers, container adapters and associative containers.

<algorithm>

But there is also one I was not aware of: the algorithm header, which provides a number of standard algorithms which are designed to work on ranges of elements. For example, it contains search, merge, sorting, heap, min/max as well as operation to modify sequences and structures (swapping, copying, partitioning).

Supplementary headers

There are 4 other headers which do not belong to the other above categories:

Friday, 11 July 2008

Spanning Tree Protocol

Today I read the Linux Networking book, and learnr about bridging and the Spanning Tree Protocol which is used to improve communication in LANs by bridging different atomic LANs together. The idea behind this protocol is that some computer is chosen as the root of the network and a spanning tree is built to optimize creation of the spanning tree and to prevent loops to be created in the network. This protocol which has an IEEE specification has also two other successors: the Rapid Spanning Tree protocol and the Multiple Spanning Tree Protocol. I will probably update this entry in order to give an overview of the details of these protocols.

Wednesday, 9 July 2008

Personal Projects

Here is a list of personal pet technical projects I have:

Web 2.0 gallery
data mining web site
data mining experiments
data warehousing
information extraction tool
ontology learning
linux kernel learning
inference engine technology
fca implementation
database implementation aspects

Each of these project is already in some kind of planing or development stage.

Thursday, 3 July 2008

Open Data Sources - presented By UK Governement

The U.K has set a number of interesting data sources for free use in order to ease the development of new tools. These data sources are found there.

Tuesday, 1 July 2008

Lguest - simple virtualization mechanism

From the article at LWN.net, I learnt that there is a small virtualisation framework for linux using an hypervisor: lguest. This virtualization framework allows the testing of drivers. So I should use this to test some kernel changes which might be problematic.

Weka Online

It seems that there is now the possibility to use Data Mining online using Weka. A company, CEO delegates, has built a web site where some arff files can be loaded as well as the corresponding data mining algorithms chosen.

It was actually partly one of the things I planed to do. But there are still lots of supplementary things which can be done. So let's see what happens.

Sunday, 29 June 2008

Git-bisect instructions

After reading a little bit of a long thread on testing Bugs for the linux kernel, there was a small HOWTO for running bissects of the linux kernel.

I write it again here in order to make sure, it is easier to find:

# install git

# clone Linus' tree:
git clone \
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git

# start bisecting:
cd linux-2.6
git bisect start
git bisect bad v2.6.21
git bisect good v2.6.20
cp /path/to/.config .
# start a round
make oldconfig
make
# install kernel, check whether it's good or bad, then:
git bisect [bad|good]
# start next round
After at about 10-15 reboots you'll have found the guilty commit
("... is first bad commit").
More information on git bisecting:
man git-bisect

But this is not all... The bisection as means of automatically checking whether a given tests works or not. Suppose you have a script to run a test called: test.sh. Then you could call for the bisection:

$ git bisect start v1.3 v1.1 -- # v1.3 is bad, v1.1 is good
$ git bisect run ~/test.sh

For more information on this a good explanation is found at this page.

Moreover, there is also the possibility to restrict the search for versions which are good or bad which had a change in a given part of the repository. For example, you may know that the bug is found in a certain subdirectories. Then you can specify these directories:

$ git bisect start -- mydirectory1 mydirectory2/directory3

Friday, 27 June 2008

Ketchup - Linux Kernel Automatic Patcher

I discovered an interesting script: ketchup which can be used in order to automatically perform patching of the kernel from one version to the next.

Thursday, 26 June 2008

Learning Method

How do I learn about a topic? Well there are different strategy. One of the things I often do is to look for excellent books on the topic and go through the content and look for things I do not already know. Take for example: The Database System Implementation book by Hector Garcia-Molina, Jeffrey D. Ullman and Jennifer Widom,Prentice-Hall International London Limited ,2000 ISBN 0-13-0400264-8. I look at the section in the table of content. For example:

2.4 Improving the access of secondary storage
- 2.4.1 I/O model of computation
- 2.4.2 Sorting data on secondary storage
- 2.4.3 Merge Sort
- 2.4.4 Two-phase, multiway merge sort
- 2.4.5 extension of multiway merging for larger relations

I know already something about merge sort. However, I don't know much about the other topics. So I take a look at each of them. If something seems interesting and important then I try to sum up the information learnt somewhere, for example, in this blog.

Friday, 20 June 2008

Combining Data Mining Models

Here is a little summary of the possible way of combining multiple models for Data Mining (I use as first resource the "Data Mining" Book from Ian H. Witten und Eibe Frank):

Bagging
Bagging with costs
Randomization
Boosting
Additive regression
Additive logistic regression
Option trees
Logistic model trees
Stacking
Error correcting output codes

Bagging

The principle of Bagging is to let create a number of models for a training set, and use the class returned the most frequently for a specific instance for each of these models. In other word, it is important that the different model return the same set of possible class output.

Bagging with costs

This extension of the Bagging approach uses a cost model. It is particularly useful when the predictions made by the models used in the bagging provide probabilities telling how likely is to be exact.

Randomization

The idea here is to introduce some kind of randomization in the model creation in order to create different models. Depending on the stability of the process, a certain class can then be chosen as prediction.

Boosting

Similarly to the bagging approach, boosting tries to create models as a kind of cascade, each model is built with the purpose of classifying better the instances which have not been suitably classified by previous models. This is a type of forward stagewise additive modelling

Additive regression

The additive regression is alse a kind of forward stagewise additive modelling which is suitable for numeric prediction with regression. Here again the principe is to use a serie of regressions which try to classify better the elements which were incorrectly classified.

Additive logistic regression

This type of regression is an adapation of the previous combination approach but for logistic regression.

Option trees

I still have to describe this but the concept is quite simple

Logistic model trees

I still have to describe this but the concept is quite simple

Stacking

The purpose of stacking is to combine different types of models which might not have the same labels. In order to achieve this, a meta learner is introduced. A number of models are built for the data. A meta learner, i.e a model which decides from the learning output of other learners, created in order to classify and adapt to all the models from the first phase.

Error correcting output codes

I still have to describe this but the concept is quite simple

Of course, all these mechanisms have been implemented in Weka.

Thursday, 19 June 2008

FishEye - Visualisation of Subversion

There seems to be a nice tool for the visualisation of subversion repository: FishEye (see this page). Unfortunately, the tool is not open source. Though an open source project may ask for an open source community license. From the web site, it is not clear to me, which SCM (source code management) system is supported. At least, Subversion seems to support CVS, SVN and Perforce (?).

Tuesday, 17 June 2008

Transaction Models

As a reminder I will list here the possible transaction models. The aim is not to be exhaustive, but merely to have this entry to check the possible transactions. First, let's list the different models:

Flat transactions (supported by J2EE)
Nested Transactions (supported by J2EE)
Chained Transactions
Sagas

Flat transactions

Flat transactions consists of a series of operations which should be performed as one unit of work, and return true or false depending on whether the operation failed or not. The reasons causing a transaction to abort may be for instance:

some invalid parameters have been given to the components
some system state which should not have changed, was violated
Hardware or software failure

Nested transactions

Contrary to flat transactions, nested transactions allow to have units of work consisting of other atomic units of works. Some of the embedded unit of work may roll back without forcing the entire transaction to roll back. In that way, failed transactions can be performed again. Note that subtransactions of transactions may also be nested transactions.

Chained transactions

In the chained transaction model, new transactions are started automatically for a program when the previous one has been commited or aborted. This is not implemented by j2ee.

Long Running Transactions - Sagas

Sagas, or Long-Running transactions are transactions which may take a long time to finish and may last for days either because they wait for an external event, or they wait for the decision of one of the actors involved in the transaction. This type of transaction usually occurs in a web service context (see here). This is not implemented by j2ee.

Wednesday, 11 June 2008

Kernel KnowHow: Per CPU Variables

I just read the section on per CPU variables from the Linux Device Drivers book from O'Reilly (TODO link).

The use of per CPU variable helps improving performance by limiting the cache sharing between CPUs. The idea is that each CPU has his own private instance of a given variable.

A variable can be defined using the following macro: DEFINE_PER_CPU(type,name);

It is important to note that a kind of locking mechanism is still needed for the CPU variables in case:

the processor would move the process to another cpu
the processor would be preemted in the middle of a modification of the variable

Therefore, each instance should be updated using a simple mechanism to lock the variable during the changes. This can be done by using the get_cpu_var() and put_cpu_var() functions, for example:

get_cpu_var(variabletoupdate)++;
put_cpu_var(variabletoupdate);

It is also possible to access the variable values from other processors using: per_cpu(variable, int cpu_id); but a locking mechanism must be implemented for these cases.

Memory allocation for dynamically allocated per-CPU variables is performed using:

void *alloc_percpu(type);
void *__alloc_percpu(size_t size , size_t align);

Deallocation is achieved using: free_percpu().

Accessing a dynamically allocated per-CPU variable is performed by using

per_cpu_ptr(void * per_cpu_var, int cpu_id).

It returns a pointer to the variable content. For example,

int cpu;
cpu= get_cpu();
ptr = per_cpu_ptr(per_cpu_var, cpu);
put_cpu();

Note: the use of the method to free the cpu from the lock.

Tuesday, 3 June 2008

Kernel Index page from LWN.net

As I was looking for some supplementary information on the kernel, I found this page which returns a categorized list of entries of articles in the LWN page.

Saturday, 31 May 2008

Alsa library musings

I have taken a look at the alsa library tonight.

There seems to be a few fun things one can do. I started to use the small test program: pcm.c from the alsa-lib source directory.

I looked at the code. I have not changed anything yet. But the program is small but shows many aspects of how to interact with pcm streams and drivers.

So one thing I did was to use the small program to create a little scale playing: some thing like:

for a in 1 1.25 1.33 1.5 1.66 1.75 2 do \
./pcm -f `echo $a * 440|bc`& pid $! \
& (sleep 1; kill $pid) \
done

Of course, the numbers 1 1.25 1.33 1.5 1.66 1.75 2 are approximation of some of the notes of a scala. I have been to lazy to look for the exact numbers.

This little test is really not much, and it certainly does make clear how to use the alsa function libraries, but it's fun. Nevertheless, after looking at the code, I have the feeling that I am going to have much fun in the future with playing with sound.

The possibility of combining both my musical theory musings together with my programming is quite a wonderful feeling.

Some of the ideas I have:

find the different instruments harmonic series and implement a small tool to be able to play some instrument
implement some composition mechanisms:
- arpegios
- Travis picking
- combinations of instruments
- combine with language ?

Friday, 30 May 2008

SMP and NUMA architectures

I have had a look at the SMP and NUMA architectures which are for multiprocessor systems.

I found a number of interesting resources:

OpenMP.org

SMP means symmetric multiprocessor while NUMA stands for Non-Uniform Memory Access. Actually SMP is a type of UMA, i.e Uniform Memory Access. Basically, SMP processor share all the same memory, this imposes some overhead on the cache of the processor to speed up the processing. NUMA architectures (of which different flavor exist, in particular cache-coherent models, aka ccNUMA) have a non uniform access to memory, some processor have access to local caches. In that way, the processes do not need to synchronize to access the data.

Thursday, 29 May 2008

Mail Spam Report and Strategies

It is always good to keep an eye on strategies to reduce SPAM. Though this report is a bit old. It presents at least most of the methods to prevent SPAM on one's email address.

Wednesday, 28 May 2008

Time Managment - Decision Making Techniques

Well... Right now my time management is not so bad as long as I don't have to do anything. But it is still important to remind oneself of the different time management techniques. So I think this blog should be a good way of making sure of this. At least, I will always have this to go back to.

One of the things I always have to keep an eye, it is certainly the decision making process. For this there are few techniques which exist (I looked this up more than once... but the following site should be suitable):

pareto analysis
six thinking hats
grid analysis
cost-benefit analysis
decision trees
force field analysis
paired comparison analysis
pmi

I will probably come back to these later.

Monday, 26 May 2008

Kernel Makefile

In this post, I sum up the main Makefile parameters and targets, right now it mainly corresponds to $make help but I might edit this entry to add useful information.

First of all use (if in the directory of the kernel sources)

$ make help

or if you are not in the directory of the kernel sources (then located in )

$ make -C help

This more or less gives the following information.

This outputs a list of possible target and supplementary information:

a few variable can be set
Documentation
- make [htmldocs|mandocs|pdfdocs|psdocs|xmldocs] ->>> build the corresponding docs
Packages
- make rpm-pkg > allows to build src and binary rpm packages
- make binrpm-pkg > allows to build binary rpm packages
- make deb-pkg > allows to build deb packages
- make tar-pkg > allows to build uncompressed tarball
- make targz-pkg > allows to build gzipped compressed tarball
- make tarbz2-pkg > allows to build bzip2 compressed tarball
Cleaning targets:
- clean - Remove most generated files but keep the config and enough build support to build external modules
- mrproper - Remove all generated files + config + various backup files
- distclean - mrproper + remove editor backup and patch files
Note that it can be useful to use ARCH=... in the cleaning process
Kernel Configuration targets:
- config - Update current config utilising a line-oriented program
- menuconfig - Update current config utilising a menu based program
- xconfig - Update current config utilising a QT based front-end
- gconfig - Update current config utilising a GTK based front-end
- oldconfig - Update current config utilising a provided .config as base
- silentoldconfig - Same as oldconfig, but quietly
- randconfig - New config with random answer to all options
- defconfig - New config with default answer to all options
- allmodconfig - New config selecting modules when possible
- allyesconfig - New config where all options are accepted with yes
- allnoconfig - New config where all options are answered with no
Other useful targets:
- prepare - Set up for building external modules
- all - Build all targets marked with [*]
  - * vmlinux - Build the bare kernel
  - * modules - Build all modules
  - modules_install - Install all modules to INSTALL_MOD_PATH (default: /)
- dir/ - Build all files in dir and below
- dir/file.[ois] - Build specified target only
- dir/file.ko - Build module including final link

Note that there are also some little things about tags for editors, but I am not so sure what it really brings.

Linux Standard Base

Yes I like standards... Standards are great... Of course you should not exaggerate it, but yeah standard base are a good thing for linux.

So take a look at the Linux Standard Base ( aka LSB). It is a specification of what rules all linux distribution should respect.

For example, it specifies the executable and linking format: ELF, as well as specifying a number of useful libraries: libc, libm, lipthread, libgcc_s, librt, libcrypt, libpam and libdl. Some util libraries are also specified: libz, libncurses, libutil.

It also specifies a number of commandline command ( see the standard on this subject)

Linux Config Archive

I found an interesting site which is an archive for configuration files from the linux kernel. Unfortunately, it did not work when I tried it. But at least the idea is quite good. I am sure I will try this from time to time.

Sunday, 25 May 2008

init scripts

For a few things I am interested in doing, I wanted to be able to have a small script preparing as soon as I boot up. Perhaps it is more interesting to use atd or cron for this but I wanted to make sure how the initscript system works.

So I prepare a small script in order to start some system tools as soon as the boot process is finished.

For example, a little tool starting a remote process when I first boot which would allow me to use some remote processing facilities, e.g (focused) crawler. This could be also some system starting before/after the httpd daemon is up.

For this I took a look at the /etc/init.d/postgres script.

#!/bin/sh # newscript This is the init script for starting up the newscript
# service
#
# chkconfig: - 99 99
# description: Starts and stops the newscript that handles \
# all newscript requests.
# processname: mynewscript
# pidfile: /var/run/mynewscript.pid

# Version 0.1 myname
# Added code to start newscript

Note the use of chkconfig: - 99 99 .

This should be adapted with more useful priorities, basically 99 means that the initscript is started as one of the last scripts. Taking a look at $man chkconfig should prove useful.

The new script stores the pid of the newscript application in /var/run/mynewscript.pid

Note that it also stores things in /var/lock/subsys/

Maintainers File in Kernel and SCM tools for the kernel

I have just had a look at the maintener file in the linux kernel Tree.

I have noticed that there are a number of orphaned project. The question is whether any of these orphaned project really needs to be taken care of.

Another interesting thing was to learn about the different scm and patching tools used in kernel development: git, quilt and hg.

Here is an interesting overview of the reason for the development of git and quilt.

I really start to like the patch approach, and the article linked above gives a good idea of the reasons to use this approach. I should try to summarise in a future post the advantages and disadvantages of the different source code management approaches.

Kernel Stuff

I did some little things with kernel programming (or more compiling) these days.

Part of the things I did were compiling kernel because I wanted to try UML (user mode linux).

So that's what I did:

Download the kernel configs from: http://uml.nagafix.co.uk/kernels/.
Download kernels from kernel.org.
untar the kernels to some directory
cd into the main directory of the kernel
copy the config of the kernels into main directory as .config file
$ make ARCH=um oldconfig
answered the necessary questions as good as I could
$ make ARCH=um
At that point some errors appeared, so I tried to correct them.
to help me in the debugging process I used $ make V=1 ARCH=um
when I had some things that did not work well I used the gcc output to call it right away. For example, sometimes the architecture files would not be found right so I used -Iinclude sometimes a precompiler marks was not set correctly so I used -D__someprecompilermarks__. At some point I removed some problematic definition by using this together with a #ifndef in the header file. $ gcc ..... -Iinclude -D__someprecompilermarks__ ...
then I also downloaded a few kernel repositories using git, though I still need to perfect this.
I read (or skipped/read) quite a few Documentation files from the kernel or from the internet.
I familiarised myself with the git web interface, this together with having a kernel RSS feed in my thunderbird.

And all this in one day and half together with other things.

Saturday, 24 May 2008

Fedora and chkconfig

I was having some troubles with the starting up of my linux box. Now these problems are solved or at least I hope so :-). The problem seemed to lie with haldaemon or messagebus. After reading a bug description for my problem on bugzilla, I had the solution. (see the bugzilla bug page for bug : 444410). The priorities of some services in runlevel 5 had to be corrected. this can be performed using: $ /sbin/chkconfig --level 5 resetpriorities with being the service's name, e.g NetworkManager, messagebus, ...

Sunday, 18 May 2008

Naemi and RPM

As I tend to forget pretty often the syntax of the query format of rpm, I had a short look at the internet. I found this web page which explains some aspects of the use of the rpm -q command.

I discovered a command which I did not know and which might turn out quite useful: namei. It follows the symbolic links until their end point can be found.

Example:

$> namei /usr/java/latest/
f: /usr/java/latest/
d /
d usr
d java
l latest -> /usr/java/jdk1.6.0_03
d /
d usr
d java
d jdk1.6.0_03

Friday, 9 May 2008

AjaxMP - a small Music server

In order to listen to your music on the local network, a php based tool: AjaxMP using mpd as music publishing daemon. Edit

I noticed that the mpd server must be configured in another way if you want to publish it to other nodes in the network.

For this:

install mpd
set password and port for mpd
add music to the mpd directory
extract the tar from ajaxmp in a php enabled web server
copy config.php.dist to config.php
edit the mpd parameters: host, port and password
copy user.txt.example to user.txt
add for each person needing access a username and a password separated by tabs

The stored password can be stored as md5, and in that way somewhat more secure... But not much!!!

Apache Problem With IP resolution

One of the apache rule for the resolution of API seems to be that addresses of request cannot be numeric.

I commented the rule out... But I should take a look whether there is no better solution.

To find the rule posing the problem I looked at the logs in: /etc/httpd/logs/error_log.

There was a line:

[Fri May 09 02:09:51 2008] [error] [client xxx.xxx.x.xxx] ModSecurity: Access denied with code 400 (phase 2). Pattern match "^[\\\\d\\\\.]+$" at REQUEST_HEADERS:Host. [id "960017"] [msg "Host header is a numeric IP address"] [severity "CRITICAL"] [hostname "xxx.xxx.x.xxx"] [uri "/ajaxmp"] [unique_id "BFUWMX8AAAEAAA8ewlgAAAAC"]

I then did a grep:

$> grep 960017 /etc/httpd/modsecurity.d/*.conf /etc/httpd/modsecurity.d/modsecurity_crs_21_protocol_anomalies.conf:SecRule REQUEST_HEADERS:Host "^[\d\.]+$" "deny,log,auditlog,status:400,msg:'Host header is a numeric IP address', severity:'2',id:'960017'"

I had found the rule causing the problem and commented it out. I hope there is a beeter solution, perhaps a better rule ???

Thursday, 17 April 2008

Ubuntu Printer Canon IP1700 Pixma using IP1800 driver

I have been trying to install using packages the drivers for the printer of a friend: a Canon IP 1700 PIXMA- It proved more complex than once thought, so she has been without a printer for a while. I had tried as suggested in a few web pages to use the driver of the IP 2200 using alien. I finally managed after I first removed all the drivers I had installed for the IP2200 and then reinstalling IP1800 drivers for debian from some web page. So at least, she has a printer right now. Edit: Note that packages for the version 7.10 of ubuntu can be downloaded there. It seems that for the Hardy Heron (aka Ubuntu 8.04) some other drivers exist, see this page.

Friday, 14 March 2008

java problem with browser

I noticed again that the java applet were not working correctly in firefox. I looked for the solution since I knew I once had solved the problem. But I could not find it easily again. After I tried to start the ControlPanel from the jre, an error occured:

Exception in thread "main" java.lang.UnsatisfiedLinkError: /usr/java/jre1.6.0_03/lib/i386/libdeploy.so: libstdc++.so.5: cannot open shared object file: No such file or directory

I installed the compatibility libraries ( after calling yum provides libstdc++.so.5. Once installed, java applets were working just fine.

Friday, 7 March 2008

Ubuntu Installation

Yesterday, I installed a Ubuntu system on the laptop of a friend of mine, whose windows got bugged by a virus and who could not boot any more.

The laptop of my friend is a Acer Travelmate 2420. There seems to be a number of good experiences with this laptop under linux so I went ahead and tried the installation.

Moving ext3 impossible

The installation went without real problems, apart that as I tried to update ubuntu there was enough place on the hard disk (well there may have been in the end if I had cleaned the cache of the previous updates. I will have to take a look at this). Once I noticed that I tried to resize the root partition. To do this, I resized the home partition. This was no problem. But as I tried to move the ext3 partition, I noticed that it did not work. So I had to reinstall the system another time.

Printer

The other problem I had was installing the printer. She has a Canon PIXMAP IP 1700 and there is no direct driver for this printer. It seems that another driver works: the IP2200 driver. Yet I haven't tried whether it works yet. I wanted to make sure that it does not ruin her printer.

DVD Burner

I will also have to make sure her DVD burner works without problem.

Antivirus

She also asked me to install an antivirus which updates itself. I used for this the antivirus which is accessible from ubuntu (I think it is the clamav). And she tried to install another one from the internet antivir, but I do not know if that works.

Friday, 22 February 2008

User Mode Linux- Mounting File Systems

Mounting an existing file systems in an user mode linux can be done in two manners, either using :

hostfs,
humfs.

Using hostfs implies the following restriction, you should not boot on it, or use devices or sockets on it, and finally the files written by the user mode linux are files with root permissions. For these reasons, it is better to use humfs. There are some issues with page caching under UML. I refer to the origin of these information on this topic, so see here.

Make the directory and cd to it host% mkdir humfs-mount host% cd humfs-mount

the directory hierarchy which should be available to the user mode linux is in a subdirectory called "data". It is possible to use an existing UML filesystem for this, information on this are found at the previously cited source.
As root, run the humfsify utility to convert this directory to the format needed by the UML humfs filesystem: host# humfsify user group 4G

Thursday, 21 February 2008

Kernel - User Mode Linux

While I was playing with kernel programming, I managed to hang my linux box. That's why I decided to use again a usermode linux box to do my module programming. First step was to start an linux user box. As specified on the User-mode Linux Kernel Home Page, download the uml linux kernel version as well as the filesystem file which should be used. For this a number of file system are possible, see for example here. In order to build a uml ou need to download the sources of the kernel. You will also need the readline libraries and the fuse libraries. You might also need others. I haven't yet checked that. Once you have these elements ou can compile a unix kernel with usermode support. Then the only remaining things to do were:

host% make defconfig ARCH=um
host% # now make menuconfig or xconfig if desired
host% make menuconfig ARCH=um
host% make ARCH=um

Note the importance of the ARCH=um parametrisation. The menuconfig is useful for further configuration of your kernel.

Kernel Programming - WORKQUEUE Problems

I decided to look at the API for schedulling in the kernel modules. This turned out to be more time consuming than what I thought. The problem is that the tutorial I have been using is not up-to-date with the current version of the linux kernel. There used to be only one structure to represent some task to perform. But after a change in the kernel workqueue API, there are now to structures: struct work_struct and struct delayed_work. This was the reason why the kernel did not compile in a nice way. I will post here the rest of the small example module for documentation purposes.

Install firefox plugins only once for all users

From the weblog: Urs-o-Log, I obtained the way of installing plugins only once for all applications. prompt $ firefox -install-global-extension One question remains though, what happens when the plugins should be updated. Are they updated individually or ?

Ohloh - an open source software network

The portal Ohloh is an other alternative to freshmeat. It provides a number of statistics and many information which are not available on freshmeat.

Quality Management

The quality of a project is extremely important to consider a project a success. Much work has been done in the enterprise world to offer processes and solutions to ensure the quality of many aspects of projects. The information of this entry are mainly obtained from the article from the article on quality management of the Linux Magazine 12/07. IT Infrastructure Library (ITIL) ITIL is a mean of checking that the infrastructures of a company can ensure the quality of the processes used. It is currently a very important quality management method. The ITIL is a number of books describing aspects of IT processes. Important processes The following processes play an important role in the quality in the ITIL library. Service Support, Service Delivery, Planning to Implement Service Management, Application Management, ICT Infrastructure Management, Security Management. For this a number of norms have been developed:

ISO 9001

DIN ISO 9001 is an important norm for quality. It comes from production companies and remains somewhat generic. The purpose of this norm is to define how the quality should be taken care of.

Important for the ISO 9001 is that the quality is defined on the result of the processes and not on their implementation. For example, it requires that systems be documentated, but does not specify how this documentation should be implemented.

A number of quality management norms have been integrated in the ISO 9001, and many other system are getting closer to the norm.

This norm defines that the quality management responsabilities lie directly by the directing managers.

ISO 20 000

ISO 20 000 defines a norm and is very similar to the ITIL. A company can be certified for this norm through a serie of audits.

Six Sigma

Six Sigma is a very strict way of checking the quality of the processes of an enterprise. For this, there is a central instance as well as many other mechanism to check in all processes the quality of the whole system. ITIL Terminology ITIL uses its own terminology to discuss the quality management of a company.

Wednesday, 20 February 2008

Firefox extensions

I have been fooling around with firefox extensions and tried one of my own. There are a few easy things to remember: create a chrome directory with the following structure:

chrome ->
- content ->
  - - extensionname.js
  - - extensionname.xul
  - - overlay.xul
- locale->
  - en-US->
    - - extensionname.dtd
    - - extensionname.properties
    - - overlay.dtd
  - skins->
components->
defaults->
- preferences->
- chrome.manifest

- install.rdf

Here is a little summary of the purpose of all these files: - the content directory contains the core of the extension, an overlay.xul file which defines some modifications to an existing xul file. The extensionname.xul file contains a xul file for the visualisation of the application, and the extensionname.js contains javascript code used for interaction purposes - the locale contains dtd and property files for the possible localisation of the extension. For each possibility, there is a directory with the name of the locale. The dtd are used for xul files, while the properties files are for javascript files. - the defaults directory is used for things related with the user profile, - while the components contains possible XPCOM components. Finally, install.rdf gives some information about the extension, while the manifest file chrome.manifest indicates on which xul file the overlay is applied and the place where the extension is to be found. To simplify the work, I use a templating method to generate extensions from some definition files. To package the extension once the install.rdf and the manifest file have been correctly edited, you only need to zip it with an ".xpi" as file extension.

JGroups - Reliable Multicasting

The JGroups is a framework I do not know yet. It seems to be required for some applications, so I took a quick look at it. From the documentation of the version 2.5, I could have a better idea of what this toolkit can be used for.

Friday, 11 January 2008

Spring Aspect Oriented Programming

The Spring framework has many features for aspect orientation. I do not define here the usual concepts of aspect oriented programming: join points, cut points, advice, weaving... A few important aspects:

the weaving process of the Spring framework is performed at runtime.
the Spring framework uses proxies to implement the aspect orientation. These proxies come in two flavors, either a JDK dynamic Proxy (?) or a CGLIB proxy (see the web site of CGLIB for this).
no field point cut (use aspectJ for this)
the spring AOP does not implement the complete AOP, but mainly the most important aspects of AOP needed for the Inversion of Control approach of the spring framework for enterprise applications

Differences between JUnit 3.8 and JUnit 4

There is a number of changes between these two versions of JUnit. For backward compability. Old Test method should still work I believe. I will sum up which are the features which changed. package names: version 3.8 used the junit.framework packages whereas 4.0 uses org.junit packages. Inheritance: the Test classes do not have to inherit from TestCase anymore. However, the new JUnit approach makes a stronger use of annotations. Assert changes: In order to use asserts with JUnit, you can either use the Assert.assertEqual(...) ( or similar methods) or import staticly the Assert class ( from Java 1.5 on) using:

import static org.junit.Assert.*;

in the import section of your test file.

Moreover, from JUnit 4.0 on there are also  method to compare arrays of objects.

Initialisation and cleaning:

The initialisation and cleaning were performed using setUp() and tearDown() methods. In version 4.0,
it is not possible anymore since the class does not extend TestCase. To solve this, new Annotations are
used: @Before and @After.

Note that there are also the annotations: @BeforeClass and @AfterClass which are the methods which
are called before loading the class for test and after all the tests have been performed.

Tests

Tests are annotated using the @Test annotation and must also return void and may not have any parameters.
These properties are checked at runtime and issue exceptions if these rules are not respected.

Ignoring Tests
It is possible to ignore a given test by using the @Ignore annotation before or after the @Test annotation


Performing Test

One performs test using the following call:
$ java –ea org.junit.runner.JUnitCore 

where  is the java complete name of the test class.

Timeouts

It is also possible to use a Timeout parameter for the test methods

Parametrised Tests

It is also possible to apply the same test but with different parameters. For this, the annotation
@Parameters may be used together with the class annotation @RunWith(Parameterized.class)

Suite

Like for the preceding version, there is also the possibility to use Suites of Tests. For this the
annotations @RunWith(Suite.class) and @Suite.SuiteClasses({FirstTestClass.class, SecondTestClass.class})

The article I used to write this entry states the lack of support in IDE for the new JUnit 4.0 version. But I suppose
that this changed in the latest versions of eclipse.