Linux Process 2 ~ Hash

2月 13, 2008

上一次有提到說，在 Linux Kernel 裡面找尋特定 PID 的 task 不是使用 sequential search，而是使用 hash 的方式，所以現在就來看看它的 hash 到底是怎麼做的！

Kernel: 2.6.23

透過 PID 找尋 task 的 API 是下面那一支( kernel/pid.c )：


struct task_struct *find_task_by_pid_type(int type, int nr)
{
    return pid_task(find_pid(nr), type);
}

type? type 是什麼呢？幹嘛要有 type? 先來看看宣告( include/linux/pid.h )吧：


enum pid_type
{
    PIDTYPE_PID,     // pid
    PIDTYPE_PGID,    // pgid
    PIDTYPE_SID,     // session
    PIDTYPE_MAX      // 用來擔任 enum 的最大值（也許是作為檢查用）
};

後面的註解是我自己加上去的。簡單來說，Linux Kernel 不是單單建立一個 hash table，而是多個，所以要給一個 type 來說明要找哪一個 hash table。接下來就繼續看 pid_task 和 find_pid 到底是幹嘛囉～


struct task_struct * fastcall pid_task(struct pid *pid, enum pid_type type)
{
    struct task_struct *result = NULL;
    if (pid) {
         struct hlist_node *first;
         first = rcu_dereference(pid->tasks[type].first);
         if (first)
             result = hlist_entry(first, struct task_struct, pids[(type)].node);
    }
    return result;
}

struct pid * fastcall find_pid(int nr)
{
    struct hlist_node *elem;
    struct pid *pid;

    hlist_for_each_entry_rcu(pid, elem,
       &pid_hash[pid_hashfn(nr)], pid_chain) {
   if (pid->nr == nr)
       return pid;
}
return NULL;
}

應該看出來了吧，在給定 Pid 之後，會先透過一個 pid_hashfn 的函示對 Pid 進行 Hash 的動作，然後在根據 Hash 出來的結果，到 pid_hash table 裡面進行搜尋（因為 Hash 的結果可能發生 collision），最後再去透過 hlist_entry 來取出它的 task_struct。

最後，來注意三件事情。

第一，Hash function 到底是如何實作的？

在 linux/kernel/pid.c 裡面，我們可以看到如下的定義：

#define pid_hashfn(nr) hash_long((unsigned long)nr, pidhash_shift)

繼續找下去，我們可以在 include/linux/hash.h 裡面看到下面的程式碼。


#if BITS_PER_LONG == 32
/* 2^31 + 2^29 - 2^25 + 2^22 - 2^19 - 2^16 + 1 */
#define GOLDEN_RATIO_PRIME 0x9e370001UL
#elif BITS_PER_LONG == 64
/*  2^63 + 2^61 - 2^57 + 2^54 - 2^51 - 2^18 + 1 */
#define GOLDEN_RATIO_PRIME 0x9e37fffffffc0001UL
#else
#error Define GOLDEN_RATIO_PRIME for your wordsize.
#endif

static inline unsigned long hash_long(unsigned long val, unsigned int bits)
{
      unsigned long hash = val;

#if BITS_PER_LONG == 64
      /*  Sigh, gcc can't optimise this alone like it does for 32 bits. */
      unsigned long n = hash;
      n <<= 18;        
      hash -= n;         
      n <<= 33;   
      hash -= n;  
      n <<= 3;    
      hash += n; 
      n <<= 3;  
      hash -= n;   
      n <<= 4;   
      hash += n;  
      n <<= 2;  
      hash += n;  
#else
      /* On some cpus multiply is faster, on others gcc will do shifts */                          
      hash *= GOLDEN_RATIO_PRIME;  
#endif                 
      /* High bits are more random, so use them. */  
      return hash >> (BITS_PER_LONG - bits);
}

真是有趣的寫法，簡單來說，Knuth 認為乘上一個接近黃金比例的質數時，可以有不錯的結果。在註解裡面來付了一篇論文：http://www.citi.umich.edu/techreports/reports/citi-tr-00-1.pdf 來加以佐證。

第二，Hash Table 的大小到底是怎麼決定的？

在開機的時候才會決定，詳情請見 void __init pidhash_init(void)

第三，struct hlist_head 是？

其實跟之前介紹的 struct list_head 用法上幾乎一樣，唯一不同的是它不是 double linked list，而是 single linked list。根據註解的說法，因為只是拿來做 Hash 使用，所以使用 double linked list 會有點浪費。

搜尋此網誌

Start a New Journey

Linux Process 2 ~ Hash

留言

張貼留言

這個網誌中的熱門文章

如何將Linux打造成OpenFlow Switch：Openvswitch

我弟家的新居感恩禮拜分享：善頌善禱

Linux Virtual Interface: TUN/TAP