[Tools] Customizing skip list of GNU Global (including regular expression)

Tools 2011.07.06 11:37

GNU Global(henceforth Global) is good tagging tool.
In Global, user can add file or directory list to skip during parsing in the configuration file (~/.globalrc by default).

[ Customization 1 ]
But, in case of monster-like-huge-system, this file list becomes too big and complex.
So, one of my option to resolve this issue is "Using regular expression(henceforth regex.) directly in the configuration file."

To do this, source code of Global should be modified, definitely.
Following source code diff. is based on GNU Global 5.9.7

diff --git a/libutil/find.c b/libutil/find.c
index aa12e81..a49bcdf 100644
--- a/libutil/find.c
+++ b/libutil/find.c
@@ -263,9 +263,18 @@ prepare_skip(void)
                char *skipf = p;
                if ((p = locatestring(p, ",", MATCH_FIRST)) != NULL)
                        *p++ = 0;
+               /*
+                * [ Modification For My Private Usage! ]
+                * string starts with '/' is treated as regular expression.
+                * (Not file name!)
+                */
                if (*skipf == '/') {
-                       list_count++;
-                       strbuf_puts0(list, skipf);
+                       reg_count++;
+                       /* put it as it is as regular expression! */
+                       for (q = skipf + 1; *q; q++)
+                               strbuf_putc(reg, *q);
+                       if (p)
+                               strbuf_putc(reg, '|');
                } else {
                        reg_count++;
                        strbuf_putc(reg, '/');
@@ -288,6 +297,8 @@ prepare_skip(void)
                /*
                 * compile regular expression.
                 */
+               printf("********* skip regular expression ***********\n%s\n"
+                      ,strbuf_value(reg));
                skip = &skip_area;
                retval = regcomp(skip, strbuf_value(reg), flags);
                if (retval != 0)

Global keeps its skip list with following form.

(/<path0>|/<path1> ...)

And characters that can be used as regex. is replaced with leading escape character '\'.
So, all characters are treated as character itself and don't have meta meaning during regex. compilation.
To use regex. directly, modifying this code area is required and above diff. is for this.
Another thing to know to use regex. directly is that file path that is used in Global for parsing always starts with './'

Original Global supports absolute path.
But, in my case, it was almost useless.
So, instead of using absolute path, I modified it to use this for regex. syntax.
That is, syntax for absolute path is used as the one for regex. of configuration file in modified Global.
(printf is for debugging configuration file. :-) )

Here is example at the modified version.
(Leading automatically-added-default-expression  is omitted.)

Standard case
-------------

[ Configuration ]
:skip=tags,project/bin/,/b[0-9]+\.txt$:

[ Regular expression string ]
(/tags$|/project/bin/)

=> absolute path is stored at other list - not in regex.

Customized case
---------------

[ Configuration ]
 :skip=tags,project/bin/,/b[0-9]+\.c:

[ Regular expression string ]
(/tags$|/project/bin/|b[0-9]+.txt$)

[ Customization 2 ]

There is no well-described official document about Global configuration file.
So, I'm not sure that following case is intentional or not.

In configuration file, default leading string of regex. is '/'.
That is, "skip=build/:" is transformed to "/build/" in regex.
Therefore, this matches not only "./build" but also "xxxxx/build/xxxxx".
But this always confused me.
So, let me introduce modification to overcome this.

--- a/libutil/find.c
+++ b/libutil/find.c
@@ -277,7 +277,10 @@ prepare_skip(void)
                                strbuf_putc(reg, '|');
                } else {
                        reg_count++;
-                       strbuf_putc(reg, '/');
+                       /*
+                        * all local files are started with './'
+                        */
+                       strbuf_puts(reg, "^\\./");
                        for (q = skipf; *q; q++) {
                                if (isregexchar(*q))
                                        strbuf_putc(reg, '\\');

This means, "all file path is relative path from gtags' root."
For me, this is much better and clearer than original version.

Enjoy!

신고
tags :
Trackback 0 : Comment 0

티스토리 툴바