"LD_PRELOAD" is set to path of shared libraries. And those are loaded at first (even before C runtime).

LD_PRELOAD=./my.so:/path/to/a.so:/path/to/b.so

One good point is, "Developer can override symbols in the stock libraries, with symbols in LD_PRELOAD-specified-libraries.
For example, 'malloc' can be overridden with user-defined one by using LD_PRELOAD.

And another good tip is using LD_PRELOAD with '__attribute__((constructor))'.
'__attribute__((constructor))' is GCC specific syntax for C/C++.
Functions tagged '__attribute__((constructor))', are located at '.ctors' section of ELF and run when shared library is loaded.
('__attribute__((destuctor))' functions are located at '.dtors' section and run when shared library is unloaded.)
So, functions tagged '__attribute__((constructor))' in LD_PRELOAD-specified-library are executed before 'main' function.
It is fantastic, isn't it?

Real example is 'stdbuf' of gnu coreutils.
There are two main parts in 'stdbuf'. Here are details.

libstdbuf.so :
    libstdbuf.so has stdbuf() tagged '__attribute__((destructor))'.
    In stdbuf, modes of standard buffers - stdin, stdout, stderr - are modified.

stdbuf
    stdbuf puts 'libstdbuf.so' to LD_PRELOAD.
    And then, 'exec()' to main program to execute.

==> So, modes of standard buffer of main program can be changed

Enjoy trick of LD_PRELOAD!
For more detail example, see this post

There are lot's of articles that introduces way of using pipe and redirecting standard IO with those.
But, whenever try to do this, there is always big issue - buffer mode!
See following example.

< Tested on [Ubuntu EGLIBC 2.12.1-0ubuntu9] + [Ubuntu 2.6.35-23-generic-pae] >

#include <stdio.h>
#define _tstr "Sample Text\n"
int
main () {
    int fdp[2];
    pipe (fdp);
    if (0 != fork ()) {/* parent */
        dup2 (fdp[1],1); /* redirect standard out */
        /* fdp is not used anymore */
        close (fdp[0]);
        close (fdp[1]);
        [*A] /* <--- see below */
        sleep(99999999);
    } else {
        int  rb;
        char buf[100];
        dup2 (fdp[0],0); /* redirect standard in */
        /* fdp is not used anymore */
        close (fdp[0]);
        close (fdp[1]);
        if (0 >= (rb = [*B])) perror("IO Error\n"); /* <-- see below for [*B] */
        buf[rb] = 0; /* add trailing 0 */
        printf ("read from input:%s\n", buf);
        sleep(99999999);
    }
}

< *** [*A][*B] pair and result. *** >
OK pairs
    [*A] : write (fdp[1], ...)   |   [*B] : read (fdp[0], ...)
    [*A] : write (1, ...)        |   [*B] : read (fdp[0], ...)
    [*A] : write (1, ...)        |   [*B] : read (0, ...)
    [*A] : printf (_tstr); fflush (stdout) | [B] : read (1, ...)

NOT OK pairs - printed output is "read from input:" ('_tstr' is not printed immediately)
    [*A] : printf (_tstr);       | [*B] : read (0, ...)
        -> 'fflush' is missing here. But '\n' is at the end of test string...

Why 'printf' doesn't work without 'flush'?
'printf' uses standard buffer (at first IO operation, buffer is allocated by using 'malloc').
And because, output device is pipe - not console, buffered mode is used.
So, until flushing, all outputs are stored in buffer (NOT device).
To make pipe be REALLY like standard IO, mode of those buffer should be LINE BUFFERED mode.
So, 'setvbuf() or setlinebuf()' should be used at the first of [*A] as follows.

[*A] : setlinebuf (stdout); printf (_tstr);
    OR setvbuf (stdout, (char*)NULL, _IOLBF, 0); printf (_tstr);

It is simple, isn't it?

Here is more complicated cases.
See following example.

< Tested on [Ubuntu EGLIBC 2.12.1-0ubuntu9] + [Ubuntu 2.6.35-23-generic-pae] >

< main.c >
#include <stdio.h>
int
main() {
    int fdp[2]; /* pipe */
    pipe (fdp);
    if (0 != fork()) { /* parent */
        dup2 (fdp[1], 1); /* redirect standard out */
        close (fdp[0]);
        close (fdp[1]); 
        [*C] /* <-- see below */
        execlp ("test", (char*)0); /* run test (*1) */
    } else {
        int  rb;
        char buf[100];
        dup2 (fdp[0],0); /* redirect standard in */
        /* fdp is not used anymore */
        close (fdp[0]);
        close (fdp[1]);
        if (0 >= (rb = [*B])) perror("IO Error\n");
        buf[rb] = 0; /* add trailing 0 */
        printf ("read from input:%s\n", buf);
        sleep(99999999);
    }

< test.c > => test (executable)
int
main () {
    printf("This is Test!\n");
    sleep(99999999);
}

As above case, string from 'test' - "This is Test!" - is not printed to console immediately because it is buffered.
(Assume that, < test.c > SHOULD NOT be modified!)
Is there solution? Yes.
Before moving next step, see this post first.

Combination of LD_PRELOAD and  __attribute__ ((constructor)) is solution.
To do this, new file is added to make share library that will be preloaded.

< mystdbuf.c > => libmystdbuf.so
#include <stdio.h>
__attribute__ ((constructor)) static void
mystdbuf () {
    setvbuf (stdout, (char*)NULL, _IOLBF, 0);
}

And add following codes to section [*C]

putenv ("LD_PRELOAD=./libmystdbuf.so");

Resolved!

Another easy and popular solution is using 'stdbuf' command in gnu core-utils.
Replace (*1) with

execlp ("stdbuf", "stdbuf", "-oL", "./test", (char*)0);

As described in LINK above, mechanism of 'stdbuf' is exactly same with above manual solution!
Done!

Following test is done with "Ubuntu EGLIBC 2.12.1-0ubuntu9"

Based on my simple test, 'select' function is poor to validate file descriptor.
In my test, select doesn't return error (-1) even for file descriptor 99999. Some times process is crashed - segment fault.
So, file descriptor should be validated before calling 'select'.
(I think I need to look into more about this. These are based on just test. So I'm not 100% sure about this.)

There is lot's of way. Here is well-known way.

is_valid_fd(int fd) { fcntl(fd, F_GETFL) != -1 || errno != EBADF; }

Summay. Be careful using 'select'!

'Language > C&C++' 카테고리의 다른 글

[C/C++][Linux] Tips about LD_PRELOAD  (0) 2010.12.13
[C/C++][linux] redirect standard io with pipe in code.  (0) 2010.12.10
[C/C++] Encapsulation tip in C.  (0) 2010.11.12
[C/C++] Tips and Memos  (0) 2010.11.12
[C/C++] Getting return address…  (0) 2010.11.03

Usually, pointer of not-opened-structure is used to hide module's information.
C dummies may use 'void*' to do this. But, it's not good way.
Let's see following example.
(Following codes are test in GCC4.4 with '-Wall' option.

typedef void module_t;             /* <-- *1 */
typedef struct _sModule module_t;  /* <-- *2 */

module_t* create_module(int arg);
int       do_something(module_t* m, int arg);
...
do_something((int*)m, 1); /* <-- *a */

Pointer of any type can be casted to 'void*', and 'void*' can be casted to pointer of any type without warning.
So, in case of (*1), (*a) doesn't generate any warning. That is, it is NOT TYPE SAFE (in compile time).
But, in case of (*2), GCC give warning like "... incompatible pointer type ...". It's TYPE SAFE.
And, interestingly, compiler doesn't complain anything about (*2) because, compiler doesn't need to know size of 'struct _sModule'.
Only pointer is used. So, knowing size of pointer type is enough and compiler already know it.
So, in terms of syntax, it's ok too!

Boolean to integer - C.

Let's think about the function that return 0 if false, otherwise 1.

 => Naive way : return (e)? 1: 0;

C doesn't support boolean type. Instead, 0 is false, non 0 is true in C.
There is no fixed value to represent TRUE.
But, as defined by 4.5/4, boolean true is promoted to 1, boolean false to 0.
So, we can improve this to

 => Better way : return !!(e);

And this way is also useful because we can get fixed integer value - integer 1 - for TRUE boolean value.

* The ORDER that function PARAMETERS are EVALUATED, is NOT SPECIFIED.
The only requirement is "Those should be fully evaluated before function is called."

Very simple... Just describing here for me to remind.

=== ARM(32bit) - RVCT ===
/* #pragma O0 <- due to optimized out, this may be required */
{ /* Just Scope */
    void* ra;
    /* register lr(r14) has return address */
    __asm
    { mov ra, lr }
    /* now variable 'ra' has return address */
}

=== x86(32bit) - GCC ===
{ /* Just Scope */
    register void* ra; /* return address */
    /* return address is stored at 4byte above from 'ebp' */
    asm ("movl 4(%%ebp), %0;"
         :"=r"(ra));
    /* now variable 'ra' has return address */
}

'Language > C&C++' 카테고리의 다른 글

[C/C++] Encapsulation tip in C.  (0) 2010.11.12
[C/C++] Tips and Memos  (0) 2010.11.12
[Linux][C/C++] Understanding Signals – User Signal Handler  (0) 2010.10.29
[C/C++] type of hard-coded-string.  (0) 2010.09.16
[C/C++] Function pointer.  (0) 2010.05.24

Signal is used for interaction between User Mode processes(henceforth UMP) and for kernel to notify processes of system events.
There are lots of materials you can find to understand what Signal is. So, let's skip it.
The point of this article is "How user signal handler(henceforth USH) is executed?" in Linux.

The core what Linux Kernel does to deliver signal, is modifying stack of UMP - usually adding data.
This is very important! UMP's stack itself is changed!
Kernel changes UMP' stack and register values as if  USH is called from specific function - let's call it F.
(For example, PC is set to USH. Return address in stack is set to function F.)
And, usually, F is just system call - sigreturn. At this system call, Kernel back UMP's stack to original values.
Here is simplified flow.

Signal is issued --> Kernel changes UMP's stack -> USH is executed -> return to function F -> System call (sigreturn) -> UMP's stack is restored -> UMP is executed in normal.

In case of multi-threaded process, thread stack is changed. Nothing different.
Understood? Than what is point?
Yes, USH is run at issued process's / thread's context in User Mode.
Let's see following codes.

/* Timer is used for example */
static pthread_mutex_t _m;
...
static void
_signal_handler(int sig, siginfo_t* si, void* uc) {
    pthread_mutex_lock(&_m);
    ...
    pthread_mutex_unlock(&_m);
}

int main(...) {
    ... /* signal is requested (ex timer) somewhere here */
    pthread_mutex_lock(&_m);
    ... /* <--- *a */
    pthread_mutex_unlock(&_m);
    ...
    return 0;
}

Can you image what I am going to talk about?
As I mentioned above signal handler is run in issued thread's context. So, if signal is issued at (*a), program is stuck due to deadlock!
So, signal handler of above codes should be like follows

static void*
_signal_handler_thread(void* arg) {
    pthread_mutex_lock(&_m);
    ...
    pthread_mutex_unlock(&_m);
}

static void
_signal_handler(int sig, siginfo_t* si, void* uc) {
    pthread_t thd;
    pthread_create(&thd, NULL, &_signal_handler_thread, NULL);
}

Done!

'Language > C&C++' 카테고리의 다른 글

[C/C++] Tips and Memos  (0) 2010.11.12
[C/C++] Getting return address…  (0) 2010.11.03
[C/C++] type of hard-coded-string.  (0) 2010.09.16
[C/C++] Function pointer.  (0) 2010.05.24
[C/C++] Memory(Heap) alloc/free interface for the library.  (0) 2010.04.20

Here is example.

sizeof("12345") == 6

What this meas? Type of hard-coded-string (like "12345") is char[].
It's important and interesting! I haven't known this for several years! Hmm...

This is example for using function pointer in C. (basically, C++ case can be easily inferred from C's one)

Syntax related with function pointer type is quite different from other normal(?) one.
So, here is summary.

* type definition
typedef void (*<type name>) (int, void*);

* variable definition
void* (*<var name>) (int, void*) = NULL;
=> define function pointer variable that's type is "void*(*)(int, void*)" and initialize it into 'NULL'.

* type casting.
fpointer1 = (void*(*)(int, void*)) fpointer2;
=> cast function pointer 'fpointer2' into "void(*)(int,void*)" type.

* function that returns function pointer - float(*)(float, float).
=> float (*get_ptr(char c))(float, float); <- parameter is "char c", return type is "float(*)(float, float)"

* function - returns function pointer - pointer variable .
=> void(*(*var_name)(char c)) (void*); <- return function pointer type is "void(*)(void*)", parameter is "char c" and variable name is 'var_name'

* declare function pointer variable whose return type is function pointer.
=> int(* (*get_func)(int))(int, int); <- parameter of function is "int" And it returns function pointer "int(*)(int,int)"

* typecasting to function pointer whose return type is function pointer.
=> int(*(*)(int))(int, int); <- pointed function's return type is "int(*)(int,int)&quot;.

* array of function pointer.
=> static int(*array_name[])(float) = { ... }; <- "int(*)(float)" typed function pointer array.

* pointer of function pointer
=> void*(**<var name>)(int, void*); <- <var name> is pointer of function pointer. it's type is "void*(**)(int, void*)"

Let's narrow the subject down to the 'C' case.

In case of implementing reusable library, sometimes we want to add some hidden information at the allocated memory used in the library. To achieve this, library may support special paired-interface. For example

    void* mylib_malloc(unsigned int size);
    void mylib_free(void* p);

To simplify discussion, let's assume that we want add just hidden 'int' value.
We can easily think two options.
- add hidden value to the end of allocated memory. -- (*1)
- add to the start of it. -- (*2)

But we cannot use (*1), because, in general, we cannot know size of memory that is allocated. Therefore there is no general way to access hidden value by address.
So, (*2) is only way we can use. Actually, this is general way to implement standard 'malloc' function, too.

then, memory block structure will be like this.

            returned value
             of 'mylib_malloc'
                 |
                 v
    +------------+-------------------
    | hidden-int | user-allocated
    +------------+-------------------

We should keep in mind these.
- Size of hidden value depends on data-align-constraint. That is, memory address of user-allocated-space should obey data-align-constraint. For example, in case of 32bit ARM, 4 byte data align is required. So, size of hidden value should be multiple of 4 to get 4-byte-aligned-address for user space.
- 'mylib_alloc/mylib_free' should make a pair. (Standard-free-function should not be used - it raises error.)

+ Recent posts