아래 글은, 어떠한 논리적인 근거없이 작성된 개인적인 의견임을 먼저 명확히 명시합니다.

"부자 3대 못간다."라는 말이 있다. 비슷하게 회자되는 말로 "재벌 3세때 문제가 생긴다 - 3세 위기론"라고들 한다. 정말 그러한가? 잘 모르겠다. 첫 줄에 미리 언급한 바와 같이 어떠한 관련 근거도 찾아본 바 없다. 하지만, 그런 말들이 많이 회자 되는 것은 사실인 것 같다.
이 명제가 그럴 듯 하다고 가정하고, 이제부터 이유를 찾아 보려고 한다.

출처가 어딘지는 불문명하다, 이런 말을 들은 적이 있다. "창업주는 본인의 손으로 고생하면서 사업을 일으켰다. 2세는, 그 어려움을 보고 또 일부는 함께 하면서 자라서, 사업의 힘들고 어두운 면을 잘 안다. 하지만 3세는, 처음부터 안정적인 상태에서 큰 어려움 없이 자라서, 사업의 다양한 면을 알지 못하고 위기 대응 및 극복 능력이 떨어진다. 그래서 보통, 3세부터 문제가 생긴다."
어떤가? 설득력이 있게 들리는가? 물론, 이 말의 전제는, "창업주가 사업을 일으키고, 2세가 사업을 확장 및 안정시켜서, 3세에게 넘겨주는" 경우 - 사실 대한민국의 많은 재벌이 이 경우를 따른다 - 를 가정하고 있다. 나도, 이 말에 공감해 왔다. 하지만, 근래들어 새로운 관점이 생겨서 이에 대해서 이야기 하고자 한다.

먼저, 옛날 "군주"와 재벌의, 소위 "회장"은 어떠한 차이점이 있을까? 만약, 두 위치가 거의 비슷하다면, 어째서 "3세 위기론"을 옛 세습왕정국가에 적용하는 사례는 많지 않은가? 이 둘의 가장 큰 차이점은, "끊임없이 '왕좌'에 대한 위협이 존재하는가?" 라고 생각한다. 옛 왕국은 "장자세습"이 제도화 된 이후에도, 음모, 모략 등으로 인해 장자승계가 무난하지 않는 경우가 많았다. 승계서열 1위는 끊임없이 목숨의 위협을 느껴야 했다. 심지어 군주가 된 이후에도, 왕좌가 안정화 될때까지는 많은 시간이 소요되었다.
하지만, 현대 재벌들은 이러한 위협이 많지 않다. 목숨의 위협은 더더욱 없다. 특히 아들이 "하나"인 경우, 대부분의 경우 문제없이 승계가 이루어진다 - 대한민국의 경우, 지금은 많이 달라졌지만, 지금까지는 "남성", "장자" 우선 승계였다. 즉 위협 및 경쟁이 없다. 있다고 하더라도 아주 약하다. 난 이것이, 과거 "군주"와 현재 "회장"의 가장 큰 차이라고 생각한다. 그리고 이 차이가 "재벌의 수명"와 "왕국의 수명"의 차이 - 3세 위기론 - 를 만든다고 생각한다.

그럼, 구체적으로 이것이 어떠한 차이를 만들어내는 것일까? 앞서 이야기한, "어려움을 겪은 세대"와 "그렇지 않은 세대"의 차이일가? 좀 더 구체적으로, 이것이 무슨 차이를 만들어내는 것일까?
난, 이 모든 것에서 비롯된 "'사람 보는 눈'의 차이"가 근본적인 요인라고 생각한다. 어려움 없이, 처음부터 승계가 확정된 사람의 주위에는 다양한 종류의 인간 군상이 그 속내를 들어낼리가 없다. 사람은 관계가 변하면서 그 다면적인 모습을 드러내게 된다. 소위 "갑"을 대하는 "을"의 자세와, "을"을 대하는 "갑"의 자세는 다를 수 밖에 없다. 항상 "갑"의 위치에 있는 사람은 "을"일때 드러나는, 그 사람의 "단면"만을 보게 된다. 이것만으로 사람을 제대로 판단하는 것은 쉽지 않다.
앞서 이야기한 "어려움을 겪는" 다는 것은, 그 속에서, 다양한 역학관계를 가진 인적 네트워크를 경험하게 된다는 뜻이다. "갑"이 되었다가, "을"이 되기도 하고, 누군가를 믿어야만 하기도 하고, 배신을 경험하기도 한다. 이런 과정 속에서 믿고 신뢰할만한 사람을 찾게 되고, 사람을 보는 눈을 기르게 된다. 하지만, 항상 "절대 갑"의 위치에만 있었던 사람이라면 어떨까? 과연 그런 사람이 인간 군상의 다양한 면을 알고, 사람을 판단하는 "눈"을 가질 수 있을까?
결국 "군주"나 "회장"의 자리는 "사람을 쓰는 자리"이다. "인사가 만사"다. 물론 "의사결정"능력도 중요하지만, 본인이 모든 의사결정을 할 수는 없다. 때문에 "의사결정을 잘 할 수 있는 사람을 쓰는 일"이 가장 중요하다. 그런데 사람보는 눈이 없다면? 이것이 "3세 위기론"의 근본 문제라고 생각한다.

Let me share quite interesting observation.

Environment

  • clang++: aarch64-linux-android34-clang++ of android NDK.
  • Run on android device

Think about following project

a.cpp

namespace {
    std::map<std::string, int> m0_;
}
static std::map<std::string, int> m1_;

void noticeFromConstructor(const char *name, int value) {
    static std::map<std::string, int> m2_;
    m0_[name]  = value; // (*1)
    m1_[name]  = value; // (*2)
    m2_[name]  = value; // (*3)
}

b.cpp

namespace {
__attribute__((constructor)) void notify() {
    noticeFromConstructor("b", 1);
}

c.cpp

__attribute__((constructor)) static void notify() {
    noticeFromConstructor("c", 2);
}

SIGSEGV is signalled at both (1) and (2) when it run - from both b.cpp and c.cpp - but, (*3) is ok. I think it's because m0_ and m1_ are not initialized yet, but m2_ is initialized.
I'm not sure that there is any explicit description regarding initialization order for this case.

'Language > C&C++' 카테고리의 다른 글

[C/C++] Small tip - Argument passed to main() function.  (0) 2015.10.19
[C/C++] Template with value  (0) 2015.04.28
[C/C++] sizeof array  (0) 2015.03.19
[Macro] '##' macro operator  (0) 2015.02.11
'malloc' and 'Segmentation fault'  (0) 2014.07.25

At early 2022, I used glib(2.66.x) for about 3 months for project run on OpenEmbedded Linux. Based on my short(3 months) experience, I would like review it. I mostly used gio and gdbus of glib. So, this review focus on those features.

Pros

  • It seems that glib is very popular. However, I am not sure gio and gdbus are also as popular as glib.
  • It has large set of features.
  • It is easy to use in both C and C++.
  • It supports multi-platform(Linux, Windows and so on).

Cons

  • Lack of document. It was not easy for me to find useful documents and examples. Again, it's just my opinion. I think it may be caused from frequent changes of API.
  • Some APIs doesn't give any response of error to caller. That is there is no way for caller to detect error happened inside API.
    • For instance, g_thread_pool_push() prints some error messages to stderr but doesn't return any error values to caller. I think it's because many APIs of glib provides very high-level functionality and use various techniques to improve performance and resource usage like memory pool, async operations and so on. However, it's very critical that there is no programmatic way for caller to detect error of API. I think it may be very critical defect when developers consider using glib at project requiring high-level stability like system daemon.
  • Behavior of Linux file descriptor seems not intuitive to me. One of reasons may be glib provides unified API for multi-platform. For instance, in case of giving and taking file descriptor via GDBus, even if receiver receives only one file descriptor, sometimes, two new open file descriptors are created.
    • When using org.freedesktop.login1.Manager.Inhibit, even if only one file descriptor is passed via GD-Bus, two new file descriptor are created at /proc/<pid>/fd. So, even if FD(File Descriptor) passed via GD-Bus is closed, due to another open FD, inhibitor cannot be closed.
    • Same scinario work well as expected if sd-bus is used instead of GD-Bus.

Summary

After winning over lack of document and get used to it, it looks very powerful and useful library. However, as mentioned above, due to some APIs that don't give error response, it's usablity may be limitted at domain asking very high-level stability.

'Development' 카테고리의 다른 글

Comparing build outputs  (0) 2019.11.14
[Remind] Array vs Set.  (0) 2019.05.03
[Dev] Software developement process...  (0) 2010.01.13
[Dev] Sharing between members and leader...  (0) 2009.12.04
[Dev] Key point to succeed in large project...  (0) 2009.08.01

The title says it all (surprisingly).

This is my test.

Test source code - a.c

#include <stdio.h>


int main() {
    int a = 0;
    printf("%d\n", 1 / a);
    printf("Done\n");
    return 0;
}

Build it for x64 (ubuntu 22.04).

$ lsb_release -a
No LSB modules are available.
Distributor ID:    Ubuntu
Description:    Ubuntu 22.04.3 LTS
Release:    22.04
Codename:    jammy

$ gcc --version
gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0

$ gcc a.c
$ ./a.out
Floating point exception (core dumped)

However, built it for aarch64 with android-ndk toolchain.

$ android/ndk/toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android33-clang a.c
$ adb push a.out /data/a.out
$ adb shell /data/a.out
0
Done

No error! Just gives ZERO!!

You can get depth image at OpenGL or Vulkan. This article assumes VK_FORMAT_D32_SFLOAT of Vulkan is used as format of depth buffer..
In this case, pixel value of depth image is z-value at NDC space. So, you can reconstruct it's position in world space by using inverse of transformation matrix including perspective matrix.
Then, how can you get this depth image at Blender?
First of all, to get depth value at Blender, you have to make Output-File-Node of OpenEXR image format, and connect it to Depth (or Z) attribute.
But, value stored in this image is not depth(Z) value in NDC space. It's depth value at your view space. So, it's good practice clamping value between near-Z and far-Z.
That is value is between near-Z and far-Z. To get real z-value in NDC space, you can use your projection matrix(usually perspective matrix).
You don't need to know (x, y) value corresponding to this depth value - you can easily capture this by having a look perspective matrix multiplication.
In summary, you have to follow below steps.
- Getting OpenEXR depth image of view space at Blender.
- Convert it by using projection matrix to get depth value at NDC space.

    : You can use even Python: cv2, struct.(un)pack, torchvision, PIL and so on.

- Use (u,v) value and depth value of image to reconstruct position in view space.

This is not a topic of Linux kernel. I would like to introduce one of popular way protecting Policy file for SELinux that used in lots of linux(including embedded linux) system.

systemd is widely user process that is widely used as init process at Linux in these days. systemd loads SELinux data from SELinux root directoy (/etc/selinux by default) if SELinux is enabled, at very early stage. And then services registered are started.

Here is sample mount status.

...(skip)...
overlay on /etc type overlay (rw,relatime,rootcontext=system_u:object_r:etc_t:s0,seclabel,lowerdir=/etc,upperdir=/overlay/etc,workdir=/overlay/.etc-work,x-systemd.automount)
...(skip)...

And you can easily find systemd service doing this mount task.

According to steps of systemd, policy of SELinux is loaded before /etc/ is hidden behind by overlay filesystem. So, original SELinux data can be safely protected from users.

This way is very popular way protecting original data from users. You can apply this trick to various cases for your system.

개요

벌써 좀 오래전 일이긴 하나, 경험의 기록 차원에서 남겨 둔다.

진행 환경

  • : 기획을 주도하는 사람 한명을 중심으로, 전부 완전히 수평적인 관계의 5명 (나이도 비슷).
  • 프로젝트: 대략적인 방향에 대해서만 동의된 상황에서 1년간 진행.
  • 상황: 프로젝트 성공 여부에 따라서, 인사 평가가 결정.

진행

초기

  • 프로젝트 진행방향을 구체화하는 과정에서, 서로 의견충돌이 발생.
  • 의사결정권자가 없으므로, 의견충돌이 중재되기 힘듦.
  • 어느 정도 서로 감정을 건드리는 상황도 종종 발생.

중기

  • 서로 의견충돌이 발생할 만한 종류의 이야기는 언급을 피함. 그러다 보니, 중요한 의사결정이 제대로 내려지지 않음.
  • 기획방향에 동의하지 않는 사람은 업무에 적극적으로 참여하지 않음.
  • 생산성에 비효율이 발생.

말기

  • 프로젝트 결과물이 기대에 미치지 못함.

요약

  • 책임과 권한이 비슷한 여러사람이 이끄는 조직을 성공시키는 것은 쉽지않다. 이것이 일한사람이 이끄는 조직보다 더 나은지 아닌지는 모르겠으나, 고려해야할 사항이 좀더 많아 보인다.
  • 한명이 의사결정권(책임과 권한)을 가지는 형태의 조직이 일반적으로 사용되는 것은 이러한 경험의 축적에서 비롯된 것으로 생각된다.

Test environment: Node 12.18.2 on Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz


const SZ = 99999999


const l2a = [];
for (let i = 0; i < 10000; i++) {
    const sub = [];
    for (let j = 0; j < 10000; j++) {
        sub.push(j);
    }
    l2a.push(sub);
}

let s = new Set();
let t = -Date.now();
for (const a of l2a) {
    for (const n of a) {
        s.add(n);
    }
}
t += Date.now();
console.log('Takes: ', t);

const ta = [];
t = -Date.now();
for (const a of l2a) {
    ta.concat(...a);
}
s = new Set(ta);
t += Date.now();
console.log('Takes: ', t);

Result is

Takes:  3070
Takes:  1665

Concating and expanding array are very fast comparing to calling a function(Set.add).

NOTE: I observed it rxjs 5.5.12, 6.5.4, 6.5.5.

You may think that below two codes gives same results where s0 and s1 are instances of BehaviorSubject.

Code-1

s1.pipe(
    withLatestFrom(s0)
).subscribe(([v1, v0]) => {
    console.log(v0, v1);
});

Code-2

s1.subscribe(v1 => {
    s0.pipe(take(1)).subscribe(v0 => {
        console.log(v0, v1);
    });
});

In general case, it's true. But, in very unique situation, these gives different results.
Here is test code for it.

const s0 = new BehaviorSubject('A');
const s1 = new BehaviorSubject('a');

setTimeout(() => s0.next('B'), 500);

let i = 0;
s0.pipe( /* Part-A */
    withLatestFrom(s1)
).subscribe(([v0, v1]) => {
    console.log('s0-pipe: ', v0, v1);
    s1.next(++i);
});

s1.pipe(
    withLatestFrom(s0)
).subscribe(([v1, v0]) => {
    console.log('s1-pipe: ', v0, v1);
});

s1.subscribe(v1 => {
    s0.pipe(take(1)).subscribe(v0 => {
        console.log('s1-sub: ', v0, v1);
    });
});

This gives following result.

s0-pipe:  A a
s1-pipe:  A 1
s1-sub:  A 1
s0-pipe:  B 1
s1-pipe:  A 2  <=
s1-sub:  B 2

Two very interesting results are observed. s1-pipe: A a and s1-sub: A a are not shown. And s0 value at second s1-pipe is not latest one!

To avoid this, I used below workaround at code Part-A.

s0.pipe(
    withLatestFrom(s1)
).subscribe(([v0, v1]) => {
    console.log('s0-pipe: ', v0, v1);
    setTimeout(() => s1.next(++i));
});

Then, it gives

s0-pipe:  A a
s1-pipe:  A a  <=
s1-sub:  A a  <=
s1-pipe:  A 1
s1-sub:  A 1
s0-pipe:  B 1
s1-pipe:  B 2  <=
s1-sub:  B 2

I think it's design of rxjs. One thing I can learn from above is that it's very risky using circular withLatestFrom at rxjs.

Henceforth, term CmpBO is used to mean Comparing build outputs.

Opening

This article is based on experience in the project that I led. Project was successful and technics described here worked well at real situation. Here is rough description of environment.

  • Software is built on Linux host.
  • Most source codes are written in C and built with GNUMake.
  • Some of them were external sources (ex. sources from 3rd parties or from opensource projects). And most of them uses autotools for building.
  • Various toolchains are used for cross-compiling.
  • Java, Python and more tools are participated in build.

Purpose

Imagine following cases.

  • Developer saids that code changes that are affected to only one executable binary, are committed to software(henceforth SW) repository. And packages built from latest source code are built from repository. Then, it is released to verificaiton team. In this situation, can verification team believes what developer says and skip testing others except for affected binary?
  • Build system and infra structure are improved. Then, can we say that these improvement doens't affect to our SW packages?

In case that we can say, "This build output is exactly same with that build output, semantically", we can improve build system and infra freely with confidence. And verification team can save efforts to test with same software.

Difficulties

Workspace path for build

Even if cloud becomes popular, still lots of build system uses bare-metal machine to build SW packages especially in case of large SW products - ex. file-system images for mobile devices(ex. Android). Main reason is building this huge SW uses system resources so heavily that using container or VM may cause loss at build performance(time). And to separate each build job safely - to avoid potential issues caused by sharing same workspace among several build jobs, usually build agent uses unique directory path as build workspace. For example, something like /build/PROJECT-NAME/BUILD-ID/ (BUILD-ID is unique string to identify this build job). This means that directory path is changed at every build even if build job uses same source code.

This causes problems by combining with following examples, at CmpBO.

  • FILE or BASE_FILE macro in C/C++.
  • Debugging information (ex. DWARF section at ELF).
  • Any codes using file path.

Order of files in a directory

At some filesystem - ex. ext4 - order of files in a directory is not deterministic. For example, following commands may show different results at different machines.

$ touch a b c d e f
$ ls -U

Because of this, file created by link, zip or tar from other machine, may not be same(in terms of binary), even if files in a directory are exactly same. But, semantically, as you know, they are same.

Time

At some project, current time is used at build. It may be used as seed to generate random number or information for debugging. Anyway, time is different at every build even if source code is same. For example, you may find TIME or DATE macro in C/C++ files. Then build results are always different at every build in terms of binary.

Inappropriate command line

Usually, various type of host machine is used as build agent. So, even if all other environment are same, host machine may be different. In this case, if someone uses command line that is dependent on host matchine, build outputs may be different based on which host machine is used. For example

$ gcc -march=natvie ...

is well-known and famous compile option of this case.

Unexpected behavior of compiler

Some C/C++ compilers generate different object files even if there is not logical changes.
For instance, in case that some header files are changed, but those changes don't affect anything to C file - source file - logically, we may expect that same object files are generated.

But, in case of some compilers - assuming gcc for example - , different - in terms of elf comparison - outputs may be generated if source files are compiled with optimization enabled - ex -O2 option. But, even if compilers generate different object files with -O2 option, they may create same object files with -O0 option because logically sources are not changed even if some header files are changed.

Therefore, even if compile outputs are different in terms of binary comparison, it's very difficult to say that build outputs are changed because logically, they are not changed.

Workaround

Because of difficulties above, it's almost impossible to compare outputs as they are. But, usually, main purpose of comparing is to tell "which outputs are semantically changed?". That is, we don't need to compare outputs of relesable build. Let me introduce 4 ways I used to workaround these difficulties to generate outputs only for CmpBO(again, this is NOT FOR RELEASE!).

Fix workspace path for build

One popular option is "Let build agents run in container with fixed workspace path". But at this project, this is rejected by Infra. team. So, instead of running build agent in container, we just run build-container inside build agent with volume-mapping(map to fixed path).

Fix order of files in a directory

In my case, ext4 filesystem is used for build-partition. As mentioned above, file order in a directory need to be fixed among build agents. It is possible by using same hash-seed at ext4. For details, please refer my previous article here

Do not use inappropriate command line for release build.

command line generating outputs dependent on host machine is, usually, not correct in terms of build for general release. So, this kind of command line should be fixed suitable for general release. Even at quick glance, -march=naive is not proper option for release build.

Fix time

This is very hacky and dangerous. But, time is always most headaching part at CmpBO. Time always changed and unpredictable! Furthermore system even cannot be run correctly without time! So, we need to fix time only for build. And I used hack using LD_PRELOAD (For details, please refer my previous short article here). Here are steps.

Create library overriding POSIX functions getting time.

Code compiled to libfixtm.so

#include <time.h>
#include <sys/time.h>
#include <sys/resource.h>
#include <stdio.h>
#include <errno.h>

// #define D(args...) printf(args)
#define D(args...) do {} while(0)


/*
 * IMPORTANT NOTE
 * ==============
 *
 * CASE: 'LD_PRELOAD=./libfixtm.so java XXX'
 *     tv_sec = 1000000
 *     tv_nsec / tv_usec = 0
 * --------------------------------------------------------
 * Env: openjdk 11.0.3 2019-04-16
 *
 * 'java' or 'javac' process never ended if tv_sec is large enough.
 * But, it's ok if 'tv_sec' is 0, 1 or 2 etc.
 * It's based on experimental results.
 * Detail analysis is NOT performed yet.
 *
 * >>> 'tv_sec' should be small enough.
 *
 *
 * CASE: 'LD_PRELOAD=./libfixtm.so python XXX' (Python 2.x)
 *     tv_sec = 0
 *     tv_nsec / tv_usec = 0
 * --------------------------------------------------------
 * Env: Python 2.7.15+
 *
 * time.time() fails at python 2.x with following error.
 *     Traceback (most recent call last):
 *      File "main.py", line 5, in <module>
 *        ts = time.time()
 *     IOError: [Errno 0] Error
 *
 * >>> 'tv_sec' > 0 for python 2.x.
 */

#define MAGIC_TIME_SEC 1

time_t
time(time_t *t) {
        D("*** time\n");
        errno = 0;
        if (t) *t = MAGIC_TIME_SEC;
        return MAGIC_TIME_SEC;
}


int
timespec_get(struct timespec *ts, int base) {
        D("*** timespec_get\n");
        errno = 0;
        if (!ts) {
                errno = EFAULT;
                return 0;
        }
        ts->tv_sec = MAGIC_TIME_SEC;
        ts->tv_nsec = 0;
        return base;
}


int
clock_gettime(clockid_t clk_id, struct timespec *ts) {
        D("*** clock_gettime\n");
        errno = 0;
        if (!ts) {
                errno = EFAULT;
                return -1;
        }
        ts->tv_sec = MAGIC_TIME_SEC;
        ts->tv_nsec = 0;
        return 0;
}


int
gettimeofday(struct timeval *tv, struct timezone *tz) {
        D("*** gettimeofday\n");
        errno = 0;
        if (!tv) return 0;

        tv->tv_sec = MAGIC_TIME_SEC;
        tv->tv_usec = 0;
        return 0;
}

Verification of time-overriding

With libfixtm.so generated above, I observed that time is fixed at most major cases at my build project. Please refer below for details.

C

File: Main.c

#include <stdio.h>
#include <time.h>
#include <sys/time.h>
#include <string.h>

int
main() {
        struct timespec ts;
        struct timeval tv;

        clock_gettime(CLOCK_MONOTONIC, &ts);
        if ((1 != ts.tv_sec) || ts.tv_nsec)
                return 1;

        gettimeofday(&tv, NULL);
        if ((1 != tv.tv_sec) || tv.tv_usec)
                return 1;
        return 0;
}

Console:

$ gcc main.c
$ LD_PRELOAD=$(pwd)/libtmfix.so ./a.out
$ echo $?
0

Summary: function clock_gettime and gettimeofday are successfully overriden by libtmfix.so.

GCC

File: a.c

#include <stdio.h>
void main() {
        printf("%s\n", __TIME__);
}

Console:

$ gcc --version
gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ LD_PRELOAD=$(pwd)/libtmfix.so gcc -o a0 a.c
$ sleep 2
$ LD_PRELOAD=$(pwd)/libtmfix.so gcc -o a1 a.c
$ cmp a0 a1
$ echo $?
0

Summary: Values of time related macros are fixed at gcc.

Java

File: Main.java

import java.util.Date;

public class Main {
        public static void main(String[] args) {
                if (1000 != System.currentTimeMillis())
                        System.exit(1);
                if (1000 != (new Date()).getTime())
                        System.exit(1);
        }
}

Console:

$ javac --version
javac 11.0.3
$ java --version
openjdk 11.0.3 2019-04-16
OpenJDK Runtime Environment (build 11.0.3+7-Ubuntu-1ubuntu218.04.1)
OpenJDK 64-Bit Server VM (build 11.0.3+7-Ubuntu-1ubuntu218.04.1, mixed mode, sharing)
$ javac Main.java
$ LD_PRELOAD=$(pwd)/libtmfix.so java Main
$ echo $?
0
Python

File: Main.py

import time;
import datetime;
import calendar

ts = time.time()
assert ts == 1.0
dt = datetime.datetime.now()
if hasattr(dt, 'timestamp'):
    ts = dt.timestamp()
    assert ts == 1.0
ts = calendar.timegm(time.gmtime())
assert ts == 1

Console:

$ python --version
Python 2.7.15+
$ python3 --version
Python 3.6.8
$ LD_PRELOAD=$(pwd)/libtmfix.so python main.py
$ echo $?
0
$ LD_PRELOAD=$(pwd)/libtmfix.so python3 main.py
$ echo $?
0

Applying to build

Some tools may never ends(ex. in case of using timeout). Workaround is overriding this tool with a proxy by changing PATH during build for CmpBO. This is sample bash script of proxy executable for ping.

#!/bin/bash
unset LD_PRELOAD
/bin/ping $*

Change known C/C++ macros to others having fixed value.

Some toolchains used for cross-compile are NOT dynamic executables. That means, above hack doesn't work. So, before build, all DATE and TIME macros in C/C++ source codes are changed to VERSION. This is possible based on assumption that DATE and TIME are used only for debugging or information, that is, they doens't have any semantic meanings at software.

Check compiler's characteristics

Your compiler may not good at comparing build-outputs, if it generates different outputs of same logic as mentioned above. In this case, you may have to give up comparing or need to disable optimization - most compiler generates same outputs if it is not logically changed and optimization is disabled.

Comparing

Even if it looks that every environment is same, it's very rare that two build outputs are same in terms of binary. For example, tar has last modification time of file at it's header (See wikipedia). And, in case of JSON, follwoing two are semantically same but different at binary.

{"a": 1,"b": 2}
{"b": 2,"a": 1}

So, for CmpBO, tools for semantic comparison depending on file type are needed. For example, eu-elfcmp may be used to compare Elf files

Comments

If you are lucky enough, you can do these with very small efforts without anything above, like follows.

  • Checkout base source code files.
  • Build them and check timestamp of outputs.
  • Checkout changes committed after base.
  • Build them again incrementally.
  • Find outputs that timestamp is changed.
    But, this works only with well structured and defined build system. And cooperation of teams in charge of modules participated in build, are essential.

Consequence

At my project, I successfully make system that can tell which binaries are sementically changed between two different source codes. I hope this hacks or workaround are helpful for you. Enjoy it!

+ Recent posts