At early 2022, I used glib(2.66.x) for about 3 months for project run on OpenEmbedded Linux. Based on my short(3 months) experience, I would like review it. I mostly used gio and gdbus of glib. So, this review focus on those features.

Pros

  • It seems that glib is very popular. However, I am not sure gio and gdbus are also as popular as glib.
  • It has large set of features.
  • It is easy to use in both C and C++.
  • It supports multi-platform(Linux, Windows and so on).

Cons

  • Lack of document. It was not easy for me to find useful documents and examples. Again, it's just my opinion. I think it may be caused from frequent changes of API.
  • Some APIs doesn't give any response of error to caller. That is there is no way for caller to detect error happened inside API.
    • For instance, g_thread_pool_push() prints some error messages to stderr but doesn't return any error values to caller. I think it's because many APIs of glib provides very high-level functionality and use various techniques to improve performance and resource usage like memory pool, async operations and so on. However, it's very critical that there is no programmatic way for caller to detect error of API. I think it may be very critical defect when developers consider using glib at project requiring high-level stability like system daemon.
  • Behavior of Linux file descriptor seems not intuitive to me. One of reasons may be glib provides unified API for multi-platform. For instance, in case of giving and taking file descriptor via GDBus, even if receiver receives only one file descriptor, sometimes, two new open file descriptors are created.
    • When using org.freedesktop.login1.Manager.Inhibit, even if only one file descriptor is passed via GD-Bus, two new file descriptor are created at /proc/<pid>/fd. So, even if FD(File Descriptor) passed via GD-Bus is closed, due to another open FD, inhibitor cannot be closed.
    • Same scinario work well as expected if sd-bus is used instead of GD-Bus.

Summary

After winning over lack of document and get used to it, it looks very powerful and useful library. However, as mentioned above, due to some APIs that don't give error response, it's usablity may be limitted at domain asking very high-level stability.

'Development' 카테고리의 다른 글

Comparing build outputs  (0) 2019.11.14
[Remind] Array vs Set.  (0) 2019.05.03
[Dev] Software developement process...  (0) 2010.01.13
[Dev] Sharing between members and leader...  (0) 2009.12.04
[Dev] Key point to succeed in large project...  (0) 2009.08.01

Henceforth, term CmpBO is used to mean Comparing build outputs.

Opening

This article is based on experience in the project that I led. Project was successful and technics described here worked well at real situation. Here is rough description of environment.

  • Software is built on Linux host.
  • Most source codes are written in C and built with GNUMake.
  • Some of them were external sources (ex. sources from 3rd parties or from opensource projects). And most of them uses autotools for building.
  • Various toolchains are used for cross-compiling.
  • Java, Python and more tools are participated in build.

Purpose

Imagine following cases.

  • Developer saids that code changes that are affected to only one executable binary, are committed to software(henceforth SW) repository. And packages built from latest source code are built from repository. Then, it is released to verificaiton team. In this situation, can verification team believes what developer says and skip testing others except for affected binary?
  • Build system and infra structure are improved. Then, can we say that these improvement doens't affect to our SW packages?

In case that we can say, "This build output is exactly same with that build output, semantically", we can improve build system and infra freely with confidence. And verification team can save efforts to test with same software.

Difficulties

Workspace path for build

Even if cloud becomes popular, still lots of build system uses bare-metal machine to build SW packages especially in case of large SW products - ex. file-system images for mobile devices(ex. Android). Main reason is building this huge SW uses system resources so heavily that using container or VM may cause loss at build performance(time). And to separate each build job safely - to avoid potential issues caused by sharing same workspace among several build jobs, usually build agent uses unique directory path as build workspace. For example, something like /build/PROJECT-NAME/BUILD-ID/ (BUILD-ID is unique string to identify this build job). This means that directory path is changed at every build even if build job uses same source code.

This causes problems by combining with following examples, at CmpBO.

  • FILE or BASE_FILE macro in C/C++.
  • Debugging information (ex. DWARF section at ELF).
  • Any codes using file path.

Order of files in a directory

At some filesystem - ex. ext4 - order of files in a directory is not deterministic. For example, following commands may show different results at different machines.

$ touch a b c d e f
$ ls -U

Because of this, file created by link, zip or tar from other machine, may not be same(in terms of binary), even if files in a directory are exactly same. But, semantically, as you know, they are same.

Time

At some project, current time is used at build. It may be used as seed to generate random number or information for debugging. Anyway, time is different at every build even if source code is same. For example, you may find TIME or DATE macro in C/C++ files. Then build results are always different at every build in terms of binary.

Inappropriate command line

Usually, various type of host machine is used as build agent. So, even if all other environment are same, host machine may be different. In this case, if someone uses command line that is dependent on host matchine, build outputs may be different based on which host machine is used. For example

$ gcc -march=natvie ...

is well-known and famous compile option of this case.

Unexpected behavior of compiler

Some C/C++ compilers generate different object files even if there is not logical changes.
For instance, in case that some header files are changed, but those changes don't affect anything to C file - source file - logically, we may expect that same object files are generated.

But, in case of some compilers - assuming gcc for example - , different - in terms of elf comparison - outputs may be generated if source files are compiled with optimization enabled - ex -O2 option. But, even if compilers generate different object files with -O2 option, they may create same object files with -O0 option because logically sources are not changed even if some header files are changed.

Therefore, even if compile outputs are different in terms of binary comparison, it's very difficult to say that build outputs are changed because logically, they are not changed.

Workaround

Because of difficulties above, it's almost impossible to compare outputs as they are. But, usually, main purpose of comparing is to tell "which outputs are semantically changed?". That is, we don't need to compare outputs of relesable build. Let me introduce 4 ways I used to workaround these difficulties to generate outputs only for CmpBO(again, this is NOT FOR RELEASE!).

Fix workspace path for build

One popular option is "Let build agents run in container with fixed workspace path". But at this project, this is rejected by Infra. team. So, instead of running build agent in container, we just run build-container inside build agent with volume-mapping(map to fixed path).

Fix order of files in a directory

In my case, ext4 filesystem is used for build-partition. As mentioned above, file order in a directory need to be fixed among build agents. It is possible by using same hash-seed at ext4. For details, please refer my previous article here

Do not use inappropriate command line for release build.

command line generating outputs dependent on host machine is, usually, not correct in terms of build for general release. So, this kind of command line should be fixed suitable for general release. Even at quick glance, -march=naive is not proper option for release build.

Fix time

This is very hacky and dangerous. But, time is always most headaching part at CmpBO. Time always changed and unpredictable! Furthermore system even cannot be run correctly without time! So, we need to fix time only for build. And I used hack using LD_PRELOAD (For details, please refer my previous short article here). Here are steps.

Create library overriding POSIX functions getting time.

Code compiled to libfixtm.so

#include <time.h>
#include <sys/time.h>
#include <sys/resource.h>
#include <stdio.h>
#include <errno.h>

// #define D(args...) printf(args)
#define D(args...) do {} while(0)


/*
 * IMPORTANT NOTE
 * ==============
 *
 * CASE: 'LD_PRELOAD=./libfixtm.so java XXX'
 *     tv_sec = 1000000
 *     tv_nsec / tv_usec = 0
 * --------------------------------------------------------
 * Env: openjdk 11.0.3 2019-04-16
 *
 * 'java' or 'javac' process never ended if tv_sec is large enough.
 * But, it's ok if 'tv_sec' is 0, 1 or 2 etc.
 * It's based on experimental results.
 * Detail analysis is NOT performed yet.
 *
 * >>> 'tv_sec' should be small enough.
 *
 *
 * CASE: 'LD_PRELOAD=./libfixtm.so python XXX' (Python 2.x)
 *     tv_sec = 0
 *     tv_nsec / tv_usec = 0
 * --------------------------------------------------------
 * Env: Python 2.7.15+
 *
 * time.time() fails at python 2.x with following error.
 *     Traceback (most recent call last):
 *      File "main.py", line 5, in <module>
 *        ts = time.time()
 *     IOError: [Errno 0] Error
 *
 * >>> 'tv_sec' > 0 for python 2.x.
 */

#define MAGIC_TIME_SEC 1

time_t
time(time_t *t) {
        D("*** time\n");
        errno = 0;
        if (t) *t = MAGIC_TIME_SEC;
        return MAGIC_TIME_SEC;
}


int
timespec_get(struct timespec *ts, int base) {
        D("*** timespec_get\n");
        errno = 0;
        if (!ts) {
                errno = EFAULT;
                return 0;
        }
        ts->tv_sec = MAGIC_TIME_SEC;
        ts->tv_nsec = 0;
        return base;
}


int
clock_gettime(clockid_t clk_id, struct timespec *ts) {
        D("*** clock_gettime\n");
        errno = 0;
        if (!ts) {
                errno = EFAULT;
                return -1;
        }
        ts->tv_sec = MAGIC_TIME_SEC;
        ts->tv_nsec = 0;
        return 0;
}


int
gettimeofday(struct timeval *tv, struct timezone *tz) {
        D("*** gettimeofday\n");
        errno = 0;
        if (!tv) return 0;

        tv->tv_sec = MAGIC_TIME_SEC;
        tv->tv_usec = 0;
        return 0;
}

Verification of time-overriding

With libfixtm.so generated above, I observed that time is fixed at most major cases at my build project. Please refer below for details.

C

File: Main.c

#include <stdio.h>
#include <time.h>
#include <sys/time.h>
#include <string.h>

int
main() {
        struct timespec ts;
        struct timeval tv;

        clock_gettime(CLOCK_MONOTONIC, &ts);
        if ((1 != ts.tv_sec) || ts.tv_nsec)
                return 1;

        gettimeofday(&tv, NULL);
        if ((1 != tv.tv_sec) || tv.tv_usec)
                return 1;
        return 0;
}

Console:

$ gcc main.c
$ LD_PRELOAD=$(pwd)/libtmfix.so ./a.out
$ echo $?
0

Summary: function clock_gettime and gettimeofday are successfully overriden by libtmfix.so.

GCC

File: a.c

#include <stdio.h>
void main() {
        printf("%s\n", __TIME__);
}

Console:

$ gcc --version
gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ LD_PRELOAD=$(pwd)/libtmfix.so gcc -o a0 a.c
$ sleep 2
$ LD_PRELOAD=$(pwd)/libtmfix.so gcc -o a1 a.c
$ cmp a0 a1
$ echo $?
0

Summary: Values of time related macros are fixed at gcc.

Java

File: Main.java

import java.util.Date;

public class Main {
        public static void main(String[] args) {
                if (1000 != System.currentTimeMillis())
                        System.exit(1);
                if (1000 != (new Date()).getTime())
                        System.exit(1);
        }
}

Console:

$ javac --version
javac 11.0.3
$ java --version
openjdk 11.0.3 2019-04-16
OpenJDK Runtime Environment (build 11.0.3+7-Ubuntu-1ubuntu218.04.1)
OpenJDK 64-Bit Server VM (build 11.0.3+7-Ubuntu-1ubuntu218.04.1, mixed mode, sharing)
$ javac Main.java
$ LD_PRELOAD=$(pwd)/libtmfix.so java Main
$ echo $?
0
Python

File: Main.py

import time;
import datetime;
import calendar

ts = time.time()
assert ts == 1.0
dt = datetime.datetime.now()
if hasattr(dt, 'timestamp'):
    ts = dt.timestamp()
    assert ts == 1.0
ts = calendar.timegm(time.gmtime())
assert ts == 1

Console:

$ python --version
Python 2.7.15+
$ python3 --version
Python 3.6.8
$ LD_PRELOAD=$(pwd)/libtmfix.so python main.py
$ echo $?
0
$ LD_PRELOAD=$(pwd)/libtmfix.so python3 main.py
$ echo $?
0

Applying to build

Some tools may never ends(ex. in case of using timeout). Workaround is overriding this tool with a proxy by changing PATH during build for CmpBO. This is sample bash script of proxy executable for ping.

#!/bin/bash
unset LD_PRELOAD
/bin/ping $*

Change known C/C++ macros to others having fixed value.

Some toolchains used for cross-compile are NOT dynamic executables. That means, above hack doesn't work. So, before build, all DATE and TIME macros in C/C++ source codes are changed to VERSION. This is possible based on assumption that DATE and TIME are used only for debugging or information, that is, they doens't have any semantic meanings at software.

Check compiler's characteristics

Your compiler may not good at comparing build-outputs, if it generates different outputs of same logic as mentioned above. In this case, you may have to give up comparing or need to disable optimization - most compiler generates same outputs if it is not logically changed and optimization is disabled.

Comparing

Even if it looks that every environment is same, it's very rare that two build outputs are same in terms of binary. For example, tar has last modification time of file at it's header (See wikipedia). And, in case of JSON, follwoing two are semantically same but different at binary.

{"a": 1,"b": 2}
{"b": 2,"a": 1}

So, for CmpBO, tools for semantic comparison depending on file type are needed. For example, eu-elfcmp may be used to compare Elf files

Comments

If you are lucky enough, you can do these with very small efforts without anything above, like follows.

  • Checkout base source code files.
  • Build them and check timestamp of outputs.
  • Checkout changes committed after base.
  • Build them again incrementally.
  • Find outputs that timestamp is changed.
    But, this works only with well structured and defined build system. And cooperation of teams in charge of modules participated in build, are essential.

Consequence

At my project, I successfully make system that can tell which binaries are sementically changed between two different source codes. I hope this hacks or workaround are helpful for you. Enjoy it!

Extremely basic concept!. But, sometimes importance of this concept is ignored becase of several reasons

  • Easy-to-use with libraries and other functions.
  • Easy-to-use becase of language characteristics.
  • To save typing efforts.
  • ...

But, to increase readability of code, it's very important keeping fundamental concepts and following use-cases of each data structure.

  • Array: Order is matter. And duplication is allowed.
  • Set: Order is NOT matter. And duplication is NOT allowed.
[[ blog 이사 과정에서 정확한 posting날짜가 분실됨. 년도와 분기 정도는 맞지 않을까? ]]

 My personal definition of software development process. (There aren't any grounds for this except for my experience.)

1. Rough analysis
  - Analyzing requirements.
  - Analyzing features and functionality that are required.
  - Analyzing required schedule.

2. Detail analysis
  - Defining the way to measure project's completeness.(When can we say "We complete this project!")
  - Defining risks.
  - Estimating costs and time.
  - Deciding whether to go or not.

3. Planing
  - Making reasonable schedule.
  - Confirming resource plan.
  - Confirming the way to measure project's progress - including Milestone. (ex. if XXX is YYY, then project is oo% completed.)
  - Planning schedule, solution and alternatives etc about risk management.

4. Development
  - Determining development environment. (ex. language, CM tool, etc.)
  - Designing SW.
  - Making test plan and cases.
  - Implementation & debugging.

5. Maintenance

[[ blog 이사 과정에서 정확한 posting날짜가 분실됨. 년도와 분기 정도는 맞지 않을까? ]]

The team member (henceforth member) needs to report current state (what he/she is doing and progress etc) frequently to team leader (henceforth leader); Leader always want to know what members are doing.
On the contrary, leader needs to tell members what he/she wants to be done.

Usually, leader set the vision, goal and direction; members make those come true. Members cannot decide what they should do if they don't have any idea about what leader wants.; It is same to the leader. Leader cannot make right decision about the future goal and vision without knowledge of current state of what members are doing.

So, "Sharing current state and goal" is very important to team.

In general, members know about practical state and issues better than leader. And leader understands external situation better and studies vision and future goal harder than members.

Now, it's time to share it!!

(Is it too trivial to discuss? Never!.)

[[ blog 이사 과정에서 정확한 posting날짜가 분실됨. 년도와 분기 정도는 맞지 않을까? ]]

There lots of books and guides about project management. But in my opinion, most important and efficient way is "Divide and Conquer". In general, small-size-project is easy to succeed regardless of methodology. So, most important thing to succeed is "Dividing large project into several small independent projects". We should focus and spend time on this!!!

( Yes.. I know..."Easier said than done!" :-( )

[[ blog 이사 과정에서 정확한 posting날짜가 분실됨. 년도와 분기 정도는 맞지 않을까? ]]

Requirement is one of most popular cause of problem in software development. Change of requirement always affect overall software development, severely. I want to skip general things about Requirement because there are already lots of stuffs about this. Let's just focus on Requirement of Handset software development. Especially, User Interface Specification (henceforth UIS).

In general, there is department in charge of UIS. The department composes UIS. Then, this UIS is delivered to Development team (development team design software big picture based on this.), designers(designers make images, screen layouts etc based on this), and test team (test team makes test and verification cases based on this.)
As you see, the point is "UIS's impact in the software development is extremely critical.". And usually, number of people who make UIS not alone. And it is true for design, software development and test. Let's image that UIS is not clear. What will happen? In this case - development based on unclear UIS -, huge amount of communication overhead is required between UIS, design, development and test team. This cost is extremely large - over than expected. (I can say like this based on my experience. Believe me.)

Even though clarity of UIS - we can also say it as 'completeness' - is very important, in many cases, it is underestimated. Instead of clarity, originality or creativity of UIS is usually required. I'm not saying that originality and creativity are not important. My point is, clarity or completeness is as important as originality or creativity.

UIS is also a kind of "Specification". So, it should be clear. That's my point.

[[ blog 이사 과정에서 정확한 posting날짜가 분실됨. 년도와 분기 정도는 맞지 않을까? ]]

It is already well known and proved that quality oriented development is better than time oriented development in terms of quality and time.

Time Oriented Development
Developing software with fixed schedule. For example, we should complete this project by 10th of March.
In this case, usually, functionality has priority to show progress. And, quality is put behind. So, software is not tested enough during development. Usually, real stage for test and debugging begins at the latter part of project. This is main cause that make time for test and debugging be longer than expected.

Quality Oriented Development
Development focusing on software quality. Software continuously tested during development to keep software quality good. So, software is debugged at the early stage. So, cost for debugging lessen(It's well known that debugging at the early stage is much cheaper than debugging at the latter stage.). This shorten time and increase quality.

[[ blog 이사 과정에서 정확한 posting날짜가 분실됨. 년도와 분기 정도는 맞지 않을까? ]]

1. Sharing team roaster and contact point.
(members need to know mapping between person and job.)

2. Sharing development environment. (if possible, harmonize environments.)
(Using same development environment can reduce communication overhead dramatically.)

[[ blog 이사 과정에서 정확한 posting날짜가 분실됨. 년도와 분기 정도는 맞지 않을까? ]]

[Note: I'm not talking about costs. We can develop something with very low cost for long time.(ex. 1 person for 5 years..)]

Predicting the future is always very difficult. The further is more difficult!
We should predict market situation(the future) when product is released. So, development time means future we should predict. "1-year-development" means "We should predict 1-year-later-future". But, "3-month-development" means just "predicting 3-month-later-future".

Can you understand how important this is?!!!

+ Recent posts