Volume 4 - Issue 4
Open Source Software Detection using Function-level Static Software Birthmark
- Dongjin Kim
Dankook University, Yongin 448-701, Korea
kdjorang@dankook.ac.kr
- Seong-je Cho
Dankook University, Yongin 448-701, Korea
sjcho@dankook.ac.kr
- Sangchul Han
Konkuk University, Chungbuk 380-701, Korea
schan@kku.ac.kr
- Minkyu Park
Konkuk University, Chungbuk 380-701, Korea
minkyup@kku.ac.kr
- Ilsun You
Korean Bible University, Seoul 138-791, Korea
isyou@bible.ac.kr
Keywords: Open-source software, Static analysis, Software birthmark, Sequence alignment
Abstract
As open-source software (OSS) is widely used, many IT organizations adopt OSS without obeying
some guidelines for open-source license agreements. To reduce risks related to open-source licenses,
the organizations should meet the requirements for OSS licenses. Because some OSS components
may be given from major upstream suppliers in binary form, it is very hard to verify whether a binary
program contains unlicensed OSS components. In this paper, we propose a novel technique
for determining whether a binary includes certain OSS components without respecting the OSS licensing
terms. Our technique employs function-level static software birthmark to detect code clones
in binaries. In our technique, the birthmark is a sequence of the size information of arguments and
local variables of functions inside a binary, and the similarity between birthmarks is computed using
semi-global sequence alignment or k-gram method. We evaluate the effectiveness of the proposed
techniques by performing experiments with some binaries and OSS components.