Finding (Risky) Signals in the Open Software Noise

Recently the Linux Foundation teamed up with DHS to create the Census Project analyze those open source software projects that should be considered risky and to define what risk might be. Risky might be: small community, slow update, no website, IRC, no listed people, etc. People need more Code Intelligence (CodeINT) on the source code they use and ways of classifying things.

Check it out:  The Census represents CII’s current view of the open source ecosystem and which projects are at risk. The Heartbleed vulnerability in OpenSSL highlighted that while some open source software (OSS) is widely used and depended on, vulnerabilities can have serious ramifications, and yet some projects have not received the level of security analysis appropriate to their importance. Some OSS projects have many participants, perform in-depth security analyses, and produce software that is widely considered to have high quality and strong security. However, other OSS projects have small teams that have limited time to do the tasks necessary for strong security. The trick is to identify quickly which critical projects fall into the second bucket.

The Census Project focuses on automatically gathering metrics, especially those that suggest less active projects (such as a low contributor count). We also provided a human estimate of the program’s exposure to attack, and developed a scoring system to heuristically combine these metrics. These heuristics identified especially plausible candidates for further consideration. For the initial set of projects to examine, we took the set of packages installed by Debian base and added a set of packages that were identified as potentially concerning. A natural outcome of the census will be a list of projects to consider funding. The decision to fund a project in need is not automated by any means.