Skip to content

C++: Improve cpp/weak-cryptographic-algorithm#5896

Merged
MathiasVP merged 26 commits into
github:mainfrom
geoffw0:weak_crypto
May 19, 2021
Merged

C++: Improve cpp/weak-cryptographic-algorithm#5896
MathiasVP merged 26 commits into
github:mainfrom
geoffw0:weak_crypto

Conversation

@geoffw0

@geoffw0 geoffw0 commented May 13, 2021

Copy link
Copy Markdown
Contributor

Many improvements aimed at increasing the precision of this query:

  • added test cases, including some inspired by real world TP and FP results from LGTM
  • made many changes to fix FPs
  • made two extensions that may cause new results to be found (support for enum constants and class member functions)

Diff run on LGTM:
https://lgtm.com/query/7335937195043825500/

  • FFmpeg; it appears to contain a DES implementation that's used for reading "Sony OpenMG Music files". If this can only be used to read encrypted files the potential security impact is minimal (the danger posed by weak encryption is to Sony rather than a user of FFmpeg), but I think these results are valid.
    • I don't much like that we report what is essentially (I think) one implementation + use of DES in 25 places (previously 31) across 3 files. I'd prefer if we only reported one or a few of the most appropriate places so we don't crowd out other results.
  • php/php-src; this contains a DES implementation that appears to be used as a 'fallback' encryption algorithm. That not a good choice - users of this code are likely to be fooling themselves about the security it offers.
  • videolan/vlc; four false positives all eliminated by the changes ( "des" is short for "description").
  • openssl/openssl; it appears to have a macro OPENSSL_NO_DES that has not been set, we're right to flag a problem in such a build. There's also use of RC2 (not controlled by a macro).
    • that said, we were reporting 777 results and are still reporting a good portion of those places. Too many!

The query was previously tagged error, medium, security`, so it is currently run but not displayed on LGTM; and is part of the security extended suite.

  • based on the above, I don't think we're quite ready to increase precision yet.

Performance figures (locally on my laptop, cleared cache for each run):

  • kamailo, main: 31.9s -> 83.4s (261%) --- the issue is MacroInvocation.getAGeneratedElement(), on this specific project only (as far as I can tell). It can be resolved by substituting the slightly simpler MacroInvocation.getAnExpandedElement() and inlining it, but I'm a bit reluctant to rush into making that library change. Though the query is quite a bit slower on this project, it's a long way from a timeout and is fine elsewhere. I think the right call is to leave this alone, unless we've seen problems elsewhere that this would also fix?
  • wireshark, main: 70.1s -> 69.6s (99%)
  • linux, main: 50.3s -> 66.8s (133%)
  • chakra, main: 15.7s -> 18.6s (118%)

@geoffw0 geoffw0 added the C++ label May 13, 2021
@geoffw0 geoffw0 requested a review from a team as a code owner May 13, 2021 15:23
@geoffw0

geoffw0 commented May 13, 2021

Copy link
Copy Markdown
Contributor Author

Re: the test failure and triple-DES:

It turns out we already had an internal test for this query which I didn't find when I looked for it - though it only has a handful of test cases anyway. There's a test case using CALG_3DES which is labelled as BAD, which we now don't label (and consequently fail the test) because triple-DES isn't on our list of bad algorithms. However it was previously accidentally flagged because the old regexp saw the string DES in 3DES.

It occurs to me that, though triple-DES was not listed as a bad algorithm before, the test case was labelled as bad - so I'm not really sure what the original design intent was. My own knowledge + brief research suggests that triple-DES is old and had a few known weaknesses, but I'm not sure if I'd condemn it as entirely 'broken'. According to wikipedia "Triple DES has been deprecated by NIST in 2017". SAMATE seems to consider it broken. I am keen to hear people's thoughts about whether I should add it to the list of "bad" algorithms?

Comment thread cpp/ql/src/Security/CWE/CWE-327/BrokenCryptoAlgorithm.ql Outdated
Comment thread cpp/ql/src/semmle/code/cpp/security/Encryption.qll Outdated
Comment thread cpp/ql/src/Security/CWE/CWE-327/BrokenCryptoAlgorithm.ql Outdated
Comment thread cpp/ql/src/semmle/code/cpp/security/Encryption.qll Outdated
@MathiasVP

Copy link
Copy Markdown
Contributor

Oops. Sorry for leaving a sequence of tiny reviews, @geoffw0!

@rdmarsh2

Copy link
Copy Markdown
Contributor

I just had a chat with @raulgarciamsft about 3DES - he pointed out a successful attack on it under laboratory conditions, and that it's no longer included in OpenSSL by default. Also, looking at the NIST deprecation, 3DES is disallowed in new FIPS-compliant applications. Given all of that, and that we're already flagging it, I'm comfortable continuing to do so unless we've gotten complaints from users of security-extended.qls.

Relatedly, we should update the link to NIST SP 800-131A to point to the 2019 revision, or perhaps include both (since the 2019 revision doesn't discuss algorithms that were disallowed by the 2015 revision).

@geoffw0

geoffw0 commented May 17, 2021

Copy link
Copy Markdown
Contributor Author

I've made some requested changes, a couple of minor fixes, and I've added triple-DES to the insecure list.

@MathiasVP

MathiasVP commented May 18, 2021

Copy link
Copy Markdown
Contributor

The code changes LGTM!

I had a quick look on [LGTM](https://lgtm.com/query/1949112823619742050/), and most of the result changes look good. I noticed a couple of true positive results that we lose, though:

This line in php/php-src:

retval = do_des(rawl, rawr, &l_out, &r_out, count, data);

and this line on ffmpeg/ffmpeg:

av_des_mac(av_des, oc->sm_val, &enc_header[pos], (oc->i_size >> 3));

perhaps MAC should also be specified as additional evidence? Or do we not want to flag MAC calls?

^ Scratch that. I should've read your initial post more carefully. Having said that, though: should the string MACalso be specified as additional evidence?

I also notice we now flag lots of subsequent lines in the same member function on arvidn/libtorrent: https://lgtm.com/query/3984724980973679017/. It looks like we might be flagging an overloaded operator-> and a copy constructor?

@geoffw0

geoffw0 commented May 18, 2021

Copy link
Copy Markdown
Contributor Author

After some discussion on Slack nobody seems keen to make changes to the Macro library just for better performance on kamailio. I've slightly simplified getAGeneratedElement to getAnExpandedElement as it's all we really need and may perform better in future even without making anything inline.

should the string MAC also be specified as additional evidence?

Yes I think so. Done.

I also notice we now flag lots of subsequent lines in the same member function on arvidn/libtorrent: https://lgtm.com/query/3984724980973679017/. It looks like we might be flagging an overloaded operator-> and a copy constructor?

Well spotted! I think we have to be careful about getting both pieces of evidence - particularly the 'additional evidence' - from the declaring type to avoid problems like this. I've made a change which I think addresses this. Thanks for pointing this out.

All concerns addressed, though I think I'll do a bit more testing...

@geoffw0

geoffw0 commented May 18, 2021

Copy link
Copy Markdown
Contributor Author

New diff run: https://lgtm.com/query/3601742662122109162/ (looks very similar to the previous diff discussed in the PR description)

SAMATE run (without a few of the latest commits, but otherwise confirming the new version does just as well on the CWE-327 test cases as before): https://jenkins.internal.semmle.com/job/Security/job/SAMATE/job/SAMATE-cpp-detailed/83/

@MathiasVP

Copy link
Copy Markdown
Contributor

Do you know why we lose this result on curl/curl?

DES_ecb_encrypt((DES_cblock*) plaintext, (DES_cblock*) results, DESKEY(ks), DES_ENCRYPT);

All other results LGTM! 👍

@geoffw0

geoffw0 commented May 19, 2021

Copy link
Copy Markdown
Contributor Author

Do you know why we lose this result on curl/curl?

DES_ecb_encrypt((DES_cblock*) plaintext, (DES_cblock*) results, DESKEY(ks), DES_ENCRYPT);

As far as I can tell those three results are still reported on the function calls, just not duplicated on those macros any more.

@MathiasVP

MathiasVP commented May 19, 2021

Copy link
Copy Markdown
Contributor

As far as I can tell those three results are still reported on the function calls, just not duplicated on those macros any more.

Oh, you're very correct! I had missed that in the diff run you posted. Thanks for clearing up my confusion 👍 I'm very happy with all the changes now, then! I'll happily merge it as soon as the internal PR is ready (and the change-note I guess?).

MathiasVP
MathiasVP previously approved these changes May 19, 2021
@geoffw0

geoffw0 commented May 19, 2021

Copy link
Copy Markdown
Contributor Author

Internal PR: https://github.com/github/semmle-code/pull/39406 . Again we haven't actually lost a result, just a duplicate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants