Skip to content

Use filtering queries to do batched AI querying#2670

Merged
starcke merged 4 commits into
mainfrom
starcke/apply-slice-filter
Aug 7, 2023
Merged

Use filtering queries to do batched AI querying#2670
starcke merged 4 commits into
mainfrom
starcke/apply-slice-filter

Conversation

@starcke

@starcke starcke commented Aug 3, 2023

Copy link
Copy Markdown
Contributor

This implements batched LLM querying with the following logic:

  • From the external API methods in context, get a slice (size 20) of un-modeled methods to consider candidates.
  • Create an filter-pack for those candidates.
  • Run the automodel candidate queries with that filter-pack in context.
  • Send SARIF to LLM.
  • Post-progress the results from the LLM, and consider all candidates without an explicit classification as netutral elements.

Checklist

  • CHANGELOG.md has been updated to incorporate all user visible changes made by this pull request.
  • Issues have been created for any UI or other user-facing changes made by this pull request.
  • [Maintainers only] If this pull request makes user-facing changes that require documentation changes, open a corresponding docs pull request in the github/codeql repo and add the ready-for-doc-review label there.

@starcke starcke changed the title Use filtering queries to do batched AI quering. Use filtering queries to do batched AI querying Aug 3, 2023
@starcke starcke force-pushed the starcke/apply-slice-filter branch 4 times, most recently from db8d942 to edbcae8 Compare August 4, 2023 09:46
@starcke starcke force-pushed the starcke/apply-slice-filter branch from edbcae8 to f6c492d Compare August 4, 2023 11:10
@starcke starcke marked this pull request as ready for review August 4, 2023 12:15
@starcke starcke requested a review from a team as a code owner August 4, 2023 12:15

@charisk charisk left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me and works. Just some stylistic, optional suggestions.

Comment thread extensions/ql-vscode/test/unit-tests/data-extensions-editor/auto-model-v2.test.ts Outdated

describe("getCandidates", () => {
it("doesnt return methods that are already modelled", () => {
const externalApiUsages: ExternalApiUsage[] = [];

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason you're initialising and then doing push? Can just do it in one line.

const externalApiUsages: ExternalApiUsage[] = [ 
  {
    ...
  } 
];

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I probably just copied it from the limit case. I'll change it.

Comment thread extensions/ql-vscode/test/unit-tests/data-extensions-editor/auto-model-v2.test.ts Outdated
Comment thread extensions/ql-vscode/test/unit-tests/data-extensions-editor/auto-model-v2.test.ts Outdated
externalApiUsages.push({
library: "my.jar",
signature: `org.my.A#x${i}()`,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random empty line

method.methodParameters,
]);

const filter = {

@charisk charisk Aug 4, 2023

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a type we could use here to help out with some type safety? Same for the syntheticConfigPath

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We decided to do this in a followup.

starcke and others added 2 commits August 4, 2023 15:35
Co-authored-by: Charis Kyriakou <charisk@users.noreply.github.com>

@adityasharad adityasharad left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the use of data extensions to configure the batch requests. Some suggestions on the pack aspects; I haven't looked closely at the rest.

Comment thread extensions/ql-vscode/src/data-extensions-editor/auto-model-codeml-queries.ts Outdated
extensions: [
{
addsTo: {
pack: `codeql/${language}-queries`,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In future this will be the automodel queries pack.

* @param candidateMethods
* @returns
*/
export async function generateCandidateFilterPack(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible performance optimisation: would it make sense to generate the pack in a temp location once up front, at the start of using the data extensions editor, but update the data extensions within the pack with a different set of candidate methods each time you need to run a filter? That might be slightly faster.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any chance that there can be multiple runs happening at the same time? Also, need to make sure that the temp folder is different for each open vscode window.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we use this approach in a few other places, so I'll defer changes to a followup. I have created an issue to discuss this with the team.

…eml-queries.ts

Co-authored-by: Aditya Sharad <6874315+adityasharad@users.noreply.github.com>
@starcke starcke merged commit 234760e into main Aug 7, 2023
@starcke starcke deleted the starcke/apply-slice-filter branch August 7, 2023 09:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants