Use filtering queries to do batched AI querying by starcke · Pull Request #2670 · github/vscode-codeql

starcke · 2023-08-03T13:43:54Z

This implements batched LLM querying with the following logic:

From the external API methods in context, get a slice (size 20) of un-modeled methods to consider candidates.
Create an filter-pack for those candidates.
Run the automodel candidate queries with that filter-pack in context.
Send SARIF to LLM.
Post-progress the results from the LLM, and consider all candidates without an explicit classification as netutral elements.

Checklist

CHANGELOG.md has been updated to incorporate all user visible changes made by this pull request.
Issues have been created for any UI or other user-facing changes made by this pull request.
[Maintainers only] If this pull request makes user-facing changes that require documentation changes, open a corresponding docs pull request in the github/codeql repo and add the ready-for-doc-review label there.

charisk

Looks good to me and works. Just some stylistic, optional suggestions.

charisk · 2023-08-04T13:13:25Z

+
+describe("getCandidates", () => {
+  it("doesnt return methods that are already modelled", () => {
+    const externalApiUsages: ExternalApiUsage[] = [];


Any reason you're initialising and then doing push? Can just do it in one line.

const externalApiUsages: ExternalApiUsage[] = [ { ... } ];

Yeah I probably just copied it from the limit case. I'll change it.

charisk · 2023-08-04T13:15:39Z

+      externalApiUsages.push({
+        library: "my.jar",
+        signature: `org.my.A#x${i}()`,
+


Random empty line

charisk · 2023-08-04T13:20:01Z

+    method.methodParameters,
+  ]);
+
+  const filter = {


Do we have a type we could use here to help out with some type safety? Same for the syntheticConfigPath

We decided to do this in a followup.

Co-authored-by: Charis Kyriakou <charisk@users.noreply.github.com>

adityasharad

I really like the use of data extensions to configure the batch requests. Some suggestions on the pack aspects; I haven't looked closely at the rest.

adityasharad · 2023-08-04T15:43:59Z

+    extensions: [
+      {
+        addsTo: {
+          pack: `codeql/${language}-queries`,


In future this will be the automodel queries pack.

adityasharad · 2023-08-04T15:45:50Z

+ * @param candidateMethods
+ * @returns
+ */
+export async function generateCandidateFilterPack(


Possible performance optimisation: would it make sense to generate the pack in a temp location once up front, at the start of using the data extensions editor, but update the data extensions within the pack with a different set of candidate methods each time you need to run a filter? That might be slightly faster.

Any chance that there can be multiple runs happening at the same time? Also, need to make sure that the temp folder is different for each open vscode window.

I think we use this approach in a few other places, so I'll defer changes to a followup. I have created an issue to discuss this with the team.

…eml-queries.ts Co-authored-by: Aditya Sharad <6874315+adityasharad@users.noreply.github.com>

starcke changed the title ~~Use filtering queries to do batched AI quering.~~ Use filtering queries to do batched AI querying Aug 3, 2023

starcke mentioned this pull request Aug 4, 2023

Draft PR showcasing how to use filtered queries in the extension. #2661

Closed

3 tasks

starcke force-pushed the starcke/apply-slice-filter branch 4 times, most recently from db8d942 to edbcae8 Compare August 4, 2023 09:46

Use filtering queries to do batched AI quering.

f6c492d

starcke force-pushed the starcke/apply-slice-filter branch from edbcae8 to f6c492d Compare August 4, 2023 11:10

starcke marked this pull request as ready for review August 4, 2023 12:15

starcke requested a review from a team as a code owner August 4, 2023 12:15

charisk approved these changes Aug 4, 2023

View reviewed changes

charisk reviewed Aug 4, 2023

View reviewed changes

starcke and others added 2 commits August 4, 2023 15:35

Apply suggestions from code review

12abf81

Co-authored-by: Charis Kyriakou <charisk@users.noreply.github.com>

Address comments.

d4137b2

adityasharad reviewed Aug 4, 2023

View reviewed changes

Update extensions/ql-vscode/src/data-extensions-editor/auto-model-cod…

9bd2286

…eml-queries.ts Co-authored-by: Aditya Sharad <6874315+adityasharad@users.noreply.github.com>

starcke merged commit 234760e into main Aug 7, 2023

starcke deleted the starcke/apply-slice-filter branch August 7, 2023 09:37

starcke mentioned this pull request Aug 3, 2023

Add option to filter automodel queries github/codeql#13852

Merged

Uh oh!

Conversation

starcke commented Aug 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Uh oh!

charisk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

charisk Aug 4, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adityasharad left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

starcke commented Aug 3, 2023 •

edited

Loading

charisk Aug 4, 2023 •

edited

Loading