diff --git a/SQL/Leetcode/Basic join/1075. Project Employees I/gpt 5.1 thinking customized/Project_Employees_I_pandas.ipynb b/SQL/Leetcode/Basic join/1075. Project Employees I/gpt 5.1 thinking customized/Project_Employees_I_pandas.ipynb
new file mode 100644
index 00000000..88b226ce
--- /dev/null
+++ b/SQL/Leetcode/Basic join/1075. Project Employees I/gpt 5.1 thinking customized/Project_Employees_I_pandas.ipynb
@@ -0,0 +1,324 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "dd3c3113",
+ "metadata": {},
+ "source": [
+ "## 0) 前提\n",
+ "\n",
+ "* 環境: **Python 3.10.15 / pandas 2.2.2**\n",
+ "* **指定シグネチャ厳守**\n",
+ "\n",
+ " * 関数名: `project_employees`\n",
+ " * 引数名: `project`, `employee`\n",
+ " * 返却列: `[\"project_id\", \"average_years\"]`\n",
+ " * 列順: 上記順序\n",
+ "* I/O 禁止(ファイル / 標準出力)、`print` / `sort_values` は使用しない\n",
+ "\n",
+ "---\n",
+ "\n",
+ "## 1) 問題\n",
+ "\n",
+ "* `{{PROBLEM_STATEMENT}}`\n",
+ " 各プロジェクトについて、そのプロジェクトにアサインされている従業員の\n",
+ " **平均経験年数 (`experience_years`) を小数第 2 位に丸めて** 求める。\n",
+ "\n",
+ "* 入力 DF: `{{INPUT_DATAFRAMES}}`\n",
+ "\n",
+ " * `project: pd.DataFrame`\n",
+ "\n",
+ " | column | dtype |\n",
+ " | ----------- | ----- |\n",
+ " | project_id | int |\n",
+ " | employee_id | int |\n",
+ "\n",
+ " 各行は「従業員 `employee_id` がプロジェクト `project_id` に所属している」ことを表す。\n",
+ "\n",
+ " * `employee: pd.DataFrame`\n",
+ "\n",
+ " | column | dtype |\n",
+ " | ---------------- | ------ |\n",
+ " | employee_id | int |\n",
+ " | name | object |\n",
+ " | experience_years | int |\n",
+ "\n",
+ " 各行は従業員 1 名の情報。`experience_years` は NULL なし。\n",
+ "\n",
+ "* 出力: `{{OUTPUT_COLUMNS_AND_RULES}}`\n",
+ "\n",
+ " * 戻り値: `pd.DataFrame`\n",
+ "\n",
+ " * 列と意味:\n",
+ "\n",
+ " * `project_id`: プロジェクト ID\n",
+ " * `average_years`: そのプロジェクトに所属する従業員の `experience_years` の平均値(小数第 2 位で丸め)\n",
+ "\n",
+ " * 各 `project_id` につき 1 行\n",
+ "\n",
+ " * 並び順は任意(`sort_values` 禁止のためソートしない)\n",
+ "\n",
+ "---\n",
+ "\n",
+ "## 2) 実装(指定シグネチャ厳守)\n",
+ "\n",
+ "> 列を最小化しつつ `merge` → `groupby.mean` → `round` の順に処理します。\n",
+ "> 今回はグループ内順位や条件抽出は不要なので、シンプルな集約だけで OK です。\n",
+ "\n",
+ "```python\n",
+ "import pandas as pd\n",
+ "\n",
+ "def project_employees(project: pd.DataFrame, employee: pd.DataFrame) -> pd.DataFrame:\n",
+ " \"\"\"\n",
+ " 各プロジェクトごとの平均経験年数を計算する。\n",
+ "\n",
+ " Args:\n",
+ " project (pd.DataFrame): 列 ['project_id', 'employee_id']\n",
+ " employee (pd.DataFrame): 列 ['employee_id', 'name', 'experience_years']\n",
+ "\n",
+ " Returns:\n",
+ " pd.DataFrame: 列名と順序は ['project_id', 'average_years']\n",
+ " \"\"\"\n",
+ " # 1) 列最小化: employee 側は平均に必要な列だけに絞る\n",
+ " emp_exp = employee[[\"employee_id\", \"experience_years\"]]\n",
+ "\n",
+ " # 2) JOIN: project に experience_years を紐づける\n",
+ " merged = project.merge(emp_exp, on=\"employee_id\", how=\"left\")\n",
+ "\n",
+ " # 3) プロジェクトごとに平均値を計算\n",
+ " out = (\n",
+ " merged\n",
+ " .groupby(\"project_id\", as_index=False)[\"experience_years\"]\n",
+ " .mean()\n",
+ " )\n",
+ "\n",
+ " # 4) 列名を仕様どおりにリネームし、小数第 2 位に丸める\n",
+ " out = out.rename(columns={\"experience_years\": \"average_years\"})\n",
+ " out[\"average_years\"] = out[\"average_years\"].round(2)\n",
+ "\n",
+ " return out\n",
+ "\n",
+ "Analyze Complexity\n",
+ "Runtime 283 ms\n",
+ "Beats 62.61%\n",
+ "Memory 69.14 MB\n",
+ "Beats 18.00%\n",
+ "\n",
+ "```\n",
+ "\n",
+ "---\n",
+ "\n",
+ "## 3) アルゴリズム説明\n",
+ "\n",
+ "### 使用 API\n",
+ "\n",
+ "* `DataFrame[...]`\n",
+ " → 列のサブセットを取り、**列最小化**(不要列を運ばないことでメモリ削減)。\n",
+ "* `DataFrame.merge`\n",
+ " → `project` と `employee` を `employee_id` で結合し、従業員の経験年数をプロジェクトに紐付け。\n",
+ "* `DataFrame.groupby` + `GroupBy.mean`\n",
+ " → `project_id` ごとに `experience_years` の平均を計算。\n",
+ "* `DataFrame.rename`\n",
+ " → 出力列名を問題仕様どおり `average_years` に。\n",
+ "* `Series.round`\n",
+ " → 平均値を小数第 2 位に丸める。\n",
+ "\n",
+ "### NULL / 重複 / 型の扱い\n",
+ "\n",
+ "* `experience_years` は問題文より **NULL なし** 前提なので、`mean()` で NULL ケアは不要。\n",
+ "* `project` 側は `(project_id, employee_id)` が主キーなので、\n",
+ " 「同じプロジェクトに同じ従業員が複数行いる」ことはなく、重複による二重カウントも発生しない。\n",
+ "* `mean()` の戻り値は `float64`。\n",
+ " `round(2)` で `2.0` → 画面では `2.00` 相当の精度を保持できる(pandas の表示設定次第)。\n",
+ "\n",
+ "---\n",
+ "\n",
+ "## 4) 計算量(概算)\n",
+ "\n",
+ "`N = len(project)`, `M = len(employee)` とします。\n",
+ "\n",
+ "* 列最小化: `employee[[\"employee_id\", \"experience_years\"]]`\n",
+ " → **O(M)**\n",
+ "* `merge`(ハッシュ結合想定): `project.merge(emp_exp, on=\"employee_id\")`\n",
+ " → **O(N + M)** 近辺\n",
+ "* `groupby(\"project_id\").mean()`\n",
+ " → **O(N)**(ハッシュベースのグループ集計)\n",
+ "\n",
+ "したがって、全体はおおよそ **O(N + M)** 時間、メモリは\n",
+ "結合後の一時 DataFrame(行数 ≒ N、列数は 3 列)相当となります。\n",
+ "\n",
+ "---\n",
+ "\n",
+ "## 5) 図解(Mermaid 超保守版)\n",
+ "\n",
+ "```mermaid\n",
+ "flowchart TD\n",
+ " A[project
project_id, employee_id]\n",
+ " B[employee
employee_id, experience_years, name]\n",
+ " C[列最小化
employee → employee_id, experience_years のみ]\n",
+ " D[merge
on employee_id]\n",
+ " E[\"groupby project_id
mean(experience_years)\"]\n",
+ " F[\"列名変更 + round(2)
average_years\"]\n",
+ " G[出力
project_id, average_years]\n",
+ "\n",
+ " B --> C\n",
+ " A --> D\n",
+ " C --> D\n",
+ " D --> E\n",
+ " E --> F\n",
+ " F --> G\n",
+ "```\n",
+ "\n",
+ "この形なら、LeetCode の pandas 版「Project Employees I」にそのままコピペ投入できるはずです。\n",
+ "\n",
+ "結論から言うと、\n",
+ "\n",
+ "* **計算量レベルではほぼ限界まで来ていて、大きな高速化は期待しにくい**\n",
+ "* ただし、**merge → map に変えることで、メモリとわずかな実行時間改善の余地はある**\n",
+ "* 特にメモリ使用量(Beats 18%)を少しでも良くしたいなら、「結合の仕方」を軽くするのがポイント\n",
+ "\n",
+ "という感じです。\n",
+ "\n",
+ "---\n",
+ "\n",
+ "## 1) いまのアプローチの整理\n",
+ "\n",
+ "あなたの現行コード(要約)は:\n",
+ "\n",
+ "```python\n",
+ "emp_exp = employee[[\"employee_id\", \"experience_years\"]]\n",
+ "\n",
+ "merged = project.merge(emp_exp, on=\"employee_id\", how=\"left\")\n",
+ "\n",
+ "out = (\n",
+ " merged\n",
+ " .groupby(\"project_id\", as_index=False)[\"experience_years\"]\n",
+ " .mean()\n",
+ ")\n",
+ "\n",
+ "out = out.rename(columns={\"experience_years\": \"average_years\"})\n",
+ "out[\"average_years\"] = out[\"average_years\"].round(2)\n",
+ "```\n",
+ "\n",
+ "やっていることは完全に正しくて、アルゴリズム的にも\n",
+ "\n",
+ "* 結合: O(N + M)\n",
+ "* groupby: O(N)\n",
+ "\n",
+ "で、ここから**オーダーを変える改善はできません**。\n",
+ "\n",
+ "LeetCode の 283ms / Beats 62% という数字も、\n",
+ "環境ノイズを含めて「十分良い」側です。\n",
+ "\n",
+ "---\n",
+ "\n",
+ "## 2) 改善ポイント:`merge` → `map` で軽量化\n",
+ "\n",
+ "`project` 側にはすでに `employee_id` が入っているので、\n",
+ "\n",
+ "> わざわざ `merge` で行を膨らませるのではなく、\n",
+ "> **`employee_id → experience_years` のマッピングを作って `map` する**\n",
+ "\n",
+ "方が、メモリ的には少し有利になり得ます。\n",
+ "\n",
+ "### 修正版コード(`map` ベース)\n",
+ "\n",
+ "```python\n",
+ "import pandas as pd\n",
+ "\n",
+ "def project_employees_i(project: pd.DataFrame, employee: pd.DataFrame) -> pd.DataFrame:\n",
+ " \"\"\"\n",
+ " 各プロジェクトごとの平均経験年数を計算する。\n",
+ "\n",
+ " Args:\n",
+ " project (pd.DataFrame): 列 ['project_id', 'employee_id']\n",
+ " employee (pd.DataFrame): 列 ['employee_id', 'name', 'experience_years']\n",
+ "\n",
+ " Returns:\n",
+ " pd.DataFrame: 列名と順序は ['project_id', 'average_years']\n",
+ " \"\"\"\n",
+ " # 1) employee_id -> experience_years のマップを作る(列最小化 + index 化)\n",
+ " emp_exp = employee.set_index(\"employee_id\")[\"experience_years\"]\n",
+ "\n",
+ " # 2) project に対応する経験年数列を付与(merge の代わりに map)\n",
+ " # 列最小化: project 側も必要な列だけ使う\n",
+ " proj = project[[\"project_id\", \"employee_id\"]].copy()\n",
+ " proj[\"experience_years\"] = proj[\"employee_id\"].map(emp_exp)\n",
+ "\n",
+ " # 3) 集約に不要な employee_id を削る(メモリ削減の一手)\n",
+ " proj = proj[[\"project_id\", \"experience_years\"]]\n",
+ "\n",
+ " # 4) プロジェクトごとに平均値を計算\n",
+ " out = (\n",
+ " proj\n",
+ " .groupby(\"project_id\", as_index=False)[\"experience_years\"]\n",
+ " .mean()\n",
+ " )\n",
+ "\n",
+ " # 5) 列名変更 + 小数第2位に丸める\n",
+ " out = out.rename(columns={\"experience_years\": \"average_years\"})\n",
+ " out[\"average_years\"] = out[\"average_years\"].round(2)\n",
+ "\n",
+ " return out\n",
+ "\n",
+ "Analyze Complexity\n",
+ "Runtime 283 ms\n",
+ "Beats 62.61%\n",
+ "Memory 68.14 MB\n",
+ "Beats 98.61%\n",
+ "\n",
+ "```\n",
+ "\n",
+ "### この書き方の狙い\n",
+ "\n",
+ "* `merge` は「左・右両方の列をフルで抱えた新しい DataFrame」を作るので、そのぶんメモリを食いやすい\n",
+ "* `map` は「Series の値を 1 列だけ投げ込む」だけなので、結合の中では比較的軽い\n",
+ "* `employee` 側は `set_index` して `Series` 化していて、\n",
+ " **キー→値** のマップとして最小限の形にしている\n",
+ "\n",
+ "大幅なメモリ削減にはならないかもしれませんが、\n",
+ "\n",
+ "* 一時的に作るオブジェクトがやや小さくなる\n",
+ "* 実行時間も微妙に改善する可能性はある\n",
+ "\n",
+ "ので、LeetCode のメモリ percentile を数ポイント押し上げられる可能性はあります。\n",
+ "\n",
+ "---\n",
+ "\n",
+ "## 3) どこまで改善を狙うべきか?\n",
+ "\n",
+ "正直なところ、今回の問題は\n",
+ "\n",
+ "* 入力サイズもそれほど極端ではない\n",
+ "* アルゴリズムも「JOIN + GROUP BY」で頭打ち\n",
+ "\n",
+ "なので、**すでに「クエリとしては十分に最適な部類」** です。\n",
+ "\n",
+ "283ms / Beats 62% という数字は、少しコードをいじったくらいで\n",
+ "劇的に変わるものではないので、\n",
+ "\n",
+ "* 可読性\n",
+ "* 素直さ(変なトリックを使わない)\n",
+ "* バグの入りにくさ\n",
+ "\n",
+ "を優先しつつ、上の `map` 版のような「軽い改善」を入れておけばかなり良いラインだと思います。\n",
+ "\n",
+ "---\n",
+ "\n",
+ "もしこの後、\n",
+ "\n",
+ "* 別の LeetCode pandas 問題\n",
+ "* もっと重いグループ演算(複数条件 / 上位 k / window関数っぽい処理)\n",
+ "\n",
+ "などが出てきたら、そこは **`groupby.transform` / `rank` / `merge` 戦略**を総合的に組み立てる練習ネタにできます。\n"
+ ]
+ }
+ ],
+ "metadata": {
+ "language_info": {
+ "name": "python"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/SQL/Leetcode/Basic join/1075. Project Employees I/gpt 5.1 thinking customized/Project_Employees_I_postgres.ipynb b/SQL/Leetcode/Basic join/1075. Project Employees I/gpt 5.1 thinking customized/Project_Employees_I_postgres.ipynb
new file mode 100644
index 00000000..6958b6d3
--- /dev/null
+++ b/SQL/Leetcode/Basic join/1075. Project Employees I/gpt 5.1 thinking customized/Project_Employees_I_postgres.ipynb
@@ -0,0 +1,374 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "e334eb64",
+ "metadata": {},
+ "source": [
+ "## 0) 前提\n",
+ "\n",
+ "* エンジン: **PostgreSQL 16.6+**\n",
+ "* 並び順: 任意\n",
+ "* `NOT IN` は本問では不要(`EXISTS` / `LEFT JOIN ... IS NULL` も登場しない)\n",
+ "* 判定は `project_id` 単位、表示は仕様どおり\n",
+ "\n",
+ "---\n",
+ "\n",
+ "## 1) 問題\n",
+ "\n",
+ "* `{{PROBLEM_STATEMENT}}`\n",
+ " 各プロジェクトについて、そのプロジェクトに所属している従業員の **平均経験年数 (`experience_years`)** を求め、**小数第 2 位まで** に丸めて出力せよ。\n",
+ "\n",
+ "* 入力: `{{TABLES_OR_SCHEMAS}}`\n",
+ "\n",
+ " **Project**\n",
+ "\n",
+ " | Column Name | Type |\n",
+ " | ----------- | ---- |\n",
+ " | project_id | int |\n",
+ " | employee_id | int |\n",
+ "\n",
+ " * `(project_id, employee_id)` が主キー\n",
+ " * `employee_id` は `Employee.employee_id` への外部キー\n",
+ "\n",
+ " **Employee**\n",
+ "\n",
+ " | Column Name | Type |\n",
+ " | ---------------- | ------- |\n",
+ " | employee_id | int |\n",
+ " | name | varchar |\n",
+ " | experience_years | int |\n",
+ "\n",
+ " * `employee_id` が主キー\n",
+ " * `experience_years` は NOT NULL\n",
+ "\n",
+ "* 出力: `{{OUTPUT_COLUMNS_AND_RULES}}`\n",
+ "\n",
+ " | Column Name | 説明 |\n",
+ " | ------------- | ----------------------------------- |\n",
+ " | project_id | プロジェクト ID |\n",
+ " | average_years | そのプロジェクトに属する従業員の平均経験年数。小数第 2 位で四捨五入 |\n",
+ "\n",
+ " * 各 `project_id` につき 1 行\n",
+ " * 並び順は任意\n",
+ "\n",
+ "---\n",
+ "\n",
+ "## 2) 最適解(単一クエリ)\n",
+ "\n",
+ "> 指示どおり CTE + ウィンドウで組み立てたバージョンです(この問題だけなら単純な `GROUP BY` でも解けますが、ウィンドウ版に寄せます)。\n",
+ "\n",
+ "```sql\n",
+ "WITH pre AS (\n",
+ " SELECT\n",
+ " p.project_id,\n",
+ " e.experience_years\n",
+ " FROM project AS p\n",
+ " JOIN employee AS e\n",
+ " ON e.employee_id = p.employee_id\n",
+ "),\n",
+ "win AS (\n",
+ " SELECT\n",
+ " project_id,\n",
+ " AVG(experience_years) OVER (\n",
+ " PARTITION BY project_id\n",
+ " ) AS avg_experience_years\n",
+ " FROM pre\n",
+ ")\n",
+ "SELECT DISTINCT\n",
+ " project_id,\n",
+ " ROUND(avg_experience_years::numeric, 2) AS average_years\n",
+ "FROM win;\n",
+ "\n",
+ "Runtime 446 ms\n",
+ "Beats 64.43%\n",
+ "\n",
+ "```\n",
+ "\n",
+ "### (参考)もっと素直な書き方\n",
+ "\n",
+ "テンプレートに縛られないなら、現場ではこちらのほうがシンプルで読みやすいです。\n",
+ "\n",
+ "```sql\n",
+ "SELECT\n",
+ " p.project_id,\n",
+ " ROUND(AVG(e.experience_years)::numeric, 2) AS average_years\n",
+ "FROM project AS p\n",
+ "JOIN employee AS e\n",
+ " ON e.employee_id = p.employee_id\n",
+ "GROUP BY\n",
+ " p.project_id;\n",
+ "\n",
+ "Runtime 439 ms\n",
+ "Beats 72.22%\n",
+ "\n",
+ "```\n",
+ "\n",
+ "---\n",
+ "\n",
+ "## 3) 要点解説\n",
+ "\n",
+ "1. **従業員の紐づけ (`pre` CTE)**\n",
+ " `project` と `employee` を `employee_id` で結合して、\n",
+ " 「プロジェクト × 従業員の経験年数」の明細を作っています。\n",
+ "\n",
+ " ```sql\n",
+ " SELECT\n",
+ " p.project_id,\n",
+ " e.experience_years\n",
+ " FROM project AS p\n",
+ " JOIN employee AS e\n",
+ " ON e.employee_id = p.employee_id;\n",
+ " ```\n",
+ "\n",
+ "2. **ウィンドウ集計でプロジェクト別の平均を計算 (`win` CTE)**\n",
+ " `AVG(experience_years) OVER (PARTITION BY project_id)` により、\n",
+ " 各 `project_id` ごとに同じ平均値(ウィンドウ集計結果)が行ごとに付きます。\n",
+ "\n",
+ " ```sql\n",
+ " AVG(experience_years) OVER (\n",
+ " PARTITION BY project_id\n",
+ " ) AS avg_experience_years\n",
+ " ```\n",
+ "\n",
+ "3. **`DISTINCT` で 1 プロジェクト 1 行に整形**\n",
+ " `win` ではプロジェクトに属する従業員の行数だけ行があるため、\n",
+ " `SELECT DISTINCT project_id, avg_experience_years` で 1 行に圧縮します。\n",
+ "\n",
+ "4. **2 桁丸め (`ROUND`)**\n",
+ " PostgreSQL で小数第 2 位までに丸めるには `ROUND(numeric, 2)` を使うので、\n",
+ " ウィンドウの結果を `numeric` にキャストした上で丸めています。\n",
+ "\n",
+ " ```sql\n",
+ " ROUND(avg_experience_years::numeric, 2) AS average_years\n",
+ " ```\n",
+ "\n",
+ "---\n",
+ "\n",
+ "## 4) 計算量(概算)\n",
+ "\n",
+ "`n` 行の `Project`、`m` 行の `Employee` を想定します。\n",
+ "\n",
+ "* `JOIN`(`employee_id` にインデックスあり前提で hash join / nested loop)\n",
+ " → おおよそ **O(n + m)** 〜 **O(n log m)** 程度\n",
+ "* ウィンドウ関数 `AVG() OVER (PARTITION BY project_id)`\n",
+ " → プロジェクトごとの件数を `n_g` とすると、**O(Σ n_g)**(内部ソートがあっても `project_id` での 1 回)\n",
+ "* 全体としてはインデックスが効いていれば **ほぼ線形 O(n + m)** 近辺に収まる想定\n",
+ "\n",
+ "この規模の問題(LeetCode レベル)なら、どちらの書き方でも十分高速です。\n",
+ "\n",
+ "---\n",
+ "\n",
+ "## 5) 図解(Mermaid 超保守版)\n",
+ "\n",
+ "```mermaid\n",
+ "flowchart TD\n",
+ " A[Project
project_id, employee_id]\n",
+ " B[Employee
employee_id, experience_years]\n",
+ " C[CTE pre
project_id + experience_years]\n",
+ " D[\"CTE win
AVG(...) OVER (PARTITION BY project_id)\"]\n",
+ " E[\"SELECT DISTINCT
project_id, ROUND(avg, 2)\"]\n",
+ " F[出力
project_id, average_years]\n",
+ "\n",
+ " A -->|JOIN employee_id| C\n",
+ " B -->|JOIN employee_id| C\n",
+ " C --> D\n",
+ " D --> E\n",
+ " E --> F\n",
+ "```\n",
+ "\n",
+ "こんな感じで、「JOIN → プロジェクトごとに平均 → 丸めて出力」という流れになっています。\n",
+ "\n",
+ "結論からいうと、\n",
+ "\n",
+ "* **クエリとしてはすでにほぼ最適**\n",
+ "* 446ms vs 439ms は **誤差レベル(実行環境ノイズ)**\n",
+ "* 改善するとしたら **「アルゴリズム」ではなく「シンプルさと無駄の削減」**\n",
+ "\n",
+ "くらいの話になります。\n",
+ "\n",
+ "---\n",
+ "\n",
+ "## 1. 2つのクエリの評価\n",
+ "\n",
+ "### ① CTE + ウィンドウ版\n",
+ "\n",
+ "```sql\n",
+ "WITH pre AS (\n",
+ " SELECT\n",
+ " p.project_id,\n",
+ " e.experience_years\n",
+ " FROM project AS p\n",
+ " JOIN employee AS e\n",
+ " ON e.employee_id = p.employee_id\n",
+ "),\n",
+ "win AS (\n",
+ " SELECT\n",
+ " project_id,\n",
+ " AVG(experience_years) OVER (\n",
+ " PARTITION BY project_id\n",
+ " ) AS avg_experience_years\n",
+ " FROM pre\n",
+ ")\n",
+ "SELECT DISTINCT\n",
+ " project_id,\n",
+ " ROUND(avg_experience_years::numeric, 2) AS average_years\n",
+ "FROM win;\n",
+ "```\n",
+ "\n",
+ "* `AVG(...) OVER (PARTITION BY ...)` で **全行に同じ平均値を付与**\n",
+ "* その後 `DISTINCT` で重複を 1 行に畳んでいる\n",
+ " → **機能的には正しい** ですが、「1プロジェクト1行」が欲しいだけなので、\n",
+ " *一度ウィンドウで複製してから DISTINCT で消す* という遠回りになっています。\n",
+ "\n",
+ "### ② GROUP BY 版\n",
+ "\n",
+ "```sql\n",
+ "SELECT\n",
+ " p.project_id,\n",
+ " ROUND(AVG(e.experience_years)::numeric, 2) AS average_years\n",
+ "FROM project AS p\n",
+ "JOIN employee AS e\n",
+ " ON e.employee_id = p.employee_id\n",
+ "GROUP BY\n",
+ " p.project_id;\n",
+ "```\n",
+ "\n",
+ "* こちらは **素直に 1 グループ 1 行を直接作っている**\n",
+ "* 論理的にも物理的にも処理フローが短く、プランナーも最適化しやすい\n",
+ "\n",
+ "実行時間の差が小さいのは、PostgreSQL のプランナーが賢くて、\n",
+ "ウィンドウ + DISTINCT も内部的にかなりうまく畳んでくれている可能性が高いです。\n",
+ "\n",
+ "---\n",
+ "\n",
+ "## 2. 実行時間 446ms vs 439ms について\n",
+ "\n",
+ "LeetCode系のランタイムは\n",
+ "\n",
+ "* 同じクエリでも **何度も実行すれば普通に ±10〜20ms くらい揺れる**\n",
+ "* 他ユーザーの負荷や環境状態に左右される\n",
+ "\n",
+ "ので、**446ms vs 439ms は完全にノイズ範囲** と思って問題ないです。\n",
+ "\n",
+ "「Beats 64% → 72%」も、プラットフォーム内での相対比較なので、\n",
+ "\n",
+ "* DBエンジンが違う(MySQL / Oracle など)\n",
+ "* インデックス前提が違う\n",
+ "* 言語実装やドライバのオーバーヘッドの差\n",
+ "\n",
+ "などいろいろ混ざっています。\n",
+ "PostgreSQL 的には、今の GROUP BY クエリで十分「いい線」です。\n",
+ "\n",
+ "---\n",
+ "\n",
+ "## 3. ほんの少しだけ直せるところ(PostgreSQL的に)\n",
+ "\n",
+ "### 3.1 不要なキャストを削る\n",
+ "\n",
+ "PostgreSQL では\n",
+ "\n",
+ "* `AVG(int)` の戻り値は **すでに `numeric`**\n",
+ "\n",
+ "なので、実は `::numeric` は不要です。\n",
+ "\n",
+ "```sql\n",
+ "SELECT\n",
+ " p.project_id,\n",
+ " ROUND(AVG(e.experience_years), 2) AS average_years\n",
+ "FROM project AS p\n",
+ "JOIN employee AS e\n",
+ " ON e.employee_id = p.employee_id\n",
+ "GROUP BY\n",
+ " p.project_id;\n",
+ "```\n",
+ "\n",
+ "これで意味はまったく同じです。\n",
+ "(性能差はほとんど出ませんが、「余計なことをしない」という意味で綺麗。)\n",
+ "\n",
+ "### 3.2 `USING` で少しだけ読みやすく\n",
+ "\n",
+ "PostgreSQL では `JOIN ... USING (...)` が使えるので、個人的にはこの書き方も好きです。\n",
+ "\n",
+ "```sql\n",
+ "SELECT\n",
+ " p.project_id,\n",
+ " ROUND(AVG(e.experience_years), 2) AS average_years\n",
+ "FROM project AS p\n",
+ "JOIN employee AS e USING (employee_id)\n",
+ "GROUP BY\n",
+ " p.project_id;\n",
+ "\n",
+ "Runtime 449 ms\n",
+ "Beats 61.92%\n",
+ "\n",
+ "```\n",
+ "\n",
+ "これも性能はほぼ同じですが、\n",
+ "\n",
+ "* 結合条件が「キー1本」のときに読みやすく\n",
+ "* `ON p.employee_id = e.employee_id` という重複記述が消える\n",
+ "\n",
+ "というメリットがあります。\n",
+ "\n",
+ "---\n",
+ "\n",
+ "## 4. チューニングの「本質的な余地」はあるか?\n",
+ "\n",
+ "この問題の構造だと、DB側でできることはだいたい以下の2点に集約されます。\n",
+ "\n",
+ "1. **JOIN をいかに効率よく行うか**\n",
+ "2. **project_id ごとに平均を出すための集約をいかに効率よく行うか**\n",
+ "\n",
+ "そして、この前提では:\n",
+ "\n",
+ "* `Employee.employee_id` は PK → **インデックスあり**\n",
+ "* `Project(project_id, employee_id)` も PK → **(project_id, employee_id) インデックスあり**\n",
+ "* `GROUP BY project_id` は、すでにインデックスの先頭列\n",
+ "\n",
+ "なので、実行計画としてもかなり恵まれていて、\n",
+ "**これ以上の実行時間短縮は「DBチューニング」領域(メモリ・並列度・統計情報など)**になります。\n",
+ "LeetCode のような環境ではそこは触れないので、クエリレベルでの改善余地はほぼありません。\n",
+ "\n",
+ "---\n",
+ "\n",
+ "## 5. まとめ\n",
+ "\n",
+ "* **CTE + ウィンドウ版**\n",
+ "\n",
+ " * ロジックは綺麗だが、この問題に対しては処理が遠回り気味\n",
+ "* **GROUP BY 版**\n",
+ "\n",
+ " * 問題にもっとも素直で、現場でもこの形が「ベストプラクティス」\n",
+ " * 実行時間もわずかに良く、プランナーも扱いやすい\n",
+ "\n",
+ "最終形としては、これを採用でOKだと思います:\n",
+ "\n",
+ "```sql\n",
+ "SELECT\n",
+ " p.project_id,\n",
+ " ROUND(AVG(e.experience_years), 2) AS average_years\n",
+ "FROM project AS p\n",
+ "JOIN employee AS e USING (employee_id)\n",
+ "GROUP BY\n",
+ " p.project_id;\n",
+ "```\n",
+ "\n",
+ "もし次のステップとして「さらに一段上の最適化感覚」を鍛えるなら、\n",
+ "\n",
+ "* `EXPLAIN (ANALYZE, BUFFERS)` を見て、\n",
+ "* `HashAggregate` / `GroupAggregate` / 使われているインデックス\n",
+ "* 実行時間の内訳(JOIN が重いのか、集約が重いのか)\n",
+ "\n",
+ "を読む練習をすると、実務寄りのスキルアップになります 💪"
+ ]
+ }
+ ],
+ "metadata": {
+ "language_info": {
+ "name": "python"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}