Skip to content

Groupby behaves differently depending on the order of the columns #396

@igonro

Description

@igonro

Describe the bug
When creating a DataFrame, depending on the order of the columns the groupby() function works properly or returns an error.

To Reproduce
This column order works perfectly:

let data = {
    worker: ["david", "david", "john", "alice", "john", "david"],
    hours: [5, 6, 2, 8, 4, 3],
    day: ["monday", "tuesday", "wednesday", "thursday", "friday", "friday"],
};
let df = new dfd.DataFrame(data);

df.groupby(["day"]).col(["hours"]).sum().print()

// ╔════════════╤═══════════════════╤═══════════════════╗
// ║            │ day               │ hours_sum         ║
// ╟────────────┼───────────────────┼───────────────────╢
// ║ 0          │ monday            │ 5                 ║
// ╟────────────┼───────────────────┼───────────────────╢
// ║ 1          │ tuesday           │ 6                 ║
// ╟────────────┼───────────────────┼───────────────────╢
// ║ 2          │ wednesday         │ 2                 ║
// ╟────────────┼───────────────────┼───────────────────╢
// ║ 3          │ thursday          │ 8                 ║
// ╟────────────┼───────────────────┼───────────────────╢
// ║ 4          │ friday            │ 7                 ║
// ╚════════════╧═══════════════════╧═══════════════════╝

df.groupby(["worker"]).count().print()
// ╔════════════╤═══════════════════╤═══════════════════╤═══════════════════╗
// ║            │ worker            │ hours_count       │ day_count         ║
// ╟────────────┼───────────────────┼───────────────────┼───────────────────╢
// ║ 0          │ david             │ 3                 │ 3                 ║
// ╟────────────┼───────────────────┼───────────────────┼───────────────────╢
// ║ 1          │ john              │ 2                 │ 2                 ║
// ╟────────────┼───────────────────┼───────────────────┼───────────────────╢
// ║ 2          │ alice             │ 1                 │ 1                 ║
// ╚════════════╧═══════════════════╧═══════════════════╧═══════════════════╝

But when I change the column order to the following it doesn't work:

let data = {
    hours: [5, 6, 2, 8, 4, 3],
    worker: ["david", "david", "john", "alice", "john", "david"],
    day: ["monday", "tuesday", "wednesday", "thursday", "friday", "friday"],
};
let df = new dfd.DataFrame(data);

df.groupby(["day"]).col(["hours"]).sum().print()
// Uncaught Error: Can't perform math operation on column hours
//    arithemetic groupby.ts:266
//    operations groupby.ts:417
//    count groupby.ts:431

df.groupby(["worker"]).count().print()
// Uncaught Error: Can't perform math operation on column hours
//    arithemetic groupby.ts:266
//    operations groupby.ts:417
//    count groupby.ts:431

Expected behavior
I would expect that changing the order of the columns wouldn't make any change on the result.

Desktop (please complete the following information):

  • OS: Windows 11
  • Browser: Firefox v97.0.1, Chrome v98.0.4758.102, Edge v98.0.1108.56
  • Version: -

Additional context
I'm using the browser version, not the node.js one.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions