[DistDGL][Robustness]Use appropriate delimiter when reading edge files. #5447

kylasa · 2023-03-10T23:19:41Z

Description

Webgraph dataset edge files have '\t' as the delimiter. But the pipeline, by default assumes ' ' as the delimiter because of which it fails to read webgraph files.

Checklist

Please feel free to remove inapplicable items for your PR.

The PR title starts with [$CATEGORY] (such as [NN], [Model], [Doc], [Feature]])
I've leverage the tools to beautify the python and c++ code.
The PR is complete and small, read the Google eng practice (CL equals to PR) to understand more about small PR. In DGL, we consider PRs with less than 200 lines of core code change are small (example, test and documentation could be exempted).
All changes have test coverage
Code is well-documented
To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change
Related issue is referred in this PR
If the PR is for a new model/paper, I've updated the example index here.

Changes

dgl-bot · 2023-03-10T23:20:06Z

To trigger regression tests:

@dgl-bot run [instance-type] [which tests] [compare-with-branch];
For example: @dgl-bot run g4dn.4xlarge all dmlc/master or @dgl-bot run c5.9xlarge kernel,api dmlc/master

dgl-bot · 2023-03-11T00:03:31Z

Commit ID: 30192c4

Build ID: 1

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

tools/distpartitioning/dataset_utils.py

dgl-bot · 2023-03-15T09:19:11Z

Commit ID: b928ddb6f8032295950eb2d6b9bdffded62e3d21

Build ID: 2

Status: ❌ CI test failed in Stage [Distributed Torch CPU Unit test].

Report path: link

Full logs path: link

dgl-bot · 2023-03-15T20:23:22Z

Commit ID: cb2d273

Build ID: 3

Status: ❌ CI test failed in Stage [Distributed Torch CPU Unit test].

Report path: link

Full logs path: link

tests/tools/pytest_utils.py

1. reverting back to the original pytest_utils.py. This will remove the random delimiter used when creating edge files.

dgl-bot · 2023-04-20T17:49:04Z

Commit ID: 9957ead6707b97478b307e106e64b666126b767d

Build ID: 4

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

dgl-bot · 2023-04-20T18:48:47Z

Commit ID: 2b6094be13127c666b1873734d270bb0fe1dfacf

Build ID: 5

Status: ❌ CI test failed in Stage [Distributed Torch CPU Unit test].

Report path: link

Full logs path: link

dgl-bot · 2023-04-25T22:15:34Z

Commit ID: ad9abfbe8a3e83e3b648d01373a668cac448eff4

Build ID: 6

Status: ❌ CI test failed in Stage [Distributed Torch CPU Unit test].

Report path: link

Full logs path: link

dgl-bot · 2023-04-25T23:57:18Z

Commit ID: d01a283

Build ID: 7

Status: ❌ CI test failed in Stage [Distributed Torch CPU Unit test].

Report path: link

Full logs path: link

dgl-bot · 2023-04-27T18:14:15Z

Commit ID: aa80a6e

Build ID: 8

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

thvasilo

I've mentioned this before: The amount of setup required for the tests is very large, and they seem to be duplicating the implementation. As a result the tests are brittle and refactoring the code will definitely lead to us having to change the test as well.

I'm also concerned that the values as noted in the comments do not seem to follow an expected pattern yet the tests pass. Are we sure we are testing the right things here?

Please take another look at the comments, fix what is needed and we can take a second look.

Let's ask the question: what's the minimum amount of setup possible to test the functionality that this code change adds.

thvasilo · 2023-04-27T17:37:43Z